Gauging the performance of a blind source separation (BSS) algorithm is non-trivial. It is obvious that the normal SNR improvement metrics will not suffice since there are multiple desired target signals. A metric that has gained traction over the recent decade is the signal to distortion ration, SDR.
To set up the problem, consider the system is illustrated in Figure 1 below.
Figure 1: m source signals impinging 2 microphones
Now, suppose multiple algorithms are used to extract the desired target signals and it is required to rank the performance of such algorithms. Consider an extracted target
where is the $k^{the}$ time-frequency desired signal, the noise component in the extracted signal and is the time-frequency distortion artifact in the desired signal. Based on the SNR concept, it is intuitive to define the so-called SDR as
Suppose the mixing matrix is a time invariant gain, then the temporal dependency can be dropped to get
To be able to actualize this metric, the extracted signals have to be decomposed. Almost all BSS algorithms make the explicit assumption that the target signals are statistically independent. The target signals in the mixture, in the absence of noise can be decomposed by orthogonal projections to span the subspace . Denote where is the orthogonal subspace projector. Then clearly
The use of inner products is then utilized to compute the SDR.
As a custom design house, VOCAL Technologies offers custom designed solutions for blind signal separation with a robust voice activity detector, acoustic echo cancellation and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific task. Contact us today to discuss your solution!