A first order endfire beamformer is used extensively when the desired noise source is anti phase to the desired signal using differential beamforming. It is however also desired that in the absence of such interfering signal, the algorithm should afford some performance improvements with regards to uncorrelated noise. The signal to noise ratio improvements using differential beamformer is different from the conventional delay and sum beamformer. We derive the expected SNR gains for uncorrelated noise on two microphones.

Consider a far field source impinging 2 microphones in an anechoic chamber as shown in Figure 1:

Figure 1: 2 microphone array

Suppose the signal at each microphone $i \in \{1, \cdots, N\}$ is given as

$x_i(w) = s(w) e^{\left(-jw \frac{(i-1) d}{c} \sin{\theta} \right)} + n_i(w)$

where $s(w)$ is the desired speech signal, $\theta$ is the direction of arrival (DOA) of the speech signal with respect to the normal to the axis joining all the microphones, $n_i (w)$ is the uncorrelated noise. The noise signal is uncorrelated with the desired signal and across different microphones such that

$\mathbb{E} [s(w) n_i^*(w)] = 0 \forall i \in \{1,2\}$

and

$\mathbb{E} [n_i(w) n_j^*(w)] = 0, i \neq j, \{i,j\} \in \{1,2\}$

where $\mathbb{E}[.]$ is the expectation operator. For endfire configuration, $\theta = -90^{\circ}$ is the assumption. The input SNR per frequency bin $w$, denoted $iSNR(w)$ is given as:

$iSNR = \frac{\mathbb{E}\left[|s(w)|^2 \right]}{\mathbb{E}\left[\left |n_1(w)\right|^2 \right]} = \frac{|s(w)|^2}{\sigma_n^2}$

After the differential beamformer, the output becomes:

$x(w) = s(w) \left(e^{\left(-jw \frac{d}{c}\right)} -e^{\left(-jw \frac{d}{c} \sin{\theta} \right)}\right) + \left(n_1(w) e^{\left(-jw \frac{d}{c} \right)} -n_2(w)\right)$

The output SNR per frequency bin $w$, denoted $oSNR(w)$ is given as

$oSNR = \frac{\mathbb{E}\left[|s(w)|^2 |1 - \cos{(w \frac{d}{c} (1-\sin{\theta}))}|\right]}{\mathbb{E}\left[|n_1(w)|^2 + |n_2(w)|^2 - 2 * \mathbb{R}e\{n_1(w) n_2^*(w) e^{(-jw\frac{d}{c})} \}\right]}$

$oSNR = \frac{\left[|s(w)|^2 |1 - \cos{(w \frac{d}{c} (1-\sin{\theta}))}|\right]}{2 \sigma_n^2}$

The SNR improvement, SNRI then becomes

$SNRI = \frac{oSNR}{iSNR} = \frac{ |1 - \cos{(w \frac{d}{c} (1-\sin{\theta}))}|}{2}$

The SNRI improvement is a function of only the separation distance $d$. A sample expected SNRI at a frequency of $1kHz$ is shown in Figure 2 below for different separation distances d. The desired direction is $-90^{\circ}$. It can be seen that the SNRI degrades smoothly as $d$ increases but there are periodic distortions if a large value of $d$ is utilized. Thus, the magnitude of $d$ plays a huge role in the expected SNRI.

Figure 2: SNRI (dB) for different $d$}

VOCAL Technologies offers custom designed solutions for beamforming with a robust voice activity detector, acoustic echo cancellation and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific beamforming task. Contact us today to discuss your solution!