Two microphone endfire beamformer signal to noise plus interferer improvements. The use of two microphones to implement differential beamforming is used to implement joint beam and null formation for specified directions of the beam and the null. Signals from undesired directions are denoted as interfering signals which are correlated across microphones whilst omnidirectional background noise is uncorrelated across microphones. We derive the expected signal to noise plus interference ratio (SNIR) gains for the two microphone solution. Consider a far field source impinging two microphones in an anechoic chamber as shown in Figure 1:

Figure 1: 2 microphone array

Suppose the signal at each microphone $i \in \{1, 2\}$ is given as:

$x_i(w) = s(w) e^{\left(-jw \frac{(i-1) d}{c} \sin{\theta} \right)} + v(w) e^{\left(-jw \frac{(i-1) d}{c} \sin{\beta} \right)} + n_i(w)$

where $s(w)$ is the desired speech signal, $\theta$ is the direction of arrival (DOA) of the speech signal with respect to the normal to the axis joining all the microphones, $v(w)$ is the directional interferer, $\beta$ is the DOA of the directional inteferer and $n_i(w)$ is independent and identically distributed zero mean noise. For endfire configuration, $\theta = -90^{\circ}$ and $\beta = 90^{\circ}$ is the assumption.The input SNIR per frequency bin $w$, denoted $iSNIR(w)$ is given as

$iSNIR = \frac{\mathbb{E}\left[\left|s(w) e^{\left(-jw \frac{(i-1) d}{c} \sin{\theta} \right)}\right|^2 \right]}{\mathbb{E}\left[\left |v(w) e^{\left(-jw \frac{(i-1) d}{c} \sin{\beta} \right)} + n_i(w)\right|^2 \right]} = \frac{|s(w)|^2}{|v(w)|^2 + \sigma_n^2}$

where $\mathbb{E}[.]$ is the expectation operator. After the differential beamformer, the output becomes:

$x(w) = s(w) \left(e^{\left(-jw \frac{d}{c}\right)} -e^{\left(-jw \frac{d}{c} \sin{\theta} \right)}\right) + v(w) \left(e^{\left(-jw \frac{d}{c} \right)} -e^{\left(-jw \frac{d}{c}\sin{\beta} \right)}\right) + \left(n_1(w) e^{\left(-jw \frac{d}{c} \right)} -n_2(w)\right)$

The output SNIR per frequency bin $w$, denoted $oSNR(w)$ is given as:

$oSNIR = \frac{|s(w)|^2 |1 - \cos{(w \frac{d}{c} (1-\sin{\theta}))}|}{|v(w)|^2 |1 - \cos{(w \frac{d}{c} (1-\sin{\beta}))}| + 2 \sigma_n^2}$

The SNIR improvement, SNIRI then becomes:

$SNIRI = \frac{oSNIR}{iSNIR} = \left(\alpha(w) + 1\right) \frac{ |1 - \cos{(w \frac{d}{c} (1-\sin{\theta})}|}{\alpha(w) |1 - \cos{(w \frac{d}{c} (1-\sin{\beta})}|+ 2}$

where $\alpha(w) = \frac{|v(w)|^2}{\sigma_n^2}$.

A sample expected SNIRI at a frequency of $4kHz$ is shown in Figure 2 below for different separation distances d for $16$ microphones. The desired direction is $-90^{\circ}$. It can be seen that the SNRI improves smoothly for small distances $d$ but there are distortions at high $d$ (not shown in Figure 2). Also the value of the interferer to noise ration, $\alpha$ plays a role in the expected SNIRI.

Figure 2: SNRI (dB) for different DOA(degrees) and $d$. Left is for $\alpha = 20dB$, right is for $\alpha = 30dB$

VOCAL Technologies offers custom designed solutions for beamforming with a robust voice activity detector, acoustic echo cancellation and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific beamforming task. Contact us today to discuss your solution!