Complete Communications Engineering

The drawback in the use of Geigel algorithm for detection of cross-talk is that, when the far end speech is at least one magnitude of order higher in SPL levels, it becomes difficult in choosing the threshold level for comparison. An alternate algorithm in solving this issue is the use of band limited signals from the far end. For example, in telephone systems, the signals are constrained within the frequencies of \sim 300Hz - 3.4kHz. The speech from the near end however is occupies the full bandwidth. The received signal can then be band limited to frequency ranges outside the far-end speech. For systems other than telephony, frequency constraints can be placed on the far end speech before being voiced through the loud speaker.
Consider the systems depicted in Figure 1 below:

Single line AEC architecture Cross-Talk

Figure 1: Single line AEC architecture

x[n] is the far end speech whilst s[n] is the near-end speech. The far-end speech is frequency constrained such that

x[n]: x(w) = \begin{cases} f(x[n],w) , &w_1 \le w \le w_2\\ 0 , &otherwise \\ \end{cases}

where $latex  f(x[n],w) $ is the time-frequency representation of the far-end speech. The problem then becomes to correctly detecting the presence of the near-end speech in the reverberant signal at the near-end microphone purely by using the energy in the suppressed frequency bands.  Now consider a received signal

r[n] =\alpha s[n]+ \sum\limits_{l=0}^L h[l]x[n-l] + v[n], \alpha \in [0,1], \alpha \in \mathbb{Z}

where v[n] is additive noise. The the time-frequency domain representation becomes

$latex r(w,t) =\alpha s(w,t)+ x(w,t)\sum\limits_{l=0}^L h(l)e^{-jwl} + v(w,t), \alpha \in [0,1], \alpha \in \mathbb{Z}$

If we consider the out of band frequencies, then

\hat{r}(w,t) =\alpha s(w,t) + v(w,t), w \notin [w_1,w_2]

The energy based detection scheme is then given as:

\beta[n] = \begin{cases} 1, & |\hat{r}[n]| > \gamma\\ 0, &otherwise \\ \end{cases}

VOCAL Technologies offers custom designed solutions for AEC with a robust double-talk detection, voice activity detector, beamforming and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific task. Contact us today to discuss your solution!

More Information