Typical algorithms for acoustic echo cancellation (AEC) are time domain based using a convolutive mixture with a runtime on orders of \mathcal{O}(NL), where L is the length of the echo tails and N is the length of the time window used. Computing the optimal filter in time-frequency domain has the advantage of reducing the computational burden by replacing the convolution with multiplication.  Consider the systems depicted in Figure 1 below:

Single line AEC architecture

Figure 1: Single line AEC architecture


The problem at hand is to correctly estimate the filter  taps such that f(h[.],x[.]) = \sum\limits_{i=1}^M h[i]x[k-i] to minimize the error. The presence of the signal s[n] may cause the adaptive system to cancel out the speech signal, thus leading most AECs updating the filter coefficients only when there is no speech detected. Now consider a received signal

r[k] = \sum\limits_{i=1}^M h[i]x[k-i] + v[k]

where v[k] is additive noise. Also consider that x[k-i], \forall i \in \{1, \cdots, M\}, \forall k is known. We want to estimate \hat{h}[i], \forall i \in \{1, \cdots, M\} such that |e[n]|^2 is minimized, where

e[n] = \sum\limits_{i=1}^M (h[i]-\hat{h}[i])x[n-i] + v[n]

A sample result for this estimating the filter weights in the frequency domain is shown in Figure 2.

Performance of frequency domain AEC

Figure 2: Performance of frequency domain AEC


VOCAL Technologies offers custom designed solutions for AEC with a robust voice activity detector, beamforming and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific task. Contact us today to discuss your solution!