Stereo or surround sound acoustic cancellation can be considered as a special case of multi-channel cancellation with a main difference: the references or speaker output signals are heavily mutually dependent. The dependence creates a trouble some problem for all Wiener filtering based cancellation algorithms: the solution is not unique.

We use the stereo music playback as the example. Below we have the left and right channels for the stereo music and a single microphone on the right.  s(t) is a local sound sources.

The microphone captures the sum of the local sound, the two speakers through acoustic channel h1(n) and h2(n) respectively.

y(n) = s(n) โ€“ x1(n)*h1(n) โ€“ x2(n)*h2(n),

where x1(n)*h1(n)  and x2(n)*h2(n) are the echo contributions to be removed.

The main feature of stereo and surround sound signals is they describe the same audio scene. The difference between each channel may be simply due to the panning. Therefore,

โˆ†(n) = x1(n) โ€“ x2(n),

may be significantly small in power compared with either the left or right channel.

We use the extreme case โˆ†(n) = 0 to demonstrate the non-uniqueness of the cancellation solution. We can easily see that if h1(n)  and h2(n) are the pair of impulse responses, then for any h(n), h1(n) + h(n) and h2(n) โ€“ h(n) must be a solution. Wiener filtering based algorithms may lead to false convergence.

For most situations the non-unique solution or false convergence does not lead to significant performance degradation. However it may cause noticeable or annoying artifacts when sudden changes occur in the environment. To mitigate this issue VOCAL Technologies implemented a two-rate cancellation method with a reference transformation scheme. Our approach does not totally remove the ambiguity but it dramatically reduces its perceptual impact.

For the kth subband, we have

e1k(n) = yk(n) โ€“ x1k(n)*h1k(n),
e2k(n) = e1k(n) โ€“ x2k(n)*h2k(n),

where k indicates the kth subband, the subscripts 1, and 2 are the cancellation stages. hk(n) are the estimated echo paths in the kth subband.

The above approach can be easily extended to as many channels as we want. However the concatenated cancellation approach suffers several drawback. It does not work well when the speaker output signals are correlated, such as, in stereo music or surround sound music playback. In addition the concatenation iteration may slow down dramatically the convergence speed in some cases. Further the computation complexity proportionally increases with the number of speakers.

To address these theoretical and practical implementation limitations VOCAL technologies developed a solution library that is based on our proprietary intellectual properties.