Complete Communications Engineering

Two Sound Sources
Figure 1: Conceptual diagram showing desired speech and distractor signal sources

There are several powerful methods of reducing/cancelling noise used in applications for Telecommunication, and, in particular, for Mobile Telecommunication, such as in cellular phones/smart phones.

In a typical acoustical environment, distractors (i.e., undesired noise signals) are complex and they can be stationary or non-stationary, localized (that is, mostly coming from a confined space or specific points of the acoustic scene) or diffused. When the distractors mix with desired signals (such as speech), the observed audio mixture can be hardly intelligible.

Methods of Noise Reduction

To improve speech intelligibility, de-noising methods are used. Depending on the number of sensors/microphones, a single-channel method or, with a multitude of sensors (typically referred to as sensor arrays), a multi-channel method is used. An example of a single-channel based method (which uses only one sensor/microphone) is spectral-subtraction that produces good results for stationary distractors. An example of multi-channel methods is a beamforming method which, in essence, adjusts the microphone array spatial characteristic such that the sensors “listen” to mostly the desired sound source (which is identified by spatial parameters as well and temporal and spectral parameters) while undesired signals that are emitted from locations different than the look direction of the desired sound source are attenuated.

Blind Source Separation

Here our attention is paid to a method called Blind Source Separation method (a.k.a. Independent Component Analysis –based signal separation) that too is based on multi-channel approach yet this approach is quite different from the beamforming. Since the topic of BSS-based noise reduction is a broad subject, here we limit ourselves to stating the problem and outlining underlying assumptions.

Let’s assume two sound sources in an acoustic scene: a sound source producing a desired signal (speech)  and a sound source producing undesired signal (noise). Illustration of this acoustic scene is depicted in Figure 1 where signals x1 and x2 are convolutional mixtures of source signals s1 and s2. The mixing matrix A:

Mixing Matrix A

is composed of impulse responses, as represented by respective arrows in Figure 1. Note that the impulse responses include all acoustical effects, such as direct sound, early reflection sound and reverberation sound.

Figure 1 also shows the desired signal (typically speech signal) and undesired signal, a distractor of sort (typically a noise signal; although the distractor can be uncorrelated speech too). Mixtures of these two signals (along with all reflections) are captured by microphones, MIC1 and MIC2. One of the practical cases corresponding to the above diagram is a mobile phone user speaking to the phone (typically equipped with two microphones) in a far-talk mode, in a noisy environment.

The general concept of using BSS for the purpose of noise reduction has been explored for many years and its applications are not only in Telecommunication but in other disciplines as well. The principal task of BSS in applications to mobile telephony is to provide faithful estimates of the source signals (that is the desired signal, speech and the undesired signal, noise), and through that, reduce the acoustic pollution of the desired speech by the noise.

BSS Based Noise Reduction
Figure 2: System model for BSS-based method of Noise Reduction

BSS is essentially a suite of statistical DSP techniques performing the following convolutional de-mixing operation:

Y(z) = W(z)X(z)

where X(z) and Y(z) and are Z-transforms of vectors  x(n) = [x1(n) x2(n)]T and y(n)=[y1(n) y2(n)]T.

Once the de-mixing stage (i.e., the BSS-based stage) is completed, the desired signal and non-desired signals are identified (typically the identification is performed using energy-based methods). The third phase in the BSS-based method of noise reduction is a post-processing phase during which further improvement of de-noising is achieved. One of the methods used during this phase is a classical spectral subtraction method, and that approach takes advantage of the observation that residual noise pollution component, after the de-mixing stage, has properties of semi-stationary signal. This property allows for adequate estimation of the noise spectrum for the purpose of the noise subtraction approach.

Vocal Technologies Ltd have developed several versions of noise cancellation and noise reduction software. A BSS-based version has successfully completed lab testing and is available for customer use. Please contact us to discuss your mobile noise reduction requirements.

More Information