In highly degraded environments, the object sometimes is to automatically detect only the presence or absence of speech or simply a human actor. This scenario arises in military aircraft, cargo trains and military combat environments. In such scenarios, the conventional approach of using an energy detector or the number of zero crossings will not suffice on the input signals because the signal energy is subsumed by the noise figure. we consider the use of an aggressive adaptive noise reduction approach to elevate the speech profile above the noise figure for recognition. Notice that the object here is not speech recognition, but merely a detection problem.
Consider a far field source impinging two microphones as shown in Figure 1 below:
Figure 1: Two microphones
Suppose the signal at each microphone is given as:
where is the desired time-frequency speech signal, is the direction of arrival (DOA) of the speech signal with respect to endfire, is uncorrelated noise.
Because of the level of noise, the direction of the signals will be unrecoverable between the two microphones, leading to an expected angle of arrival . Thus the signals can be transformed into
Where denotes the noise and denotes the correlated speech signal as noise. An adaptive noise reduction algorithm considers as a noise reference. Figure 2 below illustrates an example with an initial SNR at of and an SNR improvement of yielding a final SNR of .
Fiure 2: Aggressive adaptive noise reduction output
It is clear that a post processing detector becomes far superior to a pre-processing detector.
VOCAL Technologies offers custom designed solutions for robust voice activity detector, beamforming, acoustic echo cancellation and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific beamforming task. Contact us today to discuss your solution!