Speech pitch detection is a feature with many uses such as speaker diarization and speech activity detector. The use of average magnitude difference function (AMDF) is prevalent in RTOS’s due to its low complexity. A natural alternative to the AMDF is the magnitude average product function (MAPF).
Suppose the received signal at the microphone is given as:
where is the desired speech signal and is i.i.d zero mean Gaussian noise. The MAPF algorithm proceeds on a frame by frame basis. Suppose a frame of length N is available,such that . Then the AMDF function is defined as:
A smoother version, which we utilize, is the binarized bMAPF which is defined as:
The pitch is found using:
A smoothening function can be applied to to remove spurious noise. A sample performance of AMDF is shown in Figure 1 below:
Figure 1: Pitch detection in speech using bMAPF
VOCAL Technologies offers custom designed solutions for beamforming with a robust voice activity detector, acoustic echo cancellation and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific beamforming task. Contact us today to discuss your solution!
More Information