Complete Communications Engineering

Using a perceptual scale for frequency, instead of a mathematically linear one, often achieves better speech enhancement results when applying the same algorithms as used with a linear scale. Perceptual scales mimic how we perceive loudness, and therefore are better suited for the receiver (our ear). You can get better results from these standard algorithms by simply using a warped frequency scale without having to apply a filterbank to warp the scale, thereby making perceptual filtering an attractive option for real time implementation.

Perceptual scales are generally logarithmic functions of frequency. In other words, the frequency scale is warped to something more perceptually relevant, which often has a logarithmic shape. Such examples are the mel scale, the bark scale, the erb scale, and the greenwood scale.

In addition to perceptual scales, there are perceptual filters. These filters are bandpass filters that can be used to isolate certain perceptually relevant portions of the spectrum for further processing. Such filters include the gammatone filterbank, the gammachirp filterbank, and the third/full octave filterbanks.

In general, these perceptual scales and perceptual filterbanks have been derived through neurophysiological experiments on the auditory systems of lower animals, or from psychoacoustic experiments on people. Either way, when applied, they often represent a marked improvement over using a linear frequency scale, or the typical fast fourier transform bins.

For more information: