Perceptual Linear Prediction Cepstral Coefficients

The idea of a perceptual front end for determining Linear Prediction Cepstral Coefficients has been applied in different ways to improve speech detection and coding, as well as noise reduction, reverberation suppression, and echo cancellation. In so doing, we improve their performance while simultaneously reducing their computational load. Contact us for more information or to discuss your speech application requirements.

Linear prediction of a signal is done via Autoregressive Moving Average (ARMA) modeling of the time series. In an ARMA model, we express the current sample as:

(1)

Where x[n] is the current input signal, and y[n] is the current output. In speech processing, we do not have access to the input signal x and so we only perform Autoregressive Modeling. This is fortunate, because we can solve these equations easily with the Levinson-Durbin Recursions. The perceptual linear prediction coefficients are created from the linear prediction coefficients by performing perceptual processing before performing the Autoregressive modeling. This perceptual front-end takes the following form:

Figure 1: Calculating Perceptual Frontend Processing [1]

After this processing, we perform cepstral conversion. This is because linear prediction coefficients are very sensitive to frame synchronization and numerical error. In other words, the linear prediction cepstral coefficients are much more stable than the linear prediction coefficients themselves. To do this, we run the following recursion to compute the perceptual linear prediction coefficients:

References

H. Hermansky, Perceptual Linear Predictive (PLP) Analysis of Speech, in J. Acoust. Soc. Am., vol. 87, no. 4, pp. 1738-1752, 1990.

Complete Communications Engineering

Perceptual Linear Prediction Cepstral Coefficients in Speech

References