Complete Communications Engineering

Subjective tests are considered more reliable in determining the efficacy of a speech enhancement algorithm, since the result is intended for human listening. However, subjective tests can take days to complete, whereas an objective test can be completed in less than an hour. Therefore, objective tests are preferred over subjective tests during the intermediate stages of development. Objective speech intelligibility tests must to attempt to model how humans perceive speech in order for their results to be highly correlated and reliable as subjective tests.

The two most highly correlated objective intelligibility tests are the Articulation Band Correlation Modified Rhyme Test (ABC-MRT), and the Short Time Objective Intelligibility (STOI). The ABC-MRT uses the same paradigm as the MRT for subjective speech intelligibility. The ABC-MRT evaluates the signal under test, and selects the rhyming word which is believed to be present in the signal. This selection is compared with the truth is used to determine the intelligibility score. The ABC-MRT performs this algorithmically through a temporal correlation across 17 frequency bands that have been deemed to be the most important to understanding of speech. The word that has the maximum correlation score is selected.

The STOI is a relative speech metric, in that it requires a clean speech signal and a modified speech signal. Both signals are transformed by the short-time Fourier transform. The frequency bins are grouped into 15 one-third octave bands. To generate the intelligibility score, a normalized cross correlation between the clean and modified signals at each octave band for each frame is calculated and averaged over the length of the file. The use of the short-time windows provides the ability to capture localized degradation to speech signal.

Objective Intelligibility Testing of Speech Enhancement Algorithms

For both objective tests, to evaluate the effectiveness of speech enhancement algorithms the unprocessed noisy signal intelligibility score needs to be compared with the speech enhanced noisy score.

More Information