PSQM or Perceptual Speech Quality Measure is a computational and modeling algorithm defined in ITU-T P.861 (1996) that objectively evaluates and quantifies voice quality of voice-band (300Hz – 3400 Hz) speech codecs on a perceptual scale. It may be used to rank the performance of these speech codecs with differing speech input levels, talkers, bit rates and transcodings.

VOCAL provides a range of voice codecs for use in a wide variety of applications. Careful selection of appropriate speech codec(s) is necessary to match system requirements. VoIP vocoders and their associated PSQM/PSQM+ values under various network conditions are available. Please contact us for more information regarding a specific speech coder and to discuss your voice/VoIP application requirements.


PSQM+ was developed to overcome limitations in the original PSQM algorithm. As originally conceived PSQM was not developed to account for network Quality of Service perturbations common in Voice over IP (VoIP) applications, items such as packet loss, delay variance (jitter) or non-sequential packets. These conditions usually give inappropriate results under heavy network load simulations, failing to account for a very real perceived loss of voice quality. Attempts to duplicate network fault conditions by introducing significant packet loss result in PSQM values that correspond to falsely inflated MOS values. However, PSQM+ generates results that seem to more accurately reflect the adverse performance of speech codecs under realistic network load conditions.

Perceptual Scale

PSQM converts the physical-domain signal(s) into a meaningful psychoacoustic domain, a perceptual scale, through a series of nonlinear processes such as time-frequency mapping, frequency warping and intensity warping. The quality of the coded speech is judged on the differences in the internal representation. The difference is used for the calculation of the noise disturbance as a function of time and frequency. Besides perceptual modeling, the PSQM algorithm uses cognitive modeling such as loudness scaling and asymmetric masking in order to get high correlations between subjective and objective measurements.

PSQM uses psychoacoustical mathematical modeling (both perceptual and cognitive) to analyze the pre and post transmitted voice signals, yielding a PSQM value which is a measure of signal quality degradation and ranges from 0 (no degradation) to 6.5 (highest degradation). In turn, this result may be translated into a Mean Opinion Score (MOS), which is an accepted measure of the perceived quality of received media on a numeric perceptual scale ranging from 1 to 5. A value of 1 indicates unacceptable, poor quality voice while a value of 5 indicates high voice quality with no perceptible issues.

PSQM Testing

The PSQM standard allows automated and simulation-based test methodologies to objectively rate both speech clarity and transmitted voice quality. Various software and/or hardware products have been developed to facilitate this testing. This results in considerable savings in cost and time over the traditional practice of using large groups of people to subjectively evaluate voice signals and assess voice quality. Moreover, it yields objective results that are reliable and reproducible. This is very important to telephony providers who are mandated to maintain high Quality of Service (QoS) standards.

The lack of standardization in test signals is an issue for evaluating various speech codecs. PSQM provides more reliable and consistent MOS scores if used in accordance with ITU recommended methods for objective and subjective assessment of quality (ITU-T p.800/p.830/p.861). These recommendations include using both male and female gender voice reference signals at an average level of -20dB. The type, gender, duration, gain of the voice or signal can all have a minor impact on the PSQM value or MOS score as does the threshold levels, number of calls made and other configuration settings of the environment. When comparing voice quality measurements the signal, environment and configurations should all be taken into account.