Room and HATS equalization are prerequisites for technically sound verification of Speech/Voice Enhancement devices in laboratory conditions. This note expands on frequency band selection as well as the actual equalization process described in “EQUALIZATION FOR ETSI-COMPLIANT AUDIO FACILITY” (cf. [1]). Although related, the Head and Torso Simulator (HATS), cf. [3], is not covered here.

ETSI audio room test equalization
Figure 1: ETSI Test Room and HATS equalization is required for sound verification of Speech Enhancement devices under laboratory conditions

Figure 1 outlines the ETSI Audio/Test Room from the perspective of loudspeaker and HATS equalization; the recommended delays are  ΔtFL = 0ms, ΔtRL = 11ms, ΔtFR = 17ms, ΔtRR = 29ms.

Equalization in 3-D space can only be done in an approximate fashion. Some of the applicable standards are [3,4].  The main task in ETSI Audio Room Equalization is equalization of four loudspeakers, as observed at the HATS location. According to [4], the actual points of observation are microphones at the left artificial ear and the right artificial ear. The equalization is done binaurally. The equalization procedure is outlined as follows:

Step 1:  Separate equalization of the four speakers and separate level adjustment for each speaker. The front left (FL) speaker and the rear left (RL) speaker equalization and level adjustment use the left ear simulator.  The front right (FR) speaker and the rear right (RR) speaker equalization and level adjustment use the right ear simulator.

Step 2: Joint equalization of the two left-hand (FL and RL) using the left ear simulator.

Step 3: Joint equalization of the two right-hand (FR and RR) using the right ear simulator.

Step 4: Joint equalization of all loudspeaker using the left ear and the right earl simulators.

The equalization process involves using pink noise properly filtered. The equalization is done in 1/3rd octaves in the frequency band of [100Hz, 10000Hz]. The center frequencies fn,c of the 1/3rd octave filters include 100Hz, 125Hz, 160 Hz,… 6500Hz, 8000Hz, 10000Hz, (there are 21 of the 1/3rd octave bands within the center frequency set covering [100Hz, 10000Hz]) and are selected according to the Preferred Numbers R10 (cf. [4, 5]).  Note that the R10 numbers are not exactly geometrically spaced; that means they are not following the 1/3rd octave formula for the quotient of the adjacent center frequencies, i.e., 21/3. But, they are spaced sufficiently close to the numbers calculated using this formula (for example, 160Hz vs 157.5Hz). The cutoff frequencies for the bandpass filters are spaced geometrically though, namely:

etsi-audio-test-room-equalization-eq1                                                                                   (1)


etsi-audio-test-room-equalization-eq2                                                                                       (2)

where fn,c  (n=1,2,…,21) is the center frequency at the nth 1/3rd octave band,  fn,lo  and fn,hi  are low and high cutoff frequencies of the nth 1/3rd octave band.

Although the standard, as per Ref.[3], recommends using filtered pink noise, filtered white noise is a technically sound alternative.

Synchronously the played-back signal from the loudspeakers (as per Steps 1-4) is recorded at the respective ear’s inputs for the ear simulators of the artificial head (HATS). Subsequently the power density spectra are computed for the recorded signal and for the original signal. The averaged magnitude values Hn (n=1,2,…,21), in dB, for the equalization filters are computed as follows:

Gn = 20.log(ave|Sn,out(f)|) – 20.log(ave|Sn,in(f)|)                                     (3)

where the ‘ave’ operator is the arithmetical average calculated over all frequencies of the nth 1/3rd octave band and Sn,in(f) and Sn,out(f) are spectra within the bands of interest; |.| denotes the absolute value.

After applying gains Gn (n=1,2,…,21) computed as follows,

Gn=1/Hn                                                                                                            (4)

to the individual filters, the resultant frequency response (as in Step 4) shall be flat between 100 Hz and 10 kHz within a tolerance of ±3 dB.

There are several other details regarding the equalization procedure and its implementation. Contact us to discuss the details with our engineering staff.

VOCAL’s Voice Enhancement solutions’ practices include lab facility equalization according to the applicable standards and using various applicable DSP-based techniques such the ones discussed in this note.


  2. Head and torso simulator for telephonometry, ITU-T P.58 (05/2013)
  3. Background noise simulation technique and background noise database; ETSI EG 202 396-1 V1.2.2 (2008-09) – newer versions are available
  4. ISO 266 (in section related to R10)
  5. British Standard BS2045:1965 Preferred Numbers (there are ISO and ANSI versions (1973)).
  6. Echo Cancellation Design
  7. Voice Enhancement Design