Practice of RT60 Estimation in Bands is an extension of RT60 estimation [1]. Recommendations are based on the impulse response used for estimation of reverberant properties of systems under consideration. Although other methods can be used as well (e.g. the noise generation and its cessation, and measuring the speed of the acoustic energy decay in the room) for simplicity and convenience we will be referring to the impulse response based method.
As a general statement we wish to say that in acoustic and audio practice involving experiments in rooms or other closed spaces, it is desired to know the acoustic parameter characteristics, such as RT60, for the given room.
While the full-band (i.e., [0, Fs/2]) characterization is common in engineering practice, it does not answer the fundamental question: what is RT60 in the frequency range of interest for a given application. If the application, as considered in the broad sense, related to speech propagation and speech processing in the narrow band (and that means that Fs = 8000 Hz) then characterization of RT60 in frequency band of [300 Hz, 3400 Hz] is more relevant to this case then RT60 determined for much wider frequency band. Likewise, if the application relates to the wide band (and that implies that Fs = 16000 Hz), then characterization of RT60 in the frequency band [50 Hz, 7000 Hz] is more relevant than, for example, in [20 Hz, 16000Hz].
These considerations do not lead us to ultimate rules regarding the frequency band choice. They only indicate that the choice of the band of interest when estimating RT60 has practical meaning and that there is no universal method of selecting the right frequency band. This is of particular importance in cases where there is a strong relationship between RT60 and the width and center location of the frequency band under consideration.
These considerations also bring us to the question of practicality of knowing RT60 in the wide band of interest (here the term “wide band” refers to the band close to the band [0, F/s], yet not covering it entirely; always it is good practice to choose a low cutoff frequency that correspond to the given application (as opposed to the theoretical low limit) and the high cutoff that is measurably lower than the Nyquist frequency Fs/2 application (as opposed to the theoretical high limit).
Since the room response, and, specifically, RIR (Room Impulse Response), contains all information needed for establishing RT60 characteristic to the two points of reference, namely location of the excitation source (typically an equalized omni-directional loudspeaker) and location of the sensor (typically a high quality omni-directional microphone), then in order to generate RT60 for a specific band, some pre-processing of the RIR is required before it is used for generating RT60. It also assumed that the RIR is available prior to estimating RIR and it has been experimentally established using Fs that it is sufficiently large to the application at hand.
Specific consideration and recommendations:
- For the generation of RT60 reflecting room properties in the narrow band, the RIR should be pre-processed (filtered) by the filter corresponding to the narrow band, [300Hz, 3400Hz], with Fs of 8000Hz, it is a good practice to apply the filter band well encompassing the narrow band, thus, the preprocessing filter for RIR can be [200 Hz, 3600Hz];
- For the generation of RT60 that would reflect room properties in the wide band, [50Hz, 7000Hz], with Fs of 16000Hz it is a good practice to apply the pass-band filter of [40 Hz, 7400 Hz] (approximately);
- If there is a need to know the RT60 for signals with spectrum reflecting the human speech, a shaping filter should be applied to the RIR function. A good reference related to the shaping filter details is Ref.3;
- When characterizing RT60 for general purpose, the RIR should be generated using Fs of 24000 Hz of higher;
- In the case of generating RT60 in frequency bands covering wide band [50Hz, 7000Hz] or larger, two typical choices are made:
- RT60 in octave bands; then the standard bands are
- [44Hz, 88Hz], with center frequency of 63Hz,
- [88Hz, 177Hz], with center frequency of 125Hz,
- [177 Hz, 355 Hz] , with center frequency of 250Hz,
- [355 Hz, 710Hz] , with center frequency of 500Hz,
- [710 Hz, 1420Hz] , with center frequency of 1000Hz
- [1420 Hz, 2840Hz] , with center frequency of 2000Hz
- [2840 Hz, 5680Hz] , with center frequency of 4000Hz
- [5680 Hz, 11360Hz] , with center frequency of 8000Hz
- RT60 in 1/3rd octave bands (standard bands can be found in Ref.4)
In Figure 2, pre-processing steps for RIR used for generating RT60 are depicted. The pass-band filter can be with cutoff frequencies as mentioned earlier. In the case of octave bands and 1/3rd octave bands, the center frequencies and the cutoff frequencies are standardized and available in numerous documents (for example, cf. Ref.4). The ABS(y) block computes absolute values and the filtered RIR. FIR AVE FILTER is an averaging FIR filter. Typically the averaging is extended over 35ms (or 70ms) – cf. ITU-T G.168-2012. Note that the results for RT60 are not very sensitive to the filter length. The output yFIR of the averaging filter, expressed in dB, is illustrated in Figure 2d. In order to compute RT60 linear interpolation over the middle section is preformed. The argument range for the interpolation typically starts at some 20 milliseconds from the time 0 and ends at least several dozen milliseconds before the curve begins to flatten (which mines that the desired signals becomes berried in the environmental noise). In order to extend the time range for the interpolation, generation of the RIR using Golay method is recommended (cf. Ref.5).
VOCAL Technologies engineering practices include characterization of acoustical environments where voice enhancement devices are verified. These practices include estimation of RT60 in full band and in sub-bands where required. Contact us to discuss your audio application needs.
REFERENCES
- Methods of RT60 Estimation
- De-reverberation (Section 8), in Sound Capture and Processing, by Tashev, I., A John Wiley and Sons, LTD., Publishing 2009
- ITU-T G.168-2012, Figure C-3.
- Frequency band
- Impulse Response Estimation for Audio via Complementary Sequences
- Echo Cancellation Design