There are several parameters used to characterize the reverberation effect in rooms. One of the frequently used parameters is RT60 which expresses the time during which a sustaining sound in the room drops by 60dB after the sound source is inhibited.
Figure 1 illustrates typical behavior of the acoustical energy increase and decay when the room is excited by a moderate-duration noise pulse. There are several reliable methods for estimating room reverberation time RT60 using noise pulse and Dirac delta trains.
Room Reverberation Time
Reverberation time is one of important parameters characterizing properties of a room (in a context of acoustic scene). Long reverberation time is sensed by an average listener as an effect of sustaining sound, often with low frequency components. Long reverberation time makes it more difficult to audibly communicate between individuals as the speech intelligibility is affected and sometime significantly impaired. Short reverberation time produces an effect of “dry” acoustic environment which may produce a feeling of lack of naturalness; however, the “dry” acoustic environment improves verbal communication as the speech intelligibility improves in such an environment.
Voice enhancement devices such as AEC and NR improve, to a certain extent, adverse effects associated with long reverberation time. Yet, generally speaking, these operate less efficiently, from the viewpoint of voice enhancement characteristics, when the reverberation time is excessive.
To properly design and evaluate voice enhancement devices, it is important to measure reverberation characteristics and, if necessary, modify the acoustical environment so it represents the desired acoustical properties. Figure 2 provides an example of a typical audio system installation for evaluating reverberation properties.
RT60 Estimation using Noise Pulse Method
The first method is based on producing a train of noise pulses (as illustrated in Figure 3), recording the room responses by recording the microphone signal. Once the microphone signal is recorded in a form an audio file, it should be preprocessed by applying a high-pass filter. A preferred filter is a high-pass FIR filter with cut-off frequency fc greater than 40 Hz (approximately). A typical choice of fc is between 50 Hz and 70 Hz, approximately. The rationale for applying a high-pass filter is to consider the reverberation effect in the frequency band typical for voice enhancement applications.
Once the output signal is pre-processed, a short-term energy of the output should be computed. Only energy profiles representing signal decay portions are needed for further processing. In order to increase statistical accuracy of the reverberation measurement results, individual signal decay portions should be averaged and the result computed in the log scale (short-term energy) versus time. The time during which the signal level drops by 60dB (when linearly interpolated) is RT60.
It is recommended that at least 10 pulses are played back and the respective outputs recorded. Note that the repetition time ΔT should be carefully chosen based on prior knowledge of the room reverberation characteristics. ΔT should be greater than the expected RT60 value. The duration of individual noise pulses should be greater than the energy raise time. As a rule of thumb, the pulse duration should be at least as long as the estimated RT60 multiplied by 3-5 times.
RT60 Estimation using Dirac Delta Method
An alternate method of estimating RT60 is illustrated in Figure 4. The only difference between these two approaches is the choice the excitation signal x. In Figure 4 signal x is a train of short-term pulses (in ideal representation these pulses are Dirac deltas)
The time when the signal energy drops by 60dB may be hard to assess. As often occurs, the output signal is buried in the background noise before its level has reached 60dB drop, counting from the initial value, after the noise generation is inhibited. Then an alternate formula for RT60 can be used:
where the quotient represents the inversion of the sound decay speed (in dB/second) and specific quantities, T1, T2, L1, and L2, are explained and illustrated in Figure 4 (d). The caveat here is that any reduction of the level delta (i.e., ΔL = L2 – L1 ) has an adverse impact on accuracy of estimating RT60.
Note that the described two approaches to estimate RT60 focus on the full-band RT60 measurement. Often, knowledge of RT60 in predefined frequency bands is required. A typical approach for spectral characterization of RT60 is by applying 1/3rd octave filters to the output signal and analyzing short-term energy decay in the respective bands.
In many acoustical environments, estimations of RT60 may require data pre-processing as well as pre-sorting, and thus may blur the lines between science and art. VOCAL engineers are ready to consult in such cases or otherwise help in achieving proper estimations of the customer’s acoustic environment. Contact us to discuss your voice application verification requirements.