Speech reverberation can be a significant problem for speech enhancement especially in free-space systems, such as videoconferencing, laptops, and tablets where environmental echoes can be disruptive to conversations.
What is Reverberation?
Speech reverberation is the multi-path propagation (room impulse response) of the desired speaker to the receiving microphone. Reverberation occurs when the microphone picks up multiple attenuated and delayed copies of a single signal. In speech communications applications, these copies are generated when sound reflects off surfaces in an environment. Attenuation occurs due to the surfaces absorbing some of the sound energy.
Early Reflections and Late Reflections – Voice Reverberation
Reverberation is usually broken up into two components, early reflections and late reflections. The early reflections consist of single reflections off of surfaces within the room. However, these early reflections can also improve speech intelligibility because they strengthen the direct sound component. This is referred to as the precedence effect.
Late reflections refer to the section of the reverberation where multiple reflections of the speaker’s voice combine to provide a diffuse and exponentially decaying echo of the direct sound component. The late reflections result in noticeable echoes and colouration which cause the speaker to sound distant.
Reverberation tends to spread speech energy which reduces the information available for phonemic identification. This time-energy spreading has two distinct yet equally important effects on speech intelligibility.
- The energy in individual phonemes becomes spread out. Thus, plosives have a markedly delayed onset and decay and fricatives are smoothed.
- Preceding phonemes blur into the current ones. This effect is most apparent when a vowel precedes a consonant.
The greater the distance between the speaker and the microphone the larger the effect reverberation will have on speech quality and the greater the need for dereverberation processing. The critical distance is the distance at which the speech reverberation signal is equal to the direct sound signal, and this distance is inversely proportional to the reverberation time of the room.