Although there are a wide variety of dereverberation techniques to handle reverberation, they are grouped into two categories. Solutions that require an estimate of the acoustic impulse response (AIR) are put into the reverberation cancellation category; those that do not require an estimate of the AIR are put into the reverberation suppression category.
In hands-free communication systems, the desired signal is degraded by the acoustic channel between the source and receiver of the signal (the microphone). Within a room there are multiple sources of speech degradation. They include additive noise sources and the acoustic impulse response of the room. The difference between these two issues is that the additive noise sources are assumed to be independent of the desired speaker, while reverberation is dependent. Noise reduction algorithms address the additive noise sources, while dereverberation tackles the reflections.
Cancellation of Speech Reverberation
Reverberation cancellation is the attempt to perform the inverse filtering of the AIR. The model of the dereverberation of this type is as follows: ˆd(t) = h-1(t)*m(t) = h-1(t)*(h(t)*d(t)), where the acoustic impulse response is h(t), the desired signal is d(t), the estimate of the desired signal is ˆd(t), the microphone signal is m(t), and h-1(t) is the inverse of the AIR. Since d(t) is not known, then this becomes a blind deconvolution problem. Blind deconvolution problems are solved by either trying to estimate source signal or the system it went through.
Eigen-vector based algorithms which perform the eigenvalue decomposition of the correlation matrix of the microphone signal is one approach to blind deconvolution. The null space of the correlation matrix contains information on the AIR. The problem with this approach is the computational complexity is high and very sensitive to estimation errors in the correlation matrix.
Another approach attempts to take advantage of the harmonic structure of speech. The thought process here is that the dereverbation filter is constructed by adapting a harmonic filter to match the microphone signal. The disadvantage of this approach is that it has a very long adaptation time and does not track variant AIR’s well.
Suppression of Speech Reverberation
The second class of dereverbation techniques is reverberation suppression. Reverberation suppression does not try to estimate the AIR, but instead tries to model either speech production or the AIR with a just a few parameters. Then using the information from the model removes components of the microphone signal that are likely to be reverberant components.
Linear Prediction analysis is one approach to model speech production. Clean speech components cause the LP residuals to remain close to zero, while reverberation causes LP residuals to be time-varying. Thus the LP residual components related to reverberation can be suppressed, and the estimated clean speech residuals are used to synthesize the clean speech. The LP residual approach is more effective on lighter reverberant environments than highly reverberant ones because reverberations have a negative effect on the estimation of the LP coefficients, thus the dereverbation process.
Another approach, spectral enhancement, is similar to spectral subtraction algorithms for noise reduction. The spectral contribution of the reverb is estimated and spectral subtracted from microphone signal. The reverb spectrum component can be determined from an estimate of the reverberation time, which can be estimated in various ways. The late reflections can be modeled by an exponentially decaying function dependent on the reverb time.