How Does Acoustic Imaging Work? | VOCAL Technologies

Acoustic Imaging utilizes the technology of Microphone Array Beamforming. Utilizing the far-field propagation model, sound sources propagate as a plane wave to an array of N microphones, at a given polar and azimuth angles (or a 2D location).

The array cross spectral matrix (CSM) is defined as:

$C(t,f)\ =\ \sum_{j}{a_j(t,f)}\textbf{s}_\textbf{j}(f){\textbf{s}\prime}_\textbf{j}(f)$

Where $a_j\left(t,f\right)$ is the amplitude of the sound source at position j for at given time and frequency. $s_j$ , is N dimension steering vector. The steering vectors are normalized to one. When applying a beamforming function:

$b(s)=\ \textbf{s}\prime C\textbf{s}$

and the sound source steering vector aligns with the beamforming steering vector the output will be equal to $a_j\left(t,f\right)$ , and when the steering vector is not aligned, $a_j\left(t,f\right)\left|s\prime(f)s_k(f)\right|^2$ . $\left|s\prime(f)s_k(f)\right|^2$ is called the point spread function (PSF). It will be equal to 1 at the correct location and aliasing points, and less than one at other locations.

Two common problems of acoustic imaging are spatial aliasing and microphone self-noise. Spatial aliasing occurs when the microphone array spacing is larger than half the frequency of interest. The result is a beamforming output that contains multiple locations for which the PSF is equal to 1, but only one of them is the true sound source. Simply decreasing the microphone spacing will reduce aliasing for higher frequencies, but will increase the beamwidth (i.e. lower the spatial resolution) of lower frequencies. It is common to apply shading to the beamforming function.

W is a frequency dependent weight applied to each microphone to help maintain a constant beamwidth across a range of frequencies.

$b_{shad}(s)=\ \textbf{s}\prime\textbf{W}CW\prime\textbf{s}$

To help with self-noise (additive incoherent) microphone noise, a diagonal optimization procedure is applied to the CSM. The goal is to remove to any components of the array response that do not correspond to an actual sound source. This will improve the dynamic range of the acoustic imaging system.