The Mixed Excitation Linear Predictive (MELP) specification was first released in 1999 and the standard is commonly referred to as MIL-STD-3005 (U.S.) or STANAG 4591 (NATO). The algorithm evolved from work done in the mid 1990′s for the military on another code excited linear predictive (CELP) coder known as LPC-10. The MELP model development improved upon the deficiencies in the original LPC-10 algorithm. An enhanced mixed-excitation linear predictive version, MELPe, has also been produced.
MELP was selected as the new 2400 bps Federal Standard speech coder by the United States Department of Defense (DoD) Digital Voice Processing Consortium (DDVPC) after a multi-year extensive testing program. The vocoder selection criteria concentrated on four areas: intelligibility, voice quality, talker recognizability, and commmunicability. The selection criteria also included hardware parameters such as processing power, memory usage, and algorithm delay. MELP 2400 bps was selected as the best of the seven candidates and even beat the FS1016 4800 bps codec performance with twice the bit-rate.
Many modifications were made to LPC-10 in order to improve the speech quality in MELP speech coder. These include:
- Mixed pulse and noise excitation
- Periodic or aperiodic impulses
- Adaptive spectral enhancement
- Pulse dispersion filter
- Fourier magnitude modeling
Mixed Pulse and Noise Excitation
MELP mixed-excitation is implemented using a multi-band mixing model. This model can simulate frequency dependent voicing strength using a novel adaptive filtering structure based on a fixed filterbank. The primary effect of this multi-band mixed-excitation is to reduce the buzz usually associated with LPC coders, especially in broadband acoustic noise.
Periodic/Aperiodic Impulses
When the input speech is voiced, the MELP vocoder can synthesize speech using either periodic or aperiodic pulses. Aperiodic pulses are most often used during transition regions between voiced and unvoiced segments of the speech signal. This feature allows the synthesizer to reproduce erratic glottal pulses without introducing tonal noises.
Adaptive Spectral Enhancement
MELP adaptive spectral enhancement filter is based on the poles of the LPC vocal tract filter and is used to enhance the formant structure in the synthetic speech. This filter improves the match between synthetic and natural bandpass waveforms, and introduces a more natural quality to the speech output.
Pulse Dispersion Filter
MELP pulse dispersion is implemented using fixed pulse dispersion filter based on a spectrally flattened triangle pulse. This filter has the effect of spreading the excitation energy with a pitch period. This, in turn, reduces the harsh quality of the synthetic speech.
Fourier Magnitude Modeling
Ten Fourier magnitudes are coded with an 8-bit vector quantizer. The index of the code vector, which minimizes the weighted Euclidean distance between the input and code vectors, is transmitted.