Complete Communications Engineering

ITU voice codec
VOCAL’s ITU speech coder optimized source code provides performance and portability

VOCAL’s ITU speech coders, using  proprietary techniques, are optimized for all modern processors including DSPs and conventional processors from AMD, Intel, TI, ADI and other vendors. Benchmarks have shown that VOCAL’s highly optimized C with limited assembly code compares well against other vendors implementations with significantly more assembly language. While the difference in performance is usually within a couple of MIPS, the portability and maintainability of our code benefits our customers by lowering the initial costs, easing integration, reduced maintenance costs with compiler upgrades, and greater availability of optimized code for different modern processors. Contact us to discuss your application requirements.

download brochure


The following ITU speech coders are available in optimized C/assembly code for embedded processors, DSPs and general purpose processors:

Audio Examples

ITU Voice Codecs

Many speech coders have been standardized under the auspices of the International Telecommunications Union. These have typically used voice compression algorithms first designed for additional bandwidth reduction in the telephone network. As such, these algorithms often used G.711 μ-law or A-law TDM signals and multiplexed digital channels for the compressed speech signals. Many of these early speech coders did not use silence suppression techniques as the benefit could only be exploited on statistically multiplexed digital carrier systems. At the same time, a number of other proprietary speech coders were also developed for satellite links which very heavily relied on this statistical multiplexing.

With the advent of Voice over IP, many of these legacy vocoders were retrofitted with silence detection / comfort noise generation. Both G.723.1 and G.729 considered these functions as an add-on via Annex A and Amendment B respectively. G.711 was retrofitted with similar functionality though G.711 Appendix II. Packet loss concealment was also missing from many of these legacy speech coders. The G.711 packet loss concealment (Appendix I) and silence suppression techniques are also commonly used by other speech coders which lack such functionality, like G.722, G.726 and G.728 wideband and narrowband vocoders.

Certain speech coders like G.723.1 were designed for specific applications. Early video conferencing systems used G.723.1 which has a frame rate similar to video (30 msec versus 33.333 msec). Since video and audio needed  to be synchronized, relatively compatible frame rates were desirable. However for Voice over IP applications, large frames sizes tend to be strongly undesirable as they contribute very quickly to round-trip  conversation latency. One-way delay should be limited to 150 msec (100 msec preferred) for acceptable Voice over IP deployments and applications.

The greater availability of network bandwidth that can be found in high-speed cable modem networks and fibre deployments, such as Verizon FiOS and AT&T U-verse (SM), requires less and less speech compression need be performed. In fact these systems would greatly appreciate simplicity with the use of ordinary G.711 μ-law or A-law. This is how the clarity of hearing “a pin drop” was claimed as a selling point by Sprint for their long distance services. Unlike other carriers, they had enough excess capacity on their fiber networks to carry voice in the native telephone network format. VoIP systems not only meet this clarity standard, but often exceed it by offering an internet wideband audio codec capability using G.722 or Speex for high-definition (HD) audio .