Complete Communications Engineering

Immersive Voice and Audio Services (IVAS) Codec

VOCAL offers the IVAS codec software as a standalone algorithm, as part of a modular software library suite, as well as with a VoIP stack. VOCAL’s software is written in ANSI C and does not make use of any GPL licensed software components.

  • Both stereo and immersive audio coding with backward compatibility for mono operation
  • Mobile, VoIP, and Voice Conferencing for LTE
  • Optimized for DSPs, RiSC, CISC processors
  • 3GPP TS 26.250-258 compliant
Diagram of IVAS Codec Procedure

IVAS Overview

Immersive Voice and Audio Services (IVAS), standardized by 3GPP, is an audio codec supporting mono, stereo, and immersive audio formats. This codec provides support for spatial audio coding, decoding, and rendering and was specifically designed with immersive mobile communication services over 4G and 5G cellular networks in mind.

The IVAS encoder receives mono, stereo, or immersive audio input. Immersive formats are one of Scene-Based Audio (SBA) (1st–3rd order ambisonics), Metadata-Assisted Spatial Audio (MASA), Object-Based Audio (1–4 objects), Multichannel-Based Audio (5.1, 5.1+2, 5.1+4, 7.1, 7.1+4) or certain combinations (objects + SBA/MASA). The IVAS decoder is coupled with a jitter buffer management system and an integrated rendering algorithm. As an alternative to the integrated renderer, a standalone IVAS renderer is also offered.

The IVAS renderer provides functionality for binaural reproduction on headphones or customizable loudspeaker configurations. The renderer supports head-tracking, scene rotation, orientation tracking, customizing room acoustics parameters, object directivity, and distance attenuation. The head-related transfer function (HRTF) and binaural room impulse response (BRIR) sets used by the renderer are also able to be customized for individual listeners and environments.

IVAS is fully backward compatible with previous 3GPP standards for EVS and AMR-WB which support mono audio at 8, 16, 32, and 48 kHz sampling rates ranging and bandwidths up to 20 kHz. Stereo and immersive audio formats require a sampling rate of at least 16 kHz and a bandwidth of at least 8 kHz.

Mono Audio

Multichannel Audio

Bandwidth

Bitrates (kbps)

Audio Format

Bitrates (kbps)

Narrowband (NB) (20 – 4000 Hz)

5.9, 7.2, 8, 9.6, 13.2, 16.4, 24.4

Stereo/binaural

13.2, 16.4, 24.4, 32, 48, 64, 96, 128, 160, 192, 256

Wideband (WB) (20 – 8000 Hz)

5.9, 7.2, 8, 9.6, 13.2, 13.2 channel-aware, 16.4, 24.4, 32, 48, 64, 96, 128

SBA

13.2, 16.4, 24.4, 32, 48, 64, 96, 128, 160, 192, 256, 384, 512

AMR-WB IO (20 – 8000 Hz)

6.6, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, 23.85

MASA

13.2, 16.4, 24.4, 32, 48, 64, 96, 128, 160, 192, 256, 384, 512

Superwideband (SWB) (20 – 16000 Hz)

9.6, 13.2, 13.2 channel-aware, 16.4, 24.4, 32, 48, 64, 96, 128

Object-Based Audio

13.2, 16.4, 24.4, 32, 48, 64, 96, 128, 160, 192, 256, 384, 512

Fullband (FB) (20 – 20000 Hz)

16.4, 24.4, 32, 48, 64, 96, 128

Multichannel-Based Audio

13.2, 16.4, 24.4, 32, 48, 64, 96, 128, 160, 192, 256, 384, 512

IVAS Features

  • Transcoding support
  • Compliant with 3GPP TS 26.258 source code
  • 16, 32, and 48 kHz sampling rate support for stereo and immersive audio formats with additional 8 kHz support for mono audio
  • WB, SWB, and FB bandwidth support for stereo and immersive audio formats with additional NB support for mono audio
  • Detailed Algorithmic Description (3GPP TS 26.253)
  • Rendering (3GPP TS 26.254)
  • Error concealment of lost packets (3GPP TS 26.255)
  • Jitter Buffer Management (3GPP TS 26.256)

IVAS Use Cases

VOCAL’s solution is available for the above platforms. Ask about our RUST programming language implementations. Please contact us for specific supported platforms.


More Information


Loading...

Loading...


Loading...

Loading...


Loading...

Loading...