The H.264/AVC standard specifies a number of profiles. Each profile uses a subset of the coding tools defined by the H.264 standard. The tools are algorithms or processes used for video coding and decoding. An encoder will compress video based on a specific profile, and this will define which tools the decoder must use in order to decompress the video. A decoder may support some profiles, while it does not support others. Each profile is intended to be useful to a class of applications.
H.264/AVC Profiles
The baseline profile is the simplest profile, and must be supported by all decoders. It may be useful for real-time applications such as video conferencing, where the encoder and decoder must run quickly. The main profile is widely used. It offers a good compromise between compression performance and computational complexity. It may be suitable for basic SD TV broadcasting. The constrained baseline profile is a subset of the main profile that is popular for low complexity, low-delay applications such as mobile video. The high profile has additional tools to improve the compression ratio for HD TV.
The baseline profile supports I and P slices, a basic 4×4 integer transform, and it uses CAVLC for entropy encoding. It is dedicated to real-time applications such as video conferencing, or platforms with low processing power. It is characterized by low complexity, but the least encoding efficiency. It also supports three tools for improved transport efficiency: FMO, ASO and redundant slices which are not supported by the constrained baseline profile.
The extended profile is a superset of the baseline profile. The extended profile extends the baseline profile with several error resilience techniques. It uses B slices and supports interlaced video coding. This profile is targeted at streaming video. It’s characterized by higher compression but also higher complexity. Unlike the other profiles, this one supports special slices designed for streaming: the SI and SP slices. These allow the server to switch between different bit-rate streams when needed.
The main profile is a superset of the constrained baseline profile. The main profile uses I, P and B slices. It supports CABAC entropy encoding, but also CAVLC. It supports B slices with prediction modes such as weighted prediction. It can work with either progressive or interlaced video. It lacks some of the error resilience techniques supported by the baseline and extended profiles. This profile is mainly used for digital non-HD television broadcast.
The high profile is a superset of the main profile. The high profile offers a higher compression ratio than the others, at a slightly increased implementation complexity and computation cost. It adds the some additional tools such as: 8×8 transform and 8×8 inter-prediction, quintizer scale matrices that support frequency-dependent quantizer weightings, separate quantizer parameters for Cr and Cb, and monochrome video. It is mainly used for high definition applications. For example, it’s used to store HD videos on Blu-ray discs and it’s used for HDTV broadcasts.
The High 10 Profile (Hi10P) is built on top of the high profile, adding support for up to 10 bits per sample of decoded picture precision. This is beyond today’s mainstream consumer product capabilities.
The High 4:2:2 Profile (Hi422P) adds support for the 4:2:2 chroma subsampling format while using up to 10 bits per sample of decoded picture precision. It’s primarily targeting professional applications that use interlaced video. This profile builds on top of the high 10 profile.
The High 4:4:4 Predictive Profile (Hi444PP) is built on top of the high 4:2:2 profile. It supports up to 4:4:4 chroma sampling, up to 14 bits per sample, efficient lossless region coding, and the coding of each picture as three separate color planes.
H.264/AVC Levels
Levels specify the size of the video a decoder must be able to handle. They specify a maximum bit-rate for the video and a maximum number of macroblocks per second. Level numbers range from 1 to 5 with intermediate steps (e.g., 1.1, 1.2, 1.3, etc.) A decoder operating at a particular level must also handle all levels below.
Each level specifies a value for the following constraints:
- Maximum macroblock processing rate (MaxMBPS) that a decoder must handle per second.
- Maximum frame size (MaxFS), defined as the maximum number of macroblocks in a decoded frame.
- Maximum decoded picture buffer size (MaxDPB), defined as the memory required to store a decoded picture.
- Maximum video bitrate (MaxBR).
- Maximum coded picture buffer size (MaxCBP), defined as the memory required to store coded data prior to decoding.
- Vertical motion vector range (MaxVmvR).
- Minimum compression ratio (MinCR), defined as the minimum ratio between uncompressed video frames and compressed or coded data size.
- Maximum motion vectors per two consecutive macroblocks (MaxMvsPer2Mb) (Only specified for levels above 3).