Wavelet compression is a very efficient technique for image/video compression. It’s based on the wavelet transform that provides a multi-scale representation of images and video in the space-frequency domain. There are multiple approaches to video encoding based on wavelet compression.
Intraframe wavelet compression is implemented in the MJPEG standard which uses the JPEG 2000 standard developed for still images. In MJPEG, each frame is an independent entity encoded by either a lossy or lossless variant of JPEG 2000. This standard is utilized in digital cinematography.
For interframe wavelet compression, the video sequence is represented as a GOP that has full frames and motion-compensated residual frames, both encoded by wavelet compression. The residual frames are much less correlated than the original frames, so the wavelet transform is less effective in residual frame representation. Due to the phenomenon that the wavelet transform is often critically sampled, shift variance associated with these transforms limits the effectiveness of motion compensation.
Full 3-D wavelet video coding in both the spatial and temporal domain is also possible. For this approach, the coding scalability in rate, quality and resolution is inherent. Due to possible motion, spatially co-located pixels across frames are usually misaligned. This reduces the compression efficiency.
In order to take full advantage of the 3-D wavelet transform, wavelet filtering has to be performed along motion trajectories. One way to do this is motion threading. This exploits the long-term correlation across frames along the motion trajectory. Backward motion estimation is performed on each pair of frames at the macroblock level. Pixels along the same motion trajectory are linked to form a thread following the motion vectors of the macroblocks they belong to. The 3D wavelet transform is applied to the threads.
To increase the coding efficiency, motion threading has to accommodate fractions of pixels (e.g. ½ or ¼) for motion compensation. An efficient threading mechanism has to utilize a lifting-based wavelet implementation. There are usually different types of motion in real-world video such as panning, zooming, and scene changes. This would result in many-to-one pixel mappings and non-referred pixels.
Another approach for efficient wavelet video compression is overcomplete motion compensated wavelet coding. The overcomplete wavelet transform is similar to the wavelet transform except there is no downsampling. Complexity becomes an issue in this approach. The inverse transform is computationally more expensive than the forward transform since the reconstruction filters are always longer than the forward filters. The overcomplete wavelet transform allows for improved motion estimation/compensation in the wavelet domain. In fact, only motion references need to be overcomplete. Texture coding may still be in the complete wavelet domain.