Standards Related with Coding of Mobile 3DTV Content

Standardization of digital audio and video is investigated by the Moving Picture Experts Group (MPEG), a working group of ISO/IEC and the corresponding standards are issued with ISO/IEC designations.

MPEG-C, Part 3
The purpose of ISO/IEC 23002-3 Auxiliary Video Data Representations (MPEG-C part 3) is to support all those applications where additional data needs to be efficiently attached to the individual pixels of a regular video. In ISO/IEC 23002-3 it is described how this can be achieved in a generic way by making use of existing (and even future) video codecs available within MPEG. ISO/IEC 23002-3 consists of an array of N-bit values which are associated with the individual pixels of a regular video stream. These data can be compressed like conventional luminance signals using already existing (and even future) MPEG video codecs. The format allows for optional subsampling of the auxiliary data in both the spatial and temporal domain. This can be beneficial depending on the particular application and its requirements and allowing for very low bitrates for the auxiliary data.

The specification is very flexible in the sense that it defines a new 8-bit code word aux_video_type that specifies the type of the associated data, e.g., currently a value of 0x10 signals a depth map, a value of 0x11 signals a parallax map. New values for additional data representations can be easily added to fulfill future demands. The specification is directly applicable to 3D video as it allows specifying such video in the format of single view + associated depth, where the single channel video is augmented by the per-pixel depth attached as auxiliary data. As such, it is susceptible to efficient compression. Rendering of virtual view (at least one in case of stereo) is required at the receiver side. The specification has been standardized since 2007.

3D video (3DV) and free viewpoint video (FVV) are new types of visual media that expand the user’s experience beyond what is offered by 2D video. 3DV offers a 3D depth impression of the observed scenery, while FVV allows for an interactive selection of viewpoint and direction within a certain operating range. A common element of 3DV and FVV systems is the use of multiple views of the same scene that are transmitted to the user. Multiview Video Coding (MVC, ISO/IEC 14496-10:2008 Amendment 1) is an extension of the Advanced Video Coding (AVC) standard that provides efficient coding of such multiview video. The overall structure of MVC defines the following interfaces: The encoder receives N temporally synchronized video streams and generates one bitstream. The decoder receives the bitstream, decodes and outputs the N video signals. The video representation format is based on N views. For the case of stereo-video, that is two separate views coded together. A promising extension is to study view subsampling, i.e. one full resolution view + one subsampled view. The idea behind this approach is that the human visual system is capable to retrieve the stereo with the quality of the better channel. MVC is standard since 2008 (version 1).

3D Video Coding
3D Video Coding (3DVC) is a standard that targets serving a variety of 3D displays. Such displays here in focus present N views (e.g. N = 9) simultaneously to the user, so-called multi-viewed displays. For efficiency reasons only a lower number K of views (K = 1,..,3) shall be transmitted. For those K views additional depth data shall be provided. At the receiver side the N views to be displayed are generated from the K transmitted views with depth by depth image based rendering (DIBR). This application scenario imposes specific constraints such as narrow angle acquisition (<>K out of N views,) augmented with K depth sequences. This representation related to stereo-video generalizes the possibilities of MPEG-C, Part 3 and MVC, i.e. the two separate views can be coded together or can be reduced to single view + depth with the second view to be synthesized at the receiver. 3DVC is an ongoing MPEG activity, and a standard is expected in 2011.

Source: Mobile3DTV