MPEG-2 Basic Training

The MPEG-2 standard is defined by ISO/IEC 13818 as "the generic coding of moving pictures and associated audio information." It combines lossy video compression and lossy audio data compression to fulfill bandwidth requirements. The foundation of all MPEG compression systems is asymmetric because the encoder is more sophisticated than the decoder.

MPEG encoders are always algorithmic. Some are also adaptive, using a feedback path. MPEG decoders are not adaptive and perform a fixed function. This works well for applications like broadcasting, where the number of expensive complex encoders is few and the number of simple inexpensive decoders is huge.

The MPEG standards provide little information about encoder process and operation. Rather, it specifically defines how a decoder interprets metadata in a bit stream. MPEG metadata tells the decoder what rate video was encoded at, and it defines the audio coding, channels and other vital stream information.

A decoder that successfully deciphers MPEG streams is called compliant. The genius of MPEG is that it allows different encoder designs to evolve simultaneously. Generic low-cost and proprietary high-performance encoders and encoding schemes all work because they are all designed to talk to compliant decoders.

Before SDI
Asychronous Serial Interface (ASI) is a serial interface signal where a start bit is sent before each byte, and a stop signal is sent after each byte. This type of start-stop communication without the use of synchronized fixed time intervals was patented in 1916 and the key technology making teletype machines possible. Today, an ASI signal is often the final product of MPEG video compression, ready for transmission to a transmitter, microwave or fiber. Unlike uncompressed SDI, an ASI signal can carry one or multiple compressed SD, HD or audio streams. ASI transmission speeds are variable and depend on the user's requirements.

There are two transmission formats used by the ASI interface, a 188-byte format and a 204-byte format. The 188-byte format is the more common. If Reed-Solomon error correction data is included, the packet can grow an extra 16 bytes to 204 bytes total.

Making MPEG-2
An MPEG-2 stream can be either an Elementary Stream (ES), a Packetized Elementary Stream (PES) or a Transport Stream (TS). The ES and PES are files.

Starting with analog video and audio content, individual ESs are created by applying MPEG-2 compression algorithms to the source content in the MPEG-2 encoder. This process is typically called ingest. The encoder creates an individual compressed ES for each audio and video stream. An optimally functioning encoder will look transparent when decoded in a set-top box and displayed on a professional video monitor for technical inspection.

A good ES depends on several factors, such as the quality of the original source material, and the care used in monitoring and controlling audio and video variables upon ingest. The better the baseband signal, the better the quality of the digital file. Also influencing ES quality is the encoded stream bit rate, and how well the encoder applies its MPEG-2 compression algorithms within the allowable bit rate.

MPEG-2 has two main compression components: intraframe spatial compression and interframe motion compression. Encoders use various techniques, some proprietary, to maintain the maximum allowed bit rate while at the same time allocating bits to both compression components. This balancing act can sometimes be unsuccessful. It is a tradeoff between allocating bits for detail in a single frame and bits to represent the changes (motion) from frame to frame.

Researchers are currently investigating what constitutes a good picture. Presently, there is no direct correlation between the data in the ES and subjective picture quality. For now, the only way of checking encoding quality is with the human eye, after decoding.

The Packetized Elementary Stream
Individual ESs are essentially endless because the length of an ES is as long as the program itself. Each ES is broken into variable-length packets to create a PES, which contains a header and payload bytes.

The PES header is data about the encoding process the MPEG decoder needs to successfully decompress the ES. Each individual ES results in an individual PES. At this point, audio and video information still reside in separate PESs. The PES is primarily a logical construct and is not really intended to be used for interchange, transport and interoperability. The PES also serves as a common conversion point between TSs and PSs.

Transport Streams
Both the TS and PS are formed by packetizing PES files. During the formation of the TS, additional packets containing tables needed to demultiplex the TS are inserted. These tables are collectively called PSI. Null packets, containing a dummy payload, may also be inserted to fill the intervals between information-bearing packets. Some packets contain timing information for their associated program, called the Program Clock Reference (PCR). The PCR is inserted into one of the optional header fields of the TS packet. Recovery of the PCR allows the decoder to synchronize its clock to the rate of the original encoder clock.

The Transport Stream is defined by the syntax and structure of the TS header

TS packets are fixed in length at 188 bytes with a minimum 4-byte header and a maximum 184-byte payload. The key fields in the minimum 4-byte header are the sync byte and the Packet ID (PID). The sync byte's function is indicated by its name. It is a long digital word used for delineating the beginning of a TS packet.

The PID is a unique address identifier. Every video and audio stream, as well as each PSI table, needs to have a unique PID. The PID value is provisioned in the MPEG multiplexing equipment. Certain PID values are reserved and specified by organizations such as the Digital Video Broadcasting Group (DVB) and the Advanced Television Systems Committee (ATSC) for electronic program guides and other tables.

In order to reconstruct a program from all its video, audio and table components, it is necessary to ensure that the PID assignment is done correctly and that there is consistency between PSI table contents and the associated video and audio streams.

Program Specific Information
Program Specific Information (PSI) is part of the Transport Stream (TS). PSI is a set of tables needed to demultiplex and sort out PIDs that are tagged to programs. A Program Map Table (PMT) must be decoded to find the audio and video PIDs that identify the content of a particular program. Each program requires its own PMT with a unique PID value.

The master PSI table is the Program Association Table (PAT). If the PAT can’t be found and decoded in the Transport Stream, no programs can be found, decompressed or viewed.

PSI tables must be sent periodically and with a fast repetition rate so channel-surfers don’t feel that program selection takes too long. A critical aspect of MPEG testing is to check and verify the PSI tables for correct syntax and repetition rate.

Another PSI testing scenario is to determine the accuracy and consistency of PSI contents. As programs change or multiplexer provisioning is modified, errors may appear. One is an “Unreferenced PID,” where packets with a PID value are present in the TS that are not referred to in any table. Another would be a “Missing PID,” where no packets exist with the PID value referenced in the Transport Stream PSI table.

Good broadcast engineers never forget common sense. Just because there aren’t any unreferenced or missing PIDs doesn’t guarantee the viewer is necessarily receiving the correct program. There could be a mismatch of the audio content from one program being delivered with the video content from another.

Because MPEG-2 allows for multiple audio and video channels, a real-world “air check” is the most common-sense test to ensure that viewers are receiving the correct language and video. It’s possible to use a set-top box with a TV set to do the air check, but it’s preferable to use dedicated MPEG test gear that allows PSI table checks. It’s also handy if the test set includes a built-in decoder with picture and audio displays.

By Ned Soseman, Broadcast Engineering