Using Business Process Management to Succeed in a World of TV Everywhere

What has Business Process Management got to do with broadcasting? Surely, it is more relevant to a manufacturing operation; broadcasting is a creative business, isn’t it? Sure, production is creative, but to an outsider, broadcasting operations are much like manufacturing. Raw materials — the program tape or file — arrive from the production company, and at the other end, the broadcaster delivers programs to the viewer.

Through all the many processes, the program has been formatted for the viewer’s television, tablet or smartphone, and any necessary subtitles, captions or dubbed audio have been added to the program. The channel schedule has been delivered to the third-party listings guides with information about each program, and the promotions department may also have created trailers for the program.

Is this not a manufacturing operation, a media factory?

Over the last decade, many verticals outside the media-and-entertainment sector have moved to a software architecture where the processes are managed as a whole, rather than using islands of software to process the product. This holistic management of business processes potentially offers a more efficient way to command, control and optimize media operations.

Business Process Management (BPM) is a formalized method to manage the activities of the business. Through process management, company executives can view and monitor the running of the business, and make informed decisions about developing and optimizing the business. BPM allows business processes and goals to be abstracted from the underlying technical platforms. At the business level, a broadcaster may want to add social media interactivity. At the technical level, this could involve synchronizing playout automation with social media platforms though a Web-services interface. The business decision maker should not be concerned with the technical minutiae.

Often linked to BPM is Service-Oriented Architecture (SOA), which is a configuration of the systems that implement the processes of the business. We use SOAs all the time when interacting with e-commerce services. As an example, when a user books a vacation, a travel website calls up hotels and airlines via the Web services. From these loosely coupled services, a federated view is presented to the user of a combined flight/hotel/rental car deal from 10 or so airlines and hundreds of hotels.

It is key to note here that each airline may have its own booking system, but presenting the flight schedules and availability as a Web service abstracts the travel Web site from the airline’s software. This removes the need to intimately link the booking systems with inflexible APIs that are costly to maintain as software is updated or changed.

The business capabilities of each party are presented as re-useable services. In implementing an SOA, services are wrapped via adaptors that hide the complexity and present only the data necessary to make the business transaction.

BPM is a software layer above the workflow orchestration within the SOA. It will provide a toolset to simulate and model business processes. Workflows can be designed, and business rules applied to processes. The workflow orchestration engine calls on services as appropriate using the information from the business model. It also reports feeds to the “dashboard” with data appropriate to the role of the user.

The orchestration engine and the services that implement the business processes are linked with an Enterprise Service Bus (ESB) that carries the control messages.

Media Businesses and SOA
Although DPs or editors may view their part of content production as creative, in reality the distribution and delivery of television programming can be viewed as a content factory. The business processes in the factory include sales, acquisition and channel management. The staff running these processes are not concerned with which make and model of transcoder is used or how the media storage is designed. Their focus is on revenues, margins and costs.

Running a broadcast operation requires the management of content — both programs and commercials — through many processes before the final transmission. A typical business process could be “prepare this movie for air, and transmit at 9 p.m. on Friday.” To implement this process, the broadcaster will have designed a workflow that orchestrates many tasks and activities.

However, monitoring the processes, following content as it moves from one operator to another, and from one department to another, is difficult. An executive may rely on monthly reports from supervisors. These do not give a detailed view of how the business is running, how efficient it is or how efficiently resources are deployed.

The old way, moving tapes from one department to another, could be managed, but it was difficult to get an overall view of the efficiency of operations. Department heads provided a filtered view of operations under their command. Staff could confuse the status of jobs. The linear nature of tape operations led to a rigid style of operation, with tapes passed like a baton in a relay from one operation to the next. Even with the move to file-based operations, much of the workflow orchestration was by files being saved in watch-folders or e-mail notifications to the next operator down the line.

It is difficult to analyze or audit such operations. There may be logs stored by individual workstations or software applications, but there is no federated view. BPM changes this by two means:

  • Management dashboard - To give a comprehensive view of the business, the operation of the workflow and underlying technical processes must be presented in a simple, customizable interface that can present key information to the executive. This is referred to as the “dashboard.”

    From the dashboard, a manager can check the utilization of equipment or human resources and view any bottlenecks, like files waiting from transcoding, and take correctional action. It allows better intelligence to make decisions on strategic issues, such as capital investment vs. outsourcing.

  • Service-oriented architecture - An uconventional view is to look at broadcast operations as a number of services: ingest, transcode, edit, etc. These services are then orchestrated to speed the program from ingest to air. The existing method of orchestrating processes is through a mix of written procedures, e-mail notifications, work orders and even phone calls.

BPM alleviates the difficulty of supervising media operations by two means: A management dashboard gives the executive a comprehensive view of the business in a simple, customizable interface. Alternatively, service-oriented architecture offers an unconventional look at broadcast operations as a number of services to be orchestrated.

Services could be called from a Web interface, much like the travel booking example. But the real strength of the SOA is when the workflow is orchestrated by a middle layer of software. This middleware carries out the business requests from the BPM and calls on the services to implement those business operations. Flowing the other way is reporting on the services, which allows the BPM to view the operations and to optimize them if required.

File-based operations remove many of the constraints on operations, allowing processes to be orchestrated in a way that serves the needs of the business, rather than as dictated by the limitations of the medium that carries the content (videotape).

Services can be provided within the facility, or they could be outsourced or even cloud-based. Through abstraction of the service adaptor, the workflow orchestration engine calls a service, whether it’s in the same room or on the other side of the world.

The Advanced Media Workflow Association (AMWA) and European Broadcasting Union (EBU) have been working jointly to develop a Framework for Interoperable Media Services (FIMS), as well as specifications for service adaptors, including transfer, transform and capture. The Transfer service copies or moves files and is typically used to deliver files, to archive them or to move them through a hierarchical storage system. The Transform service encompasses trancoding, transrating, rewrapping and change of resolution. The Capture service is the ingest of real-time video and audio streams, typically from HD-SDI.

These services lie at the heart of multiformat delivery. It is envisioned that further services will be added to the project over time.

An SOA is technology-agnostic, and the middleware can be obtained from any number of software vendors. The FIMS framework constrains the requirement for the middleware, and some vendors are already offering systems that comply with the framework and the service adaptors.

Part of the aims of FIMS is to move away from proprietary adaptors to reusable adaptors. This will save cost to the broadcaster, and save manufacturers the need for what is often unprofitable custom development.

There may have been no incentive for broadcasters to change processes and operations, to ignore what other vertical sectors were doing. But the intensified global competition in the media marketplace, and the demands of multi-platform delivery, have focused the need to improve efficiency. Broadcasters must be more agile. If a new consumer device comes along, with a new way to view content, then the broadcaster should be in a position to capitalize on the opportunity, and at an optimum cost.

Conventional IT architectures do not serve well the needs of the media business. For example, some services, like editing and grading, are human processes. The editor takes the source files, and days or weeks later, delivers the final cut. Even transcoding a file can be a lengthy process. Media files are large by IT standards, up to terabytes.

This requires a separate media service bus in parallel with ESB with the bandwidth to carry media files as they move between services. Any file-based facility probably already has a media network that can be adapted to this task.

To some, such a radical change may imply risk: Will the broadcast operations stop if the SOA middleware or the BPM crash? Do they become single points of failure?

There is risk in any business system, but the risk is mitigated by careful design. There is a wealth of knowledge from other vertical sectors in the application of SOA and BPM; they have been doing it for years. A feature of the SOA is that services are loosely coupled. The playout automation exchanges schedules and logs with the middleware, but if the middleware is down for some reason, playout continues. The automation and master control operator continue as normal until the middleware restores service messaging.

BPM and SOA present a good way to build an agile business that can quickly adapt to the changing media landscape. They can aid the executive to optimize use of capital equipment and staff resources. Through agility and efficiency, the business can better compete and succeed in a world of TV everywhere.

By David Austerberry, Broadcast Engineering

Digital Video and Audio Interfaces

Professional video interfaces are undergoing a change, in part due to the age of the initial digital systems, and also because of the emergence of high-performance interconnects for consumer use. First, let’s summarize the existing solutions for high-bandwidth audio/video transfer.

Existing Interfaces
Standard-definition Serial Digital Interface (SD-SDI) is a serial link that can transmit uncompressed digital video and audio (usually up to eight channels) over 75Ω coaxial cable. Without repeaters, rates of up to 270Mb/s over 1000ft are customarily used. Digital Video Broadcasting, Asynchronous Serial Interface (DVB-ASI) was defined for the transmission of MPEG Transport Streams, and is electrically similar to SDI, with a data rate of 270Mb/s.

HD-SDI is the second-generation version of SDI and allows the transmission of HD (1080i and 720p) signals over the same 75Ω cables as SD-SDI. It can handle rates up to 1.485Gb/s. A dual-link HD-SDI provides up to 2.97Gb/s and supports 1080p resolution, but it is being replaced by the single-link 3G-SDI, the third-generation version of SDI that can reach a maximum bit rate of 2.97Gb/s over a 75Ω coax cable.

Consumer electronics are catching up with pro interfaces. Although driven from the non-professional side, evolving consumer electronics interfaces are affecting pro equipment, especially displays. The legacy analog VGA and hybrid analog/digital DVI interfaces used to interconnect PCs with displays could be obsolete by 2015, as chipset manufacturers have announced their intent to withdraw support by that year, and that means PC motherboard manufacturers will likely pull the functions from their new designs. Replacing them on PCs, DVDs and other consumer video devices are HDMI and DisplayPort.

HDMI 1.4a has a throughput of 8.2Gb/s, allowing it to carry up to 4096p24 video (or 1920p60) at 24 bits per pixel, as well as various 3-D formats, eight channels of audio, Consumer Electronics Control (CEC) and High-bandwidth Digital Content Protection (HDCP). DisplayPort 1.2 supports up to 8.6Gb/s, and thus can carry payloads similar to that of HDMI. Functionally, the interfaces differ in the way they handle video and audio, with HDMI using a raster-based protocol, and DisplayPort transporting content in packets. From a market standpoint, the main difference between HDMI and DisplayPort is that the first was designed primarily as a digital TV interface, while the second was intended as a PC-centric interface. The two interfaces also have license and royalty differences.

The USB 3.0 (also called Super Speed USB) specification is used almost exclusively as a PC (or tablet) interface to support peripherals. It supports transfer rates up to 5Gb/s, over a maximum distance of about 16ft. As a data-transfer protocol, USB is payload-agnostic, so the transfer of audio and video is essentially limited to the latency characteristics of the interface. IEEE 1394 was originally designed to support bit rates of up to 400Mb/s, but newer versions of the standard support speeds as high as 3.2Gb/s.

Thunderbolt is a newer 10Gb/s bidirectional serial interface. Developed by Apple/Intel, it provides full-bandwidth data and video transfer between a PC and peripheral and display devices, up to a distance of 10ft. Serving as the hardware layer below the PCI (bus used inside PCs) and DisplayPort stacks, the product utilizes a time-synchronization protocol that allows up to seven daisy-chained Thunderbolt products to synchronize their time within 8ns of each other. Like USB, Thunderbolt’s key differentiator from other display-interface technologies is its capability to supply power to the peripheral, at up to 10W, superseding USB 3.0’s 4.5W capacity.

HDBaseT is a recent standard that uses CAT-5e Ethernet cable to transmit 10Mb/s video and two-way control signals and power, with enough capacity for additional simultaneous 100BaseT Ethernet uses. The great attraction to this interface is that it can be deployed over existing Ethernet infrastructures, greatly reducing implementation cost. As with other data-based interfaces, the video can be conventional uncompressed HD, 3-D, 4K or high frame rate. The maximum specified distance for HDBaseT is 328ft, which can be extended through 8 hops, and the standard supports carrying up to 100W of power.

Wireless Video Products
There are several wireless standards that are vying for use driving displays. Wireless Home Digital Interface (WHDI) is an interface that uses the same 5GHz band as Wi-Fi, and is designed to transmit uncompressed HD video at data rates of up to 3Gb/s in a 40MHz channel. The range is said to be greater than 100ft, with a latency of less than 1ms.

WiGig (by the Wireless Gigabit Alliance) is a specification based on 802.11 that supports generic data transmission rates up to 7Gb/s. A different approach is being taken by WirelessHD, a specification that defines a wireless protocol that enables consumer devices to create a wireless video area network (WVAN) that can stream uncompressed audio and video up to Quad Full HD (QFHD, or4K) resolution, at 48-bit color and 240Hz refresh rates, with support for 3-D video formats. The specification, which is based on 802.15, supports data transmission rates at 10Gb/s to 28Gb/s.

The Wi-Fi Alliance has also announced a certification program, called Miracast, through which certified devices can make use of an existing Wi-Fi connection to deliver audio and video content from one device to another, without cables or a connection to an existing Wi-Fi network.

In another industry development, MHL is being used to connect tablets and smartphones to displays. MHL defines an HD video and digital audio interface optimized for connecting mobile phones and portable devices to HDTVs, displays and other home entertainment products. MHL features a single cable with a five-pin interface that is able to support up to 1080p60 HD video and 192kHz digital 7.1 channel audio, as well as simultaneously providing control and power (2.5W) to the mobile device.

Because MHL does not specify a unique connector, various mechanical interfaces have emerged, including five-pin and 11-pin MHL-USB connectors. MHL fully supports the HDCP specification (used elsewhere on DVI and HDMI interfaces) for the safeguarding of digital motion pictures, television programs and audio against unauthorized access and copying.

Maintaining High-Speed Networks
High-speed networks are challenging to maintain. When any of these high-speed interfaces are combined with long runs of cable, performance will degrade, primarily from inter-symbol interference caused by cable-based dispersion of different signal frequencies, as well as jitter caused by processing equipment, as shown in the figure below. The result will be an increase in error rate at the receiving end.

Binary digital signal with interference

To minimize this, video plants should be designed and maintained with equipment having low jitter and cable runs having the lowest length necessary, with repeaters used for lengths nearing maximum specifications. Adhering to these precautions will result in reliable operations.

By Aldo Cugnini, Broadcast Engineering

FIMS 1.0 Jointly Published by EBU and AMWA

FIMS - the Framework for Interoperable Media Services - is an open standard for Service Oriented Architecture (SOA). An SOA-based approach replaces the tightly coupled devices and functions found in traditional systems with a set of "Media Services" that are interoperable, interchangeable and resuable. The interfaces between these different services and the centralized system that runs them are defined by FIMS.

Broadcasters that adopt FIMS will be able to overcome expensive, disruptive incompatibilities in IT-based broadcast production technologies. It will be easier for them to adapt to future delivery formats and platforms.

The FIMS 1.0 specification is now available, comprising Part 1, the General Description, and Part 2, a multi-section document describing the Base Schema and the Transfer, Transform and Capture Services. An accompanying package of schema files is also available for download.

The documentation and files are available from both the EBU (as Tech 3356) and AMWA:
 Part 1: General Description
Part 2, S0: Base Schema
Part 2, S1: Transfer Service
Part 2, S2: Transform Service
Part 2, S3: Capture Service
XML Schemas

Source: EBU

Discover the Benefits of AVC-Ultra

There is no doubt that we are in the midst of a rapid evolution of codec design. Traditional codecs, some might call them legacy codecs, are gaining evolutionary improvements. These codecs include HDCAM, AVC-Intra 50 and 100 as well as AVCHD 1.0. This article will, after a brief overview of AVC-Intra and ProRes 422 as well as the new sensors that drive codec evolution, focus on AVC-Ultra.

ProRes 422
There are five flavors of ProRes 422 in comparison to uncompressed video. Although ProRes 422 codecs are 10-bit codecs, they may carry 12-bit data values. However, they vary in terms of color space and compression ratios. ProRes 4444, however, has additional functionality. The first three 4’s indicate that the codec is capable of carrying either RGB values or luminance plus two chroma components, with all three values present for each pixel. The fourth 4 indicates that an alpha value can be carried along with each pixel. When cameras record ProRes 4444, the fourth value is not present, making the data stream simply 4:4:4.

ProRes 422 formats

The advantage of the ProRes proxy codec is best experienced in Final Cut X. When you import any type of data, you have the option of automatically, in the background, creating a ProRes 422 or proxy version of the original file. You then edit the 4:2:2 10-/12-bit proxy video, which allows real-time editing of most any format on almost any Mac. During export, the original file is used as a source of all image data.

AVCHD has evolved to version 2, which has two new features: the ability to record at frame rates of 50fps or 60fps, and to record at 28Mb/s at these higher frame rates. To date, the AVCHD specification has not been enhanced to support Quad HD or 4K2K images. For this reason, cameras, such as the JVC HMQ10, record Quad HD in generic AVC/H.264. Using Level 5.1 or Level 5.2, 24fps or 60fps respectively can be recorded.

Panasonic’s AVC-Intra is available in two formats: a 50Mb/s codec and a 100Mb/s codec. AVC-Intra records a complete range of frame rates. At 1920x1080: 23.98p, 25p, 29.97p, 50i and 59.94i. At 1280x720: 23.98p, 25p, 29.97p, 50p and 59.94p. The characteristics of each of these two flavors differ.

AVC-Intra formats

Codec Parameters
All codecs have a similar set of parameters. These include image resolution, image composition (single frame versus two fields), de-Bayered versus raw (progressive-only), image frame rate or field rate, color sampling (4:4:4, 4:2:2, 4:1:1 or 4:2:0), RGB versus YCrCb, compression ratio, and bit depth.

Traditional codecs employ bit depths of either 8 or 10 bits. The number of bits used for recording is independent of the number of bits output by the sensor’s analog-to-digital converter.

Nevertheless, a camera’s dynamic range is a function of sensor performance (low noise is critical), number of A/D bits and the number of codec recording bits. Each stop requires a doubling of sensor output voltage, and each bit represents a doubling of voltage. Therefore, a 12-bit A/D has the potential to capture a 12-stop dynamic range.

As a camera’s bit depth increases, the smoothness of the camera’s gray scale increases (banding is reduced.) Therefore, the A/D and post A/D processing traditionally has more bits than necessary to capture the sensor’s dynamic range — thereby realizing the sensor’s potential.

ProRes 422 formats

Both ProRes 4444 and AVC-Ultra can provide 12-bit sample depth. Alternately, data can be converted to log values. In this case, 16 bits can be represented by only 10 bits. Thus, when looking at bit depth specifications, it’s important to know whether it’s log data.

Consider an illumination range of 18 stops. Assuming older sensor technology, at best only 12 stops can be captured by the sensor. However, these 12 stops are not all usable. Low illumination causes several stops to be lost because of high levels of noise. Likewise, at high illumination, several stops are lost due to clipping under extreme light levels. The effective dynamic range is only about six stops.

Legacy video sensor and processing

In the above figure, the brown diagonal line shows a perfectly linear gamma. In order for a video signal to be displayed correctly on a monitor, a nonlinear gamma must be applied to the signal from the A/D. In the HD world, it’s called Rec. 709. (Red curve.) This curve provides the video image that we are used to looking at. When video will be transferred to film, a lower contrast video image is required. (Blue curve.) The “X” marks the point where the filmic curve yields a brighter mid-tone image that reduces apparent contrast.

Now consider a contemporary sensor. The illumination range remains the same at 18 stops. The potential sensor range, however, has increased to 15 stops. Because of improved technology, fewer stops are lost to noise and bright light clipping. Thus, the sensor is able to capture a usable 12-stop dynamic range.

Contemporary cinema sensor and processing

Once again, the brown diagonal shows a linear gamma curve, and the red curve shows Rec. 709 gamma. To record a 12-stop signal, a 12-bit codec can be employed. Alternately, some cinema cameras utilize a logarithmic gamma (green curve) that is applied to sensor data. At point “Y,” the logarithmic curve yields a brighter picture that reduces apparent contrast. Likewise, at point “Z,” the logarithmic curve yields a darker picture that also reduces apparent contrast.

This explains why a logarithmic image looks so much “flatter” than a Rec. 709 image. After log conversion, only 10 bits are required to carry the 12-stop signal range.

Today’s sophisticated sensors demand a recording system that is capable of carrying a much higher-level quality image. For this reason, Panasonic has announced AVC-Ultra. AVC-Ultra is backward compatible with AVC-Intra. That means that an AVC-Ultra decoder can decompress all of Panasonic’s P2 codecs. AVC-Ultra offers several quality levels.

AVC-Ultra formats

The Panasonic AVC-Ultra family defines three new encoding parameters from the MPEG-4 Part 10 standard. Unlike the Intra codecs, Ultra codecs can utilize the AVC/H.264 4:4:4 Predictive Profile.

AVC-Intra Class 50 and 100 are extended to Class 200 and Class 4:4:4. The Class 200 mode extends the bit rate to 226Mb/s for 1080/23.97p, while Class 4:4:4 extends the possible resolution from 720p to 4K with value depths of 10 and 12 bits. It’s possible Class 4:4:4 at 10 or 12 bits with a 4K frame size will be employed in the 4K camera Panasonic showcased at NAB2012. The Class 4:4:4 bit rate varies between 200Mb/s and 440Mb/s depending on resolution, frame rate and bit depth.

There is also a new 8-bit AVC-Proxy mode that enables offline edits of 720p and 1080p video at bit rates varying between 800kb/s and 3.5Mb/s.

Both the Class 200 and the Class 4:4:4 are intra-frame codecs. Although Panasonic has always promoted intra-frame encoding, its new AVC-LongG is an inter-frame codec. AVC-LongG enables compression of video resolutions up to 1920x1080 at 23.97p, 25p and 29.97p. Amazingly, 4:2:2 color sampling with 10-bit pixel depth can be recorded at data rates as low as 25Mb/s.

By Steve Mullen, Broadcast Engineering

An Introduction to LTFS for Digital Media

We live in an age where content is king. For the entertainment and media industry, the majority of this content is now produced in digital form, and virtually all of this content has digital distribution; that content is now digital data. Protecting that content, the lifeblood of this industry, with the right data storage system is more important than ever.

Tape is already firmly established in the media production environment whether to secure on-set content or for long-term archiving. LTFS broadens LTO technology usefulness by being easier to use and more robust. The open and self-contained LTFS format is useful if the tapes are to be sent offsite, archived or shared with a variety of recipients.

The LTFS standard was adopted by the LTO Program in April 2010. It is an open format and software specification that supports simpler and new ways to access data on tape. Although the tape model hasn’t changed dramatically over the years, the speed, storage density and features of data tape have improved significantly, ultimately providing reliable and inexpensive storage as a sequential storage medium. With LTFS, accessing files stored on the LTFS-formatted media is similar to accessing files stored on other forms of storage media, such as disks or removable USB flash drives. It has no application software dependencies, offers support for large and numerous files, and often can have a lower total cost than traditional managed tape storage.

How LTFS Works
LTFS consists of a software driver and the format specification. Drivers, some free and open source, are available for various operating and tape systems from the tape hardware vendor websites.

The format is a self-describing tape format and defines the organization of data and metadata on tape. Within the partitions, the tape contents are still stored as usual, as blocks of data and file marks. Files are mapped by LTFS to a hierarchical directory structure.

LTFS uses media partitioning, where tape is logically divided “lengthwise” into two partitions:

  • Index partition: Contains file system info, index, metadata.
  • Content partition: Contains the files and content bodies.

The standard LTO-5 tape cartridge is segmented into two partitions, one for the index and one for the data, so the index partition can be modified as needed without affecting the append-only data partition. This creates a self-describing tape where a user can see the tape cartridge and its contents in the operating system directory tree browser and can copy, paste or drag and drop files/folders to and from the tape. Similarly, applications can access the data on tape directly, unaware they are using tape, though there may be differences in latency inherent to tape.

When the tape is inserted and mounted in a tape drive, the information in the index partition is read and cached in the workstation’s memory. From that point on, the index is accessed and updated in the workstation memory for fast performance and so the tape head can stay positioned in the content portion of the tape to more quickly access files or write new ones.

As the tape is used, the index is updated in the workstation memory for fast performance, eliminating the need to go back to the beginning of the tape. To protect the index, LTFS periodically copies the index from the workstation’s memory to the data partition. When the tape is done being used (unmounted), the index is copied one more time at the end of the data partition, the tape is rewound and then the index is written twice to the index partition. Essentially, there are multiple copies of the index on tape for restoring in the unlikely event the index partition index is not usable or if a user wants to roll back the tape to a previous version.

For instance, you may want to restore it to the way the tape appeared last Monday by choosing an index from that date. In some tape library implementations, the tape cartridge’s indexes are cached to a server disk for fast searching without having to remount the tape cartridges. LTFS also supports extended attributes, which enable custom file metadata.

LTFS opens up new opportunities for media and entertainment storage and distribution, providing support to media workflows. LTFS directly affects two major trends having significant impact on media and entertainment companies and digital media producers: the shift to file-based workflows and increasing storage demands.

With the advent of digital technologies, moving image (video and film) content producers and distributors are transitioning from analog/linear workflows based on film or videotape technology to digital/nonlinear workflows based on the manipulation of data files. Many organizations have already completed this transition. This has been commonly referred to as “moving to a tapeless workflow,” though it is more accurate to call it moving to a “videotapeless” workflow.

The other fact of life in media and entertainment is the demand of ever-increasing visual resolution and complexity (HD, 4K, 3-D, etc.) creating more and larger files that must be managed. Keeping hours of such media online on disk quickly becomes cost prohibitive. LTO tape, especially with LTFS, can address the challenge. LTFS can work with LTO hardware-based lossless compression, which can provide bandwidth and capacity benefits depending on the data content.

LTO tape is already firmly established in the media production environment. For production, LTFS efficiently supports several important requirements.
  • Camera media reuse: Digital cameras encode motion images directly to SSDs or removable disks in the camera. These media are quite expensive (three to more than 350 times the equivalent media cost on LTO). Fast transfer of their contents to tape with LTFS enables reuse of this expensive media, reducing the number of SSDs or disks that must be purchased or rented.

  • Backup: Backup of daily footage to LTO tape is a common requirement as the loss of a day’s worth of production is costly. LTFS facilitates backup by enabling small portable independent systems to easily write daily content to tape.

  • Transport: The density and cost of LTO-5 tapes with the self-describing capabilities of LTFS combine to create an effective transport medium. Large amounts of data can be sent more quickly and economically than network-based transmission methods. This is especially compelling for digital productions, which can produce terabytes of data for every day of shooting. The encryption features of LTO tape help secure the data in transit.

  • Economic direct access to data: For any file-based production workflow, an LTFS-enabled tape drive can feed workstations or networks with content directly and relatively quickly, similar to a disk and unlike most traditional tape systems. An application via the operating system always has a direct and persistent view of a mounted LTFS tape and the files it contains. Consequently, in a workflow where access to a file is expected to be fast but not instantaneous, such as a stock footage collection or archive footage of an ongoing news story, an LTFS tape is an effective and economical choice for storage.

  • Archive: LTFS-formatted tapes can be easily imported into an LTFS-compatible archive by simply reading the index and adding the file metadata to an archive manager’s catalog.

Conversely, traditional systems that use separate media for transport and archive require all the data be recopied. With LTFS, there is no need to read the much larger data partition or transfer the data to other storage media. The transport media and the archive storage media are one and the same under this scenario. The “import bandwidth” of tapes being added directly to a library en masse far exceeds any system that requires movement of the actual data.

By Rainer Richter, Broadcast Engineering

Reliable UDP (RUDP): The Next Big Streaming Protocol?

Those who have had a little experience will probably have heard of TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). They are transport protocols that run over IP links, and they define two different ways to send data from one point to another over an IP network path. TCP running over IP is written TCP/IP; UDP in the same format is UDP/IP.

TCP has a set of instructions that ensures that each packet of data gets to its recipient. It is comparable to recorded delivery in its most basic form. However, while it seems obvious at first that "making sure the message gets there" is paramount when sending something to someone else, there are a few extra considerations that must be noted. If a network link using TCP/IP notices that a packet has arrived out of sequence, then TCP stops the transmission, discards anything from the out-of-sequence packet forward, sends a "go back to where it went wrong" message, and starts the transmission again.

If you have all the time in the world, this is fine. So for transferring my salary information from my company to me, I frankly don't care if this takes a microsecond or an hour, I want it done right. TCP is fantastic for that.

In a video-centric service model, however, there is simply so much data that if a few packets don't make it over the link there are situations where I would rather skip those packets and carry on with the overall flow of the video than get every detail of the original source. Our brain can imagine the skipped bits of the video for us as long as it's not distracted by jerky audio and stop-motion video. In these circumstances, having an option to just send as much data from one end of the link to the other in a timely fashion, regardless of how much gets through accurately, is clearly desirable.

It is for this type of application that UDP is optimal. If a packet seems not to have arrived, then the recipient waits a few moments to see if it does arrive -- potentially right up to the moment when the viewer needs to see that block of video -- and if the buffer gets to the point where the missing packet should be, then it simply carries on, and the application skips the point where the missing data is, carrying on to the next packet and maintaining the time base of the video. You may see a flicker or some artifacting, but the moment passes almost instantly and more than likely your brain will fill the gap.

If this error happens under TCP then it can take TCP upward of 3 seconds to renegotiate for the sequence to restart from the missing point, discarding all the subsequent data, which must be requeued to be sent again. Just one lost packet can cause an entire "window" of TCP data to be re-sent. That can be a considerable amount of data, particularly when the link is known as a Long Fat Network link (LFN or eLeFaNt; it's true -- Google it!).

All this adds overhead to the network and to the operations of both computers using that link, as the CPU and network card's processing units have to manage all the retransmission and sync between the applications and these components.

For this reason HTTP (which is always a TCP transfer) generally introduces startup delays and playback latency, as the media players need to buffer more than 3 seconds of playback to manage any lost packets.

Indeed, TCP is very sensitive to something called window size, and knowing that very few of you ever will have adjusted the window size of your contribution feeds as you set up for your live Flash Streaming encode, I can estimate that all but those same very few have been wasting available capacity in your network links. You may not care. The links you use are good enough to do whatever it is you are trying to do.

In today's disposable culture of "use and discard" and "don't fix and reuse," it's no surprise that most streaming engineers just shrug and assume that the ability to get more bang for your buck out of your internet connection is beyond your control.

For example, did you know that if you set your Maximum Transmission Unit (MTU) -- ultimately your video packet size -- too large then the network has to break it in two in a process called fragmentation? Packet fragmentation has a negative impact on network performance for several reasons. First, a router has to perform the fragmentation -- an expensive operation. Second, all the routers in the path between the router performing the fragmentation and the destination have to carry additional packets with the requisite additional headers.

Also, in the event of a retransmission, larger packets increase the amount of data you need to resend if a retransmission occurs.

Alternatively, if you set the MTU too small then the amount of data you can transfer in any one packet is reduced and relatively increases the amount of signaling overhead (the data about the sending of the data, equivalent to the addresses and parcel tracking services in real post). If you set the MTU as small as you can for an Ethernet connection, you could find that the overhead nears 50% of all traffic.

UDP offers some advantages over TCP. But UDP is not a panacea for all video transmissions.

Where you are trying to do large-video file transfer, UDP should be a great help, but its lossy nature is rarely acceptable for stages in the workflow that require absolute file integrity. Imagine studios transferring master encodes to LOVEFiLM or Netflix for distribution. If that transfer to the LOVEFiLM or Netflix playout lost packets then every single subscriber of those services would have to accept that degraded master copy as the best possible copy. In fact, if UDP was used in these back-end workflows, the content would degrade the user's experience in the same way that historically tape-to-tape and other dubbed and analog replication processes used to. Digital media would lose that perfect replica quality that has been central to its success.

Getting back to the focus on who may want to reduce their network capacity inefficiencies: Studios, playouts, news desks, broadcast centers, and editing suites all want their video content intact/lossless, but naturally they want to manipulate that data between machines as fast as possible. Having video editors drinking coffee while videos transfer from one place to another is inefficient (even if the coffee is good).

Given they cannot operate in a lossy way, are these production facilities stuck with TCP and all the inherent inefficiencies that come with the reliable transfer? Because TCP ensures all the data gets from point to point, it is called a "reliable" protocol. In UDP's case, that reliability is "left to the user," so UDP in its native form is known as an "unreliable" protocol.

The good news is that there are indeed options out there in the form of a variety of "reliable UDP" protocols, and we'll be looking at those in the rest of this article. One thing worth noting at the outset, though, is that if you want to optimize links in your workflow, you can either do it the little-bit-hard way and pay very little, or you can do it the easy way and pay a considerable amount to have a solution fitted for you.

Reliable UDP transports can offer the ideal situation for enterprise workflows -- one that has the benefit of high-capacity throughput, minimal overhead, and the highest possible "goodput" (a rarely used but useful term that refers to the part of the throughput that you can actually use for your application's data, excluding other overheads such as signaling). In the Internet Engineering Task Force (IETF) world, from which the IP standards arise, for nearly 30 years there has been considerable work in developing reliable data transfer protocols. RFC-908, dating from way back in 1984, is a good example.

Essentially, RDP (Reliable Data Protocol) was proposed as a transport layer protocol; it was positioned in the stack as a peer to UDP and TCP. It was proposed as an RFC (Request For Comment) but did not mature in its own right to become a standard. Indeed, RDP appears to have been eclipsed in the late 1990s by the Reliable UDP Protocol (RUDP), and both Cisco and Microsoft have released RUDP versions of their own within their stacks for specific tasks. Probably because of the "task-specific" nature of RUDP implementations, though, RUDP hasn't become a formal standard, never progressing beyond "draft" status.

One way to think about how RUDP types of transport work is to use a basic model where all the data is sent in UDP format, and each missing packet is indexed. Once the main body of the transfer is done, the recipient sends the sender the index list and the sender resends only those packets on the list. As you can see, because it avoids the retransmission of any windows of data that have already been sent that immediately follow a missed packet, this simple model is much more efficient. However, it couldn't work for live data, and even for archives a protocol must be agreed upon for sending the index. It responds to that rerequest in a structured way (which could result in a lot of random seek disc access, for example, if it was badly done).

There are many reasons the major vendor implementations are task-specific. For example, where one may use UDP to avoid TCP retransmission after errors, if the entire data must be faultlessly delivered to the application, one needs to actually understand the application.

If the application requires control data to be sent, it is important for the application to have all the data required to make that decision at any point. If the RUDP system (for example) only looked for and re-requested all the missing packets every 5 minutes (!) then the logical operations that lacked the data could be held up waiting for that re-request to complete. This could break the key function of the application if the control decision needed to be made sooner than within 5 minutes.

On the other hand, if the data is a large archive of videos being sent overnight for precaching at CDN edges, then it may be that the retransmission requests could be managed during the morning. So the retransmission could be delayed until the entire archive has been sent, following up with just the missing packets on a few iterations until all the data is delivered. So the flow, in this case, has to have some user-determined and application-specific control.

TCP is easy because it works in all cases, but it is less efficient because of that. On the other hand, UDP either needs its applications to be resilient to loss or the application developer needs to write in a system for ensuring that missing/corrupted packets are retransmitted. And such systems are in effect proprietary RUDP protocols.

There is an abundance of these, both free and open source, and I am going to look at several of each option (Table 1). Most of you who use existing streaming servers will be tied to the streaming protocols that your chosen vendor offers in its application. However, for those of you developing your own streaming applications, or bespoke aspects of workflows yourselves, this list should be a good start to some of the protocols you could consider. It will also be useful for those of you who are currently using FTP for nonlinear workflows, since the swap out is likely to be relatively straightforward given than most nonlinear systems do not have the same stage-to-stage interdependence that linear or live streaming infrastructures do.

Let's zip (and I do mean zip) through this list. Note that it is not meant to be a comprehensive selection but purely a sampler.

The first ones to explore in my mind are UDP-Lite and Datagram Congestion Control Protocol. These two have essentially become IETF standards, which means that inter-vendor operation is possible (so you won't get locked into a particular vendor).

Table 1: A Selection of Reliable UDP Transport

Let's look at DCCP first. DCCP provides initial code implementations for those inclined. From the point of view of a broadcast video engineer, this is really deeply technical stuff for low-level software coders. However, if you happen to be (or simply have access to) engineers of this skill level then DCCP is freely available.

DCCP is a protocol worth considering if you are using shared network infrastructure (as opposed to private or leased line connectivity) and want to ensure you get as much throughput as UDP can enable, while also ensuring that you "play fair" with other users. It is worth commenting that "just turning on UDP" and filling the wire up with UDP data with no consideration of any other user on the wire can saturate the link and effectively make it unusable for others. This is congestion, but DCCP manages to fill the pipe as much as possible, while still inherently enabling other users to use the wire too.

Some of the key DCCP features include the following:
  • Adding a reliability layer to UDP
  • Discovery of the right MTU size is part of the protocol design (so you fill the pipe while avoiding fragmentation)
  • Congestion control
Indeed, to quote the RFC: "DCCP is intended for applications such as streaming media that can benefit from control over the tradeoffs between delay and reliable in-order delivery."

The next of these protocols is UDP-Lite. Also an IETF standard, this nearly-identical-to-UDP protocol differs in one key way: It has a checksum (a number that is the result of a logical operation performed on all the data, which if it differs after a transfer indicates that the data is corrupt) and a checksum coverage range that that checksum applies to, whereas vanilla UDP -- optionally in IPv4, and always in IPv6 -- has just a simple checksum on the whole datagram and if present the checksum covers the entire payload.

Let's simplify that a little: What this means is that in UDP-Lite you can define part of the UDP datagram as something that must arrive with "integrity," i.e., a part that must be error-free. But another part of the datagram, for example the much bigger payload of video data itself, can contain errors (remain unchecked against a checksum) since it could be assumed that the application (for example, the H.264 codec) has error handling or tolerance in it.

This UDP-Lite method is very pragmatic. In a noisy network link, the video data may be subject to errors but could be the larger part of the payload, where the important sequence number may only be a smaller part of the data (statistically less prone to errors). If it fails, the application can use UDP-Lite to request a resend of that packet. Note that it is up to the application to request the resend; the UDP-Lite protocol simply flags the failure up and the software can prioritize a resend request, or it can simply plan to work around a "discard" of the failed data. It is also worth noting that most underlying link layer protocols such as Ethernet or similar MAC-based systems may discard damaged frames of data anyway unless something interfaces with those link layer devices. So to work reliably, UDP-Lite needs to interface with the network drivers to "override" these frame discards. This adds complexity to the deployment strategy and certainly most likely takes the opportunity away from being "free." However, it's fundamentally possible.

So I wanted to see what was available "ready to use" for free, or close to free at least. I went looking for a compiled, user-friendly, simple-to-use application with a user-friendly GUI, thinking of the videographers having to learn all this code and deep packet stuff just to upload a video to the office.

While it's not really a protocol per se, I found UDPXfer, a really simple application with just a UDP "send" and "listener" mode for file transfer.

I set up the software on my laptop and a machine in Amazon EC2, fiddled with the firewall, and sent a file. I got very excited about the prompt 5MB UDP file transfer taking 2 minutes and 27 seconds, and I then set up an FTP of the same file over the same link but was disappointed that the FTP took 1 minute and 50 seconds -- considerably faster. When I looked deeper, however, the UDPXfer sender had a "packets per second" slider. I then nudged the slider to its highest setting, but it was still only essentially 100Kbps maximum, far slower than the effective TCP. So I wrote to the developer, Richard Stanway, about this ceiling. He sent a new version that allowed me to set a 1300 packets-per-second transmission. He commented that it would saturate the IP link from me to the server, and in a shared network environment a better approach would be to the tune the TCP window's size to implement some congestion control. His software was actually geared to resiliency over noisy network links that cause problems for TCP.

Given that I see this technology being used on private wires, the effective saturation that Stanway was concerned about was less of a concern for my enterprise video workflow tests, so I decided to give the new version a try. As expected, I managed to bring the transfer time down to 1 minute and 7 seconds. So while the software I was using is not on general release, it is clearly possible to implement simple software-only UDP transfer applications that can balance reliability with speed to find a maximum goodput.

Commercial Solutions
But what of the commercial vendors? Do they differentiate significantly enough from "free" to cause me to reach into my pocket?

I caught up with Aspera, Inc. and Motama GmbH, and I also reached out to ZiXi. All of this software is complex to procure at the best of times, so sadly I haven't had a chance to play practically with these. Also, the vendors do not publish rate cards, so it's difficult to comment on their pricing and value proposition.

Aspera co-presented at a recent Amazon conference with my company, and we had an opportunity to dig into its technology model a bit. Aspera is indeed essentially providing variations on the RUDP theme. It provides protocols and applications that sit on top of those protocols to enable fast file distribution over controlled network links. In Aspera's case, it was selling in behind Amazon Web Services Direct Connect to offer optimal upload speeds. It has a range of similar arrangements in place targeting enterprises that handle high volumes of latency-sensitive data. You can license the software or, through the Amazon model, pay for the service by the hour as a premium AWS service. This is a nice flexible option for occasional users.

I had a very interesting chat with the CEO of Motama, which has a very appliance-based approach to its products. The RUDP-like protocol (called RelayCaster Streaming Protocol or RCSP) is used internally by the company's appliances to move live video from the TVCaster origination appliances to RelayCaster devices. These then can be hierarchically set up in a traditional hub and spoke or potentially other more complex topologies. The software is available (under license) to run on server platforms of your choice, which is good for data center models. They have also recently started to look at licensing the protocol to a wider range of client devices, and they pride themselves in being available for set-top boxes.

The last player in the sector I wanted to note was ZiXi. While I briefly spoke with ZiXi representatives while writing this, I didn't manage to communicate properly before my deadline, so here is what I know from the company's literature and a few customer comments: ZiXi offers a platform that optimizes video transfer for OTT, internet, and mobile applications. The platform obviously offers a richer range of features than just UDP-optimized streaming, and it has P2P negotiation and transmuxing so you can flip your video from standards such as RTMP out to MPEG-TS, as you can with servers such as Wowza. Internally, within its own ecosystem, the company uses its own hybrid ZiXi protocol, including features such as forward error correction, combining applications layer software in a product called Broadcaster that looks like a server with several common muxes (RTMP, HLS, etc.) and includes ZiXi. If you have an encoder with ZiXi running, then you can contribute directly to the server using the company's RUDP-type transport.

Worth the Cost?
I am aware none of these companies licenses their software trivially. The software packages are their core intellectual properties, and defending them is vital to the companies' success. I also realize that some of the problems that they purport to address may "go away" when you deploy their technology, but in all honesty, that may be a little like replacing the engine of your car because a spark plug is misfiring.

I am left wondering where the customer can find the balance between the productivity gains in accelerating his or her workflow with these techniques (free or commercial) against the cost of a private connection plus either the cost of development time to implement one of the open/free standards or the cost of buying a supported solution.

The pricing indication I have from a few undisclosed sources is that you need to be expecting to spend a few thousand on the commercial vendor's licensing, and then more for applications, appliances, and support. This can quickly rise to a significant number.

This increased cost to improve the productivity of your workflow must be at some considerable scale, since I personally think that a little TCP window sizing, and perhaps paying for slightly "fatter" internet access, may resolve most problems -- particularly in archive transfer and so on -- and is unlikely to cost thousands.

However, at scale, where those optimizations start to make a significant productivity difference, it clearly makes a lot of sense to engage with a commercially supported provider to see if its offering can help.

At the end of the day, regardless of the fact that with a good developer you can do most things for free, there are important drivers in large businesses that will force an operator to choose to pay for a supported, tested, and robust option. For many of the same reasons, Red Hat Linux was a premium product, despite Linux itself being free.

I urge you to explore this space. To misquote James Brown: "Get on the goodput!"

By Dom Robinson, streamingMedia