How to Create ABR Content with FFmpeg in One Pass
An interesting article
A curation about new media technologies
If you need to deliver to mobile devices and via OTT platforms, you need to deliver HTTP Live Streaming (HLS). Apple provides plenty of advice for compressionists, but here are some tips and tricks for encoding and testing your HLS files.
By Jan Ozer, StreamingMedia.com
Until recently, simulcast streaming to connected devices was performed using protocols like RTMP, RTSP and MMS. In 2009, when the iPhone 3GS was launched, iOS 3.0 included a new streaming protocol called HTTP Live Streaming (HLS), part of a new class of video delivery protocol.
HLS differed from its predecessors by relying only upon HTTP to carry video and flow control data to the device. It made the protocol far more firewall-friendly and easier to scale, as it required no specialist streaming server technology distributed throughout the Internet to deliver streams to end users. The regular HTTP caching proxies that serve as the backbone of all Content Delivery Networks (CDNs) would suffice.
Apple was not alone in making this paradigm switch. Microsoft and Adobe also introduced their own protocols — SmoothStreaming and HDS, respectively. Today, work is ongoing to standardize these approaches into a single unified protocol, under a framework known as MPEG-DASH.
What is significant about all these is that they separate the control aspects of the protocol from the video data. They share the general concept that video data is encoded into chunks and placed onto an origin server or a CDN. To start a streaming session, client devices load a manifest file from that server that tells them what chunks to load and in what order. The infrastructure that serves the manifest can be completely separate from the infrastructure that serves the chunks.
The separation of these concerns provides a basis for dynamic content replacement, as it is possible to dynamically manipulate the manifest file to point the client device at an alternative sequence of video chunks that have been pre-encoded and placed on the CDN. The ability to swap chunks out in this way relies on the encoding workflow generating video chunks whose boundaries match possible replacement events.
Stream Conditioning and ESAM
Multi-screen encoding workflows must deal with encoding the video, as well as packaging it for delivery into the protocols required by devices. Stream conditioning for dynamic content replacement is about ensuring that the encoding workflow knows when events at which replacement could occur, and ensuring that the video is processed correctly. It is important to emphasize that the replacement does not happen at this point: It is done closer to the end user.
When the encoder is informed about a splice point, it starts a new group of pictures (GOP), and when this GOP is encountered downstream by the packager, a new video chunk is created, as shown in Figure 1. Broadcasters should be wary of how their encoder and packager handle edge cases, such as when a splice point comes just before or after where a natural GOP and video chunk boundary would have been, so that extremely small video chunks and GOPs are avoided.


Bridge Technologies has launched PocketProbe, an iPhone app that enables objective analysis of real network performance of streaming media, in a simple to use, easy to understand tool that technical staff can carry anywhere.
Available now from Apple’s App Store, PocketProbe extends both the existing capabilities of digital media monitoring systems built from Bridge Technologies hardware probes, and the monitoring software environment.
Already providing the most comprehensive end-to-end monitoring and analysis capability, with a range of fixed and portable probes, the system now extends right into the engineer’s pocket.
PocketProbe contains the same OTT Engine found in the company’s VB1, VB2 and 10G VB3 series digital media monitoring probes, enabling confidence validation and analysis of http variable bit-rate streams from any location.

PocketProbe is available in two versions: the free application can validate five HLS streams in round-robin mode, provide analysis and manifest consistency alarms, play back media in the various profile bit-rates, and graphically display the actual chunk download patterns and bit-rates.
The full version also offers the ability to validate HDS and SmoothStream manifest files and store twenty-five streams with all profiles.
PocketProbe is easy to use, with a fully automatic set up: once the stream URL is input, the app finds all related profiles and validates the consistency.
Since the PocketProbe uses exactly the same metric as in the hardware probes, the PocketProbe can be used by service engineers and operational staff to test real world behaviors post-cloud with various operators.
Accurate status of bit-rates used and profile changes is displayed in realtime, giving instant understanding of provider delivery capability. Together with hardware probes used pre-cloud, the post-cloud location of the PocketProbe enables excellent correlative understanding of CDN and provider abilities.
Source: TVB Europe
Have you ever played with the settings of a YouTube video to make it look better? YouTube Mobile and TV engineering head Andy Berkheimer would like you stop doing that.
Berkheimer headed a project last year that brought adaptive bitrate streaming to the YouTube desktop player, enabling the player to automatically switch between different video quality settings based on your internet connection speed, among other factors.
Now he is bringing the same technology to mobile devices and TVs. “We are making it work just as it should,” Berkheimer told me during an interview this week.
From 240p to 4K
That may sound simple, but optimizing video playback has been a long journey for the Google-owned video site. Berkheimer joined YouTube six years ago, when there was just one default video quality — 320×240, also known as 240p. “That was really, really grainy video,” recalled Berkheimer.
His team used Google’s cloud infrastructure to allow for additional codecs, bringing HD and eventually even 4k to the site. But with higher bitrates, buffering also became more of a problem.
The solution? Adaptive bitrate streaming, which is industry-speak for switching the quality of a video in midstream, without the need to re-buffer and start over. YouTube started switching from progressive downloads to adaptive bitrate streaming in its desktop player a year ago, and completed the process late last year.
The new player is keeping close eyes on the speed and health of your internet connection, explained Berkheimer: “It’s continuously monitoring the bandwidth and the throughput it is seeing,” he said, adding that it also keeps tabs on the size of your player.
Are you watching a video in full screen? Then you can expect YouTube to send you more bits, as long as your connection is fast enough.
YouTube’s Take on Adaptive Streaming
Adaptive streaming isn’t new: Companies like Netflix and Hulu have used the technology for some time to optimize their streaming experience. But YouTube had some unique challenges to solve when it rolled out its own implementation.
For example, Netflix often starts with a lower-bitrate stream and then slowly scales up, which is why it can take a minute or so before full HD quality sets in.
That approach doesn’t really work for YouTube videos that only last a minute or two. YouTube tends to be more aggressive in sending out higher-quality video, and then scales down the video if necessary, Berkheimer explained. The site also makes use of the fact that you often watch more than one YouTube video in a row, and optimizes your bit rate across an entire session.
The results of these efforts have been encouraging. YouTube has seen buffering reduced by 20 percent since it launched adaptive streaming for its desktop player. That’s why the company is now taking the technology to TVs and mobile devices.
Next Up: Mobile and TVs
Of course, TVs require a lot more HD video, and buffering becomes even more obvious when you compare it to the nonstop experience of a traditional broadcast. Berkheimer told me that YouTube is working with the majority of the TV industry to bring adaptive streaming to TV sets, and that virtually all models introduced at CES this year already support the technology. The company is also working to bring adaptive streaming of YouTube videos to game consoles.
Mobile, on the other hand, comes with different challenges, as people move in and out of the reach of cell towers while they get their video fix on public transport.
And then there is this: “One of the biggest challenges we have is the global nature of YouTube,” said Berkheimer. Average mobile internet speeds are much slower in India and Brazil than in the U.S. and Europe, but videos still have to play without long and tiresome buffering. Broadband in Canada on the other hand is fast, but tightly rationed, with major ISPs charging their customers extra if they go over their caps.
That’s also one reason that those settings that allow you to manually change the bitrate of a YouTube video haven’t disappeared from the player yet — even though Berkheimer would very much like them gone. He told me that there have been some passionate discussions within the company about these manual settings. The result? For now, they’re staying.
But Berkheimer and his team are still working hard so that you can completely ignore them. “The most rewarding thing is that users don’t have to think about it,” he said.
By Janko Roettgers, GigaOM
Microsoft today announced that it is launching a preview version of a Smooth Streaming plugin for the Open Source Media Framework (OSMF) player. Developers can use Smooth Streaming capabilities in any OSMF-compliant player, as well as Adobe's own Strobe player.
"We are pleased to announce that Windows Azure Media Services team released a preview of Microsoft Smooth Streaming plugin for OSMF," wrote Cenk Dingiloglu, a program manager on the Windows Azure Media Services team, in a Microsoft IIS blog posting. He also provided a link, for developers who want to integrate the plugin, to a set of documents and licensing requirements.
In a series of meetings last Thursday on the Microsoft campus in Redmond, Washington, the Windows Azure Media Services team laid out their strategy on a number of fronts, including the extension of Smooth Streaming client software development kits (SDKs) to embedded devices, iOS devices, and player frameworks.
During one of those Microsoft-sponsored meetings, hosted by Microsoft senior technical evangelist Alex Zambelli, Dingiloglu and Mike Downey discussed the most recent addition of OSMF support, noting that Smooth Streaming shares similarities when it comes to codecs and the use of the fragmented MP4 file.
"Support for the same audio and video codecs, H.264 and AAC, respectively," said Dingiloglu, "provides the opportunity to use fMP4, leveraging the best of both the OSMF framework and the Smooth Streaming Client SDK."
The Smooth Streaming plugin will provide some key features of Smooth Streaming, such as on-demand functionality (play, pause, seek, stop), but will also use OSMF built-in API hooks to support two key features: multiple audio language switching and maximum playback quality selection.
OSMF supports late binding, based on its use of fMP4, allowing multiple languages to be accessible to the end user without requiring all possible languages' audio tracks to be multiplexed together into a single Transport Stream, the way that iOS devices require.
OSMF and a Strobe player support also provides Microsoft a way onto the Android OS platform, too, making it possible for Smooth Streaming content to reach Android-powered smartphones and tablets.
"You can build rich media experiences for Adobe Flash Player endpoints using the same back-end infrastructure you use today to target Smooth Streaming playback to other devices like Win8 store apps, browser and so on," Dingiloglu wrote in the IIS blog post.
Microsoft isn't claiming the new OSMF plugin is ready for prime time quite yet, but I was able to see a working version of Smooth Streaming within an OSMF player during last week's visit.
In fact, one of the more impressive demonstrations was that of a playlist/manifest file that contained both Adobe .f4v files and Microsoft .ism files. The OSMF player seamlessly switched between the two fMP4 file formats, allowing content owners to intermix content from either format for playback.
"As this is a preview release, you're likely to hit issues, have feature requests, or want to provide general feedback," wrote Dingiloglu. "We want to hear it all! Please use the Smooth Streaming plugin for OSMF forum thread to let us know what's working, what isn't, and how we can improve your Smooth Streaming development experience for OSMF applications."
All of this raises the question around Smooth Streaming as it relates to MPEG DASH, the ratified dynamic adaptive streaming standard. Like Adobe, which noted it will continue to develop its own HTTP Dynamic Streaming (HDS) flavor of HTTP-delivered adaptive bitrate streaming, Microsoft sees a benefit in continuing to push the envelope with Smooth Streaming.
The company made it clear that it fully supports DASH, and yet it sees Smooth Streaming as a test bed in which it can continue to innovate for major events like the Olympic Games, which served as a catalyst - over the past three Games - for a number of innovations that now find their way into both Windows Azure Media Services and DASH.
The Smooth Streaming plugin requires browsers supporting Flash Player 10.2 or higher and also requires OSMF 2.0. Microsoft provides licensing details for the Smooth Streaming plugin for interested developers.
By Tim Siglin, StreamingMedia
Today’s media landscape is radically more diverse than just a few years ago. The delivery of consistently acceptable image and sound quality is taken for granted by viewers, despite uncertain or fluctuating bandwidth. Adaptive-Bit-Rate (ABR) streaming technology makes this possible.
What is ABR Streaming?
ABR streaming is a delivery technology designed to provide consistent, high-quality viewing in situations where bandwidth may fluctuate, and where viewers may be on a wide range of devices.
Prior to ABR streaming, Web or mobile video delivery was typically done by encoding a single downloadable file or stream at a fixed bit rate and frame size. Viewers could buffer some of the video, and then simultaneously download and play it back. This delivery model was similar to cable transmission, where a single bit rate is transmitted over a reliable medium.
Unfortunately, transmission mediums for Web and mobile devices are unreliable, and bandwidths vary. During fixed-rate video playback, viewers with low bandwidth suffer from excessive buffering (delaying playback). To compensate, providers have tended to encode at lower bit rates, punishing viewers with high bandwidth. Even then, any fluctuations in bandwidth can cause buffering delays.
To solve this problem, ABR streaming content is encoded into multiple layers, each potentially a different bit rate, frame size and/or frame rate. These layers are combined into a single package that represents the original content. ABR players switch between layers depending upon the device and available bandwidth, to ensure consistent high-quality playback.
For example, a single ABR package might include six layers, each encoded at progressively higher bit rates. As a viewer watches content on his/her mobile phone during a train ride, the player will adaptively switch between low bit rates and high bit rates, depending upon the connectivity of the device.
How Does it Work?
Most ABR streaming technologies use standard Web protocol (HTTP delivery) to send video. This offers advantages over specialized streaming protocols such as RTSP or RTP, as HTTP-based delivery works immediately on Internet networks and can take advantage of edge technologies designed to cache HTTP requests.
During playback, video and audio are delivered via HTTP in small fragments, each representing some small amount of video, typically between 2 and 10 seconds in length. Each content package includes multiple layers, and each layer may include many fragments. For example, an hour-long movie may have 12 layers, each with a thousand fragments. The player is provided with a package manifest file outlining which layers are available and the location of the fragments for each layer.
During playback, the player requests and downloads a fragment from a layer. While the fragment is played, the connection speed is monitored, and the player may opt to switch layers, either increasing or decreasing the video bit rate based upon the connection speed. Players may also choose layers with different frame sizes or frame rates to optimize the visual experience for the device. This adaptive behavior is what ensures consistent playback regardless of connection speed or device.
There are several different ABR streaming technologies available: Apple HTTP Live Streaming (HLS), Adobe HTTP Dynamic Streaming (HDS), Microsoft Smooth Streaming (MSS), and more recently MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH). Each technology requires a complete ecosystem. The content must be prepared correctly, and the correct player must be used. All of the technologies work fundamentally in the same manner, using HTTP for content delivery in fragments.
Where these technologies differ is largely related to the structure of the underlying packages. For example, HLS for older versions of iOS requires a separate file for each video fragment. In contrast, most other packages store fragments for a layer in a single file, allowing the player to download fragments using HTTP byte range requests, which download a small part of a larger file.
Other differences in ABR technology relate to the viewer experience. Apple HLS, for example, provides for a dedicated key frame layer, allowing users to scrub through the video quickly. Other packages allow an audio-only stream with a poster image for extreme low-bit-rate situations.
Preparing Content
Preparing ABR content takes several steps. First, the desired packaging and layer structures need to be identified. Next, content must be encoded, checked for quality, packaged, encrypted and delivered.

Today's TV viewers want more content from an increasing number of sources, and that means that Internet delivery is a growing phenomenon. With hybrid technologies emerging, it is reasonable to expect that television broadcast will increasingly use the Internet to expand throughput beyond that afforded by a single RF channel. But there are limitations to the Internet that must be understood in order to capitalize on this commodity, and some of those constraints are being overcome by new technologies.
Streaming Can Now Provide a High Quality of Service
In general, Internet TV is a means to provide streamed video content to a PC, STB or Internet-connected TV, by means of an Internet connection. Internet Protocol Television, or IPTV, refers to a special case where a full-time TV subscriber connection is established by means of a dedicated line (and channel) to the telephone system central office. It is envisioned, however, that many Internet TV viewers will get their content though their Internet connection, and as such, receive OTT video service that shares bandwidth with other Internet traffic.
This sharing of bandwidth creates a QoS challenge for Internet TV service: While a terrestrial channel has a fixed bandwidth (i.e., 19.2Mb/s in the U.S.), an Internet TV service must share the bandwidth, both locally (e.g., within a viewer's household) as well as regionally (e.g., with other subscribers). This means the bandwidth available to a receiver can vary continuously over a wide range, and different subscribers may have different levels of guaranteed service, as well. Lowering the video bit rate to the least common denominator would result in poor video quality to everyone; to deal with this, several technologies are available.
Progressive Download vs. Streaming
The simplest way to deliver video over the Internet is to use progressive download, sometimes called “HTTP streaming.” This is simply a bulk download of a video file to the viewer's terminal (i.e., Internet-connected TV, STB, PC, etc.). A temporary copy of the file is stored on the user's device, typically on a hard drive, and playback can start after a sufficient amount of the file has been downloaded. This means that content will always incur a considerable delay before it is available to be viewed, which makes a live service rather difficult to implement. However, because the files are downloaded using TCP, there can be a nearly 100 percent assurance that every single bit was transferred correctly.
True streaming, on the other hand, opens up a handshaking connection between the server and client using a set of Internet protocols to deliver streams, such as Real Time Streaming Protocol (RTSP), Real Time Messaging Protocol (RTMP) and Microsoft Media Services (MMS). A streaming connection delivers a video stream with minimal buffering, allowing a nearly real-time presentation of the source content. In this respect, streaming has an advantage over progressive download, as continuous delivery is the goal, but the associated downside is that corrupted or missing packets are not detected. The consequence is that audio and video can have ongoing glitches when network congestion is experienced.
Adaptive Bit Rate Streaming
To solve the QoS issue, Adaptive Bit Rate (ABR) streaming has been developed. ABR allows each device to determine the quality of its connection and then use that metric to select the best-coded stream from a number of different quality streams. At the server end, a series of encoders encode a set of multiple streams at different bit rates, and these streams are then sliced up into segments or “chunks.” An ABR client in the viewer terminal detects the incoming stream bandwidth on the fly and uses this, along with a model based on the device's CPU capability, to select a segment among the various streams.
A special manifest file precedes the first segment, providing the client with a list of URLs from which each segment can be accessed. As each segment is received, the client progresses to the next segment in that stream, or it can jump to a parallel segment in one of the other streams if the channel bandwidth changes because of congestion, etc. In principle, a handful of streams will provide enough granularity so that the viewer does not detect a change in picture quality.
Note that ABR provides high transmission bandwidth efficiency when a unicast transmission (i.e., one-to-one) is used, but it can also work well with multicast and broadcast scenarios depending on how well the Internet infrastructure distributes bandwidth to users. ABR has the potential to deliver an audio/visual experience that we have come to expect from linear transmission: low delay, fast start time and a consistent experience across viewers.
Several manufacturers have developed different solutions for ABR streaming. Adobe HTTP Dynamic Streaming (HDS) uses a format called F4F to deliver Flash videos over RTMP and HTTP. Apple HTTP adaptive Live Streaming (HLS) was developed for the iPhone and iPad, and is implemented using HTTP, H.264 and MPEG-2 Transport Streams, with a manifest file called M3U8. Microsoft Internet Information Services (IIS) Smooth Streaming is used within Silverlight on the Windows 7 phone and incorporates fragmented MP4 (fMP4) encapsulation, again with H.264 for video compression.
With these different enterprise systems, an interoperability problem exists because of proprietary protocols and manifest structures. Multiple ABR systems mean that different devices must either pick and choose which systems to support, leading to service-constrained devices, or must include all at increased cost. This situation has motivated companies and experts around the world to propose a single, standard ABR system.
DASH-ing to the Rescue: a Universal ABR System
MPEG-DASH (Dynamic Adaptive Streaming over HTTP) is a newly standardized method for defining Stream Segments and Manifest Files for the purpose of ABR streaming. The specification (ISO-IEC 23009-1) defines a Media Presentation Description (MPD) that formalizes the stream manifest, which includes Segment timing, URLs and media characteristics such as video resolution and bit rates. While compatible Segments can contain any media data — with arbitrary compression — two types of containers are exemplified in the standard: MPEG-4 file format and MPEG-2 Transport Stream.
Cisco Systems Visual Networking Index (VNI) predicts that more than 50 percent of all global Internet traffic will be attributed to video by the end of 2012. It also confirms, in addition to television screens, video delivery to cell phone and computer screens will be increasingly common Globally, Internet video traffic is projected to be 58 percent of all consumer Internet traffic in 2015, up from 40 percent in 2010. At that time, three trillion minutes of video content are projected to cross the Internet each month, up from 664 billion in 2010, when 16 percent of consumer internet video traffic in 2015 will be TV video. There is no doubt that if you are in the business of transmitting video, you will likely be using IP in the near future.
Delivering acceptable video quality over IP to TV viewers and other devices has led to a still-evolving delivery infrastructure. The required network scale has higher packet loss and error rates than smaller managed networks. Adaptive Bit Rate (ABR) delivery protocols like Apple's HLS and Microsoft's Silverlight, among others, help address these issues. These protocols use HTTP over TCP to mitigate data loss by dynamically adapting bit rates to adjust to networks that can provide only unpredictable instantaneous bandwidths.
Using a CDN to distribute the content to a range of servers located close to the viewers is another key feature to successful deployments to avoid the congestion and bottlenecks of centralized servers. Yet, despite more complex protocols to handle a range of transport issues, high-quality performance is not guaranteed. Cost-effective operations and a good viewer experience depend on good monitoring observability and targeted performance metrics for rapid problem identification, location and resolution.
ABR Protocols
ABR video delivery mechanisms over IP that enable this rapidly growing Internet video market are effective, but complex. Not only do they require the usual video compression encoders to achieve practical bit rates, but they also require a host of other devices and infrastructures, including segmenting servers, origin servers, a CDN and a last-mile delivery network.
ABR protocols help deliver a quality video experience to viewers by overcoming common IP data network performance issues such as packet arrival jitter, high loss rates, unpredictable bandwidth and security firewall issues. HTTP delivery solves most firewall issues as it is almost universally unblocked since it is also used for web browsing. HTTP, which uses TCP, assures loss-free payload delivery as well. While predictable instantaneous bandwidth levels are a challenge in unmanaged networks, by using variable encoding rates and these protocols, the viewer's client device can dynamically select the best stream bit rate for the instantaneously available bandwidth.
Apple's HTTP Live Streaming (HLS) is an example of a protocol that successfully navigates the challenges of unmanaged networks to transfer multimedia streams using HTTP. To play a stream, an HLS client first obtains the playlist file, which contains an ordered URI list of media files to be played. It then successively obtains each of the media files in the playlist. Each media file is, typically, a 10-second segment of the desired multimedia stream. A playlist file is simply a plain text file containing the locations of one or more media files that together make up the desired program.
The media file is a segment, or “chunk,” of the overall presentation. For HLS, it is always formatted as an ISO 13818 MPEG-2 TS or an MPEG-2 audio elementary stream. The content server divides the media stream into media files of approximately equal durations at packet and key frame boundaries to support effective decoding of individual media files. The server creates a URI for each media file that allows clients to obtain the file and creates the playlist file that lists the URIs in play order.

"The biggest advantage to us of a standard like MPEG DASH is that everything can be encoded one way and encapsulated one way, and stored on our CDN servers just once. That's a benefit both in terms of saving our CDN costs from a storage perspective and a benefit because you have greater cache efficiency," said Mark Watson, senior engineer for Netflix.
Watson made his comments in a red carpet interview at the recent Streaming Media West conference in Los Angeles, shortly before taking part in a panel on the MPEG DASH specification. MPEG DASH would be a great help to Netflix, he said, because then it could avoid saving several different copies of its entire movie and TV show library.
While there are several different profiles defined in MPEG DASH, Netflix will use the on-demand profile, Watson said, because all of its online content is on-demand. Between the two types of stream segments defined -- MPEG-2 Transport Streams and fragmented MP4 files -- Netflix sides with fragmented MP4. It works well for adaptive streaming and is simpler, he offered.
Netflix, Watson said, contracts with multiple CDNs and allows the client devices to determine which works best for them at any time. The company is also sensitive to the amount of traffic it's putting across networks.
MPEG DASH (Dynamic Adaptive Streaming over HTTP) is a developing ISO Standard (ISO/IEC 23009-1) that should be finalized by early 2012. As the name suggests, DASH is a standard for adaptive streaming over HTTP that has the potential to replace existing proprietary technologies like Microsoft Smooth Streaming, Adobe HTTP Dynamic Streaming (HDS), and Apple HTTP Live Streaming (HLS). A unified standard would be a boon to content publishers, who could produce one set of files that play on all DASH-compatible devices.
The DASH working group has industry support from a range of companies, with contributors including critical stakeholders like Apple, Adobe, Microsoft, Netflix, Qualcomm, and many others. However, while Microsoft has indicated that it will likely support the standard as soon as it’s finalized, Adobe and Apple have not given the same guidance, and until DASH is supported by these two major players, it will gain little traction in the market.
A more serious problem is that MPEG DASH doesn’t resolve the HTML5 codec issue. That is, DASH is codec agnostic, which means that it can be implemented in either H.264 or WebM. Since neither codec is universally supported by all HTML5 browsers, this may mean that DASH users will have to create multiple streams using multiple codecs, jacking up encoding, storage, and administrative costs.
Finally, at this point, it remains unclear whether DASH usage will be royalty-free. This may impact adaption by many potential users, including Mozilla, who has already commented that it’s “unlikely to implement” DASH unless and until it’s completely royalty-free. With Firefox currently sitting at around 22% of market share, this certainly dims DASH’s impact in the HTML5 market.
Introduction to MPEG DASH
Adaptive streaming involves producing several instances of a live or on-demand source file and making them available to various clients depending upon their delivery bandwidth and CPU processing power. By monitoring CPU utilization and/or buffer status, adaptive streaming technologies can change streams when necessary to ensure continuous playback or to improve the experience.
One key difference between adaptive streaming technologies is the streaming protocol utilized. For example, Adobe’s RTMP-based Dynamic Streaming uses Adobe’s proprietary Real Time Messaging Protocol (RTMP), which requires a streaming server and a near-continuous connection between the server and player. Requiring a streaming server can increase implementation cost, while RTMP-based packets can be blocked by firewalls[.
A near-continuous connection means that RTMP can’t take advantage of caching on plain-vanilla servers like those used for Hypertext Transfer Protocol (HTTP) delivery, the delivery protocol used by Apple’s HTTP Live Streaming (HLS), Microsoft’s Smooth Streaming, and Adobe’s HTTP-based Dynamic Streaming (HDS). All three of these delivery solutions use standard HTTP web servers to deliver streaming content, obviating the need for a streaming server.
In addition, HTTP packets are firewall friendly and can utilize HTTP caching mechanisms on the web. This latter capability should both decrease total bandwidth costs associated with delivering the video, since more data can be served from web-based caches rather than the origin server, and improve quality of service, since cached data is generally closer to the viewer and more easily retrievable.
While most of the video streamed over the web today is still delivered via RTMP, an increasing number of companies will convert to HTTP delivery over time.
All HTTP-based adaptive streaming technologies use a combination of encoded media files and manifest files that identify alternative streams and their respective URLs. The respective players monitor buffer status (HLS) and CPU utilization (Smooth Streaming and HTTP Dynamic Streaming) and change streams as necessary, locating the alternate stream from the URLs specified in the manifest file.
HLS uses MPEG-2 Transport Stream (M2TS) segments, stored as thousands of tiny M2TS files, while Smooth Streaming and HDS use time-code to find the necessary fragment of the appropriate MP4 elementary streams.
DASH is an attempt to combine the best features of all HTTP-based adaptive streaming technologies into a standard that can be utilized from mobile to OTT devices.
MPEG DASH Technology Overview
As mentioned, all HTTP-based adaptive streaming technologies have two components: the encoded A/V streams themselves and manifest files that identify the streams for the player and contain their URL addresses. For DASH, the actual A/V streams are called the Media Presentation, while the manifest file is called the Media Presentation Description.
As you can see in Figure 1, the Media Presentation is a collection of structured audio/video content that incorporates periods, adaptation sets, representations, and segments.
Consistent multi-platform audio and video content delivery presents an ongoing challenge for broadcasters. Explosive smartphone and tablet growth on varying operating systems —Android, Apple iOS, or Windows Phone—threatens to create a user-experience divide between users on mobile devices, at the desktop or in the living room.
Broadcasters must address multi-platform consumption demands without compromising content security or network efficiencies. Many broadcasters are assessing efficiency of transport protocols used for content delivery, to see how they stack up for web and mobile delivery. Some legacy solutions, such as MPEG-2 Transport Stream (M2TS), lack basic web delivery functions.
What key information do broadcasters and network operators need to know as they look for more efficient approaches to the media delivery? This white paper explores fragmented MP4 files (fMP4) and considers whether the fMP4 format can replace legacy file formats.
Along the way, we’ll explore four key areas that impact both broadcasters and network operators:
HTTP Live Streaming (or HLS) is an adaptive streaming protocol created by Apple to communicate with iOS and Apple TV devices and Macs running OSX in Snow Leopard or later. HLS can distribute both live and on-demand files and is the sole technology available for adaptively streaming to Apple devices, which is an increasingly important target segment to streaming publishers.
HLS is widely supported in streaming servers from vendors like Adobe, Microsoft, RealNetworks, and Wowza, as well as real time transmuxing functions in distribution platforms like those from Akamai. The popularity of iOS devices and this distribution-related technology support has also led to increased support on the player side, most notably from Google in Android 3.0.
In the Apple App Store, if you produce an app that delivers video longer than ten minutes or greater than 5MB of data, you must use HTTP Live Streaming, and provide at least one stream at 64Kbps or lower bandwidth. Any streaming publisher targeting iOS devices via a website or app should know the basics of HLS and how it’s implemented.
How HLS Works
At a high level, HLS works like all adaptive streaming technologies; you create multiple files for distribution to the player, which can adaptively change streams to optimize the playback experience. As an HTTP-based technology, no streaming server is required, so all the switching logic resides on the player.
To distribute to HLS clients, you encode the source into multiple files at different data rates and divide them into short chunks, usually between 5-10 seconds long. These are loaded onto an HTTP server along with a text-based manifest file with a .M3U8 extension that directs the player to additional manifest files for each of the encoded streams.
The player monitors changing bandwidth conditions. If these dictate a stream change, the player checks the original manifest file for the location of additional streams, and then the stream-specific manifest file for the URL of the next chunk of video data. Stream switching is generally seamless to the viewer.
The boom in OTT and TV Anywhere services is underlined by rapid growth in IP video transmission at all stages of the content lifecycle, and this is expanding greatly the scope and demand for Quality Assurance (QA) products. Even leading proponents of OTT services still admit there is some way to go to provide acceptable Quality of Experience (QoE) for high-definition premium content over unmanaged networks in particular.
“One of the main obstacles to OTT is the lack of a great user experience,” says Helge Høibraaten, CEO of Vimond Media Solutions, a spin-off of Norwegian commercial TV station TV 2, which is commercialising its OTT broadcast platform internationally.
Speaking at a conference during the recent IBC exhibition in Amsterdam, Høibraaten indicated that an OTT platform was defined by the quality it delivers and must meet the needs of all devices including tablets, PCs and smartphones. Vimond itself has only just extended its applications suite to Apple iOS devices (iPad and iPhone), Android and Windows phones, in addition to Windows desktop PCs which it already supported. The message for vendors of OTT platforms, and for the services that run on them, is that they should only embrace new device types when acceptable quality can be guaranteed.
The definition of acceptable quality is admittedly rather subjective. It is certain, though, that IP networks are creating new challenges for providers of QA video products. These vendors have been extending their portfolios to tackle video delivery over both managed and unmanaged IP networks, with various announcements made at IBC.
While unmanaged networks including the Internet pose the greatest challenge, even managed IP networks require careful handling to avoid packet loss and latency resulting from congestion within the infrastructure. This can happen because unlike traditional broadcast networks, IP infrastructures do not have fixed end-to-end paths and have no pre-determined transmission times for each IP packet. It is possible for more packets to enter the network than can be delivered within an acceptable time frame, leading to congestion and either dropped packets, delays, or both. Either of these can cause loss of quality on receiving devices.
The remedy is to apply traffic shaping, which involves holding up IP packets that are less critical or which can afford a little delay in order to preserve capacity for the most important packets. This can be performed at the point of entry to the network or within the network by routers themselves or other dedicated devices, and the key with managed networks is that operators can control the traffic shaping process better. Potentially, packet loss can be eliminated and latency kept within acceptable limits, according to Per Lindgren, VP Business Development and Co-Founder of Net Insight, the Swedish-owned vendor of the Nimbra IP media transport platform. Net Insight tackles the managed IP quality issue by breaking the network down into separate segments and applying QoE mechanisms including traffic shaping to each.
The first step is to ensure that the routers themselves do not create problems under congestion by dropping packets as they pass through, so Net Insight has applied traffic shaping at this level to ensure this does not happen. “By traffic shaping even inside our MSRs (Media Switch Routers), we can traffic shape down until we ensure we do not lose any packets there,” says Lindgren.
The next step is to address the links through the core network between the routers and ensure that the QoS needs of each individual service are met. “Traditionally telcos have not been treating media traffic as a special service,” says Lindgren. “So we propose building service aware media networks. MSRs aggregate traffic so that the core network (provided by a telco) only handles aggregated flows rather than individual services. Our MSRs then handle the different protection needs of each service, and can add QoS enhanced links inside a media service network rather than just at the edges.”
In this way, by addressing both the routers and links between them separately as part of a coordinated traffic management approach, the network can achieve much higher levels of quality. Even then, though, the possibility of packet loss or delay cannot be discounted, and so the third element of Net Insight’s QA strategy is to monitor every link. “We can do continuous real-time monitoring of traffic between MSRs and see any packet loss sent between one MSR and another,” Lindgren explains. “That makes it much easier to troubleshoot.”
Within unmanaged IP networks, on the other hand, it is impossible for broadcasters or operators to do either traffic shaping or performance monitoring since they do not own the infrastructure. This is an increasing issue with the growth of cloud-based services where the infrastructure is normally owned and managed by a third-party with video delivered over some Content Distribution Network (CDN). In that case there is an apparent black hole between the cloud and the end user, making it difficult for a content provider to know what quality the customer is getting.
Another Swedish vendor specialising in distributed video delivery, Edgeware, has tackled this problem with its Convoy VDN, which is software operating within the company’s Distributed Video Delivery Network (D-VDN) platform. Announced at IBC, this operates by combining the receiving device’s capability with the QoS known to be provided by the delivery infrastructure, according to Edgeware’s Chief Marketing Officer Duncan Potter.
The point is that CDNs usually operate via adaptive streaming protocols to improve network efficiency and performance, breaking video up into multiple small file chunks that can take different routes before being reassembled at the destination. The network detects each user’s CPU capacity and bandwidth continuously and adjusts the quality of the stream in real-time to ensure that QoE is always as good as it can be at that point in time. But breaking up video into chunks does make it hard to monitor what is going on within the CDN, and this is the problem Edgeware has addressed with Convoy VDN. “As we are a network device we can see what is going through,” said Potter. “We work out what is sent, collect statistics via a central reporting engine, and that is integrated with the higher level CDN management system.”
Such measures may help ensure optimum quality when a service is working normally but do not cater for major outages within the infrastructure. While IP networks are becoming more reliable, there is rising dependence in an increasingly global content market on external communication links that may be unreliable. This is a particular problem for the growing number of niche and ethnic services that have a global audience distributed across numerous, often small, communities around the world.
Such ethnic services can be lucrative, with high profit margins for operators because consumers are prepared to pay a premium or a separate subscription to receive them, but the total revenue in a given region is usually relatively small. This means operators cannot afford to spend too much capital on protecting against failure of the service in a region beyond their control, according to Danny Wilson, CEO of TV performance monitoring vendor Pixelmetrix. “Typically if an operator imports content from, say, India, they are vulnerable to loss of signal from Delhi,” he points out.
Pixelmetrix is tackling this with software announced at IBC that enables its DVStor recording and playback platform to perform disaster recovery and start playing out the content in the event of an outage. “We are recording what is going on at a downlink coming in from overseas and have integrated this with our test and measurement devices,” says Wilson. “Then if there is any interruption, the sensor detects that input signal is lost, and this DVStor solution can then provide back-up recovery on a real-time basis.”
This, in effect, is a cloud-based disaster recovery service and could be incorporated within IP-based delivery infrastructures. It highlights the growing scope of Quality Assurance, bringing together elements of disaster recovery, troubleshooting and performance monitoring within an overall QoE package.
By Philip Hunter, Videonet
Multi-screen TV is approaching a tipping point now as the Pay TV pioneers look to expand their offers to cover more channels as well as more devices, and more service providers launch TV Everywhere packages. One of the important tasks for many operators walking around IBC this year is to work out how they can scale their multi-screen services beyond a sub-set of the channels they offer on the set-top box. Ultimately consumers will expect all their channels on all screens, of course.
Ericsson Harnesses Hardware and Software to Support More Channels
Ericsson is using IBC to highlight the scalability issue and has two new products that it believes will help operators expand their offers. These are the Ericsson SPR1200 Multiscreen Stream Processor, a true hardware approach to multi-screen compression, and the Ericsson NPR1200 Multiscreen Network Processor, a dense software-based adaptive streaming segmentation and encryption processor, designed to track dynamic updates in adaptive streaming formats and DRM systems associated with the needs of delivery to different types of devices.
The combined solution enables high quality and cost-effective processing of hundreds of channels into thousands of adaptive streaming profiles, the company says. It claims the SPR1200 and NPR1200 represent the most powerful and flexible solution for the growing multi-screen market.
Ericsson’s ConsumerLab research shows that 93% of consumers still watch linear TV and will continue to do so. “The expectation by consumers for multi-screen TV is that all of their content choices available in the home on the large screen will also be available on every screen,” it adds.
RGB has Multi-platform Headend for Large and Mid-sized Deployments
Meanwhile, RGB Networks claims that the combination of its Video Multiprocessing Gateway (VMG) (a carrier-class platform for multi-screen video delivery) and its adaptive streaming solution, the TransAct Packager, provides the most scalable solution available for deployment of advanced IP video services to any device, enabling operators to go straight from trial to deployment.
The company recently added a new member to the VMG product family, in the form of the VMG-8, which it says is ideal for small to medium-sized deployments or deployments at the edge. This product inherits the field-proven transcoding, transrating, ad insertion and other advanced video processing capabilities of the VMG family and packages them in a new 7RU high carrier-grade chassis. The VMG-8 holds up to eight modules and provides a compact alternative to RGB’s larger VMG-14.
In its fully redundant configuration the VMG-8 can be configured with three video transcoder modules, one audio transcoder module and a single controller module for transcoding programmes to over 140 streams for delivery to any IP-enabled device. In this redundant configuration, each module type has a back-up which can take over operation should the primary fail. Complementing its module redundancy, the VMG-8’s reliability is further enhanced with back-up power supplies and cooling fans which automatically take over if a primary unit fails.
Like the company’s VMG-14, the VMG-8 also benefits from recent enhancements to the TCM transcoder module, enabling transcoding of up to 60 SD or HD inputs and 240 adaptive bitrate outputs per VMG-8 chassis. The VMG-14 can now support up to 132 SD or HD inputs and 528 outputs per chassis.
Harmonic Supports Live and File-based Multi-screen Delivery
Harmonic is also focusing on the needs of content distributors and creators as they deliver more of their content to more screens. The company recently announced the ProMedia family of software solutions for optimizing live and file-based multi-screen video production and processing. The ProMedia family performs a broad range of functions, including transcoding, packaging and origination to enable high-quality video creation and delivery of live streaming, live-to-VOD, and VOD services to TVs, PCs, tablets, smartphones and other IP-connected devices. ProMedia is also considered an ideal solution for content creation in file-based workflows such as tapeless production environments.
The ProMedia family provides a suite of software products that can be deployed individually or as an end-to-end video processing solution, offering great flexibility. This solution is also integrated with leading DRM systems, asset management systems and content distribution networks, in addition to other Harmonic products including encoders, receivers, playout servers, and storage.
The ProMedia family leverages Harmonic’s strong H.264 video codec expertise and is based on the same intellectual property behind Harmonic's Electra encoders. The family includes ProMedia Live for real-time video processing and transcoding, featuring enhanced H.264 video codec technology developed by Harmonic and optimized for creating high-quality Internet video streams.
Another important product in the family is ProMedia Package, a carrier-grade adaptive streaming preparation system for secure, high-value Internet video services. ProMedia Package supports numerous HTTP streaming protocol standards and is capable of packaging in multiple output formats from a single video source, enabling a more scalable, distributed architecture.
Envivio Helps Move Content Package and Delivery to the Edge
Envivio has introduced a number of notable new products for multi-screen TV. These are the Halo Network Media Processor (NMP), 4Caster C4 Gen III multi-screen encoder, and the Envivio Genesis universal mezzanine output format.
Halo NMP enables operators to shift their content packaging and delivery processing to the edges of their existing video distribution infrastructure. “Moving these operations makes it possible to add support for delivering high quality, protected video to new devices without altering the headend,” Envivio declares. “Halo NMP complements existing broadcast infrastructure and simplifies distribution to the latest smartphones, tablets, connected TVs and PCs.”
Halo lets operators take advantage of the Genesis universal output format to control the bandwidth demands multi-screen TV makes on backbone networks. Genesis merges the bitrates and resolutions needed to deliver adaptive streams for major standards and technologies into a single, efficient output format. Envivio claims the result is a reduction of as much as 50% in the bandwidth demands multi-screen TV makes on backbone resources.
Video headends powered by the Envivio 4Caster C4 family of encoders provide support for the full spectrum of IPTV, Internet TV, mobile TV, cable, satellite and terrestrial applications. They enable operators to support the growing variety of formats needed to deliver video to any device at any time, including simultaneous video delivery from a single encoder to digital set-top boxes, connected TVs, PCs and Macs, as well as tablets and mobile screens.
Imagine Communications Supports 1,000 Multi-profile Transcoders
Imagine Communications will showcase its ICE Streaming System for streaming live multi-format video to multiple tablets. ICE is a new network-side transcoding platform that allows multi-screen service providers to deliver what it claims is uncompromised video quality across multiple devices with unmatched compression efficiency. The ICE Streaming System supports up to 1,000 stream-aligned, multi-profile transcodes from a single carrier-class blade system platform.
The ICE Streaming System is based on Imagine's widely deployed ICE Video Platform and combines picture quality, scalability and full support for integrated fragmentation, encryption and HTTP streaming.
ISILON and ATEME Partner to Boost Media Processing Performance
On the eve of IBC, ATEME announced that it has partnered with ISILON to support high performance content processing for video delivery to multiple screens. This results from the combination of the ISILON IQ Series NAS storage and ATEME’s TITAN File transcoding platform. The TITAN video processing speed is enhanced by ultra-fast storage. Meanwhile more content titles can be stored thanks to the superior compression efficiency of TITAN.
The companies say the partnership dramatically simplifies the operational challenges of multi-screen transcoding workflows. “Installed in a matter of hours, the solution scales out linearly with the expansion of the content catalogue, the migration to HD, or as new output formats are added to support more viewing devices. It takes only minutes to add transcoding blades or storage capacity: there is no need for re-design and no downtime.”
The combination of ISILON IQ NAS Storage and the ATEME TITAN transcoder is proven and delivers content for more than 40 million pay TV subscribers worldwide already. The partnership, announced in late August, will make it easy for many more service operators to access the solution as they move from tape to file based workflows or enhance their VOD offerings.
By John Moulding, Videonet
Online and mobile viewing of widely-available, high-quality video content including TV programming, movies, sports events, and news is now poised to go mainstream. Driven by the recent availability of low-cost, high-resolution desktop/laptop/tablet PCs, smart phones, set-top boxes and now Ethernet-enabled TV sets, consumers have rapidly moved through the ‘novelty’ phase of acceptance into expectation that any media should be available essentially on any device over any network connection. Whether regarded as a disruption for cable TV, telco or satellite TV providers, or an opportunity for service providers to extend TV services onto the web for on-demand, time-shifted and place-shifted programming environments – often referred to as ‘three screen delivery’ or ‘TV Anywhere’ – this new video delivery model is here to stay.
While tremendous advancements in core and last mile bandwidth have been achieved in the last decade around the world – primarily driven by web-based data consumption – video traffic represents a quantum leap in bandwidth requirements. Coupled with the fact that the Internet at large is not a managed quality-of-service environment, requires that new methods of video transport be considered to provide the quality of video experience across any device and network that we have come to expect from managed TV-delivery networks.
The evolution of video delivery transport has led to a new set of de facto standard adaptive delivery protocols from Apple, Microsoft, Adobe that are now positioned for broad adoption. Consequently, networks must now be equipped with servers that can take high-quality video content from its source live or file format and ‘package’ it for transport to devices ready to accept these new delivery protocols.
Video Delivery Background
The Era of Stateful Protocols
For many years, stateful protocols including Real Time Streaming Protocol (RTSP), Adobe’s Real Time Messaging Protocol (RTMP), and Real Networks' RTSP over Real Data Transport (RDT) protocol were utilized to stream video content to desktop and mobile clients. Stateful protocols require that from the time a client connects to a streaming server until the time it disconnects, the server tracks client state. If the client needs to perform any video session control commands like start, stop, pause or fast-forward it must do so by communicating state information back to the streaming server.
Once a session between the client and the server has been established, the server sends media as a stream of small packets typically representing a few milliseconds of video. These packets can be transmitted over UDP or TCP. TCP overcomes firewall blocking of UDP packets, but may also incur increased latency as packets are sent, and resent if not acknowledged, until received at the far end.
These protocols served the market well, particularly during the era where desktop and mobile device experiences were limited by frequency, quality, duration, screen/window size/resolution, constrained processor, memory and storage capabilities of mobile devices, etc.
However, the above experience factors have all changed dramatically in the last few years. And that has exposed a number of stateful protocol implementation weaknesses:


