Live Streaming to the Browser Using MSE and MPEG-DASH
For the next two weeks, we’re running a trial in conjunction with Radio 3 to deliver surround sound to your browser for a series of classical music concerts. On the Radio 3 blog, Rupert Brun explains the background to the trial and how to get involved.
Here in Broadcast & Connected Systems at BBC R&D, we are always looking for new ways to apply our technology research to extend the reach of BBC content to the maximum number of users. Surround sound isn’t new, but delivering it to the home via the Internet has traditionally meant installing plugins or other applications, limiting the platforms and consumers we can target.
In this experiment, we believe we are the first broadcaster in the world to stream a live outside broadcast in discrete multichannel audio to the home using MPEG-DASH, and we're doing it using just a compliant web browser - no plugins, no separate software installation required.
Why Stream to the Browser, and Why Now?
The beauty of the browser is that it is (almost) ubiquitous. Every PC, tablet and smart phone has a browser installed when it ships. Increasingly, smart TVs, set top boxes and games consoles have some form of browser environment available, bringing HTML5, CSS and Javascript functionality to the majority of consumer electronic devices.
People expect to be able to consume BBC content on any platform. To enable this, the BBC currently has to support a number of streaming protocols and has to maintain many different applications with differing code bases and levels of functionality in order to support hundreds of different set top boxes, smart TVs, mobile devices and desktop environments.
What if we could have a single encoding and distribution workflow and a single cross-platform client application, reducing the complexity of distribution and allowing our developers to concentrate on delivering great user experiences?
From a listener’s perspective, the browser “just works” which makes accessing our services much easier. Removing the requirement to install plugins or other software removes a significant barrier for some users. Indeed, for cross-platform compatibility, security and stability, many browser vendors have decided not to support plugins in the future so we need to move away from these anyway.
Three particular technical standards should enable us to do this in the future: HTML5, MPEG-DASH and W3C Media Source Extensions.
HTML5 and Media Source Extensions (MSE)
In HTML5 the HTMLMediaElement, typically a video or audio tag, exposes a source element which accepts a URL of the content to be played. The browser retrieves, decodes and plays the media data automatically, providing it knows how to handle your media type. This offers simplicity for the developer and, in theory, has removed the need for plugins, but the trade-off is that there is no control over many important variables: how data is downloaded and from where, how much data is buffered, which adaptive streaming algorithms to use or what to do in case of failure.
The ability to control these variables is key to providing a world-class user experience, but by default they are hard-coded into the browser. Ideally we want to hand as much control as possible to the Javascript application, while still deferring to the browser for parsing, decoding and rendering the media data. Typically, Adobe Flash or Microsoft Silverlight applications have been used to provide all of this functionality on those platforms that support those plugins.
Most of these features can be replicated in Javascript but, until now, it has not been possible to feed media data to the HTMLMediaElement. The Media Source Extensions define a Javascript API which allows media streams to be constructed dynamically within a Javascript application.
At the heart of MSE is the MediaSource object. This object is created by the application and attached to the media element. Its purpose is to provide the media data for playback as requested by the media element.
The MediaSource object maintains a collection of SourceBuffers. These are the interface through which the application appends media data to the source and methods are provided to insert, remove and manage media data. They are essentially an abstraction of a timeline – media data can appended to the buffer based on media playback timestamp, or it can be appended sequentially, ignoring timestamps. The latter mode enables unrelated media to be spliced together, which allows uses such as advert insertion or even video editing in the browser.
The application handles the requesting of media data from the server and appends the response to the SourceBuffer. Decoupling the fetching of media data from playback allows the media data to be sourced using novel transport mechanisms or from different locations.
SourceBuffers can contain audio, video or timed text and an instance is created for each stream that needs to be presented. Typically there might be one video stream, one audio stream and perhaps a subtitle stream. Since each media type is handled separately, access services such as audio description or subtitling can be selected simply by requesting a different stream.
Finally, the specification also includes extensions to the HTMLVideoElement allowing measurement of video decode and rendering performance which could be used to help decide the most appropriate video stream to present if a number of options are available.
An additional benefit of not hardcoding features into the browser is that any functionality upgrades such as improved adaptive algorithms or defect fixes are simply a case of updating the Javascript application, which is freshly fetched each time the page is loaded, rather than requiring every user to upgrade their browser. Software updates to the browser itself might be fairly easy on a PC but happen infrequently on a smart TV or set top box.
Content Delivery Using MPEG-DASH
MPEG-DASH is the new standard for delivering media content over the Internet. It is designed to allow content to be delivered efficiently in a segmented form, making use of standard caching techniques for web content in order to deliver to large audiences. It supports bitrate adaptation, allowing each viewer to receive a stream in the best quality that their Internet connection can deliver.
Since even surround audio streams need only a low bitrate connection, we are not using bitrate adaptation for this trial. The audio stream is simply encoded at a constant rate of 320 kbps using AAC-LC. However, MPEG-DASH still takes care of dividing the live audio stream into short segments that the client can retrieve using HTTP. Most importantly, MPEG-DASH is a streaming standard which can be implemented for a browser using the W3C Media Source Extensions.
Building a MPEG-DASH Player Using MSE
In order to deliver the surround audio to you, a DASH player application needs to at least perform the following tasks:
- Create a MediaSource object and set it as the source of the media element
- Request and parse manifest and create SourceBuffer objects for each enabled stream
- Request segments for each stream and append them to the SourceBuffers
- Repeat step 3
Where Next?
MSE has recently reached Candidate Recommendation stage, meaning that it should be complete enough to allow implementation, but browser support is still limited.
Right now, Chrome (33 or higher), and IE11 on Windows 8.1, are the only browsers we’ve seen which support enough features for our trial. If you’re a fan of Firefox, Safari or other browsers, these currently have incomplete support, though many of these vendors have publicly stated they are working on it.
As support for these features becomes more widespread, we expect that more and more content on the Internet will be delivered this way. Other content providers are also starting to use these techniques: Netflix has deployed a MSE-based player – this is the default player if you are using IE11 on Windows 8.1. Youtube has also deployed a MSE-based player for some content on some platforms.
Although we are actively experimenting with MSE in BBC R&D, there are no immediate plans to launch any BBC services using the technology. Nevertheless, HTML5, MPEG-DASH and MSE are a powerful set of standards that are sure to play a significant role in delivering media content on the Internet in the coming years.
By Dave Evans, BBC R&D