Performance Comparison of HEVC, VP9, and H.264 Encoders

This work presents a performance comparison of the two latest video coding standards H.264/MPEG-AVC and H.265/MPEG-HEVC (High-Efficiency Video Coding) as well as the recently published proprietary video coding scheme VP9.

According to the experimental results, which were obtained for a whole test set of video sequences by using similar encoding configurations for all three examined representative encoders, H.265/MPEG-HEVC provides significant average bit-rate savings of 43.3% and 39.3% relative to VP9 and H.264/MPEG-AVC, respectively.

As a particular aspect of the conducted experiments, it turned out that the VP9 encoder produces an average bit-rate overhead of 8.4% at the same objective quality, when compared to an open H.264/MPEG-AVC encoder implementation – the x264 encoder. On the other hand, the typical encoding times of the VP9 encoder are more than 100 times higher than those measured for the x264 encoder.

When compared to the full-fledged H.265/MPEG-HEVC reference software encoder implementation, the VP9 encoding times are lower by a factor of 7.35, on average.

Building an Online Video Player with DASH-264

This article discusses the current state of online video, delves into the DASH standard, explores the challenges of building a DASH player, and, finally, walks through the basics of implementing the open source Dash.js player.

FIMS, SOA and Media Applications

An interesting white paper by David Austerberry.

x265 Evaluator's Guide

An interesting guide about x265, an open source HEVC implementation.

Quiptel Brings Greater Intelligence to Streaming

Quiptel, a five-year-old start-up company, claims it has an online media platform that improves the user experience of streaming audio and video while making more efficient use of available network bandwidth. It is a bold claim, which acting chief executive Richard Baker tested at a low key launch in a London hotel.

Invited media and analysts were shown multiple high-definition streams over a modest broadband connection. It appeared to work well in the demonstration, with rapid media start up a notable feature, said to be three to six times faster than conventional online video approaches, although they had problems earlier in the day when the hotel apparently lost its network access.

The patented technology appears to be based on using multiple logical network routes and a network overlay that intelligently manages traffic to optimise use of available access network bandwidth.

The result, Quiptel claims, is that more of the available network capacity is used to deliver sound and pictures, while operators can serve more customers with equivalent infrastructure.

Quiptel says its approach means an operator can deliver up to 30% more streams than HTTP Live Streaming for the same capacity. Some of the claims are documented in a detailed technical white paper that benchmarks QMP QFlow against Apple HLS. It says that HLS involves more overhead bandwidth above the bitrate for the audio and video.

However, given the increasing capacity of connections and falling connectivity costs, the assumed savings may be less significant than potential improvements to quality of service, particularly over constrained connections or in congested network conditions.

That concept of adaptive bitrate streaming has been around since the turn of the century. Many online video services currently use adaptive bitrate streaming, which breaks files into chunks and allows the player to change stream quality dynamically to maintain continuity in changing network conditions

What Quiptel appears to be doing is adding multipath connections to optimise delivery over diverse network routes.

It is an approach that is familiar to informitv from its groundbreaking work with Livestation, a pioneer of live peer to peer streaming.

In principle it can provide more robust delivery in changing network conditions. In theory that is inherent in internet protocols, but so-called overlay networks can add more intelligent routing that can optimise distribution dynamically.

One of the challenges is that requires more intelligence to be built into the player application. Quiptel has clients for multiple operating systems including personal computers and iOS or Android devices and they can also be embedded in smart televisions, set-top boxes or media players. Quiptel showed an end-to-end system using an Android set-top box.

Quiptel is aiming to offer QMP, the Quiptel Media Platform, to service providers on a white label basis or as a licensed technology.

The company is based in Hong Kong. The founder is Peter Do, who previously worked with voice over internet protocol systems. Just as VOIP services, of which Skype, now owned by Microsoft is the best known, have disrupted traditional telephony by enabling audio and video over broadband networks, so Quiptel hopes to allow a premium media experience over either managed or unmanaged networks.

“It provides a greater than 50% capex and opex saving for service providers over traditional IPTV systems, enabling a quicker time to market while expressly focusing on quality of service video delivery.” he said.

QMP includes various components, known as QFlow, QNav and QRouter, based on patented core technologies and intelligent network optimising traffic management specifically designed for delivering high quality video across managed and unmanaged mobile and broadband networks across multiple devices.

Quiptel faces a challenge in deploying its approach with service providers that have already made technology choices but could provide an advantage to those looking to roll out new services.

Source: informitv

MPEG-DASH Ecosystem Status

An interesting article by Nicolas Weil.

EBU Unveils QC Criteria

The EBU has published the first release of its QC Criteria (EBU Tech 3363) developed by its Strategic Programme on Quality Control. EBU Tech 3363 is a large collection of QC checks that can be applied to file-based audiovisual content.

Examples include the detection of visual test patterns, loudness level compliance checks, colour gamut verification, looking for image sequences that may trigger epileptic seizures, etc. This collection of QC tests can be seen as a 'shopping list' which media professionals can use to, for example, create their own delivery specifications or establish test protocols for archive transfer projects.

Each QC Criterion in the list features a definition, references, tolerances, and an example, to help users reproduce the same tests with potentially different equipment (e.g. think of a broadcaster who receives material from a wide range of post production houses). Obviously such information can get quite technically detailed.

That is why the EBU group decided to present the overview of QC Criteria in a tabular format, similar to the well-known Periodic Table of chemical elements. Several characteristics support this metaphor, including the concept of specifying the tests as 'atomically' as possible, the fact that some tests posess (much) more complex properties than others and the idea of categorizing them into groups with similar characteristics.

For the EBU QC Criteria these are: audio, video, format/bitstream and metadata/other.

Source: EBU

Interoperability, Digital Rights Management and the Web

We are on the verge of an important inflection point for the Web. In the next few years commercial web video delivery utilizing new, international standards (DASH Media Ecosystem) will become commonplace. These standards will enable cross-platform, interoperable media applications and will transform the media entertainment industry, delight consumers and expand the nature of the Web.

Although all of the standards outlined below are necessary, the most significant change was the introduction of interoperable digital rights management technologies which enable the distribution of digital media on the open web while respecting the rights of content producers.

Download the white paper

Via Video Breakthroughs

A Report on the Impact of Cloud on Broadcast

This white paper has been commissioned by the Digital Production Partnership (DPP) to provide an assessment of the applicability of cloud technology and services to broadcast processes and key media business areas.

Initial Report of the UHDTV Ecosystem Study Group

This report provides an overview of image and audio technology standards and requirements for UHDTV production in the professional broadcast domain. This report represents a SMPTE study primarily focused on real time broadcasting and distribution and is therefore not an exhaustive analysis of UHDTV1 and UHDTV2.

Click here for the report

Source: SMPTE

New DASH-AVC/264 Guidelines Include Support for 1080p Video

Version 2.0 of the DASH-AVC/264 guidelines, with support for 1080p video and multichannel audio, is now publicly available on the DASH Industry Forum (IF) website.

The new guidelines includes several promised extensions, including one on HD video that moves the recommended baseline from 720p to 1080p.

720p had initially been chosen, according to the initial guidelines released in May, as a "tradeoff between content availability, support in existing devices and compression efficiency." At that time, the baseline video support used the Progressive High Profile Level 3.1 decoder and supported up to 1280x720p at 30 fps.

"The choice for HD extensions up to 1920x1080p and 30 fps is H.264 (AVC) Progressive 12 High Profile Level 4.0 decoder," the new guidelines state, adding support for 4.0 decoders that was lacking in the previous set of guidelines.

In addition, the guidelines also provide a way to handle standard definition (SD) content.

"It is recognized that certain clients may only be capable to operate with H.264/AVC Main Profile," the guidelines state. "Therefore content authors may provide and signal a specific subset of DASH-AVC/264 by providing a dedicated interoperability identifier referring to a standard definition presentation. This interoperability point is defined as DASH-AVC/264 SD."

The new guidelines also cover several multichannel audio options.

"The baseline 1.0 version of DASH-AVC/264 only required support for HE-AACv2 stereo," says Will Law, secretary of DASH IF and Chairman of its Promotions Working Group. "Version 2.0 introduces multichannel Dolby, DTS and also Fraunhofer profiles."

Law also says that there will be a number of DASH-AVC demonstrations around the at IBC at Amsterdam's RAI Convention Centre on September 12, 2013. "These demonstrations will show the latest advancements in the DASH workflow, from encoding, through delivery and playback, including 4K video, HEVC and multichannel audio," says Law. "You'll also see HbbTV and multi-screen applications as well as solutions for DASH use in the broadcast world."

These demonstrations will occur at various booths, including Akamai—the company where Law works as a Principal Architect for Media—Ericsson, Haivision, Microsfot, Nagra, and a host of others.

Digital Primates has been hosting a demonstration of a JavaScript version of a DASH-AVC/264 reference player. The dash.js is also being reviewed for the 1.0 release, according to Law, and release is planned just prior to IBC.

The official version will be launched soon at the DASH IF site but until then the Digital Primates demo can be found on their site. The demo requires Chrome or Internet Explorer 11 (IE); PlayReady DRM playback is currently only available with IE for this demo.

As the DASH IF points out, DASH-AVC/264 "does not intend to specify a full end-to-end DRM system" but it does provide a framework for multiple DRMs to protect DASH content.  The guidelines allow the additional of "instructions or Protection System Specific, proprietary information in predetermined locations to DASH content" that has previously been  encrypted with what's generally known as the Common Encryption Scheme (ISO/IEC 23001-7).

By Tim Siglin, StreamingMedia

Telestream Announces Open-Source HEVC Encoder Project

Telestream has announced the public availability of an open source H.265 (HEVC) encoder. The new project aims to create the world’s most efficient, highest quality H.265 codec.

The iniative is being introduced under both an open source and commercial license model and is being managed by co-founder MulticoreWare Inc, Telestream’s development partner.

“Telestream and MulticoreWare have had great success in the acceleration and commercial deployment of x264 and believe that a similar approach with the collaborative development of the next generation of high-efficiency codecs will benefit the industry,” commented Shawn Carnahan, CTO at Telestream. “The x264 project proved the effectiveness of developing a codec of this complexity. Leveraging the x264 technology in this new project will ensure that the new codec is as robust, efficient and high quality as its predecessor.”

Jason Garrett-Glaser, lead developer of the x264 project added: "Previous collaboration between Telestream and MulticoreWare led to successful work on the GPU acceleration of x264, a task deemed by many to be incredibly difficult, if not impossible. With these accomplishments in mind, I am excited to support Telestream in the founding of the x265 project, which follows in the x264 tradition of high performance, quality, and flexibility under an open source license and business model."

Access is free under GNU LGPL licensing, and commercial licenses are available for companies wishing to use the resulting implementation in their products. More information can be found at, where companies and individuals can contribute to the project.

Source: TVBEurope

ASSIMILATE's Universal Media Player

SCRATCH Play supports a wide range of media formats. From cinematic RAW files (RED, Arri, Sony, Canon, Phantom, etc) to DSLR RAW files (Canon 5D, Nikon N600, etc) to editorial formats (MXF, WAV, etc) to pro VFX/still formats (DPX, EXR, etc). Even web-based media (QuickTime, Windows Media, MP4, H.264, etc) and still image formats (TIFF, JPG, PNG, etc).

SCRATCH Play features powerful color-correction tools that let you set looks on-set, and generate LUTs, CDLs or JPEG snapshots. This is the same color technology found in SCRATCH - ASSIMILATE’s world-class professional color-grading application.

SCRATCH Play is not just any media player. Sure, it plays virtually anything, but it also supports features only found in professional applications including camera metadata display, clip framing, rotating and resizing.

Download the FREE version:
For Windows
For Mac OS X

Who is Powering the Rise of Online Video?

A confluence of technologies, evolving business models and changing consumer lifestyles are converging to propel the rise of online video and fundamentally transform TV, advertising and content delivery methods.

Here are the online video ecosystem segments and companies that are giving rise to this transformation.

Working with HEVC

A nice introduction to HEVC.

By Ken McCann, DVB-SCENE Magazine

Media Storage Performance

This document details work carried out by BBC R&D for the EBU Project Group on Future Storage Systems (FSS).

Media storage is still expensive and very specialist, with a few suppliers providing high performance network storage solutions to the industry. Specifying, selecting and configuring storage is very complex, with technical decisions having far reaching cost and performance implications.

The viability of true file based production will not be determined by storage performance alone.How applications, networks and protocols behave is fundamental to getting the best performance out of network storage.

This paper details the experience gained from testing two different approaches to high performance network storage and examines the key issues that determine performance on a generic Ethernet network.

All the graphs shown were produced from measured performance data using the BBC R&D Media Storage Meter open source test tool.

Open Source DPP Creation

Open Broadcast Systems has published an interesting DPP creation workflow based on open source tools.

Source: Video Breakthroughs

SOA: The Business of Broadcast

Broadcasters have many challenges as the media business evolves, driven by new consumer devices and the increase in mobile viewing. National broadcasters are facing more competition from global operators. New entrants like YouTube and Netflix have changed the VOD landscape. Broadcasters that once aired one channel now air a multiplex of linear channels, as well provide catch-up and mobile services. Summing up, broadcasters must deliver to more platforms, linear and on-demand, in a more competitive business environment.

To meet these challenges, a business must become more agile. Many other sectors have faced similar challenges, and part of the solution for many has been to turn to new software applications, particularly Business Process Management (BPM) and the Service-Oriented Architecture (SOA). Although each can be used stand-alone, BPM and SOA are frequently used in concert as a platform to improve the performance of a business.

Operations that use videotape were constrained by the need for manual handling, but as content migrates from videotape to digital files, the way is open to use IT-based methodologies, including BPM and SOA, to aid broadcast operations.

What is SOA?
SOA is a design methodology for software systems. SOA is not a product, but an architecture to deploy loosely coupled software systems to implement the processes that deliver a business workflow. SOA provides a more viable architecture to build large and complex systems because it is a better fit to the way human activity itself is managed — by delegation. SOA has its roots in object-oriented software and component-based programming.

In the context of the media and entertainment sector, SOA can be used to implement a “media factory,” processing content from the production phase through to multi-platform delivery.

Legacy Broadcast Systems
Traditional broadcast systems comprise processing silos coupled by real-time SDI connections, file transfer and an assortment of control protocols. Such systems are optimized for a specific application and may provide a good price/performance ratio with high efficiency.

The tight coupling of legacy systems makes it difficult to upgrade or replace one or more components. These applications are typically coupled via proprietary APIs, as shown in Figure 1. If the software is upgraded to a new version, the API can change, necessitating changes to other applications that are using the API — work that is usually custom.

Figure 1. Legacy systems use many applications, all tightly coupled by proprietary APIs.

It makes it difficult to extend to add new functionality to the system to meet the ever-changing demands of multi-platform delivery and evolving codec standards. Storage architectures change with object-based and cloud storage to become alternates to on-premise NAS and SAN arrays.

When vendors upgrade products, the new versions often do not support legacy operating systems, leading to the need to replace underlying computer hardware platforms.

Traditional systems are just not agile enough to easily support the new demands of the media business. Often new multi-platform systems are tacked on to existing linear playout systems in an ad-hoc manner to support an immediate demand. The system grows in a way that eventually becomes difficult to maintain and operate.

Monitoring — No Overall Visibility
Traditional systems also suffer from a lack of visibility of the internal processes. Individual processes may display the status on a local user interface, but it is difficult to obtain an overall view (dashboard) of the operation of the business.

As broadcast strives for more efficiency, it is vital to have an overall view of technical operations as an aid to manage existing systems and guide future investment. Many broadcasters already have end-to-end alarm monitoring, but resource usage may only be monitored for billing purposes, and not to gain intimate knowledge of hardware and software utilization.

SOA in Media
SOA is not new; it has been in use for a decade or more in other sectors including defense, pharmaceuticals, banking and insurance. It developed from the principles of object-oriented software design and distributed processing.

If SOA is common in other sectors, why not just buy a system from a middleware provider? The problem lies with the special nature of media operations. The media sector has lagged other sectors in the adoption of such systems for a number of reasons. These include the sheer size of media objects and the duration of some processes. A query for an online airline reservation may take a minute at most; a transcode of a movie can take several hours. Conventional SOA implementations are not well suited to handling such long-running processes.

What is a Service?
A service is a mechanism to provide access to a capability. A transcoding application could expose its capability to transcode files as a transform service. Examples of services in the broadcast domain include ingest, transform, playout, file moves and file archiving. The service is defined at the business level rather than the detailed technical level.

It could be said that many broadcasters already operate service-oriented systems; They just don’t extend the methodology to the architecture of technical systems.

Services share a formal contract; now service contracts are commonplace in broadcasting and across the M&E sector, with companies calling on each other for capabilities such as playout, subtitling and effects, as examples. The service-level agreement for playout will include quality aspects such as permitted downtime (99.999 percent). Service contracts operate at the business level, and ultimately may result in monetary exchange.

The business management logic may call for a file to be transcoded from in-house mezzanine to YouTube delivery format, but not define the specifics of a particular make and model of transcoder or the detail of the file formats. This abstracts the business logic from the underlying technical platforms. A generic service interface for file transform can be defined, and then each transcoder is wrapped by a service adaptor that handles the complexity of the transcode process. To the business logic, the transcode is simply a job. The abstraction of the capability is a key principle of the SOA.

In a legacy system, the ingest job is delegated to an operator who configures an encoder, and then starts and stops the encoding at the appropriate times. The operator is functioning autonomously during the processes of the job. These concepts of delegation and autonomy are key to the SOA design philosophy. The encoding may well be automated as a computer process, but the principles remain the same.

Because the service is abstracted, it opens the way for broadcasters to leverage cloud services more easily. As an example, at times of peak transcode demand, a cloud transcode service could be used to supplement in-house resources. With a standard service interface for transcoding, the implementation can be an on-premise or cloud-based service. The operation of the services is orchestrated by a layer of middleware, software that manages business processes according to the needs of the business.

A transform service can be used for different business processes. For example, a transcoder could be used to transform files at ingest to the house codec or used to create multiple versions of content for multi-platform delivery. The transform services can be redeployed to different departments as the needs of the file traffic change from hour to hour.

Planning for a SOA
Migrating from traditional tightly coupled systems to use SOA principles is a big step for a media business. The efficient operation of SOA requires detailed analysis of business needs and definition of services. It also requires rigorous planning of the IT infrastructure, computers and networks for efficient operation of the services. Without the involvement of senior management down to IT services, the benefits of SOA are unlikely to be fully realized.

For broadcasters used to running departmental silos, many with real-time elements, the move to SOA will be a radical change to the way the business operates. However, the advantages of the SOA and allied systems like BPM are proving attractive propositions for the broadcaster — or service provider — running complex file-based operations for multi-platform delivery.

The problems facing a media company looking to embrace SOA and BPM include change management and the sheer problems of keeping on-air through such huge changes in the technical infrastructure supporting broadcast operations.

Many media companies have embraced the architecture, with early adopters using considerable original development of such components as service adapters — the vital link between a service like transcoding and the workflow orchestration middleware.

The use of consultants or internal software services to build a media SOA will achieve the goal, but does it make sense for all media businesses to go their own way?

It was this issue on which the Advanced Media Workflow Association (AMWA) and the European Broadcasting Union (EBU) independently agreed when, in 2010, they decided to pool resources and set up the joint Framework for Interoperable Media Services (FIMS) Project, which would develop standards for a framework to implement a media-friendly SOA.

The road will be long, and many obstacles remain to be resolved, but the success of this project will benefit both vendors and media companies in the long run.

The FIMS solution aims to provide a flexible and cost-effective solution that is reliable and future-proof. It should allow best-of-breed content processing products to be integrated with media business systems.

The FIMS team released V1.0 in 2012 as an EBU specification, Tech 3356. Three service interfaces have been specified: transform, transfer and capture.

The FIMS Project has expanded on the conventional SOA with additional features to meet the needs of media operations. Specifically FIMS adds asynchronous operation, resource management, a media bus and security.

Asynchronous operation allows for long-running services. A transcode may take hours; conventional SOA implementations allow for processes that complete in seconds or minutes.

Although services are loosely coupled to the orchestration, jobs can still be run with time constraints. This may be simply to start a job at a certain time, but services can also be real time, like the capture and playout of streams. In these cases, the job requests for the service will also include start and stop times for the process. For playout, this concept is no different from a playlist or schedule.

SOA typically is based on an Enterprise Service Bus (ESB) that carries XML messages between service providers and consumers. A media bus provides a parallel bus to the ESB to carry the large media essence files. Many file-based operations will already have media IP networks that can be adapted to provide the platform for the media bus, as shown in Figure 2.

Figure 2. SOA loosely couples autonomous services with a central orchestration engine
calling services to deliver the business requirements.

The Future
Software methodologies like SOA and BPM help media businesses manage file-based operations in more efficient ways and better serve the needs of multi-platform delivery. They provide a holistic approach to running business operations that provide better visibility of operations and simpler ways to leverage cloud services.

They have proved successful in other sectors and are ready to meet the unique needs of the media sector.

By David Austerberry, Broadcast Engineering

The Current State of Android and Video

In the OS landscape, Android is by far the most widely used for tablets and mobile devices. With over 900 million current activated users and approximately 70% market share in the space, the platform is not only the most popular, but it’s also the most fragmented in terms of OEM’s and OS versions out there.

Android History and Origins
Android is a Java-Based operating system used for mobile phones and tablet computers. It was introduced and developed initially by Android, with financial backing from Google. Google acquired Android in 2005. Google announced their mobile plans for Android in 2007 and the first iteration of Android hit the shelves in 2008. Android is the world’s most widely distributed and installed mobile OS. However, in terms of application usage and video consumption, iOS devices lead the way. A consistent user experience and standardized video playback are two of the reasons for this.

Performance of Video on Android
When running video on Android devices, the experience varies from OEM to OS version to media source. Because of this lack of standardization with video, we wanted to give an overall look at the top mobile devices running Android to determine how video is delivered and how it performs across a few key sites and platforms.

We tested the following top devices running the most used versions of Android (2.3, 4.0, 4.1 or 4.2):

  • Google Nexus 7
  • Google Nexus 4
  • Samsung Galaxy 4
  • HTC One
  • Samsung Galaxy II
  • HTC EVO 4G
In summary, the newer versions had overall better video capabilities and quality. On devices running 4.1 or higher, the video players were generally built in, and most were shown in one-third of the screen and ran with little interruption.

On the devices running older versions of Android, the experience was inconsistent across sites and the video performance wasn’t as strong. Some of the top video sites showed the option for either video display on a player or in the web browser, and some had very poor viewing capabilities for video.

A sample look at the different variations of video transfer and display on the Android devices is below:

Click to enlarge

Open Source on Android
Because Android is open source, Google releases the code under the Apache License. For this reason, every OEM modifies the open source code for their devices.

OEM’s create their own coding and specifications for Android for each device. This makes any standardization very difficult. When testing different versions of Android on different target devices, there are a lot of inconsistencies.

Google regularly releases updates for Android which further confuses things. End users often do not upgrade, either because they don’t know how, or because their device does not support the new release. The scattered consistency of updates further confuses any efforts at standardization.

Two of the largest and most widely used Android OEM’s both released their latest open source codes earlier this year. For the Samsung Galaxy codes please click here. For the HTC One click here.

Versions of Android
In 2008 Android v1.0 was released to consumers. Starting in 2009, Android started using dessert and confection code names which were released in alphabetical order: Cupcake, Donut, Eclair, Froyo, Gingerbread, Honeycomb, Ice Cream Sandwich, and the latest, Jelly Bean.

A historical look at the Android devices is below:

In terms of market share, the Gingerbread remains the most popular.

Top Android Devices per OS version
The top devices running Android in terms of both sales and popularity come from various OEM’s, with the majority from Samsung, HTC, LG and Asus. A few of the top devices from the most widely used Android OS versions are as follows:

Click to enlarge

DRM Content on Android
Android offers a DRM framework for all devices running their 3.0 OS and higher. Along with their DRM framework, they offer consistent DRM for all devices using Google’s Widevine DRM (free on all compatible Android devices) which is built on top of their framework. On all devices running 3.0 and higher, the Widevine plugin is integrated with the Android DRM framework to protect content and credentials. However, the content protection depends on the OEM device capabilities. The plug in provides licensing, safe distribution and protected playback of media content.

The image below shows how the framework and Widevine work together.

Click to enlarge

Closed Captions on Android
As developers know, closed captioning is not a simple “feature” of video that can be simply activated. There are a number of formats, standards, and approaches and it’s especially challenging for multiscreen publishers. On Android devices, closed captioning varies from app to app. However, any device using Jelly Bean 4.1 or higher can use their media player which supports internal and external subtitles. Click here for more information.

For any device using the Gingerbread version or lower which do not have any support for rendering subtitle, you can either add subtitle support yourself or integrate a third party solution.

Most larger broadcasters pushing content to OTT devices now serve closed captioning on Android (Hulu Plus, HBO GO, and Max Go to name a few).

Does Android support HLS?
Android has limited support for HLS (Apple’s HTTP Live streaming protocol), and device support is not the same from one version or one device to the next. Android devices before 4.x (Gingerbread or Honeycomb), do not support HLS. Android tried to support HLS with Android 3.0, but excessive buffering often caused streams to crash. Devices running Android 4.x and above will support HLS, but there are still inconsistencies and problems.

Best Practices for Video on Android
For deploying video on Android, there are several suggested specifications to follow. Below is a list of files supported by Android devices. Developers can also use media codecs either provided by any Android-powered device, or additional media codecs developed by third-party companies. If you want to play videos on Android, find a multi-format video player or convert videos to Android compatible formats using an encoding company.

Video Specifications for Android
Below are the recommended encoding parameters for Android video from the Android developer homepage. Any video with these parameters are playable on Android phones.

Video Encoding Recommendations
This table below lists examples of video encoding profiles and parameters that the Android media framework supports for playback. In addition to these encoding parameter recommendations, a device’s available video recording profiles can be used as a proxy for media playback capabilities. These profiles can be inspected using the CamcorderProfile class, which is available since API level 8.

For video content that is streamed over HTTP or RTSP, there are additional requirements:
  • For 3GPP and MPEG-4 containers, the moov atom must precede any mdat atoms, but must succeed the ftypatom.
  • For 3GPP, MPEG-4, and WebM containers, audio and video samples corresponding to the same time offset may be no more than 500 KB apart. To minimize this audio/video drift, consider interleaving audio and video in smaller chunk sizes.
For information about how to target your application to devices based on platform version, read Supporting Different Platform Versions.


NTU Invention Allows Clear Photos in Dim Light

Cameras fitted with a new revolutionary sensor will soon be able to take clear and sharp photos in dim conditions, thanks to a new image sensor invented at Nanyang Technological University (NTU).

The new sensor made from graphene, is believed to be the first to be able to detect broad spectrum light, from the visible to mid-infrared, with high photoresponse or sensitivity. This means it is suitable for use in all types of cameras, including infrared cameras, traffic speed cameras, satellite imaging and more.

Not only is the graphene sensor 1,000 times more sensitive to light than current low-cost imaging sensors found in today’s compact cameras, it also uses 10 times less energy as it operates at lower voltages. When mass produced, graphene sensors are estimated to cost at least five times cheaper.

Graphene is a million times smaller than the thickest human hair (only one-atom thick) and is made of pure carbon atoms arranged in a honeycomb structure. It is known to have a high electrical conductivity among other properties such as durability and flexibility.

The inventor of the graphene sensor, Assistant Professor Wang Qijie, from NTU’s School of Electrical & Electronic Engineering, said it is believed to be the first time that a broad-spectrum, high photosensitive sensor has been developed using pure graphene.

His breakthrough, made by fabricating a graphene sheet into novel nano structures, was published in Nature Communications, a highly-rated research journal.

“We have shown that it is now possible to create cheap, sensitive and flexible photo sensors from graphene alone. We expect our innovation will have great impact not only on the consumer imaging industry, but also in satellite imaging and communication industries, as well as the mid-infrared applications,” said Asst Prof Wang, who also holds a joint appointment in NTU’s School of Physical and Mathematical Sciences.

“While designing this sensor, we have kept current manufacturing practices in mind. This means the industry can in principle continue producing camera sensors using the CMOS (complementary metal-oxide-semiconductor) process, which is the prevailing technology used by the majority of factories in the electronics industry. Therefore manufacturers can easily replace the current base material of photo sensors with our new nano-structured graphene material.”

If adopted by industry, Asst Prof Wang expects that cost of manufacturing imaging sensors to fall - eventually leading to cheaper cameras with longer battery life.

How the Graphene Nanostructure Works
Asst Prof Wang came up with an innovative idea to create nanostructures on graphene which will “trap” light-generated electron particles for a much longer time, resulting in a much stronger electric signal. Such electric signals can then be processed into an image, such as a photograph captured by a digital camera.

The “trapped electrons” is the key to achieving high photoresponse in graphene, which makes it far more effective than the normal CMOS or CCD (Charge-Coupled Device) image sensors, said Asst Prof Wang. Essentially, the stronger the electric signals generated, the clearer and sharper the photos.

“The performance of our graphene sensor can be further improved, such as the response speed, through nanostructure engineering of graphene, and preliminary results already verified the feasibility of our concept,” Asst Prof Wang added.

This research, costing about $200,000, is funded by the Nanyang Assistant Professorship start-up grant and supported partially by the Ministry of Education Tier 2 and 3 research grants.

Development of this sensor took Asst Prof Wang a total of 2 years to complete. His team consisted of two research fellows, Dr Zhang Yongzhe and Dr Li Xiaohui, and four doctoral students Liu Tao, Meng Bo, Liang Guozhen and Hu Xiaonan, from EEE, NTU. Two undergraduate students were also involved in this ground-breaking work.

Asst Prof Wang has filed a patent through NTU’s Nanyang Innovation and Enterprise Office for his invention. The next step is to work with industry collaborators to develop the graphene sensor into a commercial product.

Source: Nanyang Technological University

Digital Camera Add-On Means the Light's Fantastic

KaleidoCamera is developed by Alkhazur Manakov of Saarland University in Saarbr├╝cken, Germany, and his colleagues. It attaches directly to the front of a normal digital SLR camera, and the camera's detachable lens is then fixed to the front of the KaleidoCamera.

After light passes through the lens, it enters the KaleidoCamera, which splits it into nine image beams according to the angle at which the light arrives. Each beam is filtered, before mirrors direct them onto the camera's sensor in a grid of separate images, which can be recombined however the photographer wishes.

This set-up allows users to have far more control over what type of light reaches the camera's sensor. Each filter could allow a single colour through, for example, then colours can be selected and recombined at will after the shot is taken, using software. Similarly, swapping in filters that mimic different aperture settings allows users to compose other-worldly images with high dynamic range in a single shot.

And because light beams are split up by the angle at which they arrive, each one contains information about how far objects in a scene are from the camera. With a slight tweak to its set-up, the prototype KaleidoCamera can capture this information, allowing photographers to refocus images after the photo has been taken.

Roarke Horstmeyer at the California Institute of Technology in Pasadena says the device could make digital SLR photos useful for a range of visual tasks that are normally difficult for computers, like distinguishing fresh fruit from rotten, or picking out objects from a similarly coloured background. "These sorts of tasks are essentially impossible when applying computer vision to conventional photos," says Horstmeyer.

The ability to focus images after taking them is already commercially available in the Lytro – a camera designed solely for that purpose. But while Lytro is a stand-alone device which costs roughly the same as an entry-level digital SLR, KaleidoCamera's inventors plan to turn their prototype into an add-on for any SLR camera.

Manakov will present the paper at the SIGGRAPH conference in Anaheim, California, this month. He says the team is working on miniaturising it, and that much of the prototype's current bulk simply makes it easier for the researchers to tweak it for new experiments.

"A considerable engineering effort will be required to downsize the add-on and increase image quality and effective resolution," says Yosuke Bando, a visiting scientist at the MIT Media Lab. "But it has potential to lead to exchangeable SLR lenses and cellphone add-ons."

In fact, there are already developments to bring post-snap refocusing to smartphone cameras, with California-based start-up Pelican aiming to release something next year.

"Being able to convert a standard digital SLR into a camera that captures multiple optical modes – and back again – could be a real game-changer," says Andrew Lumsdaine of Indiana University in Bloomington.

By Hal Hodson, New Scientist

MPEG-DASH: Making Tracks Toward Widespread Adoption

The need to reach multiple platforms and consumer electronics devices has long presented a technical and business headache, not to mention a cost for service providers looking to deliver online video. the holy grail of a common file format that would rule them all always seemed a quest too far.

Enter MPEG-DASH, a technology with the scope to significantly improve the way content is delivered to any device by cutting complexity and providing a common ecosystem of content and services.

The MPEG-DASH standard was ratified in December 2011 and tested in 2012, with deployments across the world now underway. Yet just as MPEG-DASH is poised to become a universal point for interoperable OTT delivery comes concern that slower-than-expected initial uptake will dampen wider adoption.

A Brief History of DASH
The early days of video streaming, reaching back to the mid-1990s, were characterized by battles between the different technologies of RealNetworks, Microsoft, and then Adobe. By the mid-2000s, the vast majority of internet traffic was HTTP-based, and Content Delivery Networks (CDNs) were increasingly being used to ensure delivery of popular content to large audiences.

“[The] hodgepodge of proprietary protocols -- all mostly based on the far less popular UDP -- suddenly found itself struggling to keep up with demand,” explains Alex Zambelli, formerly of Microsoft and now principal video specialist for iStreamPlanet, in his succinct review of the streaming media timeline for The Guardian.

That changed in 2007 when Move Networks introduced HTTP-based adaptive streaming, adjusting the quality of a video stream according to the user’s bandwidth and CPU capacity.

“Instead of relying on proprietary streaming protocols and leaving users at the mercy of the internet bandwidth gods, Move Networks used the dominant HTTP protocol to deliver media in small file chunks while using the player application to monitor download speeds and request chunks of varying quality (size) in response to changing network conditions,” explains Zambelli in the article. “The technology had a huge impact because it allowed streaming media to be distributed ... using CDNs (over standard HTTP) and cached for efficiency, while at the same time eliminating annoying buffering and connectivity issues for customers.”

Other HTTP-based adaptive streaming solutions followed: Microsoft launched Smooth Streaming in 2008, Apple debuted HTTP Live Streaming (HLS) for delivery to iOS devices a year later, and Adobe joined the party in 2010 with HTTP Dynamic Streaming (HDS).

HTTP-based adaptive streaming quickly became the weapon of choice for high-profile live streaming events from the Vancouver Winter Olympics 2010 to Felix Baumgartner’s record breaking 2012 Red Bull Stratos jump (watched live online by 8 million people).

These and other competing protocols created fresh market fragmentation in tandem with multiple DRM providers and encryption systems, all of which contributed to a barrier to further growth of the online video ecosystem.

In 2009, efforts began among telecommunications group 3rd Generation Partnership Project (3GPP) to establish an industry standard for adaptive streaming. More than 50 companies were involved -- Microsoft, Netflix, and Adobe included -- and the effort was coordinated at ISO level with other industry organizations such as studio-backed digital locker initiator Digital Entertainment Content Ecosystem (DECE), OIPF, and World Wide Web Consortium (W3C).

MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH, or DASH for short) was ratified as an international standard late in 2011. It was published as ISO/IEC 23009-1 the following April and was immediately heralded as a breakthrough because of its potential to embrace and replace existing proprietary ABR technologies and its ability to run on any device.

At the time, Thierry Fautier, senior director of convergence solutions at Harmonic, said the agreement on a single protocol would decrease the cost of production, encoding, storage, and transport: “This is why everyone is craving to have DASH. It will enable content providers, operator and vendors to scale their OTT business,” he told CSI magazine in February 2012.

In the same article, Jean-Marc Racine, managing partner at Farncombe, said, “By enabling operators to encode and store content only once, [DASH] will reduce the cost of offering content on multiple devices. Combined with Common Encryption (CENC), DASH opens the door for use with multiple DRMs, further optimising the cost of operating an OTT platform.”

The Benefits of DASH
The technical and commercial benefits outlined for MPEG-DASH on launch included the following:

  • It decouples the technical issues of delivery formats and video compression from the more typically proprietary issues of a protection regime. No longer does the technology of delivery have to develop in lockstep with the release cycle of a presentation engine or security vendor.

  • It is not blue sky technology -- the standard acknowledged adoption of existing commercial offerings in its profiles and was designed to represent a superset of all existing solutions.

  • It represented a drive for a vendor-neutral, single-delivery protocol to reduce balkanization of streaming support in CE devices. This would reduce technical headaches and transcoding costs. It meant content publishers could generate a single set of files for encoding and streaming that should be compatible with as many devices as possible from mobile to OTT, and also to the desktop via plug-ins or HTML5; in addition, it meant consumers would not have to worry about whether their devices would be able to play the content they want to watch.

“DASH offers the potential to open up the universe of multi-network, multi-screen and multi-operator delivery, beyond proprietary content silos,” forecast Steve Christian, VP of marketing at Verimatrix. “In combination with a robust protection mechanism, a whole new generation of premium services are likely to become available in the market.”

Perhaps the biggest plus was that unlike previous attempts to create a truly interoperable file format, without exception all the major players participated in its development. Microsoft, Adobe, and Apple -- as well as Netflix; Qualcomm; and Cisco -- were integral to the DASH working group.

These companies, minus Apple, formed a DASH Promoters Group (DASH-PG), which eventually boasted nearly 60 members and would be formalized as the DASH Industry Forum (DASH-IF), to develop DASH across mobile, broadcast, and internet and to enable interoperability between DASH profiles and connected devices -- exactly what was missing in the legacy adaptive streaming protocols.

The European Broadcasting Union (EBU) was the first broadcast organization to join DASH-IF, helping recommend and adopt DASH in version 1.5 of European hybrid internet-TV platform HbbTV. Other members have since boarded, including France and Spain, which have already begun deploying DASH for connected TVs, with Germany and Italy expected to follow. In the U.S., DASH is attracting mobile operators, such as Verizon, wanting to deploy eMBMS for mobile TV broadcast over LTE.

What about HLS?
However, there remain some flies in the ointment. The format for DASH is similar to Apple’s HLS, using index files and segmented content to stream to a device where the index file indicates the order in which segments are played. But even though representatives from Apple participated in drawing up DASH, Apple is holding fast to HLS and hasn’t yet publicly expressed its support for DASH.

Neither has Google, though it has confirmed that the standard is being tested in Google Chrome. Some believe that until DASH is explicitly backed by these major players, it will struggle to gain traction in the market.

“Right now there are multiple streaming options and until Apple and Google agree on DASH, it will be a while before there is widespread adoption,” says Hiren Hindocha, president and CEO of Digital Nirvana.

Adobe has encouragingly adopted the emerging video standard across its entire range of video streaming, playback, protection, and monetization technologies. Its backing will greatly reduce fragmentation and costs caused by having to support multiple video formats.

“We believe that if we have Microsoft, Adobe, and to some extent Google implementing MPEG-DASH, this will create a critical mass that will open the way to Apple,” says Fautier. “Timing for each of those companies is difficult to predict though.”

While Apple HLS has considerable momentum, other adaptive streaming protocols are being dropped in favor of DASH, which observers such as David Price, head of TV business development for Ericsson, and Brian Kahn, director of product management for Brightcove, reckon will mean that there will only be two mainstream protocols in use for the vast majority of streaming services.

“Since both Adobe and Microsoft have been pushing DASH as a standard, we can assume that HDS and Smooth Streaming will be replaced by DASH helping to reduce the number of formats,” wrote Kahn in a Brightcove blog post. In an email to me, Kahn wrote, “Additionally, Google Canary has a plugin for MPEG-DASH and it is rumoured that Google created the plug-in internally. In the end, we will probably end up with two main streaming formats: HLS and DASH.”

So why doesn’t the industry just adopt HLS instead of adding another streaming protocol? Kahn’s email points to two reasons. “First, it’s not officially a standard -- it’s a specification dictated by Apple, subject to change every six months. It also doesn’t have support for multi-DRM solutions -- DASH does, which is why most major studio houses have given it their endorsement.”
Other Roadblocks to Adoption

But the road to DASH adoption won’t be a straight one. Kahn highlights in particular the challenge of intellectual property and royalties. “This is undoubtedly an issue which will need to be addressed before DASH can achieve widespread adoption,” he told Streaming Media. “DASH participants such as Microsoft and Qualcomm have consented to collate patents for a royalty free solution, but the likes of Adobe have not agreed.

“Mozilla does not include royalty standards in its products, but without the inclusion of its browser, the likelihood of DASH reaching its goal of universal adoption for OTT streaming looks difficult,” Kahn adds. “Another potential obstacle to standardisation is video codecs -- namely, the need for a standard codec for HTML5 video. Even with universal adoption of DASH by HTML5 browsers, content would still need to be encoded in multiple codecs.

Ericsson’s Price also notes some concern about the way in which DASH is being implemented: “In regards to the elements that are discretionary, particularly in the area of time synchronization, it is hoped that as adoption becomes wider, there will be industry consensus on the implementation details; the best practise guidelines being created by DASH-IF will further accelerate adoption.”

There are further warnings that delays in implementing DASH could harm its success as a unifying format. A standards effort necessarily involves compromises, and probably the biggest compromises get hidden in the profile support in the overall standards effort. MPEG-DASH in its original specification arguably tried to be everything to everyone and perhaps suffered from excessive ambiguity (a story familiar to anyone acquainted with HTML5 video, notes Zambelli wryly).

“There are several trials and lots of noise about MPEG-DASH, but we’ve yet to see concrete demand that points to DASH being the great unifier,” warns AmberFin CTO Bruce Devlin. “In fact, unless there is some operational agreement on how to use the standard between different platform operators, then it might become yet another format to support.”

“DASH has taken quite a while to gather a following among consumer electronics and software technology vendors, delaying its adoption,” reports RGB Networks’ senior director of product marketing Nabil Kanaan. “The various profiles defined by DASH have added too much flexibility in the ecosystem, at the cost of quick standardisation. We still believe it’s a viable industry initiative and are supporting it from a network standpoint and working with ecosystem partners to make it a deployable technology.”

Elemental Technologies’ VP of marketing, Keith Wymbs, adds, “To date the impact of MPEG-DASH has been to spur the discussion about the proliferation of streaming technologies.”

“MPEG-DASH isn’t in a position where people are thinking that it will be the only spec they’ll need to support in the near to mid-term,” says Digital Rapids marketing director Mike Nann, “but most believe that it will reduce the number of adaptive streaming specifications that they’ll need to prepare their content for.”

Jamie Sherry, senior product manager at Wowza Media Systems, also thinks DASH has had very little impact to date other than to re-emphasise that for high-quality online video to really become profitable and widespread: “Issues like streaming media format fragmentation must be addressed.

“If the ideals of MPEG-DASH become a reality and traction occurs in terms of deployments, the impact to the market will be positive as operators and content publishers in general will have a real opportunity to grow their audiences while keeping costs in line,” he says.

To address this, the DASH-IF has been hard at work defining a subset of the standard to serve as a base profile that all implementations have to include. This is driven by the decision to focus on H.264/MPEG-4 encoding rather than MPEG-2 (initially both were supported). The result, DASH-AVC/264, was announced in May and is widely tipped to achieve broad adoption by speeding up the development of common profiles that can be used as the basis for interoperability testing.

“As an analogy, look back at the evolution of MPEG-2 and Transport Streams,” says Nann. “If every cable operator, encoder, middleware vendor, and set-top box vendor supported a different subset of parameters, profiles, levels, and features, they might all be within the MPEG-2 and TS specs, but we probably wouldn’t have the widespread interoperability (and thus adoption) we have today. DASH-AVC/264 is a means of doing the same for MPEG-DASH, providing a constrained set of requirements for supporting DASH across the devices that support it, and giving vendors interoperability targets.”

Aside from requiring support for H.264, the DASH-AVC/264 guidelines define other essential interoperability requirements such as support for the HE-AAC v2 audio codec, ISO base media file format, SMPTE-TT subtitle format, and MPEG Common Encryption for content protection (DRM).

“The Common Encryption element is particularly interesting because it enables competing DRM technologies such as Microsoft PlayReady, Adobe Access, and Widevine to be used inclusively without locking customers into a particular digital store,” writes Zambelli. “DASH-AVC/264 provides the details desperately needed by the industry to adopt MPEG-DASH and is expected to gain significant traction over the next one to two years.”

Digital Rapids’ Nann says he expects to see increased adoption in 2013 with “considerably more pilot projects as well as commercial deployments,” with growing device support (particularly for consumer viewing devices). “The client device support is going to be one of the biggest factors in how quickly MPEG-DASH rolls out,” says Nann.

Telestream product marketing director John Pallett concurs: “The primary driver for adoption will be the player technology to support it. The companies that develop players are generally working to support MPEG-DASH alongside their legacy formats. Most of the major player companies want to migrate to DASH, but real adoption will come when a major consumer product supports DASH natively. This has not yet happened, but we anticipate that it will change over the next year.”

For Peter Maag, CMO of Haivision Network Video, the value proposition is simple: “MPEG-DASH will simplify the delivery challenge if it is ubiquitously embraced. Realistically, there will always be a number of encapsulations and compression technologies required to address every device.”

The number of trials are growing and already include the world’s first large-scale test of MPEG-DASH OTT multiscreen at the 2012 London Olympics with Belgian broadcaster VRT, and the first commercial MPEG-DASH OTT multiscreen service with NAGRA and Abertis Telecom in 2012 -- both powered by Harmonic.

“Over the next years, we believe a significant amount of operators will deploy OTT and multiscreen services based on DASH,” suggests Fautier.

In an interview with Streaming Media, Kevin Towes, senior product manager at Adobe, declared 2012 as the year of DASH awareness and 2013 as the year of discovery.

“How can you attach some of these encoders and CDNs and players and devices together to really demonstrate the resiliency and the vision of what DASH is trying to present?” he said. “And then as we go through that it’s about then operationalizing it, getting DASH into the hands of the consumers from a more viewable point of view.”

Elemental Technologies’ Wymbs believes the discussion will evolve in the next 12 months “to one centering on the use of MPEG-DASH as a primary distribution format from centralized locations to the edge of the network where it will then be repackaged to the destination format as required.”

Given the number of elements of the value chain that need to line up for full commercialization -- encoders, servers, CDNs, security systems, and clients as a minimum -- significant commercial rollouts were always likely to take time.

In conclusion, while there are still hurdles to clear, DASH is clearly on the path toward widespread adoption, especially now that DASH-AVC/264 has been approved. According to contributing editor Tim Siglin: “If there is some form of rationalization between HLS and DASH, including the ability to include Apple’s DRM scheme in the Common Encryption Scheme, we might just note 2013 not only as the beginning of true online video delivery growth but also as the point at which cable and satellite providers begin to pay attention to delivery to all devices -- including set-top boxes -- for a true TV Everywhere experience.”

By Adrian Pennington, StreamingMedia

Introduction to Video Coding

By Iain Richardson, Vcodex

Google Adds its Free and Open-Source VP9 Video Codec to Latest Chrome Build

Google announced it has enabled its VP9 video codec by default on the Chrome dev channel. The addition means users of the company’s browser can expect to see the next-generation compression technology available out-of-the-box before the end of the year.

In May, Google revealed it was planning to finish defining VP9 on June 17, after which it would start using the technology in Chrome and on YouTube. On that day, the company enabled the free video compression standard by default in the latest Chromium build, and now it has arrived in the latest Chrome build.

VP9 is the successor to VP8, both of which fall under Google’s WebM project of freeing Web codecs from royalty constraints. Despite the fact that Google unveiled WebM three years ago at its I/O conference, VP8 is still rarely used when compared to H.264, today’s most popular video codec.

“A key goal of the WebM Project is to speed up the pace of video-compression innovation (i.e., to get better, faster), and the WebM team continues to work hard to achieve that goal,” Google says. “As always, WebM technology is 100% free, and open-sourced under a BSD-style license.”

For users, the main advantage of VP9 is that it’s 50 percent more efficient than H.264, meaning that you’ll use half the bandwidth on average when watching a video on the Internet. Yet that doesn’t take H.265 into account, the successor to H.264 that offers comparable video quality at half the number of bits per second and also requires its implementers to pay patent royalties.

Google today claimed VP9 “shows video quality that is slightly better than HEVC (H.265).” The company is of course biased, but we’re sure that comparisons by third-parties will start to surface soon.

In the meantime, Google says it is working on refining the VP9 toolset for developers and content creators as well as integrating it with the major encoding tools and consumer platforms. VP9 is already available in the open-source libvpx reference encoder and decoder, but Google still plans to optimize it for speed and performance, as well as roll out improved tools and documentation “over the coming months.”

VP9 is also meant to become part of WebRTC, an open project that lets users communicate in real-time via voice and video sans plugins, later this year. Google has previously said it wants to build VP9 into Chrome, and YouTube has also declared it would add support once the video codec lands in the browser.

The dev channel for Chrome is updated once or twice weekly. Since the feature has made it in there, it won’t be long before it shows up in the beta channel, and then eventually the stable channel.

By Emil Protalinski, The Next Web

HTML5 Video in IE 11 on Windows 8.1

We've previously discussed our plans to use HTML5 video with the proposed "Premium Video Extensions" in any browser which implements them. These extensions are the future of premium video on the web, since they allow playback of premium video directly in the browser without the need to install plugins.

Today, we're excited to announce that we've been working closely with Microsoft to implement these extensions in Internet Explorer 11 on Windows 8.1. If you install the Windows 8.1 Preview from Microsoft, you can visit today in Internet Explorer 11 and watch your favorite movies and TV shows using HTML5!

Microsoft implemented the Media Source Extensions (MSE) using the Media Foundation APIs within Windows. Since Media Foundation supports hardware acceleration using the GPU, this means that we can achieve high quality 1080p video playback with minimal CPU and battery utilization. Now a single charge gets you more of your favorite movies and TV shows!

Microsoft also has an implementation of the Encrypted Media Extensions (EME) using Microsoft PlayReady DRM. This provides the content protection needed for premium video services like Netflix.

Finally, Microsoft implemented the Web Cryptography API (WebCrypto) in Internet Explorer, which allows us to encrypt and decrypt communication between our JavaScript application and the Netflix servers.

We expect premium video on the web to continue to shift away from using proprietary plugin technologies to using these new Premium Video Extensions. We are thrilled to work so closely with the Microsoft team on advancing the HTML5 platform, which gets a big boost today with Internet Explorer’s cutting edge support for premium video. We look forward to these APIs being available on all browsers.

By Anthony Park and Mark Watson, The Netflix Tech Blog

Captioning for Streaming Video still a "Wild West"

Captioning video on demand “is a wild west,” at least in the U.S, where the service is mandated by law and therefore a key topic for content owners and their partners to get their heads around.

Delivering a measured analysis of the issues, Telestream product manager Kevin Louden captioned his own session, "Practicalities of Putting Captions on IP-Delivered Video Content" with the question: "Can anyone see your subtitles?"

Louden began by pointint out that there are legal, moral, and business reasons to make sure your content is captioned.

“The 21st Century Communications Act in the U.S. mandates that content previously broadcast or intended for broadcast have captions to it,” he explained. “This comes into effect in stages between now and 2016.

“Even if you don't do it by law [no other region of the world has quite the same legislation] some people say it's simply the right thing to do, and from a business perspective you can broaden audiences for your content by reaching out to multiple language groups.”

So how is it done? Just as there are lots of different video and audio formats for streaming and progressive download protocols there are lots of caption file formats for video on demand, the main ones being W3C TT/DFXP and WebVTT/SRT.

The former is an open standard which contains lot of information about position, font size, color, and so on for a rich presentation of the information and is “potentially very complicated,” he said.

WebVTT/SRT, on the other hand, is a text-based format native to HTML5 video tags, “very simple in its current iteration” but with little or no control of presentation features in the file.

“This is what people cobbled together before there were any standards in place, and because of that there are a lot of entrenched workflows,” Louden said.

To smooth the multiplicity of formats, two leading standards bodies are attempting to create a universal file interchange format, as a sort of mezzanine or master level.

SMPTE 2052, being proposed in the U.S., is an XML-based time text file which emerged from the Act so that content owners or their partner organisations could create deliverable formats from broadcast content for IP distribution in all its streaming media end user forms.

In Europe, EBU-TT is a similar proposition and a subset of the TTML format, for use as a universal handoff.

For organisations wanting to generate captioning information for linear video on their own websites there are several options. JW Player, for example, has in built support for WebTT, SRT and DFXP while Flow Player supports W3C TT, SRT

Numerous video encoding tools, perhaps already in situ at a facility, contain subtitling and captioning capabilities for translating between formats.

Alternatively one can employ graphical overlays or physically burn the subtitles onto the picture, a pracitce which is still remarkably common, reported Louden. “You don't need special players or sidecar files, but obviously there's not much flexibility.”

Charging a third-party service provider is a useful way of delegating the problem but, says Louden, “in theory you hand over your master SMPTE TT or EBU TT safe harbour file as the interchange format, but the reality is that people are used to their own existing profiles and will request an SRT, WebVTT format since this is the way it's always been done.”

Turning to adaptive bitrate provision, Louden noted that the main ABR formats cater for different captioning files.

The HLS specification for iOS devices contains a means of embedding 608 captions in a video's MPEG headers, while Smooth Streaming and HTTP Dynamic Streaming both support the sidecar formats DXFP and TTML (useful for repurposing linear and non-linear VoD). Where MPEG-DASH fits into this equation is up in the air.

Louden pointed out a couple of bumps in the road for anyone looking to caption their content, which included taking care of rights, especially when repurposing legacy broadcast content.

“If you sent the work out to a caption house then beware that many of them work on individual negotiations, so while you may have a licence to broadcast that information you may not have the web rights for it,” he advised.

“Also be careful editing content," he said. "Any retiming of the content will have a knock-on to the timecode-synced caption information. You have to be sure when you do your format translation that the captions are retimed too, perhaps manually.”

The demand for a universal captioning standard was agreed on by delegates in the room, but no one really believed that a standard could be agreed or made to work in practice because of commercial pressures among competing vendors.

By way of addendum: Louden noted the little differences in definition between the two continents.

“In the U.S. 'captions' display text and other sound information for the hearing impaired," he said. "Subtitles are translations to different languages, whereas in Europe both of these things are commonly referred to as the same thing—a subtitle.”

By Adrian Pennington, Streaming Media

The 2013 Fletcher Digital Camera Comparison Chart

The 2013 Digital Camera Comparison Chart is a great resource for anyone who wants to quickly see what sort of specifications they're looking at for a particular digital cinema camera.

DASH Industry Forum Releases Implementation Guidelines

DASH Industry Forum (DASH-IF) has released DASH-AVC/264, its recommended guidelines for deployment of the MPEG-DASH standard.

The MPEG-DASH specification provides a standard method for streaming multimedia content over Internet by simplifying and converging delivery of IP video. This improves user experience while decreasing cost and simplifying workflow.

The DASH-AVC/264 Implementation Guidelines recommend using a subset of MPEG-DASH, with specific video and audio codecs in order to promote interoperability among deployments in the industry.

The DASH-AVC/264 Implementation Guidelines:

  • Provide an open, interoperable standard solution in the multimedia streaming market
  • Address major multimedia streaming use cases including on-demand, live and time-shift services
  • Outline the use of two specific MPEG-DASH profiles
  • Describe specific audio and video codecs and closed caption format
  • Outline the use of common encryption
The Guidelines were created by the 67 industry-leading DASH-IF member companies, including founding members Akamai, Ericsson, Microsoft, Netflix, Qualcomm and Samsung.

The DASH-AVC/264 Implementation Guidelines are also aligned with initiatives from other industry standards bodies and consortia, including HbbTV, DECE, DTG, HD-Forum, OIPF, EBU and DLNA.

Future guideline updates will address advanced services and applications, including multi-channel audio and HEVC support.

Dynamic Target Tracking Camera System Keeps its Eye on the Ball

Source: DigInfo TV

8K Ultra HD Compact Camera and H.265 Encoder Developed by NHK

Source: DigInfo TV

Survey of European Broadcasters on MPEG-DASH

At the EBU’s BroadThinking 2013 event in March 2013, the DASH Industry Forum conducted a survey of European broadcasters on MPEG-DASH. Thirteen major broadcasters responded to the survey.

This report summarizes the results and examines if the DASH IF’s activities and focus is aligned with the needs expressed in the survey.

ButtleOFX: an Open Source Compositing Software

The aim of ButtleOFX is to create an open source compositing software based on the TuttleOFX library. TuttleOFX is an image and video processing framework based on the OpenFX open standard for visual effects plug-ins.

Color Grading

There are two things that are at the core of doing good color grading for video:

  • Ensuring that the image on the screen looks great and is graded to best tell the story.
  • Making sure that the image can be properly reproduced and delivered to a variety of media and screens.
This article will focus on the latter. Understanding the importance of legal and valid gamut and determining the color balance are critical to maintaining proper color reproduction across a variety of media and broadcast methods. Before we examine the concepts of color balance, let’s quickly review the concepts of color space.

The HSL Color Space
Video is comprised of three color components: red, green and blue (RGB). Various combinations of these colors make up the colors we see. One way to understand the Hue Saturation and Luma (HSL), or RGB color space, is to imagine it as two cones joined at their widest point.

The Waveform Monitor
The waveform monitor or rasterizer (scope) is key to providing a legal output of your creative product. Being a brilliant editor and colorist doesn’t mean much if no one will air your product. Even if your product isn’t being broadcast, legal levels affect the proper duplication of your project and the way it will look on a regular TV monitor.

One of the most basic uses of the waveform monitor is determining whether your luma and setup levels are legal. This means that the brightest part of the luma signal does not extend beyond 100 percent and that the darkest part of the picture does not drop below 0 percent.

Determining Color Balance
Color balance is indicated by the relative strength of each color channel. With neutral (pure black, white and gray) colors, the strength of each color channel should, technically, be equal.

The usual goal of color balancing is to achieve an image where neutral colors are represented with all channels being equal. The most common reason for unbalanced colors is how the camera is white balanced on location. For example, if a camera is set up to record in tungsten light, when it is actually capturing a scene lit with daylight, the blue channel will be stronger than the red and green channels.

Some camera sensors have a natural tendency to be more sensitive to certain colors in certain tonal ranges. These errors, in sensitivity or white balance, can be corrected by monitoring the image with a waveform monitor and making adjustments to the signal until the signal strength of all three channels is equal when displaying a neutral color.

There are two types of waveform displays colorists consult that are defined as “parade” displays because they show channels of information in a “parade,” from left to right. The most common of these is the RGB Parade shown in the following figure, which shows the red, green and blue channels of color information horizontally across the display.

The reference marks are referred to as the graticule. On a waveform monitor, these are the horizontal lines describing the millivolts, IRE or percentages from black to full power (white).

Component video levels are represented in terms of millivolts, with black being set at 0mV and white at 700mV. This range of video levels is also represented in terms of a percentage scale with 0 percent equal to 0mV, and 100 percent equal to 700mV.

The Vectorscope
Whereas a waveform monitor normally displays a plot of signal vs. time, a vectorscope, shown in the following figure, is an XY plot of color (hue) as an angular component of a polar display, with the signal amplitude represented by the distance from the center (black). On a vectorscope graticule, there are color targets and other markings that provide a reference as to which vector, or position, a specific color is in.

In color grading applications, the vectorscope helps analyze hue and chroma levels, keeping colors legal and helping to eliminate unwanted color casts. With the gain, setup and gamma corrections done while monitoring primarily the waveform monitor, the colorist’s attention focuses more on the vectorscope for the hue and chroma work.

The chroma strength of the signal is indicated by its distance from the center of the vectorscope. The closer the trace is to the outer edge of the vectorscope, the greater the chrominance, or the more vivid the color. The hue of the image is indicated by its rotational position around the circle. An important relationship to understand is the position of the various colors around the periphery of the vectorscope. The targets for red, blue and green form a triangle. In between each of these primary colors are the colors formed by mixing those primaries.

The chroma information presented on the vectorscope is instrumental in trying to eliminate color casts in images. As stated earlier, the chroma strength of a signal is represented by its distance from the center of the vectorscope. Because white, black and pure grays are devoid of chroma information, they all should sit neatly in the center of the vectorscope. Although most video images will have a range of colors, they also usually have some amount of whites, blacks and neutral grays. The key is to be able to see where these parts of the picture sit on the vectorscope and then use the color correction tools at your disposal to move them toward the center of the vectorscope.

For nearly all professional colorists, the various waveform displays — Flat, Low Pass, Luma only, RGB Parade and YCbCr Parade — plus the vectorscope are the main methods for analyzing their image. Although experienced colorists often rely on their eyes, they use these scopes to provide an unchanging reference to guide them as they spend hours color correcting. Without them, their eyes and grades would eventually drift off course. Spend time becoming comfortable with these scopes, and what part of the video image corresponds to the images on the scopes.

By Richard Duvall, Broadcast Engineering

HEVC Walkthrough

A walkthrough of some of the features of the new HEVC video compression standard. Video playback using the Osmo4 player from GPAC and stream analysis using Elecard's HEVC Analyzer software.

By Iain Richardson, Vcodex

Ethernet and IP

Ethernet and IP are terms broadcast engineers use many times every day. But what, exactly, is the difference between the two?

Many years ago, manufacturers needed to include, inside their application programs, software that was written to allow the program to interface to a specific network interface card. This meant that an application would only work with specific networking hardware; change the hardware, and you had to rewrite the application. Very quickly, vendors faced an escalating number of possible network devices and a number of different underlying physical network implementations (such as RG-11, RG-59 and UTP cable).

Manufacturers wanted to separate the development of their application from all of the chaos going on at the networking level. They also wanted to be able to sell their applications to different users who might have different networking technologies in their facilities. The seven-layer Open System Interconnection (OSI) data model was developed to address this issue in a standardized way. While it is not too important to understand all layers of the OSI model, it is good to understand the first four layers.

This shows the first four of seven layers to the OSI data model.

Layer 1 is the physical layer, sometimes referred to as PHY. This is the physical network hardware, and it operates at the bit level.

Layer 2 is called the data link layer and consists of a mixture of specifications describing the electrical signals on a cable and the way that data is encapsulated into frames. Ethernet is a Layer 2 protocol.

Layer 3 is referred to as the network layer, and it is here that data is encapsulated into packets. IP packets are referred to as datagrams.

Layer 4 is the transport layer. Here, we speak in terms of sessions. Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) are Layer 4 protocols.

Each of these layers plays a role in networking. Importantly, Layer 2 is responsible for physical addressing, meaning that the network architecture can rely on the fact that a Layer 2 address is unique and permanently assigned to a specific piece of hardware.

Layer 3 is responsible for logical addressing, meaning that addresses are assigned at this level by a network engineer in a way that organizes network clients into logical groups. But, neither of these two layers are responsible for anything more than “best effort” delivery of packets from one place to another. If the packets get lost, duplicated, rearranged or corrupted, neither Layer 2 nor Layer 3 will do anything about it.

Layer 4 protocols are responsible for establishing end-to-end connections called sessions, and these protocols may or may not recognize the loss of packets and do something about it. TCP, for example, will request that lost packets be resent. UDP will not. (Remember that Ethernet operates at Layer 2, and IP operates at Layer 3.)

A Hierarchy
Ethernet and IP are part of a hierarchy of interchangeable technologies in the seven-layer OSI stack. While most of the networks that broadcast engineers are likely to encounter today are Ethernet and IP, it is important to understand that, in the early days, Ethernet was just one of a large number of competing Layer 2 technologies. Other Data Link Layer technologies include ATM, Token Ring and ARCNET.

While Layer 2 does a great job of organizing data into frames and passing them on to a physical network, it is not capable of allowing network architects to group computers into logical networks and allowing messages from those computers to be sent (or routed) to a much larger campus or even worldwide network. This is where Layer 3 comes in.

IP operating at Layer 3 organizes data into datagrams, which can be sent across any Layer 2 networking technology. IP datagrams are the same size and the same format, regardless of whether these packets are sent across Ethernet, Token Ring or some other network. In fact, you might be surprised to learn that IP packets from your computer may first travel over Ethernet, then over a SONET for long-distance transmission, and then be put back into Ethernet for delivery on a local network at the destination.

IP is not the only Layer 3 protocol in common use today. There are a number of other critical protocols that operate at this layer, many of which have to do with configuring and maintaining networks. Internet Control Message Protocol (ICMP) and Open Shortest Path First (OSPF) are two examples of this.

In summary, Ethernet and IP are part of a hierarchy. IP packets are carried in Ethernet frames. IP packets that travel over long distances are likely to be carried in the payload portion of SONET frames. Furthermore, when you see the notation TCP/IP, remember that TCP is also part of this hierarchy. It is highly likely that every time you see this notation used in reference to traffic on a LAN, remember that you are actually talking about TCP over IP over Ethernet.

Other Differences
There are many other differences between Ethernet and IP that are derived from the fact that they are on different layers of the OSI model.

Ethernet frames contain source and destination addresses. The frames also contain a payload area. IP packets are transported in the payload area of the Ethernet frames. The IP packets also contain source and destination addresses and a payload section. If both Ethernet and IP contain similar structures, why use Ethernet at all?

Remember that the point of adding IP as a layer on top of Ethernet is to allow the IP packet layout (and the application software above that layer) to remain constant while providing different underlying Layer 2 structures. Basically, if you change the network, then you change Layer 2 drivers, and perhaps Layer 1 hardware, but everything above that layer remains the same.

From a practical standpoint, there are significant differences between Ethernet addresses and IP addresses. Ethernet addresses are assigned to a network interface card or chip set at the factory. They are globally unique, cannot be changed (not true, actually, but this was the original assumption), and getting an Ethernet configuration up and running essentially is a plug-and-play process.

IP addresses, on the other hand, are not assigned from the factory. These addresses need to be configured (sometimes dynamically), and while Dynamic Host Configuration Protocol (DHCP) works very well most of the time to automatically configure IP addresses, there are many cases where broadcast engineers must manually configure the IP address of a device before it can be used on the network.

Another difference is that the first three octets of an Ethernet address convey meaning; they are an Organizationally Unique Identifier (OUI). It is possible to look up an OUI and determine who assigned the Ethernet address to the hardware. IP addresses, however, have absolutely no meaning assigned to them. That is not to say that there are not special IP address ranges because there are. But, the numbers themselves do not convey any specific information.

Perhaps the most important difference between Ethernet and IP is that Ethernet frames are not routable, while IP packets are. In practical terms, what this means is that an Ethernet network is limited in terms of the number of devices that can be connected to a single network segment and the distance Ethernet frames can travel. Limits vary, but, as an example, Gigabit Ethernet on Cat 7 cable is limited to about 330ft. Gigabit Ethernet on single mode fiber (expensive) is limited to 43mi.

In order to extend the network beyond this length, you need to be able to route signals from one physical Ethernet network to another. This was one of the original design goals of IP. Network architects assign computers to IP addresses based upon physical location and corporate function (news vs. production, for example), and then IP routers automatically recognize whether packets should remain within the local network or whether they need to be routed to the Internet.

By Brad Gilmer, Broadcast Engineering