Netflix’s Many-Pronged Plan to Eliminate Video Playback Problems

For all of Netflix’s complaints about Internet service providers harming video performance, one of the company’s top technology experts is confident that the streaming company can solve most of its customers’ problems.

David Fullagar, Netflix’s director of content delivery architecture, spoke about the company’s plans Monday at the Content Delivery Summit in New York. He described the hardware Netflix uses in its Open Connect content delivery network (CDN), noting that the company has a technological advantage over traditional CDNs because it’s always delivering content to devices running Netflix’s own software rather than using a hodgepodge of products built by other companies.

The best-known parts of Open Connect are probably the storage boxes that Internet service providers can take into their own networks to bring content closer to consumers. ISPs can also peer with Netflix, exchanging traffic directly without hosting Netflix equipment. But these aren’t the only ways Netflix’s Open Connect technology can deliver good quality.

Netflix used to use third-party CDNs such as Akamai, but it has moved most of its traffic over to Open Connect in the past couple of years. Outside the US, 100 percent of Netflix traffic is distributed using Open Connect equipment. The percentage is in the “high 90s” in the US, with plans to hit 100 percent this summer. Even if the storage boxes aren’t inside an ISP’s network, they’re not too far away. They could even be in the same data centers, the Internet exchange points where Netflix transit providers connect to ISPs.

Fullagar was asked by an audience member how Netflix works with ISPs who offer competing products. “From a quality point of view we don’t need to be that close to the end user for the sort of video we serve,” Fullagar said. “Having extremely low latency is nice” because it allows videos to start playing faster. However, “what we’re most interested in is a good, uncongested link, and that doesn’t necessarily have to be very low latency.”

Netflix’s peering with ISPs has been controversial because some of the Internet providers have demanded payment in exchange for accepting Netflix traffic. Netflix gave in to Verizon and Comcast, agreeing to pay both companies, but it has claimed that the Federal Communications Commission should force the ISPs to provide free peering. Netflix has sent its traffic through congested links when its business disputes have gone unresolved, deteriorating quality despite the other steps Netflix takes to improve it. (Comcast and analyst Dan Rayburn accused Netflix of purposely sending traffic through congested links.)

When asked how much Netflix can affect streaming performance given that it controls the server end of the connection as well as the user’s software, Fullagar said, “I think we’re on the tip of the iceberg of being able to do quite a lot there.” Netflix’s access to information about each customer’s device and Internet connection will fuel some as-yet-unrevealed strategies for improving quality, he said.

“We have extra information beyond just, hey this is someone wanting this file," he said. "At connection time we know the sort of client they are, whether it’s a Wii or a PS4 or a streaming stick. We know the network they’re on, we know a bunch of historical information about latency and quality of service we’ve had to those networks. We know whether they’re connected on a device that’s wired or wireless. There’s a bunch of hints that we have there.”

The company has started some “experiments that are working out really well, and in the future we’ll talk more about that.”

Netflix itself has equipment at about 20 Internet exchange points in North America and Europe and has "tens if not hundreds of embedded caches in ISP networks," Fullagar said.

The Network Team
Netflix’s Open Connect division has about 40 people, Fullagar said. About 20 are software engineers who either build software for Netflix servers or work on the company's management software, which runs on Amazon’s cloud network and performs functions such as load balancing. Another 10 Open Connect employees are network architects, and another 10 are in operations.

Netflix stores video on two types of boxes that it designed, one that’s heavy on HDDs and another that’s all SSDs. Netflix built them in part because it couldn’t find the right mix of compute and storage capabilities in products from hardware vendors.

The HDD unit is a 4U-sized chassis that holds 216TB on 36 drives of 6TB each. It has 64GB RAM, a 10 Gigabit NIC, and some SSD for frequently accessed content.

The smaller, 1U, SSD-only unit contains 14 drives of a terabyte each, 256GB of RAM and a 40 Gigabit NIC. About 75 percent of the cost of both the HDD and SSD boxes is taken up by storage. Each unit uses Intel CPUs.

Netflix refreshes hardware annually to improve performance. At its biggest locations, Netflix keeps multiple copies of its entire video library in case of failure. That’s more than a petabyte of video files for its North American catalog.

The company relies heavily on open source software, including FreeBSD and the Nginx Web server, as well as several management applications the company wrote itself.

Netflix distributes multiple terabits per second and accounts for an astonishing one-third of North American Internet traffic at peak times, i.e. the traditional TV “prime time” each evening. During off-peak hours in the middle of the night, Netflix fills disks with the videos its algorithms say people are most likely to watch the next day. This dramatically reduces network utilization during peak hours.

The management software Netflix runs on Amazon Web Services handles distribution of content, analyzes network performance, and connects users to the proper video sources. Netflix wrote its own adaptive bitrate algorithms to react to changes in throughput, and a CDN selection algorithm to adapt to changing network conditions such as overloaded links, overloaded servers, and errors, the company said.

When Netflix used multiple third-party CDNs, connections would fail over from one to another in case of error. Netflix still uses the same failover technology, but with “multiple hierarchies” within Open Connect instead of multiple CDNs, Fullagar said.

Although Netflix is moving all its data onto Open Connect hardware, that doesn’t automatically reduce the controversial role its transit providers Level 3 and Cogent have played in carrying traffic. Level 3 and Cogent have warred with ISPs over whether they should have to pay in order to send Netflix traffic onto their networks. As a result, interconnections between these transit providers and ISPs have gotten congested, reducing the quality of Netflix and other Web services that travel over the links.

The role of transit providers is only reduced when Netflix signs direct interconnection agreements with ISPs, as it has done Verizon and Comcast, a Netflix spokesperson said. In the absence of such agreements, Netflix data passes through the company’s own CDN and then through a transit provider before hitting an ISP's network.

The payment controversies don’t necessarily affect the working relationship between the technical teams of Netflix and ISPs, though. “Engineering people at companies, whether large or small, operate independently of commercial interests,” Fullagar said. “In the UK, one of our biggest competitors is one of our best networking partners.”

Source: Ars Technica