2D to 3D for the Networks

At NAB I had a chance to talk to Jay Wiskerchen, VP Technical Operations, at DDD (Santa Monica, CA) about their new business in the 2D to 3D conversion field. Currently there are multiple companies that do cinema quality 2D to 3D conversion for the studios. This is a labor-intensive process that involves massaging every frame to get the image as good as possible. This is not a cheap process and typically it costs $5M - $10M for a 100 minute film, or $50,000 to $100,000 per minute. For block-busters like Alice in Wonderland and Clash of the Titans, this is (relatively) small change and easily earned back by the ticket price difference between 2D and 3D.

On the other hand, multiple sources offer fully-automatic conversion, including DDD. According to Wiskerchen an ASIC to do this can cost as little as a half-dollar. These chips will be included in many of the current and upcoming generation of 3DTV systems. After all, you can’t watch the Masters Tournament all the time, in part because it isn’t on the air all the time. By including these ASICs in their sets, the TV manufacturers ensure the consumer will always have something in 3D to watch, even if the 3D really isn’t very good. Insight Media will release a new report on this real time 2D-to-3D conversion shortly after NAB covering the technology and forecasts for this feature in 3DTVs, Blu-ray players and set top boxes.

The networks are planning 3D channels, but what are they going to show on them? They would love to show popular shows like CSI: Miami in 3D but there is no way CBS could afford even $50,000 per minute, or about $2M per episode, to convert it at one of the cinema-grade conversion houses. DDD is now working on a solution for them.

This process, according to Wiskerchen, involves semi-automatic conversion. The operators don’t intervene for every frame, but they do intervene in the process at key frames. In addition, the client may have a target "look and feel" for the 3D and they can work with DDD to achieve this.

The system starts by fully automatically extracting the depth map from the visual cues in the image. The algorithms to extract these depth maps can be considerably more sophisticated than the algorithms that can run on a chip that a TV set maker can afford to embed - even in a high-end TV. In addition, according to Wiskerchen, DDD has more than one algorithm to extract depth maps from an image and in this process they apply them all, generating alternate depth maps for each image and scene. This is where the human intervention begins, with the operator choosing the best depth map, controlling the total depth of the scene, determining if any of the image is suitable for a modest amount of out of screen effect with negative parallax, etc. This is where client input can play a role in determining the final image appearance.

The system is not yet perfect, as could be seen in the sample images shown in the DDD booth. Two types of content seemed to confound the current system. First were the title sequences, which are often visually confusing even in 2D. This 2D confusion is intentional, maybe the networks will accept or even prefer a visually confusing 3D title sequence, at least as long as the star’s names can be read easily.

The second problem was more subtle and corresponded to errors in the depth map. For example there was one scene where a worker in a hard-hat was bumping his head on the concrete ceiling of a building under construction-in 2D. In 3D, you wondered what the problem was because the algorithm had put the concrete ceiling in the background, where sky would be. Apparently, everything over a person’s head is "sky." According to Wiskerchen, a similar problem occurred in over-the-shoulder shots where the camera is looking at a person’s back, with someone’s face seen over the shoulder in the background. The depth map assumes all faces in the foreground. Apparently the visual result of this is weird enough that DDD wasn’t showing any sample footage with this problem.

DDD is not in production yet, although they have converted entire sample episodes. Apparently CBS is planning on consumer focus groups to determine if the results from the new DDD process are good enough for TV. Wiskerchen says that DDD plans to start using this process in production in Q3 with the results airing before the end of the year. This should give DDD time to fix any problems in the algorithms, or perhaps give the operators the tools needed to interfere with the depth maps where the algorithms fail.

Cost of the process? According to Wiskerchen is was going to be $25,000 per TV hour (i.e. 45 minutes), with a 40 hour minimum. If a network commits to 500 hours in advance, the price will fall to $10,000 per TV hour, with the first hour of content starting to flow a week later. DDD will do this as work for hire, no residuals and no on-screen credit for the conversion. He said this price fits into the business model for network 3DTV.

When I asked him what the network business model was for 3DTV, he couldn’t really answer. One thing he did say was they were planning on broadcasting two versions of 3D shows, one in 2D on its regular station and the other in 3D on a special 3D channel. According to him, there are no plans at the networks, for the time being, to broadcast in a backward compatible 2D/3D format. He added, however, this wasn’t really a DDD issue. DDD’s responsibility ends when they provide the network with uncompressed left and right eye views of an episode. How the network compresses, broadcasts and profits from that 3D episode was their business, not DDD’s.

By Matt Brennesholtz, DisplayDaily