A Stereo 3D Production Masterclass for Europe
If Europe is to develop its own stereoscopic 3D market it will need a native skills base offering engineering, production and finishing expertise that matches anything that LA in the service of Hollywood can muster, starting with the availability of many more shooting rigs to rent or buy.
To that end good quality conferences, workshops and tutorials will be vital motivators, so it was a pity that sponsors outnumbered the delegates for Peter Wilson’s incredibly detailed recent Pinewood event ‘The theory and practice of stereoscopic production (3D movies) and complementary sound techniques’.
This was a two-day journey through the minutiae surrounding 3D production, and the way the human sensory systems will perceive it happily. The trainers who shared the tutorial duties were Wilson himself, technical bard John Watkinson, Dr Bernard Harper from the School of Psychology at Liverpool University, and cinematographer Kommer Kleijn.
Armed with blue chip support from the UK Film Council and Skillset, and with sponsorship backing from the likes of Arri Media, P+S Teknik, Quantel, Codex, Snell & Wilcox and Xpand, Wilson will be able to repackage and hone down what he produced for another event. He was already planning a tutorial on digital cinema mastering.
The purposes behind the 3D event were clearly defined, to target experienced people looking for re-training. “The Hollywood studios rushed into 3D and broke loads of the DCI rules they had set for 2D, such as 16 foot candles,” said Wilson. “This workshop is without dogma. We will follow a more fundamental, scientific approach so we can give the home market a good technical base.”
His sidekicks would detail the relevant principles of human image and sound perception, explain the clues that are important for depth perception, and show why the correct interocular distance is essential for realism.
“We will also describe what causes the major defects in stereoscopy and how to avoid them, explain why the grammar of cinematography may be different with stereoscopy, and list the processes needed in stereoscopic post production,” he said.
“We will move on to how the ears extract directional clues from sound, and to stating why timing accuracy is important to sound images,” he added. He also promised some alternative approaches to surround sound production, and a session on why cinema and TV audio are so different.
The eye is not a camera
Bernard Harper presented the sessions ‘Introduction and history of stereoscopic production and psycho-visual theory’, and offered basic ground rules that are worth learning parrot fashion in the way we might recall Archimedes Principle. He started his first module by offering three reasons for 3D being so hot. First is because Lucas, Cameron, Zemeckis, Jackson and Rodriguez need new toys to play with. Second, filmmakers need to be ahead of the audience in standards, technology and innovation. Third is the fact that stereoscopic imaging is perceptually superior to 2D.
Harper focussed on human vision and how it works, using fabulously detailed images depicting the anatomy of the eye, a plan view of the optic nerves and the visual cortex, and pathways of our vision system.
“The eye is not a camera. Vision involves the sampling and creation of a visual memory,” he explained. “We extrapolate scenes based on experience (and) create an interpretation based on sampled data. Visual sensation refers to the light falling onto the retina. Visual perception refers to the interpretation of the sensory information coming from the eye,” he added.
Harper moved on to consider depth cues, and humans use a variety of methods for deriving depth information from any images we see. “Some 5-8% of the population do not have full binocular vision, but just about everyone can perceive depth,” he said. “Most depth cues are monocular, requiring only one eye.”
The typical 2D depth cues he had in mind were light and shade, linear perspective, occlusion, relative size, textural gradient, aerial perspective, motion parallax, proprioception and retinal disparity. Harper produced diagrammatic explanations of all of these cues, and it is proprioception, the sense of awareness about the position of elements of our body, that enables us to touch our nose with a finger with our eyes closed.
He moved on to cover the various forms of parallax, which in its fundamental form superimposes L and R images and lets us define the relationship of objects in a scenes in terms of retinal disparity. We experience zero parallax when the two views of an object are exactly superimposed.
“You will perceive the object to be located in space at the surface of the display. This is given the special name of zero parallax plane, and it is an important point of reference for all stereo imaging,” said Harper.
Positive parallax happens when the right eye location of an object is to the right of the left eye’s location for it. “It will be perceived to be behind the surface of the display screen. It will be perceived to be at the location where two views converge,” he said. “Negative parallax, the magical part of 3D, happens when the right eye sees an object shifted to the left of the location seen by the left eye. The brain interprets the object as appearing at the intersection of the left and right views, in front of the surface of the screen as if floating in space.”
And then there is divergent parallax, and the issue of inter-pupillary distance. “There is a biological limit to the amount of positive parallax an image may have. The eyes must never be required to diverge beyond parallel; that is, the R and L location of an object must never be displaced by more than the distance between the eyes of the viewers,” Harper said.
The origin of the ‘Cha-Cha’ technique lies in Talbot’s photos for Wheatstone — a single camera shot followed by another at a distance of 65mm — and today this method is used to capture images as if from each eye.
“Images are captured as if staring straight ahead,” said Harper. “Another approach for capturing 3D images is to use converged optical axes — the ‘Toe-in’. The point where the optical axes converge establishes the zero parallax plane. There is a controversy in 3D production, between parallel and convergent.”
Everyone knows that all elements along a horizon will have a divergent parallax, but Harper had an interesting take on this. “Theoretically we could place the zero parallax plane at the horizon and thus have the entire image located in front of the screen, but there are limits to the amount of convergence a viewer can tolerate,” he said. “In general, the skilled 3D photographer will place the zero parallax plane at the centre of interest and will not create elements with too much negative parallax.”
Harper’s ground rules start with convergent ortho-stereography and parallel ortho-stereography. The first, with a life-sized image will give a very accurate reproduction of the size and depth of the original scene, for a viewer at or near the ideal viewpoint. The other offers so much more: it gives very good reproduction under most circumstances, can be modified and shifted to control on-screen separations, and the raw images can be integrated easily into CGI composites.
“When you pull focus you have to converge,” said Harper. “Un-natural stereoscopic disparities (parallax) will lead to 3D effects on screen that are unlike direct vision and can lead to eye-strain. This is the result of a mismatch between the natural and the presented.
“Reducing the lens interaxial separations can have perceptual benefits and reduce the likelihood of retinal rivalry. Telephoto lenses should be used in parallel and at small interaxial separations like 60mm,” he added. “Telephoto lenses can be the real killer of 3D. You might get retinal rivalry if you don’t use telephotos with great care.”
The fat and thin of 3D
Harper’s second module covered Sexual Dimorphism in Photographic Portraiture. This was about the flattering of male talent and fattening of female talent under identical photographic conditions — he cited Kate Winslet’s renown as the ‘fat actress’. This phenomenon will largely vanish with the coming of 3D because objects occlude less background and appear thinner. The one exception will be parallel axis 3D.
“2D is not the accurate medium of record it is assumed to be; there will always be levels of distortion,” said Harper. “In 3D we have a chance. The other big subject is camera to object distance – changes in distance will lead to changes in perceived weight.”
He ended with a firm proposal regarding inter-pupillary distance. Many 3D fans seem to think 65mm is right. “It averages out with children so why not accept 60mm as standard? The adult average is 63mm, and in proposing 60mm you would only get a slight dwarfism that would not be noticed by the eye,” Harper suggested.
In our heads
John Watkinson presented three sessions, with the titles of ‘Introduction to Human Perception’, ‘Sound for 3D Cinema’, and ‘The Cinematographic Aspects of Stereoscopic Techniques’.
In the first he warned, ”If you don’t understand the human visual system you have no chance of success. The eye is nothing like a camera, bar the lens. We create a 3D model of our surroundings in our head.”
Watkinson looked in detail at foveal and peripheral vision and the fact that our colour vision is not absolute due to the rotating earth. Noting that the light off TVs is brighter than what we see in cinemas he identified the poor brightness of current 3D as something we endure under subconscious stress.
One of his main planks was a call for the industry to review frame rates, the point being that ghosting needs to be addressed. He concluded by saying, “3D will throw up all the artifacts that could be hidden in 2D. It is not horrible, it is the only way to reproduce reality.”
Harper observed that with the help of “clever stuff” good work could be created at 30Hz. Kleijn observed that, “24Hz with stereo will be dire. We need to put pressure on the industry for higher frame rates.”
At the start of his audio module, Watkinson observed, “If we don’t put more science in we are lost. When engineering uses science it progresses further.
“Developments in the human auditory system (HAS) have been ignored. We will get conflict if we go to 3D without addressing the audio issues. Everything said about stereoscopy relates to the human visual system,” he added.
The criteria for the HAS exist in three dimensions – frequency, time and space. The first things we process are the clues, and the delay is significant. Watkinson then reviewed the functions of each part of the ear, right down to the basilar membrane, which vibrates in a different place according to pitch. The key fact is that the ear cannot hear above 20KHz.
Banging on about no distortion in stereophony he said, “The ear works in different domains depending on what we are trying to do. When dealing with transient/event type sources it starts working in the time domain. Don’t destroy the time domain at any stage during production; time domain information is our clue to the size of things and any impairment will impair realism.
“What is the unit of accuracy of sound images? There isn’t one,” he added. “By not measuring it, we can pretend everything is alright. The specs though are inadequate.”
Of the issues concerning Sound for 3D Cinemas, Watkinson said: “The audio illusion must also be better and must agree more closely with the visual stimulus. Hi-Fi has descended into a squalid pit of pseudo science, and unless we learn from those problems there is the risk that the sound for 3D will also be sub-optimal.”
One key issue he raised concerned an old cherry. “The accuracy required for lip-sync could be even higher than in HDTV,” he said. “In 3D cinema, the key extra parameter is depth and changing depth must cause a correspondent change in the timing of the sound as well as more subtle changes such as the direct-to-reverberant ratio.”
Limited experience
Kommer Kleijn had already worked on nine 3D projects, so he was the ideal person to front the practical capture module. On the issue of finding a camera rig or a savvy cinematographer, he said there are difficulties.
“Systems are prototype or custom built, and most likely they are linked to particular companies,” he said. “Amongst cinematographers there is limited experience, and it is not easy to find independent advice. All projects need good collaboration between the stereographer and the director.
“If you find your DOP you have to ask about experience, credits, and what cameras he/she wants to use. I hope 3D will develop into a creative art, but for now 3D stories have to be 2D compatible,” he added. The secrets are to write and design for 3D, record excellent images, and avoid high contrast and ghosting situations.
Kleijn ran through Europe’s rig options, starting with the Binocle rig and the Mini DV, ‘prosumer’ and professional versions designed by Alain Derobe and sold through P+S. Only one of the three P+S versions was available, and Kleijn cited stability and reflection issues in his review. One big issue is the loss of 1.5 stops of light, and he noted that specific grip solutions are often required.
Considering the plusses and minuses of the over/under rig systems he said, “Small, reliable, stable, fast in use, and there is a wide choice of camera bodies, but there is very severe light loss, a fixed interocular, limited lens options, and those lenses are not easily available.”
Parallel HD bodies offer the plusses of miniature potential, stability, reliability, speed, out-of-the-box use, and no light loss. “But,” said Kleijn, “there is no low interocular, no off-the-shelf availability, and the lens choice is bad.”
He moved onto visualisation on the set, which starts to fall down with the poor choice of 3D viewfinders. He was working with a single eyepiece and had to use a specially adapted visualisation screen to check alignments when not operating the shot.
Kleijn mentioned CGI in one context. He said: “A lot of cheating has been done with distance, but no more. Avoid this if possible because you need to change the interocular.”
The demo set up featured Sony HDCAMs working via an Arri/P+S rig, a Snell & Wilcox Kahuna vision mixer, and the visualisation screen. Kleijn explained: “I would not converge on set, and I need to consider if I want to move the black borders to satisfy a 2D version.”
Asked about 3ality’s rigs, he said: “They are more sophisticated than anything we have in Europe, but they are more sophisticated than required – mechanical pieces of art.”
Kleijn had cracked the passing of metadata from the rig for post applications. Wilson explained the presence of the Kahuna. “The filters that do the manipulation need to be very precise or you get insertion loss. You program the DVE to do the manipulation in realtime and bring the two pictures together,” he explained. “The Kahuna has a lot of assignable DVE tiles, and you can assign those to different inputs.”
The aspect ratio converters can be assigned to handle simple camera moves, and the keystone converter can do more – correcting the camera directionally. Wilson and Watkinson have ambitions for live 3D presentations. “If you put a system together in an artful way, you can get a good result, and then use the Quantel 3D system in post,” said Wilson.
Paying a price in post
Quantel staged a hugely impressive fix-it-in-post demo fronted by Mark Horton, who avowed that the big thing in stereoscopic 3D is sport. With 3ality producing a live sportscast of an NFL game 12 hours earlier, the point was well made.
“One issue is that the action moves, the camera moves, and the background doesn’t refresh fast enough. Problems, but 2D has the same issues,” he said. “It is the same problem, but more of your brain is involved.
“The big trick in post is that you are shafted,” he added. “If material comes in converged, the key advantage is less post. If the content is parallel it requires more post: with every single shot you have to do the convergence, and this is time and money. You also lose some resolution.”
If the operators/artists fiddle too much with parallel sourced content, they could incur double imaging problems, but both parallel and convergence bring issues to the finishing stage.
“Talking convergence, we can move things into space, but we are paying a price. Use zooms and you reduce the resolution,” said Horton. “Convergence also gives you a key stoning effect, but we can fix that in post with corner pinning.” The real answer regarding convergence and parallel is simply to “shoot correctly”. Horton added: “Parallel is safer but more costly in post. Convergence can be done quickly, but it is effectively lost time.”
He went on to leave a series of key tips, starting with the commercial fact that the images might end up on a silver or white screen and viewed through different glasses technology from the likes of Xpand and Dolby.
“If you are a colorist, you need to think what kind of system it will be eventually shown on. Another critical factor is screen size. Stuff created for the small screen will ghost on big screens. The other way round, you will lose 3D,” Horton said.
“We cannot do anything about the interocular, you have to get that one right. It’s all about persistence and training the cameraman beforehand and reaching the comfort zone over time,” he added. “We (Quantel) are simply taking someone’s good ideas and finessing them, or taking production problems and fixing them.”
This was very much a European event, but Hollywood sat in the shadows right to the end. Asked about where subtitles will sit in 3D movies, Wilson said: “The studios are still trying to decide what plane to put them on.”
The language behind the imminent gold rush
Stereoscopic 3D introduces a great range of new terminology, and creative restrictions. In some ways, it means going backwards to go forwards. For this reason, we need common international terminology and the understanding that making things like zooms work requires a new level of expertise.
Our Interocular, the distance between our pupils, averages out between 63/65mm. The (camera rig) Interaxial is the degree of separation between the L/R lens axes. When the interaxial is smaller than the human interocular, objects seem larger than in reality: if wider, objects appear unnaturally small. Top specification mirror rigs can take the interocular down to zero, by overlapping.
Stereopsis: Human binocular depth sense.
Camera convergence: Involves the toe-in, using converged optical axes. These, in combination with the chosen Interaxial, align along the Z-axis, at, behind or in front of the point of interest. As observed by experienced 3D producer Phil Streather at IBC, “The more you converge on a point of interest, the further away on the Z-axis that object moves.” Excessive camera convergence can result in Keystoning (vertical misalignment), which can be fixed in post with corner pinning.
Shooting converged: Vastly more popular than shooting parallel, maybe because of vastly heavier finishing costs in post production. Camera convergence is the principle mode of shooting converged, but some systems feature converging lenses.
Shooting parallel: Camera axes are always parallel to each other, with a set Interaxial. ‘Side-by-side’ rigs may have had their day (apart from miniaturisation) because mirror/split beam rigs now dominate the market, despite light loss issues.
Positive Parallax: The right eye object location is to the right of the left eye location for the object. Images are perceived to be behind the surface of the display screen.
Negative Parallax: The L and R object views juxtapose. Consequent images appear to sit in front of the screen plane, floating in space.
The Depth or Parallax Budget: The matched values from the maximum acceptable negative parallax in the foreground, to the maximum positive parallax in the background (that result a comfortable viewing experience). Depth Range describes the distance in camera space between those points.
Ghosting: Perceived cross talk that looks like a double exposure due to cross image leakage. Bad 3D quickly causes headaches.
By George Jarrett, TVB Europe