Large-Sensor Camcorders and DSLRs

When Canon announced its EOS C300 camcorder, which has a 4K2K sensor but does not record 4K2K video, the company also announced it was developing a DSLR that will record 4K2K video. Although it may seem a bit strange that a camcorder designed to employ high-quality cinema lenses is limited to full HD video recording yet a still camera will be able to record 4K2K video, it's not strange given the history of DSLRs.

When large CCD and CMOS chips replaced an SLR's 35mm film, the next logical step was to place an LCD on the digital camera so one could review shots in the field. The next logical step was a “live view” mode that allowed one to view what was being recorded. It was only a small step to compress live view images and record them as video.

Large-sensor digital camcorders have evolved from DSLRs. It is primarily a marketing decision whether to release digital motion picture technology in a still camera package, a camcorder package or both. However, one clear advantage of a camcorder package is space for a mic jack (even XLRs), a headphone jack and manual audio controls.

When a potential buyer who is in the process of learning about 4K2K production and post production encounters the same technology in two different packages, it may prove confusing.

When a videographer shoots with a still camera, he or she will find expected camcorder functions missing. For example, every professional camcorder has some form of ND filtration; DSLRs do not.

A primary differentiator of DSLRs and traditional camcorders is their optical system. This is true for current HD and future 4K2K products.

Frame Size
While video cameras have frame sizes that relate directly to sensor size, such as 2/3in, DSLR frame size relates to 35mm film — in particular, 35mm still film. When shooting 35mm slide or negative film, each 36mm × 24mm image is placed with perforations above and below the frame.

DSLRs with 36mm × 24mm sensors are called full-frame cameras. The Canon EOS-1D X, announced for March 2012, employs an 18-megapixel 28.7mm × 19.1mm sensor. Canon calls it an APS-H sensor.

There are small variations in APS frame size: Canon APS-C (22.2mm × 14.8mm), and Nikon/Sony-C (23.4mm × 15.6mm). Both full-frame and APS sensors, when taking photos, have a 3:2 (1.50:1) aspect ratio. Panasonic uses a slightly smaller sensor for its AF100 camcorder and GH2 still camera called Micro Four Thirds (M43), which has a 1.33:1 aspect ratio and a frame size of 17.3mm × 13mm.



Sensors smaller than a full-frame sensor reduce the potential minimum DOF. Minimum DOF, of course, is a function of the maximum aperture size. A large-sensor camera does not directly provide a shallow DOF.

When a lens designed for a full-frame camera is mounted on a camera with a smaller sensor, the lens' focal length is multiplied by the lens crop factor. (Crop factor equals the ratio of a 35mm frame's 43.3mm diagonal to the diagonal of the image sensor.) A Sony APS-C camera, for example, has a crop factor of 1.5. A 50mm “normal” lens becomes a 75mm tele lens.

When shooting video, a 16:9 window on the sensor is employed. This has three ramifications. First, the viewfinder image will shrink when switching a DSLR to video mode. (This shift can be minimized by shooting 16:9 photos.) Second, the number of pixels read out will be reduced, which is a positive. Third, the lens crop factor will slightly increase. For example, when a Sony APS-C camera is switched to video mode, the crop factor increases to 1.8, thus a 50mm lens acts as a 90mm lens.

The earliest 35mm movie film had a 22mm × 18mm image, with perforations on the sides of each frame. In 1929, the Academy ratio was established. It has a 21mm × 15mm image that has a 1.37:1 aspect ratio. To obtain wide-screen, but not anamorphic, images, a Super 35 frame can be employed.

A 24.9mm × 13.9mm Super 35 frame has a native aspect ratio of 1.79:1 — a perfect match to 1.78 (16:9) HD. It also matches Quad-HD (3840 × 2160 pixels) and almost matches 4K2K, which is 4096 × 2160 pixels — a 1.90:1 aspect ratio. Not surprisingly, frame sizes that come from cinema cameras do not require the use of a 16:9 window when shooting video.


A 24.9mm x 13.9mm Super 35 frame has a native aspect ratio of 1.79:1


Lens Zoom System
While the videographer likely knows that DSLR lenses do not have power zoom, he or she may not know that photo lenses have other issues. For example, the zoom ring may have high friction because of the need to significantly extend the lens when zooming. Pressure exerted to start a zoom while shooting can easily cause a visible disturbance.

Better Sony lenses, such as A-mount lenses that use micro ball bearings, may cause noise that will be picked up by an on-camera mic.

AF System
Photographers are used to trusting auto-focus — even on action shots where the shooter is following a moving subject. When a DSLR's mirror is in the 45-degree-position in order for the shooter to see the subject, AF is possible. A portion of the image passes through a semitransparent area of the mirror, reflects off a small mirror mounted on the back of the mirror and is cast onto a small sensor at the bottom of the camera. The sensor, in conjunction with a processor, sends commands to the lens' AF motor to move to a position calculated to be correct for precise focus. This system is called phase detection AF.


A phase detection AF system uses a series of mirrors, a small sensor at the bottom of the camera
and a processor to calculate precise focus.


DSLRs employ a different AF system when shooting video because the mirror must be continuously up. The processor, therefore, obtains information from the CMOS image sensor, which is why it is called contrast detection AF.

Mirrorless cameras such as the Panasonic GH2 and Sony NEX-5N must use contrast detection AF. (Strictly speaking, digital cameras without a mirror do not have a reflex system and, therefore, are not DSLRs.)

Contrast detection AF systems work by having a microprocessor rapidly command the lens servomotor to step forward and backward by a tiny amount. The processor notes whether contrast increases or decreases. If contrast increases, then current focus is not perfect. Therefore, stepping forward and backward continues. When there is no change in contrast, the current focus is the best possible.

Contrast detection tends to be slower than phase detection and becomes slower at low light levels. And, unless the lens is designed to be quiet, AF noise may be recorded.

Aperture System
Photography lenses are designed to click into key f-stops: f/2.8, f/4, f/5.6, f/8, f/11, f/16 and f/22. Cinema and video lenses are designed so the aperture changes in a continuous manner. One solution is to use camera lenses designed by the camera's manufacturer for video shooting. The other solution is to use cinema lenses.

ND Capability
To obtain a shallow DOF with a large-chip camera under bright light — at the slow shutter speed required for the correct amount of video motion blur — an ND is a must. (ND filtration also will be required to keep the aperture under f/11 to minimize diffraction.) When a camcorder does not have a built-in filter, a shooter has three choices: mount the camera on rails on which a matte box is mounted, attach one of several ND filters to the lens or employ a vario-ND filter.

Lens Mount Type
Both cameras and camcorders that employ large sensors use a lens mount designed to work with their brand of lenses. For example, Sony's NEX family — including the FS100 and VG20 camcorders — uses Sony's E-mount. Sony markets the LA-EA2 adaptor, which enables the use of Sony and Minolta A-mount lenses. The LA-EA2 has a translucent mirror system that provides phase detection AF to many A-mount lenses.

For most interchangeable lens cameras, third-party adaptors are available. These enable you to use your favorite photo lenses on a new camera or camcorder. For example, a Sony NEX camera can use Nikon F, Canon 5D, Leica M, Leica R, Pentax, Konica Minolta MD, Olympus and Contax/Yashica lenses by using an E-mount adaptor.

Only a few adaptors, such as the LA-EA2, provide electrical signals to a lens. Without electrical connections, in-lens optical stabilization, AF and aperture control cannot function and no information from the lens is received by the camera. Therefore, modern photo lenses that send the aperture ring's setting to the AE system cannot do so.

Solutions to these issues include working with still camera and cinema lenses in a fully manual way (which may be a camera operator's first choice) or using a manufacturer's lenses that have electrical contacts.

Bringing it Home
No matter whether you shoot with a still camera or camcorder, images from the sensor must be compressed and recorded. Currently, two codecs are used for recording 4K2K: the Sony F65RAW (16-bit RAW) codec to a docking SRMaster field recorder that records to SRMemory cards or the RED R3D wavelet codec to a REDMAG solid-state drive.

Future 4K2K codec options include H.264 (as a single stream or as four HD streams) and 4K2K versions of current HD formats.

By Steve Mullen, Broadcast Engineering

DPP Unveils Technical & Metadata Standards for File-based Programme Delivery

The Digital Production Partnership (DPP) – a partnership between ITV, Channel 4 and the BBC – has unveiled its new Technical and Metadata Standards for File-based programme delivery in the UK.

Through the DPP, seven major broadcasters (BBC, ITV, C4, Sky, Channel Five, S4C and UKTV), have all agreed the UK’s first common file format, structure and wrapper to enable TV programme delivery by digital file. These new guidelines will complement the common standards already published by the DPP for tape delivery of HD and SD TV programmes.

Working closely with the Advanced Media Workflow Association (AWMA) in the US, the DPP has been the driving force behind the creation of the organisation’s ‘AS-11,’ a new international file format for HD Files. The new DPP guidelines will require files delivered to UK broadcasters to be compliant with a specified subset of this new, internationally recognised standard.

By implementing one set of pan-industry technical standards for the UK, the DPP aims to minimise confusion and expense for programme-makers, and avoid a situation where a number of different file types and specifications proliferate.

The new DPP standards aim to remove any ambiguity during the production and delivery process. A key aspect is the inclusion of editorial and technical metadata, which will ensure a consistent set of information for the processing, review, and scheduling of programmes, as well as their onward archiving, sale and distribution.

As part of the file-based guidelines, the DPP’s member broadcasters have agreed a minimum set of common metadata to be delivered with a file-based programme. And, in a bid to encourage international adoption of its metadata standards, the DPP has worked closely with the European Broadcasting Union (EBU), mapping its minimum set of common metadata to existing ‘EBU-Core’ and ‘TV-Anytime’ metadata sets.

Alongside these new standards, the DPP is currently building a free-to-use, downloadable, metadata application to enable production companies to enter the required editorial and technical metadata easily. The new application is due to launch in spring 2012.

The agreement of these new file based technical standards does not signal an immediate move to file based delivery. Instead, the DPP seeks to provide clarity around digital delivery that will become the expected standard in the future.

During 2012 BBC, ITV and Channel 4 will begin to take delivery of programmes on file on a selective basis. Production companies wishing to deliver by file should discuss this at the point of commission, and seek formal agreement with their broadcaster at the outset of production. After a period of selective piloting, file based delivery will be the preferred delivery format for these Broadcasters by 2014.

Source: Digital Production Partnership

AS-11: MXF for Contribution

AS-11 is a vendor-neutral subset of the MXF file format to use for delivery of finished programming from program producers and distributors to broadcast stations. AS-11 files are intended to be complete and ready for playout.

AS-11 supports playout while the file transfer is in progress, a workflow is referred to as “late delivery”. It is preferable for AS-11 files to be used by playout servers directly without rewrapping of the MXF data structures.

The content may be delivered at the ultimate bit-rate, picture format and aspect ratio, or it may be transcoded at the broadcast station to the required bit-rates and formats. Similar transcoding may be applied to audio and captions; additionally, specific audio and caption tracks may be selected for different broadcast channels.

The content may be pre-packaged for broadcast without further splicing or it may be segmented for ease of insertion or replacement of interstitials.

AS-11 supports SD video encoded as D-10, 50Mbit/s, and HD as AVC-Intra Class 100. Audio can be PCM, AC-3 or Dolby E.

AS-11 defines a minimal core metadata set required in all AS-11 files, a program segmentation metadata scheme, and permits inclusion of custom shim-specific metadata in the MXF file.

Source: Advanced Media Workflow Association

EBU-TT Subtitling Format Published for Industry Comments

The EBU has published a new Subtitling Format specification (EBU Tech 3350). The new format is called EBU Timed Text (EBU-TT) and provides an easy-to-use method to interchange and archive subtitles in XML.

EBU-TT is based on the W3C Timed Text Markup Language (TTML) specification. The EBU format can be seen as a constrained version of the W3C spec, aimed at providing a solution more tailored to broadcast operation. This is especially relevant as broadcasters are increasingly moving to file-based HDTV facilities, where subtitles are created, edited, exchanged and archived together with the content.



The previous EBU subtitling format was EBU STL (Tech 3264), developed at a time when information was still exchanged on floppy discs. However, as many broadcasters still use STL or have archived STL files, great care was taken in the development of EBU-TT to make sure that it provides backwards compatibility with its predecessor.

The EBU is also providing an XML Schema for EBU-TT.

Source: EBU

Tech Upstarts Kicking Glasses in 3D

There's no shortage of innovation from the major TV manufacturers on display at the huge booths at CES: OLED, 4K -- even 8K -- resolution, new interfaces, connectivity, exclusive content. What's absent here, though, are any prototypes to indicate that glasses-free (autostereo) 3D TV is anywhere close to market.

That's not to say that autostereo TV can't be found at CES. It's just coming from smaller companies in smaller booths -- one with barely a booth at all. They are pressing ahead with -- and showing off -- autostereo screens for television, tablets and smartphones while the big makers remain oddly quiet on the topic.

"Consumer electronics companies wanted to get into the home market quickly," said Raja Rajan, chief operating officer of Stream TV, whose booth in Central Hall, of the mammoth Las Vegas Convention Center, is not far from Sony's. "The consumer electronics companies have tremendous financial pressures to get to market with the fastest, easiest technologies."

That is echoed by one of Rajan's competitors, Stephen Blumenthal of 3D Fusion, a late addition to the floor that has one of its models tucked into the 3D Bee booth at the periphery of Central Hall.

"They brought (3D with glasses) to the market as a very straightforward consumer play, and until they burn through the opportunity to make as much revenue off of it as possible, this adventure with the next step is on the back burner," Blumenthal said.

His partner Ilya Sorokin noted, "The 3D with glasses technology was much easier to incorporate into their existing infrastructure because it was already there, and just lying on a shelf."

Both 3D Fusion and Stream TV are using advanced, lens-based tech that, according to Rajan, was abandoned by the big companies.

Rajan said he toured Asia showing Stream TV's screens and its real-time 2D-to-3D converter to major hardware makers, who responded enthusiastically. Stream TV is looking to be a technology provider, not to manufacture under its own name.

"We expect in the next few weeks to start announcing some of the first brands and products rolling out," Rajan said.

He said there is strong interest from Hollywood in the converter box, because it can be built into cable and satellite boxes, enabling all channels to be in 3D. At the same time, Stream's units come with controllers so the consumer can turn the 3D down, or off altogether, for comfort or personal preference.

"Our cost is incrementally 10% to 15% max over the cost of goods for a 2D television," Rajan said. "That's significant because a big re-seller can get into the consumer market at a cost consumers can afford."

MasterImage 3D, which has a solid worldwide business projecting 3D in theaters, is in the South Hall. It has been in the autostereo screen business for some time, and this year is at CES with two screens aimed straight at state-of-the-art mobile devices: a 720p 4.3-inch smartphone display and a WUXGA (1920x1200) display for tablets.

Royston Taylor, exec VP and general manager for MasterImage, said he welcomes the competition from Stream TV, which is also showing tablet screens.

"First, it validates what you're trying to do," Taylor said. "Being on your own is nice in terms of no competition, but it's very lonely in terms of being the only voice saying how great something is. The second thing is competition is always good for the consumer."

Despite strong sales of the Nintendo 3DS, the poor critical response to the 3DS, the HTC Evo 3D phone and the LG Optimus 3D phone have made some makers nervous, Taylor said. He now expects to be making announcements of deals with consumer electronics companies by April and to have gear with MasterImage 3D screens in stores by Thanksgiving.

One hurdle that had to be overcome was the lack of technical standards for judging the quality of a 3D display.

"Right now it's almost entirely subjective," he said. "Big companies won't risk a $250 million phone line on 3D just because it looks nice."

But a French company, Eldim, has come up with a product for testing 3D displays on objective, technical measurements. With standards in place, it will be possible to compare products and establish quality control in manufacturing.

3D Fusion is already selling autostereo TVs for use in digital signage. Blumenthal said the company is selling its turnkey solution, which includes a 42-inch autostereo display, at CES. Cost is $8,000. His sales are to retailers, small mom-and-pop chains, malls. Blumenthal and Sorkin recognize that their company is small and they're in no position to ramp up to consumer volumes on their own. Like Stream TV, they'd be happy to license their technology.

By David S. Cohen, Variety

3-D Cameras for Cellphones

Researchers at Massachusetts Institute of Technology (MIT) have developed a system that uses specially designed algorithms to produce a detailed 3D image with just a cheap photodetector and the processor power found in a smartphone.

Like other sophisticated depth-sensing devices, CoDAC uses the “time of flight” of light particles to gauge depth: A pulse of infrared laser light is fired at a scene, and the camera measures the time it takes the light to return from objects at different distances.

Traditional time-of-flight systems use one of two approaches to build up a “depth map” of a scene. LIDAR (for LIght Detection And Ranging) uses a scanning laser beam that fires a series of pulses, each corresponding to a point in a grid, and separately measures their time of return. But that makes data acquisition slower, and it requires a mechanical system to continually redirect the laser.

The alternative, employed by so-called time-of-flight cameras, is to illuminate the whole scene with laser pulses and use a bank of sensors to register the returned light. But sensors able to distinguish small groups of light particles — photons — are expensive: A typical time-of-flight camera costs thousands of dollars.

The MIT researchers’ system, by contrast, uses only a single light detector — a one-pixel camera. But by using some clever mathematical tricks, it can get away with firing the laser a limited number of times.

The first trick is a common one in the field of compressed sensing: The light emitted by the laser passes through a series of randomly generated patterns of light and dark squares, like irregular checkerboards. Remarkably, this provides enough information that algorithms can reconstruct a two-dimensional visual image from the light intensities measured by a single pixel.

In experiments, the researchers found that the number of laser flashes — and, roughly, the number of checkerboard patterns — that they needed to build an adequate depth map was about 5 percent of the number of pixels in the final image. A LIDAR system, by contrast, would need to send out a separate laser pulse for every pixel.

To add the crucial third dimension to the depth map, the researchers use another technique, called parametric signal processing. Essentially, they assume that all of the surfaces in the scene, however they’re oriented toward the camera, are flat planes. Although that’s not strictly true, the mathematics of light bouncing off flat planes is much simpler than that of light bouncing off curved surfaces. The researchers’ parametric algorithm fits the information about returning light to the flat-plane model that best fits it, creating a very accurate depth map from a minimum of visual information.


Click to watch the video


Indeed, the algorithm lets the researchers get away with relatively crude hardware. Their system measures the time of flight of photons using a cheap photodetector and an ordinary analog-to-digital converter — an off-the-shelf component already found in all cellphones. The sensor takes about 0.7 nanoseconds to register a change to its input.

That’s enough time for light to travel 21 centimeters, Vivek Goyal from MIT’s Research Lab of Electronics, says. “So for an interval of depth of 10 and a half centimeters — I’m dividing by two because light has to go back and forth — all the information is getting blurred together”.

Because of the parametric algorithm, however, the researchers’ system can distinguish objects that are only two millimeters apart in depth. “It doesn’t look like you could possibly get so much information out of this signal when it’s blurred together,” Goyal says.

The researchers’ algorithm is also simple enough to run on the type of processor ordinarily found in a smartphone. To interpret the data provided by the Kinect, by contrast, the Xbox requires the extra processing power of a graphics-processing unit, or GPU, a powerful special-purpose piece of hardware.

“This is a brand-new way of acquiring depth information,” says Yue M. Lu, an assistant professor of electrical engineering at Harvard University. “It’s a very clever way of getting this information.” One obstacle to deployment of the system in a handheld device, Lu speculates, could be the difficulty of emitting light pulses of adequate intensity without draining the battery.

But the light intensity required to get accurate depth readings is proportional to the distance of the objects in the scene, Goyal explains, and the applications most likely to be useful on a portable device — such as gestural interfaces — deal with nearby objects. Moreover, he explains, the researchers’ system makes an initial estimate of objects’ distance and adjusts the intensity of subsequent light pulses accordingly.

Telecoms company Qualcomm has awarded the research team one of $100,000 Innovation Fellowship grants to continue the research.

By Larry Hardesty, Massachusetts Institute of Technology

BetterView

BetterView's up-conversion technology utilizes Super-Resolution (SR) reconstruction that performs a fusion of low quality images into a higher quality result with improved optical resolution. This task encompasses scaling-up of the visual content by introducing true (optical) resolution enhancement.

Several low-resolution images of the same scene. Note they are slightly different from each other.



One HR image (with more pixels, better optical resolution, and less noise), obtained by fusing the previous images.


It has been known for the past 20 years that, in principle, one could take several low-quality images and fuse them into a single, higher-resolution outcome. This has been demonstrated by scientists, adopting various techniques and algorithms. This process became a hot field in image processing, with thousands of academic papers published during the past two decades on the problem and ways to handle it. The classical approach to fuse the low-quality images requires finding an exact correspondence between their pixels, a process known as "motion estimation".

BetterView technology is based on a recently developed and patent-pending novel family of SR algorithms, proposed by a world-leader in this field, Prof. Michael Elad (Technion – Israel Institute of Technology). Elad and his collaborator, Dr. Matan Protter devised the first method that overcomes the requirement for very accurate and explicit motion estimation in previous SR technologies.

Exact motion estimation has been a crucial stage in every earlier SR algorithm, considerably limiting the scenes that can be handled. The new family of SR techniques avoids the exact motion estimation and replaces it by a probabilistic estimate. This enables handling successfully general content scenes containing extremely complex motion patterns. The results are impressive, with no visual artifacts, and the process is completely robust.


Click to watch the video