Researchers at Carnegie Mellon
University have demonstrated that they can combine iPhone videos shot in the
wild by separate cameras to create 4D visualizations that allow viewers to
watch action from various angles, or even erase people or objects that temporarily
block sight lines. Imagine a visualization of a wedding reception, where
dancers can be seen from as many angles as there were cameras, and the tipsy
guest who walked in front of the bridal party is nowhere to be seen. The videos
can be shot independently from variety of vantage points, as might occur at a
wedding or birthday celebration. It also is possible to record actors in one
setting and then insert them into another.
Virtualized reality is nothing
new, but in the past, it has been restricted to studio setups, such as CMU's
Panoptic Studio, which boasts more than 500 video cameras embedded in its
geodesic walls. Fusing visual information of real-world scenes shot from
multiple, independent, handheld cameras into a single comprehensive model that
can reconstruct a dynamic 3D scene simply has not been possible. Researchers
worked around that limitation by using convolutional neural nets (CNNs), a type
of deep learning program that has proven adept at analyzing visual data. They
found that scene specific CNNs could be used to compose different parts of the
scene.
More information: