17 April 2014

Orienteering for Robots

Suppose you’re trying to navigate an unfamiliar section of a big city, and you’re using a particular cluster of skyscrapers as a reference point. Traffic and one-way streets force you to take some odd turns, and for a while you lose sight of your landmarks. When they reappear, in order to use them for navigation, you have to be able to identify them as the same buildings you were tracking before — as well as your orientation relative to them. That type of re-identification is second nature for humans, but it’s difficult for computers. MIT researchers discovered a new algorithm that could make it much easier, by identifying the major orientations in 3D scenes. The same algorithm could also simplify the problem of scene understanding, one of the central challenges in computer vision research.


The algorithm is primarily intended to aid robots navigating unfamiliar buildings, not motorists navigating unfamiliar cities, but the principle is the same. It works by identifying the dominant orientations in a given scene, which it represents as sets of axes, called ‘Manhattan frames’, embedded in a sphere. As a robot moved, it would, in effect, observe the sphere rotating in the opposite direction, and could gauge its orientation relative to the axes. Whenever it wanted to reorient itself, it would know which of its landmarks’ faces should be toward it, making them much easier to identify. As it turns out, the same algorithm also drastically simplifies the problem of plane segmentation, or deciding which elements of a visual scene lie in which planes, at what depth.

More information: