30 July 2020

Human-Like Perception for Robots

To carry out high-level tasks, researchers believe robots will have to be able to perceive their physical environment as humans do. Researchers at MIT have developed a representation of spatial perception for robots that is modeled after the way humans perceive and navigate the world. The new model, which they call 3D Dynamic Scene Graphs, enables a robot to quickly generate a 3D map of its surroundings that also includes objects and their semantic labels (a chair versus a table, for instance), as well as people, rooms, walls, and other structures that the robot is likely seeing in its environment. The model also allows the robot to extract relevant information from the 3D map, to query the location of objects and rooms, or the movement of people in its path. At the moment, robotic vision and navigation has advanced mainly along two routes: 3D mapping that enables robots to reconstruct their environment in three dimensions as they explore in real time; and semantic segmentation, which helps a robot classify features in its environment as semantic objects, such as a car versus a bicycle, which so far is mostly done on 2D images. 


The key component of the team’s new model is Kimera, an open-source library that the team previously developed to simultaneously construct a 3D geometric model of an environment, while encoding the likelihood that an object is (i.e. a chair versus a desk). Kimera works by taking in streams of images from a robot’s camera, as well as inertial measurements from onboard sensors, to estimate the trajectory of the robot or camera and to reconstruct the scene as a 3D mesh, in real-time. To generate a semantic 3D mesh, Kimera uses an existing neural network trained on millions of real-world images, to predict the label of each pixel, and then projects these labels in 3D using a technique known as ray-casting, commonly used in computer graphics for real-time rendering. The result is a map of a robot’s environment that resembles a dense, 3D mesh, where each face is color-coded as part of the objects, structures, and people within the environment. The team tested their new model in a photo-realistic simulator, developed in collaboration with MIT Lincoln Laboratory, that simulates a robot navigating through a dynamic office environment filled with people moving around.

More information:

https://lids.mit.edu/news-and-events/news/new-model-aims-give-robots-human-perception-their-physical-environments