In a major step forward for robotics, researchers from MIT, Amazon Robotics, and the University of British Columbia have developed a novel technique that allows robots to assess the properties of objects—such as weight, softness, or internal contents—by simply picking them up and giving them a gentle shake. Remarkably, this method relies solely on internal sensors, eliminating the need for external cameras or tactile systems.

This innovative approach mimics a common human behavior: gauging what’s inside a box by lifting and shaking it. By enabling robots to do the same, the team has created a low-cost, efficient method for robots to interpret the physical world, especially in environments where vision-based systems are impractical—like dark basements or disaster-stricken buildings.

The technique hinges on a robot’s proprioception—its ability to sense the movement and position of its body parts. When a robot lifts an object, internal joint encoders measure how the joints move and respond to the object’s properties. For example, a heavier object slows down the robot’s arm motion more than a lighter one when the same force is applied.

These sensors are already present in most robotic joints, making the system highly cost-effective. Unlike traditional approaches that depend on computer vision or additional hardware, this method leverages what’s already built into the robot.

To interpret the data gathered during an interaction, the system uses two simulations: one that models the robot’s motion and another that models the object’s physical dynamics. This digital twin approach allows the algorithm to analyze how the robot’s movements differ based on changes in an object’s properties.

By comparing the robot’s real movement to simulated outcomes, the algorithm can accurately infer object characteristics—such as mass or softness—in just seconds. The use of differentiable simulations, which calculate how small changes in object parameters affect motion, plays a crucial role in this rapid analysis. The simulations were built using NVIDIA’s Warp library, a tool designed to support such advanced physics-based modeling.

The researchers demonstrated the method by estimating the mass and softness of various objects. The algorithm proved to be as accurate as more complex, vision-based techniques, and it worked effectively after seeing just one motion trajectory from the robot.

Unlike vision-based systems that require large datasets and extensive training, this model generalizes well to new scenarios and environments. The team believes it can be extended to infer other physical properties, such as the viscosity of liquids or the behavior of granular materials like sand.

Looking ahead, the researchers plan to integrate this proprioceptive method with computer vision to create a multimodal sensing approach that combines the strengths of both systems. They’re also interested in applying the technique to more complex robots—like soft robotic arms—and more intricate objects.

Ultimately, the goal is to equip robots with the ability to independently explore and understand their environments. This development could accelerate robot learning, enhance adaptability, and lead to more intelligent, capable machines operating in dynamic, real-world settings.

Funded in part by Amazon and the GIST-CSAIL Research Program, this work marks a key advancement in robotics, highlighting the potential of sensor-free perception systems to revolutionize how machines interact with the world.

By Impact Lab