Scientists discover smarter way to teach robots complex skills

Machine Learning


Teaching robots to manipulate objects with human-like dexterity remains one of the biggest challenges in robotics. A new study suggests that the answer may not lie in giving robots more complex training data, but in giving them more consistent examples to learn from.

Researchers at New York University’s Tandon School of Engineering and Robotics and AI Institute found that robots trained with structured, predictable demonstrations performed significantly better than those trained with highly variable examples. This research could help improve how robots learn tasks that involve complex hand movements, grip changes, and coordination between multiple limbs.

Many robot learning systems rely on imitation learning, where machines learn by imitating human demonstrations. However, collecting demonstrations of highly dexterous tasks is difficult because remote control systems have difficulty capturing fine finger movements and touch-heavy interactions.

To overcome this limitation, researchers turned to motion planning algorithms that automatically generate demonstrations within physics simulations. Rather than learning from humans, the robot learned from virtual samples created by software.

The team quickly discovered the problem. A common planning technique known as Rapid Search Random Trees (RRT) produced different solutions for each demonstration, making it difficult for the robot to identify the behavior it should imitate.

Consistency trumps randomness

“These planners are very good at finding solutions,” says lead author Huaijiang Zhu.

“But if all the solutions look different, the learning system has a hard time understanding what behavior to mimic.”

According to the researchers, the randomness of the demonstrations generated by RRT creates so-called high-entropy data. Such diversity helps planning algorithms explore different solutions, but it can reduce the effectiveness of imitative learning.

To address this issue, the team developed an alternative planning approach designed to produce more consistent demonstrations. One method prioritized steady progress toward a goal, while the other relied on a library of predefined motions to reduce sample-to-sample variability.

The researchers evaluated this approach using two difficult manipulation tasks. One experiment required two robotic arms to rotate a large cylinder 180 degrees while repeatedly adjusting their grips. In another example, a dexterous robotic hand manipulated a cube in its palm to match the orientation of a target.

Virtual lessons, real results

Robots trained with more consistent demonstrations achieved significantly higher success rates than robots trained with standard RRT-generated data. In the dual-arm task, the system reached near-perfect performance in just 100 demonstrations.

The team also transferred the learned policies directly from simulation to physical hardware without additional retraining. The dual-armed robot successfully completed 90% of the experiments in the real world, and the robotic hand completed about 62% of the experiments.

The findings highlight a growing trend in robotics that combines traditional motion planning with machine learning. Researchers are increasingly using planning algorithms to generate training data for learning systems, rather than treating the two approaches separately.

The study also confirms a broader lesson about artificial intelligence: large amounts of data don’t always lead to better learning. In some cases, a carefully structured sample can be more valuable than a large collection of noisy or inconsistent demonstrations.



Source link