Grasp: AI and robotics

Machine Learning


adaptive hand

Although Katzschmann is harnessing the power of artificial intelligence, he still prefers the more precise term machine learning and emphasizes that it is far from anything resembling true living intelligence. “Traditionally, we solved robotics problems by simplifying and using physical models and control engineering,” he says. “Currently, we primarily rely on machine learning.” This data-driven approach has permeated nearly every area of ​​robotics, from generative design with 3D simulation to skill acquisition from video to algorithm-based motion control. “About half of my group is actively working on applying and improving machine learning techniques,” Katzschmann says.

Traditional methods such as control engineering excel at structured, repetitive processes such as those found on the factory floor. However, it falls short when it comes to chaotic environments and unstructured tasks. As Katzschmann points out, even the simple task of sorting different glass bottles into wooden boxes remains a big challenge for robots because the bottles come in a variety of shapes and sizes. To address this, his group developed a robotic hand with 21 degrees of freedom. Trained using a combination of reinforcement and imitation learning, this dexterous hand is part of a larger system that, together with the robotic arm, provides an incredible 28 degrees of freedom. To teach the robot, researchers wear gloves fitted with motion sensors and cameras. As they demonstrate how to grip the bottle, their movements are recorded by an external camera. This rich dataset, sometimes enriched with virtual reality images, is used to train what is known as a transformer model. Transformer models are architectures similar to the large-scale language models that power modern AI. Once trained, the robotic hand can also pick up unfamiliar objects and move them to the appropriate location. “Traditionally, you would have to create a 3D point cloud model of the environment and program all the finger positions needed to grab the bottle,” says Katzschmann. “If the position of a bottle or crate changed even slightly, the robot hand would not know how to react.” But that's no longer the case. “The movements required to pick up the bottle have been fully learned, making the hand incredibly adaptable,” he explains. In 2024, this research gave birth to Mimic Robotics, an ETH spinoff founded by Katzschmann and four former PhD and master's students. The startup aims to revolutionize manufacturing and logistics with AI-controlled robotic hands.

Learning in the cloud

Stelian Coros is a computer scientist who develops algorithms for robotics, visual computing, and computer-aided manufacturing. His main focus is on the software that shapes robot brains, and his research over the past decade has been shaped by advances in deep learning, a type of machine learning that uses artificial neural networks. “We have reached a stage where we have enough data and computational power to use deep learning for certain applications in robotics, such as automatic object recognition in images,” he says.

Neural networks also form the basis of another type of machine learning known as reinforcement learning. Reinforcement learning is a method for robots to improve through trial and error. Researchers give the robot a desired outcome, such as moving forward without falling, and the robot continually improves its behavior to maximize its score. “Just like people learn how to hit tennis, you learn by doing,” Koros says. “It's not enough for a robot to watch YouTube videos of humans performing tasks; it needs to try it out for itself.” His team therefore generates vast amounts of training data through remote control. This allows the robot to replicate the movements of a human operator. It also uses motion capture technology from the animation industry to record human behavior. Using this data with appropriate algorithms, robots can perform human-like movements depending on the situation. Koros argues that this is a key prerequisite for seamless human-robot interaction.

parallel training

At the Robotics Systems Lab (RSL), led by Professor Marco Hutter, researchers are also relying on reinforcement learning, but applied at scale in a virtual setting. “We use simulation to train thousands of robots simultaneously,” says Cesar Cadena, a senior scientist at the lab. “We can now generate as much data in an hour as we once did in a year.” These simulations were made possible by significant advances in microchips and graphics processors. RSL also works closely with NVIDIA, one of the world's largest developers of graphics processors and chipsets, as parallel processors can perform thousands of tasks simultaneously and are the basis for AI applications. This has already led to two papers directly related to the California-based company.

Virtual reinforcement learning takes place in the cloud and requires huge amounts of computing power. However, in continuous learning mode, this dependency can compromise the robot's autonomy. For example, factory robots can remain connected to the cloud to optimize performance for complex tasks. But what about rescue robots searching for survivors in remote disaster areas? How can they make quick decisions if there is no network coverage? To get around this problem, researchers install some of the computing power on the robot itself, along with pre-generated data from the cloud. “You're sacrificing some processing power,” Cadena said. “But for well-defined tasks, that's usually sufficient.”

Goal: Multipurpose robot

Does the current AI boom herald a robot revolution or something more? Koros thinks it's the latter. “The data needed for AI and the data needed for robotics are fundamentally different types,” he says. Robots have bodies and must learn through physical interaction in order to generalize their movements and apply them to different environments. In contrast, AI achieves generalization through a never-ending stream of data, primarily text, but also images and videos. Some in the robotics community continue to pursue purely data-driven approaches to training robots using terabytes of human movement data. “But that's not realistic,” Koros insists. He pointed to the research group that developed the shirt-folding robot, which required about 10,000 hours of demonstration data before it could complete the task, and even then made mistakes. “If you need that much data to learn a skill, the approach is fundamentally unscalable.”

So his group took a different path, combining learned data with physical models to fill the demonstration gap. “We understand the physics of how a ball moves through the air,” Koros said, citing the example of a robotic arm that throws a ball. The robot can use these physics to adjust its throws so that the ball lands exactly where it is intended. “And you can do it without requiring a lot of data,” Koros says. In 2023, Koros launched a spin-off, Flink Robotics, in partnership with a former PhD student. The company uses AI-powered image processing and physical models to make standard industrial robotic arms more intelligent, allowing them to pack, unload, and sort materials with greater precision. Flink Robotics' first customer, Swiss Post, plans to use the technology to automate its parcel operations.

Tendons are better than motors

Back in the soft robotics lab, biologists are developing tissue for artificial tendons and chemists are injecting life into artificial muscles with electrical impulses. Katzschmann believes that traditional motor-driven robots, no matter how sophisticated their AI, are reaching their limits in terms of generalization. “Those kinds of systems are not adaptable enough to deal with all the situations we encounter in the real world.” The body is as important to him as the brain, so he works to develop musculoskeletal robots that reflect nature's design. “Muscle provides flexibility, and the skeleton provides the load-bearing capacity needed for complex physical tasks,” he says. Nature has designed incredibly stable and versatile systems without the need for motors or metals. “That should be our model,” he insists.



Source link