× close
While the current study tested the AI algorithm only on a simulated robot, the researchers developed NoodleBot for testing the algorithm in the real world in the future.Credit: Northwestern University
Northwestern University engineers have developed a new artificial intelligence (AI) algorithm designed specifically for smart robotics. By enabling robots to quickly and reliably learn complex skills, this new method greatly increases the utility and safety of robots in a variety of applications, including self-driving cars, delivery drones, domestic assistants, and automation. could be improved.
The success of this algorithm, called Maximum Diffusion Reinforcement Learning (MaxDiff RL), lies in its ability to encourage the robot to explore the environment as randomly as possible to gain diverse experience.
This “designed randomness” improves the quality of data the robot collects about its surrounding environment. Also, by using higher quality data, the simulated robots showed faster and more efficient learning, improving overall reliability and performance.
When tested against other AI platforms, robots simulated using Northwestern's new algorithm consistently outperformed state-of-the-art models. In fact, the new algorithm worked so well that the robot was able to learn new tasks, successfully perform them within a single trial, and do them correctly the first time. This is in clear contrast to current AI models, which allow slow learning through trial and error.
The study, titled “Maximum Diffusion Reinforcement Learning,” was published in the journal nature machine intelligence.
“Other AI frameworks can be a little less reliable,” said Northwestern's Thomas Berrueta, who led the study. “Sometimes it accomplishes a task perfectly, and other times it fails completely. With our framework, every time you turn on the robot, as long as the robot is able to solve the task, you can expect the robot to perform.” This makes it easier to interpret robot successes and failures, which is critical in a world that is increasingly dependent on AI.”
Berueta is a Presidential Fellow at Northwestern University and holds a Ph.D. candidate in mechanical engineering at the McCormick School of Engineering. Robotics expert Todd Murphy, a professor of mechanical engineering at McCormick College and Belueta's advisor, is the paper's lead author. Berrueta and Murphy co-authored the paper with fellow Ph.D. Alison Pinoski. candidate in the Murphy lab.
disconnection without substance
To train machine learning algorithms, researchers and developers use large amounts of big data that have been carefully filtered and curated by humans. The AI learns from this training data and iterates through trial and error until it gets the best results.
This process works well for disembodied systems such as ChatGPT and Google Gemini (formerly known as Bard), but not for disembodied AI systems such as robots. Instead, robots collect data themselves, without the luxury of a human administrator.
“Traditional algorithms are incompatible with robotics in two different ways,” Murphy said.
“First, intangible systems can take advantage of a world where the laws of physics do not apply. Second, individual failures have no consequences. Importantly for computer science applications, in most cases It's all about success. In robotics, one failure doesn't matter.'' It can be catastrophic. ”
To address this disconnect, Berrueta, Murphy, and Pinoski aimed to develop a new algorithm that would ensure robots collect high-quality data while on the move.
The core of MaxDiff RL is to command the robot to behave more randomly in order to collect exhaustive and diverse data about the environment. By learning through self-curated random experiences, robots acquire the skills needed to accomplish useful tasks.
Understand it correctly even the first time
To test the new algorithm, the researchers compared it to current state-of-the-art models. Using computer simulation, the researchers asked a simulated robot to perform a series of standard tasks. Overall, the robot using MaxDiff RL learned faster than the other models. They also performed the tasks correctly with much more consistency and reliability than others.
Perhaps even more impressive is that robots using the MaxDiff RL method often succeed in performing the task correctly in a single attempt. That's even if you start with no knowledge.
“Our robot was faster and more agile. It was able to effectively generalize what it learned and apply it to new situations,” Berrueta said. “This is a huge advantage for real-world applications where robots can't spend endless hours on trial and error.”
MaxDiff RL is a general-purpose algorithm, so it can be used for a variety of applications. The researchers hope this will address the fundamental problems holding back the field and ultimately pave the way for reliable decision-making in smart robotics.
“This doesn't have to just be used for robotic vehicles that move around,” Pinosky said. “It could also be used for stationary robots, such as a robotic arm in the kitchen that learns how to load a dishwasher. This is an important step towards real systems that perform more complex and interesting tasks. ”
For more information:
Maximum Diffusion Reinforcement Learning, nature machine intelligence (2024). DOI: 10.1038/s42256-024-00829-3
Magazine information:
nature machine intelligence
