Robots are rapidly entering mainstream culture, but their programming generally limits their capabilities. Despite the potential benefits of incorporating recent AI advances into robot design, progress in developing general-purpose robots remains slow due to the time required to acquire real-world training data.
The development of robots with the ability to learn many tasks at once and integrate the understanding of language models with the practical abilities of helper robots is an area of extensive research.
DeepMind’s RoboCat is the first agent capable of solving and adapting to different tasks on several types of real robots. Research has shown that RoboCat learns significantly faster than other state-of-the-art models. Learning from such a large and diverse dataset, you can learn new skills in just 100 demonstrations. This ability is crucial for the development of multi-purpose robots and facilitates robotic research by reducing training requirements under human supervision.
🚀 Join the fastest ML Subreddit community
The company’s multimodal model Gato (“cat” in Spanish) is the foundation of RoboCat because it can process words, visuals and actions in both virtual and real-world settings. Their work fuses the structure of Gato with a large training dataset containing vision and motion data from hundreds of robotic arms performing various tasks. After this initial training phase, the team had RoboCat go through a “self-improvement” training cycle with a new set of activities. Each new activity he learned in five stages.
- Collect 100-1000 new tasks and robot examples demonstrated by human-controlled robotic arms.
- Tweak RoboCat for new tasks/arms to generate spin-off agents with specialized capabilities.
- The child agent runs 10,000 iterations of practice on the new task/arm and adds it to the training data pool.
- Blend sample data and user creation and demonstration data into RoboCat’s current data set.
- Retrain RoboCat with the updated dataset.
The latest version of RoboCat is based on a dataset containing millions of trajectories from real and simulated robotic arms and data created by the system itself thanks to all this training. Vision-based data representing the jobs that RoboCat is trained to perform are collected using four different types of robots and a number of robotic arms.
RoboCat was trained to use multiple robotic arms in a matter of hours. Despite being taught with arms with two-finger grippers, they learned to use more complex arms with three-finger grippers and twice as many control inputs.
After witnessing 1,000 human control demonstrations collected in a matter of hours, RoboCat was able to deftly control this new arm to gear up 86% of the time. The same degree of demonstration allowed them to perform tasks that required both precision and knowledge, such as picking the right fruit out of a bowl or solving a shape-matching puzzle.
RoboCat training is permanent. The more you learn, the better your ability to learn. After learning from 500 demos of each task, the research team showed that the original version of RoboCat was only 36% more efficient at performing never-before-seen activities. However, his latest RoboCat has been trained in different activities and has doubled the success.
Because RoboCat can learn autonomously and develop skills quickly, especially when applied to multiple robotic devices, the team believes RoboCat will pave the way for a new generation of more useful general-purpose robotic agents.
please check out deepmind article and paper.don’t forget to join 24,000+ ML SubReddits, Discord channeland email newsletterShare the latest AI research news, cool AI projects, and more. If you have any questions regarding the article above or missed something, feel free to email me. Asif@marktechpost.com
featured tools From AI Tool Club
🚀 Check out 100’s of AI Tools at the AI Tools Club
Dhanshree Shenwai is a computer science engineer with extensive experience in FinTech companies covering the fields of finance, cards and payments, and banking, with a strong interest in AI applications. She is passionate about exploring new technologies and advancements in today’s evolving world to make life easier for everyone.
