Image credit: Mine Dojo
AI researchers have built a Minecraft bot that can explore and expand its capabilities in the game’s open world. But unlike other bots, this bot basically created its own code through trial and error. a lot Number of GPT-4 queries.
This experimental system, called Voyager, is an example of an “embodied agent,” an AI that can move and act freely and purposefully in simulated or real environments. A personal assistant-type AI or chatbot doesn’t need to actually do anything, much less navigate a complex world to do so. But that’s exactly what domestic robots are expected to do in the future, and there’s a lot of research being done on how to do that.
Minecraft is a good place to test such things. Because Minecraft is a very (very) approximation of the real world, with simple, straightforward rules and physics. But at the same time complex and open, there is a lot to achieve and try. Dedicated simulators are also good, but have their own limitations.
MineDojo is a simulation framework built around Minecraft. You can’t just pop some random AI in there and expect it to figure out what all these blocks and pigs are doing. Its creator (which has a lot of overlap with the Voyager team) puts together YouTube videos, transcripts, wiki articles, tons of his Reddit posts from r/minecraft, and other data about the game to help users create AI. Or you can make fine adjustments. Model them. You can also evaluate your model more or less objectively by seeing how well it does tasks like building fences around llamas or finding and mining diamonds.
The Voyager excels at these tasks, performing significantly better than the only model close to it, the Auto-GPT. However, they have taken a similar approach and continue to write their own code using GPT-4.
I usually train my model on all the good Minecraft data and hope it finds a way to fight skeletons when the sun goes down. However, Voyager is relatively naive at first, and when he encounters something in-game, he has a little internal conversation with GPT-4 about what and how to do.
Direct the next action and add skills to the pile. Image credit: Mine Dojo
For example, at night, skeletons come out. Agents have a rough idea of this, but good players of this game ask themselves what they would do if there were monsters nearby. According to GPT-4, if you want to explore the world safely, craft a sword, equip it, and hit skeletons with it while dodging attacks. And that general sense of what to do translates into a concrete goal: gathering stone and wood, crafting and equipping swords on workbenches, and fighting skeletons.
As you complete these things, they enter your general skill library, so later when you get the task “go deeper into the cave to find iron ore” you learn to fight from scratch again No need to. Even though he still uses GPT today, he uses GPT-3.5, which is cheaper and faster, and teaches you which skills are most relevant to your particular situation. So you don’t mine skeletons or fight ore.
This is similar to how an agent like Auto-GPT, when faced with an interface it doesn’t already know, needs to learn how to navigate the interface in order to reach its goal. But Minecraft is a much deeper environment than anything we’ve solved so far, so a specialist agent like Voyager is much better. Find more things, learn more skills, and explore much larger areas than other bots.
Interestingly, but perhaps not surprisingly, GPT-4 outperforms GPT-3.5 (that is, ChatGPT) when it comes to generating useful code. In tests that replaced the former with the latter, the agent hit a wall early, perhaps even literally, and couldn’t improve. After talking to two models, it may not be obvious that one is much smarter than the other, but the truth is that you don’t have to be particularly smart to keep a seemingly intelligent conversation going (ask me how you know). please). Coding was much more difficult and GPT-4 was a big update.
The purpose of this research is not to make Minecraft players obsolete, but to find ways in which relatively simple AI models can improve themselves based on “experience”. There are no better words. As we deploy robots to assist us in our homes, hospitals and offices, they must learn lessons and apply them to their future actions.
Learn more about Voyager here.
