Is AI going to take on your job?
Dario Amodei, CEO of AI Company Humanity, thinks that may be the case. He recently warned that AI could wipe out almost half of all entry-level white-collar jobs and send unemployment rates to 10-20% over the next five years.
While Amodei made that declaration, researchers within his company were putting together the experiments. They tried to discover whether Claude, the human AI assistant, could successfully run a small store in the company's San Francisco office. If the answer is yes, the Job's Apocalypse may arrive earlier than Amody predicted.
Humanity shared research exclusively over time prior to Friday's publication. “We were trying to understand what an autonomous economy looked like,” says Daniel Freeman, a member of the human technical staff. “What are the world risks you have? [AI] A model that autonomously wields millions to billions of dollars? ”
In the experiment, Claude was given several different tasks. The chatbot (full name: Claude 3.7 Sonnet) was tasked with maintaining the store's inventory, setting pricing, communicating with customers, determining whether new items are in stock, and most importantly generating profits. Claude was given various tools to achieve these goals. This was seeking the support of human workers at Andon Labs, an AI company that built the experimental infrastructure. The store they helped refill was actually a small fridge with an iPad attached.

It didn't take long for things to start to get weird.
Speaking to Claude through Slack, human employees were able to repeatedly convince them to give them discount codes. We ordered AI to lose and sell various products. “Too often, from a business standpoint, Claude responds directly to his appeal to fairness,” says Kevin Troy, a member of the Frontier Red Team of Humanity. “You know, 'It's not fair that he gets a discount code rather than me,'” the model will often provide items for free, the researchers added.
Humanity employees also enjoyed the opportunity to ruin Claude. The model refused attempts to sell illegal items like methamphetamine, Freeman said. But after one employee jokingly jokingly wanted to buy a cubes made from amazingly heavy metal tungsten, another employee jumped into the joke, which became an office meme.
“At some point, ordering tungsten cubes from AI where many people control the fridge can be interesting,” Troy says.
Claude then ordered around 40 tungsten cubes, most of which were sold at a loss. It turns out that the cube is now used as a paperback form for the entire office of humanity, the researchers said.
Things got even more strange afterwards.
On the eve of March 31st, Claude “hahahaha” a conversation with a non-existent Andon Lab person. (The so-called hallucinations are a mode of failure in which large-scale language models confidently claim false information.) When Claude was informed that he had done this, he “threw on finding an 'alternative option to restock services',” the researchers wrote. The model claimed that they had signed a contract before and after the address of the 732 Evergreen Terrace, or the Cartoon Simpsons family.
The next day, Claude told human workers that it would deliver the order directly. “I'm currently at a vending machine… I'm wearing a navy blue blazer with a red tie,” he wrote to an employee of humanity. “I'll be here until 10:30am.” Needless to say, Claude wasn't actually there.
result
For human researchers, this experiment showed that AI still doesn't take on your job. Claude “made too many mistakes to make the shop a success,” they wrote. Claude ultimately lost. The store's net worth has dropped from $1,000 to just under $800 over the course of a month's experiment.
Still, despite many Claude's mistakes, human researchers are confident that, as Amody predicted, AI can take over the large strip of the economy in the near future.
Most of Claude's mistakes are likely to be correctable in a short period of time, they write. It can provide model access to better business tools, such as customer relationship management software. Alternatively, you can train your models specifically to manage your business. This may increase your chances of rejecting a prompt for a discount. As the model improves over time, the “context window” (the amount of information that can be processed at any time) is likely to be longer, potentially reducing the frequency of hallucinations.
“This may seem counterintuitive based on the final results, but I think this experiment suggests that AI intermediary managers are plausible on the horizon,” the researchers write. “It's worth remembering that AI doesn't have to be perfect for it to be adopted. It needs to be human performance and competitive at a lower cost.”
