🔮 Automatic research and experimental society

Science is the most reliable method discovered by humanity to generate knowledge. And for most of history, they were expensive to operate.

Andrei Karpathy A few weeks ago we released 600 lines of Python code and that started to change. his automatic investigation (look EV#565) runs an autonomous experimental loop where humans set strategic direction, define what good is, and agents iterate towards success within guardrails. Andrej’s first experiment trained a GPT-2 level model over two days and showed an 11% speedup and a true improvement of 20.

Shortly after its release, Shopify CEO Toby Lütke used an internal model of qmd automated research. Having run 37 experiments overnight, Toby woke up with a 0.8 billion parameter model, a 19% increase over the previous 1.6 billion parameter version. Toby is not a machine learning engineer.

Automated research is powerful because it solves two problems at once. One is to automate parts of the knowledge production process. And second, solving agent control problems and keeping agents engaged. If you give the AI a free explanation or optimize in the wrong direction, it will often deviate. Fortunately, AutoResearch prevents this by design. Humans decide where cars go. autoresearch remains at the wheel.

I’ve spent the last month adapting automated research to knowledge work beyond machine learning, with the goal of standing up a system that can run structured, low-cost experiments on the types of decisions most teams make every week. calling this version auto betaand I am making the complete playbook/skill available to paid members below.

Let’s go!

When I first saw automated research, my immediate impression was that it doesn’t have to be just machine learning. This loop is common: hypothesize, test, score, iterate. So I cloned it and started running it in other parts of my work.

Things didn’t go as I expected. The output looked good, but I couldn’t tell if it was an improvement. Unlike ML, where the agent has built-in feedback signals from each training run, knowledge work didn’t have that. Pricing decisions are not validated in 5 minutes. And in the paragraphs I write, most of the time I can’t tell if the argument is getting better or just changing.

This is why automated research is really difficult to apply to knowledge. The loop needs something to take into account and optimize for something that doesn’t exist naturally. Knowledge work requires optimizing such things.

So I built a version of automated research called AutoBeta that can address a wide range of business problems. Although not as technically robust as Karpathy, it has the same design principles. Objectives and constraints are set and the experiment is done in a loop.

What I changed was the score. I created an “oracle”. This is a comprehensive panel of judges that collapses the loop into a single number that can be optimized and scores each output against predefined criteria.

Source link

Binance推荐码 commented on MEGA sconto del 34% su Amazon: Can you be more specific about the content of your
binance anm"alningsbonus commented on CX Decoded Podcast Episode 2: AI Empowered CX: Real Conversations, Real Results: Shri Nandan, Comcast: Can you be more specific about the content of your
binance US-registrera commented on Wheels Of Justice Slow To Accept Legal Tech As Funding Falls: Thanks for sharing. I read many of your blog posts
binance "oppna konto commented on Forget Ray-Ban Meta smart glasses. We tested cheaper ones that support ChatGPT.: Thanks for sharing. I read many of your blog posts
Binance账户 commented on The Smartest Man Who Ever Lived: Your point of view caught my eye and was very inte

🔮 Automatic research and experimental society

RECENT POSTS

The past, present and future of self-driving laboratories

Stanford Paper Challenges Core Assumption Behind Offline-to-Online Reinforcement Learning Pipelines

OpenAI lowers prices on smaller models as companies seek to reduce AI costs

Related Posts