Despite their promising first advancements, current MLE agents face some limitations that reduce their effectiveness. First, their large reliance on existing LLM knowledge often leads to biases to familiar and frequently used methods (e.g., Scikit-Learn libraries for table data), overlooking potentially superior task-specific approaches. Furthermore, these agents typically employ a search strategy that simultaneously changes the entire code structure in each iteration. This causes agents to frequently shift their focus to other stages (such as model selection or hyperparameter tuning) as they lack the ability to deep and iteratively explore within a particular pipeline component, such as thorough experimenting with various functional engineering options.
A recent paper introduces MLE-Star, a new ML engineering agent that integrates web search and targeted code block improvements. Unlike alternatives, MLE-star tackles the challenge by first searching the web to find the right model to gain a solid foundation. We then carefully improve this foundation by testing which parts of the code are most important. MLE-Star uses a new method to blend several models to get even better results. This approach is very successful. He won medals in 63% of the Kaggle competition at MLE Benchlight, significantly outperforming the alternative.
