How MIT's new AI framework rethinks machine learning

The framework is called the SEAL-self-applied language model. Unlike traditional approaches such as fine-tuning and in-context learning on pre-collected datasets, SEAL allows models to generate their own training examples and steps to update internal parameters. In other words, the model does not only adapt to new tasks. Update your internal structures to preserve new knowledge.

How the sticker works

At the center of the sticker is reinforcement learning. The model learns to create self-editing – textual instructions that lead to changes in internal parameters. This process is similar to a model of writing your own textbooks. Rather than simply reading the data, reformat the information into a learning-optimized version.

Training takes place in two phases. First, the model updates the weights a little based on the self-generated instructions (inner loop). The system then checks if the task performance has improved (outer loop). If the update turns out to be effective, it will be retained. Otherwise, it will be discarded. Over time, models become more effective in teaching themselves.

Interestingly, the architecture of Seal can be split into two parts. One AI module acts as a “teacher”, generating self-editing, the other acting as a “student”, updating itself based on those instructions. This setup may prove particularly valuable in enterprise applications that require highly specialized training workflows.

From theory to practice

The seal framework was tested in two areas: integration of new knowledge and learning from a few examples.

In the first case, the model was tasked with memorizing facts from the text and answering questions without accessing the original material. Traditional fine-tuning only provided minor improvements, but (through the generation of meaning and generation of synthetic examples) increased the accuracy of the response to 47%. In particular, this result outweighed similar attempts using the stronger GPT-4.1.

In the second case, the model addressed visual problems from the Abstraction and Inference Corpus (ARC). This is a benchmark designed to test the ability of AI to abstractly infer and generalize abstractly from limited data. Here, the model not only needed to find the correct answer, but also developed its own learning strategy: the data to use, how to reformat it, and what learning pace should be followed. With the seal, the model reached 72.5% accuracy. Without reinforcement learning, performance would be four times lower, and standard context learning would produce no meaningful results.

Outlook and restrictions

Researchers say the lack of high-quality training data is quickly a major obstacle to advances in AI. The seal provides a partial solution. Allows the model to generate its own useful training signals. For example, AI systems can read scientific papers and create hundreds of explanations and takeouts to improve their understanding of the subject.

However, there are limitations to this method. Frequent updates can lead to what is known as catastrophic forgetting: loss of previously acquired knowledge. To address this, researchers propose a hybrid approach. It stores factual or frequently changing information in external memory, integrating core knowledge through seals.

There are also practical constraints. Real-time editing of model parameters is not yet feasible. Instead, the proposed solution is to use a delayed learning cycle. The model collects data throughout the day and updates itself at set intervals.

Previously, Kazinform's news agency reported on how ChatGpt weakens our minds.

Source link

Registrera commented on World Rugby To Introduce Smart Mouthguards To Detect Player Concussions: I don't think the title of your article matches th
binance referral commented on OpenAI And Anthropic Aim For Big Valuation Spikes, Visa Looks To Join Generative AI Gold Rush: Can you be more specific about the content of your
binance h"anvisning commented on How to Make AI Work for You, at Work: Your article helped me a lot, is there any more re
FxPro Low Leverage commented on Exante launches AI-powered news aggregator Leaprate: 現代日本は、技術革新において世界的に注目されています。特に、自動車産業では、トヨタなどの大手企業が世
anime commented on AI platform Hugging Face says hackers have stolen authentication tokens from Spaces: I recently found IndoNovelList and it’s amazing fo

How MIT's new AI framework rethinks machine learning

Leave a Reply

RECENT POSTS

Industry leader at the cutting edge of quantum AI

Google signs Department of Defense contract to provide AI for sensitive government applications

HitPaw announces Mother’s Day sale with up to 50% off AI creative tools

Related Posts

Leave a Reply