Apple releases OpenELM, a slightly more accurate LLM • The Register

Apple, not really known for its openness, has released a generative AI model called OpenELM. This clearly outperforms a set of other language models trained on public datasets.

This is no big deal. Compared to his OLMo, which debuted in February, OpenELM is 2.36% more accurate and uses half the amount of pre-training tokens. But it may be enough to remind people that Apple is no longer content to be an AI industry wallflower.

Apple's claim to openness comes from its decision to release not just its models, but its training and evaluation framework.

“Unlike the traditional practice of simply providing model weights and inference code and pre-training on private datasets, our release includes training logs, multiple checkpoints, and pre-training on publicly available Contains a complete framework for training and evaluating language models on datasets with a “Training Configuration'' described by 11 Apple researchers in a related technical document.

Also, in a departure from academic practice, the authors' email addresses are not listed. This is due to Apple's interpretation of openness, and is somewhat similar to his less open OpenAI.

Accompanying software releases are not eligible for open source licenses. This is not an unreasonable restriction, but it does make clear that Apple reserves the right to pursue patent claims if derivative works based on OpenELM are deemed to infringe Apple's rights.

OpenELM utilizes a technique called layerwise scaling to more efficiently allocate parameters in transformer models. Therefore, instead of each layer having the same set of parameters, OpenELM's transformer layers have different configurations and parameters. The result is improved accuracy, as shown by the percentage of correct predictions from the model in benchmark tests.

I've heard that OpenELM was pre-trained using GitHub's RedPajama dataset, tons of books, Wikipedia, StackExchange posts, ArXiv papers, etc., and Dolma sets from Reddit, Wikibooks, Project Gutenberg, etc. You can use this model as you imagine. When you give the model a prompt, it will try to respond or autocomplete.

One of the highlights of this release is that it comes with “code to convert models to MLX libraries for inference and fine-tuning on Apple devices.”

MLX is a framework released last year for running machine learning on Apple silicon. The ability to work locally on Apple devices rather than over a network should make his OpenELM even more interesting for developers.

“Apple's release of OpenELM is a major advancement for the AI community, delivering efficient on-device AI processing that is ideal for mobile apps and IoT devices with limited computing power,” said AI services company Aquant. said Shahar Chen, CEO and co-founder. register. “This will enable rapid local decision-making, which is essential for everything from smartphones to smart home devices, expanding the potential of AI in everyday technology.”

Apple has been keen to demonstrate the benefits of its chip architecture, especially for hardware-supported machine learning, ever since Cupertino introduced Neural Engine in 2017. Nevertheless, although OpenELM may score higher on accuracy benchmarks, it falls short in terms of performance.

“We observe that OpenELM is slower than OLMo despite its higher accuracy for a similar number of parameters,” the paper explains, using Nvidia's CUDA on Linux and the MLX version of OpenELM on Apple Silicon. I am quoting the tests performed.

Apple officials said the less-than-winning performance was due to the company's “simplistic implementation of RMSNorm,” a technology that uses machine learning to normalize data. In the future, we plan to consider further optimizations.

OpenELM is available in pre-trained and instruction-tuned models with 270 million, 450 million, 1.1 billion, and 3 billion parameters. Anyone using it is cautioned to do their due diligence before trying the model for anything meaningful.

“The release of the OpenELM model aims to empower and enrich the open research community by providing access to state-of-the-art language models,” the paper states. “These models are trained on publicly available datasets and are available without any security guarantees.” ®

Source link

b"asta binance h"anvisningskod commented on IP Basics: Copyright Law (Podcast) – Copyright: I don't think the title of your article matches th
binance konto commented on AI And The Channel: It’s Go Time: Thanks for sharing. I read many of your blog posts
小艾彩票平台 commented on Create the content you envision: Hello, for all time i used to check blog posts her
天天官网 commented on 10 AI Applications to Streamline Business and Customer Experiences: After looking into a few of the blog posts on your
免费Binance账户 commented on Foreshadowing Biden’s AI Executive Order? — AI: The Washington Report | Mintz: Can you be more specific about the content of your

Apple releases OpenELM, a slightly more accurate LLM • The Register

Leave a Reply

RECENT POSTS

Only 44% of organizations provide clear AI guidance to employees

McKinsey’s 2025 Global AI Survey: 88% of organizations are now using AI in at least one function, up from 78%. However, most companies are still in pilot mode, and only a few can point to a real impact on their bottom lines.

Snap separates Gen-AI team and creates new company Dotmo 06/22/2026

Related Posts

Leave a Reply