Apertus: A completely open and transparent multilingual model

Melissa Anchis and Florian Meyer

In July, EPFL, ETH Zurich, and the Swiss National Supercomputing Centre (CSCS) announced joint initiatives to build large-scale language models (LLMs). Currently, this model is available and serves as a building block for developers and organizations of future applications such as chatbots, translation systems, and educational tools.

The name of this model emphasizes the Latin word “open” – its distinctive feature. The entire development process, including architecture, model weights, training data and recipes, is openly accessible and fully documented.

AI researchers, experts, and experienced enthusiasts can access models through their strategic partner SwissCom, or download them from Hugging Face, the platform for AI models and applications, and deploy them to their own projects. Two sizes are available at your disposal. Featuring 8 billion and 70 billion parameters, the smaller model is suitable for individual use. Both models are released under an acceptable open source license, allowing them to be used in education, research and in a wide range of social and commercial applications.

Completely open source LLM

As a completely open language model, Apertus allows researchers, experts and enthusiasts to adapt to their specific needs based on the model and examine some of the training process. This distinguishes between Apertus from models that are accessible only to selected components.

“With this release, we aim to provide a blueprint on how we can develop trustworthy, sovereign and comprehensive AI models,” said Martin Jaggi, EPFL's machine learning professor and member of the steering committee of the Swiss AI Initiative. This model is regularly updated by a development team that includes specialized engineers and numerous researchers from CSCS, ETH Zurich and EPFL.

The driver of innovation

EPFL, ETH Zurich and CSCS are challenging new fields with their open approach. “Apertus is not a traditional case of technology transfer from research to product. Instead, we consider it a means to enhance AI expertise across innovation, society and industry.” In line with their tradition, EPFL, ETH Zurich, and CSC provide both the basic technology and infrastructure to promote innovation across the economy.

It is trained with 15 trillion tokens in over 1,000 languages. 40% of the data is not in English. Apertus includes many languages that have previously been underrated by LLM, such as Swiss German and Romanche.

“Apertus is built for the public good. On this scale, it is one of a small number of fully open LLMSs, the first to embody multilingualism, transparency and compliance as fundamental design principles.”

“Swisscom is proud to be one of the first to deploy this pioneering large-scale language model on the Swiss AI platform. As a strategic partner of the Swiss AI initiative, we support Apertas access during Swiss {ai} week. We commented on Daniel Dobos, Director of Research at Swisscom.

Accessibility

Setting up an Apertus is easy for professionals and skilled users, but is necessary for additional components such as servers, cloud infrastructure, and specific user interfaces to be used in practice. The upcoming Swiss {AI} week hackathon will be the first opportunity for developers to practice Apertus, test its functionality, and provide feedback on future version improvements.

Swisscom provides a dedicated interface for hackathon participants, making it easier to interact with models. As of today, Swisscom business customers will be able to access the Apertus model through Swisscom's Sovereign Swiss AI platform.

Furthermore, for non-Switzerland, public AI reasoning utility will make Apertus accessible as part of the global movement of public AI. “Apertus is a major AI model now. It is a model built by public agencies, for the public interest. AI is a type of public infrastructure, such as highways, water or electricity, which is our best evidence.”

Transparency and compliance

Apertus is designed with transparency at its core, ensuring complete reproducibility of the training process. Alongside the model, the research team publishes a variety of resources: comprehensive documentation and source code of the training process used, model weights including intermediate checkpoints – all released under an acceptable open source license. Terms of Use are available by hugging your face.

Apertus was developed with full consideration of the SWISS Data Protection Act, Swiss copyright law, and transparency obligations under the EU AI law. Special attention is paid to data integrity and ethical standards. Training corpus is built solely on published data. You will be filtered to retroactively respect machine-readable opt-out requests from the website and to remove any personal data or other unwanted content before training begins.

The beginning of the journey

“Apertus shows that the generator AI is strong and open,” says Antoine Bosselut, professor at the EPFL's Institute of Natural Language Processing and co-leading the Swiss AI initiative. “The release of Apertas is not a final step, but a journey, the beginning of a journey, for the public interest of all over the world, a long-term commitment to an open, trustworthy, sovereign AI Foundation. We are excited to see the developer engage in the model at the Swiss {ai} week hackathon.

Future versions aim to expand model families, improve efficiency and explore domain-specific adaptations in areas such as law, climate, health, and education. It is also expected to integrate additional features while maintaining strong standards of transparency.

EPFL (Ecole Polytechnic Federel de Lausanne) is a research institute and university located in Lausanne, Switzerland, specializing in natural sciences and engineering.

Source link