瑞士联合发布全开源大语言模型Apertus

By Melissa Anchisi and Florian Meyer

In July, EPFL, ETH Zurich, and the Swiss National Supercomputing Centre (CSCS) announced their joint initiative to build a large language model (LLM). Now, this model is available and serves as a building block for developers and organisations for future applications such as chatbots, translation systems, or educational tools.

The model is named Apertus – Latin for “open” – highlighting its distinctive feature: the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.

AI researchers, professionals, and experienced enthusiasts can either access the model through the strategic partner Swisscom or download it from Hugging Face – a platform for AI models and applications – and deploy it for their own projects. Apertus is freely available in two sizes – featuring 8 billion and 70 billion parameters, the smaller model being more appropriate for individual usage. Both models are released under a permissive open-source license, allowing use in education and research as well as broad societal and commercial applications.

A fully open-source LLM

As a fully open language model, Apertus allows researchers, professionals and enthusiasts to build upon the model and adapt it to their specific needs, as well as to inspect any part of the training process. This distinguishes Apertus from models that make only selected components accessible.

“With this release, we aim to provide a blueprint for how a trustworthy, sovereign, and inclusive AI model can be developed,” says Martin Jaggi, Professor of Machine Learning at EPFL and member of the Steering Committee of the Swiss AI Initiative. The model will be regularly updated by the development team which includes specialized engineers and a large number of researchers from CSCS, ETH Zurich and EPFL.

A driver of innovation

With its open approach, EPFL, ETH Zurich and CSCS are venturing into new territory. “Apertus is not a conventional case of technology transfer from research to product. Instead, we see it as a driver of innovation and a means of strengthening AI expertise across research, society and industry,” says Thomas Schulthess, Director of CSCS and Professor at ETH Zurich. In line with their tradition, EPFL, ETH Zurich and CSCS are providing both foundational technology and infrastructure to foster innovation across the economy.

Trained on 15 trillion tokens across more than 1,000 languages – 40% of the data is non-English – Apertus includes many languages that have so far been underrepresented in LLMs, such as Swiss German, Romansh, and many others.

“Apertus is built for the public good. It stands among the few fully open LLMs at this scale and is the first of its kind to embody multilingualism, transparency, and compliance as foundational design principles”, says Imanol Schlag, technical lead of the LLM project and Research Scientist at ETH Zurich.

“Swisscom is proud to be among the first to deploy this pioneering large language model on our sovereign Swiss AI Platform. As a strategic partner of the Swiss AI Initiative, we are supporting the access of Apertus during the Swiss {ai} Weeks. This underscores our commitment to shaping a secure and responsible AI ecosystem that serves the public interest and strengthens Switzerland’s digital sovereignty”, commented Daniel Dobos, Research Director at Swisscom.

Accessibility

While setting up Apertus is straightforward for professionals and proficient users, additional components such as servers, cloud infrastructure or specific user interfaces are required for practical use. The upcoming Swiss {ai} Weeks hackathons will be the first opportunity for developers to experiment hands-on with Apertus, test its capabilities, and provide feedback for improvements to future versions.

Swisscom will provide a dedicated interface to hackathon participants, making it easier to interact with the model. As of today, Swisscom business customers will be able to access the Apertus model via Swisscom’s sovereign Swiss AI platform.

Furthermore, for people outside of Switzerland, the Public AI Inference Utility will make Apertus accessible as part of a global movement for public AI. “Currently, Apertus is the leading public AI model: a model built by public institutions, for the public interest. It is our best proof yet that AI can be a form of public infrastructure like highways, water, or electricity,” says Joshua Tan, Lead Maintainer of the Public AI Inference Utility.

Transparency and compliance

Apertus is designed with transparency at its core, thereby ensuring full reproducibility of the training process. Alongside the models, the research team has published a range of resources: comprehensive documentation and source code of the training process and datasets used, model weights including intermediate checkpoints – all released under the permissive open-source license, which also allows for commercial use. The terms and conditions are available via Hugging Face.

Apertus was developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. Particular attention has been paid to data integrity and ethical standards: the training corpus builds only on data which is publicly available. It is filtered to respect machine-readable opt-out requests from websites, even retroactively, and to remove personal data, and other undesired content before training begins.

The beginning of a journey

“Apertus demonstrates that generative AI can be both powerful and open,” says Antoine Bosselut, Professor and Head of the Natural Language Processing Laboratory at EPFL and Co-Lead of the Swiss AI Initiative. “The release of Apertus is not a final step, rather it’s the beginning of a journey, a long-term commitment to open, trustworthy, and sovereign AI foundations, for the public good worldwide. We are excited to see developers engage with the model at the Swiss {ai} Weeks hackathons. Their creativity and feedback will help us to improve future generations of the model.”

Future versions aim to expand the model family, improve efficiency, and explore domain-specific adaptations in fields like law, climate, health and education. They are also expected to integrate additional capabilities, while maintaining strong standards for transparency.

A fully open-source LLM

A driver of innovation

Accessibility

Transparency and compliance

The beginning of a journey

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签