ΑΙhub 09月12日
瑞士联合发布全开源大语言模型Apertus
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

瑞士洛桑联邦理工学院(EPFL)、苏黎世联邦理工学院(ETH Zurich)及瑞士国家超级计算中心(CSCS)联合发布了名为Apertus的大语言模型。该模型最大的特点是完全开源,包括其架构、模型权重、训练数据和方法等所有开发过程均公开可查。Apertus提供80亿和700亿参数两个版本,并采用宽松的开源许可,支持教育、研究、社会及商业应用。模型训练数据包含15万亿个token,覆盖1000多种语言,其中40%为非英语数据,特别关注了瑞士德语、罗曼什语等代表性语言。Apertus旨在成为可信赖、主权和包容性AI模型的典范,并推动AI领域的创新与专业知识发展。

💡 **完全开源的开发模式**: Apertus模型最大的亮点在于其完全开放的开发过程,包括模型架构、权重、训练数据和训练方法等,用户可以深入了解和审查模型的每一个环节。这种透明度使其区别于仅部分公开组件的模型,为构建可信赖、主权和包容性AI提供了范例。

🌍 **广泛的多语言支持**: Apertus在训练时使用了15万亿个token,覆盖了超过1000种语言,其中40%的数据是非英语的。特别的是,它包含了如瑞士德语、罗曼什语等在大型语言模型中此前代表性不足的语言,显著提升了模型的语言多样性。

🚀 **推动AI创新与应用**: Apertus的发布不仅是技术转移,更是作为创新的驱动力,旨在加强瑞士乃至全球在研究、社会和产业界的AI专业知识。其宽松的开源许可允许广泛的商业应用,为开发者和组织提供了构建新一代AI应用的坚实基础。

🔒 **注重透明度与合规性**: Apertus在开发过程中严格遵守瑞士和欧盟的数据保护与版权法律,并特别关注数据完整性和道德标准。训练数据经过仔细筛选,过滤了个人信息和不当内容,并尊重网站的机器可读退出请求,确保了模型的合规性和伦理安全性。

By Melissa Anchisi and Florian Meyer

In July, EPFL, ETH Zurich, and the Swiss National Supercomputing Centre (CSCS) announced their joint initiative to build a large language model (LLM). Now, this model is available and serves as a building block for developers and organisations for future applications such as chatbots, translation systems, or educational tools.

The model is named Apertus – Latin for “open” – highlighting its distinctive feature: the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.

AI researchers, professionals, and experienced enthusiasts can either access the model through the strategic partner Swisscom or download it from Hugging Face – a platform for AI models and applications – and deploy it for their own projects. Apertus is freely available in two sizes – featuring 8 billion and 70 billion parameters, the smaller model being more appropriate for individual usage. Both models are released under a permissive open-source license, allowing use in education and research as well as broad societal and commercial applications.

A fully open-source LLM

As a fully open language model, Apertus allows researchers, professionals and enthusiasts to build upon the model and adapt it to their specific needs, as well as to inspect any part of the training process. This distinguishes Apertus from models that make only selected components accessible.

“With this release, we aim to provide a blueprint for how a trustworthy, sovereign, and inclusive AI model can be developed,” says Martin Jaggi, Professor of Machine Learning at EPFL and member of the Steering Committee of the Swiss AI Initiative. The model will be regularly updated by the development team which includes specialized engineers and a large number of researchers from CSCS, ETH Zurich and EPFL.

A driver of innovation

With its open approach, EPFL, ETH Zurich and CSCS are venturing into new territory. “Apertus is not a conventional case of technology transfer from research to product. Instead, we see it as a driver of innovation and a means of strengthening AI expertise across research, society and industry,” says Thomas Schulthess, Director of CSCS and Professor at ETH Zurich. In line with their tradition, EPFL, ETH Zurich and CSCS are providing both foundational technology and infrastructure to foster innovation across the economy.

Trained on 15 trillion tokens across more than 1,000 languages – 40% of the data is non-English – Apertus includes many languages that have so far been underrepresented in LLMs, such as Swiss German, Romansh, and many others.

“Apertus is built for the public good. It stands among the few fully open LLMs at this scale and is the first of its kind to embody multilingualism, transparency, and compliance as foundational design principles”, says Imanol Schlag, technical lead of the LLM project and Research Scientist at ETH Zurich.

“Swisscom is proud to be among the first to deploy this pioneering large language model on our sovereign Swiss AI Platform. As a strategic partner of the Swiss AI Initiative, we are supporting the access of Apertus during the Swiss {ai} Weeks. This underscores our commitment to shaping a secure and responsible AI ecosystem that serves the public interest and strengthens Switzerland’s digital sovereignty”, commented Daniel Dobos, Research Director at Swisscom.

Accessibility

While setting up Apertus is straightforward for professionals and proficient users, additional components such as servers, cloud infrastructure or specific user interfaces are required for practical use. The upcoming Swiss {ai} Weeks hackathons will be the first opportunity for developers to experiment hands-on with Apertus, test its capabilities, and provide feedback for improvements to future versions.

Swisscom will provide a dedicated interface to hackathon participants, making it easier to interact with the model. As of today, Swisscom business customers will be able to access the Apertus model via Swisscom’s sovereign Swiss AI platform.

Furthermore, for people outside of Switzerland, the Public AI Inference Utility will make Apertus accessible as part of a global movement for public AI. “Currently, Apertus is the leading public AI model: a model built by public institutions, for the public interest. It is our best proof yet that AI can be a form of public infrastructure like highways, water, or electricity,” says Joshua Tan, Lead Maintainer of the Public AI Inference Utility.

Transparency and compliance

Apertus is designed with transparency at its core, thereby ensuring full reproducibility of the training process. Alongside the models, the research team has published a range of resources: comprehensive documentation and source code of the training process and datasets used, model weights including intermediate checkpoints – all released under the permissive open-source license, which also allows for commercial use. The terms and conditions are available via Hugging Face.

Apertus was developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. Particular attention has been paid to data integrity and ethical standards: the training corpus builds only on data which is publicly available. It is filtered to respect machine-readable opt-out requests from websites, even retroactively, and to remove personal data, and other undesired content before training begins.

The beginning of a journey

“Apertus demonstrates that generative AI can be both powerful and open,” says Antoine Bosselut, Professor and Head of the Natural Language Processing Laboratory at EPFL and Co-Lead of the Swiss AI Initiative. “The release of Apertus is not a final step, rather it’s the beginning of a journey, a long-term commitment to open, trustworthy, and sovereign AI foundations, for the public good worldwide. We are excited to see developers engage with the model at the Swiss {ai} Weeks hackathons. Their creativity and feedback will help us to improve future generations of the model.”

Future versions aim to expand the model family, improve efficiency, and explore domain-specific adaptations in fields like law, climate, health and education. They are also expected to integrate additional capabilities, while maintaining strong standards for transparency.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Apertus 大型语言模型 开源AI 瑞士AI 多语言模型 Apertus Large Language Model Open-Source AI Swiss AI Multilingual Model
相关文章