PLaMo 2：日本语大型语言模型新进展

cs.AI updates on arXiv.org 09月08日

PLaMo 2：日本语大型语言模型新进展

本文介绍了一种名为PLaMo 2的日本语大型语言模型，采用混合Samba架构，通过持续预训练支持32K标记上下文。模型利用大量合成语料库训练，并采用高效剪枝方法，实现8B模型性能与100B模型相当。通过监督微调和直接偏好优化，结合合成指令数据与模型合并技术，优化推理性能，在日语基准测试中取得领先。

arXiv:2509.04897v1 Announce Type: cross Abstract: In this report, we introduce PLaMo 2, a series of Japanese-focused large language models featuring a hybrid Samba-based architecture that transitions to full attention via continual pre-training to support 32K token contexts. Training leverages extensive synthetic corpora to overcome data scarcity, while computational efficiency is achieved through weight reuse and structured pruning. This efficient pruning methodology produces an 8B model that achieves performance comparable to our previous 100B model. Post-training further refines the models using a pipeline of supervised fine-tuning (SFT) and direct preference optimization (DPO), enhanced by synthetic Japanese instruction data and model merging techniques. Optimized for inference using vLLM and quantization with minimal accuracy loss, the PLaMo 2 models achieve state-of-the-art results on Japanese benchmarks, outperforming similarly-sized open models in instruction-following, language fluency, and Japanese-specific knowledge.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

PLaMo 2 日本语大型语言模型 Samba架构持续预训练监督微调

相关文章

Magpie-Ultra Dataset Released: Harnessing Llama 3.1 405B for Diverse AI Instruction-Response Pairs

Tele-FLM系列再升级！52B对话模型发布、全球首个万亿单体稠密模型开源

Aquila-Med LLM：开创性的全流程开源医疗语言模型

From Wordle to Robotics: Q-SFT Unleashes LLMs’ Potential in Sequential Decision-Making

Noteworthy AI Research Papers of 2024 (Part One)

一文详尽之SFT（监督微调）！

Memorization vs. Generalization: How Supervised Fine-Tuning SFT and Reinforcement Learning RL Shape Foundation Model Learning

Researchers created an open rival to OpenAI’s o1 ‘reasoning’ model for under $50

Unraveling Direct Alignment Algorithms: A Comparative Study on Optimization Strategies for LLM Alignment

Sebastian Raschka：关于DeepSeek R1和推理模型，我有几点看法