cs.AI updates on arXiv.org 10月02日
跨语言神经语音分离模型研究
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出了一种支持单一框架内多种语言无约束范围的神经语音分离模型。通过结合可学习的基于查询的架构和多语言意识,以及大规模模拟代码转换数据预训练,该模型克服了传统方法在数据稀缺和架构优化方面的限制,并在多个语言分离基准测试中实现了最先进的性能。

arXiv:2510.00582v1 Announce Type: cross Abstract: In this paper, we present a neural spoken language diarization model that supports an unconstrained span of languages within a single framework. Our approach integrates a learnable query-based architecture grounded in multilingual awareness, with large-scale pretraining on simulated code-switching data. By jointly leveraging these two components, our method overcomes the limitations of conventional approaches in data scarcity and architecture optimization, and generalizes effectively to real-world multilingual settings across diverse environments. Experimental results demonstrate that our approach achieves state-of-the-art performance on several language diarization benchmarks, with a relative performance improvement of 23% to 52% over previous methods. We believe that this work not only advances research in language diarization but also establishes a foundational framework for code-switching speech technologies.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

神经语音分离 跨语言 模型研究 代码转换 预训练
相关文章