热点
关于我们
xx
xx
"
语音处理
" 相关文章
MERaLiON-SER: Robust Speech Emotion Recognition Model for English and SEA Languages
cs.AI updates on arXiv.org
2025-11-10T05:12:18.000000Z
UniSE: A Unified Framework for Decoder-only Autoregressive LM-based Speech Enhancement
cs.AI updates on arXiv.org
2025-10-24T04:27:43.000000Z
[分享创造] 求推荐 YouTube 视频把印度口音🇮🇳替换成标准美式口音🇺🇸的方案
V2EX
2025-10-21T03:39:47.000000Z
2025.09.30 | SLA稀疏注意力砍算力;StableToken抗噪不训模
HuggingFace 每日AI论文速递
2025-10-02T17:36:43.000000Z
Xiaomi Released MiMo-Audio, a 7B Speech Language Model Trained on 100M+ Hours with High-Fidelity Discrete Tokens
MarkTechPost@AI
2025-09-20T08:25:54.000000Z
MERaLiON-SpeechEncoder: Towards a Speech Foundation Model for Singapore and Beyond
cs.AI updates on arXiv.org
2025-09-12T04:19:21.000000Z
From Silent Signals to Natural Language: A Dual-Stage Transformer-LLM Approach
cs.AI updates on arXiv.org
2025-09-08T04:51:47.000000Z
StutterCut: Uncertainty-Guided Normalised Cut for Dysfluency Segmentation
cs.AI updates on arXiv.org
2025-08-05T11:10:22.000000Z
Synthetic Data Generation for Phrase Break Prediction with Large Language Model
cs.AI updates on arXiv.org
2025-07-25T04:28:45.000000Z
字节推出中英同传新模型:模拟音色 延迟近专业同传译员水平
Cnbeta
2025-07-24T08:07:46.000000Z
On the Relationship between Accent Strength and Articulatory Features
cs.AI updates on arXiv.org
2025-07-08T05:54:07.000000Z
K-Function: Joint Pronunciation Transcription and Feedback for Evaluating Kids Language Function
cs.AI updates on arXiv.org
2025-07-08T05:54:00.000000Z
PhonemeFake: Redefining Deepfake Realism with Language-Driven Segmental Manipulation and Adaptive Bilevel Detection
cs.AI updates on arXiv.org
2025-07-01T06:49:15.000000Z
This company is using AI to give people American-sounding accents
The Verge - Artificial Intelligences
2025-03-26T15:39:36.000000Z
Alibaba Speech Lab Releases ClearerVoice-Studio: An Open-Sourced Voice Processing Framework Supporting Speech Enhancement, Separation, and Target Speaker Extraction
MarkTechPost@AI
2024-12-07T20:19:55.000000Z
快速创建 3D 数字人头;开源多功能修图神器;Runway 新增高级运镜功能;通义提示词生成连贯图像;音频版 LoRa 音乐创作
三花AI
2024-11-04T03:00:09.000000Z
大规模、动态「语音增强/分离」新基准!清华发布移动音源仿真平台SonicSim,含950+小时训练数据
新智元
2024-10-31T09:32:02.000000Z
大规模、动态「语音增强/分离」新基准,清华发布移动音源仿真平台SonicSim,含950+小时训练数据
36氪 - 科技频道
2024-10-31T07:29:09.000000Z
SpeechBrain: A PyTorch-based Speech Toolkit
MarkTechPost@AI
2024-10-08T07:21:16.000000Z
This AI Paper by NVIDIA Introduces NEST: A Fast and Efficient Self-Supervised Model for Speech Processing
MarkTechPost@AI
2024-09-13T05:05:44.000000Z