热点
关于我们
xx
xx
"
自动语音识别
" 相关文章
Overview of the MEDIQA-OE 2025 Shared Task on Medical Order Extraction from Doctor-Patient Consultations
cs.AI updates on arXiv.org
2025-11-03T05:18:58.000000Z
Arabic Little STT: Arabic Children Speech Recognition Dataset
cs.AI updates on arXiv.org
2025-10-28T04:14:35.000000Z
Probing the Hidden Talent of ASR Foundation Models for L2 English Oral Assessment
cs.AI updates on arXiv.org
2025-10-21T04:24:09.000000Z
RWKV 2025 生态内容征集大赛 | 9 月投稿作品及评审结果
RWKV元始智能
2025-10-18T11:36:31.000000Z
Personal Attribute Leakage in Federated Speech Models
cs.AI updates on arXiv.org
2025-10-16T04:27:02.000000Z
A Critical Review of the Need for Knowledge-Centric Evaluation of Quranic Recitation
cs.AI updates on arXiv.org
2025-10-16T04:24:08.000000Z
Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
cs.AI updates on arXiv.org
2025-10-16T04:23:08.000000Z
Articulation-Informed ASR: Integrating Articulatory Features into ASR via Auxiliary Speech Inversion and Cross-Attention Fusion
cs.AI updates on arXiv.org
2025-10-13T04:11:32.000000Z
Articulation-Informed ASR: Integrating Articulatory Features into ASR via Auxiliary Speech Inversion and Cross-Attention Fusion
cs.AI updates on arXiv.org
2025-10-13T04:11:32.000000Z
An Investigation of Incorporating Mamba for Speech Enhancement
cs.AI updates on arXiv.org
2025-10-08T04:15:24.000000Z
EvolveCaptions: Empowering DHH Users Through Real-Time Collaborative Captioning
cs.AI updates on arXiv.org
2025-10-03T04:18:36.000000Z
EvolveCaptions: Empowering DHH Users Through Real-Time Collaborative Captioning
cs.AI updates on arXiv.org
2025-10-03T04:18:36.000000Z
Automatic Speech Recogntion with Hugging Face's Transformers and Amazon SageMaker
philschmid RSS feed
2025-09-30T11:13:57.000000Z
Managed Transcription with OpenAI Whisper and Hugging Face Inference Endpoints
philschmid RSS feed
2025-09-30T11:13:01.000000Z
Decoding Deception: Understanding Automatic Speech Recognition Vulnerabilities in Evasion and Poisoning Attacks
cs.AI updates on arXiv.org
2025-09-29T04:15:34.000000Z
MNV-17: A High-Quality Performative Mandarin Dataset for Nonverbal Vocalization Recognition in Speech
cs.AI updates on arXiv.org
2025-09-26T04:24:05.000000Z
i-LAVA: Insights on Low Latency Voice-2-Voice Architecture for Agents
cs.AI updates on arXiv.org
2025-09-26T04:22:24.000000Z
Variational Low-Rank Adaptation for Personalized Impaired Speech Recognition
cs.AI updates on arXiv.org
2025-09-26T04:21:19.000000Z
Data-Efficient ASR Personalization for Non-Normative Speech Using an Uncertainty-Based Phoneme Difficulty Score for Guided Sampling
cs.AI updates on arXiv.org
2025-09-26T04:21:18.000000Z
Qwen3-ASR-Flash发布后,听到大家说……
通义
2025-09-25T10:01:43.000000Z