自动语音识别_Fishai

热点

"自动语音识别" 相关文章

Overview of the MEDIQA-OE 2025 Shared Task on Medical Order Extraction from Doctor-Patient Consultations

cs.AI updates on arXiv.org 2025-11-03T05:18:58.000000Z

Arabic Little STT: Arabic Children Speech Recognition Dataset

cs.AI updates on arXiv.org 2025-10-28T04:14:35.000000Z

Probing the Hidden Talent of ASR Foundation Models for L2 English Oral Assessment

cs.AI updates on arXiv.org 2025-10-21T04:24:09.000000Z

RWKV 2025 生态内容征集大赛 | 9 月投稿作品及评审结果

RWKV元始智能 2025-10-18T11:36:31.000000Z

Personal Attribute Leakage in Federated Speech Models

cs.AI updates on arXiv.org 2025-10-16T04:27:02.000000Z

A Critical Review of the Need for Knowledge-Centric Evaluation of Quranic Recitation

cs.AI updates on arXiv.org 2025-10-16T04:24:08.000000Z

Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation

cs.AI updates on arXiv.org 2025-10-16T04:23:08.000000Z

Articulation-Informed ASR: Integrating Articulatory Features into ASR via Auxiliary Speech Inversion and Cross-Attention Fusion

cs.AI updates on arXiv.org 2025-10-13T04:11:32.000000Z

Articulation-Informed ASR: Integrating Articulatory Features into ASR via Auxiliary Speech Inversion and Cross-Attention Fusion

cs.AI updates on arXiv.org 2025-10-13T04:11:32.000000Z

An Investigation of Incorporating Mamba for Speech Enhancement

cs.AI updates on arXiv.org 2025-10-08T04:15:24.000000Z

EvolveCaptions: Empowering DHH Users Through Real-Time Collaborative Captioning

cs.AI updates on arXiv.org 2025-10-03T04:18:36.000000Z

EvolveCaptions: Empowering DHH Users Through Real-Time Collaborative Captioning

cs.AI updates on arXiv.org 2025-10-03T04:18:36.000000Z

Automatic Speech Recogntion with Hugging Face's Transformers and Amazon SageMaker

philschmid RSS feed 2025-09-30T11:13:57.000000Z

Managed Transcription with OpenAI Whisper and Hugging Face Inference Endpoints

philschmid RSS feed 2025-09-30T11:13:01.000000Z

Decoding Deception: Understanding Automatic Speech Recognition Vulnerabilities in Evasion and Poisoning Attacks

cs.AI updates on arXiv.org 2025-09-29T04:15:34.000000Z

MNV-17: A High-Quality Performative Mandarin Dataset for Nonverbal Vocalization Recognition in Speech

cs.AI updates on arXiv.org 2025-09-26T04:24:05.000000Z

i-LAVA: Insights on Low Latency Voice-2-Voice Architecture for Agents

cs.AI updates on arXiv.org 2025-09-26T04:22:24.000000Z

Variational Low-Rank Adaptation for Personalized Impaired Speech Recognition

cs.AI updates on arXiv.org 2025-09-26T04:21:19.000000Z

Data-Efficient ASR Personalization for Non-Normative Speech Using an Uncertainty-Based Phoneme Difficulty Score for Guided Sampling

cs.AI updates on arXiv.org 2025-09-26T04:21:18.000000Z

Qwen3-ASR-Flash发布后，听到大家说……

通义 2025-09-25T10:01:43.000000Z

Copyright © 2019 FISHAI.All Rights Reserved