语音处理_Fishai

热点

"语音处理" 相关文章

MERaLiON-SER: Robust Speech Emotion Recognition Model for English and SEA Languages

cs.AI updates on arXiv.org 2025-11-10T05:12:18.000000Z

UniSE: A Unified Framework for Decoder-only Autoregressive LM-based Speech Enhancement

cs.AI updates on arXiv.org 2025-10-24T04:27:43.000000Z

[分享创造] 求推荐 YouTube 视频把印度口音🇮🇳替换成标准美式口音🇺🇸的方案

V2EX 2025-10-21T03:39:47.000000Z

2025.09.30 | SLA稀疏注意力砍算力；StableToken抗噪不训模

HuggingFace 每日AI论文速递 2025-10-02T17:36:43.000000Z

Xiaomi Released MiMo-Audio, a 7B Speech Language Model Trained on 100M+ Hours with High-Fidelity Discrete Tokens

MarkTechPost@AI 2025-09-20T08:25:54.000000Z

MERaLiON-SpeechEncoder: Towards a Speech Foundation Model for Singapore and Beyond

cs.AI updates on arXiv.org 2025-09-12T04:19:21.000000Z

From Silent Signals to Natural Language: A Dual-Stage Transformer-LLM Approach

cs.AI updates on arXiv.org 2025-09-08T04:51:47.000000Z

StutterCut: Uncertainty-Guided Normalised Cut for Dysfluency Segmentation

cs.AI updates on arXiv.org 2025-08-05T11:10:22.000000Z

Synthetic Data Generation for Phrase Break Prediction with Large Language Model

cs.AI updates on arXiv.org 2025-07-25T04:28:45.000000Z

字节推出中英同传新模型：模拟音色延迟近专业同传译员水平

Cnbeta 2025-07-24T08:07:46.000000Z

On the Relationship between Accent Strength and Articulatory Features

cs.AI updates on arXiv.org 2025-07-08T05:54:07.000000Z

K-Function: Joint Pronunciation Transcription and Feedback for Evaluating Kids Language Function

cs.AI updates on arXiv.org 2025-07-08T05:54:00.000000Z

PhonemeFake: Redefining Deepfake Realism with Language-Driven Segmental Manipulation and Adaptive Bilevel Detection

cs.AI updates on arXiv.org 2025-07-01T06:49:15.000000Z

This company is using AI to give people American-sounding accents

The Verge - Artificial Intelligences 2025-03-26T15:39:36.000000Z

Alibaba Speech Lab Releases ClearerVoice-Studio: An Open-Sourced Voice Processing Framework Supporting Speech Enhancement, Separation, and Target Speaker Extraction

MarkTechPost@AI 2024-12-07T20:19:55.000000Z

快速创建 3D 数字人头；开源多功能修图神器；Runway 新增高级运镜功能；通义提示词生成连贯图像；音频版 LoRa 音乐创作

三花AI 2024-11-04T03:00:09.000000Z

大规模、动态「语音增强/分离」新基准！清华发布移动音源仿真平台SonicSim，含950+小时训练数据

新智元 2024-10-31T09:32:02.000000Z

大规模、动态「语音增强/分离」新基准，清华发布移动音源仿真平台SonicSim，含950+小时训练数据

36氪 - 科技频道 2024-10-31T07:29:09.000000Z

SpeechBrain: A PyTorch-based Speech Toolkit

MarkTechPost@AI 2024-10-08T07:21:16.000000Z

This AI Paper by NVIDIA Introduces NEST: A Fast and Efficient Self-Supervised Model for Speech Processing

MarkTechPost@AI 2024-09-13T05:05:44.000000Z

Copyright © 2019 FISHAI.All Rights Reserved