cs.AI updates on arXiv.org 09月03日
语音情感识别:机器学习新进展
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了语音情感识别的难题,通过构建SVM、LTSM和CNN等机器学习模型,结合迁移学习和数据增强技术,在较小数据集上实现了较好的性能,最佳模型ResNet34准确率达到66.7%,F1分数为0.631。

arXiv:2509.00025v1 Announce Type: cross Abstract: Speech emotion recognition (SER) has been a challenging problem in spoken language processing research, because it is unclear how human emotions are connected to various components of sounds such as pitch, loudness, and energy. This paper aims to tackle this problem using machine learning. Particularly, we built several machine learning models using SVMs, LTSMs, and CNNs to classify emotions in human speeches. In addition, by leveraging transfer learning and data augmentation, we efficiently trained our models to attain decent performances on a relatively small dataset. Our best model was a ResNet34 network, which achieved an accuracy of $66.7\%$ and an F1 score of $0.631$.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

语音情感识别 机器学习 模型构建 数据增强 迁移学习
相关文章