PCLMMPLUS数据集与CPCLDetector模型提升中国视频平台歧视检测

cs.AI updates on arXiv.org 09月25日

PCLMMPLUS数据集与CPCLDetector模型提升中国视频平台歧视检测

本文提出PCLMMPLUS数据集和CPCLDetector模型，以提升中国视频平台对歧视性语言的检测能力，保护弱势群体。

arXiv:2509.18562v2 Announce Type: replace-cross Abstract: Chinese Patronizing and Condescending Language (CPCL) is an implicitly discriminatory toxic speech targeting vulnerable groups on Chinese video platforms. The existing dataset lacks user comments, which are a direct reflection of video content. This undermines the model's understanding of video content and results in the failure to detect some CPLC videos. To make up for this loss, this research reconstructs a new dataset PCLMMPLUS that includes 103k comment entries and expands the dataset size. We also propose the CPCLDetector model with alignment selection and knowledge-enhanced comment content modules. Extensive experiments show the proposed CPCLDetector outperforms the SOTA on PCLMM and achieves higher performance on PCLMMPLUS . CPLC videos are detected more accurately, supporting content governance and protecting vulnerable groups. Code and dataset are available at https://github.com/jiaxunyang256/PCLD.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

PCLMMPLUS数据集 CPCLDetector模型歧视检测中国视频平台弱势群体

相关文章

为什么比起坏人来大多数人更讨厌蠢人?

如何避免王佳佳法官的悲剧事件发生

加拿大女子严重过敏申请健康住房2年无果无奈最后选择安乐死

报名开放｜AI向善语料库高校专场共创启动：AI是我们对您的爱！

博主陈震：自行车马路乱骑被撞汽车无责却被罚很不科学

冷血吗博主称全责碰了豪车一夜返贫也活该

ESG生态协会志愿者温暖传递，关爱青岛市李沧区兴华路社区特困家庭

AWS pledges $100M in cloud credits to help education organizations build learning tools

莫言坦言：为百万奖金决定领奖，并全数捐出

江苏盐城农民未经检验私卖羊肉，被开出10万元罚单，二审开庭审理