热点
"数据污染" 相关文章
不用教它撒谎,LLM也会“心口不一”:上海AI Lab揭露高风险下的模型欺骗行为
PaperWeekly 2025-11-02T21:05:20.000000Z
用「进化+压力测试」自动生成的竞赛级编程题,各家大模型谁更hold住?
机器之心 2025-10-27T15:13:47.000000Z
Evaluating Latent Knowledge of Public Tabular Datasets in Large Language Models
cs.AI updates on arXiv.org 2025-10-24T04:27:00.000000Z
Evaluating Latent Knowledge of Public Tabular Datasets in Large Language Models
cs.AI updates on arXiv.org 2025-10-24T04:27:00.000000Z
德克萨斯A&M大学:LLM阅读社交媒体低质内容致认知能力衰退
互联网数据资讯网-199IT 2025-10-23T13:26:57.000000Z
Deep Associations, High Creativity: A Simple yet Effective Metric for Evaluating Large Language Models
cs.AI updates on arXiv.org 2025-10-15T04:56:31.000000Z
A small amount of bad data can ‘poison’ even the largest AI models, researchers warn
Fortune | FORTUNE 2025-10-14T16:28:55.000000Z
Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models
cs.AI updates on arXiv.org 2025-10-13T04:14:14.000000Z
Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models
cs.AI updates on arXiv.org 2025-10-13T04:14:14.000000Z
RADAR: Mechanistic Pathways for Detecting Data Contamination in LLM Evaluation
cs.AI updates on arXiv.org 2025-10-13T04:09:12.000000Z
RADAR: Mechanistic Pathways for Detecting Data Contamination in LLM Evaluation
cs.AI updates on arXiv.org 2025-10-13T04:09:12.000000Z
RADAR: Mechanistic Pathways for Detecting Data Contamination in LLM Evaluation
cs.AI updates on arXiv.org 2025-10-13T04:09:12.000000Z
Detecting Distillation Data from Reasoning Models
cs.AI updates on arXiv.org 2025-10-07T04:17:41.000000Z
Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation
cs.AI updates on arXiv.org 2025-10-01T06:02:37.000000Z
A Practical Guide to Maintaining Machine Learning in Production
https://eugeneyan.com/rss 2025-09-30T11:14:19.000000Z
How Google’s $60M Reddit Deal Undermines the Future of Knowledge
Jeffbullas's Blog 2025-09-29T04:00:06.000000Z
How Generative Models Are Ruining Themselves
Communications of the ACM - Artificial Intelligence 2025-09-25T10:00:48.000000Z
How Generative Models Are Ruining Themselves
Communications of the ACM - Artificial Intelligence 2025-09-24T16:05:20.000000Z
Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM
cs.AI updates on arXiv.org 2025-09-23T06:11:36.000000Z
VisMoDAl: Visual Analytics for Evaluating and Improving Corruption Robustness of Vision-Language Models
cs.AI updates on arXiv.org 2025-09-19T04:37:50.000000Z