热点
关于我们
xx
xx
"
多模态评估
" 相关文章
TREAT: A Code LLMs Trustworthiness / Reliability Evaluation and Testing Framework
cs.AI updates on arXiv.org
2025-10-21T04:27:54.000000Z
MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded Evaluation
cs.AI updates on arXiv.org
2025-10-13T04:12:21.000000Z
Multimodal Carotid Risk Stratification with Large Vision-Language Models: Benchmarking, Fine-Tuning, and Clinical Insights
cs.AI updates on arXiv.org
2025-10-06T04:27:55.000000Z
PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation
cs.AI updates on arXiv.org
2025-10-02T04:17:55.000000Z
Text2VLM: Adapting Text-Only Datasets to Evaluate Alignment Training in Visual Language Models
cs.AI updates on arXiv.org
2025-07-29T04:21:57.000000Z
MEGA-Bench: A Comprehensive AI Benchmark that Scales Multimodal Evaluation to Over 500 Real-World Tasks at a Manageable Inference Cost
MarkTechPost@AI
2024-10-15T07:21:08.000000Z
MJ-BENCH: A Multimodal AI Benchmark for Evaluating Text-to-Image Generation with Focus on Alignment, Safety, and Bias
MarkTechPost@AI
2024-07-13T04:31:18.000000Z