machinelearning apple 10月28日 05:15
Pico-Banana-400K:推动文本引导图像编辑发展
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了Pico-Banana-400K,一个用于文本引导图像编辑的400K图像数据集,通过高质量和多样化的编辑对,推动该领域的研究。

Recent advances in multimodal models have demonstrated remarkable text-guided image editing capabilities, withsystems like GPT-4o and Nano-Banana setting new benchmarks. However, the research community’s progress remainsconstrained by the absence of large-scale, high-quality, and openly accessible datasets built from real images. Weintroduce Pico-Banana-400K, a comprehensive 400K-image dataset for instruction-based image editing. Our dataset isconstructed by leveraging Nano-Banana to generate diverse edit pairs from real photographs in the OpenImages collection.What distinguishes Pico-Banana-400K from previous synthetic datasets is our systematic approach to quality and diversity.We employ a fine-grained image editing taxonomy to ensure comprehensive coverage of edit types while maintainingprecise content preservation and instruction faithfulness through MLLM-based quality scoring and careful curation. Beyondsingle turn editing, Pico-Banana-400K enables research into complex editing scenarios. The dataset includes threespecialized subsets: (1) a 72K-example multi-turn collection for studying sequential editing, reasoning, and planning acrossconsecutive modifications; (2) a 56K-example preference subset for alignment research and reward model training; and (3)paired long-short editing instructions for developing instruction rewriting and summarization capabilities. By providingthis large-scale, high-quality, and task-rich resource, Pico-Banana-400K establishes a robust foundation for training andbenchmarking the next generation of text-guided image editing models.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

文本引导图像编辑 Pico-Banana-400K 数据集 图像编辑
相关文章