WANDER：基于新颖性搜索的文本到图像扩散模型

cs.AI updates on arXiv.org 9小时前

WANDER：基于新颖性搜索的文本到图像扩散模型

本文提出WANDER，一种基于新颖性搜索的文本到图像扩散模型，旨在解决现有模型输出多样性不足的问题。通过直接在自然语言提示上操作，WANDER使用大型语言模型进行语义演化，并结合CLIP嵌入量化新颖性，最终在多样性指标上显著优于现有优化基准。

arXiv:2511.00686v1 Announce Type: cross Abstract: Text-to-image diffusion models, while proficient at generating high-fidelity images, often suffer from limited output diversity, hindering their application in exploratory and ideation tasks. Existing prompt optimization techniques typically target aesthetic fitness or are ill-suited to the creative visual domain. To address this shortcoming, we introduce WANDER, a novelty search-based approach to generating diverse sets of images from a single input prompt. WANDER operates directly on natural language prompts, employing a Large Language Model (LLM) for semantic evolution of diverse sets of images, and using CLIP embeddings to quantify novelty. We additionally apply emitters to guide the search into distinct regions of the prompt space, and demonstrate that they boost the diversity of the generated images. Empirical evaluations using FLUX-DEV for generation and GPT-4o-mini for mutation demonstrate that WANDER significantly outperforms existing evolutionary prompt optimization baselines in diversity metrics. Ablation studies confirm the efficacy of emitters.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

文本到图像扩散模型新颖性搜索多样性大型语言模型 CLIP嵌入

相关文章

Is Claude 3 Outperforming GPT-4?

Harmonizing AI: Crafting Personalized Song Suggestions

AI News Weekly - Issue #377: Next in AI : Pioneers' Predictions! - Mar 21st 2024

COLLAGE: A New Machine Learning Approach to Deal with Floating-Point Errors in Low-Precision to Make LLM Training Accurate and Efficient

Leveraging Linguistic Expertise in NLP: A Deep Dive into RELIES and Its Impact on Large Language Models

Japanese Researchers Release “Fugaku-LLM” Trained on the Fugaku Supercomputer

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680

Deep Learning, Transformers, and the Consequences of Scale with Oriol Vinyals - #546

Trends in Machine Learning with Anima Anandkumar - TWiML Talk #215

Diversification in Recommender Systems with Ahsan Ashraf - TWiML Talk #187