cs.AI updates on arXiv.org 07月08日
Personalized Image Generation from an Author Writing Style
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出一种利用作者写作表(AWS)输入大型语言模型(LLM)生成文本到图像提示,并通过扩散模型生成视觉图像的方法。实验结果表明,该方法能够较好地捕捉作者风格,但存在高度抽象叙述元素表现上的挑战。

arXiv:2507.03313v1 Announce Type: cross Abstract: Translating nuanced, textually-defined authorial writing styles into compelling visual representations presents a novel challenge in generative AI. This paper introduces a pipeline that leverages Author Writing Sheets (AWS) - structured summaries of an author's literary characteristics - as input to a Large Language Model (LLM, Claude 3.7 Sonnet). The LLM interprets the AWS to generate three distinct, descriptive text-to-image prompts, which are then rendered by a diffusion model (Stable Diffusion 3.5 Medium). We evaluated our approach using 49 author styles from Reddit data, with human evaluators assessing the stylistic match and visual distinctiveness of the generated images. Results indicate a good perceived alignment between the generated visuals and the textual authorial profiles (mean style match: $4.08/5$), with images rated as moderately distinctive. Qualitative analysis further highlighted the pipeline's ability to capture mood and atmosphere, while also identifying challenges in representing highly abstract narrative elements. This work contributes a novel end-to-end methodology for visual authorial style personalization and provides an initial empirical validation, opening avenues for applications in creative assistance and cross-modal understanding.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI生成 作者风格 视觉图像 LLM 扩散模型
相关文章