POE Blog 10月11日 00:37
Poe平台AI模型使用趋势报告:文本、图像、视频和音频领域新动态
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本报告分析了2025年5月至8月Poe平台上的聚合用户数据,旨在提供一个独立于提供商的AI生态系统使用趋势视角。报告显示,Google在多模态领域势头强劲,用户对具备推理能力的模型的兴趣持续,以及多媒体生成领域的全球竞争日益激烈。OpenAI在文本领域保持领先,其GPT-5、GPT-4o和GPT-4.1占据主导地位;Google的Gemini系列迅速崛起。图像生成领域,Google的Gemini 2.5 Flash Image发布后迅速占据市场主导地位。视频生成领域,Google的Veo 3因其原生同步音频功能而广受欢迎,但中国AI公司提供的模型合计占据了大部分市场份额。音频领域,ElevenLabs在文本转语音(TTS)方面保持领先,而音乐生成领域则开始出现竞争。整体而言,Poe平台展现了AI模型多样化和市场竞争的活力。

💬 **文本模型竞争格局演变**:OpenAI的GPT系列模型(GPT-5、GPT-4o、GPT-4.1)占据Poe平台超过50%的消息使用量,显示其在文本生成领域的持续优势。Google的Gemini 2.5系列自发布以来用户量快速增长,成为重要竞争者。Anthropic的Claude模型虽然总体份额有所下降,但在处理编程等复杂任务方面仍受青睐。

🖼️ **图像生成领域巨变**:Google发布的Gemini 2.5 Flash Image(代号“nano-banana”)在短时间内颠覆了图像生成市场,占据了超过48%的使用量,体现了市场对速度、性能和精确度提升的强烈需求,尤其是在对话式图像编辑方面。

🎬 **视频生成市场多元化与全球化**:Google的Veo 3凭借其原生同步音频功能成为最受欢迎的视频模型,但来自中国AI公司的Kling、Hailuo、Wan和Seedance等模型合计占据了超过52.6%的市场份额,显示出该领域激烈的全球竞争态势,并对早期市场领导者(如Runway)的份额造成了压力。

🔊 **音频领域格局与新兴机会**:ElevenLabs在文本转语音(TTS)领域以74.4%的份额保持绝对领先,其新模型v3也迅速获得用户青睐。同时,音乐生成作为音频AI的新兴子领域开始出现竞争,ElevenLabs Music和Google的Lyria等模型已崭露头角。

This report examines aggregated subscriber data on Poe from May to August 2025 and presents a provider-agnostic view of emerging usage trends that may signal broader shifts in the AI ecosystem. The period is characterized by Google’s growing momentum across modalities, sustained user interest in models with reasoning capabilities (a key finding in our spring report), and increasing global competition in multi-media generation.

We hope this analysis provides a valuable, data-driven perspective on the dynamic state of AI for researchers and those who are watching the space. [1] [2]

OpenAI extends its lead in text as Google’s Gemini series gains traction

The text landscape continues to show a preference for models that balance cost efficiency with powerful reasoning. This trend has benefited OpenAI and Google, while affecting the overall share of usage among other LLM providers.

  • The OpenAI family of models has expanded its usage on Poe, now accounting for over 50% of subscriber messages. This is led by the recently launched GPT-5 (27.7%), and strong continued usage of GPT-4o (19.8%) and GPT-4.1 (7.3%).

  • After a period of slower initial adoption following its April launch, Google’s Gemini 2.5 series gained significant momentum, steadily growing from 3.5% to approximately 10% of message usage in the text category.

  • Even with new releases of Opus 4/4.1, and Sonnet 4, the total usage share for Anthropic’s Claude models declined. However, they remain a preferred choice for specific, complex tasks, with programming-related queries accounting for roughly 40% of Claude Sonnet 4 usage.

  • While Grok 4's adoption slope on Poe was one of the steepest we've seen among LLMs in the days following launch, its usage has declined to 1% of total messages in the category.

Google gains dominance in image generation with a single launch

The image generation space has been significantly disrupted by Google’s latest release, Gemini 2.5 Flash Image (codenamed “nano-banana”), demonstrating the market’s readiness to adopt models that offer a leap in speed, performance, and precision, specifically in conversational image editing.

  • Gemini 2.5 Flash Image has reshaped the category, capturing over 48% of all image generation usage just one week after its launch and sustaining its lead with a large margin.

  • Despite this major shift, OpenAI’s GPT-Image-1 (11.7%) and the FLUX family from Black Forest Labs (10.5% combined) have held their positions as the next most popular image generation options on Poe.

Innovation and global competition redefine the video landscape

The video generation space has seen a breakthrough in model capability, even as the market has become more fragmented and globally competitive.

  • Google’s Veo 3 became a viral sensation upon its release, introducing the novel capability of generating video with native, synchronized audio. This technological leap drove its rapid adoption, establishing it as the single most-used video model on Poe with approximately 24% usage share, despite its premium cost.

  • The market remains intensely competitive with a diverse range of models from Chinese AI firms (Kling, Hailuo, Wan, and Seedance) collectively representing the majority of usage, capturing 52.6% of all video generation messages during the period.

  • This hyper-competitive environment has put pressure on early market leaders. Runway, one of the category’s previous leaders, saw its market share decline to 12.3%, a trend that the release of its updated Runway Gen 4 Turbo model did not reverse.

ElevenLabs defends its lead in TTS as music generation emerges

The audio AI space is simultaneously maturing and expanding, with a clear leader in the established text-to-speech (TTS) domain and early competition forming in the nascent music generation sub-category.

  • In the TTS domain, ElevenLabs retained the strongest overall usage, fulfilling 74.4% of all subscriber requests. Its new, higher performance v3 model is seeing swift adoption, growing to 29.7% of usage and signaling interest in breakthrough capabilities such as audio tags and multi-speaker support.

  • The competitive field for TTS is diversifying, with the emergence of new models like Hailuo Speech 02 joining established alternatives such as Cartesia, Unreal Speech, PlayAI, and Orpheus. These alternatives offer stronger performance in certain languages, distinctive voice options, effects, and varied performance-price profiles.

  • The recent introduction of music generation models has opened a new front in audio AI with ElevenLabs Music and Lyria 2. In its first week, ElevenLabs Music has taken 10.1% of the total audio generation category’s usage, compared to 3.3% for Google’s Lyria launched in May 2025.

Conclusion

By sharing these insights from Poe’s unique vantage point, we aim to offer a clear, real-world view of the competitive dynamics shaping the future of AI. The growing diversity of models and providers underscores the value of a platform where users and creators can freely explore the best tools for their needs at an affordable price point.

We look forward to sharing further observations as new patterns emerge. To experience these models firsthand, you can explore our library of 200+ official text, image, video, and audio models by visiting poe.com today.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Poe AI模型 文本生成 图像生成 视频生成 音频生成 OpenAI Google Gemini GPT ElevenLabs AI趋势
相关文章