Poe平台AI模型使用趋势报告：文本、图像、视频和音频领域新动态

This report examines aggregated subscriber data on Poe from May to August 2025 and presents a provider-agnostic view of emerging usage trends that may signal broader shifts in the AI ecosystem. The period is characterized by Google’s growing momentum across modalities, sustained user interest in models with reasoning capabilities (a key finding in our spring report), and increasing global competition in multi-media generation.

We hope this analysis provides a valuable, data-driven perspective on the dynamic state of AI for researchers and those who are watching the space. [1] [2]

OpenAI extends its lead in text as Google’s Gemini series gains traction

The text landscape continues to show a preference for models that balance cost efficiency with powerful reasoning. This trend has benefited OpenAI and Google, while affecting the overall share of usage among other LLM providers.

The OpenAI family of models has expanded its usage on Poe, now accounting for over 50% of subscriber messages. This is led by the recently launched GPT-5 (27.7%), and strong continued usage of GPT-4o (19.8%) and GPT-4.1 (7.3%).
After a period of slower initial adoption following its April launch, Google’s Gemini 2.5 series gained significant momentum, steadily growing from 3.5% to approximately 10% of message usage in the text category.
Even with new releases of Opus 4/4.1, and Sonnet 4, the total usage share for Anthropic’s Claude models declined. However, they remain a preferred choice for specific, complex tasks, with programming-related queries accounting for roughly 40% of Claude Sonnet 4 usage.
While Grok 4's adoption slope on Poe was one of the steepest we've seen among LLMs in the days following launch, its usage has declined to 1% of total messages in the category.

Google gains dominance in image generation with a single launch

The image generation space has been significantly disrupted by Google’s latest release, Gemini 2.5 Flash Image (codenamed “nano-banana”), demonstrating the market’s readiness to adopt models that offer a leap in speed, performance, and precision, specifically in conversational image editing.

Gemini 2.5 Flash Image has reshaped the category, capturing over 48% of all image generation usage just one week after its launch and sustaining its lead with a large margin.
Despite this major shift, OpenAI’s GPT-Image-1 (11.7%) and the FLUX family from Black Forest Labs (10.5% combined) have held their positions as the next most popular image generation options on Poe.

Innovation and global competition redefine the video landscape

The video generation space has seen a breakthrough in model capability, even as the market has become more fragmented and globally competitive.

Google’s Veo 3 became a viral sensation upon its release, introducing the novel capability of generating video with native, synchronized audio. This technological leap drove its rapid adoption, establishing it as the single most-used video model on Poe with approximately 24% usage share, despite its premium cost.
The market remains intensely competitive with a diverse range of models from Chinese AI firms (Kling, Hailuo, Wan, and Seedance) collectively representing the majority of usage, capturing 52.6% of all video generation messages during the period.
This hyper-competitive environment has put pressure on early market leaders. Runway, one of the category’s previous leaders, saw its market share decline to 12.3%, a trend that the release of its updated Runway Gen 4 Turbo model did not reverse.

ElevenLabs defends its lead in TTS as music generation emerges

The audio AI space is simultaneously maturing and expanding, with a clear leader in the established text-to-speech (TTS) domain and early competition forming in the nascent music generation sub-category.

In the TTS domain, ElevenLabs retained the strongest overall usage, fulfilling 74.4% of all subscriber requests. Its new, higher performance v3 model is seeing swift adoption, growing to 29.7% of usage and signaling interest in breakthrough capabilities such as audio tags and multi-speaker support.
The competitive field for TTS is diversifying, with the emergence of new models like Hailuo Speech 02 joining established alternatives such as Cartesia, Unreal Speech, PlayAI, and Orpheus. These alternatives offer stronger performance in certain languages, distinctive voice options, effects, and varied performance-price profiles.
The recent introduction of music generation models has opened a new front in audio AI with ElevenLabs Music and Lyria 2. In its first week, ElevenLabs Music has taken 10.1% of the total audio generation category’s usage, compared to 3.3% for Google’s Lyria launched in May 2025.

Conclusion

By sharing these insights from Poe’s unique vantage point, we aim to offer a clear, real-world view of the competitive dynamics shaping the future of AI. The growing diversity of models and providers underscores the value of a platform where users and creators can freely explore the best tools for their needs at an affordable price point.

We look forward to sharing further observations as new patterns emerge. To experience these models firsthand, you can explore our library of 200+ official text, image, video, and audio models by visiting poe.com today.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签