Stability AI news 09月19日
Stable Audio Open Small:手机也能生成音频
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Stability AI与Arm合作发布了Stable Audio Open Small,一个参数量为3.41亿的文本转音频模型,专为Arm CPU优化,可在智能手机上快速生成短音频。该模型能在8秒内生成长达11秒的音频,并提供Arm学习路径指导开发者使用。Stable Audio Open Small采用开源许可,允许商业和非商业使用,标志着生成式音频创作能力迈向移动设备。

🎵 **模型轻量化与速度提升**: Stable Audio Open Small拥有3.41亿参数,远小于其前身Stable Audio Open的11亿参数。这一精简设计使其运行速度极快,可在智能手机上于8秒内生成长达11秒的立体声音频,成为目前市面上最快的同类模型。

📱 **移动端部署与易用性**: 该模型专门为在Arm CPU上运行而优化,这意味着它可以在全球99%的智能手机上运行,无需庞大的计算资源。这使得生成式音频创作能够真正触达移动设备,为开发者和用户带来前所未有的便捷。

⚖️ **开源许可与广泛应用**: Stable Audio Open Small已根据Stability AI社区许可免费开放,支持商业和非商业用途。这极大地降低了使用门槛,鼓励开发者探索和应用该技术,用于生成短音频样本、音效、鼓点、器乐片段和环境纹理等多种场景。

📚 **学习资源与社区支持**: 随模型一同发布的还有Arm学习路径,提供在Arm硬件上部署Stable Audio Open Small的实践指导。此外,用户还可以通过arXiv阅读研究论文,在Hugging Face下载模型权重,并在GitHub上获取代码,方便深入了解和使用。

Key Takeaways:

Download weights

Bringing generative audio creation to mobile phones

We’re open-sourcing Stable Audio Open Small in partnership with Arm, whose technology powers 99% of smartphones globally. Building on the industry-leading text-to-audio model Stable Audio Open, the new compact variant is smaller and faster, while preserving output quality and prompt adherence. 

This release follows our previously announced breakthrough that Stable Audio Open is now optimized to run on Arm CPUs, powered by Arm KleidiAI to enable AI-generated audio on a mobile phone. After demonstrating the technology in action at Mobile World Congress, Stability AI and Arm are now making the model weights available for anyone to access and deploy the model. 

Technical advancements

To our knowledge, Stable Audio Open Small is the fastest stereo text-to-audio model on the market. You can read more about the technical advancements of the model in the research paper. Here are a few highlights:

Lightweight: Stable Audio Open Small has 341M parameters, compared to Stable Audio Open’s 1.1B parameters.

Fast: Stable Audio Open Small is optimized to generate audio on a mobile phone in less than 8 seconds. It’s faster to generate, and faster to fine-tune.

Efficient: Leveraging Arm’s KleidiAI libraries, we designed this new model to run even more efficiently at the edge, so users get faster results back while lowering costs for compute time. By running entirely on Arm CPUs, Stable Audio Open Small is also accessible without heavy hardware requirements.

When to use the model

Like Stable Audio Open, Stable Audio Open Small is optimized for generating short audio samples, sound effects and production elements using text prompts. It is well suited for creating drum loops, foley, instrument riffs, and ambient textures. 

Its compact size and fast inference make it a perfect fit for on-device deployment on Arm-powered smartphones and edge devices, where real-time generation and responsiveness matter.

As AI-driven creative media workloads move to the edge, smaller models help align compute resources with task complexity. By using different model sizes, organizations can allocate workloads to the processors best suited to their use case, like generating short sound effects versus full-length songs.

Getting started

Stable Audio Open Small is now free for commercial and non-commercial use under the permissive Stability AI Community License. You can read the paper on arXiv, download the model weights on Hugging Face, and access the code on GitHub.

Visit the Arm Learning Path to walk through deploying Stable Audio Open Small on Arm hardware as well as the Arm Community Blog for a deep technical dive into how Stable Audio Open Small was optimized for on-device performance.

To stay updated on our progress, follow us on X, LinkedIn, Instagram, and join our Discord Community.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Stable Audio Open Small 文本转音频 AI音频生成 Arm CPU 移动端AI 开源模型 Stability AI Text-to-Audio AI Audio Generation Arm CPU On-Device AI Open Source Model
相关文章