The Rundown AI -每日精选 08月29日
微软发布自研AI模型,重塑AI战略
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

微软近期推出了其首批完全自主研发的人工智能模型MAI-Voice-1和MAI-1-preview,标志着其AI战略的重要转变。此前,微软在AI领域深度依赖OpenAI的合作。MAI-Voice-1是一款强大的语音生成模型,能在不到一秒内生成一分钟的语音内容,并已集成到Copilot Daily和Podcasts中。MAI-1-preview则是一款文本模型,以其高效的指令遵循和日常查询处理能力著称,且训练所需的GPU资源远少于竞争对手。尽管具体基准测试尚未公开,但微软CEO表示MAI-1的性能已达到世界顶尖水平。此举不仅预示着微软将更自主地掌控其AI发展方向,也为AI领域合作伙伴关系带来了新的变数,预示着微软将走上独立发展的AI之路。

🤖 微软推出自主研发AI模型:微软发布了其首批自研AI模型MAI-Voice-1和MAI-1-preview,这标志着其AI战略从依赖OpenAI转向自主研发的关键一步。MAI-Voice-1是一款高效的语音生成模型,能在短时间内生成高质量语音,并已应用于微软的产品中。MAI-1-preview作为文本模型,展现了在指令遵循和日常问答方面的出色表现,且训练效率高,预示着微软将在AI领域占据更主动的地位。

🗣️ OpenAI升级语音代理能力:OpenAI已将其Realtime API正式推出测试阶段,并引入了新的gpt-realtime语音到语音模型。该模型在语音处理方面有了显著提升,能够捕捉非语言线索并实现流畅的跨语言对话,准确率大幅提高。此外,还支持图像输入和模型上下文协议(MCP),允许语音代理与外部数据源和工具无缝集成,极大地增强了语音代理的应用场景和用户体验。

✉️ AI驱动的邮件支持自动化:文章介绍了一个利用Zapier AI构建的自动化工作流程,用于处理邮件支持。该AI代理能够自动分类收到的邮件,如垃圾邮件、客户支持、反馈等,并根据预设规则将相关邮件分配给团队成员,甚至起草初步的回复。通过连接Gmail、Slack等工具,并提供FAQ链接和文档作为上下文,该代理能够高效管理收件箱,将繁琐的邮件处理转化为有组织的流程,显著提升工作效率。

🌍 Cohere提供企业级翻译解决方案:Cohere推出了Command AI Translate,一个高性能的企业级翻译模型,在多项翻译基准测试中超越了GPT-5、DeepSeek-V3和Google Translate等竞争对手。该模型支持23种主要商业语言,并提供定制化选项,允许企业根据行业术语进行训练,确保翻译的准确性和专业性。更重要的是,它支持本地化部署,保障了企业敏感数据的安全和隐私,解决了企业在AI翻译应用中的核心顾虑。

Read Online | Sign Up | Advertise

Good morning, {{ first_name | AI enthusiasts }}. For years, Microsoft's AI strategy has been synonymous with OpenAI — but that narrative just got complicated.

The company's new MAI-Voice-1 and MAI-1-preview models mark its first homegrown AI, signaling a shift that could throw yet another wrench into the AI world’s most-watched partnership.

Reminder: Our next live workshop is today at 4 PM EST with The Rundown’s AI Educator, Nate Grahek — join and learn all the latest tips and tricks for getting the most out of ChatGPT. RSVP here.


In today’s AI rundown:

    Microsoft releases homegrown AI

    OpenAI’s gpt-realtime for voice agents

    Create an AI agent to handle email support

    Cohere’s SOTA enterprise translation model

    4 new AI tools, community workflows, and more

LATEST DEVELOPMENTS

MICROSOFT

🤖 Microsoft releases homegrown AI

Image source: Microsoft

The Rundown: Microsoft just introduced MAI-Voice-1 and MAI-1-preview, marking its first fully in-house AI models and coming after years of relying on OpenAI's technology in a turbulent partnership.

The details:

    MAI-Voice-1 is a speech generation model capable of generating a minute of speech in under a second, already integrated into Copilot Daily and Podcasts.

    MAI-1-preview is a text-based model trained on a fraction of the GPUs of rivals, specializing in instruction following and everyday queries.

    CEO Mustafa Suleyman said MAI-1 is “up there with some of the best models in the world”, though benchmarks have yet to be publicly released.

    The text model is currently being tested on LM Arena and via API, with Microsoft saying it will roll out in “certain text use cases” in the coming weeks.

Why it matters: Microsoft's shift toward building in-house models introduces a new dynamic to its OAI partnership, also positioning it to better control its own AI destiny. While we await benchmarks and more real-world testing for a better understanding, the tech giant looks ready to pave its own path instead of being viewed as OAI’s sidekick.

TOGETHER WITH AUGMENT CODE

👋 Meet Auggie CLI

The Rundown: Augment Code is bringing the power of its AI coding agent and context engine right to your terminal with Auggie CLI, now generally available.

From standalone terminal sessions to every piece of your dev stack, with Auggie CLI, you can:

    Build features and debug issues

    Get instant feedback suggestions for your PRs and builds

    Triage customer issues and alerts from your observability stack

    Build with the AI coding platform that gets you, your team, and your code

Try Auggie CLI today.

OPENAI

🗣️ OpenAI’s gpt-realtime for voice agents

Image source: OpenAI

The Rundown: OpenAI moved its Realtime API out of beta, also introducing a new gpt-realtime speech-to-speech model and new developer tools like image input and Model Context Protocol server integrations.

The details:

    gpt-realtime features nuanced abilities like detecting nonverbal cues and switching languages while keeping a naturally flowing conversation.

    The model achieves 82.8% accuracy on audio reasoning benchmarks, a massive increase over the 65.6% score from its predecessor.

    OpenAI also added MCP support, allowing voice agents to connect with external data sources and tools without custom integrations.

    gpt-realtime can also handle image inputs like photos or screenshots, giving the voice agent the ability to reason on visuals alongside the conversation.

Why it matters: The mainstream adoption of voice agents feels like an inevitability, and OpenAI’s additions of upgraded human conversational abilities and integrations like MCP and image understanding bring even more functionality for enterprises and devs to plug directly into customer support channels or customized voice applications.

AI TRAINING

✉️ Create an AI agent to handle email support

The Rundown: In this tutorial, you will learn how to build an AI agent that automatically triages incoming emails, tags team members in Slack, and drafts professional responses, turning your overwhelming inbox into an organized workflow.

Step-by-step:

    Go to Zapier Agents, click "New Agent", name it "Email Triage Assistant", and set it to run daily at 9 AM (batch processing saves Zapier calls)

    Click Copilot and paste: "Every day at 9 AM PST, retrieve all emails from the last 24 hours. Classify as: Spam, Auto-replies, PR/Marketing, Customer Support, Feedback, or General Inquiry"

    Add team tagging rules customized for your team members to funnel to specific departments or responsibilities

    Click "Add tools" and connect Gmail, Slack, and your FAQ URLs — grant full permissions for autonomous operation

    Test with your current inbox, verify categorization accuracy, then enable the daily schedule

Pro tip: Feed your agent FAQ URLs, Notion docs, and previous support threads in the instructions. The more context you provide, the better it handles edge cases and knows exactly who to loop in.

PRESENTED BY STACK AI

🛠️ Your secure enterprise AI toolkit

The Rundown: Deploy 10 AI agents that actually drive ROI on StackAI—the secure enterprise AI toolkit trusted by finance, legal, ops, & IT teams who move 80% faster than the rest.

With StackAI’s toolkit, you’ll get:

    Drag and drop platform + ship as chatbots, forms, apps

    Built-in PII protections, guardrails, audit trails, SSO, and compliance

    Seamless integrations with 100+ tools you already use

Download 10+ ready-to-deploy AI agents now.

COHERE

🌍 Cohere’s SOTA enterprise translation model

Image source: Midjourney

The Rundown: Cohere introduced Command AI Translate, a new enterprise model that claims top scores on key translation benchmarks while allowing for deep customization and secure, private deployment options.

The details:

    Command A Translate outperforms rivals like GPT-5, DeepSeek-V3, and Google Translate on key benchmarks across 23 major business languages.

    The model also features an optional ‘Deep Translation’ agentic workflow that double-checks complex and high-stakes content, boosting performance.

    Cohere offers customization for industry-specific terms, letting pharmaceutical companies teach their drug names or banks add their financial terminology.

    Companies can also install it on their own servers, keeping contracts, medical records, and confidential emails completely offline and secure.

Why it matters: Security has been one of the biggest issues for companies wanting to leverage AI tools, and global enterprises face a choice of uploading sensitive documents to the cloud or paying for time-consuming human translators. Cohere’s model gives businesses customizable translation in-house without data privacy risks.

QUICK HITS

🛠️ Trending AI Tools

    🎥 Google Vids - Create and edit videos with AI-powered tools

    🔊 MAI-Voice-1 - Microsoft’s new in-house voice generation model

    🗣️ gpt-realtime - OpenAI’s new advanced speech-to-speech model

    🥁 HunyuanVideo-Foley - Open-source model for professional-grade audio

📰 Everything else in AI today

Free Event: The Future of AI Agents in Coding with Guy Gur-Ari & Igor Ostrovsky, co-founders of Augment Code. Ask them anything today in r/webdev.*

xAI released Grok Code Fast 1, a new advanced coding model (previously launched under the codename sonic) that features very low costs for agentic coding tasks.

Anthropic published a new threat report revealing that cybercriminals exploited its Claude Code platform to automate a multi-million dollar extortion scheme.

OpenAI rolled out new features for its Codex software development tool, including an extension to run in IDEs, code reviews, CLI agentic upgrades, and more.

Krea introduced a waitlist for a new Realtime Video feature, enabling users to create and edit video using canvas painting, text, or live webcam feeds with consistency.

Tencent open-sourced HunyuanVideo-Foley, a new model that creates professional-grade soundtracks and effects with SOTA audio-visual synchronization.

TIME Magazine released its 2025 TIME100 AI list, featuring many of the top CEOs, researchers, and thought leaders across the industry.

*Sponsored Listing

COMMUNITY

🤝 Community AI workflows

Every newsletter, we showcase how a reader is using AI to work smarter, save time, or make life easier.

Today’s workflow comes from reader Scott M. in Franklin, TN:

"My client was using a legacy version of QuickBooks Desktop, which lacked the feature for sending automated follow-up emails for overdue invoices. To address this, I built a custom automation using Zapier AI: the workflow logs into the accounting email, IDs invoices that are more than 60 days past due, and follows the invoice link to verify whether it's been paid. If payment has not been made, the automation sends a reminder email stating that the invoice is late and includes the original payment link. Every communication includes the accounting department, ensuring they stay informed about delinquent payments."

How do you use AI? Tell us here.

🎓 Highlights: News, Guides & Events

See you soon,

Rowan, Joey, Zach, Shubham, and Jennifer—the humans behind The Rundown

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Microsoft AI OpenAI MAI-Voice-1 MAI-1-preview gpt-realtime AI Agents Email Automation Cohere Enterprise Translation Artificial Intelligence
相关文章