快速代码模型SWE-1.5发布，注重速度与性能

https://simonwillison.net/atom/everything 10月30日 08:06

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

Windsurf发布了其最新的软件工程优化模型SWE-1.5，这是一个拥有数千亿参数的前沿模型，实现了接近SOTA的代码性能。该模型与Cerebras合作，在速度方面树立了新标准，可达到高达950 tok/s的推理速度，比Haiku 4.5快6倍，比Sonnet 4.5快13倍。与Cursor发布的Composer-1类似，SWE-1.5目前仅通过其编辑器提供，尚无单独的API。Windsurf也未透露其所基于的“领先开源基础模型”的详细信息。训练方面，SWE-1.5利用包含数千个GB200 NVL72芯片的先进集群进行训练，并且在RL rollouts中使用了支持代码执行和网页浏览的高保真环境，这得益于其VM hypervisor otterlink，该技术可将Devin扩展到数万个并发机器。

🚀 Windsurf发布了SWE-1.5模型，这是一款为软件工程优化的前沿模型，拥有数千亿参数，实现了接近行业领先水平的代码编写性能。

💨 SWE-1.5与Cerebras合作，显著提升了模型推理速度，高达950 tok/s，远超其他知名模型，为用户提供了更快速的编码体验。

💻 类似于Cursor的Composer-1，SWE-1.5目前仅在其自家编辑器内可用，尚未提供独立的API接口。

💡 Windsurf在模型训练中采用了先进的技术，包括使用包含数千个GB200 NVL72芯片的集群，并利用其VM hypervisor otterlink来扩展Devin环境，支持高并发和真实的编码环境模拟，这与Cursor在Composer-1的RL训练中使用的沙盒环境类似，表明了利用强化学习和大规模模拟环境进行代码模型优化的趋势。

Introducing SWE-1.5: Our Fast Agent Model (via) Here's the second fast coding model released by a coding agent IDE in the same day - the first was Composer-1 by Cursor. This time it's Windsurf releasing SWE-1.5:

Today we’re releasing SWE-1.5, the latest in our family of models optimized for software engineering. It is a frontier-size model with hundreds of billions of parameters that achieves near-SOTA coding performance. It also sets a new standard for speed: we partnered with Cerebras to serve it at up to 950 tok/s – 6x faster than Haiku 4.5 and 13x faster than Sonnet 4.5.

Like Composer-1 it's only available via their editor, no separate API yet. Also like Composer-1 they don't appear willing to share details of the "leading open-source base model" they based their new model on.

I asked it to generate an SVG of a pelican riding a bicycle and got this:

This one felt really fast. Partnering with Cerebras for inference is a very smart move.

They share a lot of details about their training support in the post:

SWE-1.5 is trained on our state-of-the-art cluster of thousands of GB200 NVL72 chips. We believe SWE-1.5 may be the first public production model trained on the new GB200 generation. [...]
Our RL rollouts require high-fidelity environments with code execution and even web browsing. To achieve this, we leveraged our VM hypervisor otterlink that allows us to scale Devin to tens of thousands of concurrent machines (learn more about blockdiff). This enabled us to smoothly support very high concurrency and ensure the training environment is aligned with our Devin production environments.

That's another similarity to Cursor's Composer-1! They talked about how they ran "hundreds of thousands of concurrent sandboxed coding environments in the cloud" in their description of their RL training as well.

This is a notable trend: if you want to build a really great agentic coding tool there's clearly a lot to be said for using reinforcement learning to fine-tune a model against your own custom set of tools using large numbers of sandboxed simulated coding environments as part of that process.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签