VentureBeat 11月10日 22:04
Baseten推出模型训练平台,助企业摆脱对闭源AI依赖
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

AI基础设施公司Baseten近期完成了重大战略转型,全面进军模型训练领域。其新推出的Baseten Training平台旨在帮助企业在无需管理复杂GPU集群和云容量规划的情况下,对开源AI模型进行微调。此举是Baseten从核心推理业务向AI部署全生命周期拓展的关键一步,以响应客户对摆脱OpenAI等闭源模型依赖的迫切需求。该平台通过多云GPU协调和分钟级调度能力,为企业提供了一个更灵活、更经济的AI模型训练解决方案,同时也为后续的推理部署奠定了基础。

🚀 Baseten Training平台旨在解决企业在AI模型训练过程中面临的挑战,特别是如何摆脱对OpenAI等闭源AI提供商的依赖。通过支持对开源模型进行微调,Baseten使企业能够构建更具成本效益且可控的AI解决方案,从而在AI应用中获得更大的自主权和灵活性。

💡 该平台提供了一套简化的基础设施,用户无需深入了解GPU集群管理、多节点编排或云容量规划等复杂技术细节。Baseten Training通过自动化这些流程,让企业能够专注于模型本身和业务应用,显著降低了AI模型开发的门槛和运营成本。

🔗 Baseten Training的核心优势在于其多云GPU协调能力和分钟级作业调度。通过动态配置跨多个云供应商的GPU资源,Baseten能够为客户节省成本并避免了传统云服务提供商的容量限制和长期合同束缚,为模型训练提供了前所未有的灵活性。

🔄 Baseten对模型训练和推理的整合策略,强调了两者之间的紧密联系。通过提供从模型训练到部署的端到端解决方案,Baseten能够优化整个AI生命周期,确保训练出的模型能够高效、可靠地部署和运行,从而为客户带来更大的价值。

🔧 Baseten的策略是让客户完全拥有其模型权重,这与一些竞争对手形成鲜明对比。这种开放的策略建立在对自身推理性能优越性的信心之上,旨在通过技术卓越而非合同限制来留住客户,鼓励企业在AI领域进行更深入的定制化和创新。

Baseten, the AI infrastructure company recently valued at $2.15 billion, is making its most significant product pivot yet: a full-scale push into model training that could reshape how enterprises wean themselves off dependence on OpenAI and other closed-source AI providers.

The San Francisco-based company announced Thursday the general availability of Baseten Training, an infrastructure platform designed to help companies fine-tune open-source AI models without the operational headaches of managing GPU clusters, multi-node orchestration, or cloud capacity planning. The move is a calculated expansion beyond Baseten's core inference business, driven by what CEO Amir Haghighat describes as relentless customer demand and a strategic imperative to capture the full lifecycle of AI deployment.

"We had a captive audience of customers who kept coming to us saying, 'Hey, I hate this problem,'" Haghighat said in an interview. "One of them told me, 'Look, I bought a bunch of H100s from a cloud provider. I have to SSH in on Friday, run my fine-tuning job, then check on Monday to see if it worked. Sometimes I realize it just hasn't been working all along.'"

The launch comes at a critical inflection point in enterprise AI adoption. As open-source models from Meta, Alibaba, and others increasingly rival proprietary systems in performance, companies face mounting pressure to reduce their reliance on expensive API calls to services like OpenAI's GPT-5 or Anthropic's Claude. But the path from off-the-shelf open-source model to production-ready custom AI remains treacherous, requiring specialized expertise in machine learning operations, infrastructure management, and performance optimization.

Baseten's answer: provide the infrastructure rails while letting companies retain full control over their training code, data, and model weights. It's a deliberately low-level approach born from hard-won lessons.

How a failed product taught Baseten what AI training infrastructure really needs

This isn't Baseten's first foray into training. The company's previous attempt, a product called Blueprints launched roughly two and a half years ago, failed spectacularly — a failure Haghighat now embraces as instructive.

"We had created the abstraction layer a little too high," he explained. "We were trying to create a magical experience, where as a user, you come in and programmatically choose a base model, choose your data and some hyperparameters, and magically out comes a model."

The problem? Users didn't have the intuition to make the right choices about base models, data quality, or hyperparameters. When their models underperformed, they blamed the product. Baseten found itself in the consulting business rather than the infrastructure business, helping customers debug everything from dataset deduplication to model selection.

"We became consultants," Haghighat said. "And that's not what we had set out to do."

Baseten killed Blueprints and refocused entirely on inference, vowing to "earn the right" to expand again. That moment arrived earlier this year, driven by two market realities: the vast majority of Baseten's inference revenue comes from custom models that customers train elsewhere, and competing training platforms were using restrictive terms of service to lock customers into their inference products.

"Multiple companies who were building fine-tuning products had in their terms of service that you as a customer cannot take the weights of the fine-tuned model with you somewhere else," Haghighat said. "I understand why from their perspective — I still don't think there is a big company to be made purely on just training or fine-tuning. The sticky part is in inference, the valuable part where value is unlocked is in inference, and ultimately the revenue is in inference."

Baseten took the opposite approach: customers own their weights and can download them at will. The bet is that superior inference performance will keep them on the platform anyway.

Multi-cloud GPU orchestration and sub-minute scheduling set Baseten apart from hyperscalers

The new Baseten Training product operates at what Haghighat calls "the infrastructure layer" — lower-level than the failed Blueprints experiment, but with opinionated tooling around reliability, observability, and integration with Baseten's inference stack.

Key technical capabilities include multi-node training support across clusters of NVIDIA H100 or B200 GPUs, automated checkpointing to protect against node failures, sub-minute job scheduling, and integration with Baseten's proprietary Multi-Cloud Management (MCM) system. That last piece is critical: MCM allows Baseten to dynamically provision GPU capacity across multiple cloud providers and regions, passing cost savings to customers while avoiding the capacity constraints and multi-year contracts typical of hyperscaler deals.

"With hyperscalers, you don't get to say, 'Hey, give me three or four B200 nodes while my job is running, and then take it back from me and don't charge me for it,'" Haghighat said. "They say, 'No, you need to sign a three-year contract.' We don't do that."

Baseten's approach mirrors broader trends in cloud infrastructure, where abstraction layers increasingly allow workloads to move fluidly across providers. When AWS experienced a major outage several weeks ago, Baseten's inference services remained operational by automatically routing traffic to other cloud providers — a capability now extended to training workloads.

The technical differentiation extends to Baseten's observability tooling, which provides per-GPU metrics for multi-node jobs, granular checkpoint tracking, and a refreshed UI that surfaces infrastructure-level events. The company also introduced an "ML Cookbook" of open-source training recipes for popular models like Gemma, GPT OSS, and Qwen, designed to help users reach "training success" faster.

Early adopters report 84% cost savings and 50% latency improvements with custom models

Two early customers illustrate the market Baseten is targeting: AI-native companies building specialized vertical solutions that require custom models.

Oxen AI, a platform focused on dataset management and model fine-tuning, exemplifies the partnership model Baseten envisions. CEO Greg Schoeninger articulated a common strategic calculus, telling VentureBeat: "Whenever I've seen a platform try to do both hardware and software, they usually fail at one of them. That's why partnering with Baseten to handle infrastructure was the obvious choice."

Oxen built its customer experience entirely on top of Baseten's infrastructure, using the Baseten CLI to programmatically orchestrate training jobs. The system automatically provisions and deprovisions GPUs, fully concealing Baseten's interface behind Oxen's own. For one Oxen customer, AlliumAI — a startup bringing structure to messy retail data — the integration delivered 84% cost savings compared to previous approaches, reducing total inference costs from $46,800 to $7,530.

"Training custom LoRAs has always been one of the most effective ways to leverage open-source models, but it often came with infrastructure headaches," said Daniel Demillard, CEO of AlliumAI. "With Oxen and Baseten, that complexity disappears. We can train and deploy models at massive scale without ever worrying about CUDA, which GPU to choose, or shutting down servers after training."

Parsed, another early customer, tackles a different pain point: helping enterprises reduce dependence on OpenAI by creating specialized models that outperform generalist LLMs on domain-specific tasks. The company works in mission-critical sectors like healthcare, finance, and legal services, where model performance and reliability aren't negotiable.

"Prior to switching to Baseten, we were seeing repetitive and degraded performance on our fine-tuned models due to bugs with our previous training provider," said Charles O'Neill, Parsed's co-founder and chief science officer. "On top of that, we were struggling to easily download and checkpoint weights after training runs."

With Baseten, Parsed achieved 50% lower end-to-end latency for transcription use cases, spun up HIPAA-compliant EU deployments for testing within 48 hours, and kicked off more than 500 training jobs. The company also leveraged Baseten's modified vLLM inference framework and speculative decoding — a technique that generates draft tokens to accelerate language model output — to cut latency in half for custom models.

"Fast models matter," O'Neill said. "But fast models that get better over time matter more. A model that's 2x faster but static loses to one that's slightly slower but improving 10% monthly. Baseten gives us both — the performance edge today and the infrastructure for continuous improvement."

Why training and inference are more interconnected than the industry realizes

The Parsed example illuminates a deeper strategic rationale for Baseten's training expansion: the boundary between training and inference is blurrier than conventional wisdom suggests.

Baseten's model performance team uses the training platform extensively to create "draft models" for speculative decoding, a cutting-edge technique that can dramatically accelerate inference. The company recently announced it achieved 650+ tokens per second on OpenAI's GPT OSS 120B model — a 60% improvement over its launch performance — using EAGLE-3 speculative decoding, which requires training specialized small models to work alongside larger target models.

"Ultimately, inference and training plug in more ways than one might think," Haghighat said. "When you do speculative decoding in inference, you need to train the draft model. Our model performance team is a big customer of the training product to train these EAGLE heads on a continuous basis."

This technical interdependence reinforces Baseten's thesis that owning both training and inference creates defensible value. The company can optimize the entire lifecycle: a model trained on Baseten can be deployed with a single click to inference endpoints pre-optimized for that architecture, with deployment-from-checkpoint support for chat completion and audio transcription workloads.

The approach contrasts sharply with vertically integrated competitors like Replicate or Modal, which also offer training and inference but with different architectural tradeoffs. Baseten's bet is on lower-level infrastructure flexibility and performance optimization, particularly for companies running custom models at scale.

As open-source AI models improve, enterprises see fine-tuning as the path away from OpenAI dependency

Underpinning Baseten's entire strategy is a conviction about the trajectory of open-source AI models — namely, that they're getting good enough, fast enough, to unlock massive enterprise adoption through fine-tuning.

"Both closed and open-source models are getting better and better in terms of quality," Haghighat said. "We don't even need open source to surpass closed models, because as both of them are getting better, they unlock all these invisible lines of usefulness for different use cases."

He pointed to the proliferation of reinforcement learning and supervised fine-tuning techniques that allow companies to take an open-source model and make it "as good as the closed model, not at everything, but at this narrow band of capability that they want."

That trend is already visible in Baseten's Model APIs business, launched alongside Training earlier this year to provide production-grade access to open-source models. The company was the first provider to offer access to DeepSeek V3 and R1, and has since added models like Llama 4 and Qwen 3, optimized for performance and reliability. Model APIs serves as a top-of-funnel product: companies start with off-the-shelf open-source models, realize they need customization, move to Training for fine-tuning, and ultimately deploy on Baseten's Dedicated Deployments infrastructure.

Yet Haghighat acknowledged the market remains "fuzzy" around which training techniques will dominate. Baseten is hedging by staying close to the bleeding edge through its Forward Deployed Engineering team, which works hands-on with select customers on reinforcement learning, supervised fine-tuning, and other advanced techniques.

"As we do that, we will see patterns emerge about what a productized training product can look like that really addresses the user's needs without them having to learn too much about how RL works," he said. "Are we there as an industry? I would say not quite. I see some attempts at that, but they all seem like almost falling to the same trap that Blueprints fell into—a bit of a walled garden that ties the hands of AI folks behind their back."

The roadmap ahead includes potential abstractions for common training patterns, expansion into image, audio, and video fine-tuning, and deeper integration of advanced techniques like prefill-decode disaggregation, which separates the initial processing of prompts from token generation to improve efficiency.

Baseten faces crowded field but bets developer experience and performance will win enterprise customers

Baseten enters an increasingly crowded market for AI infrastructure. Hyperscalers like AWS, Google Cloud, and Microsoft Azure offer GPU compute for training, while specialized providers like Lambda Labs, CoreWeave, and Together AI compete on price, performance, or ease of use. Then there are vertically integrated platforms like Hugging Face, Replicate, and Modal that bundle training, inference, and model hosting.

Baseten's differentiation rests on three pillars: its MCM system for multi-cloud capacity management, deep performance optimization expertise built from its inference business, and a developer experience tailored for production deployments rather than experimentation.

The company's recent $150 million Series D and $2.15 billion valuation provide runway to invest in both products simultaneously. Major customers include Descript, which uses Baseten for transcription workloads; Decagon, which runs customer service AI; and Sourcegraph, which powers coding assistants. All three operate in domains where model customization and performance are competitive advantages.

Timing may be Baseten's biggest asset. The confluence of improving open-source models, enterprise discomfort with dependence on proprietary AI providers, and growing sophistication around fine-tuning techniques creates what Haghighat sees as a sustainable market shift.

"There is a lot of use cases for which closed models have gotten there and open ones have not," he said. "Where I'm seeing in the market is people using different training techniques — more recently, a lot of reinforcement learning and SFT — to be able to get this open model to be as good as the closed model, not at everything, but at this narrow band of capability that they want. That's very palpable in the market."

For enterprises navigating the complex transition from closed to open AI models, Baseten's positioning offers a clear value proposition: infrastructure that handles the messy middle of fine-tuning while optimizing for the ultimate goal of performant, reliable, cost-effective inference at scale. The company's insistence that customers own their model weights — a stark contrast to competitors using training as a lock-in mechanism — reflects confidence that technical excellence, not contractual restrictions, will drive retention.

Whether Baseten can execute on this vision depends on navigating tensions inherent in its strategy: staying at the infrastructure layer without becoming consultants, providing power and flexibility without overwhelming users with complexity, and building abstractions at exactly the right level as the market matures. The company's willingness to kill Blueprints when it failed suggests a pragmatism that could prove decisive in a market where many infrastructure providers over-promise and under-deliver.

"Through and through, we're an inference company," Haghighat emphasized. "The reason that we did training is at the service of inference."

That clarity of purpose — treating training as a means to an end rather than an end in itself—may be Baseten's most important strategic asset. As AI deployment matures from experimentation to production, the companies that solve the full stack stand to capture outsized value. But only if they avoid the trap of technology in search of a problem.

At least Baseten's customers no longer have to SSH into boxes on Friday and pray their training jobs complete by Monday. In the infrastructure business, sometimes the best innovation is simply making the painful parts disappear.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Baseten AI基础设施 模型训练 开源AI 闭源AI 微调 GPU 云原生 推理 Baseten Training
相关文章