MarkTechPost@AI 10月03日 12:55
Thinking Machines发布Tinker,低级别API简化分布式LLM微调
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Thinking Machines推出Tinker,一个Python API,允许研究人员和工程师在本地编写训练循环,并在托管的分布式GPU集群上执行。Tinker专注于提供对数据、目标和优化步骤的全面控制,同时处理调度、容错和多节点编排。该平台支持LoRA微调,并提供预制的训练循环和食谱。Tinker现处于私有Beta阶段,提供免费试用,未来将转向按使用量计费。它旨在为用户提供一种灵活且可控的方式来微调各种开放权重模型,同时抽象底层分布式计算的复杂性。

✨ Tinker API提供低级别原语(如forward_backward、optim_step、save_state、sample),而非高层级的train()封装。这使用户能够完全控制梯度计算、优化器步进、检查点和评估/推理,同时将多节点编排和容错交给平台处理。

🚀 该平台优先采用LoRA(Low-Rank Adaptation)进行微调,而非全量微调。Tinker的论述表明,在适当配置下,LoRA在许多实际工作负载(尤其是强化学习)中可以媲美全量微调的性能,这有助于降低成本和缩短迭代周期。

📚 Tinker Cookbook提供了丰富的参考训练循环和后训练食谱,涵盖监督学习、强化学习(包括RLHF)、多智能体设置等。这些预制的模板旨在减少样板代码,并为用户提供易于使用的起点,以实现复杂的训练任务。

🔧 Tinker支持广泛的开放权重模型,包括Llama和Qwen系列,甚至大型混合专家(MoE)模型。训练后的适配器权重可以导出,以便在Tinker之外使用,提供极大的灵活性和互操作性。

🛡️ 尽管Tinker抽象了分布式计算的复杂性,但其设计允许用户保持对算法的控制。用户可以自定义损失函数、RLHF流程和数据处理方式,这使得在不同于封闭系统的环境中进行实验和迭代的门槛降低。

Thinking Machines has released Tinker, a Python API that lets researchers and engineers write training loops locally while the platform executes them on managed distributed GPU clusters. The pitch is narrow and technical: keep full control of data, objectives, and optimization steps; hand off scheduling, fault tolerance, and multi-node orchestration. The service is in private beta with a waitlist and starts free, moving to usage-based pricing “in the coming weeks.”

Alright, but tell me what it is?

Tinker exposes low-level primitives—not high-level “train()” wrappers. Core calls include forward_backward, optim_step, save_state, and sample, giving users direct control over gradient computation, optimizer stepping, checkpointing, and evaluation/inference inside custom loops. A typical workflow: instantiate a LoRA training client against a base model (e.g., Llama-3.2-1B), iterate forward_backward/optim_step, persist state, then obtain a sampling client to evaluate or export weights.

https://thinkingmachines.ai/tinker/

Key Features

What runs on it?

The Thinking Machines team positions Tinker as a managed post-training platform for open-weights models from small LLMs up to large mixture-of-experts systems, a good example would be Qwen-235B-A22B as a supported model. Switching models is intentionally minimal—change a string identifier and rerun. Under the hood, runs are scheduled on Thinking Machines’ internal clusters; the LoRA approach enables shared compute pools and lower utilization overhead.

https://thinkingmachines.ai/tinker/

Tinker Cookbook: Reference Training Loops and Post-Training Recipes

To reduce boilerplate while keeping the core API lean, the team published the Tinker Cookbook (Apache-2.0). It contains ready-to-use reference loops for supervised learning and reinforcement learning, plus worked examples for RLHF (three-stage SFT → reward modeling → policy RL), math-reasoning rewards, tool-use / retrieval-augmented tasks, prompt distillation, and multi-agent setups. The repo also ships utilities for LoRA hyperparameter calculation and integrations for evaluation (e.g., InspectAI).

Who’s already using it?

Early users include groups at Princeton (Gödel prover team), Stanford (Rotskoff Chemistry), UC Berkeley (SkyRL, async off-policy multi-agent/tool-use RL), and Redwood Research (RL on Qwen3-32B for control tasks).

Tinker is private beta as of now with waitlist sign-up. The service is free to start, with usage-based pricing planned shortly; organizations are asked to contact the team directly for onboarding.

My thoughts/ comments

I like that Tinker exposes low-level primitives (forward_backward, optim_step, save_state, sample) instead of a monolithic train()—it keeps objective design, reward shaping, and evaluation in my control while offloading multi-node orchestration to their managed clusters. The LoRA-first posture is pragmatic for cost and turnaround, and their own analysis argues LoRA can match full fine-tuning when configured correctly, but I’d still want transparent logs, deterministic seeds, and per-step telemetry to verify reproducibility and drift. The Cookbook’s RLHF and SL reference loops are useful starting points, yet I’ll judge the platform on throughput stability, checkpoint portability, and guardrails for data governance (PII handling, audit trails) during real workloads.

Overall I prefer Tinker’s open, flexible API: it lets me customize open-weight LLMs via explicit training-loop primitives while the service handles distributed execution. Compared with closed systems, this preserves algorithmic control (losses, RLHF workflows, data handling) and lowers the barrier for new practitioners to experiment and iterate.


Check out the Technical details and Sign up for our waitlist here. If you’re a university or organization looking for wide scale access, contact tinker@thinkingmachines.ai

Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The post Thinking Machines Launches Tinker: A Low-Level Training API that Abstracts Distributed LLM Fine-Tuning without Hiding the Knobs appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Tinker Thinking Machines LLM 微调 Python API 分布式GPU LoRA AI训练 机器学习 Fine-tuning Distributed GPU Machine Learning
相关文章