MarkTechPost@AI 10月04日 14:05
通用代码回归模型预测性能
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

康奈尔大学和谷歌的研究人员提出了一种名为回归语言模型(RLM)的统一模型,该模型可以直接从代码字符串预测数值结果,涵盖GPU内核延迟、程序内存使用量、神经网络精度和延迟等,无需手动设计特征。该模型使用一个3亿参数的编码器-解码器,初始化自T5-Gemma,通过单一的文本到数字解码器发出受约束的数字,在异构任务和语言上实现了强大的秩相关性。

💡 **统一代码到指标回归:** 新提出的回归语言模型(RLM)能够直接从代码文本(如Python、C/C++、ONNX图)预测性能指标,包括峰值内存使用量、Triton GPU内核延迟,以及神经网络的精度和硬件特定延迟。其创新之处在于无需手动提取特征、图编码器或零成本代理,而是将代码视为纯文本输入,并直接解码数值输出。

🚀 **显著的预测性能:** 该模型在多项基准测试中取得了优异的成绩。在APPS LeetCode内存预测任务上,Spearman秩相关系数(ρ)高达0.93;在Triton GPU内核延迟预测上,ρ约为0.52;在CodeNet的17种语言上,平均ρ大于0.5;在五种经典神经架构搜索(NAS)空间上,Kendall τ系数约为0.46。这些结果在许多情况下优于或媲美基于图的模型预测器。

🔗 **多目标解码与帕累托优化:** RLM的解码器采用自回归方式,允许模型将早先预测的指标(如精度)作为条件来预测后续指标(如每设备延迟)。这种能力使得模型能够捕捉到真实的性能权衡,并沿着帕累托前沿进行优化,这对于多目标优化问题至关重要。

🛠️ **标准化技术栈,降低维护成本:** 传统的性能预测方法通常依赖于特定于任务的特征、语法树或图神经网络(GNN)编码器,这些方法在面对新的操作符或语言时容易变得脆弱。RLM通过将回归问题转化为文本到数字的生成任务,标准化了整个处理流程。输入被标记为纯文本,输出则通过逐位解码数字来生成,显著降低了维护成本,并通过微调提高了对新任务的迁移能力。

Researchers from Cornell and Google introduce a unified Regression Language Model (RLM) that predicts numeric outcomes directly from code strings—covering GPU kernel latency, program memory usage, and even neural network accuracy and latency—without hand-engineered features. A 300M-parameter encoder–decoder initialized from T5-Gemma achieves strong rank correlations across heterogeneous tasks and languages, using a single text-to-number decoder that emits digits with constrained decoding.

What exactly is new?

https://arxiv.org/abs/2509.26476

Why is this important?

Performance prediction pipelines in compilers, GPU kernel selection, and NAS typically rely on bespoke features, syntax trees, or GNN encoders that are brittle to new ops/languages. Treating regression as next-token prediction over numbers standardizes the stack: tokenize inputs as plain text (source code, Triton IR, ONNX), then decode calibrated numeric strings digit-by-digit with constrained sampling. This reduces maintenance cost and improves transfer to new tasks via fine-tuning.

Data and benchmarks

How does it work?

Stats that matters

Key Takeaways

    Unified code-to-metric regression works. A single ~300M-parameter T5Gemma-initialized model (“RLM”) predicts: (a) memory from high-level code, (b) Triton GPU kernel latency, and (c) model accuracy + device latency from ONNX—directly from text, no hand-engineered features.The research shows Spearman ρ > 0.9 on APPS memory, ≈0.52 on Triton latency, >0.5 average across 17 CodeNet languages, and Kendall-τ ≈ 0.46 on five NAS spaces.Numbers are decoded as text with constraints. Instead of a regression head, RLM emits numeric tokens with constrained decoding, enabling multi-metric, autoregressive outputs (e.g., accuracy followed by multi-device latencies) and uncertainty via sampling.The Code-Regression dataset unifies APPS/LeetCode memory, Triton kernel latency, and CodeNet memory; the regress-lm library provides the training/decoding stack.

Our Comments

It is very interesting how this work reframes performance prediction as text-to-number generation: a compact T5Gemma-initialized RLM reads source (Python/C++), Triton kernels, or ONNX graphs and emits calibrated numerics via constrained decoding. The reported correlations—APPS memory (ρ>0.9), Triton latency on RTX A6000 (~0.52), and NAS Kendall-τ ≈0.46—are strong enough to matter for compiler heuristics, kernel pruning, and multi-objective NAS triage without bespoke features or GNNs. The open dataset and library make replication straightforward and lower the barrier to fine-tuning on new hardware or languages.


Check out the Paper, GitHub Page and Dataset Card. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The post Can a Small Language Model Predict Kernel Latency, Memory, and Model Accuracy from Code? A New Regression Language Model (RLM) Says Yes appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

回归语言模型 RLM 代码性能预测 GPU内核延迟 内存使用量 神经网络精度 T5-Gemma 无特征工程 机器学习 AI Regression Language Model RLM Code Performance Prediction GPU Kernel Latency Memory Usage Neural Network Accuracy T5-Gemma No Feature Engineering Machine Learning AI
相关文章