少点错误 04月23日
The EU Is Asking for Feedback on Frontier AI Regulation (Open to Global Experts)—This Post Breaks Down What’s at Stake for AI Safety
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

欧盟正在制定关于通用人工智能(GPAI)模型监管的法规,并公开征求反馈意见。文章概述了GPAI行为准则将涵盖的内容,包括对GPAI模型的定义、提供者的认定、训练计算量的估算等。文章重点关注了AI安全,强调了独立声音在影响负责任的治理方面的重要性,并鼓励研究人员提交技术性、安全性的反馈。文章还探讨了模型版本与新模型的区别,以及下游微调者可能承担的责任。

💡欧盟正在制定GPAI模型的行为准则,旨在规范通用人工智能的开发和应用,涵盖了GPAI模型的定义,提供者的认定,以及如何估算训练计算量等关键内容。

⚖️欧盟将训练计算量(FLOP)作为模型通用性和风险的法律依据,首次提出将训练计算量阈值作为GPAI分类的监管信号,这可能成为未来AI模型监管的关键。

🔄欧盟区分了“新模型”和“模型版本”,新模型指的是全新的大型预训练,而基于该预训练的微调、升级等则被视为版本。如果模型更新使用了大量计算资源并显著改变了风险,则可能被视为新模型。

🧑‍💻欧盟特别关注下游行为者,即那些微调或修改现有GPAI模型的人。低计算量的微调责任较小,但高计算量或影响系统性风险的修改者将承担全部法律责任,包括文档和风险评估等义务。

Published on April 22, 2025 8:39 PM GMT

The European AI Office is currently writing the rules for how general-purpose AI (GPAI) models will be governed under the EU AI Act

The are explicitly asking for feedback on how to interpret and operationalize key obligations under the AI Act. 

This includes the thresholds for systemic risk, the definition of GPAI, how to estimate training compute, and when downstream fine-tuners become legally responsible. 

Why this matters for AI Safety:

The largest labs (OpenAI, Anthropic, Google DeepMind) have already expressed willigness to sign on to the Codes of Practice voluntarily

These codes will become the de facto compliance baseline, and potentially a global reference point. 

So far, AI safety perspectives are severely underrepresented.

Input is urgently needed to ensure the guidelines reflect concerns around misalignment, loss of control, emergent capabilities, robust model evaluation, and the need for interpretability audits

Key intervention points include how "high-impact capabilities" are defined, what triggers systemic risk obligations, and how documentation, transparency, and ongoing risk mitigation should be operationalized for frontier models. 

Without this input, we risk locking in a governance regime that optimizes for PR risk (copyright and vague definitions of bias), not existential or alignment-relevant risk.

This is a rare opportunity for independent voices to influence what responsible governance of frontier models should actually require. If left unchallenged, vague obligations could crystallize into compliance practices that are performative... what I understand as fake. 

Purpose of this post:

You do not need to be an European Citizen: anyone can provide feedback (but please make sure to select the correct category under "Which stakeholder category would you consider yourself in?").

📅 Feedback is open until 22 May 2025, 12:00 CET.
🗳️ Submit your response here

 

I haven’t yet seen a technical, safety-focused summary of what the GPAI Codes of Practice are actually aiming to regulate, so I’ve put one together. 

I hope it’s useful to the AI safety community. Since the full breakdown is long, here’s a TL;DR:

TL;DR 

What the GPAI Codes of Practice will actually Regulate

What AI safety researchers should weigh in on:


1. Content of the Guidelines

The Commission’s upcoming guidelines will define how the EU interprets the obligations for general-purpose AI providers under the AI Act. The guidelines will cover:

2. What counts as a General-Purpose AI Model ?

The AI Act defines “general-purpose AI model" as foundation models that display significant generality: they can perform a wide range of tasks and are used downstream across multiple applications.

Critically, the EU is trying to separate GPAI models from full AI systems, making it clear that the underlying model (e.g. LLaMA, GPT-4) can trigger obligations even if it's not wrapped in a user-facing product.

This means that training and release decisions upstream carry regulatory weight, even before fine-tuning or deployment.

Notably, models used exclusively for research, development, or prototyping are excluded, until they are released. Once placed on the market, obligations kick in.

2.1 Conditions for Sufficient Generality and Capabilities

Because there’s no mature benchmark for generality yet, the EU is anchoring its definition of GPAI on training compute. Their proposed threshold:

A model is presumed to be GPAI if it can generate text or images and was trained with >10²² FLOP.

This threshold acts as a presumption of generality

If your model meets this compute threshold and has generative capacity, the EU assumes it can perform many distinct tasks and should be governed as a GPAI model.

 However, the presumption is rebuttable: you can argue your model is too narrow despite the compute, or that a low-compute model has generality due to capabilities.

This is a crude but actionable standard. The EU is effectively saying:

Examples show how this might play out:

This marks the first time a training compute threshold is being proposed as a regulatory signal for generality. 

While imperfect, it sets a precedent for model-based regulation, and may evolve into a cornerstone of GPAI classification across other jurisdictions. 

It also suggests a future where training disclosures and FLOP estimates become a key part of legal compliance.

2.2 Differentiation Between Distinct Models and Model Versions

The EU is trying to draw a regulatory line between what counts as a new model versus a new version of an existing model

This matters because many obligations are triggered per model so, if you release a “new” model, you might have to redo everything.

The preliminary approach is simple:

A “distinct model” starts with a new large pre-training run.
Everything based on that run (fine-tunes, upgrades, or checkpoints) is a model version.

However, if a model update by the same provider uses a large amount of compute (defined as >⅓ of the original model’s threshold), and it significantly changes the model’s risk profile, then it might count as a new model even if it's technically a version. 

The thresholds are:

Why this matters for AI Safety:

This distinction has direct implications for:

If this holds, providers could:

This is effectively a compliance modularity rule. It lets labs scale governance across model variants, but still holds them to task if new versions introduce emergent risk.

For safety researchers, this section could be leveraged to advocate for stronger triggers for reclassification, especially in the face of rapid capability shifts from relatively small training updates.

3. What counts as a Provider of a General-Purpose AI Model ?

The EU distinguishes between providers (entities that develop or significantly modify general-purpose AI models) and deployers (those who build or use AI systems based on those models).

There’s special attention here on downstream actors: those who fine-tune or otherwise modify an existing GPAI model. 

The guidelines introduce a framework to determine when downstream entities become providers in their own right, triggering their own set of obligations.

3.1 What Triggers Provider Status?

Key scenarios where an entity is considered a provider:

Even collaborative projects can count as providers, usually via the coordinator or lead entity.

Downstream Modifiers as Providers

This is one of the most consequential parts of the guidelines for open-source model ecosystems and fine-tuning labs

The EU draws a line between minor modifications (e.g., light fine-tunes) and substantial overhauls that make you legally responsible for the resulting model.

You’re presumed to become a new “provider” (with specific compliance obligations) if:

In this case, you are only responsible for the modification, not the full model: meaning your documentation, data disclosure, and risk assessment duties apply only to the part you changed.

3.2 GPAI with Systemic Risk: Stricter Thresholds

Things get more serious if you are modifying or contributing toa model that crosses the systemic risk threshold (currently 10²⁵ FLOP total training compute). 

You’re treated as a new provider of a GPAI model with systemic risk if:

    You modify a model already classified as high-risk, and your modification adds >⅓ of its training compute (i.e. >3 × 10²⁴ FLOP), orYou take a lower-risk model and push it over the systemic risk threshold with cumulative compute.

In these cases:

While no current modifications are assumed to meet this bar, the guidelines are explicitly future-proofing for when downstream players (including open-source projects) have access to higher compute.

4. Exemptions for Open-Source models

The AI Act includes limited exemptions for open-source GPAI models—but only if specific criteria are met.

    The model must be released under a free and open-source license, which permits:
      Free access (no monetary or access restrictions)Free use (no license enforcement to limit how it’s used)Free modification (no paywalls or rights reservations)Free redistribution (under comparable terms, e.g. attribution clauses)
    Model weights, architecture, and usage documentation must be made public in a way that enables downstream use, modification, and redistribution.The model must not meet the criteria for systemic risk (e.g., trained above 10²⁵ FLOP or posing major cross-domain safety concerns).

Critically, monetized distribution invalidates the exemption: this includes charging for access, offering paid support services, or collecting user data for anything other than basic interoperability or security.

5. Estimating Compute: The First Scalable Safety Trigger

The EU is formalizing training compute as a trigger for systemic risk obligations, creating a scalable governance mechanism that doesn't rely on subjective capability benchmarks. 

While I personally do not fully believ that this is the best benchmark, these rules open the door to continuous monitoring and reporting infrastructure for high-compute training runs. And I think it's something alignment researchers could potentially build around.

If your model crosses certain FLOP thresholds, you're presumed to be developing:

To apply these thresholds, the EU is proposing methods to estimate compute—and defining when you need to notify the Commission if you're approaching or crossing them.

5.1 How to Estimate Compute

The EU outlines two accepted methods:

    Hardware-Based Approach (track GPU usage)
      Count GPUs × time × theoretical FLOP/s × utilizationApproximations within 5% are acceptable
    Architecture-Based Approach (model-based FLOP estimate)
      Estimate FLOP based on architecture and number of training tokens

      For transformers:

      FLOP ≈ 6 × Parameters × Training Examples

Either method is valid. Providers choose based on feasibility, but must document assumptions if using approximations (e.g., for synthetic data generation).

5.2 What "Counts" Toward Cumulative Compute

The “cumulative compute” for regulatory purposes includes:

It does not include:

This cumulative total determines whether a model passes the systemic risk threshold (10²⁵ FLOP).

The guidance also covers model compositions. E.g., Mixture-of-Experts architectures must include compute from all contributing models.

You are expected to estimate compute before the large pre-training run begins (based on planned GPU allocations or token counts). If you’re not above the threshold at first, you’re still required to monitor ongoing compute usage and notify the EU Comission that you've met this threshold if you cross the line later.

6. Other Legal & Enforcement details

Enter into force: 2 August 2027

Providers who released GPAI models before that date have until 2 August 2027 to comply.

This includes documentation, risk assessment, and training data transparency, though retroactive compliance is not required if it would involve retraining, unlearning, or disproportionate effort.

This gives labs and developers a two-year compliance window—but only for models placed on the market before August 2025. Anything new after that date must comply immediately.

Enforcement of the Code of Practice 

The EU’s Code of Practice (CoP) will become the default pathway for demonstrating compliance with the AI Act’s obligations for general-purpose AI models. It’s not legally binding, but adhering to it comes with clear enforcement advantages:

    Signatories are less likely to face intrusive audits or additional information requests.The Commission may even treat CoP participation as a mitigating factor when calculating fines (up to 3% of global revenue).

Non-signatories must prove compliance through alternative methods, such as detailed reporting, gap analyses, or independent evaluations, and may face higher scrutiny.

Supervision by the AI Office

The AI Office will be the lead enforcement authority for all GPAI model obligations. Enforcement officially begins on 2 August 2026, following a one-year grace period.

The AI Office can:

    Demand information and model accessEnforce risk mitigations or market recallsImpose fines of up to 3% of global turnover or €15M, whichever is greater

Confidential business information and IP will be protected under Article 78, but the Commission is investing in a long-term regulatory infrastructure for evaluating frontier models.


If you have any questions or would like to discuss specific sections in more detail, feel free to comment below.

I’ll do my best to answer, either directly or by reaching out to contacts in the Working Groups currently drafting and negotiating the Codes of Practice.

You can also find additional contact details in my LessWrong profile.


If you're considering submitting feedback and would like to coordinate, compare notes, or collaborate on responses, please reach out! I'd be happy to connect.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

欧盟AI法案 通用人工智能 模型监管 AI安全
相关文章