Responsible Scaling Policy

A Responsible Scaling Policy (RSP) is a commitment framework for AI development that structures safety obligations as capability-triggered if-then rules, rather than arbitrary timelines or purely voluntary norms. Introduced by Anthropic and subsequently adopted (in similar form) by OpenAI and Google.

The if-then structure

The core design choice: instead of pre-specifying what a company will do at some future date, an RSP commits to specific actions if and when a model crosses a defined capability threshold. This solves two failure modes:

False alarms: Imposing heavy burdens on current models that are demonstrably not dangerous erodes credibility and produces regulatory backlash
Complacency: Open-ended ‘we take safety seriously’ pledges with no concrete triggers provide no accountability

The trigger conditions are empirically testable: each new model is evaluated against capability benchmarks targeting the specific risks (CBRN uplift, autonomous AI research) at each ASL level.

What it commits to

At each ASL level, Anthropic pre-specifies the security and deployment requirements that will activate:

ASL-3: Enhanced security to prevent model theft by non-state actors; targeted deployment filters for CBRN-adjacent queries
ASL-4: Interpretability-based verification (not just model self-report); isolation from autonomy-expanding deployments until alignment is verified

The company is not permitted to train or deploy a model that crosses an ASL threshold without the corresponding measures in place — regardless of competitive pressure.

Limitations and critiques

The policy is self-imposed; there is no external enforcement mechanism
Capability thresholds may be difficult to measure reliably, or models may learn to sandbag evaluations at ASL-4
Competitive pressure may cause the triggers to be set too high or the measures to be too weak
Other frontier labs are not bound by Anthropic’s RSP; if they don’t adopt equivalent frameworks, the externality problem persists

Dario Amodei acknowledges these limitations and argues that external regulation modelled on the RSP structure is necessary to address the free-rider problem. See Dario Amodei on Claude, AGI and the Future of AI.

As a regulatory design pattern

The RSP’s if-then structure is potentially exportable beyond AI: tie regulatory obligations to measurable capability milestones rather than to calendar dates or company size. This approach is also used in pharmaceutical clinical trial phasing and nuclear materials safeguards.

Where mainstream views differ

Some researchers argue that:

Capability thresholds cannot be reliably specified in advance for systems that may develop capabilities discontinuously
Self-assessment of threshold crossing is inherently conflicted when the company benefits from continued development
The framework provides legitimacy to a development trajectory that should instead be debated at a societal level before proceeding