Responsible Scaling Policy
A Responsible Scaling Policy (RSP) is a commitment framework for AI development that structures safety obligations as capability-triggered if-then rules, rather than arbitrary timelines or purely voluntary norms. Introduced by Anthropic and subsequently adopted (in similar form) by OpenAI and Google.
The if-then structure
The core design choice: instead of pre-specifying what a company will do at some future date, an RSP commits to specific actions if and when a model crosses a defined capability threshold. This solves two failure modes:
- False alarms: Imposing heavy burdens on current models that are demonstrably not dangerous erodes credibility and produces regulatory backlash
- Complacency: Open-ended “we take safety seriously” pledges with no concrete triggers provide no accountability
The trigger conditions are empirically testable: each new model is evaluated against capability benchmarks targeting the specific risks (CBRN uplift, autonomous AI research) at each ASL level.
What it commits to
At each ASL level, Anthropic pre-specifies the security and deployment requirements that will activate:
- ASL-3: Enhanced security to prevent model theft by non-state actors; targeted deployment filters for CBRN-adjacent queries
- ASL-4: Interpretability-based verification (not just model self-report); isolation from autonomy-expanding deployments until alignment is verified
The company is not permitted to train or deploy a model that crosses an ASL threshold without the corresponding measures in place — regardless of competitive pressure.
Limitations and critiques
- The policy is self-imposed; there is no external enforcement mechanism
- Capability thresholds may be difficult to measure reliably, or models may learn to sandbag evaluations at ASL-4
- Competitive pressure may cause the triggers to be set too high or the measures to be too weak
- Other frontier labs are not bound by Anthropic’s RSP; if they don’t adopt equivalent frameworks, the externality problem persists
Dario Amodei acknowledges these limitations and argues that external regulation modelled on the RSP structure is necessary to address the free-rider problem. See Dario Amodei on Claude, AGI and the Future of AI.
As a regulatory design pattern
The RSP’s if-then structure is potentially exportable beyond AI: tie regulatory obligations to measurable capability milestones rather than to calendar dates or company size. This approach is also used in pharmaceutical clinical trial phasing and nuclear materials safeguards.
Where mainstream views differ
Some researchers argue that:
- Capability thresholds cannot be reliably specified in advance for systems that may develop capabilities discontinuously
- Self-assessment of threshold crossing is inherently conflicted when the company benefits from continued development
- The framework provides legitimacy to a development trajectory that should instead be debated at a societal level before proceeding