SafeTea.ai

If Then Commitments

Are an emerging framework for handling the challenge of AI posing catastrophic risks to international security and order. These are voluntary commitments by frontier AI researchers & corporations, which take the form:

If an AI model has capability X, risk mitigations Y must be in place. And, if needed, we will delay AI deployment and/or development to ensure the mitigations can be present in time._

These commitments can be voluntarily adopted by AI developers; they also, potentially, can be enforced by regulators. Adoption of could help reduce risks from AI in two key ways:

1. Prototyping, battle-testing, and building consensus around a potential framework for regulation
2. Helping AI developers and others build roadmaps of what risk mitigations need to be in place by when.

Such adoption does not require agreement on whether major AI risks are imminent—a polarized topic—only that certain situation would require certain risk mitigations if they came to pass.

Learn by writing "If then commitments"

There's no better way to learn how these commitments help with AI harm mitigation than to pick an incident, either real or hypothetical, and understand how these commitments check or gate progress.

Pick from our repository of incidents to begin. Read through the incident report, and understand what was the harm caused. Now imagine you were the developer who built the system, how would you write if-then-commitments to self guard your AI model from gaining.

Explore our repository of incidents to begin, and read their corresponding If-Then Commitments.

Incident Report Templates

These templates are meant to give you an understanding of what an incident report looks like. Make sure you read the original incident report attached and also look through any risk mitigation tools.

Robotic Malfunction: Amazon Warehouse Bear Spray Incident Hospitalizes 24

Severity 4/10: Autonomous machine punctures bear spray, causing mass chemical exposure and worker hospitalization at Amazon facility

AI-Generated Fake Passport Exposes Critical KYC Verification Vulnerabilities

Severity 6/10: ChatGPT-4o creates convincing passport forgery in 5 mins, threatening digital identity verification systems globally

AI Passport Forgery: ChatGPT-4o Exposes Critical Identity Verification Vulnerabilities

Severity 6/10: AI-generated fake passport bypasses KYC systems in 5 minutes, revealing massive digital identity verification risks.

AI-Powered Online Scams: A Significant Threat to Trust and Security

Severity 6/10: AI is making it easier for scammers to create sophisticated phishing and fraud attempts, posing risks of financial losses, privacy breaches, and erosion of public trust.

Cursor's AI Support Bot Hallucination Leads to Customer Backlash

Severity 4/10: Cursor faced customer frustration and subscription cancellations after an AI support bot provided an incorrect, fabricated response about a new login policy.

AI-Powered Online Scams: Streamlining Fraud at Scale

Severity 8/10: Scammers use AI tools to rapidly create convincing fraudulent websites and social engineering tactics, leading to a surge in online scams.

AI-Enabled Financial Aid Fraud Exploits California Community Colleges

Severity 6/10: Scammers use AI tools to create fake student identities, stealing millions in federal aid from California colleges.

YouTube AI Perpetuates Eating Disorder in 14-Year-Old

Severity 5/10: YouTube's algorithm recommended harmful content that worsened a teenager's eating disorder, causing significant physical and mental health impacts.

An AI Incident refers to any problematic or unexpected outcome arising from AI and autonomous systems. Describe a potential AI incident you are concerned about, or paste a news article or post you have seen about high risk from AI or autonomous systems.

Learn by Podcast

Listen to the If Then Podcast episode on If Then Commitments.