DSTL Ai Ethics Cards

Home Theme Next Prev

21 21 21 21

If the AI-enabled system did not work as intended, what is the worst thing that could happen?

A worst-case scenario test for AI refers to stress-testing an artificial intelligence system under extreme, high-risk, or failure-prone conditions to assess its limitations, vulnerabilities, and potential negative consequences. This approach is particularly relevant for ensuring AI safety, reliability, and alignment with ethical and security principles, such as those outlined in the UK Ministry of Defence's (MoD) Responsible AI Principles. See cards: Measuring Reliability: how do we decide if an AI system is “suitably” reliable?, Measuring Robustness: how do we decide if an AI system is “suitably” robust?, Measuring Security: how does one decide if an AI system is “suitably” secure?.

Worst case testing
Types of worst case scenarios
Measuring the scale of potential harm?

1. Worst case testing
The goal of worst-case testing is to identify risks that might not be apparent in normal operating conditions. These tests help in:
• Preventing catastrophic failures in critical systems (e.g., defence, healthcare, finance, and autonomous vehicles).
• Understanding unintended consequences, such as biases, security breaches, or unpredictable decision-making.
• Evaluating how AI behaves in extreme or adversarial situations.

2. Types of worst case scenarios
The types of Worst-Case Scenarios for AI in a military context vary, but could include:
1. Adversarial Attacks – Testing AI against deliberate attempts to manipulate or deceive it, such as misleading inputs in machine learning models (e.g., adversarial examples in image recognition).
2. Ethical Failures – Examining how AI responds to morally complex situations, such as prioritising human lives in autonomous vehicle crashes.
3. Data Poisoning – Assessing the impact of biased, corrupted, or incomplete data on AI decision-making.
4. Operational Failures – Simulating real-world stress conditions where AI systems might fail (e.g., loss of internet connectivity for AI-driven infrastructure).
5. Human-AI Interaction Risks – Evaluating scenarios where AI misinterprets human intent, leading to dangerous or unintended actions (e.g., AI misidentifying a target in a military context).
6. Runaway AI Scenarios – Stress-testing self-learning systems for risks related to autonomous goal misalignment (e.g., AI optimising for efficiency at the expense of human safety).

3. Measuring the scale of potential harm
When trying to measure the scale of the potential harm, it is useful to consider the impact and the risk. Although it would be easy to assume that lethal autonomous weapon systems (LAWS) would automatically be at the far end of the spectrum of harms if something goes wrong, a strategic level decision made by a person that is informed by erroneous or false recommendations by a trusted AI-enabled system could have far greater ethical impact than any single action by a LAWS.
Worst-case scenario testing is an essential aspect of AI safety and ethics, particularly in defence and security applications. By rigorously assessing how AI responds under extreme conditions, as well as thinking through worst-case scenarios, policymakers and developers can mitigate risks, ensuring AI serves human interests while aligning with ethical and legal frameworks.

Disclaimer