Reliability for AI in UK Defence
Reliability is the fifth and last principle of the AI Principles, it demands that AI systems operate consistently and predictably, delivering intended outcomes under varied and potentially extreme conditions. It requires that systems perform as expected, mitigating risks associated with errors or unexpected behaviours, which could have severe consequences in defence contexts, including mission failure or loss of life. Its official definition is the following:
Defintion of Reliability:
AI-enabled systems must be demonstrably reliable, robust and secure. The MOD’s AI-enabled systems must be suitably reliable; they must fulfil their intended design and deployment criteria and perform as expected, within acceptable performance parameters. Those parameters must be regularly reviewed and tested for reliability to be assured on an ongoing basis, particularly as AI-enabled systems learn and evolve over time, or are deployed in new contexts.
Given Defence’s unique operational context and the challenges of the information environment, this principle also requires AI-enabled systems to be secure, and a robust approach to cybersecurity, data protection and privacy. MOD personnel working with or alongside AI-enabled systems can build trust in those systems by ensuring that they have a suitable level of understanding of the performance and parameters of those systems, as articulated in the principle of understanding.
Achieving reliability extends beyond technical robustness—it encompasses building trust with users, ensuring transparency in performance limitations, and maintaining operational effectiveness in adversarial or degraded environments. The principle of reliability also integrates considerations of robustness and security, requiring proactive testing, fallback mechanisms, and continuous updates to address vulnerabilities.
What does it mean to be demonstrably Reliable, Robust, and Secure?
1. Reliability
2. Robustness
3. Security
Reliability
Reliability refers to the ability of an AI system to consistently perform its intended functions within defined operational parameters. In practice, this means the system delivers predictable outcomes aligned with its mission-critical objectives under a range of conditions, including extreme or adverse environments. Reliability focuses on performance consistency and the system's ability to meet expected outcomes within acceptable thresholds. It is less about responding to unforeseen challenges and more about fulfilling the defined operational requirements. To establish reliability, developers must define performance metrics, conduct rigorous testing, and provide clear evidence that the system meets these standards in both normal and degraded conditions.
(See card: Measuring Reliability: how do we decide if an AI system is “suitably” reliable?)
Robustness
Robustness is the capacity of an AI system to handle inputs and scenarios outside its intended design without failing catastrophically. It addresses how well the system adapts to uncertainty, unexpected inputs, and adversarial attacks. This is all about the system’s resilience when operating beyond its expected conditions, such as handling edge cases or mitigating potential failures caused by environmental factors or malicious interference. Developers must stress-test systems against varied and extreme inputs, evaluate the system’s behaviour under adversarial pressures, and implement fail-safe mechanisms.
(See card: Measuring Robustness: how do we decide if an AI system is “suitably” robust?) Carl von Clausewitz, the 19th century Prussian military theorist, gave us a good understanding of what to expect then, and his lessons are just as applicable today. He famously described the chaos of warfare, which is defined by friction, uncertainty, chance, and the clash of wills. Success in such an unpredictable setting depends on understanding these forces and adapting to them. Clausewitz also introduced the concept of "friction" as the force that makes even the simplest tasks in war extraordinarily difficult. Friction is the accumulation of minor difficulties, uncertainties, and errors that hinder military operations. These could include miscommunication or misunderstanding (E.g. the Charge of the Light Brigade), unexpected weather conditions, human error or equipment failure. Friction causes plans to deviate from expectations and can disrupt even the most well-prepared strategies. Clausewitz writes,“Everything in war is very simple, but the simplest thing is difficult.” Added to this is the challenge of what the adversary is doing or has done to confuse, spoof, undermine, distract or misdirect. Just as people do not always act as expected when they are placed in extreme situations, any human-designed system can be affected by multiple factors that have an impact on how it works in the real world.
(See Card: Why do good people do bad things? What can we do about it?)Robustness is not a static quality; it must be continuously reassessed as the AI system evolves, learns, or is deployed in new contexts. Regular testing under updated conditions and adversarial scenarios is essential to maintain robustness over time.
Security
Security involves safeguarding the system against threats, such as data breaches, adversarial attacks, or systemic vulnerabilities. It ensures the integrity, confidentiality, and availability of the AI system throughout its lifecycle. Security emphasises the protection of both the system and its data, ensuring that adversaries cannot exploit vulnerabilities to compromise functionality or outcomes. This requires comprehensive threat analysis, and the application of robust cybersecurity measures to reduce the system’s attack surface.
(See card: Measuring Security: how does one decide if an AI system is “suitably” secure?)