High-Risk AI · Chapter III

Article 15: Accuracy, Robustness & Cybersecurity

Article 15 of the EU AI Act requires that high-risk AI systems achieve an appropriate level of accuracy, robustness, and cybersecurity, and that they perform consistently in those respects throughout their lifecycle. It covers three distinct but related dimensions of technical performance — each with specific documentation requirements.

Applies to:Providers of high-risk AI systems listed in Annex III. The requirements cover the system as deployed — including any GPAI foundation model layer. Providers using pre-trained models are responsible for the combined system's performance against these standards.

The Three Dimensions of Article 15

1. Accuracy — Article 15(1)–(2)

Article 15(1) requires that high-risk AI systems be designed and developed to achieve an appropriate level of accuracy in light of their intended purpose. "Appropriate" is not a universal standard — it is calibrated to the risk and context of the specific system.

Article 15(2) adds a declaration requirement: providers must declare, in the technical documentation and in the instructions of use, the metrics used to measure accuracy and the accuracy levels the system has achieved. This makes accuracy not just a performance target but a documented and publicly accountable commitment.

In practice, this means providers must:

  • Select accuracy metrics appropriate to the task (precision, recall, F1, AUC, calibration error, etc.)
  • Evaluate the system against those metrics on appropriate test sets
  • Document the metrics, the test methodology, and the results
  • State the accuracy levels in the instructions of use in a way that deployers can understand

2. Robustness — Article 15(3)–(4)

Article 15(3) requires that high-risk AI systems are resilient as regards errors, faults, or inconsistencies that may occur within the system, its environment, or its inputs. Systems should perform consistently even when inputs are noisy, incomplete, or outside their training distribution.

Technical redundancy and fallback plans are explicitly required. When a high-risk AI system cannot produce a reliable output — due to out-of-distribution inputs, hardware failure, or data quality problems — it must fail safely. "Safe failure" means a defined, documented, and predictable behaviour: defaulting to human decision-making, returning no output with an alert, or triggering a manual review flag.

Article 15(4) addresses the specific risk of AI systems trained on outputs of other AI systems: where the outputs from one AI system serve as inputs to another, robustness must account for this dependency. Error propagation across AI systems in a pipeline is a distinct robustness risk that must be addressed in documentation.

3. Cybersecurity — Article 15(5)

Article 15(5) requires that high-risk AI systems be resilient against attempts by unauthorised third parties to alter the system's use, outputs, or performance by exploiting system vulnerabilities. This covers adversarial attacks specific to AI systems — in addition to general IT security requirements under other regulation.

AI-specific vulnerabilities that Article 15(5) addresses include:

  • Data poisoning: Manipulating training or fine-tuning data to embed adversarial behaviours in the model
  • Model inversion: Recovering training data or model parameters from system outputs — a particular concern for systems trained on sensitive personal data
  • Adversarial examples: Carefully crafted inputs designed to cause the system to produce incorrect outputs while appearing normal
  • Prompt injection: For LLM-based systems: manipulating inputs to override system-level instructions or extract sensitive information

Consistency Throughout the Lifecycle

Article 15(1) requires consistent performance throughout the system's lifecycle — not just at initial deployment. This means accuracy and robustness must be monitored in production, and performance degradation (model drift, distribution shift, data quality changes in real-world inputs) must be detected and addressed through the post-market monitoring system required under Article 72.

Any significant performance change detected through post-market monitoring must feed back into the Article 9 risk management system as an updated risk, and may require updating the technical documentation and instructions of use.

Common Mistakes

Measuring accuracy on test data that mirrors training data distribution

Accuracy measured on a test set drawn from the same distribution as training data does not represent real-world performance. Article 15 requires performance in the intended deployment context — test sets must reflect the actual population the system will encounter, including edge cases and underrepresented groups.

No declared accuracy metrics in instructions of use

Article 15(2) explicitly requires the accuracy metrics and achieved levels to be stated in the instructions of use. Many providers treat accuracy as internal technical data. It must be documented and disclosed to deployers.

Treating cybersecurity as a general IT security obligation

Article 15(5) targets AI-specific vulnerabilities — adversarial examples, data poisoning, model inversion — that general IT security frameworks do not typically address. Standard penetration testing is insufficient on its own.

No documented fallback plan

Article 15(3) requires technical redundancy including fallback plans. A system with no defined behaviour when its output confidence falls below a threshold, when inputs are out of distribution, or when errors occur is non-compliant.

Generate your Article 15 documentation

Nytivo's Article 15 module guides you through accuracy declaration, robustness testing documentation, and cybersecurity measures — generating the technical documentation section required for regulatory submission.

Start free trial
FAQ

Article 15 — Frequently Asked Questions

What level of accuracy does Article 15 require?

Article 15 requires an 'appropriate' level of accuracy in light of the system's intended purpose. There is no universal numerical threshold — appropriateness is contextual. A credit scoring model that is correct 95% of the time may be inappropriate if 5% error rate causes material harm to affected individuals. A medical image classifier may require higher accuracy than a content recommendation engine in the same deployment volume. The provider must determine and document what accuracy level is appropriate for their specific use case and demonstrate the system meets it.

Does Article 15 require adversarial testing?

Article 15(5) specifically requires that high-risk AI systems be resilient against attempts by third parties to alter their use, outputs, or performance by exploiting system vulnerabilities. While the Act does not prescribe specific adversarial testing methodologies, demonstrating this resilience in practice typically requires testing against known attack types relevant to the system — data poisoning, model inversion, adversarial examples, and prompt injection where applicable. For GPAI model-based systems, this should also address jailbreak resilience.

What are 'fallback mechanisms' under Article 15?

Article 15(3) requires that high-risk AI systems include technical redundancy measures — including fallback plans — to ensure continuity of operation or safe failure when errors or unexpected outputs occur. In practice, this means the system should fail in a defined, documented, and safe way rather than producing arbitrary outputs under error conditions. This could include defaulting to human decision-making, returning no output rather than a potentially harmful one, or triggering an alert for human review.

How does Article 15 interact with Article 9?

Closely. Accuracy gaps, known robustness weaknesses, and cybersecurity vulnerabilities identified under Article 15 are risks that must feed into the Article 9 risk management system. Conversely, the risk assessment under Article 9 should identify technical performance as a risk category and inform what Article 15 documentation must address. In practice, accuracy and robustness findings should be cross-referenced between the two documentation sections.

Are Article 15 requirements different for GPAI-based systems?

The Article 15 obligations apply to the AI system as a whole, regardless of whether it is built on a general-purpose AI model. If you are deploying a high-risk AI application built on a foundation model, you are responsible for the accuracy, robustness, and cybersecurity of your application layer — you cannot fully discharge Article 15 by relying on the GPAI provider's model-level compliance. You must assess and document the combined system's performance.