Writing
EU AI Act Article 15 Deferred: What Changes for AI Security Testing
The Digital Omnibus pushed Article 15's cybersecurity requirements to December 2027. The deadline moved; the obligation didn't. Here's what it means for SaaS teams shipping AI features.
TL;DR. The EU AI Act's Article 15 cybersecurity requirements for high-risk AI systems were deferred from August 2026 to December 2027 under the May 2026 Digital Omnibus agreement. The deadline shifted, but the underlying obligation, preventing, detecting, responding to, and controlling attacks on AI systems, hasn't changed. Enterprise customers in regulated verticals are already asking for evidence of compliance via vendor questionnaires, which means the market clock runs ahead of the regulatory one. SaaS teams shipping AI features that fall under Article 15 should start their audit cycle now, not in 2027.
What happened
On 7 May 2026, EU institutions reached a provisional agreement on the Digital Omnibus on AI, the first set of amendments to the EU AI Act since its adoption in 2024. The most significant change for cybersecurity: Article 15's requirements for high-risk AI systems, originally enforceable from 2 August 2026, are deferred to 2 December 2027 for standalone Annex III systems, and to 2 August 2028 for AI embedded in regulated products under Annex I.
One caveat worth stating plainly: this is a provisional political agreement, not yet adopted law. It only takes legal effect once formally adopted and published in the Official Journal. Until then, 2 August 2026 technically remains a live compliance date, though adoption ahead of that date is widely expected.
The deferral gives affected companies more time to build meaningful compliance mechanisms. The Omnibus also tightens watermarking requirements for AI-generated images, video, and audio, a topic I covered in depth in my master's thesis, and adds a new prohibition on AI-generated non-consensual intimate imagery and child sexual abuse material to Article 5.
The deadline math. A full audit cycle, from internal assessment through patching and re-testing, takes three to six months. A SaaS team starting today completes the work in 2026. A team that waits for the deadline is still patching when the regulation comes into force.

What Article 15 actually requires
Article 15(1) requires high-risk AI systems to be designed so that they achieve an appropriate level of cybersecurity, robustness, and accuracy. This article focuses on the cybersecurity dimension. Given the stochastic nature of how current large language models work, accuracy deserves an article of its own.
On cybersecurity, Article 15 requires that attacks on training data (data poisoning) or on pre-trained components (model poisoning) are prevented, detected, responded to, and controlled. I wrote extensively in my thesis about data poisoning, sometimes called “poisoning the well”, and the literature is clear that a small fraction of adversarial documents in a training corpus is enough to degrade model performance.
A related risk is the backdoor. A backdoored model behaves as intended until a trigger activates it, a specific phrase, a token, a hash value; the trigger is arbitrary, chosen by the attacker. Until then it sits like a sleeper agent. Once activated, it acts on behalf of the adversary that planted it: extracting user data, returning intentionally wrong answers (to damage a business until a ransom is paid), or simply burning through tokens to drive up the cost of running the model.
The term “prompt injection” never appears in Article 15, but it's covered there under the phrase “inputs designed to cause the AI model to make a mistake.” The article also names confidentiality attacks and model flaws. Model flaws are the hardest to test for, because the regulation leaves room for interpretation about what the term means. My read is that it covers things like bias, the model behaving in ways it shouldn't. The clearest example is the recruiting model Amazon built and later scrapped, which systematically favored male candidates even when everything else in the CV was identical.
The four verbs in Article 15 each describe a distinct security capability:
- Prevented. This is the purpose of penetration testing, and one of the most effective measures against the attack classes Article 15 covers.
- Detected. The attack has already happened, and the job is to surface it. There is no such thing as 100% security in anything human-made; the goal is reducing failure rates and ensuring detection mechanisms are in place when prevention fails. Effective detection in production means continuous red teaming (costly, but necessary for larger deployments), plus monitoring and logging of model performance and generations, with anonymized data where the context requires it.
- Responded. What happens after detection. Useful measures include rollback to a stable model version, isolation, retraining, or replacing the foundation model. Fine-tuning can also help when errors stem from out-of-distribution training data, comparably cheap, and it shifts the model's focus to a different region of the training corpus.
- Controlled. The safeguards standing between the model and misuse: rate limits, access controls, output filters. As token costs decline, both output and input filters are increasingly enforced by separate LLMs acting as auditors, judging what the model receives as input, and whether its output is appropriate before it reaches the user.
In practice, most current AI deployments handle “prevented” weakly, through generic input filtering and prompt engineering rather than systematic red teaming against documented attack patterns. The deferral gives organizations time to fix this. Whether they actually use it is the open question.
What the deferral changes for security testing
The deferral buys time; it does not eliminate the need.
Before the Omnibus agreement, the timeframe was undeniably too short. Companies are deploying AI at scale in affected industries, and reputable penetration testing takes time. Depending on the size of the deploying organization, there's a bureaucratic ritual to finding a suitable pen-tester in the first place. Once the report lands, fixing the findings takes further time, scaled to their number and severity, and a buffer in that patching phase is always welcome.
Some fixes can't be handled by the deploying company's IT team at all; they require an upstream patch from the foundation model provider, though competitive pressure tends to keep those providers fast on security-related updates. And because the commercial foundation models with the deepest market penetration are black boxes, much of the testing can only happen at the inference level.
There's also a cultural gap. In many organizations, the willingness to thoroughly test an AI system in a staging environment simply isn't there yet, so the system gets deployed before it is properly tested. With the deadline approaching, that creates real regulatory exposure, and substantial business risk beyond the regulation itself. A company running an inadequately tested model can suffer a cyber attack and be extorted, damage its reputation with current and prospective clients, and face fines of up to €15 million or 3% of global annual turnover, whichever is higher.
Meanwhile, the market is moving faster than the regulator. Vendor security questionnaires already ask for SOC 2 attestation and AI security evidence. The market clock runs ahead of the regulatory clock, which is the real reason the December 2027 date is not the date that should drive your planning.
What this means for SaaS teams shipping AI features
If you've already deployed an AI feature, act now.
The first step is determining whether the deployed model falls under Article 15, that is, whether it's a high-risk AI system, since that classification significantly broadens what applies to you. If it does, the next step is to evaluate the model itself. Sometimes there's an obvious mismatch worth fixing first, a 7B open-source model running clinical decision support, for instance. Assuming the model choice fits the use case, the audit cycle looks roughly like this:
- Internal assessment: one to two weeks, depending on organization size.
- Vendor selection: two to four weeks to identify and engage a qualified pen-tester.
- Testing: typically four to six weeks, depending on scope. This can run in a staging environment with synthetic data, so it neither affects the live business nor raises data-protection concerns. Over those weeks, automated testing runs alongside manual testing, and the report is written, a technical section with concrete remediation guidance for your IT team, and an executive section aimed at management.
- Patching: two to eight weeks, depending on the findings.
- Re-testing: one to two weeks. Any credible pen-tester will offer to re-test critical, high, and medium findings to confirm they were fixed correctly.
That's three to six months end to end. The math is the argument: a team starting today is done well within 2026. A team that waits for the deadline to feel urgent is still mid-cycle when the rules take effect.
Why “we'll handle this in 2027” is the wrong answer
The EU AI Act creates real regulatory overhead, but its scoping to high-risk AI systems is appropriate. Waiting until 2027 to begin is not. The steps above take time, and finding the right specialist is not trivial.
By late 2027, hiring an AI security auditor will be a procurement scramble. Right now, it's a strategic choice.
If you're shipping an AI feature that may fall under Article 15 and you want to know where you stand before the deadline becomes urgent, I run security audits scoped exactly to this problem. Book a 20-minute scoping call or email me a description of your system at s.tagwercher@tagwercher.io. I'll tell you honestly whether your system is in scope and what a realistic audit timeline looks like.