2026-05-23 EU AI ACT ~12 min

EU AI Act readiness assessment for high-risk AI providers before 2 August 2026

High-risk obligations under Regulation (EU) 2024/1689 enter application on 2 August 2026. This page is a practical, no-spin walk through the six articles that bind a high-risk provider on that date, a ten-question gap check we use with clients, and where cryptographic evaluation evidence fits. Written by the author of the PRML specification.

High-risk enforcement begins
2 August 2026
Regulation (EU) 2024/1689 · Article 113 application schedule

What "readiness" actually means under Regulation (EU) 2024/1689

The phrase "EU AI Act readiness" is doing several different jobs in the market right now. Some vendors use it to mean having a policy document. Others use it to mean having completed a fundamental rights impact assessment. Others mean a fully built quality management system under Article 17. None of these are wrong; none of them is the whole picture.

For a high-risk provider — the category that covers most enterprise AI systems used in credit scoring, recruitment, education, critical infrastructure, law enforcement, migration, justice administration, and biometric identification (Annex III), plus AI as a safety component in regulated products (Annex I) — readiness means having defensible answers to six concrete obligations on the day enforcement begins.

This page covers those six. It does not try to compress the entire 458-page regulation into a checklist; it focuses on what a provider has to be able to show, in writing, on 2 August 2026.

The six articles that bind a high-risk provider on 2 August 2026

ArticleWhat it requiresEvidence shape
Art 9Risk management system covering the entire AI system lifecycle, iteratively documented and testedRisk register, test results per risk, mitigation log
Art 12Automated logging of events relevant to risk identification, post-market monitoring, and traceability across the system lifetimeTamper-evident logs, log retention policy, log integrity controls
Art 13Transparency to deployers: instructions for use, intended purpose, accuracy and robustness levels, human oversight provisionsInstructions for use document, accuracy claims with evidence
Art 14Effective human oversight measures appropriate to the risk and context of useOversight procedure documentation, training records, override logs
Art 17Quality management system covering the regulatory strategy, design controls, examination procedures, post-market monitoring, communication with competent authoritiesQMS documentation (ISO 9001-compatible structure), audit-ready
Art 18Retention of technical documentation, automatically generated logs, and conformity assessment artefacts for ten years after the system is placed on the marketDocumented retention infrastructure, content-addressed where possible

Article 12 and Article 18 are the two that most procurement teams underestimate when first reading the regulation. The text is short. The implementation is system design.

Why Article 12 is the article that breaks naive compliance

Article 12 obliges high-risk AI providers to design systems that "automatically record events ('logs')" with sufficient granularity to support: identification of situations that may result in the AI system presenting a risk to health, safety, or fundamental rights; post-market monitoring; functioning monitoring; substantial-modification detection.

This is not the same thing as application logging. The logs have to be designed in. They have to survive operational pressure (a provider cannot just delete them when storage is tight). They have to be retrievable on request by a market surveillance authority or notified body. They have to be tamper-evident enough that the absence of tampering is itself part of the audit trail.

"Tamper-evident" is the load-bearing phrase. A log file with no integrity control is, from an audit perspective, indistinguishable from a log file that was edited after the fact. The regulation does not prescribe a specific cryptographic mechanism, but in practice the providers who get clean conformity assessments will be the ones who can demonstrate that the logs have not been modified after recording. Hash chains, cryptographic timestamps, signed log entries, content-addressed storage — these are not novelties. They are the standard answer.

Why Article 18 is the article that breaks naive storage strategy

Article 18 requires retention of the technical documentation referred to in Article 11, the automatically generated logs referred to in Article 12, and the documentation of the conformity assessment for at least ten years after the AI system has been placed on the market or put into service.

Ten years is a long time in cloud infrastructure. Storage providers change pricing, deprecate APIs, get acquired, retire services. A retention strategy that depends on a single vendor's continued existence is fragile. The providers who get this right will use content-addressed retention (the artefact's hash is the address, not a vendor-specific URL), with multiple geographic mirrors, and a documented chain of custody.

This is also where the evaluation evidence specifically lives. The accuracy and robustness claims a provider makes under Article 13 need an evidence trail. That trail has to be re-derivable in 2036 against the original artefact, not against a snapshot of how a vendor's dashboard looked in 2026.

The ten-question readiness self-check

The following is the gap analysis we run with regulated AI clients before any engagement. Honest answers to all ten take roughly thirty minutes. There is no scoring rubric; this is a triage instrument, not a maturity assessment.

  1. For each high-risk AI system you provide, can you point to a single document that names the metric, the threshold, and the dataset against which the published accuracy claim was evaluated? If the answer involves a slide deck, a Slack message, or "ask the team," that is the gap.
  2. Can you re-derive the SHA-256 (or equivalent content hash) of every published evaluation claim from the canonical record, six months from now, without depending on a vendor portal? If the claim is "in our MLflow," that is a single-vendor dependency.
  3. Is the threshold for each claim verifiable as having been committed before the result was observed? Article 12 logging is shaped to detect risks; a threshold set after observing the result is not detectable as such without a pre-commitment timestamp anchor.
  4. For the logs Article 12 requires, can you produce evidence that they have not been edited since recording? The mechanism does not have to be exotic; it has to be auditable. Sequential timestamps in a database are not, on their own, tamper-evident.
  5. If your evaluation dataset is updated by an upstream provider (a benchmark publisher, a labeling vendor, a data licensor), can you tell from your own records which version of the dataset was used for the claim you made? Dataset content hashing is the standard answer. Naming alone is not.
  6. For each model version named in your conformity documentation, can you cryptographically pin the build that was evaluated against the claim? Model substitution is the most common reason published claims diverge from operational behaviour. Pinning is mostly free; the absence of it is mostly expensive.
  7. Does your retention infrastructure cover the ten-year window in Article 18, including a scenario where your current cloud provider deprecates the relevant service? Content-addressed storage and multiple geographic mirrors are the conservative answer.
  8. If a market surveillance authority requested all evidence relating to a single published claim, could your team assemble that evidence package in less than five business days? If the answer is no, the gap is process, not technology.
  9. For the regulatory crosswalk between your internal evidence and the EU AI Act articles, is there a documented mapping that an auditor can read without translation by your team? Internal vocabulary that does not line up with the regulation is a friction point under scrutiny.
  10. When you publish an evaluation claim externally (in a model card, a paper, a customer report, a product page), does the same claim exist verbatim in your internal evidence, with a re-derivable hash? Divergence between external and internal versions of the same claim is the single most common audit finding.

Most regulated AI teams we work with answer yes to four or five of these, partially to two or three, and no to the rest. That distribution is normal. It is also the distribution that turns into procurement budget for compliance evidence services between now and 2 August 2026.

Where cryptographic evaluation evidence fits

The questions above are framework-agnostic. A provider could answer all ten with bespoke internal tooling, with a Big-4 consultancy retainer, or with open primitives. We work on the open-primitives end of that spectrum. The honest part of this page is that PRML — the Pre-Registered ML Manifest specification we author — addresses questions 1, 2, 3, 5, 6, and 10 directly. It does not address questions 4, 7, 8, or 9 on its own.

What PRML closes

What PRML deliberately does not close

This page is not legal advice. The regulation is binding; the interpretive mappings we publish are an engineering pattern. A high-risk AI provider's actual readiness depends on a qualified-counsel review against the specifics of the system and the notified body's expectations. We are not lawyers; the spec we author is one piece of the evidence layer, not the whole.

What "good evidence" actually looks like in 2026

The shape of an audit-quality evidence package for a single published evaluation claim under the EU AI Act, as of 2026, has converged across the providers who have gone through early conformity assessments:

  1. A pre-registered claim manifest. Metric, comparator, threshold, dataset content hash, seed, producer identity, claim identifier, timestamp. Canonically serialised, content-addressed, hashable offline by anyone.
  2. An anchor timestamp. Not the producer-declared timestamp inside the manifest — a separate, externally observable timestamp from a system the producer does not control. Git commit time in a public repository, registry receipt time, Sigstore Rekor entry, RFC 3161 timestamping authority, arXiv submission, DOI registration, or CI run log.
  3. A re-derivation script. Forty lines of Python (or any language). Inputs: manifest text. Output: the SHA-256 hash. If the hash matches the anchor, the claim is verified.
  4. An execution attestation. Article 12 log integrity for the actual eval run. Sigstore Rekor, in-toto SLSA, or equivalent. Cookbook Pattern 11 walks through PRML + Sigstore.
  5. An independence attestation. Where the claim depends on multi-party validation (clinical trial-shaped evaluations, benchmark leaderboards with external validators), the validators' verdicts must be committed before they see each other's results. Blind commit-reveal, as documented in Cookbook Pattern 13, is the standard approach.
  6. A regulator-facing summary. Two to four pages, plain prose, that explains what was claimed, what evidence supports it, what residual risk remains, and what controls the provider has put in place. This is the document the notified body or market surveillance authority reads first.

None of these six artefacts is exotic. None requires a proprietary platform. All of them can be produced with open specifications and open-source tooling. The hard part is not the technology; it is the discipline of producing them before the claim is published, not retroactively after a regulator asks.

What changes on 2 August 2026, in practice

The regulation has been in force since 1 August 2024. The general-purpose AI obligations applied from 2 February 2025. The high-risk obligations enter application on 2 August 2026. Member State authorities and the AI Office can begin enforcement on that date.

For an existing high-risk system already on the EU market on 2 August 2026, the obligations apply. Article 113(3)(b) addresses transitional arrangements; a qualified counsel review against the specific system and its placement-on-market date is the appropriate consultation. We do not give legal advice on the transitional provisions; we point to the regulation.

The pattern we see across early conformity assessments: the gap between "we have a compliance policy" and "we can produce the six artefacts above for every published claim within five business days" is the gap that defines readiness. The compliance policy is necessary. It is not sufficient.

Three concrete next steps

Whatever vendor or framework you ultimately pick — Big-4 audit firm, boutique compliance vendor, in-house build, open-source primitive — the next three steps are the same:

  1. Inventory every published evaluation claim. Internal claims, customer-facing claims, model card claims, technical report claims, marketing claims. Anything where a number is stated about how the system performs. This is usually a longer list than teams expect.
  2. Identify which claims will face external scrutiny first. Notified body conformity assessment is one path. Customer due-diligence is another. Procurement security review is a third. Each has different evidence weight requirements.
  3. Build the evidence for the top three claims before the rest. The marginal cost of evidence per claim falls quickly; the marginal cost of the first claim is high. Do not try to do all of them at once.

For the third step specifically, we offer a Diagnostic Sprint engagement that takes one to twelve claims through this process, produces the audit-quality artefacts in writing, and leaves you with re-derivable cryptographic evidence and a regulator-mappable summary. The full tier definitions are at /sprint/#tiers. Audit Review (€15,000, one claim, five business days) is the entry tier. Enterprise (€180,000-250,000, up to twelve claims, isolated registry, two written executive briefings) is the full envelope.

Frequently raised questions during scope conversations

Does PRML replace our existing model risk management process?

No. PRML is a manifest format for one specific kind of artefact: a pre-committed evaluation claim. Your model risk management process — internal MRM under SR 11-7, internal model validation, the QMS under Article 17 — sits around it and contains it. PRML manifests are inputs into the MRM evidence package, not a replacement for the package.

What if our claims are continuous (streaming evaluation, live leaderboards)?

The v0.2 specification (frozen 22 May 2026) added a streaming mode for exactly this case. Continuous evaluations commit to an aggregation rule and a window rather than a single batch. The cryptographic shape is the same; the semantic shape is different. The v0.2 RFC notes document the streaming mode in detail.

Does this work for non-EU regulators (NIST AI RMF, ISO/IEC 42001, FDA SaMD, MHRA AI as a medical device)?

The evidence shape is the same. The regulatory mapping is different. We publish standard crosswalks for the three most-requested frameworks (EU AI Act Article 12, NIST AI RMF 1.0, ISO/IEC 42001:2023) and produce personalised mappings to other frameworks under the Enterprise tier of the Diagnostic Sprint.

Can we self-attest without engaging Studio 11?

Yes. The PRML specification is CC BY 4.0 and the reference implementations are MIT. Four byte-equivalent implementations in Python, JavaScript, Go, and Rust. The full toolchain is open. We charge for the engagement (scope memo, audit report, regulatory crosswalk, re-derivation script, support window), not for the specification or the tooling. If your team has the bandwidth to do this internally, the documentation is there for you to do it.

How to start a scope conversation

Email [email protected] with subject [EU AI Act readiness] and four lines naming your high-risk use case, the count of claims you want to lock, the target regulator or notified body, and your internal deadline before 2 August 2026. We respond within one business day with a 1-2 page scope confirmation and the relevant tier's invoice. No scoping calls. Everything in writing. The invoice rides with the scope memo, so you can hand both to procurement in one envelope.

Cüneyt Öztürk — falsify track lead, Studio 11. [email protected]. This page is CC BY 4.0. Studio 11 is a Türkiye-incorporated company. The PRML specification we author is CC BY 4.0; the reference implementations are MIT. We are not lawyers. The regulation is binding; qualified counsel reviews are appropriate for compliance decisions. The crosswalk pages we publish carry the interpretive-mapping disclaimer; this page does the same.