2026-05-23 EU AI ACT · ARTICLE 12 ~10 min

Article 12 checklist: ten questions to close the automated logging gap before 2 December 2027

Q: Does Article 12 apply to GPAI models?

Not directly. Article 12 is in Chapter III Section 2 binding high-risk systems. GPAI obligations live in Chapter V. Article 53(1)(d) imposes a documentation obligation on GPAI providers that resembles Article 12 in spirit but is structured differently.

Q: Do we need a separate log retention system, or can we use our existing observability stack?

Article 12 logs benefit from clean separation from operational observability. Many teams take their existing observability stack as source-of-truth, with a periodic export to a regulatory log store that satisfies retention and integrity requirements.

Article 12 of Regulation (EU) 2024/1689 is two sentences and a sub-list. The implementation is a system-design problem that most high-risk providers underestimate on first reading. This page is the working ten-item checklist we built for audit preparation, with the six event categories that must be logged, the ten-year retention floor under Article 18, and a printable single-page version for compliance reviews.

Full Article 12 crosswalk Readiness assessment

What Article 12 actually says

The operative text of Article 12, in full:

"High-risk AI systems shall technically allow for the automatic recording of events (logs) over the lifetime of the system. Such logging capabilities shall conform to recognised standards or common specifications in the state of the art. The logging capabilities shall ensure a level of traceability of the AI system's functioning that is appropriate to the intended purpose of the system."

Three commitments compressed into one paragraph: automatic recording (no human in the loop for the log itself), over the lifetime of the system (not a development-time facility), and traceability appropriate to intended purpose (a scope test, not a uniform standard).

Article 12(2) then specifies the minimum: logs shall enable identification of situations that may result in the system presenting a risk under Article 79(1) or in a substantial modification. Article 12(3) requires logging of six specific event categories for remote biometric identification systems under Annex III(1)(a), but the principle generalises to any high-risk system.

The six log event categories

For Annex III(1)(a) biometric identification the regulation enumerates six categories. For other high-risk systems the categories are not enumerated but are derived from Article 12(2)'s traceability test. The six below are the categories notified bodies consistently expect across system classes:

Category	What to log
Use periods	Start time, end time, and duration of each session in which the system was operated. The minimum traceable unit of system activity.
Reference database	For systems matching against a reference set: the identifier of the reference database used at the time of each use, with version or content hash. Changes to the reference database between uses must be reconstructible.
Input data	The input data on which the search or inference resulted in a match or output. For biometric systems this is explicit; for other systems this maps to the inference inputs sufficient to reproduce the output, subject to data-minimisation principles.
Verification persons	Identification of the natural persons involved in verifying the results, where Article 14 human oversight is implemented through a verification step.
System decisions	The output of the system per session, including confidence scores or other uncertainty quantifications surfaced to the deployer.
Risk events	Any incident, near-miss, or deviation from expected behaviour that may trigger Article 73 reporting or Article 72 post-market monitoring. This is the category most providers under-instrument.

The ten-item Article 12 checklist

Each item below is one closeable question. If you can answer "yes, documented, tested" to all ten, your Article 12 posture is defensible at a notified body assessment. If any answer is "we are working on it," that becomes the work order between now and 2 December 2027.

01 Is logging automatic at the system level, not at the deployment script level? A log that fires only when the operator remembered to enable it does not satisfy Article 12. The capability must be in the system itself, not bolted on around it. Article 12(1)

02 Does the log cover the lifetime of the system, not only the training run? The phrase "over the lifetime" is non-trivial. Logs from training, validation, deployment, inference, and decommissioning must all be retrievable, not just the most recent. Article 12(1)

03 Are the six event categories instrumented? Use periods, reference database state, input data, verification persons, system decisions, risk events. Missing risk-event instrumentation is the single most common gap. Article 12(2), 12(3)

04 Is the log schema versioned and stable? A schema change without a version bump turns a ten-year audit trail into ten years of inconsistent records. The schema is part of the technical documentation under Annex IV §2(h). Annex IV §2(h)

05 Are logs retained for at least ten years after the system is placed on the market? Article 18 sets the floor. Retention is not "best effort." The storage infrastructure and its lifecycle policies must support the floor unconditionally. Article 18

06 Can a log entry be edited after the fact without detection? The regulation does not mandate cryptographic guarantees, but a notified body assessor will ask. Append-only storage, signed event streams, or Merkle-anchored batches are the standard answers. Article 12 in context of Article 17 QMS

07 Is each evaluation claim bound to a tamper-evident artefact? An accuracy number reported to a deployer or a regulator must be retrievable to a specific run, a specific test set, a specific seed. Pre-registration via a manifest hash is the cleanest pattern. Article 15 + Article 12 join

08 Can the log be read by a market surveillance authority without your involvement? Article 21 obliges providers to provide logs on request. A log format readable only via your proprietary tooling is not compliant; documented schema and standard formats are. Article 21, Article 74

09 Is there a documented log-integrity control under the QMS? Article 17 requires the QMS to cover examination and verification procedures. Log integrity is one such procedure. Document it, test it, log the tests. Article 17(1)(d), 17(1)(j)

10 Do logs feed into Article 72 post-market monitoring and Article 73 incident reporting? A log nobody reads is paperwork. The feedback loop into risk management (Article 9), monitoring (Article 72), and incident reporting (Article 73) is what makes Article 12 evidentiary. Articles 9, 72, 73

Retention: the ten-year floor under Article 18

Article 18 binds providers to keep the automatically generated logs, the technical documentation, the EU declaration of conformity, and any decisions of notified bodies for ten years after the system is placed on the market. For systems with a long operational life (medical-device-adjacent AI, infrastructure control systems, biometric identification deployed by public authorities), this is a multi-decade obligation when production lifetime is added.

The retention infrastructure has three operational requirements that are easy to miss:

Storage durability across staff turnover. The person who set up the log storage in 2026 will not be on the team in 2036. Documentation of the storage architecture is part of the technical documentation.
Format survivability. A log written in a proprietary binary format will become unreadable when the tooling that produced it is deprecated. Open, documented formats (JSON Lines, Parquet, Arrow) are the safer choice. Cleartext is acceptable; obscure binary is risky.
Storage-cost stability. Object storage at scale over ten years has a non-trivial cost trajectory. Cold storage tiers and lifecycle policies are the standard answer, but they must be tested for recovery before they are relied on for evidentiary recovery.

Where tamper-evident logging matters

The regulation does not explicitly mandate cryptographic guarantees on logs. It mandates a level of traceability "appropriate to the intended purpose." That phrase does most of the work in practice.

For a system whose intended purpose is high-stakes — credit scoring, recruitment screening, biometric identification, public-sector eligibility — the level of traceability that an audit will treat as appropriate is one where the log cannot have been edited after the event. For low-stakes systems an append-only file is sufficient. The boundary is fuzzy and notified bodies will draw it case by case.

The cheapest defensible posture is to make logs tamper-evident at the design level and stop arguing about whether it was required:

Append-only storage with documented controls preventing modification of existing entries.
Periodic Merkle anchoring of log batches to an external timestamping service (RFC 3161 TSA, OpenTimestamps, or a comparable public timestamping authority).
Signed evaluation manifests at the per-run level, where each evaluation claim is committed to a SHA-256 hash before the run produces an output. This is the PRML pattern, an open spec we maintain. See the v0.1 specification.

The PRML hash anchor pattern, in three lines

For the evaluation-claim subset of Article 12 logging — the part that binds metrics, thresholds, datasets, and seeds — the load-bearing artefact is a manifest hash committed before the run starts. The pattern is small enough to fit in a yaml file and one line of tooling:

version: prml/0.1
metric: accuracy
comparator: '>='
threshold: 0.85
dataset:
  id: imagenet-val-2012
  hash: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
seed: 42
producer:
  id: falsify.dev

The hash of the canonical bytes of this manifest becomes the run's identity. It is computed before the experiment runs. If a deployer later sees an accuracy claim and wants to verify it was not retroactively softened, they recompute the hash and compare. The threshold becomes a structural commitment rather than a footnote.

This is one specific pattern for one specific subset of Article 12 logging. It does not cover use-period logging, verification-person logging, or risk-event logging. It does cover the most contested part of an Article 12 audit: "did your team change the accuracy threshold between the report and the audit?"

What to do next

Full readiness assessment Sprint tiers Ten-week plan Email [email protected]

FAQ

Does Article 12 apply to GPAI models?

Not directly. Article 12 is in Chapter III Section 2, which binds high-risk systems. GPAI obligations live in Chapter V (Articles 50 to 56). Article 53(1)(d) imposes a documentation obligation on GPAI providers that resembles Article 12 in spirit but is structured differently. If your GPAI model is deployed in a high-risk system, the deploying provider inherits Article 12 obligations.

Is JSON Lines acceptable as a log format?

Yes. The regulation does not specify a format. JSON Lines is widely documented, parseable without proprietary tooling, and supports the schema-versioning requirement (item 04 on the checklist). Parquet and Arrow are also acceptable. Proprietary binary formats are not recommended for the ten-year retention requirement.

Do we need a separate log retention system, or can we use our existing observability stack?

The Article 12 logs are a defined regulatory artefact and benefit from clean separation from operational observability. Many teams take their existing observability stack as the source-of-truth, with a periodic export to a regulatory log store that satisfies the retention and integrity requirements. The export approach is simpler than rebuilding logging on top of a new system.

What is "tamper-evident" in regulatory practice?

Tamper-evident means modifications after the fact are detectable, not that they are impossible. Append-only file systems, cryptographic signatures over batches, and external timestamping are the standard mechanisms. Tamper-proof (impossibility of modification) is a stronger property and is not required by Article 12.

How does Article 12 interact with the GDPR data-minimisation principle?

The regulation requires logging of input data that triggered system decisions. GDPR requires data minimisation. The standard reconciliation is to log identifiers and content hashes rather than the raw input data, with the raw data accessible from a separate store under access controls. Recital 27 of the AI Act specifically references this interaction.

About this page. Written by Cüneyt Öztürk, independent researcher and author of the PRML specification. The Falsify Sprint programme runs for ML evaluation evidence under the AI Act. This page is not legal advice. Article references are to Regulation (EU) 2024/1689 as published in the Official Journal on 12 July 2024. The text quoted in the "What Article 12 actually says" section is the operative text of Article 12(1) of the regulation; see also the published consolidated text at eur-lex.europa.eu/eli/reg/2024/1689/oj. CC BY 4.0. Printable single-page summary: use the "Print or save as PDF" button at the top of the page.