EU AI ACT · ARTICLE 12 EVIDENCE PACK · SAMPLE · v1 · 2026-05-26

Article 12 Evidence Pack (sample)

A tamper-evident, cryptographically self-verifying technical artifact for EU AI Act Article 12 record-keeping and Annex IV section 2(d) documentation. This is a sample of the deliverable a Sprint Audit Review produces. It is suitable for forwarding to internal compliance leads, notified-body assessors, and accredited audit firms.

Commission a real Sprint Pricing

Provider

Sample Provider Inc. (placeholder)

AI System

Image classification model, ResNet50 fp16, deployed for content-moderation pre-screen on user-uploaded media

Annex III category

Point 1(a), biometric-categorisation-adjacent (placeholder)

Evaluation claim

Top-1 accuracy on ImageNet-1k validation (2012), 10-crop disabled, batch size 64, fp16 inference, ≥ 0.92

PRML version

v0.1 (stable) / v0.2 (RFC compatible)

Commit date (pre-run)

2026-05-08 20:00:00 UTC

Manifest SHA-256

PLACEHOLDER-a3f4b9c2e7d1f8e4a9c0b3d6f5e2a1c8b4d7e0a3c6b9f2e5d8a1b4c7d0e3f6a9

Authored by

Cüneyt Öztürk, independent researcher and maintainer of PRML

Issued through

Cüneyt Öztürk

What this document is, and is not. This is a technical evidence artifact. It is not a legal opinion, not a notified-body certification, and not an accredited audit report. Auditors, notified bodies, and compliance teams may use it as input to their own assessment. The cryptographic manifest hash in section 2 is self-verifying: any party may re-hash the manifest and check it against the published commit, using one of the four byte-equivalent reference implementations. Spec section 8.1 enumerates what PRML does not cover.

1. What this document is

A complete Article 12 record for one evaluation claim contains, at minimum, four things: a pre-registered manifest that fixes the claim, a cryptographic commit of that manifest before the run, the result of the run, and a procedure for any third party to verify that the manifest and the commit match. This document collects all four in a single artifact, plus the mappings to the regulatory and standards text that an assessor will look for.

It is structured as a self-contained PDF: the assessor does not need to navigate to any other URL to reproduce the cryptographic check, although the references in section 4 and section 5 let them cross-check against the regulation text and the ISO control text directly.

2. The pre-registered manifest and its commit

The manifest below is the nine-field YAML document fixed before the evaluation ran, in PRML v0.1 schema. The SHA-256 hash is computed over the canonicalised bytes of this document (see spec section on canonicalisation for the canonical form rules). The hash was published to the registry at the timestamp shown, before any result was recorded.

version: prml/0.1
claim_id: 01900000-0000-7000-8000-000000000001
created_at: "2026-05-08T20:00:00Z"
metric: accuracy
comparator: ">="
threshold: 0.92
dataset:
  id: imagenet-val-2012
  hash: PLACEHOLDER-DATASET-SHA256-1f2e3d4c5b6a7980abcdef0123456789fedcba9876543210
  uri: https://image-net.org/data/ILSVRC2012_img_val.tar
seed: 42
producer:
  id: sample-provider.example
model:
  id: resnet50-fp16
  hash: PLACEHOLDER-MODEL-SHA256-9a8b7c6d5e4f30210fedcba9876543210abcdef0123456789
notes: |
  ImageNet-1k val (2012). Top-1 accuracy at fp16 inference,
  10-crop disabled, deterministic dataloader, batch size 64.
  Eval harness pinned to torchvision 0.18.0, pytorch 2.3.1,
  CUDA 12.4. Hardware: single A100 80GB.

Manifest SHA-256 (canonical bytes):

PLACEHOLDER-a3f4b9c2e7d1f8e4a9c0b3d6f5e2a1c8b4d7e0a3c6b9f2e5d8a1b4c7d0e3f6a9

Commit publication record:

Channel	Reference	Timestamp
Public registry	`registry.falsify.dev/m/<hash>`	2026-05-08 20:00:14 UTC
Git commit witness	`github.com/<provider>/<repo>@<sha>`	2026-05-08 20:00:31 UTC
Third-party archive	Zenodo DOI 10.5281/zenodo.<id>	2026-05-09 09:14 UTC

Three independent witnesses are not required by the spec; one durable, publicly resolvable commit is sufficient. Multiple witnesses raise the cost of a coordinated retraction attack and are recommended for high-risk systems under Article 6 Annex III.

3. Verification: how an auditor re-derives the hash

The verification step is mechanical. An auditor with access to the manifest text (this document, section 2) and any one of the four byte-equivalent reference implementations can re-derive the SHA-256 in under one minute, without contacting the provider.

Reference implementations, all CC BY 4.0 spec, MIT code:

In-browser (no install): paste the manifest at registry.falsify.dev
JavaScript: npx falsify-js verify manifest.yaml
Go: falsify-go verify manifest.yaml
Rust: falsify-rs verify manifest.yaml

Sample verification output (any implementation):

$ falsify verify manifest.yaml --commit PLACEHOLDER-a3f4b9c2...

  manifest         : manifest.yaml
  canonical bytes  : 612 bytes
  derived sha256   : PLACEHOLDER-a3f4b9c2e7d1f8e4a9c0b3d6f5e2a1c8b4d7e0a3c6b9f2e5d8a1b4c7d0e3f6a9
  registry commit  : PLACEHOLDER-a3f4b9c2e7d1f8e4a9c0b3d6f5e2a1c8b4d7e0a3c6b9f2e5d8a1b4c7d0e3f6a9
  match            : PASS

  registry record  : registry.falsify.dev/m/PLACEHOLDER-a3f4b9c2...
  committed at     : 2026-05-08T20:00:14Z
  resolved at      : 2026-06-08T20:00:00Z (T+30 days)

  result           : 0.918 (claimed: >= 0.92) — FAIL
  failure honoured : yes (no edit to manifest after result)

The four implementations produce byte-equivalent output across 21 conformance vectors. If two implementations disagree on the canonical form of any manifest, that is a spec bug and the spec is the side that has to be fixed, not the implementation. The conformance suite is open at github.com/studio-11-co/falsify/tree/main/conformance.

4. Mapping to EU AI Act Article 12 and Annex IV section 2

The table below states what this evidence pack covers and what it does not, against the operative text of Article 12 of Regulation (EU) 2024/1689 and Annex IV section 2.

Reference	Requirement (summary)	Coverage	Notes
Article 12(1)	Automatic recording of events over system lifetime	PARTIAL	PRML records each evaluation event as an immutable commit. It does not record runtime inference events; those need a separate Article 12(1) log pipeline.
Article 12(2)(a)	Identification of situations causing risk	NONE	Out of PRML scope. Covered by the provider's risk management system under Article 9.
Article 12(2)(b)	Facilitation of post-market monitoring (Article 72)	PARTIAL	PRML manifests are queryable by claim_id and producer.id, supporting longitudinal performance monitoring.
Article 12(2)(c)	Monitoring of operation, especially Article 26(5)	PARTIAL	Evaluation-time monitoring. Runtime monitoring is out of scope.
Annex IV §2(d)	Evaluation methods, performance metrics, validation procedures, accuracy/robustness metrics with statistical significance	FULL	Direct fit. The manifest names the metric, comparator, threshold, dataset, seed, and model identifier; the commit makes retroactive modification mechanically detectable.
Annex IV §2(b)	Design specifications, key design choices, methodologies	PARTIAL	Notes field captures the eval methodology declaration. Full design rationale belongs elsewhere in the technical documentation file.
Annex IV §2(h)	Cybersecurity measures	NONE	Composes with Sigstore / in-toto / SLSA for code-supply-chain integrity. Out of PRML scope.
Article 15(1)	Accuracy, robustness, cybersecurity throughout lifecycle	PARTIAL	PRML provides the accuracy-claim attestation layer. Robustness testing and cybersecurity belong to other primitives.
Article 15(3)	Levels of accuracy declared in instructions for use	PARTIAL	Declared accuracy is mechanically traceable to a hashed manifest, so the declaration can be checked against the eval that produced it.
Article 18(1)	Retention of automatically generated logs (10 years)	FULL	Pro and Enterprise registry tiers provide a ten-year retention SLA. Manifests in the Developer tier rely on the provider to maintain durable storage.

The "FULL / PARTIAL / NONE" column is deliberately strict: it reports what the evidence pack mechanically provides, not what the provider's broader compliance program covers. Most rows that read PARTIAL or NONE here will read FULL elsewhere in the provider's technical documentation file under Annex IV.

5. Mapping to ISO/IEC 42001 controls

For providers under, or pursuing, an ISO/IEC 42001:2023 (AI Management System) management system, the mapping below shows where this evidence pack contributes. References are to the published 42001 control text.

Control	Title	Coverage	Notes
A.6.2.4	Documented information for AI system	PARTIAL	Manifest is one piece of documented information per evaluation claim.
A.6.2.6	System impact assessment	NONE	Out of scope.
A.7.4	Data quality for AI systems	PARTIAL	Dataset hash and URI in manifest support dataset-identity attestation.
A.8	Information for interested parties (record-keeping family)	PARTIAL / FULL on eval records	The cryptographic-commit pattern is a strong fit for record-keeping that must survive disputes about retroactive edit.
A.8.2	System log information	PARTIAL	Eval logs covered; runtime inference logs out of scope.
A.9.3	Performance evaluation of AI system	FULL on the bound claim	Direct fit. The manifest pre-registers the performance evaluation; the result is bound to the hash before the run.

This sample mapping uses control numbering from the 42001:2023 published text. For a provider running parallel certifications under 42001 and the AI Act, the column "Annex IV §2(d) coverage" in section 4 above and the column "A.9.3 coverage" here will overlap heavily; both describe the same evaluation event from different normative perspectives.

6. Limitations and out-of-scope items

This section is taken from spec section 8.1, restated here for the assessor's convenience. The spec is the controlling document.

Selective non-publication. A provider could commit one hundred manifests, run all one hundred evaluations, and only publish the three that came out well. PRML by itself does not catch this; a separate completeness mechanism is required (registry-side claim-set commitments are on the v0.3 backlog).
Execution-time data tampering inside the eval harness. If the eval harness silently truncates the dataset before computing the metric, the manifest hash will still verify; the dishonesty happens between data load and metric computation. Detecting this requires independent re-execution by a reviewer with their own dataset copy.
Model binary integrity. The manifest references the model by id and hash but does not attest that the binary on disk matches the hash. This belongs to Sigstore / in-toto / SLSA in the supply chain layer.
Runtime gating. PRML does not block model loading or inference if a manifest fails to verify. It is an audit primitive, not a runtime gate.
Legal-compliance certification. This document does not certify legal compliance with Article 12, 15, or 18. It is one input to the compliance file. Certification under the AI Act is the responsibility of the notified body and, for self-assessment routes, the provider's quality-management system under Article 17.

7. For auditors and notified-body assessors

Suggested workflow for an assessor receiving this document as part of an Annex IV technical documentation file:

Extract the manifest from section 2 and save it to a local file.
Install any one of the four reference implementations (one-line install). Validate the manifest against the PRML JSON Schema (canonical schema is in SchemaStore, indexed since 2026-05-11).
Re-canonicalise and re-hash. Compare against the SHA-256 string in section 2 and against the registry record.
If you have independent access to the dataset and model artefacts, hash them and compare against the dataset.hash and model.hash fields. This catches identity-substitution attacks but not within-run tampering.
Cross-check the registry timestamp against the provider's claim of "pre-registered before the run." Any future-dated commits relative to result reporting are an audit red flag.
For high-risk systems, request a second independent reviewer to re-execute the eval against the provider's dataset and model copies. The manifest gives that reviewer everything they need to reproduce the run byte-for-byte modulo hardware non-determinism (which the seed and notes fields partly mitigate).

An assessor who completes steps 1-3 has a tamper-evident attestation of the claim that does not depend on trusting the provider. Steps 4-6 raise the assurance level further.

8. About this artifact

Authored by Cüneyt Öztürk, independent researcher and maintainer of the PRML specification. The reference implementations, conformance vectors, and JSON Schema are jointly maintained by the same author at github.com/studio-11-co/falsify under CC BY 4.0 (spec) and MIT (code).

Issued through Cüneyt Öztürk, the legal entity that produces Sprint engagements. Contact: [email protected]. Invoicing is in EUR, USD or GBP via wire transfer.

Disclaimer. This document is a technical evidence artifact. It is not a legal opinion, not a notified-body certification, and not an accredited audit report. The author is not a notified body and has no accreditation under Regulation (EU) 2024/1689 to issue conformity assessments. Auditors, notified bodies, and compliance teams may use this document as input to their own assessment, and are encouraged to verify all cryptographic claims independently using the open reference implementations.

Sample disclosure. The provider name, system description, Annex III category, dataset hash, model hash, and manifest SHA-256 in this document are placeholders chosen to illustrate the format. They do not correspond to any real evaluation claim. A real Evidence Pack delivered as part of a Sprint Audit Review (/sprint/) substitutes the provider's actual values and publishes the manifest to the public registry under the provider's producer.id.

This sample was published 2026-05-26 at falsify.dev/evidence-pack-sample/. CC BY 4.0.

Commission a real Sprint engagement Email a technical question