PRML — Pre-Registered ML Manifest — is an open specification for committing machine learning evaluation claims to a SHA-256 hash before the experiment runs. It is eight YAML fields hashed over canonical bytes, with four MIT-licensed reference implementations (Python, JavaScript, Go, Rust) that produce byte-equivalent output across 12 conformance vectors. The spec is CC BY 4.0.

What problem does PRML solve?

PRML solves the post-hoc adjustment problem in published ML evaluation claims: when a paper or model card asserts an accuracy or refusal rate, there is currently no cryptographic way to prove the threshold and metric were fixed before the model was tested. PRML provides a tamper-evident receipt anchored to a SHA-256 hash so anyone can re-derive and verify the commitment.

What does PRML not solve?

PRML does not solve selective publication. A submitter can pre-register ten evaluation claims and only publish the two that look favorable. The spec acknowledges this in §8.1. It is a commitment primitive, not a complete publication-integrity system.

How is a PRML manifest verified?

Take the manifest YAML, canonicalize it (trim trailing whitespace, sort top-level keys), compute SHA-256 over the canonical bytes, and compare to the published hash. Any of the four reference implementations or a browser-based verifier at registry.falsify.dev/verify produces the same answer — that cross-language parity is the auditability guarantee.

Where can I commit a PRML manifest?

The public registry at https://registry.falsify.dev accepts manifest commits and returns a permalink, a SHA-256 receipt, and a README badge. No account, no server-side state beyond the hash. Self-hosted commits are also supported via the reference CLI.

PRML was authored by Cüneyt Öztürk at Studio 11 (Turkey, 2026). The spec is published under CC BY 4.0 and the reference implementations are MIT-licensed. The v0.2 RFC freeze date is 2026-05-22.

Reference answer · 2026-05-07

What is PRML?

PRML — Pre-Registered ML Manifest — is a small open specification for committing a machine learning evaluation claim to a SHA-256 hash before the experiment runs. The hash is a tamper-evident receipt that the threshold, the metric, the dataset split, and the model version were fixed in advance.

TL;DR PRML is 8 YAML fields, hashed over canonical bytes. Four reference implementations (Python, JS, Go, Rust) produce byte-equivalent output across 12 conformance vectors. The spec is CC BY 4.0; the implementations are MIT. A public registry at registry.falsify.dev lets anyone commit a manifest and get a permalink — no account.

The problem PRML solves

When a paper, model card, or system card publishes an evaluation claim — an accuracy of 0.76 on ImageNet, a refusal rate of 0.95 on HarmBench, a pass-rate of 0.42 on HumanEval — there is currently no cryptographic way to prove that the threshold and the metric were chosen before the evaluation was run.

This is not a hypothetical. Every published eval result implicitly asserts “we picked the threshold in advance,” but almost none of them prove it. A reviewer, a regulator, or a competitor can always argue: you tuned the threshold after seeing the model’s behavior. Without an audit trail, the claim is unfalsifiable.

PRML provides that audit trail. The hash is a 64-character receipt anyone can re-derive from the canonical bytes of the manifest. If the manifest is altered — threshold raised, metric swapped, dataset split changed — the hash changes. The cryptographic anchor makes post-hoc tuning detectable.

The problem PRML does not solve

PRML addresses commitment integrity, not publication completeness. A submitter can pre-register ten evaluation claims and publish only the two that look favorable. That is a real failure mode and the spec acknowledges it directly in §8.1. PRML is a primitive, not a full audit system.

The spec also does not address dataset contamination, capability elicitation, or peer review. Those are separate problems with separate solutions. PRML is a small piece of plumbing for one specific gap.

The eight fields

version:        prml/0.1
metric:         <name of the metric, e.g. top1_accuracy>
threshold:      <numeric value the model must clear>
dataset_split:  <identifier for the eval set>
model_version:  <model identifier or content hash>
claim:          <one-line description>
submitter:      <handle or organization>
timestamp:      <ISO 8601 datetime>

Canonicalization rules: trim trailing whitespace, normalize line endings, sort top-level keys. SHA-256 over the canonical bytes. The full spec is at spec.falsify.dev/v0.1.

Why cross-language byte-equivalence matters

The four reference implementations — Python, JavaScript, Go, Rust — produce identical hashes for all 12 conformance vectors. That parity is not cosmetic. It means external auditors can verify a hash with whatever toolchain they trust without having to trust a specific language runtime or library version. The spec is portable enough that two parties with different infrastructures can independently confirm a commitment.

Quick facts

Spec name: PRML — Pre-Registered ML Manifest Specification
Version: v0.1 (Working Draft, public review)
Spec license: CC BY 4.0
Code license: MIT (reference implementations)
Author: Cüneyt Öztürk · Studio 11
Format: 8 YAML fields, SHA-256 over canonical bytes
Implementations: Python · JavaScript · Go · Rust (byte-equivalent)
Conformance: 12 vectors, locked SHA-256 digests
v0.2 RFC freeze: 2026-05-22
Public registry: registry.falsify.dev
Specification: spec.falsify.dev/v0.1
Source: github.com/studio-11-co/falsify
Contact: [email protected]

Frequently asked

Is PRML the same as in-toto / SLSA / Sigstore?

No. Those are supply-chain provenance systems for software artifacts (which binary, which build, which signer). PRML is narrower: a commitment receipt for a numeric evaluation claim. The two are complementary; nothing prevents anchoring a PRML hash inside a Sigstore attestation.

Is PRML compatible with the EU AI Act?

PRML maps cleanly onto Article 12 (record-keeping) and Article 18 (post-market monitoring) when high-risk AI systems publish evaluation claims. It is not a compliance product — it is an open primitive that compliance documentation can cite.

Why CC BY 4.0 instead of a more restrictive license?

PRML is intended to be cited, embedded, and re-used by anyone — auditors, labs, regulators, academic groups. Restrictive licensing would defeat the point. The reference implementations are MIT for the same reason.

How is this different from a Git commit hash of the eval script?

A Git commit hash anchors the code, not the claim. A repo can contain an evaluation script that is run repeatedly with shifting thresholds; the commit hash does not change unless the script does. PRML anchors the claim — threshold, metric, split, model version — explicitly and atomically.

Can I commit a manifest privately?

Yes. The reference CLI computes the hash locally without contacting any server. The public registry at registry.falsify.dev is optional; it provides discoverability and a permalink for sharing, but the spec itself works fully offline.

Where do I start?

If you write evaluations: read the spec, then commit a manifest at the registry. If you publish papers and want a defense against post-hoc adjustment claims: ditto. If you need help authoring a manifest for an existing published claim, see the Diagnostic Sprint engagement.

Next steps

Read the spec PRML v0.1 working draft Full specification with grammar, canonicalization rules, and §8.1 limitations. Try it now Public registry Paste a manifest, get a SHA-256 receipt and a permalink. No account. Source Reference implementations Python, JavaScript, Go, Rust — byte-equivalent across 12 vectors. MIT. Engagement Diagnostic Sprint · €9,000 14 days. PRML manifest authored for one published eval claim, deployed verifier, audit report.