v0.2 RFC, briefly — what's open and why.
Five proposals are open for community comment until 2026-05-22 23:59 UTC. v0.1 stays the stable spec. v0.2 will be additive — every v0.1 manifest hashes identically under v0.2 canonicalisation rules. The honest read of what changes, what doesn't, and why the comment window matters more than the proposals themselves.
The five proposals, in one sentence each
P-01 — Streaming variant. An optional prml_mode: streaming for live evaluations (Chatbot Arena Elo, A/B-tested production, drift monitors), where the threshold commits to an aggregation rule applied to a window rather than a fixed batch.
P-02 — Runner attestation. An optional runner_attestation URI to an out-of-band execution attestation (Sigstore, in-toto, TEE). PRML records that an attestation was emitted; it does not interpret the attestation. This narrows — but does not close — the gap between what was claimed and what was run.
P-03 — Revocation. Optional revoked_at + revocation_reason with a small controlled vocabulary (dataset_compromised, model_recalled, author_request, other). The hash continues to verify after revocation; verifiers must surface revocation status separately.
P-04 — Conformance vector format. Standardise the test-vector directory layout and a stdin/stdout runner protocol so any implementation can mechanically prove byte-equivalence with the reference.
P-05 — Patent grant placement. Move the existing patent non-assertion grant from Appendix C into a new §1.5 (preamble). Standards-body reviewers requested this for inclusion in Annex Z reference checks. No textual change to the grant.
What v0.2 doesn't try to do
Three classes of problem stay outside the v0.2 envelope, deliberately:
- Selective publication. A publisher can pre-register ten claims and publish two. PRML §8.1 names this. v0.2 keeps it as the named limit. Closing it is a publication-norms question, not a serialisation primitive question.
- Algorithm agility. SHA-256 only. Post-quantum migration is a v0.3 conversation with its own threat model.
- Multi-claim manifests. Still single-claim per manifest. Composition is a registry concern.
The non-negotiable
Every v0.1 manifest hashes identically under v0.2 canonicalisation. This is normative: any proposed v0.2 change that breaks v0.1 hash-equivalence for v0.1-shaped inputs is rejected at design time. The 12 v0.1 conformance vectors plus 8 new v0.2 vectors mechanically verify the property — 32/32 vectors pass on every commit through CI.
Why the window matters more than the proposals
Five named proposals sound like the substance of an RFC. They are not. The substance is what falls outside the proposals — the questions a reader thinks of while reading them, the gaps a working auditor notices, the mismatch between the spec's framing and a real institution's obligations. We want those.
If you publish ML eval claims professionally, run an audit programme, write papers in eval methodology, or work with EU AI Act Article 12 logging — your position on any of these proposals can directly shape what the v0.2 freeze looks like. The window closes 2026-05-22 23:59 UTC. After that, comments roll into v0.3 unless they identify a security flaw.
Two ways to comment, in roughly increasing seriousness:
- GitHub Discussions, label
rfc-v0.2: github.com/studio-11-co/falsify/discussions - Email the editor: [email protected] (subject prefix
[v0.2 RFC]; preferred for institutional or confidential comments).
The editor reads everything. A two-line comment with one specific concern beats a ten-line comment with five general ones. Vague approval is welcome but doesn't shape the freeze.
What's already shipped alongside the RFC
The infrastructure that makes the RFC accessible has been built out in parallel:
- Formal JSON Schema for both v0.1 and v0.2 — drop-in IDE autocomplete, CI validation, no installation. spec.falsify.dev/schema/
- In-browser playground that hashes a manifest with the Web Crypto API, byte-equivalent to the Python reference. falsify.dev/playground
- Compliance landing mapping v0.2 fields to EU AI Act Articles 12 / 17 / 18 / 50 / 72 / 73, NIST AI RMF, and ISO/IEC 42001 clauses. falsify.dev/compliance
- Reference implementations — four codebases. Python on PyPI (
pip install falsify) and JavaScript on npm (npm install falsify-js) are published packages; Go and Rust live as source under studio-11-co/falsify/impl/. Byte-equivalent across 20 conformance vectors. - Cookbook — 10 patterns, 4 anti-patterns, 4 working examples. github.com/studio-11-co/falsify-cookbook
None of that is required to comment. The full RFC text is 1,500 words.
The line we keep
PRML closes one specific gap: it makes post-hoc threshold tuning mechanically detectable. That's a small claim. The spec is fifteen pages and the schema is eight fields. v0.2 adds four optional fields; the core stays small. We are not building a publication-integrity system, an attestation framework, or a benchmark. We are building one primitive that has to be small enough to be implementable in any language and durable enough to outlive any specific tooling.
The proposals are how the primitive gets one degree more useful without ceasing to be small. Tell us where we got that wrong.