Diagnostic Sprint · Engagement v1

Turn an ML evaluation claim into auditable evidence. Written deliverables, fixed prices.

Three tiers, all async. We author PRML manifests for your existing evaluation claims, anchors them on the public registry, and writes the audit memo your reviewer, regulator, or customer can cite. No calls. No discovery loop. No payment processor in the middle. Wire transfer and a single invoice.

Audit Review
€15K
Full Sprint
€65K
Enterprise
€180–250K
Format
100% async
Three tiers

Pick the engagement that matches your claim count and timeline.

Each tier is fixed-price, fixed-scope. No discovery phase, no hourly billing. The differences below are the differences. Everything ships in writing; everything is yours to cite, fork, and reuse.

Audit Review€15,000 · 5 business days

Single existing evaluation claim, locked into a PRML manifest, registry-anchored, audit memo delivered. For a compliance officer or eval engineer who needs third-party signature on one specific claim before a regulator or customer asks.

Deliverables:

  • One PRML manifest, registry-anchored at registry.falsify.dev
  • Audit memo, 6-8 pages PDF: threat model, manifest fields, residual risk
  • Re-derivation script (Python, 30-50 lines) for independent verification
  • Citation block (BibTeX + Zenodo DOI format) for your model card or paper
  • 5 business days written Q&A window post-delivery, 48-hour response SLA

Payment: 100% upfront, single invoice, wire transfer

Email — Audit Review →

Full Sprint€65,000 · 3 weeks

Three evaluation claims, CI pipeline deployed, full audit report. For an AI team at a regulated provider that needs CI-level enforcement (block merges that tamper or regress against locked thresholds) and an audit-grade evidence package for the next regulator interaction.

Deliverables:

  • Three PRML manifests, registry-anchored
  • CI pipeline deployment: prml-verify-action wired in, set to fail builds on tamper or regression
  • Audit report, 12-15 pages PDF: per-manifest threat model, regulator-mappable evidence (EU AI Act Article 12 / NIST AI RMF / ISO 42001), residual risk register
  • Re-derivation scripts per manifest
  • 30-day support window post-delivery, unlimited written Q&A, 48-hour response SLA
  • 1-page closing memo: what's defensible now, what's still open

Payment: 50% / 50% wire transfer in two milestones (kickoff and audit-report delivery), single invoice

Email — Full Sprint →

Enterprise Engagement€180,000 – €250,000 · by inquiry · 8–10 weeks

Up to 12 manifests, isolated private registry instance, personalized regulatory crosswalk PDF, two written executive briefings. For a regulated AI provider, notified body, or large research org under EU AI Act high-risk status (2 Dec 2027 deadline) or pursuing ISO 42001 certification.

Deliverables:

  • Up to 12 PRML manifests, registry-anchored on your isolated instance
  • Isolated private registry: registry.<your-org>.com (Cloudflare Worker + KV), basic auth, audit log endpoint, monitored uptime during the engagement window
  • CI pipeline deployment across up to 5 repositories
  • Personalized regulatory crosswalk PDF, 20-30 pages: EU AI Act Articles 12 & 18 + optional NIST AI RMF + optional ISO/IEC 42001 mapping, each control linked to a specific manifest hash
  • Two written executive briefings, 4-6 pages each (mid-engagement status & close-out defensibility statement)
  • 60-day support window post-delivery, including responses to your auditor or notified body during their review
  • Right of citation in your compliance documentation and regulator filings

Three variants:

  • Standard €180K — up to 8 manifests, 3 repos, one framework
  • Plus €220K — up to 12 manifests, 5 repos, two frameworks
  • Premium €250K — full envelope, all three frameworks + custom mapping to one additional standard (e.g. MITRE ATLAS, NIST AI 600-1)

Payment: 33% / 33% / 34% in three milestones, three invoices. Net-30 procurement terms available. Scope and terms agreed per engagement. Optional on-site visit: half-day, €15,000 surcharge.

Email — Enterprise →

Pricing is fixed at engagement start and locked into the scope memo. No mid-engagement upsells, no scope-creep billing, no "out-of-scope" surcharges except the explicit Enterprise add-ons listed above. The full tier definitions live at spec/COMMERCIAL.md in the public repo — same source as this page.

Why this exists

Threshold pre-registration is fifteen years old. The cryptographic version is what survives a hostile reviewer.

SR 11-7 was 2011. EBA/GL/2017/16 came later. Internal MRM teams have been documenting threshold commitments for over a decade. The bind has always been signature plus workflow plus trust — defensible inside an institution that wants to defend it, increasingly thin in front of a reviewer that doesn't.

PRML doesn't replace any of that. It produces a SHA-256 hash that any toolchain can re-derive against canonical bytes — mechanically verifiable rather than procedurally trustable. Same content. Same fields. Different proof shape.

The Sprint is the wrapper that produces that hash for one of your published claims in 14 days. If your model is sitting in an MRM queue or a notified-body submission and the binding constraint is "show us the receipt", that's what we ship.

What you get

Five artefacts you can hand to a reviewer or an auditor.

Every deliverable is yours to keep, share, and re-use. The PRML manifest and the verifier are MIT-licensed; the audit report is plain Markdown with cryptographic citations.

[I]PRML manifest

An 8-field YAML committing one published claim — threshold, metric, dataset split, model version, submitter, timestamp — to a SHA-256 hash. Anchored on registry.falsify.dev so anyone can re-derive and verify.

[II]Deployed verifier

Reference verifier (Python, JS, Go, or Rust — your pick) wired into your CI. Re-runs the claim, computes the canonical hash, exits 0 / 10 / 3 for pass / fail / tamper. No vendor lock-in.

[III]Audit report

A 6–10 page Markdown report covering: what was claimed, what was committed, how to verify, and §8.1 limitations specific to your claim. Citable in papers, model cards, and notified-body submissions.

[IV]Public permalink

A registry.falsify.dev/<hash> page anyone can land on. Shareable in tweets, README badges, and reviewer responses. Optional: keep it private until you publish the underlying claim.

[V]CI integration template

A wired-in studio-11-co/prml-verify-action@v2 workflow in your repo: five-line composite Action, exits non-zero on tampered or regressed claims, optional public registry anchor. Same artefact that gets cited in the audit report.

How it runs (Full Sprint example)

Fixed scope. Fixed price. No discovery loop.

Three weeks, five written checkpoints. Audit Review collapses the same flow into 5 business days for a single claim. Enterprise stretches it across 8–10 weeks with bi-weekly progress notes. Every step is written; nothing requires a call.

D0
Scope memo · written exchange

You email [email protected]. We reply within 48 business hours with a 2-page scope document naming the claims, target regulators, CI integration points, and the invoice. You sign the scope memo electronically. No scoping call required.

W1
Claim selection · manifest drafts

Three claims locked from the candidate list in your scope memo. Three PRML manifest drafts written, sent for written review. You comment in writing; one revision round; manifests frozen by end of week 1.

W2
Verifier deployment · rerun green

prml-verify-action wired into your CI. Re-runs the claims end-to-end against your dataset hashes. If a claim doesn't reproduce, you find out before the receipt goes public — that's a feature. Progress note delivered end of week 2.

W3
Audit report draft · written review

Draft of the 12–15 page report sent for review. You comment in writing; one revision round. Includes per-manifest §8.1 limitations and regulator-mappable evidence for the framework(s) you named at scope time.

D21
Hand-off · permalinks live · support window opens

Final report delivered as PDF. Manifests committed to the registry (public or private depending on tier). Re-derivation scripts attached. 30-day written-Q&A support window opens with 48-hour response SLA.

Fit

A sprint is the right shape for some teams. For others it isn’t.

Honest signal upfront beats a discovery call that ends in a no.

Right fit
Wrong fit
You’ve published or are about to publish a numeric eval claim — an accuracy, a refusal rate, a pass-rate — that you’d like to be able to defend.
You need an end-to-end eval pipeline built from scratch. That’s a longer engagement; let’s talk separately.
A reviewer, a regulator, or an internal red-team has questioned whether your threshold was fixed in advance.
You need a managed compliance service. We ship an artefact and walk away — we don’t hold ongoing audit retainers.
You can spare ~6 hours of an engineer’s time across two weeks for the verifier integration and one rerun.
You want a generic AI-safety audit. The sprint is narrow on purpose: one claim, one receipt, one report.
Your legal or compliance team is comfortable with an MIT-licensed deliverable and an open spec (CC BY 4.0).
You need the audit kept fully private with no public permalink. Possible, but ask — the public option is the default.
Who delivers it

You work with the person who wrote the spec.

No project managers, no associates, no offshore subcontractors. The Diagnostic Sprint is delivered by Cüneyt Öztürk personally.

Cüneyt Öztürk · PRML / falsify
Cüneyt Öztürk

Author of PRML v0.1 and the four reference implementations (Python, JS, Go, Rust). Independent researcher working on AI evaluation infrastructure and the PRML / falsify track. Engagements are structured under English-law SOWs.

◆ Next step
Send an email naming the tier and the claim(s) you'd like to lock. Reply within one business day, every time.

If your scope doesn't fit any of the three tiers, the reply will say so plainly. No follow-up sequence, no nurture flow, no calls. The scope document arrives in writing within 48 business hours of your first email; the invoice rides with it. From there it's wire transfer and start.

Want only the hosted platform without a written engagement? See Pro pricing — subscription registry, ten-year retention, no founder hours involved.

Email [email protected]