Diagnostic Sprint · Engagement v1

Turn an ML evaluation claim into auditable evidence. Written deliverables, fixed prices.

Three tiers, all async. We author PRML manifests for your existing evaluation claims, anchors them on the public registry, and writes the audit memo your reviewer, regulator, or customer can cite. No calls. No discovery loop. No payment processor in the middle. Wire transfer and a single invoice.

Audit Review

€15K

Full Sprint

€65K

Enterprise

€180–250K

Format

100% async

◆ Why async, why these prices

Async is the product, not a limitation. All deliverables in writing means everything is citable, dateable, and re-readable months later when your auditor reopens the file. A 30-minute call leaves no defensible record; a 6-page memo does. Compliance officers run 3-4 audits in parallel; removing the calendar friction is a feature.

Pricing anchors. Audit Review at €15,000 matches a solo-senior-consultant one-week rate. Full Sprint at €65,000 sits inside the boutique audit firm sprint band (Eticas, BABL AI, ORCAA range). Enterprise at €180–250,000 is mid-tier Big-4 range (Deloitte / KPMG sub-300K engagements). No discount for being solo + AI-leveraged on delivery; we charge the value of the artefact, not the hours behind it.

Invoicing. Single invoice, wire transfer to our corporate bank account, currency USD / EUR / GBP at customer choice. No third-party merchant of record, no payment-processor surcharge.

Platform access bundled. Every Sprint tier includes twelve months of Falsify Pro platform access (private hosted registry, ten-year Article 18 retention, conformance badge, written email support). Your manifests stay live and queryable after delivery without a second purchase. After twelve months you can continue on Pro at the standard rate, downgrade to Developer, or commission another Sprint.

Sample of the deliverable. An Audit Review produces a tamper-evident, cryptographically self-verifying technical evidence artifact: PRML manifest plus SHA-256 commit, verification steps, Annex IV §2(d) and ISO/IEC 42001 control mapping, scope limitations. See a sample Evidence Pack →

See the three tiers ↓ Email [email protected] → Read PRML v0.1 first →

Three tiers

Pick the engagement that matches your claim count and timeline.

Each tier is fixed-price, fixed-scope. No discovery phase, no hourly billing. The differences below are the differences. Everything ships in writing; everything is yours to cite, fork, and reuse.

Audit Review€15,000 · 5 business days

Single existing evaluation claim, locked into a PRML manifest, registry-anchored, audit memo delivered. For a compliance officer or eval engineer who needs third-party signature on one specific claim before a regulator or customer asks.

Deliverables:

One PRML manifest, registry-anchored at registry.falsify.dev
Audit memo, 6-8 pages PDF: threat model, manifest fields, residual risk
Re-derivation script (Python, 30-50 lines) for independent verification
Citation block (BibTeX + Zenodo DOI format) for your model card or paper
5 business days written Q&A window post-delivery, 48-hour response SLA

Payment: 100% upfront, single invoice, wire transfer

Email — Audit Review →

Full Sprint€65,000 · 3 weeks

Three evaluation claims, CI pipeline deployed, full audit report. For an AI team at a regulated provider that needs CI-level enforcement (block merges that tamper or regress against locked thresholds) and an audit-grade evidence package for the next regulator interaction.

Deliverables:

Three PRML manifests, registry-anchored
CI pipeline deployment: prml-verify-action wired in, set to fail builds on tamper or regression
Audit report, 12-15 pages PDF: per-manifest threat model, regulator-mappable evidence (EU AI Act Article 12 / NIST AI RMF / ISO 42001), residual risk register
Re-derivation scripts per manifest
30-day support window post-delivery, unlimited written Q&A, 48-hour response SLA
1-page closing memo: what's defensible now, what's still open

Payment: 50% / 50% wire transfer in two milestones (kickoff and audit-report delivery), single invoice

Email — Full Sprint →

Enterprise Engagement€180,000 – €250,000 · by inquiry · 8–10 weeks

Up to 12 manifests, isolated private registry instance, personalized regulatory crosswalk PDF, two written executive briefings. For a regulated AI provider, notified body, or large research org under EU AI Act high-risk status (2 Dec 2027 deadline) or pursuing ISO 42001 certification.

Deliverables:

Up to 12 PRML manifests, registry-anchored on your isolated instance
Isolated private registry: registry.<your-org>.com (Cloudflare Worker + KV), basic auth, audit log endpoint, monitored uptime during the engagement window
CI pipeline deployment across up to 5 repositories
Personalized regulatory crosswalk PDF, 20-30 pages: EU AI Act Articles 12 & 18 + optional NIST AI RMF + optional ISO/IEC 42001 mapping, each control linked to a specific manifest hash
Two written executive briefings, 4-6 pages each (mid-engagement status & close-out defensibility statement)
60-day support window post-delivery, including responses to your auditor or notified body during their review
Right of citation in your compliance documentation and regulator filings

Three variants:

Standard €180K — up to 8 manifests, 3 repos, one framework
Plus €220K — up to 12 manifests, 5 repos, two frameworks
Premium €250K — full envelope, all three frameworks + custom mapping to one additional standard (e.g. MITRE ATLAS, NIST AI 600-1)

Payment: 33% / 33% / 34% in three milestones, three invoices. Net-30 procurement terms available. Scope and terms agreed per engagement. Optional on-site visit: half-day, €15,000 surcharge.

Email — Enterprise →

Pricing is fixed at engagement start and locked into the scope memo. No mid-engagement upsells, no scope-creep billing, no "out-of-scope" surcharges except the explicit Enterprise add-ons listed above. The full tier definitions live at spec/COMMERCIAL.md in the public repo — same source as this page.

Why this exists

Threshold pre-registration is fifteen years old. The cryptographic version is what survives a hostile reviewer.

SR 11-7 was 2011. EBA/GL/2017/16 came later. Internal MRM teams have been documenting threshold commitments for over a decade. The bind has always been signature plus workflow plus trust — defensible inside an institution that wants to defend it, increasingly thin in front of a reviewer that doesn't.

PRML doesn't replace any of that. It produces a SHA-256 hash that any toolchain can re-derive against canonical bytes — mechanically verifiable rather than procedurally trustable. Same content. Same fields. Different proof shape.

The Sprint is the wrapper that produces that hash for one of your published claims in 14 days. If your model is sitting in an MRM queue or a notified-body submission and the binding constraint is "show us the receipt", that's what we ship.

What you get

Five artefacts you can hand to a reviewer or an auditor.

Every deliverable is yours to keep, share, and re-use. The PRML manifest and the verifier are MIT-licensed; the audit report is plain Markdown with cryptographic citations.

[I]PRML manifest

An 8-field YAML committing one published claim — threshold, metric, dataset split, model version, submitter, timestamp — to a SHA-256 hash. Anchored on registry.falsify.dev so anyone can re-derive and verify.

[II]Deployed verifier

Reference verifier (Python, JS, Go, or Rust — your pick) wired into your CI. Re-runs the claim, computes the canonical hash, exits 0 / 10 / 3 for pass / fail / tamper. No vendor lock-in.

[III]Audit report

A 6–10 page Markdown report covering: what was claimed, what was committed, how to verify, and §8.1 limitations specific to your claim. Citable in papers, model cards, and notified-body submissions.

[IV]Public permalink

A registry.falsify.dev/<hash> page anyone can land on. Shareable in tweets, README badges, and reviewer responses. Optional: keep it private until you publish the underlying claim.

[V]CI integration template

A wired-in studio-11-co/prml-verify-action@v2 workflow in your repo: five-line composite Action, exits non-zero on tampered or regressed claims, optional public registry anchor. Same artefact that gets cited in the audit report.

How it runs (Full Sprint example)

Fixed scope. Fixed price. No discovery loop.

Three weeks, five written checkpoints. Audit Review collapses the same flow into 5 business days for a single claim. Enterprise stretches it across 8–10 weeks with bi-weekly progress notes. Every step is written; nothing requires a call.

Scope memo · written exchange

You email [email protected]. We reply within 48 business hours with a 2-page scope document naming the claims, target regulators, CI integration points, and the invoice. You sign the scope memo electronically. No scoping call required.

Claim selection · manifest drafts

Three claims locked from the candidate list in your scope memo. Three PRML manifest drafts written, sent for written review. You comment in writing; one revision round; manifests frozen by end of week 1.

Verifier deployment · rerun green

prml-verify-action wired into your CI. Re-runs the claims end-to-end against your dataset hashes. If a claim doesn't reproduce, you find out before the receipt goes public — that's a feature. Progress note delivered end of week 2.

Audit report draft · written review

Draft of the 12–15 page report sent for review. You comment in writing; one revision round. Includes per-manifest §8.1 limitations and regulator-mappable evidence for the framework(s) you named at scope time.

D21

Hand-off · permalinks live · support window opens

Final report delivered as PDF. Manifests committed to the registry (public or private depending on tier). Re-derivation scripts attached. 30-day written-Q&A support window opens with 48-hour response SLA.

Fit

A sprint is the right shape for some teams. For others it isn’t.

Honest signal upfront beats a discovery call that ends in a no.

Right fit

Wrong fit

You’ve published or are about to publish a numeric eval claim — an accuracy, a refusal rate, a pass-rate — that you’d like to be able to defend.

You need an end-to-end eval pipeline built from scratch. That’s a longer engagement; let’s talk separately.

A reviewer, a regulator, or an internal red-team has questioned whether your threshold was fixed in advance.

You need a managed compliance service. We ship an artefact and walk away — we don’t hold ongoing audit retainers.

You can spare ~6 hours of an engineer’s time across two weeks for the verifier integration and one rerun.

You want a generic AI-safety audit. The sprint is narrow on purpose: one claim, one receipt, one report.

Your legal or compliance team is comfortable with an MIT-licensed deliverable and an open spec (CC BY 4.0).

You need the audit kept fully private with no public permalink. Possible, but ask — the public option is the default.

Who delivers it

You work with the person who wrote the spec.

No project managers, no associates, no offshore subcontractors. The Diagnostic Sprint is delivered by Cüneyt Öztürk personally.

Cüneyt Öztürk · PRML / falsify

Cüneyt Öztürk

Author of PRML v0.1 and the four reference implementations (Python, JS, Go, Rust). Independent researcher working on AI evaluation infrastructure and the PRML / falsify track. Engagements are structured under English-law SOWs.

[email protected] LinkedIn → GitHub → PRML v0.1 →

◆ Next step

Send an email naming the tier and the claim(s) you'd like to lock. Reply within one business day, every time.

If your scope doesn't fit any of the three tiers, the reply will say so plainly. No follow-up sequence, no nurture flow, no calls. The scope document arrives in writing within 48 business hours of your first email; the invoice rides with it. From there it's wire transfer and start.

Want only the hosted platform without a written engagement? See Pro pricing — subscription registry, ten-year retention, no founder hours involved.

Email [email protected] →