PRML Lie Detector · v0.1 · heuristic

Paste a benchmark claim. See what it forgot to say.

A 30-second forensic breakdown of any ML accuracy, refusal-rate, or pass-rate claim. We check it against 9 falsifiability criteria the spec considers minimum hygiene. Heuristic — not authoritative — but it surfaces what most published claims quietly omit.

Paste the claim

try: imagenet humaneval refusal-rate strong example

0 / 9 criteria met UNCATEGORIZED

This is a heuristic regex-based check, not a formal PRML verifier. Some honest claims will score low because they describe their methodology elsewhere; some dishonest ones might gain points by mentioning the right keywords without backing them up. The score surfaces structural omissions — interpret accordingly. The real spec is at spec.falsify.dev/v0.1.

If this surfaced something you didn't expect

The 9 criteria above aren't arbitrary — they're the structural fields PRML asks you to commit to a SHA-256 hash before you run an evaluation. Once committed, the hash is your tamper-evident receipt that the threshold and metric were fixed in advance.

Try the registry → Read the spec → What is PRML?