Methodology: MediaReceipts

See it in action

How a Verdict Is Made

Two real claims, one from Fox News, one from CNN, walk through every step of the pipeline, from broadcast to published verdict. Same methodology, same standard, applied identically.

Read the full walkthrough →

MediaReceipts launched with a single founding editor applying this methodology to every published verdict. The process described here reflects both current practice and the standards that will govern the platform as it scales. Every published claim in the database has been adjudicated under this methodology.

Section 1

What counts as a claim

Not everything said on air is a factual claim. MediaReceipts tracks only verifiable factual assertions: statements that make a specific, testable claim about the world that can be confirmed or contradicted using authoritative sources.

A valid claim has three elements: an identifiable subject, a statement about a concrete fact, and enough specificity that a fact-checker knows what to look for. The question we ask is: could a trained researcher determine whether this statement is true or false by consulting primary sources? If yes, it's in scope.

In scope

Specific statistics and figures ("unemployment fell to 3.4%")
Claims about events ("the Senate voted 62–38")
Official findings and rulings ("the court ruled X unconstitutional")
Attributions ("the President said…", when the quote is checkable)
Claims about capabilities or institutional facts
Polling and survey data citations

Out of scope

Opinions, predictions, and value judgments ("this is a disaster")
Editorial characterizations without testable predicates
Rhetorical questions and non-declarative statements
Claims requiring non-public information to verify
Vague assertions without a falsifiable referent ("things are going well")
Personal conduct and private life: only professional, on-air statements

Section 2

How claims are selected

Claims currently enter the MediaReceipts pipeline through editorial selection by the MediaReceipts team, identifying factual claims from monitored cable news programming on tracked figures. A public nomination channel is in development; when launched, community-submitted clips and claims will enter the same review pipeline as editorial-sourced claims and not every nomination will lead to a published verdict.

Editorial selection maintains rough parity across all tracked figures and networks when selecting which claims to prioritize. Because nomination volume may skew toward figures from one end of the political spectrum, MediaReceipts actively monitors nomination patterns and manually balances the workload. Aggregated nomination data by network is published publicly so anyone can verify that coverage is balanced.

Source transcripts are obtained from broadcast audio of the tracked program. The primary method is machine transcription of the broadcast itself, producing a speaker-aligned, timestamped transcript. Where machine transcription is unavailable, the Internet Archive TV News Archive is the configured secondary source. No paywalled or private transcript services are used. Each transcript records how it was obtained, and that provenance is exposed on a claim's public record via the episode_source field.

Section 3

Source standards

MediaReceipts uses a six-tier source hierarchy. Verdicts require sources from the appropriate tier for the claim type. Primary sources (Tiers 1–3) are sufficient on their own to support a verdict. Secondary sources (Tiers 4–6) require independent corroboration from at least two sources.

Source independence matters as much as source tier. Five outlets reporting the same government press briefing represent one source, not five. When all confirming sources trace to a single institutional origin, this is flagged in the decision memo and affects the confidence level assigned to the verdict.

Tier	Category	Examples
1	Primary Institutional Records	Congressional Record, BLS/BEA/Census data releases, court filings, legislation text, official vote tallies, executive orders
2	Peer-Reviewed Research & Official Statistical Releases	Peer-reviewed journals, WHO/OECD/IMF data, CBO/GAO analyses, National Academy reports
3	Direct Official Statements with Traceable Provenance	Press conference transcripts, C-SPAN footage, verified official social media posts, on-record attributed quotes in press pool reports
4	Major Wire Services & Papers of Record	AP, Reuters, AFP dispatches; New York Times, Wall Street Journal, Washington Post, BBC, NPR, when directly citing primary documents
5	Subject-Matter Expert Analysis	Named, credentialed experts at recognized institutions commenting within their domain of expertise: economists, legal scholars, military analysts
6	Reputable Secondary Reporting	Established outlet investigative reporting, recognized fact-check organizations (PolitiFact, FactCheck.org), named-source reporting in established trade publications

Section 4

The two-gate review process

Every claim passes through two sequential editorial gates before a verdict is published. This structure separates the initial viability check (is this claim worth fact-checking?) from the substantive research and verdict assignment.

Gate 1

Claim Viability Assessment

An AI research assistant reviews the nominated claim and assesses whether it meets the standards for a verifiable factual assertion. It checks claim structure, source availability in the claim's domain, and whether initial verification attempts turn up meaningful evidence. A human editor then reviews the AI's assessment and makes the final call: proceed to Gate 2, or reject with a documented reason. Rejected claims are logged with their rejection code. (See §5 for the full Gate 1 rubric.)

Gate 2

Deep Verification & Verdict

Claims that pass Gate 1 receive a full independent research pass. The AI assistant searches for primary sources, evaluates source quality and independence, and produces a structured evidence summary and a verdict nomination. A human editor reviews the evidence independently (recording their own preliminary verdict before seeing the AI's recommendation) and makes the final determination. The published verdict, sources, and full reasoning are stored in the decision memo attached to each claim.

The human editor is always the quality gate. AI tools assist with research at scale, but no verdict is published without a human editor's review and sign-off. The AI's role is to surface evidence and flag patterns, not to determine outcomes.

Section 5

Gate 1 selection criteria

Gate 1 is the platform's viability triage. It does not assign a verdict. It answers a single question for each extracted claim: is this claim worth the cost of deep verification at Gate 2? The criteria below govern that decision.

§5.1

The four assessment dimensions

Every claim entering Gate 1 is assessed across four dimensions in sequence. Each dimension feeds the next, and the four together produce a single binary nomination: approve forward, or reject with a coded reason.

Dimension	Approve signal	Reject / reduced-confidence signal	Decision-maker
01 Claim structure Is the extracted claim a discrete, falsifiable assertion?	Identifiable subject, testable predicate, concrete referent a record can confirm or refute. Rhetorical characterizations that assert something specific about observable reality (governmental structure, legal status, measurable outcome, physical event) route forward. They carry a verifiable factual core.	Vagueness, pure opinion, motive attribution, rhetorical questions, predictions. Hard stop: if the extracted text does not faithfully represent what the speaker said, reject as EXTRACTION_ERROR. The AI does not proceed on a structurally invalid record.	AI nominates
02 Attribution quality Can the claim be cleanly tied back to the speaker?	Machine transcription of the actual broadcast paired with a verbatim quote from a named speaker.	Journalist's reconstruction in place of direct broadcast capture (mediation layer recorded, attribution confidence reduced, but proceeds). Hedged or anonymous attributions in the source language ("officials say," "sources familiar with") are flagged but do not by themselves trigger rejection.	AI nominates
03 Domain-based source availability Is the domain one where authoritative, independent sources realistically exist?	High availability: domestic policy, legislation, public economic and health statistics, official findings. Medium-high (with scrutiny): polling and survey data, with extra scrutiny on framing and temporal selection.	Low availability: active military operations, classified programs, sealed legal proceedings, with verification confidence capped. Very low: diplomatic negotiations, intelligence assessments, typically rejected unless they reference a publicly released document.	AI nominates
04 Verification output interpretation What did the preliminary research pass actually find?	True with independent sources (strongest signal). True with single-origin sourcing (approved with sourcing flag). Mostly True (approved; specific inaccuracy documented). Misleading (approved; among the most editorially valuable findings; trigger and distortion mechanism documented). False (approved; core to the platform's mission).	Unverifiable in a structurally constrained domain: reject. On a recent claim that may not yet be indexed, the call goes to human judgment.	AI nominates

§5.1 · Compound

Compound claims have two paths. Independently verifiable sub-assertions that would plausibly produce different verdicts split back to extraction. Sub-assertions too tightly linked to separate approve forward as a compound with notes that travel into Gate 2. Gate 1 does not determine which sub-assertion governs.

§5.1 · Source independence

Outlet diversity does not substitute for data-origin independence. Five outlets citing the same Pentagon briefing is one source, not five.

§5.1 · Lifecycle check

Official findings (court rulings, regulatory decisions, inspector general reports) are high-availability for the finding's existence. However, verification must also confirm the finding's current lifecycle status: appealed, stayed, overturned, superseded, or amended since the speaker cited it.

§5.1 · Hard stop

When the extracted text does not faithfully represent what the speaker said (a malformed extraction, a conflation, or a misparse), the claim is rejected immediately as EXTRACTION_ERROR. The AI does not proceed to verification on a structurally invalid claim record.

The four dimensions together produce one binary nomination. The AI verification pass produces this nomination; a human reviewer makes the final decision.

AI nominatesHuman decides

§5.2

Two-tier rejection codes

Every Gate 1 rejection carries a coded reason. Codes are split into two tiers. Tier 1 codes can be assigned by either the AI verification pass or a human reviewer. Tier 2 codes are reserved for human reviewers; they capture editorial judgments the AI is not authorized to make.

Tier 1Assignable by AI or human reviewer · 8 codes

TOO_VAGUE

AI or human

Claim lacks a testable predicate; no source can be targeted.

OPINION

AI or human

Editorial framing, attribution of motive, or characterization, not a factual assertion.

DUPLICATE

AI or human

Subsumed by an identical or more specific claim already in the queue. Applies within and across episodes. Cross-episode repetitions of false or misleading claims are generally assessed independently. Each broadcast is a separate editorial act, and routine deduplication would undercount inaccuracy. DUPLICATE is appropriate only when the second instance adds no new editorial signal.

TRIVIAL

AI or human

Technically verifiable but carries no editorial weight. Includes universally known facts, tautologies, and assertions so banal that fact-checking them would undermine the platform's credibility.

NOT_A_CLAIM

AI or human

Question, rhetorical assertion, or non-declarative statement.

UNVERIFIABLE

AI or human

No public primary-source path exists in currently indexed public sources. Common in classified, sealed, or active-conflict domains.

EXTRACTION_ERROR

AI or human

The extracted text does not faithfully represent what the speaker said. A pipeline artifact, not a speaker problem.

VERIFICATION_FAILED

System or human

Technical failure in the verification pass after retries exhausted. System-assigned by mr_verify.py, not by the AI's system prompt. The reviewer can manually re-trigger verification.

Tier 2Reserved for human reviewers only · 2 codes

SINGLE_SOURCE_INSTITUTIONAL

Human only

Verifiable in principle, but every available source traces to a single institution with no independent corroboration. The claim may be true, but the platform cannot publish a verdict with confidence.

PENDING_CONTEXT

Human only

Hold for transcript-context review. Used when the surrounding broadcast may change the claim's meaning and human assessment is required.

Why the two-tier split exists. It prevents the AI from both approving and rejecting the same fact pattern. When sources are single-origin, the AI flags the limitation in its reviewer summary and nominates approve. The human reviewer then decides whether the sourcing gap is disqualifying and, if so, applies SINGLE_SOURCE_INSTITUTIONAL as an override.

§5.3

Reviewer actions

A Gate 1 reviewer has three actions on any claim card.

Approve

Send the claim into Gate 2 for deep verification.

The Gate 1 preliminary verdict, source list, and sourcing signals travel forward as context; Gate 2 conducts its own independent research and reaches its own verdict.

Approval at Gate 1 is not a verdict; it is a judgment that the claim warrants the cost of the deep pass.

When the reviewer's decision differs from the AI's nomination, the override is recorded with a required one-sentence justification.

Reject

Remove the claim from the verification pipeline with one of the §5.2 coded reasons.

Human reviewers may assign any Tier 1 or Tier 2 code.

The claim record is preserved (see §5.4); rejection is not deletion.

Flag for review

Escalate to a senior reviewer when a confident call is not possible.

Flagging is not failure; it is the editor's structured way to say "I cannot commit on this one." The editor selects an escalation reason and provides a written explanation (minimum 20 characters).

Flagged claims are excluded from the editor's override-rate calculation. The editor made no approve/reject decision, so there is no agreement signal to count.

Escalation reasons attached to a flag

Rerun verification

The verification pass surfaced something off and warrants a second run.

Source-fidelity concern

A primary source's faithfulness to the underlying record is in question.

Editorial sensitivity

A claim raising publication-judgment considerations beyond standard sourcing and structure that warrants senior-reviewer review.

Objectivity concern

The reviewer wants a second pair of eyes on a sensitive editorial call.

Decision difficulty

A genuine close call between approve and reject.

Other

Free-text disposition note required; claim routes to senior queue.

§5.4

What rejection at Gate 1 means

Rejection at Gate 1 removes a claim from the verification pipeline. It does not delete the claim. Every rejected claim is retained in the platform's database with its rejection reason, the reviewer who made the call, the time of rejection, and (where applicable) any senior-reviewer disposition notes from a prior flag. This record is internal. Rejected claims do not appear on figure profiles, network scorecards, or platform-wide accuracy figures, because they have not been verified and have not received a verdict.

Reopen-eligible

Designed to be re-evaluable

UNVERIFIABLE

May reopen when public sources later become available, which is common when a sealed proceeding is unsealed or a classified matter is declassified.

VERIFICATION_FAILED

May reopen by manual re-trigger after the technical failure that caused it has been resolved.

EXTRACTION_ERROR

Held for re-extraction rather than discarded; the rejection identifies a pipeline artifact, not a speaker problem.

Typically terminal

No verifiable predicate or editorial value

OPINIONNOT_A_CLAIMTOO_VAGUEDUPLICATETRIVIAL

These categories are typically terminal because the claim itself does not contain a verifiable factual predicate (or editorial value, in the case of DUPLICATE and TRIVIAL) that subsequent source availability would change.

The rejection record is permanent and traceable. If a rejection is challenged (by the figure named, by an outside researcher, or by a contributor), the platform can produce the full chain of reasoning that produced it.

Note: Re-evaluation of previously rejected claims is an editorial capability the platform supports in principle. Formal procedures governing when and how re-evaluation is triggered are under development and will be documented separately.

Section 6

Quality controls

Consistent verdicts across different editors and different claims require structured quality controls. The standards below are drawn from academic content analysis. Until a second human reviewer is in place, the founding editor is the sole human reviewer; the second-review sampling and inter-rater reliability mechanisms described below will activate once second-reviewer positions are filled.

≥ 0.7

Target Cohen's kappa

The statistical threshold for "substantial agreement" between independent reviewers on the same claim.

20%

Independent second review

Once second-reviewer positions are filled, a minimum of one in five published claims will be independently reviewed by a second editor before publication.

Quarterly

Reliability reporting

Inter-rater reliability scores will be tracked and published quarterly once a second human reviewer is in place. If Cohen's κ falls below 0.6 in any tracked quarter, the editorial team will convene a calibration session.

New editor certification

When new editors are added, they will be required to score a minimum Cohen's κ of 0.7 on a set of 20 pre-scored calibration claims before being permitted to review claims independently. The calibration set covers all four verdict categories and includes claims from across the political spectrum.

Section 7

Gate 2 verdict categories

MediaReceipts uses four verdict categories. Each has a precise, operationalizable definition designed to produce consistent results when applied to the same claim. The categories are ordered by the nature of the discrepancy, not by severity of implied wrongdoing.

True

The claim's core assertion is fully supported by authoritative sources, with no material omissions that would change a reasonable viewer's understanding.

Minor imprecisions (rounding, approximate timeframes, common-usage simplifications) do not prevent a True verdict if they don't change the substance of what's being asserted.

Mostly True

The claim's core direction is correct, but it contains one or more identifiable factual errors that a competent fact-checker would note, even if they don't reverse the overall point.

Example: "Unemployment fell to 3.4%" when the actual figure is 3.6%. Directionally accurate, but the specific number is wrong.

Misleading

Every specific fact cited is technically accurate, but the overall impression the claim creates is demonstrably false: through omission, selective data, or distortive framing.

The key distinction from Mostly True: a Mostly True claim has a wrong detail but a right picture. A Misleading claim has right details but a wrong picture.

Held from public display · Releasing upon advisory board launch

False

The claim's core factual assertion is directly contradicted by authoritative sources.

"Directly contradicted" means the authoritative source provides a different answer to the same factual question the claim addresses, not merely a different framing or emphasis.

Held from public display · Releasing upon advisory board launch

A fifth outcome, Unverifiable, is used when a claim is a legitimate factual assertion but authoritative public sources sufficient to assess it do not exist. Unverifiable claims are retained in the internal database for future re-evaluation and do not count toward a figure's accuracy record.

Section 8

The Misleading category

The Misleading verdict is the most carefully defined category in our methodology, because technically true statements can cause real informational harm. We apply Misleading only when the distortion is material, meaning a viewer who heard the claim would form a substantially different understanding of the issue than one who received the complete picture.

Every Misleading verdict must identify which of four defined triggers applies and provide specific sourced evidence for how the distortion operates.

Omission

The claim, as stated, lacks context that, if it had been included, would materially reverse the directional implication for a reasonable viewer. The omitted context must be established by independent sources and must materially change what a reasonable viewer concludes.

Cherry-pick

The claim cites a real data point drawn from a time period, subcategory, or population that is not representative of the broader dataset, producing a conclusion substantially different from what the broader dataset supports. Both the cited figure and the broader dataset must be independently verified.

Framing

The claim's construction (comparison structure, denominator choice, implied causation, juxtaposition) supports a false inference not supported by the underlying facts.

Outdated

The claim cites data or references an event whose underlying facts are no longer current, and the claim's construction does not qualify the age of the cited material. The material difference between the cited and current figures must be documented.

The Misleading and False categories are currently collected internally and will be released publicly upon seating of the independent advisory board. The True and Mostly True categories are displayed publicly now. When Misleading verdicts are released, they will appear alongside a standard disclosure explaining that the verdict assesses informational effect, not speaker intent. See Section 12 for the full intent policy.

Section 9

How accuracy figures are calculated

Every accuracy percentage shown on a figure profile, network scorecard, or platform-wide chart is a direct count of published verdicts. The platform does not apply a weighted formula, a confidence multiplier, a time-decay function, or a correction modifier to the displayed numbers. The number you see is the number of claims in each verdict category, expressed as a percentage of that figure's total published verdicts.

A figure's accuracy rate is the count of True verdicts divided by the count of all published verdicts on that figure. While the platform is in its True/Mostly True public verdict phase, the calculation includes only those two categories. When Misleading and False are released publicly, those categories will enter the calculation under the same direct-count method: Misleading and False verdicts will count toward "all published verdicts" but not toward "True verdicts." Network and platform-level accuracy figures aggregate the same way.

The platform intends to surface correction data alongside verdict records so that post-publication remediation by a figure or outlet is visible to the public. The display mechanism will be defined in a separate design specification. The original verdict's classification does not change regardless of correction status, consistent with the principle that a False claim was still False at the time it aired. During the True/Mostly True-only publication phase, percentage-based accuracy displays are suppressed; only verdict counts are shown publicly.

Section 10

Corrections tracking

For every claim, the Gate 2 research pass includes a structured correction search at the time of verdict preparation. The Adjudicator checks the network's correction page, keyword-searches for correction language, scans the figure's on-air and social statements, and reviews external fact-checker feeds. Any correction found at research time is recorded with the verdict in five structured fields and is part of the verdict's published record.

Continuous post-publication monitoring, automated re-checking of published verdicts against new corrections as they appear in the wild, is in active development under a published design (Decision Memo DM-2026-010). When live, late-arriving corrections will be surfaced on the affected verdict, the figure's record, and a "last-checked" timestamp will be visible on every verdict so the act of re-checking is auditable, not just the result.

A correction does not change the original verdict. A False claim that was later corrected is still False at the time it aired. Corrections are recorded with the affected verdict and remain part of that verdict's permanent published record. How corrections are reflected in figure-level accuracy figures is described in §9.

Section 11

Appeals process

Any individual (including the figure in question, their representatives, or members of the public) may submit a formal appeal of any published verdict. Appeals are reviewed on the merits. All verdict revisions are documented in a public changelog; original verdicts remain visible with a "revised" notation.

Submit an appeal

Anyone can submit an appeal through the public form. Include the claim ID, the specific factual assertion you believe is incorrect, and the source you believe contradicts our verdict.

Acknowledgment within 7 days

All appeals receive an initial acknowledgment confirming receipt within 7 calendar days.

Substantive editorial review within 30 days

A full editorial review is completed within 30 days of submission. The reviewer assesses the submitted evidence against the original sourcing and may request additional information.

Public changelog updated

If a verdict is revised, the change is logged in a public changelog with the original verdict, new verdict, date of revision, and the grounds for the change. The original verdict remains visible with a "revised" label.

Section 12

A note on intent

MediaReceipts verdicts assess the informational effect of a claim on a reasonable viewer, not the speaker's intent, motivation, or character. This applies to all four verdict categories, including Misleading.

The test for every verdict is purely effect-based: does the claim's specific factual content match the authoritative record? Does its construction, framing, or omissions create a false impression? These questions have answers grounded in evidence. Whether a speaker intended to mislead does not.

A Misleading verdict means: this claim, as stated, creates a false impression in a reasonable viewer's mind. It does not mean the speaker lied, intended to deceive, or acted in bad faith. Intent is unknowable from a transcript, and a methodology that required proving intent would be both editorially subjective and legally vulnerable.

This principle is stated here in full and will appear as a standard disclosure on every published Misleading verdict when that category is released publicly.

The trigger labels in Section 8 describe properties of the published claim's content, not the speaker's selection process or motive. Whether a speaker chose to omit, cherry-pick, frame, or rely on outdated material is unknowable from a transcript and outside the scope of any verdict.

Changelog

Public version history

Version

Date

Summary

Gate 1 v1.12 · Gate 2 v1.10

May 9, 2026

Trigger language clarification; escalation reason rename; correction-display direction; senior queue label standardization; changelog added.

Gate 1 v1.11 · Gate 2 v1.9

May 5, 2026

Extraction model update (Haiku); companion document synchronization.