Tactics·Jul 5, 2026

A 6-phase pipeline for AI agents that separates fact from inference

To prevent AI research agents from presenting inferences as facts, use a deterministic pipeline where the LLM only extracts claims. Rule-based code must handle all scoring and labeling. An AI…

By Maya · Tactics desk·Human-reviewed·✓ Verified Jul 5, 2026·4 min read·1 source

To prevent AI research agents from presenting inferences as facts, use a deterministic pipeline where the LLM only extracts claims. Rule-based code must handle all scoring and labeling.

An AI research agent that mixes retrieved facts with its own conclusions is a liability. The model might report a market size of 1.2 trillion won (a retrieved data point) and then infer the market is "growing fast" (a conclusion), presenting both with equal confidence. For any decision with stakes, this ambiguity is unacceptable.

The solution is not a better prompt. It is a structural division of labor. The LLM should never be allowed to decide what constitutes a fact. That judgment must be handled by deterministic, rule-based code that is auditable and reproducible.

The LLM extracts, code judges

The core of the architecture is a hard separation of duties. The LLM is used for its strength in parsing unstructured text into structured claims. All subsequent steps, including scoring, cross-checking, and labeling, are executed by deterministic code.

The LLM does	Deterministic code does
Extract claims from a fetched page; summarize a passage	Score, cross-check, sort, deduplicate, label FACT/INFERENCE, decide freshness

This split provides two critical guarantees. First, reproducibility: the same query will produce the same set of labeled facts and inferences on every run. Second, it prevents laundering: the model cannot promote its own guess to the status of a fact because it never controls the labeling process.

A six-phase pipeline for provenance

The author proposes an explicit, six-stage pipeline to enforce this separation. Each stage is a distinct, testable component.

PLAN: The initial user query is broken down into specific sub-queries and a list of sources to consult.
HARVEST: The system fetches data from the planned sources. This stage is purely data collection and does not involve an LLM.
NORMALIZE: This is the only phase where the LLM operates. It reads the raw, harvested content and extracts structured claims from each source.
CORROBORATE: Claims are grouped, and the system counts the number of independent sources backing each one.
SCORE: Rules are applied to assign labels. A claim might be labeled FACT only if it meets a strict criterion, such as corroboration from two or more independent sources.
RENDER: The final output presents the labeled FACTs and INFERENCEs, along with an explicit list of information gaps.

Earning the FACT label

Under this model, FACT is not a default state. It is a status that a claim must earn by satisfying a predefined, programmatic rule. The system is designed so a claim is an INFERENCE unless it passes a specific gate, like being verified by an official API or appearing in multiple, distinct sources. If your research agent gives different confidence on the same question across runs, an LLM is scoring somewhere in the pipeline. This architecture is designed to eliminate that variability.

What we'd change

This playbook offers a robust path to auditable AI outputs, but its implementation introduces significant trade-offs. The architecture is more complex and expensive than a standard Retrieval-Augmented Generation (RAG) pipeline. The HARVEST and CORROBORATE stages require fetching from and processing multiple sources for a single query, increasing both latency and operational cost.

The system's integrity is also entirely dependent on the quality of its source material. The CORROBORATE phase assumes the availability of multiple independent and accurate sources. If the top-ranked sources for a query are all citing the same incorrect information, the pipeline will confidently label a falsehood as a FACT. The model does not solve the garbage-in, garbage-out problem; it simply makes the provenance of the garbage transparent.

Finally, the process of corroborating claims is non-trivial. Simple token matching is brittle. Determining if two differently worded statements from separate sources make the same semantic claim is a difficult computer science problem in itself. A naive implementation risks misclassifying nuanced or complex information.

Landing

Building a deterministic pipeline is a deliberate choice to prioritize trust and reproducibility over speed and simplicity. It moves an AI agent from a probabilistic tool to an auditable system of record. For founders building products in high-stakes domains like finance, law, or scientific research, where the cost of a hallucination is catastrophic, this architectural discipline is not optional. It is the foundation of an enterprise-ready product.

The investor read

This playbook signals a maturation of the AI/RAG market, moving from 'magic' to auditable reliability. An architecture that separates LLM-based extraction from rule-based verification is a defensive moat against simple API wrappers. It indicates a product built for high-stakes enterprise use cases (finance, legal, medical) where hallucinations create significant liability. While more complex and costly to build and run, this approach creates a stickier, more valuable product. For investors, this is a blueprint for an enterprise-grade AI tool, not a bootstrapped side project. It's a deliberate trade of short-term velocity for long-term defensibility and trust.

Pull quote: “If your research agent gives different confidence on the same question across runs, an LLM is scoring somewhere in the pipeline.”

Sources · how we verified

How to make an AI research agent label facts vs inferences — a deterministic provenance pipeline ↗

Every claim ties to a primary source. See our methodology.

Reported by the Maya desk on Founderr Pulse’s Tactics beat. Every factual claim is tied to a primary source and linked; anything that can’t be stood up doesn’t run. Founderr (RIKHATH LLC) is the accountable publisher and corrects in place. How we work · About · File a correction.

Maya

The Maya desk covers tactics: concrete playbooks, growth experiments, and operating decisions indie founders are running now. Every claim is sourced and linked. Operated by Founderr (RIKHATH LLC) See the desk →

The LLM extracts, code judges

A six-phase pipeline for provenance

Earning the FACT label

What we'd change

Landing

The investor read

How a proxy middleware fix cut a reported 30% from a scraping budget

A Data Analysis Shows JP and EN Developer Content Are Mirror Images

Nvidia GPU Pricing in 2026: A Founder's Guide to Cloud Compute Costs