No intelligence output reaches a client unless it has passed multi-model cross-verification, claim-level evidence tracing, and confidence scoring. This is not a feature — it is the architecture.
Stage 1 — Signal Ingestion & Deduplication
Raw signals are ingested from regulatory feeds, clinical databases, and monitored sources. Each signal receives a content hash for deduplication and a novelty score to suppress repetition. Duplicate or near-duplicate signals are flagged before entering the pipeline.
Stage 2 — Multi-Axis Scoring
Every signal is scored across six dimensions: relevance, urgency, credibility, novelty, contradiction risk, and decision impact. Scoring is performed by a cost-efficient model (Gemini 2.5 Flash) to enable high-volume processing without budget erosion. Composite scores determine pipeline priority.
Stage 3 — Evidence Packet Assembly
High-scoring signals are clustered into structured evidence packets containing extracted claims, entities, timestamps, and source quality scores. These packets — not raw signals — are what the generation model receives. The generator never sees unstructured input.
Stage 4 — Primary Generation (Claude)
The primary model generates briefings or dossiers from evidence packets only. Prompts enforce evidence-only reasoning: the model must cite specific evidence for every claim. Client memory and prior context are injected to prevent repetition and ensure continuity.
Stage 5 — Cross-Model Verification (OpenAI)
A separate verifier model — from a different provider — reviews the output against the original evidence. The verifier never sees the generator's reasoning chain. It independently identifies unsupported claims, contradictions, overconfident assertions, and missing caveats. Outputs that fail verification are revised and re-verified.
Stage 6 — Claim-Level Tracking & Confidence Classification
Every claim in the final output is extracted, stored, and tagged with source evidence IDs, support strength, and a confidence classification: High (multi-source corroboration), Medium (single strong source), Low (limited evidence), or Blocked (contradicted by evidence). No claim reaches a client without a traceable evidence chain.
Anti-Hallucination Stack — 6 Layers
1. Evidence-Only Prompting — LLM receives structured evidence packets, not open-ended prompts
2. Cross-Model Verification — Generator and verifier are different providers; verifier never sees generator reasoning
3. Claim Extraction & Tracking — Every claim mapped to source evidence with support strength scoring
4. Confidence Classification — High / Medium / Low / Blocked labeling on every assertion
5. Revision Loops — Failed verifications trigger automated revision and re-verification (max 3 cycles)
6. Full Audit Trail — Every model call, decision, and revision logged with timestamps, models used, and token counts