Knowledge Ledger

The Project Knowledge Ledger is Aira's structured memory. It's called a ledger because it's auditable, append-only, and evidence-backed — like financial accounting applied to project intelligence.

Why a ledger

Traditional PM tools store flat text: "User wants faster onboarding." That's an opinion, not knowledge. You can't trace where it came from, verify it, or detect when it contradicts something else.

Aira's ledger stores atoms — atomic knowledge items with provenance, confidence, and evidence chains. Every atom can be traced back to the source text that produced it. Every synthesis (PRD, features, tasks) cites the atoms it was derived from. Nothing is asserted without evidence.

Core principles:

Raw inputs are transient intake material
Ledger atoms + evidence are long-lived truth
Derived artifacts (PRD, features, tasks) are materialized views over atoms, not independent truth sources
The working summary is always regenerated from ledger state — never carried forward

Pipeline stages

flowchart TD
    SRC["Raw Sources"] --> S0["Stage 0: Hygiene<br/>deterministic — no LLM"]
    S0 --> S1["Stage 1: Map<br/>atom extraction per chunk"]
    S1 --> S1B["Stage 1b: Reflection<br/>validation + confidence scoring"]
    S1B --> S2["Stage 2: Normalize + Dedupe<br/>fingerprint · merge · contradictions"]
    S2 --> S3["Stage 3: Reduce<br/>PRD · features · tasks<br/>(all cite atom IDs)"]

Stage 0: Hygiene (deterministic gate)

No LLM call happens until this passes. All operations are deterministic:

Normalize — Encoding, newlines, stable line numbering. Compute raw_normalized_hash.
Redact — Replace secrets with placeholders. Preserve line count. Compute sanitized_hash.
Annotate — Detect instruction-like or prompt-injection spans. Store as annotations without mutating chunk text.
Filter — Skip binaries, minified files, generated noise via deterministic rules.
Chunk — Structure-first chunking (by headings, functions, classes), fallback to overlap windows. Persist chunk metadata.

Stage 1: Map (per-chunk extraction)

Each chunk is processed by the LLM to extract atoms. Input: (text_payload, annotations[], compact_project_state). Output: atomic items of 8 kinds.

Stage 1b: Reflection / validation

A validation pass after Map:

Reject atoms with missing or weak evidence
Enforce required fields per kind
Downgrade confidence on indirect evidence
Flag contradiction candidates for expansion

Deterministic rules include: anchor validity checks, snippet quality floor (minimum 20 chars), kind-specific evidence floors, indirect evidence penalty. Only borderline cases go to a higher-tier model for adjudication.

Stage 2: Normalize + Dedupe + Contradictions

Canonicalize atom text and compute stable fingerprints
Cluster similar atoms and merge duplicates
Detect contradictions between atoms
Create immutable records: merge_ops, ledger_contradictions

Stage 2 does not create ledger_links. The atom↔atom semantic graph is derived from signals the pipeline already produces — see Derived link graph below.

Stage 3: Reduce (synthesis)

Synthesize from atoms only
Output must cite atom IDs
A synthesis validator rejects uncited claims
The working summary is regenerated from current ledger state — the previous summary is disposable

Atom structure

An atom has 8 possible kinds:

Kind	Required fields	Evidence floor
claim	title, body, confidence, fingerprint	1 evidence row
decision	title, body, confidence, fingerprint	2 evidence rows (unless draft)
requirement	title, body, confidence, fingerprint	2 evidence rows (unless draft)
risk	title, body, severity, impact, confidence, fingerprint	1 direct evidence row
unknown	title, body, confidence, fingerprint	1 evidence row
entity	canonical entity linkage, fingerprint	1 mention/evidence linkage
action_item	title, body, confidence, fingerprint	1 evidence row
domain_signal	title, body, confidence, fingerprint	1 evidence row

Concrete example

{
  "id": "a1b2c3d4-...",
  "project_id": "p5e6f7g8-...",
  "kind": "risk",
  "title": "Single point of failure in payment gateway",
  "body": "The payment processing flow relies on a single Stripe webhook endpoint with no retry queue. If the endpoint is down, payments are silently dropped.",
  "status": "active",
  "confidence": 0.85,
  "severity": "high",
  "impact": "Revenue loss during outages; no alerting on failed payments",
  "polarity": "negative",
  "domain": "payments",
  "tags": ["infrastructure", "reliability", "stripe"],
  "atom_fingerprint": "e3b0c44298fc1c14...",
  "created_by_run_id": "run-abc123"
}

With evidence:

{
  "atom_id": "a1b2c3d4-...",
  "source_id": "src-xyz789",
  "chunk_id": "chk-456def",
  "anchor_type": "line",
  "anchor_start": 142,
  "anchor_end": 158,
  "snippet": "stripe_webhook_handler processes events synchronously. No queue, no retry. If this endpoint 500s, we lose the event.",
  "chunk_hash": "sha256:...",
  "sanitized_hash": "sha256:...",
  "confidence": 0.9
}

The fingerprint

The atom_fingerprint is a SHA-256 hash of canonicalized atom fields:

Build a canonical object: kind, entity references, normalized body text, polarity/severity (where applicable), sorted tags
Canonicalize: lowercase, normalize whitespace, trim punctuation edges, stable JSON key ordering
Hash: SHA-256(canonical_json) hex digest

This is not semantic similarity — it's deterministic identity. Canonicalization is pure, deterministic Python (_canonicalize_atom_for_fingerprint: lowercase, whitespace/punctuation normalization, sorted tags, stable JSON key ordering) — no LLM is involved. A plain SHA-256 of the canonical JSON yields the fingerprint. Same canonical form → same fingerprint → upsert instead of duplicate.

Runtime fields (confidence, timestamps, run IDs) are excluded from the fingerprint. Two runs that extract the same knowledge produce the same fingerprint.

Evidence anchors

Every atom links to one or more evidence rows. Each evidence row contains:

source_id — Which source document
chunk_id — Which chunk within the source
anchor_start / anchor_end — Line range or byte range within the chunk
snippet — The exact quoted text
chunk_hash / sanitized_hash — Cryptographic verification that the evidence hasn't been modified

If a hash mismatch is detected (source was modified after extraction), the evidence is marked stale and targeted re-ingestion is queued.

Contradiction detection and resolution

When Stage 2 finds atoms that conflict, it creates a ledger_contradictions record:

Atom A: "The system uses JWT for authentication"
Atom B: "Session-based auth is the current approach"

Contradiction:
  severity: medium
  status: open
  atom_a_id: ...
  atom_b_id: ...

Contradiction status values: open → resolved | dismissed (there is no investigating status). Independently, each contradiction carries a state — open_blocking or open_non_blocking — indicating whether it blocks downstream synthesis.

Resolution requires human input:

Choose which atom is correct
Provide a rationale
Optionally mark atoms as human-verified with a confidence floor

All resolution actions write immutable audit records.

Derived artifacts as materialized views

PRD, features, and tasks are not independent truth. They are materialized views over ledger atoms:

Artifact	Depends on	Must reference
PRD	decision, requirement, risk, unknown, domain_signal atoms	Atom IDs by section
Features	requirement, decision, risk, domain_signal atoms, entity clusters	Source atom clusters, contradictions considered
Tasks	Selected features + risk + unknown atoms + team/capacity data	Upstream feature and atom IDs
Reports	Current derived artifacts + ledger deltas	Supporting atom IDs

When atoms are superseded or contradictions resolved, only affected artifact sections are recomputed.

Artifact stability states

State	Meaning
draft	Generated with incomplete evidence coverage
preview_stable	Preview-lane quality bar met; suitable for onboarding progression
stable	Deep-lane validation complete; used for downstream planning

Regressions in supporting atoms can downgrade stable → preview_stable until recomputation closes gaps.

Preview lane vs deep lane

The two-lane UX strategy affects how the ledger serves the frontend:

Preview lane — Process high-value sources first (README, architecture docs, manifests). Fast atom extraction. Produce a PRD v0, initial features, and a sprint-task seed. Target: first atom preview in under 3 seconds, minimal viable plan in under 30 seconds.

Deep lane — Broader ingestion in the background. Incremental refinement of atoms and artifacts. Updates arrive every 5–10 seconds. Artifacts may upgrade from preview_stable to stable as deep-lane processing completes.

The frontend doesn't wait for deep-lane completion. It renders from preview-lane results and progressively updates as refinements arrive.

Contradiction assessments

When contradictions are detected in Stage 2, an automated assessment evaluates severity and suggests resolution. The ledger_contradiction_assessments table records each assessment with:

decision — The conflict state the assessment resolves to: open_blocking, open_non_blocking, or suppressed (written from decision.state). The keep_* / both_valid / needs_human verdicts belong to the separate dreaming engine, not this column.
decision_confidence — How confident the assessment is (0-1)
reason_codes — Machine-readable justification codes

Assessments help prioritize which contradictions need urgent human attention vs. which can wait.

Human-in-the-loop events

All HITL actions are tracked in the ledger_hitl_events table with full before/after state snapshots:

Event type	What happened
`atom_verified`	Human verified an atom, setting a confidence floor
`atom_rejected`	Human rejected an atom during HITL verification
`atom_needs_review`	Human flagged an atom as needing further review
`contradiction_resolved`	Human resolved a conflict between atoms
`contradiction_dismissed`	Human dismissed a contradiction
`merge_approved`	Human approved a candidate merge operation
`merge_overridden`	Human rejected an auto-merge and split atoms back

Each event captures before_state and after_state as JSON snapshots, the acting user, and the ingestion run context.

Structured claims

The ledger_claims table decomposes atoms into fine-grained subject-predicate assertions. Each claim has:

subject — The entity or concept being described
predicate — The assertion about the subject
strength — How strong the assertion is: hard or soft (default soft)
scope — The context in which the claim applies
claim_fingerprint — Deterministic identity for deduplication

Claims enable more precise contradiction detection and cross-referencing between atoms.

Entity graph

The ledger_entities table maintains canonical named entities (systems, services, teams, integrations) with aliases and external IDs. The ledger_entity_mentions table links entities to the atoms and chunks where they appear, with anchor positions for precise source tracing.

This entity graph powers:

Entity-based knowledge retrieval (find all atoms about a specific system)
Cross-source entity resolution (same entity mentioned differently in different sources)
Dependency mapping between project components

Derived link graph (`ledger_links`)

The atom↔atom semantic graph in ledger_links is derived from signals the ledger already computes — it is not authored by the extraction LLM and is not created during Stage 2. Two relations are produced today:

supersedes — derived from merge_ops. When a dream merge (or normalize) collapses a duplicate, the surviving canonical atom gets a supersedes edge to the now-superseded/obsolete loser (direction canonical → loser). Pending, unapplied proposals are skipped, so an edge always points at a real terminal atom.
relates_to — derived from atom embeddings. Each active atom's nearest neighbours (cosine similarity above a threshold, via the pgvector HNSW index) become undirected relates_to edges. Only two active atoms are ever related, so a merge that collapses one endpoint cannot resurrect a stale edge.

Derivation is idempotent (upsert on the unique relation tuple) and runs incrementally as part of source analysis. depends_on (task→atom), supports, and refines are planned follow-ons; contradicts is intentionally omitted as it is already modelled by ledger_contradictions.

A one-time backfill materialises links for a project's existing atoms + merge_ops (POST /projects/{project_id}/knowledge/derive-links, admin-gated). It is run once by an operator post-deploy, not automatically on deploy. The read endpoint GET /knowledge/atoms/{atom_id}/related exposes the graph grouped by relation type; the "Related" UI panel that consumes it ships as a separate follow-on frontend change.

Knowledge regression testing

When prompt templates or pipeline heuristics change, regression tests verify that the updated pipeline produces equivalent or better results:

knowledge_regression_runs — Test runs comparing candidate vs. current prompt versions
knowledge_regression_cases — Individual test fixtures with pass/fail and metric comparisons

Passing regression runs can be promoted to update the active prompt version. This prevents quality regressions in the extraction pipeline.

Database model

Key tables (19 listed below — some are shared with the Planning Pipeline domain; see Data Architecture for the full ERD):

source_chunks — Chunked source content with hashes and anchors
chunk_annotations — Prompt-injection and untrusted content annotations
ledger_atoms — The atoms themselves
ledger_evidence — Evidence rows linking atoms to source chunks
ledger_entities — Canonical named entities with aliases
ledger_entity_mentions — Entity references within atoms with anchor positions
ledger_links — Derived atom↔atom relationships (supersedes from merge_ops, relates_to from atom embeddings); depends_on / supports / refines are planned follow-ons — see Derived link graph
ledger_claims — Fine-grained subject-predicate assertions within atoms
merge_ops — Immutable merge/supersede/obsolete operations with review status
ledger_contradictions — Conflict records with resolution workflow
ledger_contradiction_assessments — Automated severity/resolution assessments
ledger_hitl_events — Human-in-the-loop action audit trail
knowledge_state — Per-project working_summary, open_questions, active_contradictions, evidence_index
ingestion_runs — Run tracking with checkpoints and cost attribution
derived_artifacts / derived_artifact_sections / derived_artifact_atom_refs — Materialized views over atoms
knowledge_regression_runs / knowledge_regression_cases — Pipeline quality regression tests