Back to Aira

Data Architecture

Aira's PostgreSQL schema contains 106 tables organized into seven logical domains. Each section below shows an entity-relationship diagram for that domain's key tables, with primary keys, foreign keys, and the fields that drive behaviour.

Convention used in diagrams

  • PK — primary key
  • FK — foreign key
  • ||--o{ — one-to-many (zero or more)
  • ||--|{ — one-to-many (one or more)
  • ||--|| — one-to-one
  • }o--|| — many-to-one

1. Core — Projects, Auth & Identity

Users, projects, access control, billing, authentication tokens, and identity providers. Every other domain hangs off projects.

erDiagram
    users {
        string id PK
        string email
        string name
        string password_hash
        string google_id
        string status
        bool   is_admin
        datetime created_at
    }

    projects {
        string id PK
        string name
        string description
        string vision
        string onboarding_mode
        jsonb  health_snapshot
        string created_by FK
        datetime created_at
    }

    project_members {
        string id PK
        string project_id FK
        string user_id FK
        string role
        datetime invited_at
        datetime accepted_at
    }

    project_settings {
        string id PK
        string project_id FK
        string timezone
        string llm_provider
        string llm_model
        bool   heartbeat_enabled
        string auto_rebalance_mode
        bool   pr_automation_enabled
    }

    project_billing {
        string id PK
        string project_id FK
        string plan
        datetime trial_started_at
        string stripe_customer_id
        bool   is_trial_active
    }

    api_keys {
        string id PK
        string user_id FK
        string project_id FK
        string key_hash
        string name
        jsonb  scopes
        datetime expires_at
    }

    webauthn_credentials {
        string id PK
        string user_id FK
        bytes  credential_id
        bytes  public_key
        int    sign_count
        string friendly_name
    }

    auth_refresh_tokens {
        string jti PK
        string user_id FK
        string token_hash
        string family_id
        datetime expires_at
        datetime revoked_at
    }

    auth_oidc_identities {
        string id PK
        string user_id FK
        string provider
        string issuer
        string subject
        string email
    }

    project_invitations {
        string id PK
        string project_id FK
        string email
        string role
        string invited_by FK
        string status
        datetime expires_at
    }

    mcp_oauth_clients {
        string client_id PK
        string client_secret_hash
        jsonb  redirect_uris
        string client_name
        string scope
        bool   is_active
    }

    users ||--o{ project_members : "joins"
    projects ||--o{ project_members : "has members"
    projects ||--|| project_settings : "has settings"
    projects ||--|| project_billing : "has billing"
    users ||--o{ api_keys : "owns"
    projects ||--o{ api_keys : "scoped to"
    users ||--o{ webauthn_credentials : "has passkeys"
    users ||--o{ auth_refresh_tokens : "has sessions"
    users ||--o{ auth_oidc_identities : "has SSO"
    projects ||--o{ project_invitations : "sends invites"

Additional tables in this domain

TablePurpose
auth_email_verificationsEmail verification tokens with send counts
auth_password_reset_otpsOTP-based password reset with attempt tracking
auth_magic_linksPasswordless login links with next-path routing
webauthn_challengesTemporary WebAuthn registration/authentication challenges
user_audit_logSecurity activity log (event type, IP, user agent)
user_preferencesPer-user per-project UI preferences
system_configGlobal key-value system configuration
mcp_oauth_authorization_codesOAuth 2.1 authorization codes for MCP clients
mcp_oauth_refresh_tokensMCP OAuth refresh tokens with scopes
project_integrationsConfigured integrations per project (provider, encrypted config)
project_issuesIssue tracking (title, status, priority, reporter)

2. Planning Pipeline

Sources feed insights, insights drive features, features break into tasks, and tasks land in sprints. Repository analysis, PR tracking, and sprint planning requests extend this backbone.

erDiagram
    projects {
        string id PK
    }

    sources {
        string id PK
        string project_id FK
        string type
        string title
        text   content
        bool   analyzed
        string content_hash
        string content_type
        string storage_uri
        datetime last_analyzed_at
    }

    insights {
        string id PK
        string project_id FK
        string source_id FK
        string type
        string title
        string impact
        float  confidence
        string status
        string linked_feature_id FK
        jsonb  tags
    }

    features {
        string id PK
        string project_id FK
        string title
        text   description
        text   rationale
        float  priority_score
        string status
        string effort_estimate
        string impact_estimate
        text   objective
        jsonb  definition_of_done
        datetime shipped_at
    }

    feature_insights {
        string feature_id FK
        string insight_id FK
    }

    tasks {
        string id PK
        string project_id FK
        string feature_id FK
        string sprint_id FK
        string assignee_id FK
        string title
        string type
        string status
        string priority
        string complexity
        float  estimated_hours
        int    progress
        string due_date
    }

    sprints {
        string id PK
        string project_id FK
        int    number
        string name
        text   goal
        string status
        float  committed_hours
        float  completed_hours
        float  velocity
        jsonb  story_ids
        datetime start_date
        datetime end_date
    }

    team_members {
        string id PK
        string project_id FK
        string name
        string role
        string member_type
        string seniority
        jsonb  skills
        float  capacity_hours_per_sprint
        float  current_load_hours
    }

    task_pull_requests {
        string id PK
        string task_id FK
        string project_id FK
        string provider
        string repo_full_name
        int    pr_number
        string pr_state
        string pr_url
    }

    repository_analysis_runs {
        string id PK
        string project_id FK
        string provider
        string status
        int    total_count
        int    completed_count
    }

    sprint_planning_requests {
        string id PK
        string project_id FK
        string request_type
        string status
        string sprint_id FK
    }

    projects ||--o{ sources : "has"
    projects ||--o{ insights : "has"
    projects ||--o{ features : "has"
    projects ||--o{ tasks : "has"
    projects ||--o{ sprints : "has"
    sources ||--o{ insights : "generates"
    insights }o--o{ features : "linked via feature_insights"
    features ||--o{ tasks : "breaks into"
    sprints ||--o{ tasks : "contains"
    team_members ||--o{ tasks : "assigned"
    tasks ||--o{ task_pull_requests : "linked to"
    projects ||--o{ repository_analysis_runs : "scans repos"

Additional tables in this domain

TablePurpose
source_chunksChunked source content with hashes, anchors, and token counts
chunk_annotationsPrompt-injection and untrusted content annotations per chunk
repository_analysis_itemsPer-repo status within a repo analysis run
repository_full_analysis_runsDeep repository analysis runs (file-level)
repository_full_analysis_itemsPer-repo items for full analysis with file/source counts

3. Knowledge Ledger (PKL)

The PKL pipeline (Stages 0-3) extracts structured atoms from raw sources, deduplicates them, tracks contradictions, and synthesizes derived artifacts. This is the AI brain's memory.

erDiagram
    source_chunks {
        string id PK
        string project_id FK
        string source_id FK
        string sanitized_hash
        string chunk_hash
        string anchor_type
        int    anchor_start
        int    anchor_end
        text   text_payload
        int    token_count
    }

    ingestion_runs {
        string id PK
        string project_id FK
        string status
        int    atoms_created
        int    atoms_merged
        int    contradictions_found
        jsonb  coverage_metrics
        float  total_cost_usd
    }

    ledger_atoms {
        string id PK
        string project_id FK
        string kind
        string title
        text   body
        string status
        float  confidence
        string verification_status
        string atom_fingerprint
        string domain
        jsonb  tags
    }

    ledger_entities {
        string id PK
        string project_id FK
        string canonical_name
        string type
        jsonb  aliases
        jsonb  external_ids
    }

    ledger_evidence {
        string id PK
        string atom_id FK
        string source_id FK
        string chunk_id FK
        text   snippet
        float  confidence
        bool   stale
    }

    ledger_links {
        string id PK
        string from_atom_id FK
        string to_atom_id FK
        string relation_type
        float  weight
    }

    ledger_claims {
        string id PK
        string atom_id FK
        string domain
        string subject
        string predicate
        string strength
        float  claim_confidence
        string claim_fingerprint
    }

    ledger_contradictions {
        string id PK
        string project_id FK
        string atom_a_id FK
        string atom_b_id FK
        string severity
        string status
        string state
        string pair_hash
        string resolution_atom_id FK
    }

    ledger_contradiction_assessments {
        string id PK
        string contradiction_id FK
        string decision
        float  decision_confidence
        jsonb  reason_codes
    }

    ledger_hitl_events {
        string id PK
        string project_id FK
        string event_type
        string target_type
        string target_id
        string action_by_user_id
        jsonb  before_state
        jsonb  after_state
    }

    knowledge_state {
        string id PK
        string project_id FK
        text   working_summary
        jsonb  open_questions
        jsonb  evidence_index
        string updated_by_run_id FK
    }

    derived_artifacts {
        string id PK
        string project_id FK
        string artifact_type
        string artifact_state
        string title
        text   body
        string parent_artifact_id FK
    }

    derived_artifact_sections {
        string id PK
        string artifact_id FK
        string section_key
        string section_title
        int    position
        text   content
        string section_fingerprint
    }

    derived_artifact_atom_refs {
        string id PK
        string artifact_id FK
        string section_id FK
        string atom_id FK
        string relation_type
    }

    insight_atom_refs {
        string id PK
        string insight_id FK
        string atom_id FK
        string relation_type
    }

    sources ||--o{ source_chunks : "chunked into"
    ingestion_runs ||--o{ ledger_atoms : "creates"
    source_chunks ||--o{ ledger_evidence : "evidences"
    ledger_atoms ||--o{ ledger_evidence : "supported by"
    ledger_atoms ||--o{ ledger_links : "from"
    ledger_atoms ||--o{ ledger_links : "to"
    ledger_atoms ||--o{ ledger_claims : "asserts"
    ledger_atoms ||--o{ ledger_contradictions : "conflicts A"
    ledger_atoms ||--o{ ledger_contradictions : "conflicts B"
    ledger_contradictions ||--o{ ledger_contradiction_assessments : "assessed by"
    derived_artifacts ||--o{ derived_artifact_sections : "has sections"
    derived_artifact_sections ||--o{ derived_artifact_atom_refs : "cites"
    ledger_atoms ||--o{ derived_artifact_atom_refs : "cited by"
    ledger_atoms ||--o{ insight_atom_refs : "materializes"
    ingestion_runs ||--|| knowledge_state : "updates"

Additional tables in this domain

TablePurpose
chunk_annotationsAnnotations per chunk (prompt-injection, untrusted content flags)
ledger_entity_mentionsEntity references within atoms, linking to chunks with anchor positions
merge_opsImmutable merge/supersede/obsolete records with review status
derived_artifact_contradiction_refsContradictions relevant to specific artifact sections
knowledge_regression_runsRegression test runs for knowledge pipeline prompt changes
knowledge_regression_casesIndividual test fixtures within a regression run

4. Team Ledger

Tracks team events, PR activity, daily metrics per member, chat testimony, escalation workflows, collaboration networks, and fairness policies. The Team Operations Ledger (TOL) feeds the Assigner agent.

erDiagram
    teams {
        string id PK
        string project_id FK
        string name
        jsonb  default_working_hours_policy
    }

    team_members {
        string id PK
        string project_id FK
        string team_id FK
        string name
        string role
        string member_type
        string seniority
        string agent_instance_id FK
        float  capacity_hours_per_sprint
        float  current_load_hours
    }

    task_events {
        string id PK
        string project_id FK
        string task_id
        string member_id FK
        string event_type
        datetime event_ts_utc
        string idempotency_key
    }

    pr_events {
        string id PK
        string project_id FK
        string repo_id
        string pr_id
        string member_id FK
        string event_type
        datetime event_ts_utc
    }

    member_metrics_daily {
        string project_id PK
        string member_id PK
        date   date_utc PK
        int    tasks_completed
        float  cycle_time_p50
        float  cycle_time_p95
        float  rework_rate
        float  wip_avg
    }

    team_chat_messages {
        string id PK
        string project_id FK
        string team_id FK
        string author_member_id FK
        string source_type
        string risk_level
        text   minimized_text
        datetime message_ts_utc
    }

    team_chat_message_annotations {
        string id PK
        string message_id FK
        string annotation_type
        string label
        string severity
    }

    escalation_cases {
        string id PK
        string project_id FK
        string source_message_id FK
        string category
        string severity
        string status
        text   summary
    }

    team_affinity_edges {
        string id PK
        string project_id FK
        string member_a_id FK
        string member_b_id FK
        float  edge_weight
        float  confidence
        jsonb  signals
    }

    fairness_policy_versions {
        string policy_version_id PK
        string project_id FK
        jsonb  rules_json
        string status
        bool   is_active
        float  rollout_percentage
    }

    teams ||--o{ team_members : "contains"
    team_members ||--o{ task_events : "generates"
    team_members ||--o{ pr_events : "generates"
    team_members ||--o{ member_metrics_daily : "tracked in"
    teams ||--o{ team_chat_messages : "receives"
    team_members ||--o{ team_chat_messages : "authors"
    team_chat_messages ||--o{ team_chat_message_annotations : "annotated"
    team_chat_messages ||--o{ escalation_cases : "triggers"
    team_members ||--o{ team_affinity_edges : "member A"
    team_members ||--o{ team_affinity_edges : "member B"

Additional tables in this domain

TablePurpose
escalation_case_eventsAudit trail for escalation lifecycle (open, assign, resolve)
collaboration_outcomesMeasured results of member pairings (cycle time delta, rework delta)
team_ledger_ingestion_offsetsCursor positions for incremental team signal ingestion
team_ledger_review_eventsHuman review decisions on team ledger items

5. Rebalancing

Multi-stage workload rebalancing with candidate generation, shadow simulation, challenge consensus, and human approval workflows. Implements the Decision Constitution (safety > privacy > corroboration > fairness > delivery).

erDiagram
    rebalance_proposals {
        string id PK
        string project_id FK
        string sprint_id FK
        text   summary
        jsonb  proposed_actions
        string constraint_policy_version
        string proposal_hash
        bool   approval_required
        string status
        datetime approved_at
        datetime applied_at
    }

    rebalance_decisions {
        string id PK
        string project_id FK
        string proposal_id FK
        string model_provider
        string fairness_policy_version_id
        string input_window_hash
        bool   abstained
    }

    rebalance_candidate_sets {
        string id PK
        string proposal_id FK
        string decision_id FK
        int    candidate_count
        string selection_method
    }

    rebalance_candidate_plans {
        string id PK
        string proposal_id FK
        string candidate_set_id FK
        string label
        string strategy
        float  total_hours_redistributed
        bool   approval_required
    }

    rebalance_shadow_simulations {
        string id PK
        string proposal_id FK
        string candidate_plan_id FK
        float  predicted_delivery_delta
        float  risk_score
        jsonb  mitigations
    }

    rebalance_challenge_consensus {
        string id PK
        string proposal_id FK
        string decision_id FK
        bool   triggered
        bool   strong_negative_consensus
        bool   confidence_downgrade
        jsonb  findings
    }

    rebalance_negotiation_records {
        string id PK
        string proposal_id FK
        string candidate_set_id FK
        string selected_candidate_plan_id FK
        string selection_method
        bool   challenge_required
    }

    rebalance_proposal_approvals {
        string id PK
        string proposal_id FK
        string decision
        text   reason
        string decided_by_user_id
    }

    rebalance_proposals ||--o{ rebalance_decisions : "evaluated by"
    rebalance_decisions ||--o{ rebalance_candidate_sets : "generates"
    rebalance_candidate_sets ||--o{ rebalance_candidate_plans : "contains"
    rebalance_candidate_plans ||--o{ rebalance_shadow_simulations : "simulated"
    rebalance_decisions ||--o{ rebalance_challenge_consensus : "challenged"
    rebalance_candidate_sets ||--o{ rebalance_negotiation_records : "negotiated"
    rebalance_proposals ||--o{ rebalance_proposal_approvals : "approved by"
    sprints ||--o{ rebalance_proposals : "rebalanced by"

Rebalancing pipeline

The 8 tables form a multi-stage pipeline: proposal generation -> decision (LLM-backed) -> candidate set enumeration -> candidate plans with strategies -> shadow simulation of delivery impact -> challenge consensus (adversarial review) -> negotiation/selection -> human approval.


6. AI Agents & Observability

AI agent instances, message queues, execution logs, LLM cost tracking, quality assessment, the append-only audit event log, and the onboarding/orchestration pipeline.

erDiagram
    agent_instances {
        string id PK
        string project_id FK
        string team_member_id FK
        string agent_type
        string display_name
        string model_tier
        string status
        string llm_provider
        string llm_model
        float  total_cost_cents
        int    total_runtime_seconds
    }

    agent_messages {
        string id PK
        string agent_instance_id FK
        string project_id FK
        string direction
        string message_type
        text   content
        string status
        datetime delivered_at
    }

    agent_execution_log {
        string id PK
        string agent_instance_id FK
        string project_id FK
        string action_type
        string task_id FK
        int    input_tokens
        int    output_tokens
        float  cost_cents
        int    duration_ms
    }

    llm_calls {
        string id PK
        string project_id FK
        string agent
        string provider
        string model
        int    input_tokens
        int    output_tokens
        float  cost_usd
        int    latency_ms
    }

    budgets {
        string id PK
        string project_id FK
        float  monthly_limit_usd
        float  daily_limit_usd
        float  alert_threshold
        bool   hard_limit
    }

    events {
        string id PK
        string project_id FK
        int    sequence
        string entity_type
        string entity_id
        string action
        jsonb  changes
        jsonb  snapshot
        string actor_type
        string actor_id
    }

    trace_spans {
        string id PK
        string trace_id
        string agent
        string graph
        int    latency_ms
        float  cost_usd
        string status
        jsonb  metadata
    }

    quality_runs {
        string id PK
        string project_id FK
        string target_type
        string target_id
        string mode
        string status
        string tier_achieved
        int    iterations
    }

    onboarding_pipeline_runs {
        string id PK
        string project_id FK
        string status
        string current_stage
        jsonb  stages
        jsonb  artifacts
        int    version
    }

    orchestration_jobs {
        string id PK
        string project_id FK
        string job_type
        string run_id
        jsonb  payload
        string status
        int    priority
        int    attempt_count
        string lease_owner
    }

    agent_instances ||--o{ agent_messages : "queues"
    agent_instances ||--o{ agent_execution_log : "logs"
    agent_execution_log }o--|| tasks : "against"
    projects ||--o{ llm_calls : "tracks"
    projects ||--|| budgets : "has budget"
    projects ||--o{ events : "audit log"
    projects ||--o{ onboarding_pipeline_runs : "onboards via"
    projects ||--o{ orchestration_jobs : "schedules"

Additional tables in this domain

TablePurpose
billing_usageMonthly usage rollups (AI calls, tokens, team members, storage)
quality_assessmentsCompleted quality assessments with dimension scores and tier
quality_run_eventsStreaming events for quality assessment progress
activity_logHuman-readable activity entries (category, level, agent, description)
prompt_versionsPrompt content versioning with change tracking
error_logError tracking (service, severity, message, stack trace, context)
onboarding_pipeline_eventsStreaming events for onboarding pipeline progress
manual_input_atomization_jobsJobs to atomize manually created insights/features
heartbeat_itemsCustom autonomous monitoring items with frequency
heartbeat_logHeartbeat execution logs (functions run, actions generated, tokens)
pulse_actionsAI-generated action items with priority and dismissal tracking
stakeholder_reportsGenerated stakeholder reports (question, answer, health, metrics)

7. Integrations & Communication

Telegram bindings, chat sessions, notifications, PR processing, and webhook receipts.

erDiagram
    telegram_user_bindings {
        string id PK
        string telegram_user_id
        string telegram_chat_id
        string telegram_username
        string user_id FK
        string project_id FK
        string active_chat_session_id FK
    }

    telegram_connect_states {
        string id PK
        string state_token
        string project_id FK
        string user_id FK
        datetime expires_at
        datetime consumed_at
    }

    chat_sessions {
        string id PK
        string project_id FK
        string user_id FK
        string title
        bool   is_archived
    }

    chat_messages {
        string id PK
        string project_id FK
        string session_id FK
        string role
        text   content
        datetime created_at
    }

    notifications {
        string id PK
        string project_id FK
        string user_id FK
        string type
        string title
        text   body
        jsonb  action_payload
        bool   read
    }

    pr_processing_runs {
        string id PK
        string project_id FK
        string provider
        string repo_full_name
        int    pr_number
        string action
        string status
        string delivery_key
    }

    webhook_receipts {
        string id PK
        string provider
        string project_id FK
        string delivery_key
        string payload_hash
    }

    project_knowledge_deltas {
        string id PK
        string project_id FK
        string source_type
        string source_ref
        string delta_kind
        text   summary
    }

    chat_sessions ||--o{ chat_messages : "contains"
    telegram_user_bindings }o--|| chat_sessions : "active session"
    projects ||--o{ telegram_user_bindings : "binds users"
    projects ||--o{ chat_sessions : "has sessions"
    projects ||--o{ notifications : "delivers"
    projects ||--o{ pr_processing_runs : "processes PRs"
    projects ||--o{ webhook_receipts : "deduplicates"

Table count by domain

DomainTablesKey concern
Core — Projects, Auth & Identity22Identity, access control, MCP OAuth
Planning Pipeline16Source-to-sprint lineage
Knowledge Ledger (PKL)18Evidence-backed atoms, derived artifacts
Team Ledger14Operations, dynamics, fairness
Rebalancing8Multi-stage approval workflow
AI Agents & Observability20Agent lifecycle, cost tracking, quality
Integrations & Communication8Telegram, chat, webhooks
Total106
Documentation