The Failure Mode

Every AI-powered mission runs into the same wall. Not a capability failure — an infrastructure failure.

Example 1 — Database migration

An agent is told: "Migrate our production database to Postgres 16." Round one: it reads the schema, plans the steps. Round two: it has lost the thread. It re-reads the schema. It re-discovers that staging already has the migration script. Round three through ten: the agent is effectively starting over every session.

Result: a two-day migration becomes a week. The agent "hallucinates" solutions it already ruled out.

Example 2 — Code refactor

Agent tasked with refactoring a 50k-line codebase across 3 weeks. Day 1: it builds a clear plan. Day 5: new session, no record of what's done, what was decided, what was deferred. It re-reads the same files. It re-makes decisions it already made. It misses the context that would tell it "this approach was already rejected in PR #14."

The failure mode is the same: high-frequency forgetfulness. The model is brilliant inside a moment. But across moments, there's no persistence layer. No state. No continuity. The mission doesn't compound.


The Five Primitives

AetherForge's response is five intentional primitives. Not a framework — a minimal set of concepts that, together, provide everything an agent needs to maintain mission continuity.

2.1 mission

The mission is the top-level unit. A persistent object with an identity that survives sessions. It has: an objective, a set of constraints, a status (planning/active/blocked/completed/failed), and timestamps. Everything else in the system is attached to it.

What it replaces: ad hoc session initialization prompts. Instead of "here's the context for this session," the agent asks: "what's the current state of Mission #47?"

2.2 reality_fact

A timestamped, source-attributed observation. Not a belief, not an assumption — a record of something observed, with provenance.

id              UUID PRIMARY KEY
mission_id      UUID NOT NULL REFERENCES mission(id)
content         TEXT NOT NULL        -- "staging DB has 2.3GB of data"
source          TEXT NOT NULL        -- "pg_dump stdout", "user email", "schema query"
confidence      DECIMAL(3,2)         -- 0.00–1.00
observed_at     TIMESTAMPTZ NOT NULL
created_at      TIMESTAMPTZ DEFAULT now()

Reality facts are immutable after creation. They represent what was observed, not what is currently true. This is critical for audit trails — if the agent made a decision based on a fact that later turned out to be wrong, you can reconstruct the decision context exactly as it was.

What it replaces: vector store entries (opaque retrievals vs auditable, source-attributed, temporally ordered records).

2.3 decision

A record of what the agent chose and why.

id                  UUID PRIMARY KEY
mission_id          UUID NOT NULL REFERENCES mission(id)
context_slice_id    UUID REFERENCES context_slice(id)
action             TEXT NOT NULL        -- "Use pg_upgrade instead of pg_dump"
rationale           TEXT NOT NULL        -- "pg_upgrade is in-place, requires less downtime"
expected_outcome    TEXT NOT NULL        -- "Migration completes in ~20 min with ~0 downtime"
outcome             TEXT                 -- filled in after validation
decided_at          TIMESTAMPTZ NOT NULL
created_at          TIMESTAMPTZ DEFAULT now()

Decisions are not execution logs — they're the agent's reasoning made durable. When expected_outcome doesn't match what actually happened, that delta is the most valuable signal in the system.

What it replaces: scratchpads, internal monologue, "let me think through this" chains that evaporate when the session ends.

2.4 context_slice

A relevance-scored, temporally weighted assembly of facts and decisions relevant to a specific task. The agent's working memory for a given moment — assembled fresh from the mission's accumulated history.

id              UUID PRIMARY KEY
mission_id      UUID NOT NULL REFERENCES mission(id)
task_label      TEXT NOT NULL        -- "Run the migration script"
relevance_score DECIMAL(5,2)         -- aggregate relevance of included facts
time_weight     DECIMAL(3,2)         -- decay factor based on fact recency
drift_flag      BOOLEAN              -- true if new facts contradict older ones
slice_json      JSONB NOT NULL       -- { facts: [...], decisions: [...], scores: [...] }
assembled_at    TIMESTAMPTZ NOT NULL
created_at      TIMESTAMPTZ DEFAULT now()

Drift detection: if a new reality_fact contradicts an earlier one, context_slice gets drift_flag = true. The agent knows its model of reality has shifted.

What it replaces: manual context window management, chat history dumps, "let me summarize what we know" prompts.

2.5 validation_event

The system closing the loop — outcome verification that connects decisions back to reality.

id              UUID PRIMARY KEY
mission_id      UUID NOT NULL REFERENCES mission(id)
decision_id     UUID REFERENCES decision(id)
check_label     TEXT NOT NULL        -- "pg_upgrade completed without error"
check_result    TEXT NOT NULL        -- "exit code 0, 2.3GB migrated in 18 min"
status          TEXT NOT NULL        -- "pass" | "fail" | "inconclusive"
validated_at    TIMESTAMPTZ NOT NULL
created_at      TIMESTAMPTZ DEFAULT now()

The delta between expected_outcome (decision) and check_result (validation_event) is the feedback signal that drives the loop.

What it replaces: nothing in today's agent stacks. Validation is the missing piece.


The Schema — Six Tables

In addition to the five primitives, there is one supporting table:

capability

id              UUID PRIMARY KEY
mission_id      UUID REFERENCES mission(id)  -- null means general capability
skill_name      TEXT NOT NULL               -- "postgres_upgrade", "rollback_procedure"
proficiency     DECIMAL(3,2)                -- learned confidence, 0–1
learned_from    UUID REFERENCES decision(id)
times_used      INTEGER DEFAULT 0
last_used_at    TIMESTAMPTZ
created_at      TIMESTAMPTZ DEFAULT now()

When a validation_event confirms a successful outcome, and the agent did something novel to achieve it, that skill gets recorded here. Future missions with similar needs can be routed to the relevant capability. The system gets faster the more it runs.

Rationale per table

mission: The primary unit. Without this, everything else floats with no anchor.

reality_fact: Source-attributed, immutable, temporally ordered. Replaces vector stores with something auditable.

decision: Captures not just what the agent chose but why. Enables retrospective analysis.

context_slice: Assembles working memory per task, relevance-scored, drift-detected. Prevents context window flooding.

validation_event: Closes the loop. Every decision needs to know if it worked.

capability: Compositional memory. What the system learned propagates forward.


Worked Example — "Migrate Production Database to Postgres 16"

Iteration 1 — Planning

Mission created: { "id": "m47", "objective": "Migrate prod DB to Postgres 16, minimize downtime, preserve all data", "status": "active" }

Agent queries prod DB schema. Rows written: reality_fact ("prod DB is PostgreSQL 14.5, 2.3GB, dual AZ"), reality_fact ("app connection string is in DATABASE_URL env var"). Decision: "Run discovery before touching prod." Context slice assembled: shows schema, current version, connection config.

Iteration 2 — Option evaluation

Agent evaluates pg_upgrade vs pg_dump vs logical replication. Rows written: reality_fact ("pg_upgrade is in-place, ~0 dump/restore time, requires downtime window"), reality_fact ("logical replication needs PG 16 on both ends"). Decision: "Use pg_upgrade --link for fastest in-place migration." Rationale: "pg_upgrade is fastest for in-place migration with this schema size."

Iteration 3 — Staging validation

Agent spins up staging DB (Postgres 16), runs pg_upgrade there first. Rows written: reality_fact ("staging migration completed in 18 min, 0 data loss, no schema incompatibilities"). Validation_event: check_label "staging pg_upgrade succeeds", check_result "exit code 0, 2.3GB in 18 min", status "pass". Decision expected_outcome (20 min) vs actual (18 min) — delta positive.

Iteration 4 — Prod migration decision

Agent asks: "has anything changed since staging run?" Drift_flag: false. Decision: "Proceed with prod migration using pg_upgrade --link, maintenance window 2–4 AM."

Iteration 5 — Prod migration execution

Agent runs pg_upgrade --link on prod. Rows written: reality_fact ("prod migration took 21 min, exit code 0, all data present"). Validation_event: check_label "prod pg_upgrade completed", check_result "exit code 0, 2.3GB in 21 min", status "pass."

Iteration 6 — App reconfiguration

Agent updates DATABASE_URL, runs smoke tests. Rows written: reality_fact ("app reconnects to new DB, all queries return expected results"). Validation_event: check_label "app queries against new DB", check_result "14 smoke tests passed, 0 failures", status "pass."

Iteration 7 — Capability recorded

Agent creates capability record for "postgres_upgrade_pg_upgrade_link". Proficiency: 0.95 (validated in staging and prod). learned_from: decision_id from iteration 4.

Iteration 8 — Cleanup

Agent confirms old PG14 instance is backed up. Mission status: completed.

Context slice at iteration 8 includes all 9 facts, 4 decisions, 3 validation events, drift_flag false, relevance scores { pg_upgrade: 0.98, downtime: 0.95, data_loss: 0.92 }. The agent has full visibility — what it decided, why, what happened, what the capability ceiling is.


What's Next

Open API (Q3 2026): Full public API with SDK support for Python, TypeScript, and Go. Design partners get early access in July.

Design Partner Program: We are working with 12 agent framework teams who need persistent mission infrastructure underneath their execution layer. If you're building a coding agent, a research agent, or a workflow automation system, you have the same problem. Apply for early access →

MCP Server (Q4 2026): AetherForge as a Model Context Protocol server — so any MCP-compatible agent can connect directly to its mission state. No custom integration required.

The core loop is the foundation. What's built on top of it — autonomous code agents, research synthesizers, multi-agent coordination, long-horizon planning — that's where it gets interesting.

Ready to try it? Join the waitlist →

Already have an API key? Python SDK →  ·  API Docs →