Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Storage layer

Argon’s storage layer is a trait abstraction (The storage trait) with two backends behind it: an in-memory backend for tests and REPL, and a Postgres backend. Both implement the same trait so the rest of the runtime is backend-agnostic.

The architectural decision that frames every other choice is the rejection of triple-decomposition: storage is a single append-only log of axiom-ADT events, not entity-attribute-value triples (The single-event-log architecture). The single log carries every cross-cutting concern — discriminator axes, bitemporality, defeasibility, provenance, capability gates — once, instead of per-relation.

The contract is ontology-neutral. No UFO axis is a column. Rigidity, sortality, identity-provision, and any vocabulary-specific axis live as meta_property events whose body declares (axis, target, value). A UFO vocabulary package ships its axes as data, not as type-system primitives.

The single-event-log architecture

Storage is one canonical relation: axiom_events. Every kernel write — concept declaration, subsumption axiom, relation declaration, rule declaration, type assertion, relation tuple, property assertion, meta-property, standpoint declaration, module declaration, retraction — inserts exactly one row. Rows are never updated in place except for the transaction-time-end field on supersession. Rows are deleted only via the capability-gated forget operation (Capability surface).

Each row carries:

  1. Identity — a UUIDv7 event identifier (time-orderable, monotonic) and a stable axiom_id UUID identifying the logical axiom across edits.
  2. The axiom-ADT body — a variant tag (one of the 26 axiom kinds catalogued in Axiom-ADT variant catalog) plus a canonical-CBOR payload mirroring Argon’s CoreIR 1:1. The content_id is SHA-256(body), so identical bodies deduplicate.
  3. Discriminator axes — tenant, fork, standpoint, module. Always present; default singletons in single-tenant deployments.
  4. Bitemporality — a valid-time range and transaction-time begin/end (Temporal substrate, RP-004).
  5. Polarity — assert (+) or retract (-); retractions point at the assert they supersede.
  6. Decidability classification — the tier-pair (main_tier, temporal_tier) per Tier ladder and RP-004.
  7. Defeasibility proof tags — Governatori-Rotolo 4-slot tags per defeasible reasoning.
  8. Provenance — PosBool(M) DNF for derived events (absent for axioms).

The four discriminator axes plus the bitemporal pair give the 4-axis versioning contract: any event is located by (tenant, fork, standpoint, module) in space and by (valid-time, transaction-time) in time. A query at an AsOf point (AsOf semantics — bitemporal point) reads the events live in scope at that bitemporal coordinate.

Why one log, not many tables. A split design (mutable normalized TBox tables plus an append-only ABox log) cannot answer “what was the TBox as of TT₀” without rebuilding state from a separate ledger, and forces every discriminator axis to be added redundantly to every TBox table. A unified log factors these axes once. Event sourcing is the correct foundation for bitemporal retraction, copy-on-write forking, and PosBool(M) derivation provenance; any of these over a mutable schema introduces impedance mismatches that surface as edge-case correctness bugs.

Why ADT-shaped bodies, not RDF-style decomposition. Reifying an axiom into triples loses the ADT pattern-matching benefit, inflates row count 5–50×, and adds a join layer on every read. The empirical case is clear: horned-owl (Phillip Lord, Newcastle) achieves 1–2 orders of magnitude faster parsing and ~20× less memory than the OWL API’s triple store by using Rust enums for OWL axioms; Konclude and Tawny-OWL converge on the same shape. Pattern matching on the ADT is reasoning in the matching-logic sense (Roșu 2017, LMCS).

The body encoding is canonical CBOR (deterministic per RFC 8949 §4.2.1). The body schema is the variant’s body shape from the axiom-ADT variant catalog.

The storage trait

Backends implement:

pub trait StorageBackend: Send + Sync {
    type Error: std::error::Error + Send + Sync;

    /// Open or create storage for a (tenant, fork). Includes any
    /// per-backend schema migrations.
    fn open(
        &mut self,
        tenant_id: TenantId,
        fork_id: ForkId,
    ) -> Result<(), Self::Error>;

    /// Append one axiom event (assert or retract).
    fn append(&mut self, event: &AxiomEvent) -> Result<TxTime, Self::Error>;

    /// Append a batch of axiom events as one transaction.
    fn append_batch(
        &mut self,
        events: &[AxiomEvent],
    ) -> Result<TxTime, Self::Error>;

    /// Physically erase axiom events for compliance. Capability checked
    /// by the caller; the backend logs to its forget log.
    fn forget(
        &mut self,
        targets: &[AxiomId],
        actor: PrincipalId,
        reason: &str,
    ) -> Result<ForgetReceipt, Self::Error>;

    /// Scan live events of a kind in scope at a bitemporal point.
    /// Returns events with op='+' and (tx_to absent OR tx_to > as_of.tt).
    fn scan_live(
        &self,
        scope: Scope,
        kind: AxiomKind,
        as_of: AsOf,
    ) -> Result<EventCursor<'_>, Self::Error>;

    /// Lookup events by extracted body field. Backends with the right
    /// index serve this in O(log n); others fall back to scan.
    fn lookup_by_field(
        &self,
        scope: Scope,
        kind: AxiomKind,
        field: BodyField,
        value: BodyValue,
        as_of: AsOf,
    ) -> Result<EventCursor<'_>, Self::Error>;

    /// Per-axiom-id history (across retractions / reasserts).
    fn axiom_history(
        &self,
        scope: Scope,
        axiom_id: AxiomId,
    ) -> Result<EventCursor<'_>, Self::Error>;

    /// Apply CQRS projection maintenance after a batch of appends.
    fn maintain_projections(
        &mut self,
        delta: &EventDelta,
    ) -> Result<(), Self::Error>;
}

Scope carries (tenant_id, fork_id, standpoint_id_set, module_id_set). The standpoint_id_set is typically a single standpoint plus its lattice ancestors (pre-computed at compose time, cached in global-control).

Backends manage their own schema versioning. The schema version is not carried in the .oxbin preamble.

The Postgres backend

The Postgres backend (oxc-storage-pg) materializes the event log as a single canonical axiom_events table plus a set of CQRS projection tables (modules, concepts, relations, meta-axes, standpoints and their ancestry, per-(standpoint, concept) world assumption, forks, and a forget-audit log). The event log is the source of truth; projections are recomputable views maintained incrementally by the Argon runtime calling maintain_projections after each batch (so projection logic stays in Rust and is shared with the in-memory backend) rather than by DB triggers.

Body fields are extracted into DB-enforced generated columns (STORED GENERATED ALWAYS AS) via an oxc-pgx pgrx extension that reads canonical CBOR; because deterministic CBOR is byte-deterministic, these extractions cannot drift from the body. Hot-path access patterns — live-in-scope-by-kind, axiom-id history, valid-time and transaction-time ranges, and per-variant / per-relation hot fields — are served by partial, GiST, and BRIN indexes that ox build auto-generates from the declared schema; modelers never write storage indexes by hand.

The concrete Postgres DDL — table definitions, index definitions, the cbor_extract_* extension signatures, and the projection catalog — is an operational artifact of this one backend, not part of the language surface. It is not mechanized (see Mechanization) and lives in the oxc-storage-pg crate. ox kernel init installs the extension and runs migrations.

The Postgres backend crate exists but is not the default runtime path; the in-memory backend is what the runtime exercises by default (see Serving surface).

The in-memory backend

The in-memory backend (oxc-storage-mem) is the live backend — for tests, REPL, and ephemeral runtimes. It implements StorageBackend with BTreeMap-backed indexes plus an interval tree for O(log n) bitemporal range queries, sharing all query and mutation code with the Postgres backend. It is the default for cargo nextest run -j 4 (tests need no Postgres). It has no durability (drops on shutdown), no IAM (trusts the calling process; capability checks pass unconditionally), and no partitioning.

Mechanization

The storage layer is mechanized in Lean at the contract level — abstract over backend, ontology-neutral, axiom-ADT-shaped:

  • Argon/Storage/Trait.lean — abstract StorageBackend signature (no DDL)
  • Argon/Storage/AxiomEvent.lean — the canonical event row shape (@[language_interface])
  • Argon/Storage/AxiomKind.lean — the 26 variants as a Lean inductive (@[language_interface])
  • Argon/Storage/AxiomBody.lean — per-variant body payloads (@[language_interface])
  • Argon/Storage/CatalogProjections.lean — projection shapes

oxc-protocol::storage mirrors the @[language_interface]-tagged inductives; CI fails on drift. The Postgres DDL is not mechanized — it is an operational artifact of the Postgres backend. The trait admits future backends (analytical / RDF-bridge / property-graph / object-store) without changing the canonical axiom_events abstraction; none are standard backends, and none would change the event-log contract above.