mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-09 01:35:18 +02:00
Refresh user-facing and agent-facing docs for the staged-write rewire and clean up stale Run-state-machine references that survived MR-771. MR-794-specific updates: * docs/runs.md — remove "Known limitation: mid-query partial failure" section; document the in-memory accumulator + D₂ rule + the LoadMode::Overwrite residual. * docs/invariants.md §VI.25 — flip from aspirational/open to upheld for inserts/updates. Within-query read-your-writes is now load-bearing for the publisher CAS contract. * docs/architecture.md — add "Mutation atomicity — in-memory accumulator (MR-794)" subsection with per-op flow; refresh the engine + state diagrams to drop RunRegistry and add MutationStaging. * docs/execution.md — rewrite the mutation flow sequence diagram for the staged-write path; updated the LoadMode table to call out per-mode commit semantics; rewrote load vs ingest. * docs/query-language.md — document the D₂ parse-time rule. * docs/errors.md — add the D₂ BadRequest rejection path. * docs/testing.md — extend the runs.rs row to cover the new MR-794 contract tests; add the staged_writes.rs row. * docs/releases/v0.4.1.md (new) — release note covering the rewire, test additions, residuals, and files changed. * AGENTS.md (CLAUDE.md symlink) — update the atomic-per-query description and the L2 capability matrix row. Stale-reference cleanup (MR-771 leftovers): * docs/storage.md — drop live _graph_runs.lance / _graph_run_actors.lance from the layout diagram and prose; mark legacy. * docs/branches-commits.md — move __run__<id> to a legacy note; remove publish_run from the publish-trigger list. * docs/audit.md — refresh _as API list (drop begin_run_as / publish_run_as); legacy RunRecord.actor_id moved to a historical note. * docs/constants.md — mark run registry / branch-prefix rows as legacy. * docs/cli.md — replace the legacy omnigraph run * quickstart block with omnigraph commit list/show. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 KiB
3 KiB
Branches, Commits, Snapshots
L1 — Lance per-dataset branches
Lance supports branching at the dataset level: a branch is a named lineage of versions, and fork_branch_from_state(source_branch, target_branch, source_version) creates a copy-on-write fork.
L2 — Graph-level branches
OmniGraph builds graph branches on top by branching every sub-table coherently:
branch_create(name)/branch_create_from(target, name)— disallowed namemain; fails if branch exists; ensures the schema-apply lock is idle.branch_list()— returns public branches, filters internal__run__…and__schema_apply_lock__prefixes.branch_delete(name)— refuses if there are descendants or active runs on the branch; cleans up owned per-branch fragments.- Lazy forking: a branch only forks a sub-table when that sub-table is first mutated on it. Pure-read branches share fragments with their source.
sync_branch(branch)— re-binds the in-memory handle to the latest head of the branch.
L2 — Commit graph (db/commit_graph.rs)
In-memory shape of a graph commit:
GraphCommit {
graph_commit_id: ULID,
manifest_branch: Option<String>,
manifest_version: u64,
parent_commit_id: Option<String>,
merged_parent_commit_id: Option<String>, // populated for merge commits
actor_id: Option<String>, // joined in-memory from _graph_commit_actors.lance, NOT a column on _graph_commits.lance
created_at: i64 (microseconds since epoch),
}
Storage is split across two Lance datasets (both with stable row IDs):
_graph_commits.lance— every column above exceptactor_id._graph_commit_actors.lance— optional separate(graph_commit_id, actor_id)map, created on demand. Theactor_idfield above is populated by joining this dataset in-memory at load time.
Notes:
- Every successful publish (load / change / merge / schema_apply) appends one commit.
- Merge commits have two parents; linear commits have one.
- API:
list_commits(branch),get_commit(id),head_commit_id_for_branch(branch).
L2 — Snapshots & time travel
snapshot()— current snapshot for the bound branch; cached.snapshot_of(target)— snapshot at aReadTarget(branch | snapshot id).snapshot_at_version(v: u64)— historical snapshot from any manifest version.entity_at(table_key, id, version)— single-entity time travel without building a full snapshot.- A
Snapshotis a(version, HashMap<table_key, SubTableEntry>)— cheap to build, snapshot-isolated cross-table reads.
L2 — Internal system branches
Filtered from branch_list() but visible to internals:
__schema_apply_lock__— serializes schema migrations.__run__<run-id>— legacy from the pre-v0.4.0 Run state machine (removed in MR-771). The branch-name guard predicateis_internal_run_branchis kept as defense-in-depth so users cannot create a branch matching the legacy prefix; the filter will be removed once production legacy branches are swept (MR-770).