omnigraph/docs/branches-commits.md
Ragnor Comerford a335d98854
Refactor AGENTS.md from encyclopedia to map; move spec into docs/
Splits the 990-line AGENTS.md into a 184-line map (architecture,
where-to-find index, always-on invariants, capability matrix,
maintenance contract) plus 18 new docs/*.md files holding the deep
content per topic (storage, schema and query languages, indexes,
embeddings, branches/commits, runs, merge, changes, execution, policy,
server, CLI reference, audit, errors, CI, constants, v0.3.1 notes).

Adds scripts/check-agents-md.sh and a check_agents_md CI job that
verifies every docs/ link in AGENTS.md resolves and every doc in the
canonical set is linked. CLAUDE.md remains a symlink to AGENTS.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-28 23:31:08 +02:00

2.4 KiB

Branches, Commits, Snapshots

L1 — Lance per-dataset branches

Lance supports branching at the dataset level: a branch is a named lineage of versions, and fork_branch_from_state(source_branch, target_branch, source_version) creates a copy-on-write fork.

L2 — Graph-level branches

OmniGraph builds graph branches on top by branching every sub-table coherently:

  • branch_create(name) / branch_create_from(target, name) — disallowed name main; fails if branch exists; ensures the schema-apply lock is idle.
  • branch_list() — returns public branches, filters internal __run__… and __schema_apply_lock__ prefixes.
  • branch_delete(name) — refuses if there are descendants or active runs on the branch; cleans up owned per-branch fragments.
  • Lazy forking: a branch only forks a sub-table when that sub-table is first mutated on it. Pure-read branches share fragments with their source.
  • sync_branch(branch) — re-binds the in-memory handle to the latest head of the branch.

L2 — Commit graph (db/commit_graph.rs)

Stored as a Lance dataset _graph_commits.lance (with stable row IDs):

GraphCommit {
  graph_commit_id: ULID,
  manifest_branch: Option<String>,
  manifest_version: u64,
  parent_commit_id: Option<String>,
  merged_parent_commit_id: Option<String>,   // populated for merge commits
  actor_id: Option<String>,
  created_at: i64 (microseconds since epoch),
}
  • Every successful publish (load / change / merge / schema_apply / publish_run) appends one commit.
  • Merge commits have two parents; linear commits have one.
  • _graph_commit_actors.lance — optional separate actor map (created on demand).
  • API: list_commits(branch), get_commit(id), head_commit_id_for_branch(branch).

L2 — Snapshots & time travel

  • snapshot() — current snapshot for the bound branch; cached.
  • snapshot_of(target) — snapshot at a ReadTarget (branch | snapshot id).
  • snapshot_at_version(v: u64) — historical snapshot from any manifest version.
  • entity_at(table_key, id, version) — single-entity time travel without building a full snapshot.
  • A Snapshot is a (version, HashMap<table_key, SubTableEntry>) — cheap to build, snapshot-isolated cross-table reads.

L2 — Internal system branches

Filtered from branch_list() but visible to internals:

  • __run__<run-id> — ephemeral isolation branch for a transactional run.
  • __schema_apply_lock__ — serializes schema migrations.