omnigraph/docs/user/branching/index.md
Andrew Altshuler 612741b387
docs(user): split language/branching pages + add front-door pages (Phase 2) (#225)
Content build-out on top of the Phase 1 topic move. No behavior changes.

Splits (existing content relocated, cross-linked):
- queries/index.md → mutations/index.md (insert/update/delete + the
  inserts-vs-deletes rule) and search/index.md (the multi-modal search
  functions + a hybrid-ranking overview tying nearest/bm25/rrf together).
  queries/index.md now covers the read shape and points at both.
- branching/index.md → branching/time-travel.md (snapshots/time travel) and
  branching/merge.md (three-way merge + the 7 conflict kinds, verified against
  error.rs MergeConflictKind).

New pages (written from the code, user-facing):
- quickstart.md — init → load → query → branch, with verified CLI flags.
- concepts/index.md — what OmniGraph is + the L1/L2 (Lance/OmniGraph) framing.

Expanded operations/audit.md from a 7-line struct dump into a real
actor-tracking page (server token-resolved vs CLI --as chain; reading the
trail; the omnigraph:recovery reserved actor).

Index wiring: docs/user/index.md and AGENTS.md's topic table link every new
page; also normalized AGENTS.md's docs/user link display text to match the
Phase 1 retargeted paths.

Verified: zero broken .md links; check-agents-md.sh green (57 links, 54 docs).

Deferred to Phase 3: de-dev polish (grammar paths, IR internals still in
queries/branching), guides/, and a possible reference/config.md split (the
config schema is already coherent in cli/reference.md).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 13:53:46 +03:00

5.5 KiB

Branches, Commits, Snapshots

L1 — Lance per-dataset branches

Lance supports branching at the dataset level: a branch is a named lineage of versions, and fork_branch_from_state(source_branch, target_branch, source_version) creates a copy-on-write fork.

L2 — Graph-level branches

OmniGraph builds graph branches on top by branching every sub-table coherently:

  • branch_create(name) / branch_create_from(target, name) — disallowed name main; fails if branch exists; ensures the schema-apply lock is idle. Atomic and authority-first like branch_delete: it flips the __manifest branch (authority), then creates the derived commit-graph branch, force-dropping any orphaned commit-graph ref left by an incomplete prior delete (the manifest branch is fresh, so a same-named commit-graph branch is provably a zombie). If commit-graph creation fails, the manifest branch is rolled back so the name never half-exists.
  • branch_list() — returns public branches, filters the internal __schema_apply_lock__ branch.
  • branch_delete(name) — refuses if there are descendants on the branch, or if it is the current branch. The manifest is the single authority for branch existence: deletion flips the __manifest branch ref first (one atomic op), after which the branch is gone from every snapshot. The owned per-table forks and the commit-graph branch are derived state, reclaimed best-effort with force_delete_branch after the flip. A failure during that reclaim (transient object-store error) does not fail the call or block the authority flip; the leftover forks are unreachable orphans that the cleanup reconciler converges. One consequence: if a delete's best-effort reclaim fails, reusing that branch name before the next cleanup surfaces a clear error pointing at cleanup (the stale fork would otherwise collide on first write).
  • Lazy forking: a branch only forks a sub-table when that sub-table is first mutated on it. Pure-read branches share fragments with their source. A fork collision is classified by the manifest authority, not by Lance branch versions: if the live manifest already records the fork on the active branch, a concurrent first-write won and the caller gets a retryable "refresh and retry"; if the manifest does not, a physical branch there is an orphan and the caller is pointed at cleanup.
  • sync_branch(branch) — re-binds the in-memory handle to the latest head of the branch.

L2 — Commit graph (db/commit_graph.rs)

In-memory shape of a graph commit:

GraphCommit {
  graph_commit_id: ULID,
  manifest_branch: Option<String>,
  manifest_version: u64,
  parent_commit_id: Option<String>,
  merged_parent_commit_id: Option<String>,   // populated for merge commits
  actor_id: Option<String>,                  // joined in-memory from _graph_commit_actors.lance, NOT a column on _graph_commits.lance
  created_at: i64 (microseconds since epoch),
}

Storage is split across two Lance datasets (both with stable row IDs):

  • _graph_commits.lance — every column above except actor_id.
  • _graph_commit_actors.lance — optional separate (graph_commit_id, actor_id) map, created on demand. The actor_id field above is populated by joining this dataset in-memory at load time.

Notes:

  • Every successful publish (load / change / merge / schema_apply) appends one commit.
  • Merge commits have two parents; linear commits have one.
  • API: list_commits(branch), get_commit(id), head_commit_id_for_branch(branch).

L2 — Snapshots & time travel

Reading a branch at a past version, or a single entity at a past version, is covered on the time travel page. Merging branches and the conflict kinds are on the merge page.

L2 — Internal system branches

Internal or legacy branch refs:

  • __schema_apply_lock__ — serializes schema migrations; filtered from branch_list() but visible to internals.
  • __run__<run-id> — legacy from the pre-v0.4.0 Run state machine (removed in MR-771). These are swept off __manifest on the first read-write open by the v2→v3 internal-schema migration (MR-770), and __run__* is no longer a reserved name. Known limitation: a pre-v0.4.0 graph opened read-only still surfaces any stale __run__* branch in branch_list() until its first read-write open (the migration is write-path-only, like all manifest migrations).

L2 — Recovery audit trail

The five migrated writers (MutationStaging::finalize, schema_apply, branch_merge, ensure_indices, optimize_all_tables) protect their multi-table commits with a sidecar at __recovery/{ulid}.json written before Phase B and deleted after Phase C. The next Omnigraph::open (gated on OpenMode::ReadWrite) runs the recovery sweep in crates/omnigraph/src/db/manifest/recovery.rs: classify per-table state, decide all-or-nothing per sidecar, roll forward / back, record an audit row.

Audit rows live in _graph_commit_recoveries.lance (sibling to _graph_commits.lance) and reference the commit graph by graph_commit_id. The linked recovery commit is identified by that same graph_commit_id, and actor_id="omnigraph:recovery" is stored in _graph_commit_actors.lance (joined by graph_commit_id) — _graph_commits.lance itself does not carry the actor_id column. To find recoveries for a specific original actor: omnigraph commit list --filter actor=omnigraph:recovery, then join to _graph_commit_recoveries.lance by graph_commit_id to read recovery_for_actor. Schema: see crates/omnigraph/src/db/recovery_audit.rs.