diff --git a/AGENTS.md b/AGENTS.md
index e62b5ae..a98b974 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -198,7 +198,7 @@ omnigraph policy explain --actor act-alice --action change --branch main
 | Columnar storage on object store | ✅ Arrow/Lance | URI normalization, S3 env-var plumbing |
 | Per-dataset versioning + time travel | ✅ | `snapshot_at_version`, `entity_at`, snapshot-pinned reads across many tables |
 | Per-dataset branches | ✅ | **Graph-level** branches (atomic across all sub-tables), lazy fork, system branch filtering |
-| Atomic single-dataset commits | ✅ | **Multi-table publish via three layers**, NOT a single Lance primitive: (1) per-table Lance `commit_staged` for the data write, (2) `__manifest` row-level CAS via `ManifestBatchPublisher` for cross-table ordering, (3) recovery-on-open reconciler for the residual gap between (1) and (2). Layer (3) is **not yet shipped** — tracked in MR-847. Until MR-847 lands, a failure between per-table `commit_staged` and the manifest publish leaves drift (the documented "Phase B → Phase C residual" — see [docs/runs.md](docs/runs.md)). Engine writes route through a sealed `TableStorage` trait (MR-793) exposing `stage_*` + `commit_staged` as the canonical staged-write surface; documented inline-commit residuals (`delete_where`, `create_vector_index`, plus legacy `append_batch` / `merge_insert_batches` / `overwrite_batch` / `create_*_index` pending Phase 9) remain on the trait until upstream Lance ships a public two-phase API ([#6658](https://github.com/lance-format/lance/issues/6658), [#6666](https://github.com/lance-format/lance/issues/6666)) and call-site conversion (Phase 1b) completes. **Do not describe atomicity as "fully upheld" until MR-847 ships.** |
+| Atomic single-dataset commits | ✅ | **Multi-table publish via three layers**, NOT a single Lance primitive: (1) per-table Lance `commit_staged` for the data write, (2) `__manifest` row-level CAS via `ManifestBatchPublisher` for cross-table ordering, (3) the open-time recovery sweep for the residual gap between (1) and (2). All three layers ship; the four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) write a `__recovery/{ulid}.json` sidecar before Phase B and delete it after Phase C. The next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the sweep in `db/manifest/recovery.rs`: classify, decide all-or-nothing per sidecar, roll forward via single `ManifestBatchPublisher::publish` or roll back via `Dataset::restore`, and record an audit row in `_graph_commit_recoveries.lance` (queryable via `omnigraph commit list --filter actor=omnigraph:recovery`). Continuous in-process recovery (no restart needed between Phase B failure and recovery) is the goal of a future background reconciler. Engine writes route through a sealed `TableStorage` trait exposing `stage_*` + `commit_staged` as the canonical staged-write surface; documented inline-commit residuals (`delete_where`, `create_vector_index`, plus legacy `append_batch` / `merge_insert_batches` / `overwrite_batch` / `create_*_index`) remain on the trait until upstream Lance ships a public two-phase API ([#6658](https://github.com/lance-format/lance/issues/6658), [#6666](https://github.com/lance-format/lance/issues/6666)) and the migration of every call site completes. |
 | Compaction (`compact_files`) | ✅ | `omnigraph optimize` orchestrates over all node/edge tables, bounded concurrency |
 | Cleanup (`cleanup_old_versions`) | ✅ | `omnigraph cleanup` with `--keep` / `--older-than` policy |
 | BTREE / inverted (FTS) / vector indexes | ✅ | `ensure_indices` builds them on every relevant column; idempotent; lazy across branches |
diff --git a/crates/omnigraph/src/db/manifest.rs b/crates/omnigraph/src/db/manifest.rs
index b20433a..f31cc4f 100644
--- a/crates/omnigraph/src/db/manifest.rs
+++ b/crates/omnigraph/src/db/manifest.rs
@@ -16,6 +16,8 @@ mod migrations;
 mod namespace;
 #[path = "manifest/publisher.rs"]
 mod publisher;
+#[path = "manifest/recovery.rs"]
+mod recovery;
 #[path = "manifest/repo.rs"]
 mod repo;
 #[path = "manifest/state.rs"]
@@ -30,6 +32,11 @@ pub(crate) use namespace::open_table_head_for_write;
 #[cfg(test)]
 use namespace::{branch_manifest_namespace, staged_table_namespace};
 use publisher::{GraphNamespacePublisher, ManifestBatchPublisher};
+pub(crate) use recovery::{
+    delete_sidecar, has_schema_apply_sidecar, new_sidecar, recover_manifest_drift, write_sidecar,
+    RecoveryMode, RecoverySidecar, RecoverySidecarHandle, SidecarKind, SidecarTablePin,
+    SidecarTableRegistration, SidecarTombstone,
+};
 use repo::{init_manifest_repo, open_manifest_repo, snapshot_state_at};
 pub use state::SubTableEntry;
 #[cfg(test)]
diff --git a/crates/omnigraph/src/db/manifest/recovery.rs b/crates/omnigraph/src/db/manifest/recovery.rs
new file mode 100644
index 0000000..588042c
--- /dev/null
+++ b/crates/omnigraph/src/db/manifest/recovery.rs
@@ -0,0 +1,1600 @@
+//! Recovery-on-open primitives.
+//!
+//! This module implements the building blocks of the per-sidecar recovery
+//! sweep that closes the documented Phase B → Phase C residual (see
+//! `docs/runs.md` "Open-time recovery sweep"). The high-level shape:
+//!
+//! 1. Each writer that performs a multi-table commit writes a small JSON
+//!    sidecar at `__recovery/{ulid}.json` BEFORE its per-table
+//!    `commit_staged` loop, listing every `(table_key, table_path,
+//!    expected_version, post_commit_pin)` it intends to publish.
+//! 2. After the manifest publish (Phase C) succeeds, the writer deletes
+//!    the sidecar.
+//! 3. If the writer crashes between Phase B begin and Phase C success,
+//!    the sidecar remains. The next `Omnigraph::open` (gated on
+//!    `OpenMode::ReadWrite`) classifies each table in each sidecar and
+//!    either rolls forward all tables (if every table is at
+//!    `post_commit_pin` AND matches the sidecar) or rolls back all
+//!    drifted tables to the manifest-pinned version.
+//!
+//! ## Verified Lance behavior the rollback path depends on
+//!
+//! - `Dataset::restore()` takes no version arg; restores
+//!   `self.manifest.version` (currently checked-out version). From HEAD =
+//!   `h`, produces a new commit at `h + 1` with content == checked-out
+//!   version. Pinned by
+//!   `tests/staged_writes.rs::lance_restore_appends_one_commit_with_checked_out_content`.
+//! - `Dataset::restore` "wins" against concurrent Append/Update/Delete/
+//!   CreateIndex/Merge — see `check_restore_txn` at lance-4.0.0
+//!   `src/io/commit/conflict_resolver.rs:986`. The hazard is documented
+//!   by `tests/staged_writes.rs::lance_restore_loses_to_concurrent_append_via_orphaning`.
+//!   This module sidesteps the hazard by running recovery only at
+//!   `Omnigraph::open` (before any other writers can race). A future
+//!   continuous in-process recovery reconciler will need to guard via
+//!   per-(table_key, branch) queue acquisition.
+
+use std::collections::HashMap;
+
+use lance::Dataset;
+use serde::{Deserialize, Serialize};
+use tracing::warn;
+
+use crate::db::commit_graph::CommitGraph;
+use crate::db::graph_coordinator::GraphCoordinator;
+use crate::db::recovery_audit::{
+    RecoveryAudit, RecoveryAuditRecord, RecoveryKind, TableOutcome, now_micros,
+};
+use crate::db::schema_state::SchemaStateRecovery;
+use crate::error::{OmniError, Result};
+use crate::storage::StorageAdapter;
+
+use super::Snapshot;
+use super::publisher::{GraphNamespacePublisher, ManifestBatchPublisher};
+use super::{ManifestChange, SubTableUpdate, TableRegistration, TableTombstone};
+
+/// System actor identifier recorded on every recovery commit. Operators
+/// distinguish recovery commits from user commits in `omnigraph commit list`
+/// by filtering on this actor; the original sidecar's actor (if any) flows
+/// into the audit row's `recovery_for_actor` field.
+pub(crate) const RECOVERY_ACTOR: &str = "omnigraph:recovery";
+
+/// Subdirectory under the repo root holding sidecar files.
+pub(crate) const RECOVERY_DIR_NAME: &str = "__recovery";
+
+/// Current sidecar JSON shape version. Bumping this is a breaking change:
+/// older binaries will refuse to interpret newer sidecars (intentional —
+/// see [`SidecarSchemaError`]).
+pub(crate) const SIDECAR_SCHEMA_VERSION: u32 = 1;
+
+/// Selects which recovery actions are allowed in a sweep.
+///
+/// Open-time recovery (`Omnigraph::open` with `OpenMode::ReadWrite`)
+/// runs the full sweep — `Dataset::restore` is safe because no other
+/// writers are active yet. In-process recovery (called from
+/// `Omnigraph::refresh` during a long-running server) must NOT call
+/// `Dataset::restore`: it "wins" against concurrent Append/Update/
+/// Delete/CreateIndex/Merge per `check_restore_txn`, silently orphaning
+/// the concurrent writer's commit (pinned by
+/// `tests/staged_writes.rs::lance_restore_loses_to_concurrent_append_via_orphaning`).
+/// Roll-forward is safe under concurrency because
+/// `ManifestBatchPublisher::publish` uses row-level CAS.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) enum RecoveryMode {
+    /// Open-time: the full sweep. RolledPastExpected → roll forward;
+    /// mixed/unexpected → roll back via `Dataset::restore`; invariant
+    /// violation → abort with a loud error.
+    Full,
+    /// In-process (refresh): roll-forward only. Sidecars that would
+    /// require restore or abort are LEFT ON DISK for the next ReadWrite
+    /// open. Closes the common case (mutation/load finalize → publisher
+    /// failure) without restart.
+    RollForwardOnly,
+}
+
+/// Categorizes the writer that produced a sidecar so audit trail and
+/// observability can attribute recoveries to the right code path.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
+#[serde(rename_all = "PascalCase")]
+pub(crate) enum SidecarKind {
+    /// `MutationStaging::finalize` — `mutate_as` and the bulk loader.
+    Mutation,
+    /// `loader/mod.rs` — distinct from mutations only for audit clarity.
+    Load,
+    /// `schema_apply::apply_schema_with_lock` — table rewrites + indices.
+    SchemaApply,
+    /// `branch_merge_on_current_target` — three-way merge publishes.
+    BranchMerge,
+    /// `ensure_indices_for_branch` — index lifecycle commits.
+    EnsureIndices,
+}
+
+/// One table's contribution to a sidecar's intended commit. The classifier
+/// uses these to decide per-table state at recovery time.
+#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
+pub(crate) struct SidecarTablePin {
+    /// Stable identifier (`node:Person`, `edge:Knows`, etc.).
+    pub table_key: String,
+    /// Full URI to the Lance dataset for this table.
+    pub table_path: String,
+    /// Manifest-pinned version at writer start (CAS expectation).
+    pub expected_version: u64,
+    /// Lance HEAD that the writer's `commit_staged` would produce
+    /// (typically `expected_version + 1`).
+    pub post_commit_pin: u64,
+    /// Lance branch ref this table lives on (mirrors
+    /// `SubTableEntry::table_branch`). Required for the recovery sweep
+    /// to open the dataset at the correct ref — `Dataset::open(path)`
+    /// alone returns the default ref (typically main), which would
+    /// classify a feature-branch sidecar against main's HEAD and silently
+    /// no-op or roll back the wrong table version. Optional for backward
+    /// compatibility with older sidecars; `None` means main / default.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub table_branch: Option<String>,
+}
+
+/// New-table registration captured by SchemaApply sidecars so recovery
+/// can publish a `ManifestChange::RegisterTable` for tables that the
+/// writer was about to create. Without this, added tables exist as
+/// orphan datasets on disk after recovery while the live `_schema.pg`
+/// declares types the manifest doesn't know about — `snapshot.entry()`
+/// returns None when the engine tries to read them.
+#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
+pub(crate) struct SidecarTableRegistration {
+    /// Stable identifier (`node:Tag`, `edge:WorksAt`, etc.).
+    pub table_key: String,
+    /// Repo-relative path the manifest will register
+    /// (e.g. `nodes/{fnv1a64-hex}`); recovery joins this with `root_uri`
+    /// to open the dataset Lance HEAD when constructing the
+    /// accompanying `Update`.
+    pub table_path: String,
+    /// Lance branch ref the dataset lives on (None for main / default).
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub table_branch: Option<String>,
+}
+
+/// Tombstone metadata captured by SchemaApply sidecars so recovery can
+/// publish a `ManifestChange::Tombstone` for tables the writer was
+/// about to mark removed. Without this, tombstoned types stay visible
+/// in the manifest snapshot after recovery even though the live
+/// schema no longer declares them.
+#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
+pub(crate) struct SidecarTombstone {
+    pub table_key: String,
+    /// Manifest version at which this table was active before the
+    /// tombstone — required by the publisher's CAS pre-check.
+    pub tombstone_version: u64,
+}
+
+/// In-memory representation of the on-disk JSON sidecar.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub(crate) struct RecoverySidecar {
+    pub schema_version: u32,
+    pub operation_id: String,
+    pub started_at: String,
+    pub branch: Option<String>,
+    pub actor_id: Option<String>,
+    pub writer_kind: SidecarKind,
+    pub tables: Vec<SidecarTablePin>,
+    /// For `SidecarKind::BranchMerge` only: the source branch's HEAD
+    /// commit id at the time the sidecar was written. Used by the
+    /// recovery sweep's audit step to call `append_merge_commit`
+    /// (recording `merged_parent_commit_id`) instead of `append_commit`,
+    /// so future merges between the same pair recognize "already up-to-
+    /// date" and merge-base computations stay correct. Optional for
+    /// backward compatibility — older sidecars (or non-BranchMerge
+    /// kinds) carry `None` and recovery falls back to `append_commit`.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub merge_source_commit_id: Option<String>,
+    /// SchemaApply-only: tables the writer was about to register
+    /// (added types + renamed targets). Recovery emits a
+    /// `RegisterTable` + `Update` pair per entry so the manifest
+    /// catches up to the live schema's declared type set.
+    /// Backward-compat: empty / absent for older sidecars and
+    /// non-SchemaApply writers.
+    #[serde(default, skip_serializing_if = "Vec::is_empty")]
+    pub additional_registrations: Vec<SidecarTableRegistration>,
+    /// SchemaApply-only: tables the writer was about to tombstone
+    /// (removed types + renamed sources). Recovery emits a
+    /// `ManifestChange::Tombstone` per entry.
+    /// Backward-compat: empty / absent for older sidecars and
+    /// non-SchemaApply writers.
+    #[serde(default, skip_serializing_if = "Vec::is_empty")]
+    pub tombstones: Vec<SidecarTombstone>,
+}
+
+/// Opaque handle returned by [`write_sidecar`] so the caller can delete
+/// the sidecar after Phase C succeeds. Holding the handle does NOT keep
+/// the sidecar alive — it just records the URI to delete.
+#[derive(Debug, Clone)]
+pub(crate) struct RecoverySidecarHandle {
+    pub(crate) operation_id: String,
+    pub(crate) sidecar_uri: String,
+}
+
+/// Error returned when the sidecar's `schema_version` is unknown to this
+/// binary. We refuse-and-error rather than read-and-warn: an old binary
+/// cannot guess what semantics a newer writer baked into a future shape.
+/// Operator action is required (typically: upgrade the binary).
+#[derive(Debug)]
+pub(crate) struct SidecarSchemaError {
+    pub sidecar_uri: String,
+    pub found_version: u32,
+    pub supported_version: u32,
+}
+
+impl std::fmt::Display for SidecarSchemaError {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        write!(
+            f,
+            "recovery sidecar at '{}' declares schema_version={}, but this \
+             binary supports only schema_version={}; refusing to interpret \
+             — upgrade omnigraph or remove the sidecar with operator review",
+            self.sidecar_uri, self.found_version, self.supported_version,
+        )
+    }
+}
+
+impl std::error::Error for SidecarSchemaError {}
+
+impl From<SidecarSchemaError> for OmniError {
+    fn from(err: SidecarSchemaError) -> Self {
+        OmniError::manifest_internal(err.to_string())
+    }
+}
+
+/// Per-table classification of observed Lance HEAD vs. manifest-pinned
+/// state, computed against the sidecar's intent.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) enum TableClassification {
+    /// `lance_head == manifest_pinned == sidecar.expected_version`.
+    /// The writer never reached this table's `commit_staged` (or this
+    /// table wasn't touched yet). No drift; no action.
+    NoMovement,
+    /// `lance_head == manifest_pinned + 1` AND
+    /// `sidecar.expected_version == manifest_pinned` AND
+    /// `sidecar.post_commit_pin == lance_head`. The writer's
+    /// `commit_staged` for this table succeeded; only Phase C did not
+    /// land. Eligible for roll-forward (in the all-or-nothing decision).
+    RolledPastExpected,
+    /// `lance_head == manifest_pinned + 1` but the sidecar's
+    /// `expected_version`/`post_commit_pin` don't match. Some other writer
+    /// or recovery action moved this table. Roll back to the manifest pin.
+    UnexpectedAtP1,
+    /// `lance_head > manifest_pinned + 1`. Multi-step orphan from a
+    /// previous restore attempt or an external mutation. Roll back to
+    /// the manifest pin.
+    UnexpectedMultistep,
+    /// `lance_head < manifest_pinned`. Should be impossible: the manifest
+    /// pin can only advance after a successful Lance commit. Surface
+    /// loudly and abort recovery.
+    InvariantViolation { observed: u64 },
+}
+
+/// Per-sidecar decision derived from the table classifications.
+///
+/// **All-or-nothing**: the writer that produced the sidecar intended an
+/// atomic publish across every table it listed. Rolling forward only some
+/// of them would publish a partial commit and violate `docs/invariants.md`
+/// §VI.23. The decision is based on the worst classification:
+///
+/// - Any `InvariantViolation` → `Abort` (operator action required).
+/// - Any `UnexpectedAtP1` / `UnexpectedMultistep` / `NoMovement` →
+///   `RollBack` all drifted tables to the manifest pin.
+/// - All `RolledPastExpected` → `RollForward` every table in one
+///   manifest publish.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) enum SidecarDecision {
+    /// All tables successfully reached Phase B for this writer; only the
+    /// manifest publish (Phase C) didn't land. Roll the pin forward atomically.
+    RollForward,
+    /// Some tables didn't reach Phase B (or sidecar doesn't match observed state).
+    /// Roll back the rolled-past-expected ones; leave the no-movement ones alone.
+    RollBack,
+    /// An invariant was violated. Refuse to act; surface for operator review.
+    Abort,
+}
+
+/// Build the `__recovery/` directory URI under a repo root.
+pub(crate) fn recovery_dir_uri(root_uri: &str) -> String {
+    let trimmed = root_uri.trim_end_matches('/');
+    format!("{}/{}", trimmed, RECOVERY_DIR_NAME)
+}
+
+/// Build the URI for a specific sidecar (`__recovery/{operation_id}.json`).
+pub(crate) fn sidecar_uri(root_uri: &str, operation_id: &str) -> String {
+    let dir = recovery_dir_uri(root_uri);
+    format!("{}/{}.json", dir, operation_id)
+}
+
+/// Write a sidecar atomically and return a handle for later deletion.
+///
+/// The atomicity contract is inherited from [`StorageAdapter::write_text`]:
+/// LocalStorageAdapter writes via `tokio::fs::write` (whole-file replace);
+/// S3StorageAdapter writes via PutObject (atomic at the object level).
+/// Both are sufficient for sidecar semantics — readers either see the
+/// complete sidecar or none.
+pub(crate) async fn write_sidecar(
+    root_uri: &str,
+    storage: &dyn StorageAdapter,
+    sidecar: &RecoverySidecar,
+) -> Result<RecoverySidecarHandle> {
+    debug_assert_eq!(sidecar.schema_version, SIDECAR_SCHEMA_VERSION);
+    let uri = sidecar_uri(root_uri, &sidecar.operation_id);
+    let json = serde_json::to_string_pretty(sidecar).map_err(|err| {
+        OmniError::manifest_internal(format!("failed to serialize recovery sidecar: {}", err))
+    })?;
+    storage.write_text(&uri, &json).await?;
+    Ok(RecoverySidecarHandle {
+        operation_id: sidecar.operation_id.clone(),
+        sidecar_uri: uri,
+    })
+}
+
+/// Delete a sidecar after Phase C succeeded. Idempotent (safe to retry).
+pub(crate) async fn delete_sidecar(
+    handle: &RecoverySidecarHandle,
+    storage: &dyn StorageAdapter,
+) -> Result<()> {
+    storage.delete(&handle.sidecar_uri).await
+}
+
+/// Read every sidecar under `__recovery/`. Returns an empty vec if the
+/// directory does not exist or is empty (the steady-state path).
+///
+/// Sidecars whose `schema_version` is unsupported by this binary are NOT
+/// silently skipped — the function returns an error so an operator can
+/// investigate. Rationale: a sidecar with an unknown shape may encode
+/// state we don't know how to recover; better to fail open than guess.
+pub(crate) async fn list_sidecars(
+    root_uri: &str,
+    storage: &dyn StorageAdapter,
+) -> Result<Vec<RecoverySidecar>> {
+    let dir = recovery_dir_uri(root_uri);
+    let mut uris = storage.list_dir(&dir).await?;
+    // Sort by URI so the sweep processes sidecars deterministically.
+    // Sidecar filenames are ULIDs, which are lexicographically sortable
+    // === chronologically sortable; the older sidecar is processed
+    // before the newer one. Without this sort, `list_dir` returns
+    // filesystem-order results which are nondeterministic and can mask
+    // ordering-sensitive bugs.
+    uris.sort();
+    let mut out = Vec::with_capacity(uris.len());
+    for uri in uris {
+        // Skip non-JSON files defensively; the directory is ours but a
+        // future feature might leave other artifacts here.
+        if !uri.ends_with(".json") {
+            continue;
+        }
+        let body = storage.read_text(&uri).await?;
+        let sidecar = parse_sidecar(&uri, &body)?;
+        out.push(sidecar);
+    }
+    Ok(out)
+}
+
+/// Parse a sidecar body, enforcing the schema-version refusal policy.
+/// Exposed separately so unit tests can exercise the parse path without
+/// going through storage.
+pub(crate) fn parse_sidecar(sidecar_uri: &str, body: &str) -> Result<RecoverySidecar> {
+    // First check the schema_version peek — gives a typed error before we
+    // try to deserialize the rest of the structure (which might fail with
+    // a less-helpful "missing field" message).
+    #[derive(Deserialize)]
+    struct Peek {
+        schema_version: u32,
+    }
+    let peek: Peek = serde_json::from_str(body).map_err(|err| {
+        OmniError::manifest_internal(format!(
+            "recovery sidecar at '{}' is not valid JSON: {}",
+            sidecar_uri, err
+        ))
+    })?;
+    if peek.schema_version != SIDECAR_SCHEMA_VERSION {
+        return Err(SidecarSchemaError {
+            sidecar_uri: sidecar_uri.to_string(),
+            found_version: peek.schema_version,
+            supported_version: SIDECAR_SCHEMA_VERSION,
+        }
+        .into());
+    }
+    serde_json::from_str(body).map_err(|err| {
+        OmniError::manifest_internal(format!(
+            "recovery sidecar at '{}' failed to deserialize: {}",
+            sidecar_uri, err
+        ))
+    })
+}
+
+/// Classify one table's observed state vs. the sidecar's intent.
+///
+/// `kind` adjusts the precision of the `RolledPastExpected` predicate:
+/// - **Strict** (`Mutation`, `Load`): exactly one `commit_staged` per
+///   table, so `lance_head == manifest_pinned + 1` AND
+///   `post_commit_pin == lance_head` is required.
+/// - **Loose** (`SchemaApply`, `EnsureIndices`, `BranchMerge`): the
+///   writer may run N ≥ 1 `commit_staged` calls per table (one per
+///   index built + one for the overwrite, etc.; merge tables run
+///   merge_insert + delete_where + index rebuilds) and the exact N
+///   is hard to compute at sidecar-write time. The loose match accepts
+///   any `lance_head > manifest_pinned` as `RolledPastExpected` when
+///   `pin.expected_version == manifest_pinned` (the writer's CAS
+///   target matches what the manifest currently shows). The risk this
+///   admits — an external agent advancing HEAD between sidecar write
+///   and recovery — is out of scope for the single-coordinator model.
+pub(crate) fn classify_table(
+    pin: &SidecarTablePin,
+    lance_head: u64,
+    manifest_pinned: u64,
+    kind: SidecarKind,
+) -> TableClassification {
+    use TableClassification::*;
+    if lance_head < manifest_pinned {
+        return InvariantViolation {
+            observed: lance_head,
+        };
+    }
+    if lance_head == manifest_pinned {
+        return NoMovement;
+    }
+    // lance_head > manifest_pinned
+    let strict = matches!(kind, SidecarKind::Mutation | SidecarKind::Load);
+    if strict {
+        if lance_head == manifest_pinned + 1 {
+            if pin.expected_version == manifest_pinned && pin.post_commit_pin == lance_head {
+                RolledPastExpected
+            } else {
+                UnexpectedAtP1
+            }
+        } else {
+            // lance_head > manifest_pinned + 1
+            UnexpectedMultistep
+        }
+    } else {
+        // Loose match for multi-commit writers (SchemaApply, EnsureIndices).
+        if pin.expected_version == manifest_pinned {
+            RolledPastExpected
+        } else if lance_head == manifest_pinned + 1 {
+            UnexpectedAtP1
+        } else {
+            UnexpectedMultistep
+        }
+    }
+}
+
+/// Compute the per-sidecar decision from a slice of table classifications.
+///
+/// All-or-nothing per `docs/invariants.md` §VI.23 — see [`SidecarDecision`].
+pub(crate) fn decide(classifications: &[TableClassification]) -> SidecarDecision {
+    use SidecarDecision::*;
+    use TableClassification::*;
+    if classifications
+        .iter()
+        .any(|c| matches!(c, InvariantViolation { .. }))
+    {
+        return Abort;
+    }
+    if classifications
+        .iter()
+        .any(|c| matches!(c, NoMovement | UnexpectedAtP1 | UnexpectedMultistep))
+    {
+        return RollBack;
+    }
+    // All RolledPastExpected (or the slice is empty — no-op trivially).
+    RollForward
+}
+
+/// Restore a single table's Lance HEAD to `target_version`, producing a
+/// new commit at HEAD+1 with content == content-at-`target_version`.
+///
+/// Always runs the actual `Dataset::restore` — there is NO fragment-set
+/// short-circuit because equal fragment IDs do NOT imply equal content:
+/// Lance index commits and deletion-vector updates change the manifest
+/// (and therefore the user-visible state) without changing fragment IDs.
+/// Skipping the restore in those cases would leave Lance HEAD ahead of
+/// the manifest with no recovery artifact left.
+///
+/// Cost: under repeated mid-rollback crashes (rare), Lance HEAD
+/// accumulates extra restore commits that `omnigraph cleanup` reclaims.
+/// Bounded by the number of recovery iterations — typically 1.
+pub(crate) async fn restore_table_to_version(
+    table_path: &str,
+    branch: Option<&str>,
+    target_version: u64,
+) -> Result<()> {
+    let head = Dataset::open(table_path)
+        .await
+        .map_err(|e| OmniError::Lance(e.to_string()))?;
+    let head = match branch {
+        Some(b) if b != "main" => head
+            .checkout_branch(b)
+            .await
+            .map_err(|e| OmniError::Lance(e.to_string()))?,
+        _ => head,
+    };
+    let mut to_restore = head
+        .checkout_version(target_version)
+        .await
+        .map_err(|e| OmniError::Lance(e.to_string()))?;
+    to_restore
+        .restore()
+        .await
+        .map_err(|e| OmniError::Lance(e.to_string()))?;
+    Ok(())
+}
+
+/// Open-time recovery sweep — the entry point invoked from
+/// `Omnigraph::open` (gated on `OpenMode::ReadWrite`).
+///
+/// Enumerates every sidecar in `__recovery/`, classifies each table per
+/// the sidecar's intent, and applies the all-or-nothing decision:
+/// roll-forward (every table eligible), roll-back (mixed or unexpected
+/// state), or abort (invariant violation).
+///
+/// Idempotency: a crash mid-sweep leaves the sidecar (deletion is the
+/// final step). Re-opening re-classifies; repeated rollbacks of the
+/// same table append extra Lance restore commits which `omnigraph
+/// cleanup` reclaims.
+///
+/// Concurrency: today recovery runs synchronously in `Omnigraph::open`
+/// *before* the engine is wrapped in the server's `Arc<RwLock<Omnigraph>>`.
+/// No request handlers can race. A future per-(table_key, branch) writer
+/// queue model (paired with a background reconciler) will need to acquire
+/// queues before the sweep restores or publishes.
+pub(crate) async fn recover_manifest_drift(
+    root_uri: &str,
+    storage: std::sync::Arc<dyn StorageAdapter>,
+    coordinator: &mut GraphCoordinator,
+    mode: RecoveryMode,
+    schema_state_recovery: SchemaStateRecovery,
+) -> Result<()> {
+    let sidecars = list_sidecars(root_uri, storage.as_ref()).await?;
+    if sidecars.is_empty() {
+        return Ok(());
+    }
+
+    // For each sidecar, classify against a FRESH snapshot AT THE
+    // SIDECAR'S BRANCH. Two reasons:
+    // 1. Per-sidecar refresh: sidecar N's roll-forward writes manifest
+    //    changes that sidecar N+1 must observe, otherwise N+1 classifies
+    //    its tables against stale pins.
+    // 2. Per-branch snapshot: a sidecar from a feature-branch writer
+    //    pins entries on that feature branch. Classifying against the
+    //    main coordinator's snapshot would compare to main's pins (and
+    //    main's Lance HEAD if pin.table_branch isn't honored), silently
+    //    no-op'ing or rolling back the wrong table version. Open a
+    //    separate per-branch coordinator and use ITS snapshot.
+    for sidecar in sidecars {
+        let branch_snapshot = match sidecar.branch.as_deref() {
+            Some(b) => {
+                let mut branch_coord =
+                    GraphCoordinator::open_branch(root_uri, b, std::sync::Arc::clone(&storage))
+                        .await?;
+                branch_coord.refresh().await?;
+                branch_coord.snapshot()
+            }
+            None => {
+                coordinator.refresh().await?;
+                coordinator.snapshot()
+            }
+        };
+        process_sidecar(
+            root_uri,
+            storage.as_ref(),
+            &branch_snapshot,
+            &sidecar,
+            mode,
+            schema_state_recovery,
+        )
+        .await?;
+    }
+    // Final refresh so the caller sees the post-sweep state.
+    coordinator.refresh().await?;
+    Ok(())
+}
+
+async fn process_sidecar(
+    root_uri: &str,
+    storage: &dyn StorageAdapter,
+    snapshot: &Snapshot,
+    sidecar: &RecoverySidecar,
+    mode: RecoveryMode,
+    schema_state_recovery: SchemaStateRecovery,
+) -> Result<()> {
+    let mut states = Vec::with_capacity(sidecar.tables.len());
+    for pin in &sidecar.tables {
+        let lance_head = open_lance_head(&pin.table_path, pin.table_branch.as_deref()).await?;
+        let manifest_pinned = snapshot
+            .entry(&pin.table_key)
+            .map(|e| e.table_version)
+            .unwrap_or(0);
+        states.push(ClassifiedTable {
+            classification: classify_table(pin, lance_head, manifest_pinned, sidecar.writer_kind),
+            manifest_pinned,
+            lance_head,
+        });
+    }
+    let classifications = states
+        .iter()
+        .map(|state| state.classification)
+        .collect::<Vec<_>>();
+
+    match decide(&classifications) {
+        SidecarDecision::Abort => match mode {
+            RecoveryMode::Full => {
+                // Surface loudly without deleting the sidecar — operator
+                // must investigate. This includes any InvariantViolation
+                // classification (Lance HEAD < manifest pinned: should
+                // be impossible).
+                Err(OmniError::manifest_internal(format!(
+                    "recovery sidecar '{}' has invariant violation; refusing to act \
+                     — operator review required (sidecar at '{}', classifications: {:?})",
+                    sidecar.operation_id,
+                    sidecar_uri(root_uri, &sidecar.operation_id),
+                    classifications,
+                )))
+            }
+            RecoveryMode::RollForwardOnly => {
+                // In-process refresh-time recovery: leave the sidecar
+                // and defer the loud abort to the next ReadWrite open.
+                // Operator-actionable error surfacing belongs at open,
+                // not silently inside refresh.
+                warn!(
+                    operation_id = sidecar.operation_id.as_str(),
+                    writer_kind = ?sidecar.writer_kind,
+                    "recovery: deferring sidecar with invariant violation to next ReadWrite open"
+                );
+                Ok(())
+            }
+        },
+        SidecarDecision::RollBack => {
+            // Distinguish "stale sidecar from a previous successful
+            // roll-forward whose audit/delete failed" from a legitimate
+            // rollback. If every table is at NoMovement AND any pin's
+            // manifest_pinned has advanced past expected_version, the
+            // manifest already reflects the writer's intent — a previous
+            // recovery's `roll_forward_all` succeeded but `record_audit`
+            // or `delete_sidecar` failed, leaving the sidecar to be
+            // re-discovered. Recording this as RolledBack with empty
+            // outcomes (the naive RollBack path's behavior under all-
+            // NoMovement) misleads operators reading
+            // `_graph_commit_recoveries.lance` — the actual outcome was
+            // a successful roll-forward.
+            let all_no_movement = states
+                .iter()
+                .all(|s| matches!(s.classification, TableClassification::NoMovement));
+            let any_pin_advanced = sidecar
+                .tables
+                .iter()
+                .zip(states.iter())
+                .any(|(pin, state)| state.manifest_pinned > pin.expected_version);
+            if all_no_movement && any_pin_advanced {
+                if matches!(mode, RecoveryMode::RollForwardOnly) {
+                    // Refresh-time audit-recovery is safe: no
+                    // Dataset::restore involved; just an audit-row write
+                    // and sidecar delete.
+                    warn!(
+                        operation_id = sidecar.operation_id.as_str(),
+                        writer_kind = ?sidecar.writer_kind,
+                        "recovery: cleaning up stale sidecar from a prior successful \
+                         roll-forward (audit-recovery, in-process refresh)"
+                    );
+                } else {
+                    warn!(
+                        operation_id = sidecar.operation_id.as_str(),
+                        writer_kind = ?sidecar.writer_kind,
+                        "recovery: cleaning up stale sidecar from a prior successful \
+                         roll-forward (manifest already advanced; recording RolledForward audit)"
+                    );
+                }
+                return record_audit_recovery_rollforward(
+                    root_uri, storage, snapshot, sidecar, &states,
+                )
+                .await;
+            }
+            if matches!(mode, RecoveryMode::RollForwardOnly) {
+                // In-process recovery cannot run Dataset::restore safely
+                // (would orphan a concurrent writer's commit). Leave the
+                // sidecar in place; the next ReadWrite open will handle
+                // it via the full sweep.
+                warn!(
+                    operation_id = sidecar.operation_id.as_str(),
+                    writer_kind = ?sidecar.writer_kind,
+                    "recovery: deferring rollback-eligible sidecar to next ReadWrite open"
+                );
+                return Ok(());
+            }
+            warn!(
+                operation_id = sidecar.operation_id.as_str(),
+                writer_kind = ?sidecar.writer_kind,
+                "recovery: rolling back sidecar (mixed or unexpected state)"
+            );
+            roll_back_sidecar(root_uri, storage, snapshot, sidecar, &states).await
+        }
+        SidecarDecision::RollForward => {
+            if matches!(sidecar.writer_kind, SidecarKind::SchemaApply)
+                && !schema_state_recovery.completed_schema_apply_sidecar_rename()
+            {
+                return match mode {
+                    RecoveryMode::Full => {
+                        warn!(
+                            operation_id = sidecar.operation_id.as_str(),
+                            "recovery: rolling back SchemaApply sidecar because schema staging \
+                             files were not promoted in this recovery pass"
+                        );
+                        roll_back_sidecar(root_uri, storage, snapshot, sidecar, &states).await
+                    }
+                    RecoveryMode::RollForwardOnly => {
+                        warn!(
+                            operation_id = sidecar.operation_id.as_str(),
+                            "recovery: deferring SchemaApply sidecar because schema staging files \
+                             were not promoted in this recovery pass"
+                        );
+                        Ok(())
+                    }
+                };
+            }
+            warn!(
+                operation_id = sidecar.operation_id.as_str(),
+                writer_kind = ?sidecar.writer_kind,
+                "recovery: rolling forward sidecar (Phase B completed; \
+                 Phase C did not land)"
+            );
+            let (new_manifest_version, published_versions) =
+                roll_forward_all(root_uri, sidecar, snapshot).await?;
+            // `to_version` records the ACTUAL Lance HEAD published for
+            // each table (not pin.post_commit_pin, which is a lower bound
+            // for loose-match writers like SchemaApply / EnsureIndices /
+            // BranchMerge that run multiple commit_staged calls per table).
+            // SchemaApply additional_registrations are also included so
+            // operators reading the audit row see the full publish set,
+            // not just the pinned subset.
+            let mut outcomes: Vec<TableOutcome> = sidecar
+                .tables
+                .iter()
+                .map(|pin| TableOutcome {
+                    table_key: pin.table_key.clone(),
+                    from_version: pin.expected_version,
+                    to_version: published_versions
+                        .get(&pin.table_key)
+                        .copied()
+                        .unwrap_or(pin.post_commit_pin),
+                })
+                .collect();
+            for reg in &sidecar.additional_registrations {
+                outcomes.push(TableOutcome {
+                    table_key: reg.table_key.clone(),
+                    from_version: 0,
+                    to_version: published_versions.get(&reg.table_key).copied().unwrap_or(0),
+                });
+            }
+            record_audit(
+                root_uri,
+                sidecar,
+                new_manifest_version,
+                RecoveryKind::RolledForward,
+                outcomes,
+            )
+            .await?;
+            delete_sidecar_by_operation_id(root_uri, storage, &sidecar.operation_id).await?;
+            Ok(())
+        }
+    }
+}
+
+#[derive(Debug, Clone, Copy)]
+struct ClassifiedTable {
+    classification: TableClassification,
+    manifest_pinned: u64,
+    /// Lance HEAD observed at classification time. Captured so the
+    /// rollback audit's `from_version` can record where Lance HEAD was
+    /// before `Dataset::restore` ran (operators reading
+    /// `_graph_commit_recoveries.lance` see actual drift, not
+    /// `from_version == to_version == manifest_pinned`).
+    lance_head: u64,
+}
+
+async fn roll_back_sidecar(
+    root_uri: &str,
+    storage: &dyn StorageAdapter,
+    snapshot: &Snapshot,
+    sidecar: &RecoverySidecar,
+    states: &[ClassifiedTable],
+) -> Result<()> {
+    // Restore every table whose Lance HEAD has drifted from the
+    // manifest pin (RolledPastExpected, UnexpectedAtP1,
+    // UnexpectedMultistep). NoMovement tables are already at the
+    // manifest pin — no action. Restore is unconditional; repeated
+    // mid-rollback crashes accumulate a few extra Lance commits that
+    // `omnigraph cleanup` reclaims.
+    let mut outcomes = Vec::with_capacity(sidecar.tables.len());
+    for (pin, state) in sidecar.tables.iter().zip(states.iter()) {
+        if matches!(
+            state.classification,
+            TableClassification::RolledPastExpected
+                | TableClassification::UnexpectedAtP1
+                | TableClassification::UnexpectedMultistep
+        ) {
+            restore_table_to_version(
+                &pin.table_path,
+                pin.table_branch.as_deref(),
+                state.manifest_pinned,
+            )
+            .await?;
+            // `from_version` records the Lance HEAD observed BEFORE the
+            // restore (the actual drift), not the manifest pin. Operators
+            // reading `_graph_commit_recoveries.lance` see "rolled back
+            // from v7 to v5" rather than "v5 → v5".
+            outcomes.push(TableOutcome {
+                table_key: pin.table_key.clone(),
+                from_version: state.lance_head,
+                to_version: state.manifest_pinned,
+            });
+        }
+    }
+    // Manifest pin doesn't move on rollback; record an audit-only
+    // commit at the existing version so operators can correlate via
+    // `omnigraph commit list --filter actor=omnigraph:recovery`.
+    record_audit(
+        root_uri,
+        sidecar,
+        snapshot.version(),
+        RecoveryKind::RolledBack,
+        outcomes,
+    )
+    .await?;
+    delete_sidecar_by_operation_id(root_uri, storage, &sidecar.operation_id).await?;
+    Ok(())
+}
+
+/// Cleanup path for stale sidecars where a previous recovery's
+/// roll-forward succeeded (manifest pin advanced past `expected_version`)
+/// but `record_audit` or sidecar deletion failed, leaving the sidecar
+/// to be re-discovered on a subsequent open. By the time we re-classify,
+/// every table reads as `NoMovement` (lance_head == manifest_pinned),
+/// which the naive `RollBack` arm would record as RolledBack-with-empty-
+/// outcomes — misleading for operators because the actual outcome was
+/// a successful roll-forward.
+///
+/// This helper records the correct shape: a `RolledForward` audit row
+/// whose `from_version` is the original `expected_version` and whose
+/// `to_version` is the current `manifest_pinned` (the actual published
+/// version after the prior roll-forward). No Lance writes are needed —
+/// the substrate is already in the post-roll-forward state.
+async fn record_audit_recovery_rollforward(
+    root_uri: &str,
+    storage: &dyn StorageAdapter,
+    snapshot: &Snapshot,
+    sidecar: &RecoverySidecar,
+    states: &[ClassifiedTable],
+) -> Result<()> {
+    let outcomes: Vec<TableOutcome> = sidecar
+        .tables
+        .iter()
+        .zip(states.iter())
+        .map(|(pin, state)| TableOutcome {
+            table_key: pin.table_key.clone(),
+            from_version: pin.expected_version,
+            to_version: state.manifest_pinned,
+        })
+        .collect();
+    record_audit(
+        root_uri,
+        sidecar,
+        snapshot.version(),
+        RecoveryKind::RolledForward,
+        outcomes,
+    )
+    .await?;
+    delete_sidecar_by_operation_id(root_uri, storage, &sidecar.operation_id).await?;
+    Ok(())
+}
+
+/// Atomically extend every table's manifest pin from `expected_version` to
+/// `post_commit_pin` via a single `ManifestBatchPublisher::publish` call.
+/// Returns the new manifest version produced by the publish.
+///
+/// All-or-nothing at the substrate: the publish writes one `__manifest`
+/// row-level CAS that either advances every listed pin together or fails
+/// with `ExpectedVersionMismatch` (no partial publish). The publisher's
+/// internal `PUBLISHER_RETRY_BUDGET = 5` handles transient row-level CAS
+/// contention; persistent contention surfaces the typed conflict error to
+/// the recovery sweep, which leaves the sidecar in place for the next
+/// open's retry.
+/// Returns `(new_manifest_version, per_table_published_versions)`. The
+/// per-table map is what the audit row's `to_version` should record —
+/// for loose-match writers the actual Lance HEAD can be higher than the
+/// sidecar's `post_commit_pin` (which is a lower bound), so the pin is
+/// the wrong source of truth for an operator-facing audit field.
+async fn roll_forward_all(
+    root_uri: &str,
+    sidecar: &RecoverySidecar,
+    snapshot: &Snapshot,
+) -> Result<(u64, HashMap<String, u64>)> {
+    let total_changes =
+        sidecar.tables.len() + sidecar.additional_registrations.len() + sidecar.tombstones.len();
+    let mut updates: Vec<ManifestChange> = Vec::with_capacity(total_changes);
+    let mut expected: HashMap<String, u64> = HashMap::with_capacity(total_changes);
+    let mut published_versions: HashMap<String, u64> =
+        HashMap::with_capacity(sidecar.tables.len() + sidecar.additional_registrations.len());
+
+    for pin in &sidecar.tables {
+        // Open the dataset at its CURRENT Lance HEAD on the pin's branch
+        // (not at the sidecar's post_commit_pin). For strict-match writers
+        // (Mutation/Load) HEAD == post_commit_pin by construction. For
+        // loose-match writers (SchemaApply/EnsureIndices/BranchMerge) HEAD
+        // may be higher than post_commit_pin (multiple commit_staged
+        // calls per table); we want to publish to the actual current HEAD.
+        let head_ds = Dataset::open(&pin.table_path)
+            .await
+            .map_err(|e| OmniError::Lance(e.to_string()))?;
+        let head_ds = match pin.table_branch.as_deref() {
+            Some(b) if b != "main" => head_ds
+                .checkout_branch(b)
+                .await
+                .map_err(|e| OmniError::Lance(e.to_string()))?,
+            _ => head_ds,
+        };
+        let head_version = head_ds.version().version;
+
+        let row_count = head_ds
+            .count_rows(None)
+            .await
+            .map_err(|e| OmniError::Lance(e.to_string()))? as u64;
+
+        let table_relative_path = super::table_path_for_table_key(&pin.table_key)?;
+        let version_metadata = super::metadata::TableVersionMetadata::from_dataset(
+            root_uri,
+            &table_relative_path,
+            &head_ds,
+        )?;
+
+        updates.push(ManifestChange::Update(SubTableUpdate {
+            table_key: pin.table_key.clone(),
+            table_version: head_version,
+            table_branch: pin.table_branch.clone(),
+            row_count,
+            version_metadata,
+        }));
+        expected.insert(pin.table_key.clone(), pin.expected_version);
+        published_versions.insert(pin.table_key.clone(), head_version);
+    }
+
+    // SchemaApply-only: register added tables (and renamed targets) and
+    // emit accompanying Update entries so recovery's manifest commit
+    // matches what the writer would have published. Without this, added
+    // tables exist as orphan datasets on disk but never receive a
+    // manifest entry, leaving the live schema and manifest mismatched.
+    //
+    // Filtered against `snapshot`: when the manifest already has a live
+    // entry for `reg.table_key`, a previous recovery (or the writer
+    // itself, before crashing in Phase D) has already published the
+    // registration — skip it to avoid the publisher's ExpectedVersionMismatch
+    // (expected=0, actual=current_version) on the redundant Register.
+    for reg in &sidecar.additional_registrations {
+        if snapshot.entry(&reg.table_key).is_some() {
+            // Already registered — record the current version in
+            // published_versions so the audit row's `to_version` reflects
+            // reality, but emit no manifest change.
+            if let Some(entry) = snapshot.entry(&reg.table_key) {
+                published_versions.insert(reg.table_key.clone(), entry.table_version);
+            }
+            continue;
+        }
+        let dataset_uri = format!("{}/{}", root_uri.trim_end_matches('/'), reg.table_path);
+        let head_ds = Dataset::open(&dataset_uri)
+            .await
+            .map_err(|e| OmniError::Lance(e.to_string()))?;
+        let head_ds = match reg.table_branch.as_deref() {
+            Some(b) if b != "main" => head_ds
+                .checkout_branch(b)
+                .await
+                .map_err(|e| OmniError::Lance(e.to_string()))?,
+            _ => head_ds,
+        };
+        let head_version = head_ds.version().version;
+        let row_count = head_ds
+            .count_rows(None)
+            .await
+            .map_err(|e| OmniError::Lance(e.to_string()))? as u64;
+        let version_metadata = super::metadata::TableVersionMetadata::from_dataset(
+            root_uri,
+            &reg.table_path,
+            &head_ds,
+        )?;
+
+        updates.push(ManifestChange::RegisterTable(TableRegistration {
+            table_key: reg.table_key.clone(),
+            table_path: reg.table_path.clone(),
+        }));
+        updates.push(ManifestChange::Update(SubTableUpdate {
+            table_key: reg.table_key.clone(),
+            table_version: head_version,
+            table_branch: reg.table_branch.clone(),
+            row_count,
+            version_metadata,
+        }));
+        // No prior manifest entry expected for an added table.
+        expected.insert(reg.table_key.clone(), 0);
+        published_versions.insert(reg.table_key.clone(), head_version);
+    }
+
+    // SchemaApply-only: tombstone removed types (and renamed sources).
+    //
+    // Filtered against `snapshot`: when the manifest no longer has an
+    // entry for `tomb.table_key`, the tombstone has already landed in
+    // a prior recovery / the writer's Phase C — skip emit so the
+    // publisher doesn't error on a redundant tombstone.
+    for tomb in &sidecar.tombstones {
+        if snapshot.entry(&tomb.table_key).is_none() {
+            continue;
+        }
+        updates.push(ManifestChange::Tombstone(TableTombstone {
+            table_key: tomb.table_key.clone(),
+            tombstone_version: tomb.tombstone_version,
+        }));
+        // Tombstone CAS pre-check expects the table to be at
+        // `tombstone_version - 1` (the pre-tombstone version, since
+        // schema_apply sets `tombstone_version = source.table_version + 1`).
+        expected.insert(
+            tomb.table_key.clone(),
+            tomb.tombstone_version.saturating_sub(1),
+        );
+    }
+
+    let publisher = GraphNamespacePublisher::new(root_uri, sidecar.branch.as_deref());
+    let new_dataset = publisher.publish(&updates, &expected).await?;
+    Ok((new_dataset.version().version, published_versions))
+}
+
+/// Append the audit row describing this recovery action.
+///
+/// Two-part write: (a) `_graph_commits.lance` row anchored on the recovery
+/// actor (`omnigraph:recovery`); (b) `_graph_commit_recoveries.lance` row
+/// linking back to (a) and naming the original actor + per-table outcomes.
+/// Same not-atomic-pair-write shape as the existing `_graph_commits`
+/// + `_graph_commit_actors` split — a crash between the two leaves an
+/// orphan commit row with no audit row. The recovery sweep tolerates this:
+/// on re-entry the classifier surfaces `NoMovement` for already-restored /
+/// already-published tables, the action is a no-op, and the audit append
+/// is retried.
+async fn record_audit(
+    root_uri: &str,
+    sidecar: &RecoverySidecar,
+    manifest_version: u64,
+    kind: RecoveryKind,
+    outcomes: Vec<TableOutcome>,
+) -> Result<()> {
+    // Non-main recovery commits must be appended on the sidecar branch's
+    // commit graph, otherwise parent_commit_id comes from the global
+    // main head. BranchMerge additionally records the source branch's
+    // HEAD as merged_parent_commit_id so future merges between the same
+    // pair recognize "already up-to-date".
+    let target_branch = sidecar.branch.as_deref();
+    let mut graph = match target_branch {
+        Some(branch) => CommitGraph::open_at_branch(root_uri, branch).await?,
+        None => CommitGraph::open(root_uri).await?,
+    };
+    let graph_commit_id = match (
+        sidecar.writer_kind,
+        sidecar.merge_source_commit_id.as_deref(),
+        kind,
+    ) {
+        (SidecarKind::BranchMerge, Some(source_id), RecoveryKind::RolledForward) => {
+            let parent_commit_id = graph.head_commit_id().await?.unwrap_or_default();
+            graph
+                .append_merge_commit(
+                    target_branch,
+                    manifest_version,
+                    &parent_commit_id,
+                    source_id,
+                    Some(RECOVERY_ACTOR),
+                )
+                .await?
+        }
+        _ => {
+            graph
+                .append_commit(target_branch, manifest_version, Some(RECOVERY_ACTOR))
+                .await?
+        }
+    };
+    let mut audit = RecoveryAudit::open(root_uri).await?;
+    audit
+        .append(RecoveryAuditRecord {
+            graph_commit_id,
+            recovery_kind: kind,
+            recovery_for_actor: sidecar.actor_id.clone(),
+            operation_id: sidecar.operation_id.clone(),
+            sidecar_writer_kind: format!("{:?}", sidecar.writer_kind),
+            per_table_outcomes: outcomes,
+            created_at: now_micros()?,
+        })
+        .await?;
+    Ok(())
+}
+
+/// Returns `true` if any `SchemaApply` sidecar is present in
+/// `__recovery/`. Schema-state recovery (`recover_schema_state_files`)
+/// uses this to skip its normal pre-vs-post-commit disambiguation —
+/// when a SchemaApply sidecar is present, we know the writer reached
+/// Phase B (Lance HEADs advanced) but didn't complete Phase C (manifest
+/// publish + staging→final renames). The right action is to complete
+/// the rename so the recovery sweep's roll-forward step sees the new
+/// catalog. Without this, the disambiguation logic deletes the staging
+/// files (since manifest still pins the old table set) and leaves the
+/// repo with new-schema data on disk but the old `_schema.pg` live —
+/// real corruption.
+pub(crate) async fn has_schema_apply_sidecar(
+    root_uri: &str,
+    storage: &dyn StorageAdapter,
+) -> Result<bool> {
+    let sidecars = list_sidecars(root_uri, storage).await?;
+    Ok(sidecars
+        .iter()
+        .any(|s| matches!(s.writer_kind, SidecarKind::SchemaApply)))
+}
+
+/// Open the Lance dataset at `table_path` checked out at the given
+/// branch ref (or default if `branch` is None or "main") and return its
+/// HEAD version. Recovery uses this so feature-branch sidecars classify
+/// against the feature-branch's Lance HEAD, not main's.
+async fn open_lance_head(table_path: &str, branch: Option<&str>) -> Result<u64> {
+    let ds = Dataset::open(table_path)
+        .await
+        .map_err(|e| OmniError::Lance(e.to_string()))?;
+    let ds = match branch {
+        Some(b) if b != "main" => ds
+            .checkout_branch(b)
+            .await
+            .map_err(|e| OmniError::Lance(e.to_string()))?,
+        _ => ds,
+    };
+    Ok(ds.version().version)
+}
+
+async fn delete_sidecar_by_operation_id(
+    root_uri: &str,
+    storage: &dyn StorageAdapter,
+    operation_id: &str,
+) -> Result<()> {
+    storage.delete(&sidecar_uri(root_uri, operation_id)).await
+}
+
+/// Convenience: build a [`RecoverySidecar`] with auto-generated
+/// `operation_id` and `started_at`. The caller fills in the other fields.
+pub(crate) fn new_sidecar(
+    writer_kind: SidecarKind,
+    branch: Option<String>,
+    actor_id: Option<String>,
+    tables: Vec<SidecarTablePin>,
+) -> RecoverySidecar {
+    use std::time::{SystemTime, UNIX_EPOCH};
+    let operation_id = ulid::Ulid::new().to_string();
+    let started_at = match SystemTime::now().duration_since(UNIX_EPOCH) {
+        Ok(d) => format!("{}", d.as_micros()),
+        Err(_) => "0".to_string(),
+    };
+    RecoverySidecar {
+        schema_version: SIDECAR_SCHEMA_VERSION,
+        operation_id,
+        started_at,
+        branch,
+        actor_id,
+        writer_kind,
+        tables,
+        merge_source_commit_id: None,
+        additional_registrations: Vec::new(),
+        tombstones: Vec::new(),
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::storage::LocalStorageAdapter;
+    use crate::table_store::TableStore;
+    use arrow_array::{Int32Array, RecordBatch, StringArray};
+    use arrow_schema::{DataType, Field, Schema};
+    use std::sync::Arc;
+
+    fn person_schema() -> Arc<Schema> {
+        Arc::new(Schema::new(vec![
+            Field::new("id", DataType::Utf8, false),
+            Field::new("age", DataType::Int32, true),
+        ]))
+    }
+
+    fn person_batch(rows: &[(&str, Option<i32>)]) -> RecordBatch {
+        let ids: Vec<&str> = rows.iter().map(|(id, _)| *id).collect();
+        let ages: Vec<Option<i32>> = rows.iter().map(|(_, age)| *age).collect();
+        RecordBatch::try_new(
+            person_schema(),
+            vec![
+                Arc::new(StringArray::from(ids)),
+                Arc::new(Int32Array::from(ages)),
+            ],
+        )
+        .unwrap()
+    }
+
+    fn make_pin(table_key: &str, table_path: &str, expected: u64, post: u64) -> SidecarTablePin {
+        SidecarTablePin {
+            table_key: table_key.to_string(),
+            table_path: table_path.to_string(),
+            expected_version: expected,
+            post_commit_pin: post,
+            table_branch: None,
+        }
+    }
+
+    #[test]
+    fn sidecar_round_trips_through_json() {
+        let original = new_sidecar(
+            SidecarKind::Mutation,
+            Some("main".to_string()),
+            Some("act-alice".to_string()),
+            vec![make_pin("node:Person", "file:///tmp/people.lance", 5, 6)],
+        );
+        let json = serde_json::to_string(&original).unwrap();
+        let parsed = parse_sidecar("file:///tmp/__recovery/x.json", &json).unwrap();
+        assert_eq!(parsed.schema_version, SIDECAR_SCHEMA_VERSION);
+        assert_eq!(parsed.operation_id, original.operation_id);
+        assert_eq!(parsed.writer_kind, SidecarKind::Mutation);
+        assert_eq!(parsed.branch.as_deref(), Some("main"));
+        assert_eq!(parsed.actor_id.as_deref(), Some("act-alice"));
+        assert_eq!(parsed.tables.len(), 1);
+        assert_eq!(parsed.tables[0].table_key, "node:Person");
+    }
+
+    #[test]
+    fn parse_sidecar_refuses_unknown_schema_version() {
+        let body = r#"{
+            "schema_version": 99,
+            "operation_id": "01H000000000000000000000XX",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": null,
+            "writer_kind": "Mutation",
+            "tables": []
+        }"#;
+        let err = parse_sidecar("file:///tmp/__recovery/x.json", body).unwrap_err();
+        let msg = err.to_string();
+        assert!(
+            msg.contains("schema_version=99") && msg.contains("supports only schema_version=1"),
+            "expected SidecarSchemaError mentioning the version mismatch, got: {}",
+            msg,
+        );
+    }
+
+    #[test]
+    fn classify_no_movement_when_head_equals_pinned() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 5, 5, SidecarKind::Mutation),
+            TableClassification::NoMovement,
+        );
+    }
+
+    #[test]
+    fn classify_rolled_past_expected_when_sidecar_matches_strict() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 6, 5, SidecarKind::Mutation),
+            TableClassification::RolledPastExpected,
+        );
+    }
+
+    #[test]
+    fn classify_unexpected_at_p1_when_sidecar_does_not_match_strict() {
+        // Same +1 drift but post_commit_pin says it should be 7, not 6.
+        let pin = make_pin("node:Person", "irrelevant", 5, 7);
+        assert_eq!(
+            classify_table(&pin, 6, 5, SidecarKind::Mutation),
+            TableClassification::UnexpectedAtP1,
+        );
+    }
+
+    #[test]
+    fn classify_unexpected_multistep_when_head_jumped_more_than_one_strict() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 8, 5, SidecarKind::Mutation),
+            TableClassification::UnexpectedMultistep,
+        );
+    }
+
+    #[test]
+    fn classify_invariant_violation_when_head_below_pinned() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 3, 5, SidecarKind::Mutation),
+            TableClassification::InvariantViolation { observed: 3 },
+        );
+    }
+
+    // Loose-match writers (SchemaApply, EnsureIndices) accept any
+    // lance_head > expected_version as RolledPastExpected when the
+    // expected version still matches the manifest pin. The exact
+    // post_commit_pin is allowed to be a lower bound.
+    #[test]
+    fn classify_loose_match_accepts_multi_commit_drift_for_schema_apply() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        // Sidecar's post_commit_pin says 6, but Lance HEAD is 8 (SchemaApply
+        // built two indices). Strict would say UnexpectedMultistep; loose
+        // accepts it as RolledPastExpected.
+        assert_eq!(
+            classify_table(&pin, 8, 5, SidecarKind::SchemaApply),
+            TableClassification::RolledPastExpected,
+        );
+    }
+
+    #[test]
+    fn classify_loose_match_accepts_multi_commit_drift_for_ensure_indices() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 9, 5, SidecarKind::EnsureIndices),
+            TableClassification::RolledPastExpected,
+        );
+    }
+
+    #[test]
+    fn classify_loose_match_no_movement_unchanged() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 5, 5, SidecarKind::SchemaApply),
+            TableClassification::NoMovement,
+        );
+    }
+
+    #[test]
+    fn classify_loose_match_invariant_violation_unchanged() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 3, 5, SidecarKind::SchemaApply),
+            TableClassification::InvariantViolation { observed: 3 },
+        );
+    }
+
+    /// BranchMerge must be loose-matched, not strict: while the strict
+    /// classifier expects exactly one `commit_staged` per table,
+    /// `publish_rewritten_merge_table` runs multiple per table
+    /// (merge_insert + delete_where + index rebuilds — the comment in
+    /// `merge.rs` explicitly says so). Strict classification would roll
+    /// back valid completed Phase B work as `UnexpectedMultistep`.
+    #[test]
+    fn classify_loose_match_accepts_multi_commit_drift_for_branch_merge() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 8, 5, SidecarKind::BranchMerge),
+            TableClassification::RolledPastExpected,
+        );
+    }
+
+    #[test]
+    fn classify_loose_match_branch_merge_no_movement_unchanged() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 5, 5, SidecarKind::BranchMerge),
+            TableClassification::NoMovement,
+        );
+    }
+
+    #[test]
+    fn classify_loose_match_branch_merge_invariant_violation_unchanged() {
+        let pin = make_pin("node:Person", "irrelevant", 5, 6);
+        assert_eq!(
+            classify_table(&pin, 3, 5, SidecarKind::BranchMerge),
+            TableClassification::InvariantViolation { observed: 3 },
+        );
+    }
+
+    #[test]
+    fn decide_roll_forward_when_all_classifications_match() {
+        let cls = vec![
+            TableClassification::RolledPastExpected,
+            TableClassification::RolledPastExpected,
+        ];
+        assert_eq!(decide(&cls), SidecarDecision::RollForward);
+    }
+
+    #[test]
+    fn decide_roll_back_on_mid_phase_b_crash_mix() {
+        // Mid-Phase-B crash: one table iterated (RolledPastExpected),
+        // another not yet iterated (NoMovement).
+        let cls = vec![
+            TableClassification::RolledPastExpected,
+            TableClassification::NoMovement,
+        ];
+        assert_eq!(decide(&cls), SidecarDecision::RollBack);
+    }
+
+    #[test]
+    fn decide_roll_back_on_unexpected_at_p1() {
+        let cls = vec![
+            TableClassification::RolledPastExpected,
+            TableClassification::UnexpectedAtP1,
+        ];
+        assert_eq!(decide(&cls), SidecarDecision::RollBack);
+    }
+
+    #[test]
+    fn decide_abort_on_invariant_violation() {
+        let cls = vec![
+            TableClassification::RolledPastExpected,
+            TableClassification::InvariantViolation { observed: 1 },
+        ];
+        assert_eq!(decide(&cls), SidecarDecision::Abort);
+    }
+
+    #[test]
+    fn decide_roll_forward_on_empty_slice() {
+        // Degenerate case: no tables in the sidecar. Vacuously RollForward
+        // (and the executor will iterate zero tables).
+        assert_eq!(decide(&[]), SidecarDecision::RollForward);
+    }
+
+    #[tokio::test]
+    async fn restore_table_to_version_appends_one_commit() {
+        let dir = tempfile::tempdir().unwrap();
+        let uri = format!("{}/people.lance", dir.path().to_str().unwrap());
+        let store = TableStore::new(dir.path().to_str().unwrap());
+
+        let mut ds = TableStore::write_dataset(&uri, person_batch(&[("alice", Some(30))]))
+            .await
+            .unwrap();
+        store
+            .append_batch(&uri, &mut ds, person_batch(&[("bob", Some(25))]))
+            .await
+            .unwrap();
+        store
+            .append_batch(&uri, &mut ds, person_batch(&[("carol", Some(40))]))
+            .await
+            .unwrap();
+        let head_before = ds.version().version;
+        assert_eq!(head_before, 3);
+
+        restore_table_to_version(&uri, None, 1).await.unwrap();
+
+        let post = Dataset::open(&uri).await.unwrap();
+        assert_eq!(post.version().version, head_before + 1);
+        // Content matches v1 (just alice).
+        let scanner = post.scan();
+        let batches: Vec<RecordBatch> =
+            futures::TryStreamExt::try_collect(scanner.try_into_stream().await.unwrap())
+                .await
+                .unwrap();
+        let total: usize = batches.iter().map(|b| b.num_rows()).sum();
+        assert_eq!(total, 1);
+    }
+
+    #[tokio::test]
+    async fn restore_table_to_version_always_appends_a_commit() {
+        // Restore is unconditional — equal fragment IDs do NOT imply
+        // equal content (Lance index commits and deletion-vector
+        // updates change the manifest without touching fragment IDs).
+        // Repeated restore calls each produce a new HEAD+1 commit.
+        let dir = tempfile::tempdir().unwrap();
+        let uri = format!("{}/people.lance", dir.path().to_str().unwrap());
+        let store = TableStore::new(dir.path().to_str().unwrap());
+
+        let mut ds = TableStore::write_dataset(&uri, person_batch(&[("alice", Some(30))]))
+            .await
+            .unwrap();
+        store
+            .append_batch(&uri, &mut ds, person_batch(&[("bob", Some(25))]))
+            .await
+            .unwrap();
+        // First restore: HEAD goes from 2 to 3 (with content == v1).
+        restore_table_to_version(&uri, None, 1).await.unwrap();
+        let mid = Dataset::open(&uri).await.unwrap().version().version;
+        assert_eq!(mid, 3);
+
+        // Second restore to v1: still appends a commit (HEAD = 4) because
+        // restore is unconditional. The pile-up is bounded and reclaimed
+        // by `omnigraph cleanup`.
+        restore_table_to_version(&uri, None, 1).await.unwrap();
+        let post = Dataset::open(&uri).await.unwrap().version().version;
+        assert_eq!(
+            post,
+            mid + 1,
+            "restore must always append a commit (no fragment-set short-circuit)"
+        );
+    }
+
+    #[tokio::test]
+    async fn list_sidecars_returns_empty_when_dir_missing() {
+        let dir = tempfile::tempdir().unwrap();
+        let storage = LocalStorageAdapter::default();
+        let result = list_sidecars(dir.path().to_str().unwrap(), &storage)
+            .await
+            .unwrap();
+        assert!(result.is_empty());
+    }
+
+    #[tokio::test]
+    async fn write_then_list_then_delete_round_trip() {
+        let dir = tempfile::tempdir().unwrap();
+        // Create the __recovery/ subdir so write_sidecar's parent exists
+        // (LocalStorageAdapter::write_text doesn't mkdir parents).
+        std::fs::create_dir(dir.path().join(RECOVERY_DIR_NAME)).unwrap();
+        let storage = LocalStorageAdapter::default();
+        let root = dir.path().to_str().unwrap();
+
+        let sidecar = new_sidecar(
+            SidecarKind::Mutation,
+            Some("main".to_string()),
+            Some("act-alice".to_string()),
+            vec![make_pin("node:Person", "file:///tmp/x.lance", 5, 6)],
+        );
+        let handle = write_sidecar(root, &storage, &sidecar).await.unwrap();
+        assert_eq!(handle.operation_id, sidecar.operation_id);
+
+        let listed = list_sidecars(root, &storage).await.unwrap();
+        assert_eq!(listed.len(), 1);
+        assert_eq!(listed[0].operation_id, sidecar.operation_id);
+
+        delete_sidecar(&handle, &storage).await.unwrap();
+        let after = list_sidecars(root, &storage).await.unwrap();
+        assert!(after.is_empty());
+    }
+
+    #[tokio::test]
+    async fn list_sidecars_skips_non_json_files() {
+        let dir = tempfile::tempdir().unwrap();
+        std::fs::create_dir(dir.path().join(RECOVERY_DIR_NAME)).unwrap();
+        // Drop a non-JSON file the sweep must ignore (e.g., .DS_Store).
+        std::fs::write(
+            dir.path().join(RECOVERY_DIR_NAME).join(".DS_Store"),
+            "noise",
+        )
+        .unwrap();
+        let storage = LocalStorageAdapter::default();
+        let result = list_sidecars(dir.path().to_str().unwrap(), &storage)
+            .await
+            .unwrap();
+        assert!(result.is_empty());
+    }
+
+    /// `list_dir` returns filesystem-order results, which would make
+    /// sidecar processing nondeterministic. Sidecar filenames are ULIDs
+    /// (lexicographically sortable === chronologically sortable), so
+    /// sorting by URI gives deterministic, time-ordered processing —
+    /// the older sidecar processed before the newer one.
+    #[tokio::test]
+    async fn list_sidecars_returns_deterministic_order() {
+        let dir = tempfile::tempdir().unwrap();
+        std::fs::create_dir(dir.path().join(RECOVERY_DIR_NAME)).unwrap();
+        let storage = LocalStorageAdapter::default();
+        let root = dir.path().to_str().unwrap();
+
+        // Write sidecars in REVERSE chronological order (newest first).
+        // The classifier shouldn't care, but the sweep needs deterministic
+        // processing for reproducibility.
+        let ids = [
+            "01H000000000000000000000ZZ",
+            "01H000000000000000000000MM",
+            "01H000000000000000000000AA",
+        ];
+        for id in &ids {
+            let sc = new_sidecar(
+                SidecarKind::Mutation,
+                None,
+                None,
+                vec![make_pin("node:Person", "/dev/null/x.lance", 1, 2)],
+            );
+            // Override operation_id to use our deterministic ID.
+            let mut sc = sc;
+            sc.operation_id = id.to_string();
+            write_sidecar(root, &storage, &sc).await.unwrap();
+        }
+
+        let listed = list_sidecars(root, &storage).await.unwrap();
+        let listed_ids: Vec<&str> = listed.iter().map(|s| s.operation_id.as_str()).collect();
+        let mut sorted_ids = listed_ids.clone();
+        sorted_ids.sort();
+        assert_eq!(
+            listed_ids, sorted_ids,
+            "list_sidecars must return sidecars in deterministic (sorted) order; got {:?}",
+            listed_ids,
+        );
+    }
+}
diff --git a/crates/omnigraph/src/db/mod.rs b/crates/omnigraph/src/db/mod.rs
index 8ce5576..7a335fd 100644
--- a/crates/omnigraph/src/db/mod.rs
+++ b/crates/omnigraph/src/db/mod.rs
@@ -2,6 +2,7 @@ pub mod commit_graph;
 pub mod graph_coordinator;
 pub mod manifest;
 mod omnigraph;
+mod recovery_audit;
 mod run_registry;
 mod schema_state;
 
@@ -9,8 +10,8 @@ pub use commit_graph::GraphCommit;
 pub use graph_coordinator::{GraphCoordinator, ReadTarget, ResolvedTarget, SnapshotId};
 pub use manifest::{Snapshot, SubTableEntry, SubTableUpdate};
 pub use omnigraph::{
-    CleanupPolicyOptions, MergeOutcome, Omnigraph, SchemaApplyResult, TableCleanupStats,
-    TableOptimizeStats,
+    CleanupPolicyOptions, MergeOutcome, Omnigraph, OpenMode, SchemaApplyResult,
+    TableCleanupStats, TableOptimizeStats,
 };
 pub(crate) use omnigraph::ensure_public_branch_ref;
 pub(crate) use run_registry::is_internal_run_branch;
diff --git a/crates/omnigraph/src/db/omnigraph.rs b/crates/omnigraph/src/db/omnigraph.rs
index 32f1234..e54b6eb 100644
--- a/crates/omnigraph/src/db/omnigraph.rs
+++ b/crates/omnigraph/src/db/omnigraph.rs
@@ -81,6 +81,24 @@ pub struct Omnigraph {
     pub(crate) audit_actor_id: Option<String>,
 }
 
+/// Whether [`Omnigraph::open`] runs the open-time recovery sweep.
+///
+/// Recovery requires Lance writes (`Dataset::restore`, `ManifestBatchPublisher::publish`).
+/// Read-only consumers — NDJSON export, `commit list`, `read`, schema
+/// inspection — should not trigger writes (they may run with read-only
+/// object-store credentials, and silent open-time mutations are
+/// surprising). They also don't need recovery: reads always resolve
+/// through the manifest pin, which is the consistent snapshot regardless
+/// of any Phase B → Phase C drift on the per-table side.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub enum OpenMode {
+    /// Run the recovery sweep on open. Default for `Omnigraph::open`.
+    ReadWrite,
+    /// Skip the recovery sweep. Use for read-only consumers via
+    /// [`Omnigraph::open_read_only`].
+    ReadOnly,
+}
+
 impl Omnigraph {
     /// Create a new repo at `uri` from schema source.
     ///
@@ -119,24 +137,65 @@ impl Omnigraph {
         })
     }
 
-    /// Open an existing repo.
+    /// Open an existing repo (read-write).
     ///
     /// Reads `_schema.pg`, parses it, builds the catalog, and opens `__manifest`.
+    /// Runs the open-time recovery sweep before returning — see [`OpenMode`].
     pub async fn open(uri: &str) -> Result<Self> {
-        Self::open_with_storage(uri, storage_for_uri(uri)?).await
+        Self::open_with_storage_and_mode(uri, storage_for_uri(uri)?, OpenMode::ReadWrite).await
     }
 
+    /// Open an existing repo for read-only consumers (NDJSON export,
+    /// `commit list`, etc.). Skips the recovery sweep — see [`OpenMode`].
+    pub async fn open_read_only(uri: &str) -> Result<Self> {
+        Self::open_with_storage_and_mode(uri, storage_for_uri(uri)?, OpenMode::ReadOnly).await
+    }
+
+    /// `open_with_storage` retained for existing callers (init/test paths).
+    /// Defaults to `OpenMode::ReadWrite`.
     pub(crate) async fn open_with_storage(
         uri: &str,
         storage: Arc<dyn StorageAdapter>,
+    ) -> Result<Self> {
+        Self::open_with_storage_and_mode(uri, storage, OpenMode::ReadWrite).await
+    }
+
+    pub(crate) async fn open_with_storage_and_mode(
+        uri: &str,
+        storage: Arc<dyn StorageAdapter>,
+        mode: OpenMode,
     ) -> Result<Self> {
         let root = normalize_root_uri(uri)?;
         // Open the coordinator first so the schema-staging recovery sweep can
-        // compare its snapshot against any leftover staging files. Recovery
-        // either deletes staging (pre-commit crash) or completes the rename
-        // (post-commit crash) before the live schema files are read.
-        let coordinator = GraphCoordinator::open(&root, Arc::clone(&storage)).await?;
-        recover_schema_state_files(&root, Arc::clone(&storage), &coordinator.snapshot()).await?;
+        // compare its snapshot against any leftover staging files.
+        let mut coordinator = GraphCoordinator::open(&root, Arc::clone(&storage)).await?;
+        // Both the schema-state recovery sweep AND the manifest-drift
+        // recovery sweep are gated on `OpenMode::ReadWrite`. Read-only
+        // consumers (NDJSON export, `commit list`, schema show) shouldn't
+        // trigger object-store mutations: they may run with read-only
+        // credentials, and silent open-time writes are surprising. Both
+        // sweeps' work is recoverable on the next ReadWrite open, so
+        // skipping under ReadOnly doesn't lose any safety guarantees —
+        // the manifest pin is the consistent snapshot regardless of
+        // drift on the per-table side or leftover schema-staging files.
+        if matches!(mode, OpenMode::ReadWrite) {
+            let schema_state_recovery =
+                recover_schema_state_files(&root, Arc::clone(&storage), &coordinator.snapshot())
+                    .await?;
+            // Recovery sweep: close the Phase B → Phase C residual on
+            // any sidecar left over from a crashed writer. Continuous
+            // in-process recovery for long-running servers (no restart
+            // required between Phase B failure and recovery) is a
+            // separate background-reconciler effort.
+            crate::db::manifest::recover_manifest_drift(
+                &root,
+                Arc::clone(&storage),
+                &mut coordinator,
+                crate::db::manifest::RecoveryMode::Full,
+                schema_state_recovery,
+            )
+            .await?;
+        }
         // Read _schema.pg (post-recovery — may have just been renamed in).
         let schema_path = schema_source_uri(&root);
         let schema_source = storage.read_text(&schema_path).await?;
@@ -202,17 +261,29 @@ impl Omnigraph {
 
     /// Engine-facing trait surface around `TableStore`.
     ///
-    /// MR-793 Phase 1: this is the canonical accessor for newly-written
-    /// engine code. The trait's signatures use opaque `SnapshotHandle` /
-    /// `StagedHandle` instead of leaking `lance::Dataset` /
+    /// This is the canonical accessor for newly-written engine code. The
+    /// trait's signatures use opaque `SnapshotHandle` / `StagedHandle`
+    /// instead of leaking `lance::Dataset` /
     /// `lance::dataset::transaction::Transaction`. Existing call sites
     /// that still use `db.table_store.X(...)` (the inherent struct
-    /// methods) are migrated incrementally — see §9 of
-    /// `.context/mr-793-design.md`.
+    /// methods) are migrated incrementally.
     pub(crate) fn storage(&self) -> &dyn crate::storage_layer::TableStorage {
         &self.table_store
     }
 
+    /// Engine-level access to the object-store adapter (S3 / local fs).
+    /// Used by the recovery sidecar protocol — writers in the engine
+    /// call this to write/delete sidecars at `__recovery/{ulid}.json`.
+    pub(crate) fn storage_adapter(&self) -> &dyn crate::storage::StorageAdapter {
+        self.storage.as_ref()
+    }
+
+    /// Engine-level access to the repo's normalized root URI. Used by
+    /// the recovery sidecar protocol to compute `__recovery/` paths.
+    pub(crate) fn root_uri(&self) -> &str {
+        &self.root_uri
+    }
+
     pub(crate) async fn open_coordinator_for_branch(
         &self,
         branch: Option<&str>,
@@ -306,8 +377,89 @@ impl Omnigraph {
         Ok(())
     }
 
-    /// Re-read the handle-local coordinator state from storage.
-    pub(crate) async fn refresh(&mut self) -> Result<()> {
+    /// Re-read the handle-local coordinator state from storage AND run
+    /// in-process recovery. Closes the Phase B → Phase C residual (e.g.
+    /// `MutationStaging::finalize` crash mid-publish in a long-running
+    /// server) without restart.
+    ///
+    /// Composition mirrors `Omnigraph::open_with_storage_and_mode`'s
+    /// recovery sequence, in the same order, with one restriction: the
+    /// manifest-drift sweep runs in `RollForwardOnly` mode (rollback /
+    /// abort cases defer to the next ReadWrite open because
+    /// `Dataset::restore` is unsafe under concurrency). Each step:
+    ///
+    /// 1. `coordinator.refresh()` — re-read manifest.
+    /// 2. `recover_schema_state_files` — complete an in-flight
+    ///    schema_apply's staging→final rename if a SchemaApply sidecar
+    ///    is on disk; idempotent + early-returns when no staging files
+    ///    exist. Required BEFORE manifest-drift recovery so a
+    ///    SchemaApply roll-forward doesn't publish the manifest while
+    ///    the staging files remain unrenamed (which would corrupt the
+    ///    repo: data on new schema, catalog on old).
+    /// 3. `recover_manifest_drift(... RollForwardOnly)` — close the
+    ///    finalize→publisher residual via roll-forward; defer rollback
+    ///    work to next ReadWrite open.
+    /// 4. `runtime_cache.invalidate_all` — drop stale per-snapshot caches.
+    ///
+    /// Steady state cost: one `list_dir` of `__recovery/` (typically
+    /// returns empty → early return for both passes). No additional
+    /// Lance reads.
+    ///
+    /// Engine-internal callers that already hold an in-flight sidecar
+    /// (e.g. `schema_apply` mid-write) MUST use
+    /// [`refresh_coordinator_only`](Self::refresh_coordinator_only) to
+    /// avoid the recovery sweep racing their own sidecar.
+    pub async fn refresh(&mut self) -> Result<()> {
+        self.coordinator.refresh().await?;
+        let schema_state_recovery = recover_schema_state_files(
+            &self.root_uri,
+            Arc::clone(&self.storage),
+            &self.coordinator.snapshot(),
+        )
+        .await?;
+        crate::db::manifest::recover_manifest_drift(
+            &self.root_uri,
+            Arc::clone(&self.storage),
+            &mut self.coordinator,
+            crate::db::manifest::RecoveryMode::RollForwardOnly,
+            schema_state_recovery,
+        )
+        .await?;
+        self.reload_schema_if_source_changed().await?;
+        self.runtime_cache.invalidate_all().await;
+        Ok(())
+    }
+
+    async fn reload_schema_if_source_changed(&mut self) -> Result<()> {
+        let schema_path = schema_source_uri(&self.root_uri);
+        let schema_source = self.storage.read_text(&schema_path).await?;
+        if schema_source == self.schema_source {
+            return Ok(());
+        }
+        let current_source_ir = read_schema_ir_from_source(&schema_source)?;
+        let branches = self.coordinator.branch_list().await?;
+        let (accepted_ir, _) = load_or_bootstrap_schema_contract(
+            &self.root_uri,
+            Arc::clone(&self.storage),
+            &branches,
+            &current_source_ir,
+        )
+        .await?;
+        let mut catalog = build_catalog_from_ir(&accepted_ir)?;
+        fixup_blob_schemas(&mut catalog);
+        self.schema_source = schema_source;
+        self.catalog = catalog;
+        Ok(())
+    }
+
+    /// Refresh coordinator state and invalidate the runtime cache WITHOUT
+    /// running the recovery sweep. Engine-internal callers that hold an
+    /// in-flight sidecar (e.g. `schema_apply::apply_schema_with_lock`'s
+    /// internal lease-check refresh) need this variant: running recovery
+    /// here would observe the caller's own sidecar, classify it as
+    /// RolledPastExpected, and roll it forward — racing the caller's
+    /// own publish path.
+    pub(crate) async fn refresh_coordinator_only(&mut self) -> Result<()> {
         self.coordinator.refresh().await?;
         self.runtime_cache.invalidate_all().await;
         Ok(())
@@ -462,6 +614,23 @@ impl Omnigraph {
         table_ops::ensure_indices_on(self, branch).await
     }
 
+    #[cfg(feature = "failpoints")]
+    #[doc(hidden)]
+    pub async fn failpoint_publish_table_head_without_index_rebuild_for_test(
+        &mut self,
+        branch: &str,
+        table_key: &str,
+        table_branch: Option<&str>,
+    ) -> Result<u64> {
+        table_ops::failpoint_publish_table_head_without_index_rebuild_for_test(
+            self,
+            branch,
+            table_key,
+            table_branch,
+        )
+        .await
+    }
+
     /// Compact small Lance fragments into fewer larger ones across every
     /// node + edge table on `main`. See [`optimize`] for details.
     pub async fn optimize(&mut self) -> Result<Vec<optimize::TableOptimizeStats>> {
@@ -840,7 +1009,6 @@ impl Omnigraph {
     pub(crate) async fn invalidate_graph_index(&self) {
         table_ops::invalidate_graph_index(self).await
     }
-
 }
 
 pub(crate) fn normalize_branch_name(branch: &str) -> Result<Option<String>> {
@@ -1276,6 +1444,10 @@ edge WorksAt: Person -> Company
             self.deletes.lock().unwrap().push(uri.to_string());
             self.inner.delete(uri).await
         }
+
+        async fn list_dir(&self, dir_uri: &str) -> Result<Vec<String>> {
+            self.inner.list_dir(dir_uri).await
+        }
     }
 
     #[tokio::test]
diff --git a/crates/omnigraph/src/db/omnigraph/schema_apply.rs b/crates/omnigraph/src/db/omnigraph/schema_apply.rs
index 36f4b01..e35258c 100644
--- a/crates/omnigraph/src/db/omnigraph/schema_apply.rs
+++ b/crates/omnigraph/src/db/omnigraph/schema_apply.rs
@@ -151,6 +151,91 @@ pub(super) async fn apply_schema_with_lock(
     let mut table_updates = HashMap::<String, crate::db::SubTableUpdate>::new();
     let mut table_tombstones = HashMap::<String, u64>::new();
 
+    // Recovery sidecar: protect the per-table commit_staged loop in
+    // rewritten_tables + indexed_tables. The post_commit_pin we record
+    // here is a lower bound (expected + 1); the classifier loose-matches
+    // for SidecarKind::SchemaApply because the actual N depends on how
+    // many indices need building. See classify_table's loose-match arm.
+    let recovery_pins: Vec<crate::db::manifest::SidecarTablePin> = rewritten_tables
+        .iter()
+        .chain(indexed_tables.iter().filter(|t| {
+            !rewritten_tables.contains(*t)
+                && !added_tables.contains(*t)
+                && !renamed_tables.contains_key(*t)
+        }))
+        .filter_map(|table_key| {
+            let entry = snapshot.entry(table_key)?;
+            Some(crate::db::manifest::SidecarTablePin {
+                table_key: table_key.clone(),
+                table_path: db.table_store.dataset_uri(&entry.table_path),
+                expected_version: entry.table_version,
+                post_commit_pin: entry.table_version + 1,
+                table_branch: entry.table_branch.clone(),
+            })
+        })
+        .collect();
+    // Capture additional registrations + tombstones for the sidecar so
+    // recovery can publish them alongside the per-table updates. Without
+    // this, an added type's dataset is created in Phase B but the
+    // manifest never gains an entry for it after roll-forward — the
+    // live `_schema.pg` declares a type the manifest doesn't know about
+    // and reads through the engine fail with "no manifest entry for X".
+    let mut sidecar_registrations: Vec<crate::db::manifest::SidecarTableRegistration> = Vec::new();
+    for table_key in &added_tables {
+        sidecar_registrations.push(crate::db::manifest::SidecarTableRegistration {
+            table_key: table_key.clone(),
+            table_path: table_path_for_table_key(table_key)?,
+            table_branch: None,
+        });
+    }
+    for target_table_key in renamed_tables.keys() {
+        sidecar_registrations.push(crate::db::manifest::SidecarTableRegistration {
+            table_key: target_table_key.clone(),
+            table_path: table_path_for_table_key(target_table_key)?,
+            table_branch: None,
+        });
+    }
+    let mut sidecar_tombstones: Vec<crate::db::manifest::SidecarTombstone> = Vec::new();
+    for source_table_key in renamed_tables.values() {
+        let source_entry = snapshot.entry(source_table_key).ok_or_else(|| {
+            OmniError::manifest(format!(
+                "missing source table '{}' for schema rename when building recovery sidecar",
+                source_table_key
+            ))
+        })?;
+        sidecar_tombstones.push(crate::db::manifest::SidecarTombstone {
+            table_key: source_table_key.clone(),
+            tombstone_version: source_entry.table_version.saturating_add(1),
+        });
+    }
+
+    let recovery_handle = if recovery_pins.is_empty()
+        && sidecar_registrations.is_empty()
+        && sidecar_tombstones.is_empty()
+    {
+        None
+    } else {
+        // `branch=None` because schema_apply publishes against main —
+        // the `__schema_apply_lock__` branch is purely a serialization
+        // sentinel (acquire_schema_apply_lock creates it; the manifest
+        // publish via coordinator.commit_changes_with_actor below targets
+        // the coordinator's active branch, which is the pre-lock branch).
+        // If the lock release fires before recovery, the lock branch is
+        // gone — the sidecar must not reference it.
+        let mut sidecar = crate::db::manifest::new_sidecar(
+            crate::db::manifest::SidecarKind::SchemaApply,
+            None,
+            db.audit_actor_id.clone(),
+            recovery_pins,
+        );
+        sidecar.additional_registrations = sidecar_registrations;
+        sidecar.tombstones = sidecar_tombstones;
+        Some(
+            crate::db::manifest::write_sidecar(db.root_uri(), db.storage_adapter(), &sidecar)
+                .await?,
+        )
+    };
+
     for table_key in &added_tables {
         let table_path = table_path_for_table_key(table_key)?;
         let dataset_uri = db.table_store.dataset_uri(&table_path);
@@ -237,8 +322,8 @@ pub(super) async fn apply_schema_with_lock(
         )
         .await?;
         let dataset_uri = db.table_store.dataset_uri(&entry.table_path);
-        // MR-793 Phase 6: route through stage_overwrite + commit_staged
-        // for non-empty batches. Lance's `InsertBuilder::execute_uncommitted`
+        // Route through stage_overwrite + commit_staged for non-empty
+        // batches. Lance's `InsertBuilder::execute_uncommitted`
         // errors on empty data (lance-4.0.0 `src/dataset/write/insert.rs:144`),
         // so the empty-rewrite case stays on `overwrite_dataset` (which
         // accepts empty input). The empty case is rare in schema_apply
@@ -257,11 +342,7 @@ pub(super) async fn apply_schema_with_lock(
             // open the wrong HEAD here.
             let existing = db
                 .table_store
-                .open_dataset_head_for_write(
-                    table_key,
-                    &dataset_uri,
-                    entry.table_branch.as_deref(),
-                )
+                .open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref())
                 .await?;
             let staged = db.table_store.stage_overwrite(&existing, batch).await?;
             db.table_store
@@ -336,7 +417,7 @@ pub(super) async fn apply_schema_with_lock(
         }));
     }
 
-    db.refresh().await?;
+    db.refresh_coordinator_only().await?;
     if db.version() != base_manifest_version {
         return Err(OmniError::manifest_conflict(format!(
             "schema apply lost its write lease: main advanced from v{} to v{} while schema apply was in progress",
@@ -353,6 +434,8 @@ pub(super) async fn apply_schema_with_lock(
     // `recover_schema_state_files`:
     //   - crash before commit  → manifest unchanged; staging deleted on open
     //   - crash after commit   → manifest advanced; staging renamed on open
+    crate::failpoints::maybe_fail("schema_apply.before_staging_write")?;
+
     let staging_pg_uri = schema_source_staging_uri(&db.root_uri);
     db.storage
         .write_text(&staging_pg_uri, desired_schema_source)
@@ -396,6 +479,23 @@ pub(super) async fn apply_schema_with_lock(
         db.invalidate_graph_index().await;
     }
 
+    // Recovery sidecar lifecycle: delete after the manifest commit
+    // succeeded. Best-effort: if this delete fails, the sidecar persists
+    // and on next open the sweep sees every table at the post-publish
+    // manifest pin (NoMovement) and the sidecar is treated as a stale
+    // artifact (recovery is a no-op and the sidecar is cleaned up).
+    // Failing the schema_apply call would report failure for a migration
+    // that already succeeded.
+    if let Some(handle) = recovery_handle {
+        if let Err(err) = crate::db::manifest::delete_sidecar(&handle, db.storage_adapter()).await {
+            tracing::warn!(
+                error = %err,
+                operation_id = handle.operation_id.as_str(),
+                "recovery sidecar cleanup failed; the next open's recovery sweep will resolve it"
+            );
+        }
+    }
+
     Ok(SchemaApplyResult {
         supported: true,
         applied: true,
@@ -405,13 +505,13 @@ pub(super) async fn apply_schema_with_lock(
 }
 
 pub(super) async fn ensure_schema_apply_idle(db: &mut Omnigraph, operation: &str) -> Result<()> {
-    db.refresh().await?;
+    db.refresh_coordinator_only().await?;
     ensure_schema_apply_not_locked(db, operation).await
 }
 
 pub(super) async fn acquire_schema_apply_lock(db: &mut Omnigraph) -> Result<()> {
     db.ensure_schema_state_valid().await?;
-    db.refresh().await?;
+    db.refresh_coordinator_only().await?;
     let branches = db.coordinator.all_branches().await?;
     if branches
         .iter()
@@ -425,7 +525,7 @@ pub(super) async fn acquire_schema_apply_lock(db: &mut Omnigraph) -> Result<()>
     db.coordinator
         .branch_create(SCHEMA_APPLY_LOCK_BRANCH)
         .await?;
-    db.refresh().await?;
+    db.refresh_coordinator_only().await?;
 
     let blocking_branches = db
         .coordinator
@@ -449,7 +549,12 @@ pub(super) async fn release_schema_apply_lock(db: &mut Omnigraph) -> Result<()>
     db.coordinator
         .branch_delete(SCHEMA_APPLY_LOCK_BRANCH)
         .await?;
-    db.refresh().await
+    // Use refresh_coordinator_only — the full Omnigraph::refresh would
+    // run roll-forward-only recovery, and on the failure path the
+    // in-flight schema_apply sidecar is still on disk; recovery would
+    // race the caller's own publish (or roll forward an aborted apply
+    // we want to leave for next-open).
+    db.refresh_coordinator_only().await
 }
 
 pub(super) async fn ensure_schema_apply_not_locked(db: &Omnigraph, operation: &str) -> Result<()> {
diff --git a/crates/omnigraph/src/db/omnigraph/table_ops.rs b/crates/omnigraph/src/db/omnigraph/table_ops.rs
index b343b4e..1e48d03 100644
--- a/crates/omnigraph/src/db/omnigraph/table_ops.rs
+++ b/crates/omnigraph/src/db/omnigraph/table_ops.rs
@@ -31,6 +31,37 @@ pub(super) async fn ensure_indices_on(db: &mut Omnigraph, branch: &str) -> Resul
     ensure_indices_for_branch(db, branch.as_deref()).await
 }
 
+#[cfg(feature = "failpoints")]
+pub(super) async fn failpoint_publish_table_head_without_index_rebuild_for_test(
+    db: &mut Omnigraph,
+    branch: &str,
+    table_key: &str,
+    table_branch: Option<&str>,
+) -> Result<u64> {
+    let branch = normalize_branch_name(branch)?;
+    let snapshot = db.snapshot_for_branch(branch.as_deref()).await?;
+    let entry = snapshot
+        .entry(table_key)
+        .ok_or_else(|| OmniError::manifest(format!("no manifest entry for {}", table_key)))?;
+    let full_path = format!("{}/{}", db.root_uri, entry.table_path);
+    let ds = db
+        .table_store
+        .open_dataset_head_for_write(table_key, &full_path, table_branch)
+        .await?;
+    let state = db.table_store.table_state(&full_path, &ds).await?;
+    let update = crate::db::SubTableUpdate {
+        table_key: table_key.to_string(),
+        table_version: state.version,
+        table_branch: table_branch.map(str::to_string),
+        row_count: state.row_count,
+        version_metadata: state.version_metadata,
+    };
+    let mut expected = std::collections::HashMap::new();
+    expected.insert(table_key.to_string(), entry.table_version);
+    commit_prepared_updates_on_branch_with_expected(db, branch.as_deref(), &[update], &expected)
+        .await
+}
+
 pub(super) async fn ensure_indices_for_branch(
     db: &mut Omnigraph,
     branch: Option<&str>,
@@ -42,6 +73,95 @@ pub(super) async fn ensure_indices_for_branch(
     let mut updates = Vec::new();
     let active_branch = resolved.branch;
 
+    // Recovery sidecar: protect the per-table commit_staged loop in
+    // build_indices_on_dataset (one commit per index built). Only pins
+    // tables that ACTUALLY need index work — the classifier
+    // loose-matches for SidecarKind::EnsureIndices (the actual N
+    // depends on which indices are missing), but if a table needs zero
+    // commits and gets pinned, the all-or-nothing decision rule
+    // classifies it as `NoMovement` and rolls back legitimately-
+    // committed work on sibling tables. Steady-state runs (everything
+    // already indexed) skip the sidecar entirely.
+    let mut recovery_pins: Vec<crate::db::manifest::SidecarTablePin> = Vec::new();
+    for type_name in db.catalog.node_types.keys() {
+        let table_key = format!("node:{}", type_name);
+        let Some(entry) = snapshot.entry(&table_key) else {
+            continue;
+        };
+        // Match the processing loop's branch filter: when running on a
+        // feature branch, main-branch tables (table_branch = None) are
+        // skipped (`None => continue` at ~line 118). Pinning them here
+        // would force NoMovement on recovery and trigger an all-or-
+        // nothing rollback of legitimately-committed work on the
+        // feature-branch tables.
+        if active_branch.is_some() && entry.table_branch.is_none() {
+            continue;
+        }
+        let full_path = format!("{}/{}", db.root_uri, entry.table_path);
+        if needs_index_work_node(
+            db,
+            type_name,
+            &table_key,
+            &full_path,
+            entry.table_branch.as_deref(),
+        )
+        .await?
+        {
+            recovery_pins.push(crate::db::manifest::SidecarTablePin {
+                table_key,
+                table_path: full_path,
+                expected_version: entry.table_version,
+                post_commit_pin: entry.table_version + 1,
+                // Use active_branch (where commits actually land), NOT
+                // entry.table_branch (where the table currently lives).
+                // open_owned_dataset_for_branch_write forks a feature
+                // branch from a main-branch table on first write — the
+                // resulting commit lands on active_branch. Recovery's
+                // open_lance_head must check the same branch.
+                table_branch: active_branch.clone(),
+            });
+        }
+    }
+    for edge_name in db.catalog.edge_types.keys() {
+        let table_key = format!("edge:{}", edge_name);
+        let Some(entry) = snapshot.entry(&table_key) else {
+            continue;
+        };
+        if active_branch.is_some() && entry.table_branch.is_none() {
+            continue;
+        }
+        let full_path = format!("{}/{}", db.root_uri, entry.table_path);
+        if needs_index_work_edge(db, &table_key, &full_path, entry.table_branch.as_deref()).await? {
+            recovery_pins.push(crate::db::manifest::SidecarTablePin {
+                table_key,
+                table_path: full_path,
+                expected_version: entry.table_version,
+                post_commit_pin: entry.table_version + 1,
+                // Use active_branch (where commits actually land), NOT
+                // entry.table_branch (where the table currently lives).
+                // open_owned_dataset_for_branch_write forks a feature
+                // branch from a main-branch table on first write — the
+                // resulting commit lands on active_branch. Recovery's
+                // open_lance_head must check the same branch.
+                table_branch: active_branch.clone(),
+            });
+        }
+    }
+    let recovery_handle = if recovery_pins.is_empty() {
+        None
+    } else {
+        let sidecar = crate::db::manifest::new_sidecar(
+            crate::db::manifest::SidecarKind::EnsureIndices,
+            active_branch.clone(),
+            db.audit_actor_id.clone(),
+            recovery_pins,
+        );
+        Some(
+            crate::db::manifest::write_sidecar(db.root_uri(), db.storage_adapter(), &sidecar)
+                .await?,
+        )
+    };
+
     for type_name in db.catalog.node_types.keys() {
         let table_key = format!("node:{}", type_name);
         let Some(entry) = snapshot.entry(&table_key) else {
@@ -136,13 +256,123 @@ pub(super) async fn ensure_indices_for_branch(
         }
     }
 
+    // Failpoint: pin the per-writer Phase B → Phase C residual for
+    // ensure_indices. Lance HEAD has advanced on every touched table
+    // (one commit_staged per index built) but the manifest publish below
+    // hasn't run. Used by
+    // `tests/failpoints.rs::ensure_indices_phase_b_failure_recovered_on_next_open`.
+    crate::failpoints::maybe_fail("ensure_indices.post_phase_b_pre_manifest_commit")?;
+
     if !updates.is_empty() {
         commit_prepared_updates_on_branch(db, branch, &updates).await?;
     }
 
+    // Recovery sidecar lifecycle: delete after the manifest publish (or
+    // no-op when there were no updates — the sidecar covered the
+    // per-table commit window regardless). Best-effort cleanup; failing
+    // the user here would error a call that already succeeded.
+    if let Some(handle) = recovery_handle {
+        if let Err(err) = crate::db::manifest::delete_sidecar(&handle, db.storage_adapter()).await {
+            tracing::warn!(
+                error = %err,
+                operation_id = handle.operation_id.as_str(),
+                "recovery sidecar cleanup failed; the next open's recovery sweep will resolve it"
+            );
+        }
+    }
+
     Ok(())
 }
 
+/// Returns true if the node table is missing at least one declared
+/// scalar/vector index that `build_indices_on_dataset_for_catalog` would
+/// build AND has at least one row (the ensure_indices loop has
+/// `if row_count > 0 { build_indices(...) }`, so empty tables produce
+/// zero commits and must NOT be pinned in the sidecar — pinning them
+/// would force `NoMovement` classification on recovery and trigger the
+/// all-or-nothing rollback of sibling tables' legitimate index work).
+///
+/// Per the actual `build_indices_on_dataset_for_catalog` implementation
+/// (this file, ~line 419-491), nodes get BTree (id) + per-prop FTS
+/// (@search String) + per-prop Vector indices; edges get BTree only
+/// (id, src, dst). The two helpers mirror that asymmetry — see the
+/// `needs_index_work_edge` doc comment.
+async fn needs_index_work_node(
+    db: &Omnigraph,
+    type_name: &str,
+    table_key: &str,
+    full_path: &str,
+    table_branch: Option<&str>,
+) -> Result<bool> {
+    let ds = db
+        .table_store
+        .open_dataset_head_for_write(table_key, full_path, table_branch)
+        .await?;
+    // Empty tables are skipped by the ensure_indices loop, so they must
+    // not be pinned in the sidecar — pinning a table that produces zero
+    // commits classifies as NoMovement on recovery and forces all-or-
+    // nothing rollback of sibling tables' legitimate index work.
+    // Errors from count_rows are propagated: silently treating them as
+    // "0 rows" risks skipping a table that is actually about to be
+    // modified.
+    if db.table_store.count_rows(&ds, None).await? == 0 {
+        return Ok(false);
+    }
+    if !db.table_store.has_btree_index(&ds, "id").await? {
+        return Ok(true);
+    }
+    let Some(node_type) = db.catalog.node_types.get(type_name) else {
+        return Ok(false);
+    };
+    for index_cols in &node_type.indices {
+        if index_cols.len() != 1 {
+            continue;
+        }
+        let prop_name = &index_cols[0];
+        let Some(prop_type) = node_type.properties.get(prop_name) else {
+            continue;
+        };
+        if matches!(prop_type.scalar, ScalarType::String) && !prop_type.list {
+            if !db.table_store.has_fts_index(&ds, prop_name).await? {
+                return Ok(true);
+            }
+        } else if matches!(prop_type.scalar, ScalarType::Vector(_)) && !prop_type.list {
+            if !db.table_store.has_vector_index(&ds, prop_name).await? {
+                return Ok(true);
+            }
+        }
+    }
+    Ok(false)
+}
+
+/// Companion to `needs_index_work_node` for edge tables.
+///
+/// **Intentional asymmetry with the node helper**: edges only need
+/// BTree indices (id, src, dst) per `build_indices_on_dataset_for_catalog`
+/// at the edge branch (this file, lines 474-485). FTS / vector indices
+/// on edge properties are not built today; if they ever are, this
+/// helper plus the build function must be updated together.
+///
+/// Empty edge tables are skipped by the ensure_indices loop the same
+/// way node tables are; see `needs_index_work_node`.
+async fn needs_index_work_edge(
+    db: &Omnigraph,
+    table_key: &str,
+    full_path: &str,
+    table_branch: Option<&str>,
+) -> Result<bool> {
+    let ds = db
+        .table_store
+        .open_dataset_head_for_write(table_key, full_path, table_branch)
+        .await?;
+    if db.table_store.count_rows(&ds, None).await? == 0 {
+        return Ok(false);
+    }
+    Ok(!db.table_store.has_btree_index(&ds, "id").await?
+        || !db.table_store.has_btree_index(&ds, "src").await?
+        || !db.table_store.has_btree_index(&ds, "dst").await?)
+}
+
 pub(super) async fn open_for_mutation(
     db: &Omnigraph,
     table_key: &str,
@@ -290,14 +520,14 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
         }
 
         if let Some(node_type) = catalog.node_types.get(type_name) {
-            // Per MR-793 §10 OQ3: stage scalar indices first (BTree,
-            // Inverted), then call `create_vector_index` inline. The
-            // inline-commit on a vector index advances HEAD, which would
-            // invalidate any uncommitted scalar index transactions if we
-            // stacked them. Today the per-stage shape commits each
-            // scalar index immediately so the order constraint is
-            // implicit, but if we ever batch scalar stages we must
-            // ensure they all land before the vector inline-commit.
+            // Stage scalar indices first (BTree, Inverted), then call
+            // `create_vector_index` inline. The inline-commit on a vector
+            // index advances HEAD, which would invalidate any uncommitted
+            // scalar index transactions if we stacked them. Today the
+            // per-stage shape commits each scalar index immediately so
+            // the order constraint is implicit, but if we ever batch
+            // scalar stages we must ensure they all land before the
+            // vector inline-commit.
             for index_cols in &node_type.indices {
                 if index_cols.len() != 1 {
                     continue;
@@ -353,13 +583,13 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
 }
 
 /// Stage a BTREE index transaction and commit it, advancing the in-memory
-/// `*ds` to the new HEAD. MR-793 Phase 4: replaces the previous
-/// inline-commit `create_btree_index(ds)` call with the staged primitive
-/// + an immediate `commit_staged`. Per-call behavior is unchanged
-/// (HEAD advances once per index), but the bytes-on-disk and HEAD-advance
-/// are now decoupled at the `TableStore` API surface — a caller that
-/// needs end-of-batch atomicity can stage many transactions and commit
-/// them in one pass (Phase 8's index reconciler relies on this).
+/// `*ds` to the new HEAD. The staged primitive + immediate `commit_staged`
+/// pair replaced the earlier inline-commit `create_btree_index(ds)` call.
+/// Per-call behavior is unchanged (HEAD advances once per index), but
+/// the bytes-on-disk and HEAD-advance are now decoupled at the
+/// `TableStore` API surface — a caller that needs end-of-batch atomicity
+/// can stage many transactions and commit them in one pass (the eventual
+/// index reconciler relies on this).
 async fn stage_and_commit_btree(
     db: &Omnigraph,
     table_key: &str,
@@ -377,8 +607,9 @@ async fn stage_and_commit_btree(
             ))
         })?;
     // Failpoint between stage and commit. Used by `tests/failpoints.rs`
-    // to demonstrate that a Phase A failure in the staged-index path
-    // leaves no Lance-HEAD drift on the touched table.
+    // to demonstrate that a stage-step failure in the staged-index
+    // path (`stage_create_btree_index` succeeded; `commit_staged` not
+    // yet called) leaves no Lance-HEAD drift on the touched table.
     crate::failpoints::maybe_fail("ensure_indices.post_stage_pre_commit_btree")?;
     let new_ds = db
         .table_store
@@ -395,7 +626,7 @@ async fn stage_and_commit_btree(
 }
 
 /// Stage an INVERTED (FTS) index transaction and commit it. See
-/// `stage_and_commit_btree` for the MR-793 Phase 4 rationale.
+/// `stage_and_commit_btree` for the rationale.
 async fn stage_and_commit_inverted(
     db: &Omnigraph,
     table_key: &str,
diff --git a/crates/omnigraph/src/db/recovery_audit.rs b/crates/omnigraph/src/db/recovery_audit.rs
new file mode 100644
index 0000000..b7d4975
--- /dev/null
+++ b/crates/omnigraph/src/db/recovery_audit.rs
@@ -0,0 +1,362 @@
+//! Recovery audit row storage in `_graph_commit_recoveries.lance`.
+//!
+//! Sibling to `_graph_commits.lance` (`commit_graph.rs`). Each successful
+//! recovery sweep — roll-forward or roll-back — records one row here so
+//! operators investigating a sidecar-attributed mutation can correlate
+//! `omnigraph commit list --filter actor=omnigraph:recovery` with the
+//! original actor whose mutation was rolled forward / back.
+//!
+//! Sibling-table is additive: it doesn't bump
+//! `INTERNAL_MANIFEST_SCHEMA_VERSION`, and can be removed in favor of a
+//! schema migration later if the join cost matters. The schema-migration
+//! alternative (adding `recovery_for_actor` and `recovery_kind` columns
+//! to `_graph_commits.lance` itself) was considered and rejected to keep
+//! this change additive.
+//!
+//! Atomicity caveat: append to `_graph_commit_recoveries.lance` is
+//! sequential w.r.t. the `CommitGraph::append_commit` write. A crash
+//! between the two leaves an orphan commit-graph row with no audit row.
+//! Same shape as the existing `_graph_commits` + `_graph_commit_actors`
+//! split; the recovery sweep tolerates it the same way (re-entry sees
+//! `NoMovement` for already-restored / already-published tables; the
+//! audit append is retried).
+
+use std::sync::Arc;
+use std::time::{SystemTime, UNIX_EPOCH};
+
+use arrow_array::{
+    Array, RecordBatch, RecordBatchIterator, StringArray, TimestampMicrosecondArray,
+};
+use arrow_schema::{DataType, Field, Schema, SchemaRef, TimeUnit};
+use futures::TryStreamExt;
+use lance::Dataset;
+use lance::dataset::{WriteMode, WriteParams};
+use lance_file::version::LanceFileVersion;
+use serde::{Deserialize, Serialize};
+
+use crate::error::{OmniError, Result};
+
+const RECOVERIES_DIR: &str = "_graph_commit_recoveries.lance";
+
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
+#[serde(rename_all = "PascalCase")]
+pub(crate) enum RecoveryKind {
+    RolledForward,
+    RolledBack,
+}
+
+impl RecoveryKind {
+    fn as_str(self) -> &'static str {
+        match self {
+            RecoveryKind::RolledForward => "RolledForward",
+            RecoveryKind::RolledBack => "RolledBack",
+        }
+    }
+
+    fn parse(s: &str) -> Result<Self> {
+        match s {
+            "RolledForward" => Ok(RecoveryKind::RolledForward),
+            "RolledBack" => Ok(RecoveryKind::RolledBack),
+            other => Err(OmniError::manifest_internal(format!(
+                "unknown recovery_kind '{}' in _graph_commit_recoveries.lance",
+                other
+            ))),
+        }
+    }
+}
+
+#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
+pub(crate) struct TableOutcome {
+    pub table_key: String,
+    /// For RolledForward: the prior manifest pin (== sidecar.expected_version).
+    /// For RolledBack: same.
+    pub from_version: u64,
+    /// For RolledForward: the new manifest pin (== sidecar.post_commit_pin).
+    /// For RolledBack: == sidecar.expected_version (Lance HEAD reverted).
+    pub to_version: u64,
+}
+
+#[derive(Debug, Clone, PartialEq, Eq)]
+pub(crate) struct RecoveryAuditRecord {
+    pub graph_commit_id: String,
+    pub recovery_kind: RecoveryKind,
+    pub recovery_for_actor: Option<String>,
+    pub operation_id: String,
+    pub sidecar_writer_kind: String,
+    pub per_table_outcomes: Vec<TableOutcome>,
+    pub created_at: i64,
+}
+
+pub(crate) struct RecoveryAudit {
+    root_uri: String,
+    dataset: Option<Dataset>,
+}
+
+impl RecoveryAudit {
+    /// Open the recovery-audit dataset for the repo, or return a handle
+    /// with no dataset yet (created on first append). Mirrors the
+    /// optional-dataset pattern from `_graph_commit_actors.lance`.
+    pub(crate) async fn open(root_uri: &str) -> Result<Self> {
+        let root = root_uri.trim_end_matches('/').to_string();
+        let dataset = Dataset::open(&recoveries_uri(&root)).await.ok();
+        Ok(Self {
+            root_uri: root,
+            dataset,
+        })
+    }
+
+    /// Append one recovery audit record. Lazily initializes the dataset
+    /// on first call (idempotent under racy creation via the same
+    /// `Dataset already exists` rebound as `_graph_commit_actors.lance`).
+    pub(crate) async fn append(&mut self, record: RecoveryAuditRecord) -> Result<()> {
+        let batch = recovery_record_to_batch(&record)?;
+        let reader = RecordBatchIterator::new(vec![Ok(batch)], recoveries_schema());
+        let mut dataset = match self.dataset.take() {
+            Some(dataset) => dataset,
+            None => create_recoveries_dataset(&self.root_uri).await?,
+        };
+        dataset
+            .append(reader, None)
+            .await
+            .map_err(|e| OmniError::Lance(e.to_string()))?;
+        self.dataset = Some(dataset);
+        Ok(())
+    }
+
+    /// Read every recorded recovery (test + audit-CLI surface). Ordered by
+    /// `created_at` ascending.
+    pub(crate) async fn list(&self) -> Result<Vec<RecoveryAuditRecord>> {
+        let dataset = match &self.dataset {
+            Some(dataset) => dataset,
+            None => return Ok(Vec::new()),
+        };
+        let batches: Vec<RecordBatch> = dataset
+            .scan()
+            .try_into_stream()
+            .await
+            .map_err(|e| OmniError::Lance(e.to_string()))?
+            .try_collect()
+            .await
+            .map_err(|e| OmniError::Lance(e.to_string()))?;
+
+        let mut out = Vec::new();
+        for batch in batches {
+            for row in 0..batch.num_rows() {
+                out.push(decode_row(&batch, row)?);
+            }
+        }
+        out.sort_by_key(|r| r.created_at);
+        Ok(out)
+    }
+}
+
+fn recoveries_uri(root_uri: &str) -> String {
+    format!("{}/{}", root_uri.trim_end_matches('/'), RECOVERIES_DIR)
+}
+
+fn recoveries_schema() -> SchemaRef {
+    Arc::new(Schema::new(vec![
+        Field::new("graph_commit_id", DataType::Utf8, false),
+        Field::new("recovery_kind", DataType::Utf8, false),
+        Field::new("recovery_for_actor", DataType::Utf8, true),
+        Field::new("operation_id", DataType::Utf8, false),
+        Field::new("sidecar_writer_kind", DataType::Utf8, false),
+        // per_table_outcomes is serialized as a JSON string. The audit
+        // table is queried infrequently; a JSON column avoids needing
+        // a list-of-struct schema, which would make schema evolution
+        // (adding fields per outcome) more painful.
+        Field::new("per_table_outcomes_json", DataType::Utf8, false),
+        Field::new(
+            "created_at",
+            DataType::Timestamp(TimeUnit::Microsecond, None),
+            false,
+        ),
+    ]))
+}
+
+async fn create_recoveries_dataset(root_uri: &str) -> Result<Dataset> {
+    let uri = recoveries_uri(root_uri);
+    let batch = RecordBatch::new_empty(recoveries_schema());
+    let reader = RecordBatchIterator::new(vec![Ok(batch)], recoveries_schema());
+    let params = WriteParams {
+        mode: WriteMode::Create,
+        enable_stable_row_ids: true,
+        data_storage_version: Some(LanceFileVersion::V2_2),
+        ..Default::default()
+    };
+    match Dataset::write(reader, &uri as &str, Some(params)).await {
+        Ok(dataset) => Ok(dataset),
+        Err(err) if err.to_string().contains("Dataset already exists") => Dataset::open(&uri)
+            .await
+            .map_err(|open_err| OmniError::Lance(open_err.to_string())),
+        Err(err) => Err(OmniError::Lance(err.to_string())),
+    }
+}
+
+fn recovery_record_to_batch(record: &RecoveryAuditRecord) -> Result<RecordBatch> {
+    let outcomes_json = serde_json::to_string(&record.per_table_outcomes).map_err(|e| {
+        OmniError::manifest_internal(format!(
+            "failed to serialize per_table_outcomes for recovery audit: {}",
+            e
+        ))
+    })?;
+    RecordBatch::try_new(
+        recoveries_schema(),
+        vec![
+            Arc::new(StringArray::from(vec![record.graph_commit_id.clone()])),
+            Arc::new(StringArray::from(vec![record.recovery_kind.as_str()])),
+            Arc::new(StringArray::from(vec![record
+                .recovery_for_actor
+                .clone()])),
+            Arc::new(StringArray::from(vec![record.operation_id.clone()])),
+            Arc::new(StringArray::from(vec![record.sidecar_writer_kind.clone()])),
+            Arc::new(StringArray::from(vec![outcomes_json])),
+            Arc::new(TimestampMicrosecondArray::from(vec![record.created_at])),
+        ],
+    )
+    .map_err(|e| OmniError::Lance(e.to_string()))
+}
+
+fn decode_row(batch: &RecordBatch, row: usize) -> Result<RecoveryAuditRecord> {
+    let str_col = |name: &str| -> Result<&StringArray> {
+        batch
+            .column_by_name(name)
+            .ok_or_else(|| OmniError::manifest_internal(format!("missing column '{}' in recovery audit", name)))?
+            .as_any()
+            .downcast_ref::<StringArray>()
+            .ok_or_else(|| OmniError::manifest_internal(format!("column '{}' has wrong type", name)))
+    };
+    let ts_col = batch
+        .column_by_name("created_at")
+        .ok_or_else(|| OmniError::manifest_internal("missing 'created_at' column".to_string()))?
+        .as_any()
+        .downcast_ref::<TimestampMicrosecondArray>()
+        .ok_or_else(|| {
+            OmniError::manifest_internal("'created_at' column has wrong type".to_string())
+        })?;
+
+    let graph_commit_ids = str_col("graph_commit_id")?;
+    let kinds = str_col("recovery_kind")?;
+    let for_actors = str_col("recovery_for_actor")?;
+    let op_ids = str_col("operation_id")?;
+    let writers = str_col("sidecar_writer_kind")?;
+    let outcomes_json = str_col("per_table_outcomes_json")?;
+
+    let outcomes: Vec<TableOutcome> =
+        serde_json::from_str(outcomes_json.value(row)).map_err(|e| {
+            OmniError::manifest_internal(format!(
+                "failed to deserialize per_table_outcomes_json from recovery audit: {}",
+                e
+            ))
+        })?;
+
+    Ok(RecoveryAuditRecord {
+        graph_commit_id: graph_commit_ids.value(row).to_string(),
+        recovery_kind: RecoveryKind::parse(kinds.value(row))?,
+        recovery_for_actor: if for_actors.is_null(row) {
+            None
+        } else {
+            Some(for_actors.value(row).to_string())
+        },
+        operation_id: op_ids.value(row).to_string(),
+        sidecar_writer_kind: writers.value(row).to_string(),
+        per_table_outcomes: outcomes,
+        created_at: ts_col.value(row),
+    })
+}
+
+pub(crate) fn now_micros() -> Result<i64> {
+    SystemTime::now()
+        .duration_since(UNIX_EPOCH)
+        .map(|d| d.as_micros() as i64)
+        .map_err(|e| {
+            OmniError::manifest_internal(format!("system clock before unix epoch: {}", e))
+        })
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    fn sample_record() -> RecoveryAuditRecord {
+        RecoveryAuditRecord {
+            graph_commit_id: "01H000000000000000000000XX".to_string(),
+            recovery_kind: RecoveryKind::RolledForward,
+            recovery_for_actor: Some("act-alice".to_string()),
+            operation_id: "01H000000000000000000000OP".to_string(),
+            sidecar_writer_kind: "Mutation".to_string(),
+            per_table_outcomes: vec![
+                TableOutcome {
+                    table_key: "node:Person".to_string(),
+                    from_version: 5,
+                    to_version: 6,
+                },
+                TableOutcome {
+                    table_key: "edge:Knows".to_string(),
+                    from_version: 12,
+                    to_version: 13,
+                },
+            ],
+            created_at: 1_700_000_000_000_000,
+        }
+    }
+
+    #[tokio::test]
+    async fn recovery_audit_round_trips_through_lance() {
+        let dir = tempfile::tempdir().unwrap();
+        let root = dir.path().to_str().unwrap();
+
+        let mut audit = RecoveryAudit::open(root).await.unwrap();
+        // Empty repo: list returns empty.
+        assert!(audit.list().await.unwrap().is_empty());
+
+        // Append + list.
+        let record = sample_record();
+        audit.append(record.clone()).await.unwrap();
+        let listed = audit.list().await.unwrap();
+        assert_eq!(listed.len(), 1);
+        assert_eq!(listed[0], record);
+
+        // Append a second record; both visible, sorted by created_at.
+        let mut second = sample_record();
+        second.graph_commit_id = "01H000000000000000000000YY".to_string();
+        second.recovery_kind = RecoveryKind::RolledBack;
+        second.recovery_for_actor = None;
+        second.created_at = record.created_at + 1;
+        audit.append(second.clone()).await.unwrap();
+
+        let listed = audit.list().await.unwrap();
+        assert_eq!(listed.len(), 2);
+        assert_eq!(listed[0], record);
+        assert_eq!(listed[1], second);
+    }
+
+    #[tokio::test]
+    async fn recovery_audit_persists_across_open_cycles() {
+        let dir = tempfile::tempdir().unwrap();
+        let root = dir.path().to_str().unwrap();
+
+        {
+            let mut audit = RecoveryAudit::open(root).await.unwrap();
+            audit.append(sample_record()).await.unwrap();
+        }
+
+        let audit = RecoveryAudit::open(root).await.unwrap();
+        let listed = audit.list().await.unwrap();
+        assert_eq!(listed.len(), 1);
+        assert_eq!(listed[0], sample_record());
+    }
+
+    #[test]
+    fn recovery_kind_round_trips_through_string() {
+        assert_eq!(
+            RecoveryKind::parse("RolledForward").unwrap(),
+            RecoveryKind::RolledForward,
+        );
+        assert_eq!(
+            RecoveryKind::parse("RolledBack").unwrap(),
+            RecoveryKind::RolledBack,
+        );
+        assert!(RecoveryKind::parse("Garbage").is_err());
+    }
+}
diff --git a/crates/omnigraph/src/db/schema_state.rs b/crates/omnigraph/src/db/schema_state.rs
index 8ba24f4..13dfccc 100644
--- a/crates/omnigraph/src/db/schema_state.rs
+++ b/crates/omnigraph/src/db/schema_state.rs
@@ -285,6 +285,24 @@ fn schema_lock_conflict(detail: impl Into<String>) -> OmniError {
     ))
 }
 
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) enum SchemaStateRecovery {
+    Noop,
+    CleanedStaging,
+    CompletedStagingRename { schema_apply_sidecar: bool },
+}
+
+impl SchemaStateRecovery {
+    pub(crate) fn completed_schema_apply_sidecar_rename(self) -> bool {
+        matches!(
+            self,
+            Self::CompletedStagingRename {
+                schema_apply_sidecar: true,
+            }
+        )
+    }
+}
+
 /// Reconcile leftover schema staging files (`_schema.pg.staging`,
 /// `_schema.ir.json.staging`, `__schema_state.json.staging`) against the
 /// manifest snapshot.
@@ -306,7 +324,7 @@ pub(crate) async fn recover_schema_state_files(
     root_uri: &str,
     storage: Arc<dyn StorageAdapter>,
     snapshot: &Snapshot,
-) -> Result<()> {
+) -> Result<SchemaStateRecovery> {
     let pg_staging = schema_source_staging_uri(root_uri);
     let ir_staging = schema_ir_staging_uri(root_uri);
     let state_staging = schema_state_staging_uri(root_uri);
@@ -316,7 +334,28 @@ pub(crate) async fn recover_schema_state_files(
     let state_exists = storage.exists(&state_staging).await?;
 
     if !pg_exists && !ir_exists && !state_exists {
-        return Ok(());
+        return Ok(SchemaStateRecovery::Noop);
+    }
+
+    // Schema-apply atomicity: when a SchemaApply sidecar is present,
+    // the writer reached Phase B (Lance HEADs advanced) but didn't
+    // complete Phase C (manifest publish + staging→final renames). The
+    // recovery sweep about to run will roll the table versions forward
+    // to the new Lance HEADs; we MUST also rename the staging files
+    // forward so the catalog matches. Without this, the disambiguation
+    // logic below sees actual_keys == live_keys (manifest didn't move)
+    // and deletes the staging files, leaving the repo with new-schema
+    // data on disk but the old `_schema.pg` live — corruption.
+    if crate::db::manifest::has_schema_apply_sidecar(root_uri, storage.as_ref()).await? {
+        warn!(
+            "recovery: SchemaApply sidecar present; completing schema-staging rename so the \
+             manifest-drift sweep's roll-forward sees the new catalog (manifest v{})",
+            snapshot.version()
+        );
+        complete_staging_rename(root_uri, storage.as_ref()).await?;
+        return Ok(SchemaStateRecovery::CompletedStagingRename {
+            schema_apply_sidecar: true,
+        });
     }
 
     if !pg_exists {
@@ -346,7 +385,9 @@ pub(crate) async fn recover_schema_state_files(
             snapshot.version()
         );
         complete_staging_rename(root_uri, storage.as_ref()).await?;
-        return Ok(());
+        return Ok(SchemaStateRecovery::CompletedStagingRename {
+            schema_apply_sidecar: false,
+        });
     }
 
     let staging_source = storage.read_text(&pg_staging).await?;
@@ -365,7 +406,7 @@ pub(crate) async fn recover_schema_state_files(
             "removing leftover schema staging files matching the live schema (no-op apply that crashed)"
         );
         cleanup_staging_files(root_uri, storage.as_ref()).await?;
-        return Ok(());
+        return Ok(SchemaStateRecovery::CleanedStaging);
     }
 
     let live_keys = expected_table_keys(&live_ir);
@@ -388,14 +429,16 @@ pub(crate) async fn recover_schema_state_files(
             snapshot.version()
         );
         cleanup_staging_files(root_uri, storage.as_ref()).await?;
-        Ok(())
+        Ok(SchemaStateRecovery::CleanedStaging)
     } else if actual_keys == staging_keys {
         warn!(
             "schema apply crashed after manifest commit; completing schema-file rename (manifest v{})",
             snapshot.version()
         );
         complete_staging_rename(root_uri, storage.as_ref()).await?;
-        Ok(())
+        Ok(SchemaStateRecovery::CompletedStagingRename {
+            schema_apply_sidecar: false,
+        })
     } else {
         Err(schema_lock_conflict(format!(
             "found schema staging files but the manifest's table set ({:?}) matches neither the live schema ({:?}) nor the staging schema ({:?}); manual operator action required",
@@ -407,9 +450,7 @@ pub(crate) async fn recover_schema_state_files(
 async fn cleanup_staging_files(root_uri: &str, storage: &dyn StorageAdapter) -> Result<()> {
     storage.delete(&schema_source_staging_uri(root_uri)).await?;
     storage.delete(&schema_ir_staging_uri(root_uri)).await?;
-    storage
-        .delete(&schema_state_staging_uri(root_uri))
-        .await?;
+    storage.delete(&schema_state_staging_uri(root_uri)).await?;
     Ok(())
 }
 
diff --git a/crates/omnigraph/src/exec/merge.rs b/crates/omnigraph/src/exec/merge.rs
index df62a63..b466663 100644
--- a/crates/omnigraph/src/exec/merge.rs
+++ b/crates/omnigraph/src/exec/merge.rs
@@ -913,9 +913,9 @@ async fn publish_rewritten_merge_table(
     // Phase 1: merge_insert changed/new rows (preserves _row_created_at_version for
     // existing rows, bumps _row_last_updated_at_version only for actually-changed rows).
     //
-    // MR-793 Phase 5: routed through the staged primitive so a failure
-    // between writing fragments and committing leaves no Lance-HEAD
-    // drift. The commit_staged here is per-table per-call (Lance has no
+    // Routed through the staged primitive so a failure between writing
+    // fragments and committing leaves no Lance-HEAD drift. The
+    // commit_staged here is per-table per-call (Lance has no
     // multi-dataset atomic commit); the residual sits at this single
     // commit point, narrowed from the previous "merge_insert + delete +
     // index" multi-step inline-commit chain.
@@ -957,11 +957,11 @@ async fn publish_rewritten_merge_table(
     //
     // INLINE-COMMIT RESIDUAL: lance-4.0.0 does not expose a public
     // two-phase delete API (DeleteJob is `pub(crate)` —
-    // lance-format/lance#6658 is open with no PRs). MR-793 deliberately
-    // does NOT introduce a `stage_delete` wrapper that would secretly
-    // inline-commit (a side-channel — see design doc §3.2). When the
-    // upstream API ships, swap this `delete_where` call for
-    // `stage_delete` + `commit_staged`.
+    // lance-format/lance#6658 is open with no PRs). We deliberately do
+    // NOT introduce a `stage_delete` wrapper that would secretly
+    // inline-commit (it would create a side-channel between the staged
+    // and inline write paths). When the upstream API ships, swap this
+    // `delete_where` call for `stage_delete` + `commit_staged`.
     if !staged.deleted_ids.is_empty() {
         let escaped: Vec<String> = staged
             .deleted_ids
@@ -977,11 +977,11 @@ async fn publish_rewritten_merge_table(
 
     // Phase 3: rebuild indices.
     //
-    // `build_indices_on_dataset` was migrated in MR-793 Phase 4 to use
-    // `stage_create_btree_index` / `stage_create_inverted_index` +
-    // `commit_staged` for scalar indices. Vector indices remain inline
-    // (residual — `build_index_metadata_from_segments` is `pub(crate)`
-    // in lance-4.0.0; companion ticket to lance-format/lance#6658).
+    // `build_indices_on_dataset` uses `stage_create_btree_index` /
+    // `stage_create_inverted_index` + `commit_staged` for scalar
+    // indices. Vector indices remain inline-commit
+    // (`build_index_metadata_from_segments` is `pub(crate)` in lance-
+    // 4.0.0 — companion ticket to lance-format/lance#6658).
     let row_count = target_db
         .table_store()
         .table_state(&full_path, &current_ds)
@@ -1167,6 +1167,96 @@ impl Omnigraph {
 
         validate_merge_candidates(self, source_snapshot, &target_snapshot, &candidates).await?;
 
+        // Recovery sidecar: protect the per-table commit_staged loop.
+        // Pin only `RewriteMerged` candidates because they always
+        // advance Lance HEAD through `publish_rewritten_merge_table`
+        // (which runs stage_merge_insert + delete_where + index
+        // rebuilds — multiple commit_staged calls per table; loose
+        // classification handles the multi-step drift).
+        //
+        // `AdoptSourceState` candidates are NOT pinned: their publish
+        // path is `publish_adopted_source_state`, whose subcases mostly
+        // don't advance Lance HEAD (pure manifest pointer switch, or
+        // fork via `fork_dataset_from_entry_state` which only adds a
+        // Lance branch ref). If those subcases were pinned, recovery
+        // would classify them as NoMovement and the all-or-nothing
+        // decision would force a rollback that destroys legitimately-
+        // committed work on sibling RewriteMerged tables.
+        //
+        // Residual: two `AdoptSourceState` subcases (when source has a
+        // table_branch AND the source delta is non-empty) internally
+        // call `publish_rewritten_merge_table` and DO advance HEAD.
+        // Those are not covered by this sidecar — if they fail mid-
+        // commit, the residual persists until the next ReadWrite open
+        // detects it via a subsequent ExpectedVersionMismatch from a
+        // later writer that touches the same table. Closing this gap
+        // requires pre-computing source deltas during candidate
+        // classification (a structural change to `CandidateTableState`)
+        // and is left as follow-up work.
+        let recovery_pins: Vec<crate::db::manifest::SidecarTablePin> = ordered_table_keys
+            .iter()
+            .filter_map(|table_key| {
+                let candidate = candidates.get(table_key)?;
+                if !matches!(candidate, CandidateTableState::RewriteMerged(_)) {
+                    return None;
+                }
+                let entry = target_snapshot.entry(table_key)?;
+                Some(crate::db::manifest::SidecarTablePin {
+                    table_key: table_key.clone(),
+                    table_path: self.table_store().dataset_uri(&entry.table_path),
+                    expected_version: entry.table_version,
+                    post_commit_pin: entry.table_version + 1,
+                    // Use the merge target branch (where commits actually
+                    // land), NOT entry.table_branch (where the table
+                    // currently lives). publish_rewritten_merge_table calls
+                    // open_for_mutation, which forks an inherited-from-main
+                    // table to active_branch on first write — the resulting
+                    // Lance commit lands on active_branch. Recovery's
+                    // open_lance_head must check the same branch, otherwise
+                    // an inherited-table feature-to-feature merge classifies
+                    // as NoMovement and the all-or-nothing rollback skips
+                    // the orphaned post-Phase-B HEAD on the target ref.
+                    // Same rationale as table_ops.rs:115-120 in
+                    // ensure_indices_for_branch.
+                    table_branch: self.active_branch().map(str::to_string),
+                })
+            })
+            .collect();
+        let recovery_handle = if recovery_pins.is_empty() {
+            None
+        } else {
+            // Use the merge target branch directly, NOT a heuristic
+            // derived from `ordered_table_keys.first()`. The first
+            // sorted table key may not be in the target snapshot at all
+            // (its `entry()` returns None → branch becomes None == main),
+            // and the SubTableEntry's `table_branch` field isn't
+            // necessarily the merge target branch. The caller
+            // `branch_merge` calls `swap_coordinator_for_branch(target_branch)`
+            // before invoking this function, so `self.active_branch()`
+            // is the target.
+            let target_branch = self.active_branch().map(str::to_string);
+            let mut sidecar = crate::db::manifest::new_sidecar(
+                crate::db::manifest::SidecarKind::BranchMerge,
+                target_branch,
+                self.audit_actor_id.clone(),
+                recovery_pins,
+            );
+            // Carry the source branch's HEAD commit id so the recovery
+            // sweep's audit step can record this as a MERGE commit
+            // (linked to the source) instead of a plain commit. Without
+            // this, future merges between the same pair lose
+            // already-up-to-date detection and merge-base correctness.
+            sidecar.merge_source_commit_id = Some(source_head_commit_id.to_string());
+            Some(
+                crate::db::manifest::write_sidecar(
+                    self.root_uri(),
+                    self.storage_adapter(),
+                    &sidecar,
+                )
+                .await?,
+            )
+        };
+
         let mut updates = Vec::new();
         let mut changed_edge_tables = false;
         for table_key in &ordered_table_keys {
@@ -1195,11 +1285,32 @@ impl Omnigraph {
             updates.push(update);
         }
 
+        // Failpoint: pin the per-writer Phase B → Phase C residual for
+        // branch_merge. Lance HEAD has advanced on every touched table
+        // (publish_*) but the manifest publish below hasn't run. Used
+        // by `tests/failpoints.rs::branch_merge_phase_b_failure_recovered_on_next_open`.
+        crate::failpoints::maybe_fail("branch_merge.post_phase_b_pre_manifest_commit")?;
+
         let manifest_version = if updates.is_empty() {
             self.version()
         } else {
             self.commit_manifest_updates(&updates).await?
         };
+
+        // Recovery sidecar lifecycle: delete after manifest publish.
+        // Best-effort cleanup; the merge already landed durably so
+        // failing the user here is undesirable.
+        if let Some(handle) = recovery_handle {
+            if let Err(err) =
+                crate::db::manifest::delete_sidecar(&handle, self.storage_adapter()).await
+            {
+                tracing::warn!(
+                    error = %err,
+                    operation_id = handle.operation_id.as_str(),
+                    "recovery sidecar cleanup failed; the next open's recovery sweep will resolve it"
+                );
+            }
+        }
         self.record_merge_commit(
             manifest_version,
             target_head_commit_id,
diff --git a/crates/omnigraph/src/exec/mutation.rs b/crates/omnigraph/src/exec/mutation.rs
index e996241..c6d2737 100644
--- a/crates/omnigraph/src/exec/mutation.rs
+++ b/crates/omnigraph/src/exec/mutation.rs
@@ -737,18 +737,23 @@ impl Omnigraph {
             Err(e) => Err(e),
             Ok(total) if staging.is_empty() => Ok(total),
             Ok(total) => {
-                let (updates, expected_versions) = staging
-                    .finalize(self, requested.as_deref())
+                let (updates, expected_versions, sidecar_handle) = staging
+                    .finalize(
+                        self,
+                        requested.as_deref(),
+                        crate::db::manifest::SidecarKind::Mutation,
+                    )
                     .await?;
                 // Failpoint that wedges the documented finalize→publisher
                 // residual: per-table `commit_staged` calls already
                 // advanced Lance HEAD on every touched table; a failure
                 // injected here mirrors the production-rare case where
                 // the publisher's CAS pre-check rejects (or the manifest
-                // write throws) after staged commits succeeded. Used by
-                // `tests/failpoints.rs::finalize_publisher_residual_*`
-                // to pin the documented residual behavior. See
-                // `docs/runs.md` "Finalize → publisher residual".
+                // write throws) after staged commits succeeded. The
+                // sidecar written inside `staging.finalize()` persists
+                // across this failure so the next `Omnigraph::open`'s
+                // recovery sweep can roll forward — see
+                // `tests/failpoints.rs::recovery_rolls_forward_after_finalize_publisher_failure`.
                 crate::failpoints::maybe_fail("mutation.post_finalize_pre_publisher")?;
                 self.commit_updates_on_branch_with_expected(
                     requested.as_deref(),
@@ -756,6 +761,33 @@ impl Omnigraph {
                     &expected_versions,
                 )
                 .await?;
+                // Phase C succeeded — sidecar can be deleted. If this
+                // delete fails, the next open's sweep classifies every
+                // table as NoMovement (manifest pin == Lance HEAD ==
+                // post_commit_pin) and the sidecar is treated as a
+                // stale artifact (cleaned up via the Phase 2 logic).
+                if let Some(handle) = sidecar_handle {
+                    // Best-effort cleanup: the manifest publish already
+                    // succeeded, so the user's mutation is durable. A
+                    // failed delete leaves the sidecar on disk; the
+                    // next open's recovery sweep classifies every table
+                    // as `NoMovement` (manifest pin == Lance HEAD ==
+                    // post_commit_pin) and tidies up. Failing the user
+                    // here would return an error for a write that
+                    // already landed.
+                    if let Err(err) = crate::db::manifest::delete_sidecar(
+                        &handle,
+                        self.storage_adapter(),
+                    )
+                    .await
+                    {
+                        tracing::warn!(
+                            error = %err,
+                            operation_id = handle.operation_id.as_str(),
+                            "recovery sidecar cleanup failed; the next open's recovery sweep will resolve it"
+                        );
+                    }
+                }
                 Ok(total)
             }
         }
diff --git a/crates/omnigraph/src/exec/staging.rs b/crates/omnigraph/src/exec/staging.rs
index 2c7eae3..47433be 100644
--- a/crates/omnigraph/src/exec/staging.rs
+++ b/crates/omnigraph/src/exec/staging.rs
@@ -27,6 +27,9 @@ use lance::Dataset;
 use omnigraph_compiler::catalog::EdgeType;
 
 use crate::db::SubTableUpdate;
+use crate::db::manifest::{
+    new_sidecar, write_sidecar, RecoverySidecarHandle, SidecarKind, SidecarTablePin,
+};
 use crate::error::{OmniError, Result};
 
 /// Whether the per-table accumulator should commit via `stage_append`
@@ -218,8 +221,13 @@ impl MutationStaging {
     pub(crate) async fn finalize(
         self,
         db: &crate::db::Omnigraph,
-        _branch: Option<&str>,
-    ) -> Result<(Vec<SubTableUpdate>, HashMap<String, u64>)> {
+        branch: Option<&str>,
+        sidecar_kind: SidecarKind,
+    ) -> Result<(
+        Vec<SubTableUpdate>,
+        HashMap<String, u64>,
+        Option<RecoverySidecarHandle>,
+    )> {
         let MutationStaging {
             expected_versions,
             paths,
@@ -230,6 +238,50 @@ impl MutationStaging {
         let mut updates: Vec<SubTableUpdate> =
             inline_committed.into_values().collect();
 
+        // Sidecar protocol: build the per-table pin list BEFORE any Lance
+        // commit_staged runs, then write the sidecar so a crash between
+        // Phase B (this loop's commit_staged calls) and Phase C (the
+        // manifest publish in the caller) is recoverable on next open.
+        // Skipped when `pending` is empty (delete-only mutation; the D₂
+        // parse-time rule keeps deletes out of this code path so this
+        // branch is reached only for the inline-committed-only case).
+        let pins: Vec<SidecarTablePin> = pending
+            .iter()
+            .map(|(table_key, _)| {
+                let path = paths.get(table_key).ok_or_else(|| {
+                    OmniError::manifest_internal(format!(
+                        "MutationStaging::finalize: missing path for table '{}'",
+                        table_key,
+                    ))
+                })?;
+                let expected = *expected_versions.get(table_key).ok_or_else(|| {
+                    OmniError::manifest_internal(format!(
+                        "MutationStaging::finalize: missing expected version for table '{}'",
+                        table_key,
+                    ))
+                })?;
+                Ok::<SidecarTablePin, OmniError>(SidecarTablePin {
+                    table_key: table_key.clone(),
+                    table_path: path.full_path.clone(),
+                    expected_version: expected,
+                    post_commit_pin: expected + 1,
+                    table_branch: path.table_branch.clone(),
+                })
+            })
+            .collect::<Result<Vec<_>>>()?;
+
+        let sidecar_handle = if pins.is_empty() {
+            None
+        } else {
+            let sidecar = new_sidecar(
+                sidecar_kind,
+                branch.map(|s| s.to_string()),
+                db.audit_actor_id.clone(),
+                pins,
+            );
+            Some(write_sidecar(db.root_uri(), db.storage_adapter(), &sidecar).await?)
+        };
+
         for (table_key, table) in pending {
             let path = paths.get(&table_key).ok_or_else(|| {
                 OmniError::manifest_internal(format!(
@@ -318,7 +370,7 @@ impl MutationStaging {
             });
         }
 
-        Ok((updates, expected_versions))
+        Ok((updates, expected_versions, sidecar_handle))
     }
 }
 
diff --git a/crates/omnigraph/src/loader/mod.rs b/crates/omnigraph/src/loader/mod.rs
index f275938..f4cf7d1 100644
--- a/crates/omnigraph/src/loader/mod.rs
+++ b/crates/omnigraph/src/loader/mod.rs
@@ -537,10 +537,39 @@ async fn load_jsonl_reader<R: BufRead>(
 
     // Phase 4: Atomic manifest commit with publisher-level OCC.
     if use_staging {
-        let (updates, expected_versions) = staging.finalize(db, branch).await?;
+        let (updates, expected_versions, sidecar_handle) = staging
+            .finalize(db, branch, crate::db::manifest::SidecarKind::Load)
+            .await?;
+        // Same finalize → publisher residual as mutations: per-table
+        // staged commits have advanced Lance HEAD, but the manifest
+        // publish has not run yet. Reuse the mutation failpoint name so
+        // one failpoint pins the shared `MutationStaging` boundary.
+        crate::failpoints::maybe_fail("mutation.post_finalize_pre_publisher")?;
         db.commit_updates_on_branch_with_expected(branch, &updates, &expected_versions)
             .await?;
+        // The recovery sidecar protects the per-table commit_staged →
+        // manifest publish window. Phase C succeeded — clean up
+        // best-effort: failing the user here would error out a write
+        // that already landed durably.
+        if let Some(handle) = sidecar_handle {
+            if let Err(err) =
+                crate::db::manifest::delete_sidecar(&handle, db.storage_adapter()).await
+            {
+                tracing::warn!(
+                    error = %err,
+                    operation_id = handle.operation_id.as_str(),
+                    "recovery sidecar cleanup failed; the next open's recovery sweep will resolve it"
+                );
+            }
+        }
     } else {
+        // LoadMode::Overwrite keeps the legacy inline-commit path —
+        // truncate-then-append doesn't fit the staged shape (see
+        // `docs/runs.md` "LoadMode::Overwrite residual"). The recovery
+        // sidecar is not applicable here because the writer doesn't go
+        // through MutationStaging; per-table inline commits + a final
+        // manifest publish handle their own residual via the documented
+        // operator workflow (re-run overwrite to recover).
         db.commit_updates_on_branch_with_expected(
             branch,
             &overwrite_updates,
diff --git a/crates/omnigraph/src/storage.rs b/crates/omnigraph/src/storage.rs
index c252e0e..5d2e568 100644
--- a/crates/omnigraph/src/storage.rs
+++ b/crates/omnigraph/src/storage.rs
@@ -27,6 +27,10 @@ pub trait StorageAdapter: Debug + Send + Sync {
     async fn rename_text(&self, from_uri: &str, to_uri: &str) -> Result<()>;
     /// Remove a file. Returns Ok(()) if the file does not exist.
     async fn delete(&self, uri: &str) -> Result<()>;
+    /// List all files (non-recursively, files only) directly under `dir_uri`.
+    /// Returns full URIs (same scheme as `dir_uri`). The result is unordered.
+    /// Returns Ok(empty) if the directory does not exist or is empty.
+    async fn list_dir(&self, dir_uri: &str) -> Result<Vec<String>>;
 }
 
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
@@ -59,6 +63,16 @@ impl StorageAdapter for LocalStorageAdapter {
 
     async fn write_text(&self, uri: &str, contents: &str) -> Result<()> {
         let path = local_path_from_uri(uri)?;
+        // Ensure parent directory exists. S3 has no equivalent (PutObject
+        // is path-agnostic). For local fs, callers like the recovery
+        // sidecar protocol expect transparent directory creation under
+        // the repo root (the `__recovery/` directory doesn't pre-exist;
+        // first sidecar write creates it).
+        if let Some(parent) = path.parent() {
+            if !parent.as_os_str().is_empty() {
+                tokio::fs::create_dir_all(parent).await?;
+            }
+        }
         tokio::fs::write(&path, contents).await?;
         Ok(())
     }
@@ -82,6 +96,27 @@ impl StorageAdapter for LocalStorageAdapter {
             Err(err) => Err(err.into()),
         }
     }
+
+    async fn list_dir(&self, dir_uri: &str) -> Result<Vec<String>> {
+        let path = local_path_from_uri(dir_uri)?;
+        let mut out = Vec::new();
+        let mut entries = match tokio::fs::read_dir(&path).await {
+            Ok(e) => e,
+            Err(err) if err.kind() == std::io::ErrorKind::NotFound => return Ok(out),
+            Err(err) => return Err(err.into()),
+        };
+        let dir_str = dir_uri.trim_end_matches('/');
+        while let Some(entry) = entries.next_entry().await? {
+            let ft = entry.file_type().await?;
+            if !ft.is_file() {
+                continue;
+            }
+            if let Some(name) = entry.file_name().to_str() {
+                out.push(format!("{}/{}", dir_str, name));
+            }
+        }
+        Ok(out)
+    }
 }
 
 #[async_trait]
@@ -154,6 +189,43 @@ impl StorageAdapter for S3StorageAdapter {
             Err(err) => Err(storage_backend_error("delete", uri, err)),
         }
     }
+
+    async fn list_dir(&self, dir_uri: &str) -> Result<Vec<String>> {
+        // Normalize: ensure the URI describes a directory (trailing '/') so
+        // we don't match sibling paths with a shared prefix
+        // (e.g. listing `__recovery` shouldn't match `__recovery_log/...`).
+        let dir_with_slash = if dir_uri.ends_with('/') {
+            dir_uri.to_string()
+        } else {
+            format!("{}/", dir_uri)
+        };
+        // object_store::Path strips the trailing '/'; re-add it for filtering.
+        let prefix_loc = self.object_path(&dir_with_slash)?;
+        let prefix_with_slash = format!("{}/", prefix_loc.as_ref());
+
+        let mut entries = self.store.list(Some(&prefix_loc));
+        let mut out = Vec::new();
+        let bucket_root = format!("{}{}/", S3_SCHEME_PREFIX, self.bucket);
+        while let Some(meta) = entries
+            .try_next()
+            .await
+            .map_err(|err| storage_backend_error("list_dir", dir_uri, err))?
+        {
+            let key_str = meta.location.as_ref();
+            // Require the directory boundary to filter out sibling-prefix
+            // matches (object_store's `list` is prefix-based, not dir-based).
+            if !key_str.starts_with(&prefix_with_slash) {
+                continue;
+            }
+            let suffix = &key_str[prefix_with_slash.len()..];
+            // Non-recursive: skip anything inside a sub-directory.
+            if suffix.contains('/') {
+                continue;
+            }
+            out.push(format!("{}{}", bucket_root, key_str));
+        }
+        Ok(out)
+    }
 }
 
 impl S3StorageAdapter {
diff --git a/crates/omnigraph/tests/composite_flow.rs b/crates/omnigraph/tests/composite_flow.rs
new file mode 100644
index 0000000..00f4d49
--- /dev/null
+++ b/crates/omnigraph/tests/composite_flow.rs
@@ -0,0 +1,921 @@
+//! Composite end-to-end flow integration test.
+//!
+//! Walks the canonical user flow in one fixture: init → load → branch →
+//! mutate → query → merge → time-travel → optimize → cleanup → reopen.
+//! Every numbered step has at least one assertion.
+//!
+//! This is the deterministic narrative counterpart to a randomized /
+//! property-based reliability harness — it catches regressions where
+//! individual operations all pass their unit tests but their composition
+//! breaks. It runs in CI on every PR (no `#[ignore]`).
+
+mod helpers;
+
+use arrow_array::{Array, Int64Array};
+use omnigraph::db::{Omnigraph, ReadTarget};
+use omnigraph::loader::{LoadMode, load_jsonl};
+use omnigraph_compiler::ir::ParamMap;
+use omnigraph_compiler::result::QueryResult;
+
+use helpers::{
+    MUTATION_QUERIES, count_rows, count_rows_branch, mixed_params, mutate_branch, mutate_main,
+    query_branch, query_main, snapshot_main, version_branch, version_main,
+};
+
+/// Extract the `total` value from a `total_people` query result and
+/// assert it equals `expected`. The query returns one row with one
+/// `Int64` column named `total`; asserting only `num_rows() == 1`
+/// would not catch a regression that returns a stale or wrong count.
+fn assert_total(result: &QueryResult, expected: i64, context: &str) {
+    let batch = result.concat_batches().unwrap();
+    assert_eq!(
+        batch.num_rows(),
+        1,
+        "total_people must return exactly one summary row ({context})"
+    );
+    let total_col = batch
+        .column_by_name("total")
+        .unwrap_or_else(|| panic!("missing `total` column ({context})"))
+        .as_any()
+        .downcast_ref::<Int64Array>()
+        .unwrap_or_else(|| panic!("`total` column is not Int64 ({context})"));
+    assert_eq!(
+        total_col.value(0),
+        expected,
+        "total_people count mismatch ({context})"
+    );
+}
+
+const TEST_SCHEMA: &str = include_str!("fixtures/test.pg");
+const TEST_DATA: &str = include_str!("fixtures/test.jsonl");
+const TEST_QUERIES: &str = include_str!("fixtures/test.gq");
+
+#[tokio::test]
+async fn composite_flow_canonical_lifecycle() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 1: init a fresh repo with the standard test schema.
+    // ─────────────────────────────────────────────────────────────────
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    let v_init = version_branch(&db, "main").await.unwrap();
+    assert!(
+        v_init >= 1,
+        "init must produce a non-zero manifest version; got {}",
+        v_init
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 2: load JSONL seed data (Person + Company nodes,
+    // Knows + WorksAt edges).
+    // ─────────────────────────────────────────────────────────────────
+    load_jsonl(&mut db, TEST_DATA, LoadMode::Append).await.unwrap();
+    let v_after_load = version_branch(&db, "main").await.unwrap();
+    assert!(
+        v_after_load > v_init,
+        "load must advance the manifest version: v_init={}, v_after_load={}",
+        v_init,
+        v_after_load,
+    );
+    assert_eq!(
+        count_rows(&db, "node:Person").await,
+        4,
+        "test.jsonl declares 4 Person rows"
+    );
+    assert_eq!(
+        count_rows(&db, "node:Company").await,
+        2,
+        "test.jsonl declares 2 Company rows"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 3: branch_create `feature` off main.
+    // ─────────────────────────────────────────────────────────────────
+    db.branch_create("feature").await.unwrap();
+    let branches = db.branch_list().await.unwrap();
+    assert!(
+        branches.iter().any(|b| b == "feature"),
+        "feature branch must appear in branch_list; got {:?}",
+        branches,
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 4: mutate on `feature` — single statement (insert) +
+    // multi-statement (insert + insert).
+    // ─────────────────────────────────────────────────────────────────
+    mutate_branch(
+        &mut db,
+        "feature",
+        MUTATION_QUERIES,
+        "insert_person",
+        &mixed_params(&[("$name", "Eve")], &[("$age", 22)]),
+    )
+    .await
+    .expect("single-statement insert on feature");
+
+    mutate_branch(
+        &mut db,
+        "feature",
+        MUTATION_QUERIES,
+        "insert_person_and_friend",
+        &mixed_params(
+            &[("$name", "Frank"), ("$friend", "Eve")],
+            &[("$age", 33)],
+        ),
+    )
+    .await
+    .expect("multi-statement insert+edge on feature");
+
+    // After: feature has 4 + Eve + Frank = 6 Persons.
+    let snap = db
+        .snapshot_of(ReadTarget::branch("feature"))
+        .await
+        .unwrap();
+    let person_ds = snap.open("node:Person").await.unwrap();
+    assert_eq!(
+        person_ds.count_rows(None).await.unwrap(),
+        6,
+        "feature should now have 6 Persons (4 seeded + Eve + Frank)"
+    );
+
+    // Main is untouched by feature mutations.
+    assert_eq!(
+        count_rows(&db, "node:Person").await,
+        4,
+        "main must remain at 4 Persons after feature mutations"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 5: query on `feature` — exercise multi-modal modes.
+    // The fixture queries cover scalar lookup (get_person), traversal
+    // (friends_of), aggregation (friend_counts, total_people, age_stats).
+    // ─────────────────────────────────────────────────────────────────
+    let total_people = query_branch(
+        &mut db,
+        "feature",
+        TEST_QUERIES,
+        "total_people",
+        &ParamMap::default(),
+    )
+    .await
+    .unwrap();
+    assert!(
+        !total_people.batches().is_empty(),
+        "total_people must return at least one batch"
+    );
+
+    let friends_of_alice = query_branch(
+        &mut db,
+        "feature",
+        TEST_QUERIES,
+        "friends_of",
+        &mixed_params(&[("$name", "Alice")], &[]),
+    )
+    .await
+    .unwrap();
+    assert!(
+        !friends_of_alice.batches().is_empty(),
+        "friends_of(Alice) must return data — Alice knows Bob and Charlie in the seed"
+    );
+
+    let unemployed = query_branch(
+        &mut db,
+        "feature",
+        TEST_QUERIES,
+        "unemployed",
+        &ParamMap::default(),
+    )
+    .await
+    .unwrap();
+    assert!(
+        !unemployed.batches().is_empty(),
+        "unemployed (anti-join) must return Persons without WorksAt edges"
+    );
+
+    let friend_counts = query_branch(
+        &mut db,
+        "feature",
+        TEST_QUERIES,
+        "friend_counts",
+        &ParamMap::default(),
+    )
+    .await
+    .unwrap();
+    assert!(
+        !friend_counts.batches().is_empty(),
+        "friend_counts (aggregation) must return per-person counts"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 6: mutate on `main` simultaneously — sets up a non-conflicting
+    // merge by touching a sibling type (Company) that feature didn't
+    // touch. (The test schema doesn't have a Company-mutation query, so
+    // we update an existing Person's age — Bob is on main but his age
+    // wasn't changed on feature.)
+    // ─────────────────────────────────────────────────────────────────
+    mutate_main(
+        &mut db,
+        MUTATION_QUERIES,
+        "set_age",
+        &mixed_params(&[("$name", "Bob")], &[("$age", 26)]),
+    )
+    .await
+    .expect("set Bob's age on main");
+    let v_pre_merge_main = version_branch(&db, "main").await.unwrap();
+
+    // Capture the pre-merge main snapshot for time-travel verification later.
+    let snapshot_pre_merge = snapshot_main(&db).await.unwrap();
+    let pre_merge_version = snapshot_pre_merge.version();
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 7: branch_merge feature → main, verify merge result + audit.
+    // ─────────────────────────────────────────────────────────────────
+    let merge_outcome = db.branch_merge("feature", "main").await.unwrap();
+    let v_post_merge = version_branch(&db, "main").await.unwrap();
+    assert!(
+        v_post_merge > v_pre_merge_main,
+        "merge must advance main's manifest version: pre={}, post={}",
+        v_pre_merge_main,
+        v_post_merge,
+    );
+    let _ = merge_outcome;
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 8: query at the post-merge snapshot — verify both sides'
+    // writes are visible. Main now has 4 + Eve + Frank = 6 Persons,
+    // and Bob's age is 26 (from the main mutation).
+    // ─────────────────────────────────────────────────────────────────
+    assert_eq!(
+        count_rows(&db, "node:Person").await,
+        6,
+        "post-merge main must have all 6 Persons"
+    );
+
+    // Verify Bob's age update from main carried through the merge.
+    let bob_after = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "get_person",
+        &mixed_params(&[("$name", "Bob")], &[]),
+    )
+    .await
+    .unwrap();
+    assert!(
+        !bob_after.batches().is_empty(),
+        "Bob must still be present on main post-merge"
+    );
+
+    // Verify Eve (from feature) is now visible on main.
+    let eve_after = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "get_person",
+        &mixed_params(&[("$name", "Eve")], &[]),
+    )
+    .await
+    .unwrap();
+    assert!(
+        !eve_after.batches().is_empty(),
+        "Eve (from feature) must be visible on main post-merge"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 9: snapshot_at_version(pre_merge_version) — verify time-travel
+    // still sees the pre-merge state (4 Persons on main, no Eve/Frank).
+    // ─────────────────────────────────────────────────────────────────
+    let pre_merge_snapshot = db.snapshot_at_version(pre_merge_version).await.unwrap();
+    let pre_merge_persons = pre_merge_snapshot
+        .open("node:Person")
+        .await
+        .unwrap()
+        .count_rows(None)
+        .await
+        .unwrap();
+    assert_eq!(
+        pre_merge_persons, 4,
+        "time-travel to pre-merge version must show 4 Persons (pre-feature-merge state)"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 10: optimize the post-merge graph — verify indices stay
+    // valid and queryable.
+    //
+    // **Known limitation**: `optimize_all_tables` calls Lance
+    // `compact_files` directly — it advances per-table Lance HEAD
+    // without updating the omnigraph `__manifest` pin. After optimize,
+    // the next writer's expected_table_versions captures the
+    // pre-optimize manifest pin, but the publisher's pre-check reads
+    // a higher version from the manifest dataset (because some other
+    // path — possibly schema-state recovery on reopen — wrote a newer
+    // __manifest row). The `ExpectedVersionMismatch` is benign
+    // (re-issuing the mutation after a snapshot refresh succeeds), but
+    // a composite test cannot reliably exercise post-optimize mutations
+    // until that path is investigated. Coverage of post-optimize
+    // mutations is left to a focused optimize+cleanup integration test.
+    // ─────────────────────────────────────────────────────────────────
+    let optimize_stats = db.optimize().await.unwrap();
+    assert!(
+        !optimize_stats.is_empty(),
+        "optimize must return per-table stats"
+    );
+
+    // Re-run a query to verify post-optimize correctness.
+    let post_optimize_total = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "total_people",
+        &ParamMap::default(),
+    )
+    .await
+    .unwrap();
+    assert!(
+        !post_optimize_total.batches().is_empty(),
+        "queries must still work after optimize"
+    );
+    assert_eq!(
+        count_rows(&db, "node:Person").await,
+        6,
+        "row counts unchanged by optimize"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 11: cleanup — keep last 10 versions, only purge versions
+    // older than 1 hour. With this small test, we have well under 10
+    // versions and nothing that old, so cleanup is a no-op except for
+    // any orphan files. The recovery floor (--keep ≥ 3) needed for the
+    // open-time recovery sweep is preserved by the keep-10 default.
+    // Verify the call doesn't break subsequent queries.
+    // ─────────────────────────────────────────────────────────────────
+    use omnigraph::db::CleanupPolicyOptions;
+    use std::time::Duration;
+    let _cleanup_stats = db
+        .cleanup(CleanupPolicyOptions {
+            keep_versions: Some(10),
+            older_than: Some(Duration::from_secs(3600)),
+        })
+        .await
+        .unwrap();
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 12: reopen the engine — verify post-cleanup state is consistent.
+    // ─────────────────────────────────────────────────────────────────
+    drop(db);
+    let mut db = Omnigraph::open(uri).await.unwrap();
+    assert_eq!(
+        count_rows(&db, "node:Person").await,
+        6,
+        "Person count consistent across reopen"
+    );
+    assert_eq!(
+        count_rows(&db, "node:Company").await,
+        2,
+        "Company count consistent across reopen"
+    );
+
+    // Branch list still contains feature.
+    let branches = db.branch_list().await.unwrap();
+    assert!(
+        branches.iter().any(|b| b == "feature"),
+        "feature branch must still be visible after reopen; got {:?}",
+        branches,
+    );
+
+    // Final query exercise — full read path works post-reopen,
+    // post-cleanup. Post-cleanup mutation is omitted here pending
+    // resolution of the optimize-vs-manifest-pin interaction documented
+    // in Step 10.
+    let final_total = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "total_people",
+        &ParamMap::default(),
+    )
+    .await
+    .unwrap();
+    assert!(!final_total.batches().is_empty());
+}
+
+/// Multi-branch sequential merges with main writes interleaved between
+/// every diverge point. Catches compositional regressions that single-
+/// merge tests can't see:
+///
+/// - **Base/LCA recomputation across two merges**: feat-b's base must be
+///   the main version *at feat-b's branch creation*, not main's
+///   post-feat-a-merge HEAD. A regression that uses main HEAD as the
+///   merge base would re-classify Eve / Grace as unknown source-only
+///   rows and re-apply them.
+/// - **Manifest pin propagation through merge commits**: after merge
+///   feat-a → main, main's table_branch entries for Person and Knows
+///   must reflect the rewrite-on-active path; the second merge needs
+///   them to compute its diff correctly.
+/// - **Time-travel through merge DAG**: snapshot_at_version at three
+///   distinct points (pre-feat-a-merge, post-feat-a-merge-pre-helen,
+///   pre-feat-b-merge) must each return the right historical state
+///   without bleed-through from later commits.
+/// - **Reopen consistency over a multi-merge history**: dropping the
+///   handle and reopening must replay the full merge DAG cleanly with
+///   no recovery sweep activity (steady state).
+///
+/// All other compositional concerns (single merge mechanics, conflict
+/// detection, time-travel mechanics) are covered by `branching.rs` and
+/// `point_in_time.rs`. This test only exercises *composition*.
+#[tokio::test]
+async fn composite_flow_multi_branch_sequential_merges() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 1: init + load baseline (4 Person, 2 Company, 3 Knows, 2 WorksAt
+    // edges from test.jsonl).
+    // ─────────────────────────────────────────────────────────────────
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    load_jsonl(&mut db, TEST_DATA, LoadMode::Append).await.unwrap();
+    assert_eq!(count_rows(&db, "node:Person").await, 4);
+    assert_eq!(count_rows(&db, "edge:Knows").await, 3);
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 2: mutate main — insert "Alice2" before any branching. Main
+    // diverges from the load baseline by exactly one row.
+    // ─────────────────────────────────────────────────────────────────
+    mutate_main(
+        &mut db,
+        MUTATION_QUERIES,
+        "insert_person",
+        &mixed_params(&[("$name", "Alice2")], &[("$age", 31)]),
+    )
+    .await
+    .expect("insert Alice2 on main");
+    assert_eq!(count_rows(&db, "node:Person").await, 5);
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 3: branch_create feat-a from main. feat-a inherits main's
+    // 5-Person state.
+    // ─────────────────────────────────────────────────────────────────
+    db.branch_create("feat-a").await.unwrap();
+    assert_eq!(count_rows_branch(&db, "feat-a", "node:Person").await, 5);
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 4: mutate main — insert "Bob2" AFTER feat-a was created. main
+    // and feat-a now diverge: main has Bob2, feat-a does not.
+    // ─────────────────────────────────────────────────────────────────
+    mutate_main(
+        &mut db,
+        MUTATION_QUERIES,
+        "insert_person",
+        &mixed_params(&[("$name", "Bob2")], &[("$age", 26)]),
+    )
+    .await
+    .expect("insert Bob2 on main");
+    assert_eq!(count_rows(&db, "node:Person").await, 6);
+    assert_eq!(
+        count_rows_branch(&db, "feat-a", "node:Person").await,
+        5,
+        "feat-a must not see main's post-branch-create writes"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 5: mutate feat-a — insert "Eve". feat-a now also has 6 rows,
+    // but the *sixth* is Eve, not Bob2.
+    // ─────────────────────────────────────────────────────────────────
+    mutate_branch(
+        &mut db,
+        "feat-a",
+        MUTATION_QUERIES,
+        "insert_person",
+        &mixed_params(&[("$name", "Eve")], &[("$age", 22)]),
+    )
+    .await
+    .expect("insert Eve on feat-a");
+    assert_eq!(count_rows_branch(&db, "feat-a", "node:Person").await, 6);
+    assert_eq!(
+        count_rows(&db, "node:Person").await,
+        6,
+        "main must not see feat-a's writes"
+    );
+    // Branch isolation through the QUERY ENGINE (not just dataset-direct):
+    // `get_person` on feat-a finds Eve (uses the BTree index on Person.name);
+    // the same query on main finds nothing. Catches regressions where the
+    // planner resolves the wrong snapshot for branch-targeted reads.
+    let eve_on_feat_a = query_branch(
+        &mut db,
+        "feat-a",
+        TEST_QUERIES,
+        "get_person",
+        &mixed_params(&[("$name", "Eve")], &[]),
+    )
+    .await
+    .unwrap();
+    assert_eq!(
+        eve_on_feat_a.num_rows(),
+        1,
+        "get_person(Eve) on feat-a must return 1 row through the query engine"
+    );
+    let eve_on_main = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "get_person",
+        &mixed_params(&[("$name", "Eve")], &[]),
+    )
+    .await
+    .unwrap();
+    assert_eq!(
+        eve_on_main.num_rows(),
+        0,
+        "get_person(Eve) on main must return 0 rows — feat-a's writes are isolated"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 6: branch_create feat-b from main. feat-b's base is main's
+    // current state (post-Bob2): 6 Persons including Bob2 but NOT Eve.
+    // The two branches now share neither base nor head with each other.
+    // ─────────────────────────────────────────────────────────────────
+    db.branch_create("feat-b").await.unwrap();
+    assert_eq!(count_rows_branch(&db, "feat-b", "node:Person").await, 6);
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 7: mutate feat-b — insert "Frank".
+    // ─────────────────────────────────────────────────────────────────
+    mutate_branch(
+        &mut db,
+        "feat-b",
+        MUTATION_QUERIES,
+        "insert_person",
+        &mixed_params(&[("$name", "Frank")], &[("$age", 33)]),
+    )
+    .await
+    .expect("insert Frank on feat-b");
+    assert_eq!(count_rows_branch(&db, "feat-b", "node:Person").await, 7);
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 8: mutate feat-a again — insert "Grace" + Knows(Grace → Eve).
+    // feat-a now has 7 Persons and 4 Knows edges.
+    // ─────────────────────────────────────────────────────────────────
+    mutate_branch(
+        &mut db,
+        "feat-a",
+        MUTATION_QUERIES,
+        "insert_person_and_friend",
+        &mixed_params(
+            &[("$name", "Grace"), ("$friend", "Eve")],
+            &[("$age", 28)],
+        ),
+    )
+    .await
+    .expect("insert Grace + Knows(Grace → Eve) on feat-a");
+    assert_eq!(count_rows_branch(&db, "feat-a", "node:Person").await, 7);
+    assert_eq!(count_rows_branch(&db, "feat-a", "edge:Knows").await, 4);
+    assert_eq!(
+        count_rows(&db, "edge:Knows").await,
+        3,
+        "main's Knows must be untouched by feat-a's edge insert"
+    );
+    // Edge traversal through the QUERY ENGINE on feat-a: `friends_of(Grace)`
+    // exercises the Knows topology + index from feat-a's snapshot. Catches
+    // regressions in graph-index lookup against branch-local edge tables.
+    let graces_friends = query_branch(
+        &mut db,
+        "feat-a",
+        TEST_QUERIES,
+        "friends_of",
+        &mixed_params(&[("$name", "Grace")], &[]),
+    )
+    .await
+    .unwrap();
+    assert_eq!(
+        graces_friends.num_rows(),
+        1,
+        "friends_of(Grace) on feat-a must return Eve via the query engine + Knows index"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 9: capture pre-merge-feat-a state. Both a version (for direct
+    // dataset open) AND a SnapshotId (for query-engine time-travel) are
+    // captured so we can later assert historical state through both paths.
+    // ─────────────────────────────────────────────────────────────────
+    let pre_merge_a_version = version_main(&db).await.unwrap();
+    let pre_merge_a_snap_id = db.resolve_snapshot("main").await.unwrap();
+    let pre_merge_a_persons = count_rows(&db, "node:Person").await;
+    assert_eq!(pre_merge_a_persons, 6);
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 10: merge feat-a → main. main gains Eve, Grace, and the
+    // Knows(Grace → Eve) edge. main's manifest version advances.
+    // ─────────────────────────────────────────────────────────────────
+    db.branch_merge("feat-a", "main").await.unwrap();
+    let post_merge_a_version = version_main(&db).await.unwrap();
+    assert!(
+        post_merge_a_version > pre_merge_a_version,
+        "merge feat-a → main must advance main's manifest version"
+    );
+    assert_eq!(count_rows(&db, "node:Person").await, 8);
+    assert_eq!(count_rows(&db, "edge:Knows").await, 4);
+    // Post-merge query-engine readback: Eve is now reachable on main via
+    // `get_person` (BTree index lookup) and Grace's edge to Eve survives
+    // the merge as a traversable edge via `friends_of`. This is the
+    // load-bearing check that `publish_rewritten_merge_table`'s Phase 3
+    // index rebuild produced a queryable result, not just data on disk.
+    let eve_on_main_post_merge = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "get_person",
+        &mixed_params(&[("$name", "Eve")], &[]),
+    )
+    .await
+    .unwrap();
+    assert_eq!(
+        eve_on_main_post_merge.num_rows(),
+        1,
+        "Eve must be findable on main post-merge through the BTree index"
+    );
+    let graces_friends_on_main = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "friends_of",
+        &mixed_params(&[("$name", "Grace")], &[]),
+    )
+    .await
+    .unwrap();
+    assert_eq!(
+        graces_friends_on_main.num_rows(),
+        1,
+        "friends_of(Grace) on main post-merge must traverse the rebuilt Knows index"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 11: mutate main AFTER the first merge — insert "Helen". This
+    // makes feat-b's eventual merge a non-trivial one: feat-b's base
+    // (created in step 6) does not include Eve / Grace / Helen, but
+    // main now has all three on top of Bob2.
+    // ─────────────────────────────────────────────────────────────────
+    mutate_main(
+        &mut db,
+        MUTATION_QUERIES,
+        "insert_person",
+        &mixed_params(&[("$name", "Helen")], &[("$age", 44)]),
+    )
+    .await
+    .expect("insert Helen on main post-merge");
+    assert_eq!(count_rows(&db, "node:Person").await, 9);
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 12: capture pre-merge-feat-b state. Used for time-travel
+    // assertions in step 14.
+    // ─────────────────────────────────────────────────────────────────
+    let pre_merge_b_version = version_main(&db).await.unwrap();
+    let pre_merge_b_snap_id = db.resolve_snapshot("main").await.unwrap();
+    assert!(
+        pre_merge_b_version > post_merge_a_version,
+        "Helen insert must advance main's version past the merge"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 13: merge feat-b → main. The diff base for this merge is
+    // feat-b's branch-creation point (step 6), NOT main's current head.
+    // A regression that uses main HEAD as the base would attempt to
+    // re-apply Eve/Grace/Helen as source-only rows or surface conflicts.
+    // ─────────────────────────────────────────────────────────────────
+    db.branch_merge("feat-b", "main").await.unwrap();
+    let post_merge_b_version = version_main(&db).await.unwrap();
+    assert!(
+        post_merge_b_version > pre_merge_b_version,
+        "merge feat-b → main must advance main's manifest version"
+    );
+    assert_eq!(
+        count_rows(&db, "node:Person").await,
+        10,
+        "main must contain all 10 Persons after both merges land"
+    );
+    // Aggregation through the QUERY ENGINE over the fully merged graph:
+    // `total_people` returns count(Person) = 10. Catches regressions in
+    // group-by/count execution against a multi-fragment table whose
+    // current shape was produced by two sequential merges.
+    let total_post_merges = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "total_people",
+        &ParamMap::default(),
+    )
+    .await
+    .unwrap();
+    assert_total(&total_post_merges, 10, "post both merges, main must total 10 Persons");
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 14: time-travel to pre-merge-a-version. Reads must return
+    // main's pre-feat-a-merge state: 6 Persons, no Eve / Grace / Frank /
+    // Helen. Catches snapshot leakage from later commits.
+    //
+    // Verified through TWO paths: direct dataset open (catches manifest-
+    // pin propagation regressions) AND `.gq` query against the captured
+    // SnapshotId (catches planner / index-state regressions where a
+    // historical query accidentally resolves against current indices
+    // instead of the snapshot's frozen index state).
+    // ─────────────────────────────────────────────────────────────────
+    let pre_a_snap = db.snapshot_at_version(pre_merge_a_version).await.unwrap();
+    let pre_a_persons = pre_a_snap
+        .open("node:Person")
+        .await
+        .unwrap()
+        .count_rows(None)
+        .await
+        .unwrap();
+    assert_eq!(
+        pre_a_persons, 6,
+        "time-travel to pre-merge-a must show exactly 6 Persons (dataset-direct)"
+    );
+    let pre_a_knows = pre_a_snap
+        .open("edge:Knows")
+        .await
+        .unwrap()
+        .count_rows(None)
+        .await
+        .unwrap();
+    assert_eq!(
+        pre_a_knows, 3,
+        "time-travel to pre-merge-a must show exactly 3 Knows edges (no Grace → Eve)"
+    );
+    // `.gq` query against the captured SnapshotId — the planner must
+    // resolve `total_people` against the historical Person snapshot,
+    // not main's current head. Asserts the actual count value (not just
+    // row count) so a planner regression that resolves to current state
+    // would surface here as a count mismatch (10 instead of 6).
+    let pre_a_total_via_query = db
+        .query(
+            ReadTarget::Snapshot(pre_merge_a_snap_id.clone()),
+            TEST_QUERIES,
+            "total_people",
+            &ParamMap::default(),
+        )
+        .await
+        .unwrap();
+    assert_total(
+        &pre_a_total_via_query,
+        6,
+        "time-travel to pre-merge-a must report 6 Persons via the query engine",
+    );
+    // Edge-traversal time-travel: Grace and her Knows(Grace → Eve) edge
+    // do not exist at pre_merge_a, so `friends_of(Grace)` must return 0
+    // even though Grace's row IS visible at later snapshots.
+    let pre_a_grace_friends = db
+        .query(
+            ReadTarget::Snapshot(pre_merge_a_snap_id.clone()),
+            TEST_QUERIES,
+            "friends_of",
+            &mixed_params(&[("$name", "Grace")], &[]),
+        )
+        .await
+        .unwrap();
+    assert_eq!(
+        pre_a_grace_friends.num_rows(),
+        0,
+        "friends_of(Grace) at pre-merge-a must return 0 — Grace's row predates the merge"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 15: time-travel to pre-merge-b-version. Reads must show
+    // post-feat-a-merge state (Eve, Grace, Helen present) but NOT Frank.
+    // ─────────────────────────────────────────────────────────────────
+    let pre_b_snap = db.snapshot_at_version(pre_merge_b_version).await.unwrap();
+    let pre_b_persons = pre_b_snap
+        .open("node:Person")
+        .await
+        .unwrap()
+        .count_rows(None)
+        .await
+        .unwrap();
+    assert_eq!(
+        pre_b_persons, 9,
+        "time-travel to pre-merge-b must show 9 Persons (post-feat-a-merge + Helen, pre-feat-b-merge)"
+    );
+    // Frank does not exist at pre-merge-b (he was on feat-b only); a
+    // historical `get_person(Frank)` via the query engine must return 0.
+    let pre_b_frank_via_query = db
+        .query(
+            ReadTarget::Snapshot(pre_merge_b_snap_id.clone()),
+            TEST_QUERIES,
+            "get_person",
+            &mixed_params(&[("$name", "Frank")], &[]),
+        )
+        .await
+        .unwrap();
+    assert_eq!(
+        pre_b_frank_via_query.num_rows(),
+        0,
+        "Frank must not appear at pre-merge-b — his row only enters main when feat-b merges"
+    );
+    // Eve is present at pre-merge-b (feat-a already landed); the
+    // historical query must find her.
+    let pre_b_eve_via_query = db
+        .query(
+            ReadTarget::Snapshot(pre_merge_b_snap_id),
+            TEST_QUERIES,
+            "get_person",
+            &mixed_params(&[("$name", "Eve")], &[]),
+        )
+        .await
+        .unwrap();
+    assert_eq!(
+        pre_b_eve_via_query.num_rows(),
+        1,
+        "Eve must be findable at pre-merge-b — she landed on main during feat-a's merge"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 16: query feat-b at its current head — feat-b is unchanged
+    // by main's merges; it still shows its own 7-row state.
+    // ─────────────────────────────────────────────────────────────────
+    assert_eq!(
+        count_rows_branch(&db, "feat-b", "node:Person").await,
+        7,
+        "feat-b's own snapshot must be unaffected by main's merge of feat-a"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 17: a feature-side query exercises the read path on a branch
+    // whose base predates a completed merge (feat-b's base is pre-feat-a).
+    // ─────────────────────────────────────────────────────────────────
+    let frank_on_feat_b = query_branch(
+        &mut db,
+        "feat-b",
+        TEST_QUERIES,
+        "get_person",
+        &mixed_params(&[("$name", "Frank")], &[]),
+    )
+    .await
+    .unwrap();
+    assert!(
+        !frank_on_feat_b.batches().is_empty(),
+        "feat-b must still see its own Frank insert"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 18: drop + reopen. Steady state — no recovery sidecars on
+    // disk, manifest replays cleanly, all branches and tables visible.
+    // ─────────────────────────────────────────────────────────────────
+    drop(db);
+    let db = Omnigraph::open(uri).await.unwrap();
+    assert_eq!(
+        count_rows(&db, "node:Person").await,
+        10,
+        "main Person count must persist across reopen"
+    );
+    assert_eq!(
+        count_rows(&db, "edge:Knows").await,
+        4,
+        "main Knows count must persist across reopen"
+    );
+    let branches = db.branch_list().await.unwrap();
+    assert!(
+        branches.iter().any(|b| b == "feat-a") && branches.iter().any(|b| b == "feat-b"),
+        "both feature branches must persist across reopen; got {:?}",
+        branches
+    );
+
+    // No recovery sidecars left behind by a clean flow.
+    let recovery_dir = std::path::Path::new(uri).join("__recovery");
+    let leftover_sidecars = if recovery_dir.exists() {
+        std::fs::read_dir(&recovery_dir).unwrap().count()
+    } else {
+        0
+    };
+    assert_eq!(
+        leftover_sidecars, 0,
+        "clean compositional flow must not leave recovery sidecars on disk"
+    );
+
+    // ─────────────────────────────────────────────────────────────────
+    // Step 19: post-reopen query-engine readback. Exercises the full
+    // read path (planner, indices, snapshot resolution) against the
+    // reopened engine — catches regressions where indices serialize
+    // correctly to disk but the reopened catalog can't bind them.
+    // ─────────────────────────────────────────────────────────────────
+    let mut db = db;
+    let post_reopen_total = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "total_people",
+        &ParamMap::default(),
+    )
+    .await
+    .unwrap();
+    assert_total(
+        &post_reopen_total,
+        10,
+        "post-reopen total_people must still report 10 Persons",
+    );
+    // Edge-traversal post-reopen: Grace's Knows(Grace → Eve) survived
+    // both the merge and the reopen as a queryable graph edge.
+    let graces_friends_post_reopen = query_main(
+        &mut db,
+        TEST_QUERIES,
+        "friends_of",
+        &mixed_params(&[("$name", "Grace")], &[]),
+    )
+    .await
+    .unwrap();
+    assert_eq!(
+        graces_friends_post_reopen.num_rows(),
+        1,
+        "friends_of(Grace) must traverse post-reopen — index + topology bound correctly"
+    );
+}
diff --git a/crates/omnigraph/tests/failpoints.rs b/crates/omnigraph/tests/failpoints.rs
index 8f10b17..0d8a20a 100644
--- a/crates/omnigraph/tests/failpoints.rs
+++ b/crates/omnigraph/tests/failpoints.rs
@@ -6,12 +6,50 @@ use fail::FailScenario;
 use omnigraph::db::Omnigraph;
 use omnigraph::failpoints::ScopedFailPoint;
 
-use helpers::{MUTATION_QUERIES, mixed_params, mutate_main};
+use helpers::recovery::{
+    FollowUpMutation, RecoveryExpectation, TableExpectation, assert_post_recovery_invariants,
+    branch_head_commit_id, single_sidecar_operation_id,
+};
+use helpers::{MUTATION_QUERIES, mixed_params, mutate_main, version_main};
 
 const SCHEMA_V1: &str = "node Person { name: String @key }\n";
 const SCHEMA_V2_ADDED_TYPE: &str =
     "node Person { name: String @key }\nnode Company { name: String @key }\n";
 
+fn node_table_uri(root: &str, type_name: &str) -> String {
+    let mut hash: u64 = 0xcbf2_9ce4_8422_2325;
+    for &b in type_name.as_bytes() {
+        hash ^= b as u64;
+        hash = hash.wrapping_mul(0x100_0000_01b3);
+    }
+    format!("{}/nodes/{hash:016x}", root.trim_end_matches('/'))
+}
+
+fn person_batch(rows: &[(&str, &str, Option<i32>)]) -> arrow_array::RecordBatch {
+    use std::sync::Arc;
+
+    use arrow_array::{Int32Array, StringArray};
+    use arrow_schema::{DataType, Field, Schema};
+
+    let schema = Arc::new(Schema::new(vec![
+        Field::new("id", DataType::Utf8, false),
+        Field::new("age", DataType::Int32, true),
+        Field::new("name", DataType::Utf8, false),
+    ]));
+    let ids: Vec<&str> = rows.iter().map(|(id, _, _)| *id).collect();
+    let names: Vec<&str> = rows.iter().map(|(_, name, _)| *name).collect();
+    let ages: Vec<Option<i32>> = rows.iter().map(|(_, _, age)| *age).collect();
+    arrow_array::RecordBatch::try_new(
+        schema,
+        vec![
+            Arc::new(StringArray::from(ids)),
+            Arc::new(Int32Array::from(ages)),
+            Arc::new(StringArray::from(names)),
+        ],
+    )
+    .unwrap()
+}
+
 #[tokio::test]
 async fn branch_create_failpoint_triggers() {
     let _scenario = FailScenario::setup();
@@ -56,7 +94,7 @@ async fn graph_publish_failpoint_triggers_before_commit_append() {
 // state.
 
 #[tokio::test]
-async fn schema_apply_recovers_pre_commit_crash() {
+async fn schema_apply_pre_commit_crash_rolls_forward_via_sidecar() {
     let _scenario = FailScenario::setup();
     let dir = tempfile::tempdir().unwrap();
     let uri = dir.path().to_str().unwrap().to_string();
@@ -73,11 +111,29 @@ async fn schema_apply_recovers_pre_commit_crash() {
         );
     }
 
-    // Reopen — recovery sweep should delete staging files and keep the
-    // original schema, since the manifest commit never happened.
+    // Reopen. With the sidecar protocol, a Phase B → Phase C crash
+    // (per-table commit_staged done; manifest publish not yet) is
+    // recoverable: the sidecar's `additional_registrations` carries the
+    // intent to register `node:Company`, schema-state recovery promotes
+    // the staging files, and the manifest-drift sweep publishes the
+    // RegisterTable + Update so the manifest catches up to the schema
+    // the writer already declared. The orphan-dataset-on-disk-with-no-
+    // manifest-entry corruption that pre-this-protocol recoveries left
+    // behind is closed.
     let db = Omnigraph::open(&uri).await.unwrap();
-    assert_eq!(db.schema_source(), SCHEMA_V1);
+    assert_eq!(
+        db.schema_source(),
+        SCHEMA_V2_ADDED_TYPE,
+        "live schema must reflect the rolled-forward apply (Company added)"
+    );
     assert_no_staging_files(dir.path());
+    // node:Company must be registered in the manifest (queryable);
+    // pre-protocol recoveries left it as an orphan dataset on disk.
+    let company_rows = helpers::count_rows(&db, "node:Company").await;
+    assert_eq!(
+        company_rows, 0,
+        "node:Company must have a manifest entry post-recovery"
+    );
 }
 
 #[tokio::test]
@@ -140,43 +196,52 @@ async fn schema_apply_recovers_partial_rename() {
     assert_no_staging_files(dir.path());
 }
 
-/// Pin the documented "finalize → publisher residual" of the
-/// staged-write commit path.
+/// Prove the recovery sweep closes the "finalize → publisher residual"
+/// across one open cycle.
 ///
 /// `MutationStaging::finalize` runs `commit_staged` per touched table
 /// sequentially before the publisher commits the manifest. Lance has no
 /// multi-dataset atomic commit primitive, so a failure between the
 /// per-table staged commits and the manifest commit leaves Lance HEAD
-/// advanced on the touched tables with no manifest update — and the
-/// next mutation surfaces `ExpectedVersionMismatch` on those tables.
+/// advanced on the touched tables with no manifest update.
 ///
-/// This isn't a code bug we can fix without an upstream Lance change;
-/// it's the documented residual (see `docs/runs.md` "Finalize →
-/// publisher residual"). The test pins the behavior so future code
-/// changes catch any silent regression: if someone widens the residual
-/// (e.g. failing earlier in finalize without rolling back), this test
-/// will surface a different error than `ExpectedVersionMismatch`. If
-/// someone narrows the residual (e.g. lance ships multi-dataset commit
-/// and we plumb it), this test will start passing the next mutation
-/// — and someone has to update the assertion + the docs.
+/// Closing the residual: finalize writes a sidecar at
+/// `__recovery/{ulid}.json` BEFORE Phase B, the failpoint fires AFTER
+/// finalize but BEFORE the publisher, the engine handle is dropped, and
+/// the next `Omnigraph::open` runs the recovery sweep. The sweep
+/// classifies every table in the sidecar as `RolledPastExpected` (Lance
+/// HEAD == expected + 1, post_commit_pin matches), decides RollForward,
+/// atomically extends every manifest pin via
+/// `ManifestBatchPublisher::publish`, records an audit row, and deletes
+/// the sidecar.
+///
+/// After this test passes:
+/// - The originally-attempted insert ("Eve") is visible via a normal
+///   query.
+/// - The next mutation succeeds without `ExpectedVersionMismatch`.
+/// - `_graph_commit_recoveries.lance` carries an audit row with
+///   `recovery_kind=RolledForward` and the original sidecar's
+///   `actor_id` in `recovery_for_actor`.
+///
+/// Continuous in-process recovery (no restart needed between failure
+/// and recovery) is the goal of a future background reconciler.
 #[tokio::test]
-async fn finalize_publisher_residual_drifts_lance_head_until_next_writer_recovers() {
-    use omnigraph::error::{ManifestConflictDetails, OmniError};
-
+async fn recovery_rolls_forward_after_finalize_publisher_failure() {
     let _scenario = FailScenario::setup();
     let dir = tempfile::tempdir().unwrap();
-    let mut db = Omnigraph::init(dir.path().to_str().unwrap(), helpers::TEST_SCHEMA)
-        .await
-        .unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+    let operation_id;
 
+    // Setup: trigger the residual.
     {
-        let _failpoint =
-            ScopedFailPoint::new("mutation.post_finalize_pre_publisher", "return");
+        let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+        let _failpoint = ScopedFailPoint::new("mutation.post_finalize_pre_publisher", "return");
 
-        // First mutation: finalize succeeds (commit_staged advances Lance
-        // HEAD on node:Person), then the failpoint kicks before the
-        // publisher's manifest commit. The caller sees the synthetic
-        // error.
+        // The mutation's finalize completes (commit_staged advances Lance
+        // HEAD on node:Person AND writes a `__recovery/{ulid}.json`
+        // sidecar). Then the failpoint kicks in before the publisher's
+        // manifest commit, so the manifest pin stays at the pre-write
+        // version. The sidecar persists for the next-open recovery sweep.
         let err = mutate_main(
             &mut db,
             MUTATION_QUERIES,
@@ -185,48 +250,523 @@ async fn finalize_publisher_residual_drifts_lance_head_until_next_writer_recover
         )
         .await
         .unwrap_err();
+        assert!(
+            err.to_string()
+                .contains("injected failpoint triggered: mutation.post_finalize_pre_publisher"),
+            "unexpected error: {err}"
+        );
+
+        // Sidecar must still exist on disk for the recovery sweep to find.
+        let recovery_dir = dir.path().join("__recovery");
+        let sidecars: Vec<_> = std::fs::read_dir(&recovery_dir)
+            .unwrap()
+            .filter_map(|e| e.ok())
+            .collect();
+        assert_eq!(
+            sidecars.len(),
+            1,
+            "exactly one sidecar should persist after the finalize failure"
+        );
+        operation_id = single_sidecar_operation_id(dir.path());
+
+        // Drop the failpoint scope and the engine handle.
+    }
+
+    // Recovery: reopen runs the recovery sweep. The sweep finds the
+    // sidecar, classifies node:Person as RolledPastExpected, decides
+    // RollForward, publishes the manifest update, records the audit
+    // row, deletes the sidecar.
+    let db = Omnigraph::open(&uri).await.unwrap();
+
+    // The originally-attempted "Eve" insert is now visible — the recovery
+    // sweep extended the manifest pin to include the staged commit.
+    let person_count = helpers::count_rows(&db, "node:Person").await;
+    assert_eq!(
+        person_count, 1,
+        "exactly one person (Eve) must be visible after roll-forward"
+    );
+    drop(db);
+
+    assert_post_recovery_invariants(
+        dir.path(),
+        &operation_id,
+        RecoveryExpectation::RolledForward {
+            tables: vec![TableExpectation::main("node:Person").follow_up_mutation(
+                FollowUpMutation::new(
+                    "main",
+                    MUTATION_QUERIES,
+                    "insert_person",
+                    mixed_params(&[("$name", "Frank")], &[("$age", 33)]),
+                ),
+            )],
+        },
+    )
+    .await
+    .unwrap();
+
+    let db = Omnigraph::open(&uri).await.unwrap();
+    let person_count = helpers::count_rows(&db, "node:Person").await;
+    assert_eq!(
+        person_count, 2,
+        "Frank's insert must land normally after recovery"
+    );
+}
+
+#[tokio::test]
+async fn recovery_rolls_forward_load_on_feature_branch() {
+    use omnigraph::loader::LoadMode;
+
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+    let operation_id;
+    let main_person_pin;
+    let feature_parent_commit_id;
+
+    {
+        let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+        db.branch_create("feature").await.unwrap();
+        db.mutate(
+            "feature",
+            MUTATION_QUERIES,
+            "insert_person",
+            &mixed_params(&[("$name", "BeforeLoad")], &[("$age", 40)]),
+        )
+        .await
+        .unwrap();
+        main_person_pin = db
+            .snapshot_of(omnigraph::db::ReadTarget::branch("main"))
+            .await
+            .unwrap()
+            .entry("node:Person")
+            .expect("main must have Person")
+            .table_version;
+        feature_parent_commit_id = branch_head_commit_id(dir.path(), "feature").await.unwrap();
+
+        let _failpoint = ScopedFailPoint::new("mutation.post_finalize_pre_publisher", "return");
+        let err = db
+            .load(
+                "feature",
+                r#"{"type":"Person","data":{"name":"FeatureLoad","age":41}}
+"#,
+                LoadMode::Append,
+            )
+            .await
+            .unwrap_err();
+        assert!(
+            err.to_string()
+                .contains("injected failpoint triggered: mutation.post_finalize_pre_publisher"),
+            "unexpected error: {err}"
+        );
+        operation_id = single_sidecar_operation_id(dir.path());
+    }
+
+    let db = Omnigraph::open(&uri).await.unwrap();
+    assert_eq!(
+        helpers::count_rows_branch(&db, "feature", "node:Person").await,
+        2,
+        "feature branch load row must be visible after recovery"
+    );
+    assert_eq!(
+        helpers::count_rows(&db, "node:Person").await,
+        0,
+        "feature branch load recovery must not publish the row to main"
+    );
+    drop(db);
+
+    assert_post_recovery_invariants(
+        dir.path(),
+        &operation_id,
+        RecoveryExpectation::RolledForward {
+            tables: vec![
+                TableExpectation::branch("node:Person", "feature")
+                    .expected_main_manifest_pin(main_person_pin)
+                    .expected_recovery_parent_commit_id(feature_parent_commit_id)
+                    .follow_up_mutation(FollowUpMutation::new(
+                        "feature",
+                        MUTATION_QUERIES,
+                        "insert_person",
+                        mixed_params(&[("$name", "AfterLoad")], &[("$age", 42)]),
+                    )),
+            ],
+        },
+    )
+    .await
+    .unwrap();
+
+    let db = Omnigraph::open(&uri).await.unwrap();
+    assert_eq!(
+        helpers::count_rows_branch(&db, "feature", "node:Person").await,
+        3,
+        "follow-up feature mutation must succeed after load recovery"
+    );
+    assert_eq!(
+        helpers::count_rows(&db, "node:Person").await,
+        0,
+        "follow-up feature mutation must not move main"
+    );
+}
+
+#[tokio::test]
+async fn recovery_rolls_forward_ensure_indices_on_feature_branch() {
+    use lance_index::DatasetIndexExt;
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+    let operation_id;
+    let feature_parent_commit_id;
+    let main_person_pin;
+
+    let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+    load_jsonl(
+        &mut db,
+        r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+        LoadMode::Append,
+    )
+    .await
+    .unwrap();
+    db.branch_create("feature").await.unwrap();
+    db.mutate(
+        "feature",
+        MUTATION_QUERIES,
+        "insert_person",
+        &mixed_params(&[("$name", "BeforeEnsure")], &[("$age", 42)]),
+    )
+    .await
+    .unwrap();
+
+    main_person_pin = db
+        .snapshot_of(omnigraph::db::ReadTarget::branch("main"))
+        .await
+        .unwrap()
+        .entry("node:Person")
+        .expect("main must have Person")
+        .table_version;
+
+    // Make the feature branch's Person table genuinely need index work
+    // while keeping the manifest internally consistent. The test-only
+    // publisher deliberately skips the normal index-rebuild preparation;
+    // the failed writer below is still the real `ensure_indices_on`.
+    let person_uri = node_table_uri(&uri, "Person");
+    let store = TableStore::new(&uri);
+    let mut ds = store
+        .open_dataset_head(&person_uri, Some("feature"))
+        .await
+        .unwrap();
+    ds.drop_index("id_idx").await.unwrap();
+    let dropped_index_head = ds.version().version;
+    db.failpoint_publish_table_head_without_index_rebuild_for_test(
+        "feature",
+        "node:Person",
+        Some("feature"),
+    )
+    .await
+    .unwrap();
+    let feature_snapshot = db
+        .snapshot_of(omnigraph::db::ReadTarget::branch("feature"))
+        .await
+        .unwrap();
+    assert_eq!(
+        feature_snapshot
+            .entry("node:Person")
+            .expect("feature must have Person")
+            .table_version,
+        dropped_index_head,
+        "test setup must publish the dropped-index table head before ensure_indices runs",
+    );
+    feature_parent_commit_id = branch_head_commit_id(dir.path(), "feature").await.unwrap();
+
+    {
+        let _failpoint =
+            ScopedFailPoint::new("ensure_indices.post_phase_b_pre_manifest_commit", "return");
+        let err = db.ensure_indices_on("feature").await.unwrap_err();
         assert!(
             err.to_string().contains(
-                "injected failpoint triggered: mutation.post_finalize_pre_publisher"
+                "injected failpoint triggered: ensure_indices.post_phase_b_pre_manifest_commit"
             ),
             "unexpected error: {err}"
         );
+        operation_id = single_sidecar_operation_id(dir.path());
     }
-    // Failpoint dropped — subsequent calls are not synthetic-failed.
+    drop(db);
 
-    // Next mutation against the same table surfaces the documented
-    // residual: Lance HEAD on node:Person advanced (commit_staged ran),
-    // manifest didn't, so the publisher CAS at next-mutation time
-    // surfaces ExpectedVersionMismatch.
-    let err = mutate_main(
+    let db = Omnigraph::open(&uri).await.unwrap();
+    assert_eq!(
+        helpers::count_rows_branch(&db, "feature", "node:Person").await,
+        2,
+        "feature should see inherited alice plus recovered branch-local row"
+    );
+    assert_eq!(
+        helpers::count_rows(&db, "node:Person").await,
+        1,
+        "ensure_indices branch recovery must not move main"
+    );
+    drop(db);
+
+    assert_post_recovery_invariants(
+        dir.path(),
+        &operation_id,
+        RecoveryExpectation::RolledForward {
+            tables: vec![
+                TableExpectation::branch("node:Person", "feature")
+                    .expected_main_manifest_pin(main_person_pin)
+                    .expected_recovery_parent_commit_id(feature_parent_commit_id)
+                    .follow_up_mutation(FollowUpMutation::new(
+                        "feature",
+                        MUTATION_QUERIES,
+                        "insert_person",
+                        mixed_params(&[("$name", "AfterEnsure")], &[("$age", 44)]),
+                    )),
+            ],
+        },
+    )
+    .await
+    .unwrap();
+
+    let db = Omnigraph::open(&uri).await.unwrap();
+    assert_eq!(
+        helpers::count_rows_branch(&db, "feature", "node:Person").await,
+        3,
+        "follow-up feature mutation must succeed after ensure_indices recovery"
+    );
+    assert_eq!(
+        helpers::count_rows(&db, "node:Person").await,
+        1,
+        "follow-up feature mutation must not move main"
+    );
+}
+
+/// Refresh-time recovery (Option B): the in-process `Omnigraph::refresh`
+/// runs roll-forward-only recovery, closing the long-running-server
+/// residual without restart.
+///
+/// Setup: trigger `mutation.post_finalize_pre_publisher` once. The
+/// sidecar persists. Without dropping the engine, call `db.refresh()`.
+/// The post-condition: sidecar gone; Eve visible; subsequent mutation
+/// on the same handle succeeds without restart and without
+/// ExpectedVersionMismatch.
+#[tokio::test]
+async fn refresh_runs_roll_forward_recovery_in_process() {
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+
+    let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+
+    // Setup: trigger the residual (sidecar persists; manifest unchanged).
+    {
+        let _failpoint = ScopedFailPoint::new("mutation.post_finalize_pre_publisher", "return");
+        let err = mutate_main(
+            &mut db,
+            MUTATION_QUERIES,
+            "insert_person",
+            &mixed_params(&[("$name", "Eve")], &[("$age", 22)]),
+        )
+        .await
+        .unwrap_err();
+        assert!(
+            err.to_string()
+                .contains("injected failpoint triggered: mutation.post_finalize_pre_publisher"),
+            "unexpected error: {err}"
+        );
+        let recovery_dir = dir.path().join("__recovery");
+        assert_eq!(
+            std::fs::read_dir(&recovery_dir).unwrap().count(),
+            1,
+            "exactly one sidecar must persist after the finalize failure"
+        );
+    }
+
+    // Recovery: explicit refresh runs roll-forward-only recovery
+    // in-process — no restart needed. Sidecar finds the Person drift,
+    // classifies RolledPastExpected, rolls forward via publisher CAS,
+    // and deletes the sidecar.
+    db.refresh().await.expect("refresh must succeed");
+
+    // Sidecar must be gone — refresh-time recovery rolled it forward.
+    let recovery_dir = dir.path().join("__recovery");
+    if recovery_dir.exists() {
+        let remaining: Vec<_> = std::fs::read_dir(&recovery_dir)
+            .unwrap()
+            .filter_map(|e| e.ok())
+            .collect();
+        assert!(
+            remaining.is_empty(),
+            "sidecar must be deleted by refresh-time roll-forward; remaining: {:?}",
+            remaining,
+        );
+    }
+
+    // Eve (the originally-attempted insert) is visible without restart.
+    let person_count = helpers::count_rows(&db, "node:Person").await;
+    assert_eq!(
+        person_count, 1,
+        "Eve must be visible after refresh-time roll-forward"
+    );
+
+    // A direct Person mutation also succeeds without ExpectedVersionMismatch.
+    mutate_main(
         &mut db,
         MUTATION_QUERIES,
         "insert_person",
         &mixed_params(&[("$name", "Frank")], &[("$age", 33)]),
     )
     .await
-    .unwrap_err();
-    let OmniError::Manifest(manifest_err) = err else {
-        panic!("expected Manifest error, got {err:?}");
-    };
-    let Some(ManifestConflictDetails::ExpectedVersionMismatch {
-        ref table_key,
-        expected,
-        actual,
-    }) = manifest_err.details
-    else {
-        panic!(
-            "expected ExpectedVersionMismatch (the documented residual), got {:?}",
-            manifest_err.details
-        );
+    .expect("Person insert must succeed after refresh-time recovery");
+    assert_eq!(helpers::count_rows(&db, "node:Person").await, 2);
+}
+
+/// Refresh-time recovery must NOT call `Dataset::restore` — it can
+/// silently orphan a concurrent writer's commit. Sidecars that would
+/// require rollback must be left on disk for the next ReadWrite open.
+///
+/// Setup: synthesize a sidecar that would classify as `UnexpectedAtP1`
+/// (rollback territory) — strict-match Mutation kind with
+/// expected_version != manifest_pinned. Trigger refresh and assert:
+/// sidecar still on disk, Lance HEAD unchanged (no restore commit).
+/// Then drop + open: full sweep handles it.
+#[tokio::test]
+async fn refresh_defers_rollback_eligible_sidecar_to_next_open() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+
+    // Bootstrap.
+    let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+    load_jsonl(
+        &mut db,
+        r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+        LoadMode::Append,
+    )
+    .await
+    .unwrap();
+
+    // Capture Person's full URI and manifest pin.
+    let snapshot = db
+        .snapshot_of(omnigraph::db::ReadTarget::branch("main"))
+        .await
+        .unwrap();
+    let entry = snapshot.entry("node:Person").unwrap();
+    let person_uri = format!("{}/{}", uri.trim_end_matches('/'), entry.table_path);
+    let manifest_pin = entry.table_version;
+
+    // Drift Person's Lance HEAD ahead of the manifest pin (without
+    // touching the manifest) so the classifier can reach UnexpectedAtP1
+    // / UnexpectedMultistep / RolledPastExpected paths that require
+    // a real restore on rollback.
+    let store = TableStore::new(&uri);
+    let mut ds = lance::Dataset::open(&person_uri).await.unwrap();
+    store
+        .delete_where(&person_uri, &mut ds, "1 = 2")
+        .await
+        .unwrap();
+    let head_after_drift = ds.version().version;
+    assert_eq!(head_after_drift, manifest_pin + 1);
+
+    // Synthesize a sidecar with expected_version that DOES NOT match
+    // the current manifest pin AND post_commit_pin == lance_head →
+    // strict Mutation classifier sees lance_head == manifest_pinned + 1
+    // but expected != manifest_pinned → UnexpectedAtP1. decide → RollBack.
+    //
+    // expected_version must be a REAL Lance version (`restore_table_to_version`
+    // calls `checkout_version` on it, and an unknown version errors). Use
+    // manifest_pin - 1 which exists from the bootstrap commit chain.
+    let bogus_expected = manifest_pin - 1;
+    let bogus_post = head_after_drift;
+    let sidecar_json = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H0000000000000000000RBCK",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": "act-rollback",
+            "writer_kind": "Mutation",
+            "tables": [
+                {{
+                    "table_key":"node:Person",
+                    "table_path":"{}",
+                    "expected_version":{},
+                    "post_commit_pin":{}
+                }}
+            ]
+        }}"#,
+        person_uri, bogus_expected, bogus_post,
+    );
+    let recovery_dir = dir.path().join("__recovery");
+    std::fs::create_dir_all(&recovery_dir).unwrap();
+    std::fs::write(
+        recovery_dir.join("01H0000000000000000000RBCK.json"),
+        &sidecar_json,
+    )
+    .unwrap();
+
+    // Capture pre-refresh Lance HEAD on Person.
+    let pre_head = lance::Dataset::open(&person_uri)
+        .await
+        .unwrap()
+        .version()
+        .version;
+
+    // Trigger refresh-time recovery directly. Sidecar is rollback-
+    // eligible (UnexpectedAtP1); RollForwardOnly mode defers it,
+    // leaving the sidecar on disk and Lance HEAD unchanged on Person.
+    db.refresh()
+        .await
+        .expect("refresh must succeed (deferring rollback)");
+
+    // Sidecar still on disk.
+    assert_eq!(
+        std::fs::read_dir(&recovery_dir).unwrap().count(),
+        1,
+        "rollback-eligible sidecar must be deferred to next ReadWrite open",
+    );
+
+    // Lance HEAD on Person unchanged — no restore ran.
+    let post_head = lance::Dataset::open(&person_uri)
+        .await
+        .unwrap()
+        .version()
+        .version;
+    assert_eq!(
+        pre_head, post_head,
+        "refresh-time recovery must NOT call Dataset::restore on Person; \
+         pre_head={pre_head}, post_head={post_head}",
+    );
+
+    // Cross-check: drop the engine and reopen — full sweep handles
+    // the rollback (will use Dataset::restore safely; no concurrent
+    // writers at open time).
+    drop(db);
+    let _db = Omnigraph::open(&uri).await.unwrap();
+    // After full-sweep recovery, the sidecar should be processed
+    // (deleted). Sidecar's tables are eligible for rollback (UnexpectedAtP1):
+    // restore happens on Person (HEAD advances by 1).
+    let remaining = if recovery_dir.exists() {
+        std::fs::read_dir(&recovery_dir).unwrap().count()
+    } else {
+        0
     };
     assert_eq!(
-        table_key, "node:Person",
-        "drift should be on the table the failed finalize touched"
+        remaining, 0,
+        "full sweep at next open must process the deferred sidecar",
     );
+    let final_head = lance::Dataset::open(&person_uri)
+        .await
+        .unwrap()
+        .version()
+        .version;
     assert!(
-        actual > expected,
-        "Lance HEAD on the drifted table should be ahead of manifest pinned: actual={actual} expected={expected}",
+        final_head > post_head,
+        "full sweep must run Dataset::restore (head advances); \
+         post_head={post_head}, final_head={final_head}",
     );
 }
 
@@ -242,8 +782,7 @@ async fn finalize_publisher_residual_does_not_drift_untouched_tables() {
         .unwrap();
 
     {
-        let _failpoint =
-            ScopedFailPoint::new("mutation.post_finalize_pre_publisher", "return");
+        let _failpoint = ScopedFailPoint::new("mutation.post_finalize_pre_publisher", "return");
         let _ = mutate_main(
             &mut db,
             MUTATION_QUERIES,
@@ -265,11 +804,11 @@ async fn finalize_publisher_residual_does_not_drift_untouched_tables() {
     .expect("Company write on a non-drifted table should succeed");
 }
 
-/// MR-793 Phase 4 acceptance bar — proves that a Phase A failure in
-/// the staged-index path (`stage_create_btree_index` succeeded;
-/// `commit_staged` not yet called) leaves NO Lance-HEAD drift on the
-/// existing tables. Subsequent operations against those tables succeed
-/// without `ExpectedVersionMismatch`.
+/// Acceptance test: a stage-step failure in the staged-index path
+/// (`stage_create_btree_index` succeeded; `commit_staged` not yet
+/// called) leaves NO Lance-HEAD drift on the existing tables.
+/// Subsequent operations against those tables succeed without
+/// `ExpectedVersionMismatch`.
 ///
 /// Path: `apply_schema(v1 → v2)` adds a new node type. The
 /// `added_tables` loop in `schema_apply` creates the empty dataset and
@@ -289,7 +828,7 @@ async fn finalize_publisher_residual_does_not_drift_untouched_tables() {
 /// by `cleanup_old_versions` (or removed when a future apply at the
 /// same target path resolves the rename).
 #[tokio::test]
-async fn ensure_indices_phase_a_btree_failure_leaves_existing_tables_writable() {
+async fn ensure_indices_stage_btree_failure_leaves_existing_tables_writable() {
     let _scenario = FailScenario::setup();
     let dir = tempfile::tempdir().unwrap();
     let uri = dir.path().to_str().unwrap().to_string();
@@ -303,13 +842,14 @@ async fn ensure_indices_phase_a_btree_failure_leaves_existing_tables_writable()
     // node:Project table's btree-on-id build. (TEST_SCHEMA already
     // has Person + Company + Knows + WorksAt — pick a name that isn't
     // already declared.)
-    let extended_schema = format!("{}\nnode Project {{ name: String @key }}\n", helpers::TEST_SCHEMA);
+    let extended_schema = format!(
+        "{}\nnode Project {{ name: String @key }}\n",
+        helpers::TEST_SCHEMA
+    );
 
     {
-        let _failpoint = ScopedFailPoint::new(
-            "ensure_indices.post_stage_pre_commit_btree",
-            "return",
-        );
+        let _failpoint =
+            ScopedFailPoint::new("ensure_indices.post_stage_pre_commit_btree", "return");
         let err = db.apply_schema(&extended_schema).await.unwrap_err();
         assert!(
             err.to_string()
@@ -344,3 +884,731 @@ fn assert_no_staging_files(repo: &std::path::Path) {
         );
     }
 }
+
+// =====================================================================
+// Per-writer Phase B → Phase C recovery integration
+// =====================================================================
+//
+// Each of the four migrated writers writes a sidecar BEFORE its
+// per-table commit_staged loop and deletes it AFTER the manifest
+// publish. The `recovery_rolls_forward_after_finalize_publisher_failure`
+// test above covers MutationStaging::finalize. The three tests below
+// cover the other three writers: schema_apply, branch_merge,
+// ensure_indices.
+//
+// Each follows the same shape: trigger the writer with a failpoint
+// active in the Phase B → Phase C window, drop the engine, reopen,
+// assert recovery rolled forward (manifest pin advanced, audit row
+// recorded, sidecar deleted) and a follow-up operation succeeds without
+// ExpectedVersionMismatch.
+
+#[tokio::test]
+async fn schema_apply_without_schema_staging_rolls_back_on_next_open() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+    let operation_id;
+
+    {
+        let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+        load_jsonl(
+            &mut db,
+            r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+            LoadMode::Append,
+        )
+        .await
+        .unwrap();
+    }
+
+    let pre_failure_version = {
+        let db = Omnigraph::open(&uri).await.unwrap();
+        version_main(&db).await.unwrap()
+    };
+
+    {
+        let mut db = Omnigraph::open(&uri).await.unwrap();
+        let _failpoint = ScopedFailPoint::new("schema_apply.before_staging_write", "return");
+        let v2_schema = r#"node Person {
+    name: String @key
+    age: I32?
+    city: String?
+}
+
+node Company {
+    name: String @key
+}
+
+node Tag {
+    label: String @key
+}
+
+edge Knows: Person -> Person {
+    since: Date?
+}
+
+edge WorksAt: Person -> Company
+"#;
+        let err = db.apply_schema(v2_schema).await.unwrap_err();
+        assert!(
+            err.to_string()
+                .contains("injected failpoint triggered: schema_apply.before_staging_write"),
+            "unexpected error: {err}"
+        );
+        operation_id = single_sidecar_operation_id(dir.path());
+    }
+
+    let db = Omnigraph::open(&uri).await.unwrap();
+    assert_eq!(
+        version_main(&db).await.unwrap(),
+        pre_failure_version,
+        "manifest must remain on the old schema when no schema staging files existed"
+    );
+    assert_eq!(
+        helpers::count_rows(&db, "node:Person").await,
+        1,
+        "old-schema data must remain readable after rollback"
+    );
+    drop(db);
+
+    assert_post_recovery_invariants(
+        dir.path(),
+        &operation_id,
+        RecoveryExpectation::RolledBack {
+            tables: vec![TableExpectation::main("node:Person")],
+        },
+    )
+    .await
+    .unwrap();
+
+    let live_schema = std::fs::read_to_string(dir.path().join("_schema.pg")).unwrap();
+    assert!(
+        !live_schema.contains("city: String?"),
+        "_schema.pg must keep the OLD schema when staging files never existed; got:\n{live_schema}",
+    );
+    assert!(
+        !live_schema.contains("node Tag"),
+        "_schema.pg must keep the OLD schema when staging files never existed; got:\n{live_schema}",
+    );
+}
+
+#[tokio::test]
+async fn schema_apply_phase_b_failure_recovered_on_next_open() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+    let operation_id;
+
+    // Seed: a Person table with one row so the schema-apply rewritten_tables
+    // loop has actual work to do.
+    {
+        let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+        load_jsonl(
+            &mut db,
+            r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+            LoadMode::Append,
+        )
+        .await
+        .unwrap();
+    }
+
+    // Capture pre-failure manifest version so we can assert the recovery
+    // sweep advances it.
+    let pre_failure_version = {
+        let db = Omnigraph::open(&uri).await.unwrap();
+        version_main(&db).await.unwrap()
+    };
+
+    // Setup: trigger the residual via `schema_apply.after_staging_write`.
+    // This failpoint fires AFTER the rewritten_tables/indexed_tables loops
+    // (Lance HEAD advanced) AND AFTER the schema-state staging files are
+    // written, but BEFORE the manifest publish. The recovery sidecar persists.
+    {
+        let mut db = Omnigraph::open(&uri).await.unwrap();
+        let _failpoint = ScopedFailPoint::new("schema_apply.after_staging_write", "return");
+        // v2 schema: add a `city` property to Person AND add a new
+        // `Tag` node type. The new property triggers the rewritten_tables
+        // path (Phase B sidecar coverage). The new type changes the
+        // overall table set — required to keep `recover_schema_state_files`
+        // (which runs BEFORE recover_manifest_drift) happy: it can't
+        // disambiguate property-only migrations and would reject the
+        // open before the recovery sweep ever ran.
+        let v2_schema = r#"node Person {
+    name: String @key
+    age: I32?
+    city: String?
+}
+
+node Company {
+    name: String @key
+}
+
+node Tag {
+    label: String @key
+}
+
+edge Knows: Person -> Person {
+    since: Date?
+}
+
+edge WorksAt: Person -> Company
+"#;
+        let err = db.apply_schema(v2_schema).await.unwrap_err();
+        assert!(
+            err.to_string()
+                .contains("injected failpoint triggered: schema_apply.after_staging_write"),
+            "unexpected error: {err}"
+        );
+
+        // Sidecar must still exist.
+        let recovery_dir = dir.path().join("__recovery");
+        let sidecars: Vec<_> = std::fs::read_dir(&recovery_dir)
+            .unwrap()
+            .filter_map(|e| e.ok())
+            .collect();
+        assert_eq!(
+            sidecars.len(),
+            1,
+            "exactly one sidecar must persist after schema_apply failure"
+        );
+        operation_id = single_sidecar_operation_id(dir.path());
+    }
+
+    // Recovery: reopen runs the recovery sweep. Sidecar's writer_kind is
+    // SchemaApply (loose-match) — classifier accepts the multi-commit
+    // drift on Person, decision is RollForward, manifest extends to the
+    // current Lance HEAD.
+    let db = Omnigraph::open(&uri).await.unwrap();
+
+    // Recovery sweep must have advanced the manifest pin on the rewritten
+    // table: roll-forward published the post-failure Lance HEAD.
+    let post_recovery_version = version_main(&db).await.unwrap();
+    assert!(
+        post_recovery_version > pre_failure_version,
+        "manifest version must advance post-recovery; pre={pre_failure_version}, \
+         post={post_recovery_version}",
+    );
+    drop(db);
+
+    assert_post_recovery_invariants(
+        dir.path(),
+        &operation_id,
+        RecoveryExpectation::RolledForward {
+            tables: vec![TableExpectation::main("node:Person")],
+        },
+    )
+    .await
+    .unwrap();
+
+    // Schema-apply atomicity: the live `_schema.pg` must reflect the
+    // NEW schema (city column on Person, Tag node type) — not the old.
+    // Without the schema-staging coordination, the schema-state
+    // recovery would have deleted the staging files (because manifest
+    // hadn't advanced when it ran), leaving a corrupt repo with new-
+    // schema data on disk but old-schema catalog.
+    let live_schema = std::fs::read_to_string(dir.path().join("_schema.pg")).unwrap();
+    assert!(
+        live_schema.contains("city: String?"),
+        "_schema.pg must reflect the NEW schema (city column added); got:\n{live_schema}",
+    );
+    assert!(
+        live_schema.contains("node Tag"),
+        "_schema.pg must reflect the NEW schema (Tag type added); got:\n{live_schema}",
+    );
+
+    // Catalog ↔ manifest agreement: the new `node:Tag` type the schema
+    // declares must have a manifest entry the engine can read against.
+    // Without registrations / tombstones in the sidecar, recovery's
+    // `roll_forward_all` only publishes Updates for rewritten tables;
+    // added tables (Tag) end up as orphan datasets on disk with no
+    // manifest entry, and the live schema declares a type the manifest
+    // doesn't know about.
+    let db = Omnigraph::open(&uri).await.unwrap();
+    let tag_rows = helpers::count_rows(&db, "node:Tag").await;
+    assert_eq!(
+        tag_rows, 0,
+        "node:Tag must have a manifest entry (with 0 rows) post-recovery; \
+         a panic here means recovery failed to register the added table"
+    );
+}
+
+#[tokio::test]
+async fn branch_merge_phase_b_failure_recovered_on_next_open() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+
+    // Seed main with a row, branch off, mutate BOTH sides so the merge
+    // produces at least one `RewriteMerged` candidate (target moved past
+    // base too — required for the recovery sidecar to pin anything; the
+    // sidecar only pins RewriteMerged candidates because they're the
+    // only path that always advances Lance HEAD via
+    // `publish_rewritten_merge_table`).
+    {
+        let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+        load_jsonl(
+            &mut db,
+            r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+            LoadMode::Append,
+        )
+        .await
+        .unwrap();
+        db.branch_create("feature").await.unwrap();
+        db.mutate(
+            "feature",
+            MUTATION_QUERIES,
+            "insert_person",
+            &mixed_params(&[("$name", "Bob")], &[("$age", 40)]),
+        )
+        .await
+        .unwrap();
+        // Mutate main too so the merge sees target ≠ base for Person —
+        // forces RewriteMerged classification.
+        mutate_main(
+            &mut db,
+            MUTATION_QUERIES,
+            "insert_person",
+            &mixed_params(&[("$name", "Carol")], &[("$age", 50)]),
+        )
+        .await
+        .unwrap();
+    }
+
+    // Capture pre-failure state on main for post-recovery comparison.
+    let pre_failure_version = {
+        let db = Omnigraph::open(&uri).await.unwrap();
+        version_main(&db).await.unwrap()
+    };
+
+    // Setup: failpoint fires after the per-table publish loop completes
+    // but before commit_manifest_updates. Sidecar persists.
+    {
+        let mut db = Omnigraph::open(&uri).await.unwrap();
+        let _failpoint =
+            ScopedFailPoint::new("branch_merge.post_phase_b_pre_manifest_commit", "return");
+        let err = db.branch_merge("feature", "main").await.unwrap_err();
+        assert!(
+            err.to_string().contains(
+                "injected failpoint triggered: branch_merge.post_phase_b_pre_manifest_commit"
+            ),
+            "unexpected error: {err}"
+        );
+
+        let recovery_dir = dir.path().join("__recovery");
+        let sidecars: Vec<_> = std::fs::read_dir(&recovery_dir)
+            .unwrap()
+            .filter_map(|e| e.ok())
+            .collect();
+        assert_eq!(
+            sidecars.len(),
+            1,
+            "exactly one sidecar must persist after branch_merge failure"
+        );
+    }
+
+    // Recovery: reopen runs the sweep. BranchMerge uses LOOSE
+    // classification — `publish_rewritten_merge_table` runs multiple
+    // commit_staged calls per table (stage_merge_insert + delete_where +
+    // index rebuilds), so post_commit_pin in the sidecar is a lower
+    // bound; the loose-match classifier accepts any HEAD > expected_version
+    // when expected_version == manifest_pinned.
+    let db = Omnigraph::open(&uri).await.unwrap();
+
+    let recovery_dir = dir.path().join("__recovery");
+    if recovery_dir.exists() {
+        let remaining: Vec<_> = std::fs::read_dir(&recovery_dir)
+            .unwrap()
+            .filter_map(|e| e.ok())
+            .collect();
+        assert!(
+            remaining.is_empty(),
+            "sidecar must be deleted; remaining: {:?}",
+            remaining,
+        );
+    }
+    let audit_dir = dir.path().join("_graph_commit_recoveries.lance");
+    assert!(
+        audit_dir.exists(),
+        "_graph_commit_recoveries.lance must exist after branch_merge recovery"
+    );
+
+    // Recovery must have advanced main's manifest pin (the merge published).
+    let post_recovery_version = version_main(&db).await.unwrap();
+    assert!(
+        post_recovery_version > pre_failure_version,
+        "manifest version must advance post-recovery; pre={pre_failure_version}, \
+         post={post_recovery_version}",
+    );
+
+    // The recovered branch_merge must record a MERGE commit (with
+    // `merged_parent_commit_id` set), not a plain commit. Without
+    // this, future merges between the same pair lose
+    // already-up-to-date detection. We verify by reading
+    // `_graph_commits.lance` and asserting the most recent commit
+    // tagged with the recovery actor has a non-null
+    // `merged_parent_commit_id`.
+    {
+        use arrow_array::{Array, StringArray};
+        use futures::TryStreamExt;
+        let commits_dir = dir.path().join("_graph_commits.lance");
+        let ds = lance::Dataset::open(commits_dir.to_str().unwrap())
+            .await
+            .unwrap();
+        let batches: Vec<arrow_array::RecordBatch> = ds
+            .scan()
+            .try_into_stream()
+            .await
+            .unwrap()
+            .try_collect()
+            .await
+            .unwrap();
+        let mut found_recovery_merge = false;
+        for batch in batches {
+            let merged = batch
+                .column_by_name("merged_parent_commit_id")
+                .expect("merged_parent_commit_id column present")
+                .as_any()
+                .downcast_ref::<StringArray>()
+                .expect("merged_parent_commit_id is Utf8");
+            // The actor_id lives in _graph_commit_actors; cross-checking
+            // is heavier than necessary. Detecting any non-null
+            // merged_parent_commit_id in the post-recovery state is
+            // sufficient: only a recovered branch_merge can produce one
+            // here (we never completed a normal merge in this test).
+            for i in 0..merged.len() {
+                if !merged.is_null(i) {
+                    found_recovery_merge = true;
+                    break;
+                }
+            }
+        }
+        assert!(
+            found_recovery_merge,
+            "recovered branch_merge must record `merged_parent_commit_id` so future \
+             merges detect already-up-to-date — no merge-parent-tagged commit found",
+        );
+    }
+    drop(db);
+}
+
+/// Branch-axis variant of the branch_merge recovery test: target is a
+/// non-main branch. Catches the branch-specific commit-graph head bug
+/// (D2) — without `CommitGraph::open_at_branch`, the recovery sweep
+/// would record the global head as the merge parent on a non-main
+/// target, and future merges between the same pair would lose
+/// already-up-to-date detection.
+#[tokio::test]
+async fn branch_merge_phase_b_failure_recovered_on_non_main_target() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+    let operation_id;
+    let target_parent_commit_id;
+
+    // Setup:
+    //   main: alice
+    //   target_branch (off main): + bob (target moved past base)
+    //   source_branch (off main): + carol (source moved past base)
+    // Merge: source_branch → target_branch
+    {
+        let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+        load_jsonl(
+            &mut db,
+            r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+            LoadMode::Append,
+        )
+        .await
+        .unwrap();
+        db.branch_create("target_branch").await.unwrap();
+        db.mutate(
+            "target_branch",
+            MUTATION_QUERIES,
+            "insert_person",
+            &mixed_params(&[("$name", "Bob")], &[("$age", 40)]),
+        )
+        .await
+        .unwrap();
+        db.branch_create("source_branch").await.unwrap();
+        db.mutate(
+            "source_branch",
+            MUTATION_QUERIES,
+            "insert_person",
+            &mixed_params(&[("$name", "Carol")], &[("$age", 50)]),
+        )
+        .await
+        .unwrap();
+    }
+
+    let main_person_pin = {
+        let db = Omnigraph::open(&uri).await.unwrap();
+        db.snapshot_of(omnigraph::db::ReadTarget::branch("main"))
+            .await
+            .unwrap()
+            .entry("node:Person")
+            .expect("main must have Person")
+            .table_version
+    };
+    target_parent_commit_id = branch_head_commit_id(dir.path(), "target_branch")
+        .await
+        .unwrap();
+
+    // Setup: failpoint fires after the per-table publish loop completes
+    // but before commit_manifest_updates. Sidecar persists with
+    // branch=Some("target_branch").
+    {
+        let mut db = Omnigraph::open(&uri).await.unwrap();
+        let _failpoint =
+            ScopedFailPoint::new("branch_merge.post_phase_b_pre_manifest_commit", "return");
+        let err = db
+            .branch_merge("source_branch", "target_branch")
+            .await
+            .unwrap_err();
+        assert!(
+            err.to_string().contains(
+                "injected failpoint triggered: branch_merge.post_phase_b_pre_manifest_commit"
+            ),
+            "unexpected error: {err}"
+        );
+        let recovery_dir = dir.path().join("__recovery");
+        let sidecar_count = std::fs::read_dir(&recovery_dir).unwrap().count();
+        assert_eq!(
+            sidecar_count, 1,
+            "exactly one sidecar must persist after non-main branch_merge failure"
+        );
+        operation_id = single_sidecar_operation_id(dir.path());
+    }
+
+    // Recovery: reopen runs full sweep. The BranchMerge sidecar's branch
+    // = Some("target_branch"); D2 fix opens a per-branch CommitGraph
+    // for the audit append so the merge-parent linkage is correct.
+    let db = Omnigraph::open(&uri).await.unwrap();
+    drop(db);
+
+    assert_post_recovery_invariants(
+        dir.path(),
+        &operation_id,
+        RecoveryExpectation::RolledForward {
+            tables: vec![
+                TableExpectation::branch("node:Person", "target_branch")
+                    .expected_main_manifest_pin(main_person_pin)
+                    .expected_recovery_parent_commit_id(target_parent_commit_id),
+            ],
+        },
+    )
+    .await
+    .unwrap();
+}
+
+/// Contract: the BranchMerge sidecar's per-table `table_branch` MUST be
+/// the merge target branch (where commits land via
+/// `publish_rewritten_merge_table` → `open_for_mutation` → potentially
+/// `fork_dataset_from_entry_state`), NOT `entry.table_branch` (where
+/// the table currently lives in the target's manifest snapshot).
+///
+/// `ensure_indices_for_branch` already has this invariant pinned by an
+/// explicit comment at `table_ops.rs:115-120`. Without the same fix in
+/// `merge.rs`, a future change to candidate selection or the publish
+/// path that produces a `RewriteMerged` whose entry.table_branch
+/// diverges from active_branch would silently drift Lance HEAD on the
+/// target ref while recovery checks the wrong ref and no-ops the
+/// rollback.
+///
+/// This test reads the sidecar JSON directly and asserts every per-pin
+/// `table_branch` equals the active (target) branch. Even when the
+/// values happen to coincide in practice (the strict candidate logic
+/// keeps RewriteMerged tables on active_branch), the contract assertion
+/// catches a regression that reverts to `entry.table_branch.clone()`.
+#[tokio::test]
+async fn branch_merge_sidecar_pins_table_branch_to_active_branch() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+
+    {
+        let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+        load_jsonl(
+            &mut db,
+            r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+            LoadMode::Append,
+        )
+        .await
+        .unwrap();
+        db.branch_create("target_branch").await.unwrap();
+        db.mutate(
+            "target_branch",
+            MUTATION_QUERIES,
+            "insert_person",
+            &mixed_params(&[("$name", "Bob")], &[("$age", 40)]),
+        )
+        .await
+        .unwrap();
+        db.branch_create("source_branch").await.unwrap();
+        db.mutate(
+            "source_branch",
+            MUTATION_QUERIES,
+            "insert_person",
+            &mixed_params(&[("$name", "Carol")], &[("$age", 50)]),
+        )
+        .await
+        .unwrap();
+    }
+
+    {
+        let mut db = Omnigraph::open(&uri).await.unwrap();
+        let _failpoint =
+            ScopedFailPoint::new("branch_merge.post_phase_b_pre_manifest_commit", "return");
+        let _ = db
+            .branch_merge("source_branch", "target_branch")
+            .await
+            .expect_err("failpoint must fire");
+    }
+
+    let operation_id = single_sidecar_operation_id(dir.path());
+    let sidecar_path = dir
+        .path()
+        .join("__recovery")
+        .join(format!("{operation_id}.json"));
+    let sidecar_json = std::fs::read_to_string(&sidecar_path).unwrap();
+    let sidecar: serde_json::Value = serde_json::from_str(&sidecar_json).unwrap();
+
+    let tables = sidecar["tables"]
+        .as_array()
+        .expect("sidecar tables must be an array");
+    assert!(
+        !tables.is_empty(),
+        "sidecar must pin at least one RewriteMerged table — both branches mutated Person"
+    );
+    for pin in tables {
+        let table_branch = pin
+            .get("table_branch")
+            .and_then(|v| v.as_str())
+            .unwrap_or_else(|| {
+                panic!(
+                    "sidecar pin must record table_branch as the merge target (active_branch); \
+                     got pin {pin:?}"
+                )
+            });
+        assert_eq!(
+            table_branch, "target_branch",
+            "sidecar pin must record `table_branch` as the merge target branch (where \
+             commits actually land via publish_rewritten_merge_table → open_for_mutation), \
+             NOT entry.table_branch from the target snapshot. See merge.rs filter_map and \
+             the rationale comment at table_ops.rs:115-120. Got pin: {pin:?}"
+        );
+    }
+}
+
+/// `ensure_indices` only writes a sidecar when at least one table
+/// genuinely needs index work (per `needs_index_work_*` helpers in
+/// `db/omnigraph/table_ops.rs`). When all tables are steady-state
+/// (every declared index already built, or empty tables that the loop
+/// skips), the sidecar is omitted entirely.
+///
+/// Test setup: `load_jsonl` auto-builds indices via
+/// `prepare_updates_for_commit`. So after the load, every Person/Knows
+/// index is built and Company is empty. `ensure_indices` correctly
+/// produces zero pins → no sidecar. The failpoint still fires (it sits
+/// after the loops), so the call returns Err — but no recovery state
+/// persists. Reopen is a clean no-op.
+///
+/// Triggering an actual sidecar persistence requires bypassing
+/// `load_jsonl`'s auto-build via raw `TableStore::append_batch` — the
+/// helper-direct path. That's covered structurally by the
+/// `needs_index_work_*` code path and the
+/// `recovery_ensure_indices_handles_empty_tables` integration test.
+#[tokio::test]
+async fn ensure_indices_phase_b_failure_does_not_leak_sidecar_when_no_work_needed() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+
+    let _scenario = FailScenario::setup();
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap().to_string();
+
+    // Seed: load_jsonl auto-builds Person's indices via
+    // prepare_updates_for_commit. After this, ensure_indices has no
+    // work to do (steady state).
+    {
+        let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
+        load_jsonl(
+            &mut db,
+            r#"{"type":"Person","data":{"name":"alice","age":30}}
+{"type":"Person","data":{"name":"bob","age":25}}
+"#,
+            LoadMode::Append,
+        )
+        .await
+        .unwrap();
+    }
+
+    // Setup: trigger the failpoint. Steady-state ensure_indices
+    // produces zero sidecar pins (the helpers scope pins to tables
+    // that genuinely need work); no sidecar is written. The failpoint
+    // still fires, surfacing the Err.
+    {
+        let mut db = Omnigraph::open(&uri).await.unwrap();
+        let _failpoint =
+            ScopedFailPoint::new("ensure_indices.post_phase_b_pre_manifest_commit", "return");
+        let err = db.ensure_indices().await.unwrap_err();
+        assert!(
+            err.to_string().contains(
+                "injected failpoint triggered: ensure_indices.post_phase_b_pre_manifest_commit"
+            ),
+            "unexpected error: {err}"
+        );
+
+        // KEY ASSERTION: no sidecar persists, because the helpers
+        // scope pins to tables that genuinely need work. Steady-state
+        // = no pins = no sidecar = no recovery state = zero open-time
+        // overhead.
+        let recovery_dir = dir.path().join("__recovery");
+        let sidecars: Vec<_> = if recovery_dir.exists() {
+            std::fs::read_dir(&recovery_dir)
+                .unwrap()
+                .filter_map(|e| e.ok())
+                .collect()
+        } else {
+            Vec::new()
+        };
+        assert!(
+            sidecars.is_empty(),
+            "steady-state ensure_indices must not leave a sidecar; got {:?}",
+            sidecars,
+        );
+    }
+
+    // Recovery: reopen is a clean no-op (no sidecar to recover).
+    let _db = Omnigraph::open(&uri).await.unwrap();
+
+    let recovery_dir = dir.path().join("__recovery");
+    if recovery_dir.exists() {
+        let remaining: Vec<_> = std::fs::read_dir(&recovery_dir)
+            .unwrap()
+            .filter_map(|e| e.ok())
+            .collect();
+        assert!(
+            remaining.is_empty(),
+            "sidecar must remain deleted; remaining: {:?}",
+            remaining,
+        );
+    }
+    // No audit row expected — no sidecar was processed.
+    let audit_dir = dir.path().join("_graph_commit_recoveries.lance");
+    assert!(
+        !audit_dir.exists(),
+        "_graph_commit_recoveries.lance must NOT exist when no sidecar was processed"
+    );
+}
diff --git a/crates/omnigraph/tests/forbidden_apis.rs b/crates/omnigraph/tests/forbidden_apis.rs
index 055d96a..cc9f163 100644
--- a/crates/omnigraph/tests/forbidden_apis.rs
+++ b/crates/omnigraph/tests/forbidden_apis.rs
@@ -1,4 +1,4 @@
-//! MR-793 Phase 3 — forbidden-API guard test.
+//! Forbidden-API guard test.
 //!
 //! Engine code (`exec/`, `db/omnigraph/`, `loader/`, `changes/`) MUST NOT
 //! call Lance's inline-commit data-write APIs directly. The
@@ -29,15 +29,15 @@
 //!   the cross-table manifest commit. Documented exception.
 //! - `crates/omnigraph/src/storage_layer.rs` — IS the trait module.
 //!
-//! ## Initial state (MR-793 Phase 3)
+//! ## Transitional allow-list
 //!
-//! At the time this test was written, MR-793 has migrated three writers
-//! (ensure_indices, branch_merge, schema_apply rewrites) onto staged
-//! primitives. Other engine call sites (the bulk loader, exec/mutation,
-//! exec/query, etc.) still use the legacy inherent `TableStore` methods
-//! — they're not visible at the trait boundary, but they DO call lance
-//! types. The allow-list below reflects this transitional state. Phase 9
-//! tightens the allow-list as call sites migrate.
+//! The migration of writers onto staged primitives is incremental.
+//! Several writers (ensure_indices, branch_merge, schema_apply rewrites)
+//! already route through the staged primitives; others (bulk loader,
+//! exec/mutation, exec/query) still use the legacy inherent
+//! `TableStore` methods — they're not visible at the trait boundary, but
+//! they DO call lance types. The file-level allow-list below reflects
+//! this transitional state and tightens as call sites migrate.
 
 use std::path::{Path, PathBuf};
 
@@ -99,6 +99,7 @@ const ALLOW_LIST_FILES: &[&str] = &[
     "storage_layer.rs",      // The trait module.
     "commit_graph.rs",       // Maintains `_graph_commits.lance` system table.
     "graph_coordinator.rs",  // Drives the manifest publisher / branch coordinator.
+    "recovery_audit.rs",     // Maintains `_graph_commit_recoveries.lance` (recovery audit trail).
 ];
 
 /// Directories exempt from the guard. Files under these paths may use
@@ -202,7 +203,7 @@ fn engine_code_does_not_call_forbidden_lance_apis() {
 
     if !violations.is_empty() {
         panic!(
-            "MR-793 forbidden-API guard found {} violation(s) in engine code. \
+            "Forbidden-API guard found {} violation(s) in engine code. \
              Engine code MUST route through the `TableStorage` trait (or its \
              inherent counterparts on `TableStore`) instead of calling Lance's \
              inline-commit APIs directly. If a use is genuinely justified, add \
diff --git a/crates/omnigraph/tests/helpers/mod.rs b/crates/omnigraph/tests/helpers/mod.rs
index 66db1a7..e7e1efb 100644
--- a/crates/omnigraph/tests/helpers/mod.rs
+++ b/crates/omnigraph/tests/helpers/mod.rs
@@ -1,5 +1,7 @@
 #![allow(dead_code)]
 
+pub mod recovery;
+
 use arrow_array::{Array, RecordBatch, StringArray};
 use futures::TryStreamExt;
 
diff --git a/crates/omnigraph/tests/helpers/recovery.rs b/crates/omnigraph/tests/helpers/recovery.rs
new file mode 100644
index 0000000..3a8505f
--- /dev/null
+++ b/crates/omnigraph/tests/helpers/recovery.rs
@@ -0,0 +1,580 @@
+use std::path::{Path, PathBuf};
+
+use arrow_array::{Array, RecordBatch, StringArray};
+use futures::TryStreamExt;
+use lance::Dataset;
+use omnigraph::db::commit_graph::CommitGraph;
+use omnigraph::db::{GraphCommit, Omnigraph, ReadTarget, SubTableEntry};
+use omnigraph::error::{OmniError, Result};
+use omnigraph_compiler::ir::ParamMap;
+use serde::Deserialize;
+
+const RECOVERY_ACTOR: &str = "omnigraph:recovery";
+
+#[derive(Debug)]
+pub enum RecoveryExpectation {
+    RolledForward { tables: Vec<TableExpectation> },
+    RolledBack { tables: Vec<TableExpectation> },
+    Deferred,
+    NoOp,
+}
+
+#[derive(Debug)]
+pub struct TableExpectation {
+    pub table_key: String,
+    pub branch: Option<String>,
+    pub expected_lance_head: Option<u64>,
+    pub expected_main_manifest_pin: Option<u64>,
+    pub expected_recovery_parent_commit_id: Option<String>,
+    pub follow_up_mutation: Option<FollowUpMutation>,
+}
+
+#[derive(Debug)]
+pub struct FollowUpMutation {
+    pub branch: String,
+    pub query_source: String,
+    pub query_name: String,
+    pub params: ParamMap,
+}
+
+#[derive(Debug, Clone)]
+struct RecoveryAuditRow {
+    graph_commit_id: String,
+    recovery_kind: String,
+    operation_id: String,
+    sidecar_writer_kind: String,
+    per_table_outcomes: Vec<TableOutcome>,
+}
+
+#[derive(Debug, Clone, Deserialize)]
+struct TableOutcome {
+    table_key: String,
+    from_version: u64,
+    to_version: u64,
+}
+
+impl TableExpectation {
+    pub fn main(table_key: impl Into<String>) -> Self {
+        Self::new(table_key, None::<String>)
+    }
+
+    pub fn branch(table_key: impl Into<String>, branch: impl Into<String>) -> Self {
+        Self::new(table_key, Some(branch))
+    }
+
+    pub fn new(table_key: impl Into<String>, branch: Option<impl Into<String>>) -> Self {
+        Self {
+            table_key: table_key.into(),
+            branch: branch.map(Into::into),
+            expected_lance_head: None,
+            expected_main_manifest_pin: None,
+            expected_recovery_parent_commit_id: None,
+            follow_up_mutation: None,
+        }
+    }
+
+    pub fn expected_lance_head(mut self, version: u64) -> Self {
+        self.expected_lance_head = Some(version);
+        self
+    }
+
+    pub fn expected_main_manifest_pin(mut self, version: u64) -> Self {
+        self.expected_main_manifest_pin = Some(version);
+        self
+    }
+
+    pub fn expected_recovery_parent_commit_id(mut self, commit_id: impl Into<String>) -> Self {
+        self.expected_recovery_parent_commit_id = Some(commit_id.into());
+        self
+    }
+
+    pub fn follow_up_mutation(mut self, mutation: FollowUpMutation) -> Self {
+        self.follow_up_mutation = Some(mutation);
+        self
+    }
+}
+
+impl FollowUpMutation {
+    pub fn new(
+        branch: impl Into<String>,
+        query_source: impl Into<String>,
+        query_name: impl Into<String>,
+        params: ParamMap,
+    ) -> Self {
+        Self {
+            branch: branch.into(),
+            query_source: query_source.into(),
+            query_name: query_name.into(),
+            params,
+        }
+    }
+}
+
+pub fn single_sidecar_operation_id(repo_root: &Path) -> String {
+    let ids = sidecar_operation_ids(repo_root);
+    assert_eq!(
+        ids.len(),
+        1,
+        "expected exactly one recovery sidecar under __recovery/, got {:?}",
+        ids,
+    );
+    ids.into_iter().next().unwrap()
+}
+
+pub fn sidecar_operation_ids(repo_root: &Path) -> Vec<String> {
+    let dir = repo_root.join("__recovery");
+    if !dir.exists() {
+        return Vec::new();
+    }
+    let mut ids = std::fs::read_dir(&dir)
+        .unwrap()
+        .filter_map(|entry| {
+            let entry = entry.ok()?;
+            let path = entry.path();
+            if path.extension().and_then(|ext| ext.to_str()) != Some("json") {
+                return None;
+            }
+            path.file_stem()
+                .and_then(|stem| stem.to_str())
+                .map(str::to_string)
+        })
+        .collect::<Vec<_>>();
+    ids.sort();
+    ids
+}
+
+pub async fn branch_head_commit_id(repo_root: &Path, branch: &str) -> Result<String> {
+    let graph = match branch {
+        "main" => CommitGraph::open(&repo_uri(repo_root)).await?,
+        branch => CommitGraph::open_at_branch(&repo_uri(repo_root), branch).await?,
+    };
+    graph.head_commit_id().await?.ok_or_else(|| {
+        OmniError::manifest_internal(format!("commit graph for branch {branch} has no head"))
+    })
+}
+
+pub async fn assert_post_recovery_invariants(
+    repo_root: &Path,
+    operation_id: &str,
+    expectation: RecoveryExpectation,
+) -> Result<()> {
+    match expectation {
+        RecoveryExpectation::RolledForward { tables } => {
+            assert_sidecar_absent(repo_root, operation_id);
+            let audit = read_audit_row(repo_root, operation_id).await?;
+            assert_eq!(
+                audit.recovery_kind, "RolledForward",
+                "audit row for {operation_id} recorded the wrong recovery_kind",
+            );
+            assert_manifest_pins_match_lance_heads(repo_root, &tables).await?;
+            assert_audit_to_versions_match_lance_heads(repo_root, &audit, &tables).await?;
+            assert_recovery_commit_shape(repo_root, &audit, &tables).await?;
+            assert_non_main_did_not_move_main(repo_root, &tables).await?;
+            assert_idempotent_reopen(repo_root, operation_id).await?;
+            run_follow_up_mutations(repo_root, tables).await?;
+        }
+        RecoveryExpectation::RolledBack { tables } => {
+            assert_sidecar_absent(repo_root, operation_id);
+            let audit = read_audit_row(repo_root, operation_id).await?;
+            assert_eq!(
+                audit.recovery_kind, "RolledBack",
+                "audit row for {operation_id} recorded the wrong recovery_kind",
+            );
+            assert_rollback_outcomes_record_drift(&audit);
+            assert_recovery_commit_shape(repo_root, &audit, &tables).await?;
+            assert_non_main_did_not_move_main(repo_root, &tables).await?;
+            assert_idempotent_reopen(repo_root, operation_id).await?;
+            run_follow_up_mutations(repo_root, tables).await?;
+        }
+        RecoveryExpectation::Deferred => {
+            assert!(
+                sidecar_path(repo_root, operation_id).exists(),
+                "deferred recovery must leave sidecar {operation_id} on disk",
+            );
+            assert!(
+                read_audit_row(repo_root, operation_id).await.is_err(),
+                "deferred recovery must not record an audit row for {operation_id}",
+            );
+        }
+        RecoveryExpectation::NoOp => {
+            assert_sidecar_absent(repo_root, operation_id);
+            assert!(
+                read_audit_row(repo_root, operation_id).await.is_err(),
+                "no-op recovery must not record an audit row for {operation_id}",
+            );
+        }
+    }
+
+    Ok(())
+}
+
+fn branch_context(tables: &[TableExpectation]) -> Option<String> {
+    tables
+        .iter()
+        .filter_map(|table| table.branch.as_deref())
+        .find(|branch| *branch != "main")
+        .map(str::to_string)
+}
+
+fn sidecar_path(repo_root: &Path, operation_id: &str) -> PathBuf {
+    repo_root
+        .join("__recovery")
+        .join(format!("{operation_id}.json"))
+}
+
+fn assert_sidecar_absent(repo_root: &Path, operation_id: &str) {
+    assert!(
+        !sidecar_path(repo_root, operation_id).exists(),
+        "recovery sidecar {operation_id} must be deleted after successful recovery",
+    );
+}
+
+async fn assert_manifest_pins_match_lance_heads(
+    repo_root: &Path,
+    tables: &[TableExpectation],
+) -> Result<()> {
+    let uri = repo_uri(repo_root);
+    let db = Omnigraph::open(&uri).await?;
+    for table in tables {
+        let (entry, lance_head) = entry_and_lance_head(&db, &uri, table).await?;
+        assert_eq!(
+            entry.table_version, lance_head,
+            "manifest pin for {} on {:?} must match Lance HEAD after roll-forward",
+            table.table_key, table.branch,
+        );
+        if let Some(expected) = table.expected_lance_head {
+            assert_eq!(
+                lance_head, expected,
+                "Lance HEAD for {} on {:?} did not match the test's expected value",
+                table.table_key, table.branch,
+            );
+        }
+    }
+    Ok(())
+}
+
+async fn assert_audit_to_versions_match_lance_heads(
+    repo_root: &Path,
+    audit: &RecoveryAuditRow,
+    tables: &[TableExpectation],
+) -> Result<()> {
+    let uri = repo_uri(repo_root);
+    let db = Omnigraph::open(&uri).await?;
+    for table in tables {
+        let (_, lance_head) = entry_and_lance_head(&db, &uri, table).await?;
+        let outcome = audit
+            .per_table_outcomes
+            .iter()
+            .find(|outcome| outcome.table_key == table.table_key)
+            .ok_or_else(|| {
+                OmniError::manifest_internal(format!(
+                    "audit row for {} has no outcome for {}",
+                    audit.operation_id, table.table_key,
+                ))
+            })?;
+        assert_eq!(
+            outcome.to_version, lance_head,
+            "audit to_version for {} must match the published Lance HEAD",
+            table.table_key,
+        );
+    }
+    Ok(())
+}
+
+/// For RolledBack outcomes, `from_version` records the Lance HEAD
+/// observed BEFORE the restore (the actual drift) and `to_version`
+/// records the manifest pin we restored to. If both equal, the audit
+/// row is uninformative — operators cannot tell how far Lance HEAD
+/// drifted from the manifest. This assertion catches any regression
+/// that reverts `from_version` to `manifest_pinned`.
+fn assert_rollback_outcomes_record_drift(audit: &RecoveryAuditRow) {
+    for outcome in &audit.per_table_outcomes {
+        assert!(
+            outcome.from_version > outcome.to_version,
+            "rollback outcome for {} must record drift via `from_version > to_version` \
+             (Lance HEAD before restore > manifest pin restored to); got from={}, to={}",
+            outcome.table_key,
+            outcome.from_version,
+            outcome.to_version,
+        );
+    }
+}
+
+async fn assert_non_main_did_not_move_main(
+    repo_root: &Path,
+    tables: &[TableExpectation],
+) -> Result<()> {
+    let uri = repo_uri(repo_root);
+    let db = Omnigraph::open(&uri).await?;
+    let main = db.snapshot_of(ReadTarget::branch("main")).await?;
+    for table in tables {
+        let Some(expected) = table.expected_main_manifest_pin else {
+            continue;
+        };
+        let entry = main.entry(&table.table_key).ok_or_else(|| {
+            OmniError::manifest_internal(format!(
+                "main snapshot has no entry for {}",
+                table.table_key,
+            ))
+        })?;
+        assert_eq!(
+            entry.table_version, expected,
+            "non-main recovery for {} on {:?} must not move main's manifest pin",
+            table.table_key, table.branch,
+        );
+    }
+    Ok(())
+}
+
+async fn assert_recovery_commit_shape(
+    repo_root: &Path,
+    audit: &RecoveryAuditRow,
+    tables: &[TableExpectation],
+) -> Result<()> {
+    let branch = branch_context(tables);
+    let expected_parent = expected_recovery_parent(tables)?;
+    let branch = branch.as_deref();
+    let commit = read_recovery_commit(repo_root, audit, branch).await?;
+
+    assert_eq!(
+        commit.actor_id.as_deref(),
+        Some(RECOVERY_ACTOR),
+        "recovery commit {} for operation {} must use actor {}",
+        commit.graph_commit_id,
+        audit.operation_id,
+        RECOVERY_ACTOR,
+    );
+
+    if let Some(expected_parent) = expected_parent {
+        assert_eq!(
+            commit.parent_commit_id.as_deref(),
+            Some(expected_parent.as_str()),
+            "recovery commit {} for operation {} recorded the wrong parent",
+            commit.graph_commit_id,
+            audit.operation_id,
+        );
+    }
+
+    if audit.sidecar_writer_kind == "BranchMerge" {
+        assert!(
+            commit.merged_parent_commit_id.is_some(),
+            "recovered BranchMerge must record merged_parent_commit_id",
+        );
+
+        if let Some(branch) = branch {
+            let graph = CommitGraph::open_at_branch(&repo_uri(repo_root), branch).await?;
+            let commits = graph.load_commits().await?;
+            let parent = commit.parent_commit_id.as_deref().ok_or_else(|| {
+                OmniError::manifest_internal(format!(
+                    "recovered BranchMerge commit {} has no parent_commit_id",
+                    commit.graph_commit_id,
+                ))
+            })?;
+            assert!(
+                commits
+                    .iter()
+                    .any(|candidate| candidate.graph_commit_id == parent),
+                "recovered BranchMerge parent_commit_id {} is not on target branch {}",
+                parent,
+                branch,
+            );
+        }
+    }
+
+    Ok(())
+}
+
+fn expected_recovery_parent(tables: &[TableExpectation]) -> Result<Option<String>> {
+    let mut expected = None;
+    for table in tables {
+        let Some(candidate) = &table.expected_recovery_parent_commit_id else {
+            continue;
+        };
+        match &expected {
+            None => expected = Some(candidate.clone()),
+            Some(existing) if existing == candidate => {}
+            Some(existing) => {
+                return Err(OmniError::manifest_internal(format!(
+                    "conflicting expected recovery parents in table expectations: {existing} vs {candidate}",
+                )));
+            }
+        }
+    }
+    Ok(expected)
+}
+
+async fn assert_idempotent_reopen(repo_root: &Path, operation_id: &str) -> Result<()> {
+    let before = matching_audit_rows(repo_root, operation_id).await?;
+    let uri = repo_uri(repo_root);
+    let _db = Omnigraph::open(&uri).await?;
+    assert_sidecar_absent(repo_root, operation_id);
+    let after = matching_audit_rows(repo_root, operation_id).await?;
+    assert_eq!(
+        after.len(),
+        before.len(),
+        "immediate reopen after recovery must be a clean no-op for {operation_id}",
+    );
+    Ok(())
+}
+
+async fn run_follow_up_mutations(repo_root: &Path, tables: Vec<TableExpectation>) -> Result<()> {
+    let mut db: Option<Omnigraph> = None;
+    for table in tables {
+        let Some(mutation) = table.follow_up_mutation else {
+            continue;
+        };
+        if db.is_none() {
+            db = Some(Omnigraph::open(&repo_uri(repo_root)).await?);
+        }
+        let db = db.as_mut().unwrap();
+        db.mutate(
+            &mutation.branch,
+            &mutation.query_source,
+            &mutation.query_name,
+            &mutation.params,
+        )
+        .await
+        .map_err(|err| {
+            OmniError::manifest_internal(format!(
+                "follow-up mutation {} on {} after recovery failed: {}",
+                mutation.query_name, table.table_key, err,
+            ))
+        })?;
+    }
+    Ok(())
+}
+
+async fn entry_and_lance_head(
+    db: &Omnigraph,
+    root_uri: &str,
+    table: &TableExpectation,
+) -> Result<(SubTableEntry, u64)> {
+    let branch = table.branch.as_deref().unwrap_or("main");
+    let snapshot = db.snapshot_of(ReadTarget::branch(branch)).await?;
+    let entry = snapshot
+        .entry(&table.table_key)
+        .ok_or_else(|| {
+            OmniError::manifest_internal(format!(
+                "snapshot for branch {branch} has no entry for {}",
+                table.table_key,
+            ))
+        })?
+        .clone();
+    let lance_head = lance_head_for_entry(root_uri, &entry).await?;
+    Ok((entry, lance_head))
+}
+
+async fn lance_head_for_entry(root_uri: &str, entry: &SubTableEntry) -> Result<u64> {
+    let table_uri = format!("{}/{}", root_uri.trim_end_matches('/'), entry.table_path);
+    let ds = Dataset::open(&table_uri)
+        .await
+        .map_err(|err| OmniError::Lance(err.to_string()))?;
+    let ds = match entry.table_branch.as_deref() {
+        Some(branch) if branch != "main" => ds
+            .checkout_branch(branch)
+            .await
+            .map_err(|err| OmniError::Lance(err.to_string()))?,
+        _ => ds,
+    };
+    Ok(ds.version().version)
+}
+
+async fn read_recovery_commit(
+    repo_root: &Path,
+    audit: &RecoveryAuditRow,
+    branch: Option<&str>,
+) -> Result<GraphCommit> {
+    let uri = repo_uri(repo_root);
+    let graph = match branch {
+        Some(branch) => CommitGraph::open_at_branch(&uri, branch).await?,
+        None => CommitGraph::open(&uri).await?,
+    };
+    graph
+        .load_commits()
+        .await?
+        .into_iter()
+        .find(|commit| commit.graph_commit_id == audit.graph_commit_id)
+        .ok_or_else(|| {
+            OmniError::manifest_internal(format!(
+                "recovery commit {} for operation {} was not found",
+                audit.graph_commit_id, audit.operation_id,
+            ))
+        })
+}
+
+async fn read_audit_row(repo_root: &Path, operation_id: &str) -> Result<RecoveryAuditRow> {
+    let mut rows = matching_audit_rows(repo_root, operation_id).await?;
+    if rows.len() != 1 {
+        return Err(OmniError::manifest_internal(format!(
+            "expected exactly one recovery audit row for {operation_id}, got {}",
+            rows.len(),
+        )));
+    }
+    Ok(rows.remove(0))
+}
+
+async fn matching_audit_rows(
+    repo_root: &Path,
+    operation_id: &str,
+) -> Result<Vec<RecoveryAuditRow>> {
+    let recoveries_dir = repo_root.join("_graph_commit_recoveries.lance");
+    if !recoveries_dir.exists() {
+        return Ok(Vec::new());
+    }
+    let ds = Dataset::open(recoveries_dir.to_str().unwrap())
+        .await
+        .map_err(|err| OmniError::Lance(err.to_string()))?;
+    let batches: Vec<RecordBatch> = ds
+        .scan()
+        .try_into_stream()
+        .await
+        .map_err(|err| OmniError::Lance(err.to_string()))?
+        .try_collect()
+        .await
+        .map_err(|err| OmniError::Lance(err.to_string()))?;
+
+    let mut rows = Vec::new();
+    for batch in batches {
+        let graph_commit_ids = string_column(&batch, "graph_commit_id")?;
+        let kinds = string_column(&batch, "recovery_kind")?;
+        let ops = string_column(&batch, "operation_id")?;
+        let writers = string_column(&batch, "sidecar_writer_kind")?;
+        let outcomes_json = string_column(&batch, "per_table_outcomes_json")?;
+        for row in 0..batch.num_rows() {
+            if ops.value(row) != operation_id {
+                continue;
+            }
+            let per_table_outcomes =
+                serde_json::from_str(outcomes_json.value(row)).map_err(|err| {
+                    OmniError::manifest_internal(format!(
+                        "failed to parse recovery audit outcomes for {operation_id}: {err}",
+                    ))
+                })?;
+            rows.push(RecoveryAuditRow {
+                graph_commit_id: graph_commit_ids.value(row).to_string(),
+                recovery_kind: kinds.value(row).to_string(),
+                operation_id: ops.value(row).to_string(),
+                sidecar_writer_kind: writers.value(row).to_string(),
+                per_table_outcomes,
+            });
+        }
+    }
+    Ok(rows)
+}
+
+fn string_column<'a>(batch: &'a RecordBatch, name: &str) -> Result<&'a StringArray> {
+    batch
+        .column_by_name(name)
+        .ok_or_else(|| {
+            OmniError::manifest_internal(format!("recovery audit batch missing '{name}'"))
+        })?
+        .as_any()
+        .downcast_ref::<StringArray>()
+        .ok_or_else(|| {
+            OmniError::manifest_internal(format!("recovery audit column '{name}' is not Utf8"))
+        })
+}
+
+fn repo_uri(repo_root: &Path) -> String {
+    repo_root.to_str().unwrap().to_string()
+}
diff --git a/crates/omnigraph/tests/recovery.rs b/crates/omnigraph/tests/recovery.rs
new file mode 100644
index 0000000..5ad87e8
--- /dev/null
+++ b/crates/omnigraph/tests/recovery.rs
@@ -0,0 +1,1451 @@
+//! Open-time recovery sweep integration tests.
+//!
+//! These exercise the full `Omnigraph::open` cycle: drop a synthetic
+//! sidecar into `__recovery/`, advance some Lance HEADs to simulate the
+//! Phase B → Phase C residual, reopen the engine, and assert the sweep's
+//! decision-tree dispatch did the right thing.
+//!
+//! Coverage: open-time invocation, `OpenMode::{ReadWrite, ReadOnly}`,
+//! roll-back path, schema-version refusal, roll-forward path, and audit
+//! row recording.
+
+use std::path::Path;
+use std::sync::Arc;
+
+use arrow_array::{Int32Array, RecordBatch, StringArray};
+use arrow_schema::{DataType, Field, Schema};
+use lance::Dataset;
+use omnigraph::db::Omnigraph;
+
+mod helpers;
+use helpers::recovery::{RecoveryExpectation, TableExpectation, assert_post_recovery_invariants};
+
+const TEST_SCHEMA: &str = include_str!("fixtures/test.pg");
+
+fn write_sidecar_file(repo_root: &Path, operation_id: &str, json: &str) {
+    let dir = repo_root.join("__recovery");
+    if !dir.exists() {
+        std::fs::create_dir(&dir).unwrap();
+    }
+    std::fs::write(dir.join(format!("{}.json", operation_id)), json).unwrap();
+}
+
+fn list_recovery_dir(repo_root: &Path) -> Vec<String> {
+    let dir = repo_root.join("__recovery");
+    if !dir.exists() {
+        return Vec::new();
+    }
+    std::fs::read_dir(&dir)
+        .unwrap()
+        .filter_map(|e| e.ok().map(|d| d.file_name().to_string_lossy().to_string()))
+        .collect()
+}
+
+/// Full URI of a node-type Lance dataset under a fresh Omnigraph repo.
+/// Mirrors the `nodes/{fnv1a64-hex(type_name)}` layout in `db/manifest/layout.rs`.
+fn node_table_uri(root: &str, type_name: &str) -> String {
+    let h: u64 = fnv1a64(type_name.as_bytes());
+    format!("{}/nodes/{:016x}", root.trim_end_matches('/'), h)
+}
+
+fn fnv1a64(bytes: &[u8]) -> u64 {
+    let mut hash: u64 = 0xcbf2_9ce4_8422_2325;
+    for &b in bytes {
+        hash ^= b as u64;
+        hash = hash.wrapping_mul(0x100_0000_01b3);
+    }
+    hash
+}
+
+/// Build a Person RecordBatch matching the post-init Lance schema:
+/// `id: Utf8, age: Int32?, name: Utf8`. Used by tests that need to advance
+/// Lance HEAD with real fragment changes (not no-op deletes) bypassing
+/// `__manifest`.
+fn person_batch(rows: &[(&str, &str, Option<i32>)]) -> RecordBatch {
+    let schema = Arc::new(Schema::new(vec![
+        Field::new("id", DataType::Utf8, false),
+        Field::new("age", DataType::Int32, true),
+        Field::new("name", DataType::Utf8, false),
+    ]));
+    let ids: Vec<&str> = rows.iter().map(|(id, _, _)| *id).collect();
+    let names: Vec<&str> = rows.iter().map(|(_, name, _)| *name).collect();
+    let ages: Vec<Option<i32>> = rows.iter().map(|(_, _, age)| *age).collect();
+    RecordBatch::try_new(
+        schema,
+        vec![
+            Arc::new(StringArray::from(ids)),
+            Arc::new(Int32Array::from(ages)),
+            Arc::new(StringArray::from(names)),
+        ],
+    )
+    .unwrap()
+}
+
+#[tokio::test]
+async fn recovery_does_not_run_on_clean_open() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+    let _db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    drop(_db);
+
+    // Reopen — `__recovery/` doesn't exist; the sweep must be a clean no-op.
+    let _db = Omnigraph::open(uri).await.unwrap();
+    // Verify by side-effect: the recovery dir was not created by the sweep.
+    assert!(
+        !dir.path().join("__recovery").exists(),
+        "clean-open sweep must not create __recovery/"
+    );
+}
+
+#[tokio::test]
+async fn recovery_refuses_unknown_schema_version_on_open() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+    let _db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    drop(_db);
+
+    // A sidecar from a hypothetical future writer; the older binary must
+    // refuse to interpret it (resolved-decisions §3 in the design doc).
+    let sidecar_json = r#"{
+        "schema_version": 99,
+        "operation_id": "01H000000000000000000000ZZ",
+        "started_at": "0",
+        "branch": null,
+        "actor_id": null,
+        "writer_kind": "Mutation",
+        "tables": []
+    }"#;
+    write_sidecar_file(dir.path(), "01H000000000000000000000ZZ", sidecar_json);
+
+    let err = Omnigraph::open(uri)
+        .await
+        .err()
+        .expect("expected open to fail because of unknown sidecar schema_version");
+    let msg = err.to_string();
+    assert!(
+        msg.contains("schema_version=99") && msg.contains("supports only schema_version=1"),
+        "expected SidecarSchemaError mentioning the version mismatch, got: {}",
+        msg,
+    );
+    // Sidecar must still be on disk — we don't auto-delete unparseable files.
+    assert!(
+        list_recovery_dir(dir.path()).contains(&"01H000000000000000000000ZZ.json".to_string()),
+        "sidecar should remain on disk after refusal so an operator can inspect it"
+    );
+}
+
+#[tokio::test]
+async fn read_only_open_skips_recovery_sweep() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+    let _db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    drop(_db);
+
+    // Drop a syntactically-valid but invariant-violating sidecar (HEAD < pin
+    // would error if classified). Read-only must NOT classify it — it must
+    // skip the sweep entirely.
+    let sidecar_json = r#"{
+        "schema_version": 1,
+        "operation_id": "01H000000000000000000000RO",
+        "started_at": "0",
+        "branch": null,
+        "actor_id": null,
+        "writer_kind": "Mutation",
+        "tables": [
+            {
+                "table_key": "node:Person",
+                "table_path": "/dev/null/nonexistent.lance",
+                "expected_version": 99,
+                "post_commit_pin": 100
+            }
+        ]
+    }"#;
+    write_sidecar_file(dir.path(), "01H000000000000000000000RO", sidecar_json);
+
+    // ReadOnly open must succeed — the sweep is skipped, so the bogus
+    // sidecar is never inspected.
+    let _db = Omnigraph::open_read_only(uri).await.unwrap();
+    // And the sidecar is still there — ReadOnly never deletes anything.
+    assert!(
+        list_recovery_dir(dir.path()).contains(&"01H000000000000000000000RO.json".to_string()),
+        "ReadOnly open must leave the sidecar untouched"
+    );
+}
+
+#[tokio::test]
+async fn recovery_rolls_back_synthetic_drift_on_open() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    // Bootstrap a real graph with a Person table so we have a Lance dataset
+    // to advance synthetically.
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    let test_data = r#"{"type":"Person","data":{"name":"alice","age":30}}
+{"type":"Person","data":{"name":"bob","age":25}}
+"#;
+    load_jsonl(&mut db, test_data, LoadMode::Append)
+        .await
+        .unwrap();
+    drop(db);
+
+    // Synthetic drift: advance Person's Lance HEAD WITHOUT updating the
+    // manifest pin. This is the shape a Phase B → Phase C crash would
+    // leave (with no sidecar — the writer never wrote one because we're
+    // simulating the residual class directly).
+    //
+    // Use `delete_where` with a never-matching predicate: it inline-commits
+    // a Lance transaction (advancing HEAD by one) without removing data
+    // and without depending on the dataset's exact column set. The actual
+    // residual the sweep recovers from is the manifest-vs-Lance-HEAD gap;
+    // it's agnostic to *what* op caused the gap.
+    let person_uri = node_table_uri(uri, "Person");
+    let store = TableStore::new(uri);
+    let mut ds = Dataset::open(&person_uri).await.unwrap();
+    let head_before_drift = ds.version().version;
+    let _ = store
+        .delete_where(&person_uri, &mut ds, "1 = 2")
+        .await
+        .unwrap();
+    let head_after_drift = ds.version().version;
+    assert_eq!(
+        head_after_drift,
+        head_before_drift + 1,
+        "synthetic drift must advance Lance HEAD by exactly 1"
+    );
+    drop(ds);
+
+    // Drop a sidecar that DOESN'T match the observed drift — sidecar says
+    // expected=head_before_drift, post_commit_pin=head_before_drift (i.e.,
+    // pretend no Phase B happened). Observed: head_after_drift =
+    // expected + 1. Classification: UnexpectedAtP1 (post_commit_pin doesn't
+    // match observed). Decision: RollBack.
+    let sidecar_json = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H00000000000000000000RB",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": "act-test",
+            "writer_kind": "Mutation",
+            "tables": [
+                {{
+                    "table_key": "node:Person",
+                    "table_path": "{}",
+                    "expected_version": {},
+                    "post_commit_pin": {}
+                }}
+            ]
+        }}"#,
+        person_uri, head_before_drift, head_before_drift
+    );
+    write_sidecar_file(dir.path(), "01H00000000000000000000RB", &sidecar_json);
+
+    // Reopen. The sweep must classify Person as UnexpectedAtP1 (h=p+1 but
+    // sidecar.post_commit_pin != observed head), decide RollBack, and call
+    // restore_table_to_version(person_uri, head_before_drift). The
+    // fragment-set short-circuit may make this a no-op if the synthetic
+    // drift produced no fragment changes (delete_where with a never-matching
+    // predicate is one such case — Lance bumps version but fragments are
+    // unchanged). Either way the sweep must complete without error and
+    // delete the sidecar; the actual rollback HEAD-advance behavior is
+    // pinned by the Phase 2 unit test
+    // `restore_table_to_version_appends_one_commit`.
+    let _db = Omnigraph::open(uri).await.unwrap();
+
+    let post = Dataset::open(&person_uri).await.unwrap();
+    let _ = head_after_drift; // synthesized but no longer asserted on directly
+    assert!(
+        post.version().version >= head_after_drift,
+        "post-sweep Lance HEAD must not regress below the synthesized drift"
+    );
+
+    // Sidecar deleted as the final step — proves the sweep ran end to end.
+    let after = list_recovery_dir(dir.path());
+    assert!(
+        !after.contains(&"01H00000000000000000000RB.json".to_string()),
+        "sidecar must be deleted after successful sweep; remaining files: {:?}",
+        after,
+    );
+
+    // Idempotency: reopening should be a clean no-op (no error; no new sidecar).
+    let _db2 = Omnigraph::open(uri).await.unwrap();
+    assert!(
+        list_recovery_dir(dir.path()).is_empty(),
+        "second open must be a clean no-op"
+    );
+}
+
+// =====================================================================
+// Phase 4 — roll-forward path + audit row recording
+// =====================================================================
+
+/// Helper: count rows in `_graph_commit_recoveries.lance` at the given root.
+async fn count_recovery_audit_rows(repo_root: &Path) -> usize {
+    let recoveries_dir = repo_root.join("_graph_commit_recoveries.lance");
+    if !recoveries_dir.exists() {
+        return 0;
+    }
+    let ds = Dataset::open(recoveries_dir.to_str().unwrap())
+        .await
+        .expect("recoveries dataset opens");
+    use futures::TryStreamExt;
+    let batches: Vec<arrow_array::RecordBatch> = ds
+        .scan()
+        .try_into_stream()
+        .await
+        .unwrap()
+        .try_collect()
+        .await
+        .unwrap();
+    batches.iter().map(|b| b.num_rows()).sum()
+}
+
+/// Helper: read the most recent recovery audit row's `recovery_kind`,
+/// `recovery_for_actor`, and `operation_id`. Returns `None` if no rows.
+async fn read_latest_recovery_audit(
+    repo_root: &Path,
+) -> Option<(String, Option<String>, String, String)> {
+    let recoveries_dir = repo_root.join("_graph_commit_recoveries.lance");
+    if !recoveries_dir.exists() {
+        return None;
+    }
+    let ds = Dataset::open(recoveries_dir.to_str().unwrap()).await.ok()?;
+    use arrow_array::{Array, StringArray};
+    use futures::TryStreamExt;
+    let batches: Vec<arrow_array::RecordBatch> = ds
+        .scan()
+        .try_into_stream()
+        .await
+        .ok()?
+        .try_collect()
+        .await
+        .ok()?;
+    let last_batch = batches.iter().filter(|b| b.num_rows() > 0).last()?;
+    let row = last_batch.num_rows() - 1;
+    let kinds = last_batch
+        .column_by_name("recovery_kind")?
+        .as_any()
+        .downcast_ref::<StringArray>()?;
+    let for_actors = last_batch
+        .column_by_name("recovery_for_actor")?
+        .as_any()
+        .downcast_ref::<StringArray>()?;
+    let ops = last_batch
+        .column_by_name("operation_id")?
+        .as_any()
+        .downcast_ref::<StringArray>()?;
+    let writers = last_batch
+        .column_by_name("sidecar_writer_kind")?
+        .as_any()
+        .downcast_ref::<StringArray>()?;
+    Some((
+        kinds.value(row).to_string(),
+        if for_actors.is_null(row) {
+            None
+        } else {
+            Some(for_actors.value(row).to_string())
+        },
+        ops.value(row).to_string(),
+        writers.value(row).to_string(),
+    ))
+}
+
+/// Helper: read every recovery audit row's `recovery_kind` value, in
+/// storage order (multiple batches concatenated). Used by the
+/// multi-sidecar fresh-snapshot test as a diagnostic alongside the
+/// post-recovery Lance HEAD assertion.
+async fn list_recovery_audit_kinds(repo_root: &Path) -> Vec<String> {
+    let recoveries_dir = repo_root.join("_graph_commit_recoveries.lance");
+    if !recoveries_dir.exists() {
+        return Vec::new();
+    }
+    let ds = Dataset::open(recoveries_dir.to_str().unwrap())
+        .await
+        .unwrap();
+    use arrow_array::{Array, StringArray};
+    use futures::TryStreamExt;
+    let batches: Vec<arrow_array::RecordBatch> = ds
+        .scan()
+        .try_into_stream()
+        .await
+        .unwrap()
+        .try_collect()
+        .await
+        .unwrap();
+    let mut out = Vec::new();
+    for batch in batches {
+        let kinds = batch
+            .column_by_name("recovery_kind")
+            .expect("recovery_kind column present")
+            .as_any()
+            .downcast_ref::<StringArray>()
+            .expect("recovery_kind is Utf8");
+        for i in 0..kinds.len() {
+            out.push(kinds.value(i).to_string());
+        }
+    }
+    out
+}
+
+/// Helper: count `_graph_commits.lance` rows tagged with the recovery actor.
+async fn count_recovery_actor_commits(repo_root: &Path) -> usize {
+    let actors_dir = repo_root.join("_graph_commit_actors.lance");
+    if !actors_dir.exists() {
+        return 0;
+    }
+    let ds = Dataset::open(actors_dir.to_str().unwrap()).await.unwrap();
+    use arrow_array::{Array, StringArray};
+    use futures::TryStreamExt;
+    let batches: Vec<arrow_array::RecordBatch> = ds
+        .scan()
+        .try_into_stream()
+        .await
+        .unwrap()
+        .try_collect()
+        .await
+        .unwrap();
+    let mut count = 0;
+    for batch in &batches {
+        let actors = batch
+            .column_by_name("actor_id")
+            .unwrap()
+            .as_any()
+            .downcast_ref::<StringArray>()
+            .unwrap();
+        for i in 0..actors.len() {
+            if actors.value(i) == "omnigraph:recovery" {
+                count += 1;
+            }
+        }
+    }
+    count
+}
+
+#[tokio::test]
+async fn recovery_rolls_forward_after_phase_b_completes() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    // Bootstrap: init + load 2 rows. Manifest pin and Lance HEAD both
+    // advance via the legitimate publisher path.
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    let test_data = r#"{"type":"Person","data":{"name":"alice","age":30}}
+{"type":"Person","data":{"name":"bob","age":25}}
+"#;
+    load_jsonl(&mut db, test_data, LoadMode::Append)
+        .await
+        .unwrap();
+    drop(db);
+
+    let person_uri = node_table_uri(uri, "Person");
+    let store = TableStore::new(uri);
+    let mut ds = Dataset::open(&person_uri).await.unwrap();
+    let head_before = ds.version().version;
+
+    // Synthesize a successful Phase B: advance Lance HEAD by one
+    // (delete_where with no-match — no fragment changes, but version bumps).
+    let _ = store
+        .delete_where(&person_uri, &mut ds, "1 = 2")
+        .await
+        .unwrap();
+    let head_after = ds.version().version;
+    assert_eq!(head_after, head_before + 1);
+
+    // Drop a sidecar that MATCHES the synthesized state
+    // (expected=head_before, post_commit_pin=head_after) — classifier
+    // returns RolledPastExpected, decision is RollForward.
+    let sidecar_json = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H00000000000000000000RF",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": "act-alice",
+            "writer_kind": "Mutation",
+            "tables": [
+                {{
+                    "table_key": "node:Person",
+                    "table_path": "{}",
+                    "expected_version": {},
+                    "post_commit_pin": {}
+                }}
+            ]
+        }}"#,
+        person_uri, head_before, head_after
+    );
+    write_sidecar_file(dir.path(), "01H00000000000000000000RF", &sidecar_json);
+
+    // Reopen — sweep must roll forward, advancing the manifest pin to
+    // head_after via a single ManifestBatchPublisher::publish call.
+    let db = Omnigraph::open(uri).await.unwrap();
+    drop(db);
+
+    assert_post_recovery_invariants(
+        dir.path(),
+        "01H00000000000000000000RF",
+        RecoveryExpectation::RolledForward {
+            tables: vec![TableExpectation::main("node:Person").expected_lance_head(head_after)],
+        },
+    )
+    .await
+    .unwrap();
+}
+
+/// A previous recovery's `roll_forward_all` succeeded (manifest pin
+/// already advanced past the sidecar's `expected_version`) but
+/// `record_audit` or sidecar deletion failed, leaving the sidecar to be
+/// re-discovered on a subsequent open. The naive RollBack arm would
+/// classify all tables as NoMovement and record a `RolledBack` audit row
+/// with empty outcomes — misleading because the actual outcome was a
+/// successful roll-forward. Recovery must detect this stale-after-
+/// success shape and record `RolledForward` instead.
+///
+/// Synthesizes the state by:
+/// 1. Letting init + load advance the manifest pin AND Lance HEAD
+///    legitimately to some version `v`.
+/// 2. Writing a sidecar whose `expected_version < v` and
+///    `post_commit_pin == v` — exactly the shape left over after a
+///    publisher succeeds but audit fails.
+///
+/// On reopen the classifier sees `lance_head == manifest_pinned == v`
+/// → all NoMovement → decide returns RollBack. The new audit-recovery
+/// branch must detect `manifest_pinned > expected_version` and record
+/// `RolledForward` with `from_version=expected_version`,
+/// `to_version=v`.
+#[tokio::test]
+async fn recovery_records_rolled_forward_for_stale_sidecar_after_successful_roll_forward() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    let test_data = r#"{"type":"Person","data":{"name":"alice","age":30}}
+{"type":"Person","data":{"name":"bob","age":25}}
+"#;
+    load_jsonl(&mut db, test_data, LoadMode::Append)
+        .await
+        .unwrap();
+
+    // Capture the current manifest pin and Lance HEAD — these match
+    // because the load went through the publisher.
+    let person_entry = db
+        .snapshot_of(omnigraph::db::ReadTarget::branch("main"))
+        .await
+        .unwrap()
+        .entry("node:Person")
+        .expect("Person entry exists post-load")
+        .clone();
+    let manifest_pin = person_entry.table_version;
+    drop(db);
+
+    let person_uri = node_table_uri(uri, "Person");
+    let head_now = Dataset::open(&person_uri).await.unwrap().version().version;
+    assert_eq!(
+        head_now, manifest_pin,
+        "Lance HEAD must equal manifest pin in steady state"
+    );
+    // Sidecar shape that simulates "publisher succeeded; audit/delete
+    // failed in a previous recovery pass". `expected_version` is less
+    // than the current manifest pin (the publish already ran) and
+    // `post_commit_pin` matches the current head.
+    let stale_expected = manifest_pin - 1;
+    let sidecar_json = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H00000000000000000000SF",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": "act-original",
+            "writer_kind": "Mutation",
+            "tables": [
+                {{
+                    "table_key": "node:Person",
+                    "table_path": "{}",
+                    "expected_version": {},
+                    "post_commit_pin": {}
+                }}
+            ]
+        }}"#,
+        person_uri, stale_expected, manifest_pin
+    );
+    write_sidecar_file(dir.path(), "01H00000000000000000000SF", &sidecar_json);
+
+    // Reopen — sweep must classify Person as NoMovement (head_now ==
+    // manifest_pinned) but recognize stale-after-success because
+    // manifest_pinned > stale_expected. Audit-recovery branch records
+    // RolledForward and deletes the sidecar.
+    let _db = Omnigraph::open(uri).await.unwrap();
+
+    // Sidecar deleted.
+    assert!(
+        list_recovery_dir(dir.path()).is_empty(),
+        "stale-after-success sidecar must be deleted after audit-recovery"
+    );
+
+    // Audit row says RolledForward (not RolledBack).
+    let audit = read_latest_recovery_audit(dir.path()).await;
+    assert_eq!(
+        audit,
+        Some((
+            "RolledForward".to_string(),
+            Some("act-original".to_string()),
+            "01H00000000000000000000SF".to_string(),
+            "Mutation".to_string(),
+        )),
+        "stale-after-success sidecar must record RolledForward, not RolledBack"
+    );
+    // Audit outcomes report from_version=stale_expected, to_version=manifest_pin.
+    use arrow_array::{Array, StringArray};
+    use futures::TryStreamExt;
+    let recoveries_dir = dir.path().join("_graph_commit_recoveries.lance");
+    let ds = Dataset::open(recoveries_dir.to_str().unwrap())
+        .await
+        .unwrap();
+    let batches: Vec<arrow_array::RecordBatch> = ds
+        .scan()
+        .try_into_stream()
+        .await
+        .unwrap()
+        .try_collect()
+        .await
+        .unwrap();
+    let last = batches.iter().filter(|b| b.num_rows() > 0).last().unwrap();
+    let row = last.num_rows() - 1;
+    let outcomes_json = last
+        .column_by_name("per_table_outcomes_json")
+        .unwrap()
+        .as_any()
+        .downcast_ref::<StringArray>()
+        .unwrap()
+        .value(row);
+    let outcomes: serde_json::Value = serde_json::from_str(outcomes_json).unwrap();
+    let arr = outcomes.as_array().unwrap();
+    assert_eq!(arr.len(), 1, "outcomes must include the Person table");
+    let outcome = &arr[0];
+    assert_eq!(outcome["table_key"], "node:Person");
+    assert_eq!(outcome["from_version"].as_u64().unwrap(), stale_expected);
+    assert_eq!(outcome["to_version"].as_u64().unwrap(), manifest_pin);
+}
+
+#[tokio::test]
+async fn recovery_rolls_back_records_audit_row_with_recovery_actor() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    let test_data = r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#;
+    load_jsonl(&mut db, test_data, LoadMode::Append)
+        .await
+        .unwrap();
+    drop(db);
+
+    let person_uri = node_table_uri(uri, "Person");
+    let store = TableStore::new(uri);
+    let mut ds = Dataset::open(&person_uri).await.unwrap();
+    let head_before = ds.version().version;
+    let _ = store
+        .delete_where(&person_uri, &mut ds, "1 = 2")
+        .await
+        .unwrap();
+    let head_after = ds.version().version;
+    let _ = head_after;
+
+    // Sidecar with MISMATCHED post_commit_pin → classifier returns
+    // UnexpectedAtP1 → decision is RollBack.
+    let sidecar_json = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H00000000000000000000AB",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": "act-bob",
+            "writer_kind": "Load",
+            "tables": [
+                {{
+                    "table_key": "node:Person",
+                    "table_path": "{}",
+                    "expected_version": {},
+                    "post_commit_pin": {}
+                }}
+            ]
+        }}"#,
+        person_uri, head_before, head_before
+    );
+    write_sidecar_file(dir.path(), "01H00000000000000000000AB", &sidecar_json);
+
+    let _db = Omnigraph::open(uri).await.unwrap();
+
+    // Audit row recorded for RolledBack.
+    assert_eq!(count_recovery_audit_rows(dir.path()).await, 1);
+    assert_eq!(count_recovery_actor_commits(dir.path()).await, 1);
+    let audit = read_latest_recovery_audit(dir.path()).await;
+    assert_eq!(
+        audit,
+        Some((
+            "RolledBack".to_string(),
+            Some("act-bob".to_string()),
+            "01H00000000000000000000AB".to_string(),
+            "Load".to_string(),
+        )),
+    );
+}
+
+#[tokio::test]
+async fn recovery_rolls_forward_with_null_actor() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    let test_data = r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#;
+    load_jsonl(&mut db, test_data, LoadMode::Append)
+        .await
+        .unwrap();
+    drop(db);
+
+    let person_uri = node_table_uri(uri, "Person");
+    let store = TableStore::new(uri);
+    let mut ds = Dataset::open(&person_uri).await.unwrap();
+    let head_before = ds.version().version;
+    let _ = store
+        .delete_where(&person_uri, &mut ds, "1 = 2")
+        .await
+        .unwrap();
+    let head_after = ds.version().version;
+
+    // Sidecar with no actor_id (CLI-driven mutation; common case).
+    let sidecar_json = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H00000000000000000000NA",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": null,
+            "writer_kind": "EnsureIndices",
+            "tables": [
+                {{
+                    "table_key": "node:Person",
+                    "table_path": "{}",
+                    "expected_version": {},
+                    "post_commit_pin": {}
+                }}
+            ]
+        }}"#,
+        person_uri, head_before, head_after
+    );
+    write_sidecar_file(dir.path(), "01H00000000000000000000NA", &sidecar_json);
+
+    let _db = Omnigraph::open(uri).await.unwrap();
+
+    let audit = read_latest_recovery_audit(dir.path()).await;
+    assert_eq!(
+        audit,
+        Some((
+            "RolledForward".to_string(),
+            None, // recovery_for_actor is None when sidecar.actor_id is None
+            "01H00000000000000000000NA".to_string(),
+            "EnsureIndices".to_string(),
+        )),
+    );
+}
+
+// =====================================================================
+// Multi-sidecar processing — integration tests
+// =====================================================================
+
+/// Multiple sidecars must be processed in deterministic ORDER and against
+/// FRESH manifest snapshots. Without sort + per-sidecar refresh, sidecar
+/// B can be classified against sidecar A's stale pre-publish snapshot
+/// and incorrectly roll back work that just landed.
+///
+/// This test drops two synthetic sidecars on independent tables and
+/// asserts the sweep processes both end-to-end (both deleted, both
+/// audited). The unit test `list_sidecars_returns_deterministic_order`
+/// pins the sort order; this integration test pins the multi-sidecar
+/// flow against a real engine state.
+#[tokio::test]
+async fn recovery_processes_multiple_sidecars_with_fresh_snapshot_per_iter() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    // Bootstrap: load Person and Company so both have committed datasets.
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    let test_data = r#"{"type":"Person","data":{"name":"alice","age":30}}
+{"type":"Company","data":{"name":"acme"}}
+"#;
+    load_jsonl(&mut db, test_data, LoadMode::Append)
+        .await
+        .unwrap();
+    drop(db);
+
+    // Synthesize drift on both tables independently.
+    let person_uri = node_table_uri(uri, "Person");
+    let company_uri = node_table_uri(uri, "Company");
+    let store = TableStore::new(uri);
+    let mut person_ds = Dataset::open(&person_uri).await.unwrap();
+    let person_pre = person_ds.version().version;
+    let _ = store
+        .delete_where(&person_uri, &mut person_ds, "1 = 2")
+        .await
+        .unwrap();
+    let person_post = person_ds.version().version;
+
+    let mut company_ds = Dataset::open(&company_uri).await.unwrap();
+    let company_pre = company_ds.version().version;
+    let _ = store
+        .delete_where(&company_uri, &mut company_ds, "1 = 2")
+        .await
+        .unwrap();
+    let company_post = company_ds.version().version;
+
+    // Drop two sidecars; ULID prefix ensures sort order is A then B.
+    let sidecar_a = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H0000000000000000000AAAA",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": "act-a",
+            "writer_kind": "EnsureIndices",
+            "tables": [
+                {{"table_key":"node:Person","table_path":"{}","expected_version":{},"post_commit_pin":{}}}
+            ]
+        }}"#,
+        person_uri, person_pre, person_post
+    );
+    let sidecar_b = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H0000000000000000000BBBB",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": "act-b",
+            "writer_kind": "EnsureIndices",
+            "tables": [
+                {{"table_key":"node:Company","table_path":"{}","expected_version":{},"post_commit_pin":{}}}
+            ]
+        }}"#,
+        company_uri, company_pre, company_post
+    );
+    write_sidecar_file(dir.path(), "01H0000000000000000000AAAA", &sidecar_a);
+    write_sidecar_file(dir.path(), "01H0000000000000000000BBBB", &sidecar_b);
+
+    // Reopen — sweep must process both sidecars with fresh snapshots
+    // between iterations, deleting each as it completes.
+    let _db = Omnigraph::open(uri).await.unwrap();
+
+    assert!(
+        list_recovery_dir(dir.path()).is_empty(),
+        "both sidecars must be deleted after sweep"
+    );
+
+    // Both audit rows recorded.
+    assert_eq!(
+        count_recovery_audit_rows(dir.path()).await,
+        2,
+        "two sweeps must record two audit rows"
+    );
+}
+
+/// `ensure_indices_for_branch` must only pin tables that actually need
+/// new index work. If it pinned every catalog table and only one needed
+/// new indices, the others would classify as `NoMovement` on recovery,
+/// triggering the all-or-nothing decision rule to roll BACK the table
+/// that did get index work — destroying legitimate Phase B output.
+///
+/// Steady-state case: when nothing needs indexing, no sidecar should
+/// be written. The sibling test `recovery_ensure_indices_handles_empty_tables`
+/// covers the more nuanced empty-table case where the existing
+/// ensure_indices loop has `if row_count > 0 { build_indices(...) }` —
+/// empty tables produce zero commits and would otherwise force
+/// NoMovement → rollback.
+#[tokio::test]
+async fn recovery_ensure_indices_steady_state_no_sidecar() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    let test_data = r#"{"type":"Person","data":{"name":"alice","age":30}}
+{"type":"Company","data":{"name":"acme"}}
+"#;
+    load_jsonl(&mut db, test_data, LoadMode::Append)
+        .await
+        .unwrap();
+    db.ensure_indices().await.unwrap();
+    drop(db);
+
+    let mut db = Omnigraph::open(uri).await.unwrap();
+    db.ensure_indices().await.unwrap();
+    assert!(
+        list_recovery_dir(dir.path()).is_empty(),
+        "steady-state ensure_indices must not leave a sidecar (no tables need work)"
+    );
+}
+
+/// Empty tables (zero rows) bypass `build_indices_on_dataset` because
+/// `ensure_indices_for_branch` has `if row_count > 0 { build_indices(...) }`.
+/// The `needs_index_work_*` helpers must match this — pinning an empty
+/// table means recovery classifies it as `NoMovement` (no commits ever
+/// ran) and rolls back any sibling table's legitimate index work.
+///
+/// Integration verification: after a real init + ensure_indices on a
+/// repo where every table is empty, the recovery sweep must complete
+/// cleanly (no leftover sidecar) AND the next ensure_indices must also
+/// leave no sidecar — proving the empty-table-scoping behavior lets
+/// steady-state runs incur zero sidecar I/O. The
+/// `count_rows == 0 → return false` short-circuit in `needs_index_work_*`
+/// is what makes this work.
+///
+/// A stronger assertion that captured the sidecar mid-flight and verified
+/// the persisted JSON omits empty tables would require bypassing
+/// `load_jsonl` (which auto-builds indices via
+/// `prepare_updates_for_commit`); pinning that with a unit test on the
+/// helpers directly would require bootstrapping an engine plus raw Lance
+/// writes — left as a follow-up.
+#[tokio::test]
+async fn recovery_ensure_indices_handles_empty_tables() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    // Don't load any data — every table is empty.
+    db.ensure_indices().await.unwrap();
+    assert!(
+        list_recovery_dir(dir.path()).is_empty(),
+        "ensure_indices on an all-empty repo must not leave a sidecar"
+    );
+    // Reopen + ensure_indices — still steady state, still no sidecar.
+    drop(db);
+    let mut db = Omnigraph::open(uri).await.unwrap();
+    db.ensure_indices().await.unwrap();
+    assert!(
+        list_recovery_dir(dir.path()).is_empty(),
+        "second ensure_indices on an all-empty repo must also not leave a sidecar"
+    );
+}
+
+/// Multi-sidecar processing must refresh the manifest snapshot between
+/// sidecars: sidecar N's roll-forward writes manifest changes that
+/// sidecar N+1 must observe, otherwise N+1 classifies its tables
+/// against stale pins and may incorrectly run a Dataset::restore that
+/// would not have run under a fresh view.
+///
+/// Setup:
+/// - Sidecar A: kind=EnsureIndices (loose), refers to Person at
+///   expected=v1, post=v2.
+/// - Sidecar B: kind=EnsureIndices (loose), refers to Person at
+///   expected=v2, post=v3.
+/// - Lance HEAD for Person sits at v3, and v1, v2, v3 have DIFFERENT
+///   fragment-id sets (each version added a real row via append_batch).
+///   This means `restore_table_to_version`'s fragment-set short-circuit
+///   does NOT fire under the bug path — a real `Dataset::restore`
+///   actually runs there, producing HEAD=v4.
+///
+/// Outcome paths:
+///
+/// **Stale-snapshot bug** (no per-sidecar refresh):
+///   Sidecar A's classifier sees pre-recovery pin=v1, expected=v1
+///   matches → RolledPastExpected → RollForward to HEAD=v3. Manifest
+///   advances Person v1 → v3. Sidecar B's classifier still sees the
+///   STALE pin v1: lance_head=v3, manifest_pinned=v1, expected=v2.
+///   Loose-match predicate `expected == manifest_pinned` fails (v2 !=
+///   v1); `lance_head == manifest_pinned + 1` fails (v3 != v2) →
+///   UnexpectedMultistep → RollBack. Restore Person to expected=v2,
+///   creating Lance HEAD v4.
+///
+/// **Fresh-snapshot fix** (refresh per sidecar):
+///   Sidecar A: same as above; manifest pin advances to v3.
+///   Sidecar B refresh: classifier now sees pin=v3, lance_head=v3,
+///   expected=v2. lance_head == manifest_pinned → NoMovement → RollBack
+///   decision but the rollback loop has no eligible tables (only
+///   {RolledPastExpected, UnexpectedAtP1, UnexpectedMultistep} are
+///   restored), so it's a no-op rollback. Lance HEAD stays at v3.
+///
+/// **Differentiating assertion**: post-recovery Lance HEAD for Person
+/// must be == v3 (no restore happened). The stale-snapshot bug would
+/// have advanced HEAD to v4 via Dataset::restore.
+///
+/// Note: the audit row for sidecar B is "RolledBack" in the fix path
+/// because the all-or-nothing decision sees NoMovement. Overlapping-
+/// sidecar scenarios where one writer's HEAD-chained work absorbs the
+/// other's are rare in practice — per-(table, branch) writer
+/// serialization prevents them in steady state — but the recovery
+/// sweep handles them safely without forward-progress drift.
+#[tokio::test]
+async fn recovery_multi_sidecar_requires_fresh_snapshot_for_correctness() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    // Bootstrap: load Person rows; manifest pin and Lance HEAD == some
+    // baseline N.
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    load_jsonl(
+        &mut db,
+        r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+        LoadMode::Append,
+    )
+    .await
+    .unwrap();
+    drop(db);
+
+    let person_uri = node_table_uri(uri, "Person");
+    let store = TableStore::new(uri);
+    let mut ds = Dataset::open(&person_uri).await.unwrap();
+    let v1 = ds.version().version;
+
+    // Advance Lance HEAD twice via raw append_batch to mimic two
+    // consecutive would-be-publishes that didn't land. Each append adds
+    // a new fragment, so v1, v2, v3 have DIFFERENT fragment-id sets —
+    // restore_table_to_version's fragment-set short-circuit will not
+    // fire when classifier dispatches to rollback (the
+    // differentiator we rely on).
+    //
+    // Bypassing __manifest is what `delete_where` and `append_batch`
+    // both do (direct on Lance); using append_batch (instead of no-op
+    // deletes) is what makes the fragment-set differ across versions.
+    store
+        .append_batch(
+            &person_uri,
+            &mut ds,
+            person_batch(&[("bob-id", "bob", Some(25))]),
+        )
+        .await
+        .unwrap();
+    let v2 = ds.version().version;
+    store
+        .append_batch(
+            &person_uri,
+            &mut ds,
+            person_batch(&[("carol-id", "carol", Some(40))]),
+        )
+        .await
+        .unwrap();
+    let v3 = ds.version().version;
+    assert_eq!(v2, v1 + 1);
+    assert_eq!(v3, v2 + 1);
+
+    // Sidecar A: writer A's intent was pin v1 → v2.
+    // Sidecar B: writer B's intent was pin v2 → v3 (depends on A landing).
+    // Both EnsureIndices kind so loose-match applies.
+    let sidecar_a = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H0000000000000000000AAAA",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": "act-a",
+            "writer_kind": "EnsureIndices",
+            "tables": [
+                {{"table_key":"node:Person","table_path":"{}","expected_version":{},"post_commit_pin":{}}}
+            ]
+        }}"#,
+        person_uri, v1, v2
+    );
+    let sidecar_b = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H0000000000000000000BBBB",
+            "started_at": "0",
+            "branch": null,
+            "actor_id": "act-b",
+            "writer_kind": "EnsureIndices",
+            "tables": [
+                {{"table_key":"node:Person","table_path":"{}","expected_version":{},"post_commit_pin":{}}}
+            ]
+        }}"#,
+        person_uri, v2, v3
+    );
+    write_sidecar_file(dir.path(), "01H0000000000000000000AAAA", &sidecar_a);
+    write_sidecar_file(dir.path(), "01H0000000000000000000BBBB", &sidecar_b);
+
+    // Reopen — both sidecars must process to completion (sidecar B
+    // requires fresh snapshot to see sidecar A's manifest update).
+    let _db = Omnigraph::open(uri).await.unwrap();
+
+    assert!(
+        list_recovery_dir(dir.path()).is_empty(),
+        "both sidecars must process to completion (fresh snapshot per iteration)"
+    );
+    assert_eq!(
+        count_recovery_audit_rows(dir.path()).await,
+        2,
+        "two sidecars → two audit rows"
+    );
+
+    // The "sidecars deleted + audit rows present" assertions above are
+    // necessary but not sufficient — both pass even when sidecar B rolls
+    // back under a stale snapshot (the bug path), because the sidecar is
+    // still deleted and an audit row is still written. The differentiating
+    // signal is the post-recovery Lance HEAD for Person:
+    //   - Fresh-snapshot fix: sidecar B is no-op rollback (NoMovement);
+    //     no Dataset::restore runs; HEAD stays at v3.
+    //   - Stale-snapshot bug: sidecar B classifies as UnexpectedMultistep;
+    //     restore advances HEAD to v4.
+    let ds_after = Dataset::open(&person_uri).await.unwrap();
+    assert_eq!(
+        ds_after.version().version,
+        v3,
+        "Person Lance HEAD must remain v3 (no restore from stale-snapshot rollback); got {} \
+         — a higher value indicates sidecar B classified UnexpectedMultistep against the \
+         stale pre-recovery pin and ran a restore",
+        ds_after.version().version
+    );
+    // Sanity: the audit kinds are diagnostic — first sidecar rolls forward
+    // (RolledPastExpected → RollForward); second is no-op rollback in this
+    // overlapping-sidecar scenario.
+    let kinds = list_recovery_audit_kinds(dir.path()).await;
+    assert_eq!(kinds.len(), 2, "expected 2 audit rows, got {:?}", kinds);
+    assert!(
+        matches!(kinds[0].as_str(), "RolledForward"),
+        "first sidecar must roll forward; got {:?}",
+        kinds
+    );
+}
+
+/// A sidecar from a feature-branch writer must be classified against
+/// THAT FEATURE BRANCH's manifest pin and Lance HEAD — not main's.
+/// Otherwise:
+///   - `snapshot.entry(table_key)` returns main's entry (or None) and
+///     `manifest_pinned` is wrong.
+///   - `Dataset::open(path)` returns the default ref's HEAD (main),
+///     missing the feature branch's actual drift.
+/// Either way, the classifier sees NoMovement → RollBack as no-op →
+/// sidecar deleted while feature's drift remains. Subsequent feature
+/// writers surface ExpectedVersionMismatch.
+///
+/// Setup:
+/// - Load alice on main.
+/// - Create `feature` branch.
+/// - Mutate feature (insert bob) → feature's manifest pin AND Lance
+///   HEAD on the feature branch advance.
+/// - Capture feature's post-mutate manifest pin (v_pin) and Lance HEAD
+///   (v_head).
+/// - Synthesize a sidecar with `branch=Some("feature")`, pin Person at
+///   `expected=v_pin, post=v_pin+1`, `table_branch=Some("feature")`.
+/// - Drop the engine and append_batch on Person's feature branch to
+///   advance HEAD to v_pin+1 (bypass manifest).
+///
+/// On reopen, recovery must:
+///   - Open a per-branch coordinator at `feature` for snapshot
+///     classification.
+///   - Open Person's Lance dataset at the `feature` ref for HEAD read.
+///   - Classify as RolledPastExpected and roll forward.
+#[tokio::test]
+async fn recovery_classifies_feature_branch_sidecar_against_feature_branch() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    load_jsonl(
+        &mut db,
+        r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+        LoadMode::Append,
+    )
+    .await
+    .unwrap();
+    db.branch_create("feature").await.unwrap();
+    db.mutate(
+        "feature",
+        helpers::MUTATION_QUERIES,
+        "insert_person",
+        &helpers::mixed_params(&[("$name", "bob")], &[("$age", 40)]),
+    )
+    .await
+    .unwrap();
+
+    // Capture feature-branch state.
+    let feature_snapshot = db
+        .snapshot_of(omnigraph::db::ReadTarget::branch("feature"))
+        .await
+        .unwrap();
+    let feature_entry = feature_snapshot
+        .entry("node:Person")
+        .expect("feature snapshot must have Person entry");
+    let v_pin = feature_entry.table_version;
+    let feature_branch_name = feature_entry.table_branch.clone();
+    let main_pin = db
+        .snapshot_of(omnigraph::db::ReadTarget::branch("main"))
+        .await
+        .unwrap()
+        .entry("node:Person")
+        .expect("main snapshot must have Person entry")
+        .table_version;
+    drop(db);
+
+    // Bypass the manifest: append directly to Person's Lance HEAD on the
+    // feature branch ref to advance HEAD past v_pin.
+    let person_uri = node_table_uri(uri, "Person");
+    let store = TableStore::new(uri);
+    let mut ds = store
+        .open_dataset_head(&person_uri, feature_branch_name.as_deref())
+        .await
+        .unwrap();
+    store
+        .append_batch(
+            &person_uri,
+            &mut ds,
+            person_batch(&[("carol-id", "carol", Some(40))]),
+        )
+        .await
+        .unwrap();
+    let v_head = ds.version().version;
+    assert_eq!(v_head, v_pin + 1, "append must advance HEAD by 1");
+
+    // Synthesize a sidecar saying the writer's intent was to publish
+    // feature's pin v_pin → v_pin+1. (Mutation kind = strict match.)
+    let sidecar_json = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H0000000000000000000FEAT",
+            "started_at": "0",
+            "branch": "feature",
+            "actor_id": "act-feature",
+            "writer_kind": "Mutation",
+            "tables": [
+                {{
+                    "table_key":"node:Person",
+                    "table_path":"{}",
+                    "expected_version":{},
+                    "post_commit_pin":{},
+                    "table_branch":{}
+                }}
+            ]
+        }}"#,
+        person_uri,
+        v_pin,
+        v_head,
+        match &feature_branch_name {
+            Some(b) => format!("\"{}\"", b),
+            None => "null".to_string(),
+        },
+    );
+    write_sidecar_file(dir.path(), "01H0000000000000000000FEAT", &sidecar_json);
+
+    // Reopen — recovery sweep must process the feature-branch sidecar
+    // against feature's snapshot, not main's. With the fix, feature's
+    // manifest pin advances v_pin → v_head.
+    let db = Omnigraph::open(uri).await.unwrap();
+    drop(db);
+
+    assert_post_recovery_invariants(
+        dir.path(),
+        "01H0000000000000000000FEAT",
+        RecoveryExpectation::RolledForward {
+            tables: vec![
+                TableExpectation::branch("node:Person", "feature")
+                    .expected_lance_head(v_head)
+                    .expected_main_manifest_pin(main_pin),
+            ],
+        },
+    )
+    .await
+    .unwrap();
+}
+
+/// Companion to the roll-forward feature-branch test: branch-axis
+/// rollback. Synthesize a feature-branch sidecar that classifies as
+/// rollback-eligible (UnexpectedAtP1) and assert the recovery sweep
+/// processes the sidecar against the FEATURE branch (not main) and runs
+/// the rollback. Without branch-aware recovery, classification would
+/// happen against main's snapshot/HEAD — likely NoMovement → no-op
+/// rollback that doesn't touch the actually-drifted feature ref.
+#[tokio::test]
+async fn recovery_rolls_back_feature_branch_sidecar_against_feature_branch() {
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    use omnigraph::table_store::TableStore;
+
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+    load_jsonl(
+        &mut db,
+        r#"{"type":"Person","data":{"name":"alice","age":30}}
+"#,
+        LoadMode::Append,
+    )
+    .await
+    .unwrap();
+    db.branch_create("feature").await.unwrap();
+    db.mutate(
+        "feature",
+        helpers::MUTATION_QUERIES,
+        "insert_person",
+        &helpers::mixed_params(&[("$name", "bob")], &[("$age", 40)]),
+    )
+    .await
+    .unwrap();
+
+    let feature_snapshot = db
+        .snapshot_of(omnigraph::db::ReadTarget::branch("feature"))
+        .await
+        .unwrap();
+    let feature_entry = feature_snapshot
+        .entry("node:Person")
+        .expect("feature snapshot must have Person entry");
+    let v_pin = feature_entry.table_version;
+    let feature_branch_name = feature_entry.table_branch.clone();
+    let main_pin = db
+        .snapshot_of(omnigraph::db::ReadTarget::branch("main"))
+        .await
+        .unwrap()
+        .entry("node:Person")
+        .expect("main snapshot must have Person entry")
+        .table_version;
+    drop(db);
+
+    // Bypass the manifest: append on the feature ref to advance HEAD past
+    // the manifest pin.
+    let person_uri = node_table_uri(uri, "Person");
+    let store = TableStore::new(uri);
+    let mut ds = store
+        .open_dataset_head(&person_uri, feature_branch_name.as_deref())
+        .await
+        .unwrap();
+    store
+        .append_batch(
+            &person_uri,
+            &mut ds,
+            person_batch(&[("dave-id", "dave", Some(50))]),
+        )
+        .await
+        .unwrap();
+    let v_head = ds.version().version;
+    assert_eq!(v_head, v_pin + 1);
+
+    // Sidecar with bogus expected (mismatch) AND post_commit_pin == v_head.
+    // Strict Mutation classifier sees lance_head == manifest_pinned + 1
+    // but expected != manifest_pinned → UnexpectedAtP1 → RollBack.
+    let bogus_expected = v_pin.saturating_sub(1);
+    let sidecar_json = format!(
+        r#"{{
+            "schema_version": 1,
+            "operation_id": "01H0000000000000000000FRB1",
+            "started_at": "0",
+            "branch": "feature",
+            "actor_id": "act-feature-rb",
+            "writer_kind": "Mutation",
+            "tables": [
+                {{
+                    "table_key":"node:Person",
+                    "table_path":"{}",
+                    "expected_version":{},
+                    "post_commit_pin":{},
+                    "table_branch":{}
+                }}
+            ]
+        }}"#,
+        person_uri,
+        bogus_expected,
+        v_head,
+        match &feature_branch_name {
+            Some(b) => format!("\"{}\"", b),
+            None => "null".to_string(),
+        },
+    );
+    write_sidecar_file(dir.path(), "01H0000000000000000000FRB1", &sidecar_json);
+
+    // Reopen with full sweep — RollBack is allowed at open time.
+    let db = Omnigraph::open(uri).await.unwrap();
+    drop(db);
+
+    assert_post_recovery_invariants(
+        dir.path(),
+        "01H0000000000000000000FRB1",
+        RecoveryExpectation::RolledBack {
+            tables: vec![
+                TableExpectation::branch("node:Person", "feature")
+                    .expected_main_manifest_pin(main_pin),
+            ],
+        },
+    )
+    .await
+    .unwrap();
+
+    // Lance HEAD on the feature ref must have advanced (real restore ran).
+    let post = store
+        .open_dataset_head(&person_uri, feature_branch_name.as_deref())
+        .await
+        .unwrap();
+    assert!(
+        post.version().version > v_head,
+        "real restore must have appended a commit on feature; v_head={}, post={}",
+        v_head,
+        post.version().version,
+    );
+
+    let db = Omnigraph::open(uri).await.unwrap();
+    assert_eq!(
+        helpers::count_rows_branch(&db, "feature", "node:Person").await,
+        2,
+        "feature branch must still expose the manifest-pinned rows after rollback"
+    );
+    assert_eq!(
+        helpers::count_rows(&db, "node:Person").await,
+        1,
+        "feature rollback must not move main"
+    );
+}
+
+/// `OpenMode::ReadOnly` must NOT run `recover_schema_state_files`,
+/// which can delete or rename schema-staging files. Read-only consumers
+/// may run with read-only object-store credentials, and silent open-time
+/// mutations violate the contract.
+///
+/// This test drops a schema-staging file (which the recovery sweep
+/// would normally delete) then opens with ReadOnly mode. The staging
+/// file must remain untouched.
+#[tokio::test]
+async fn read_only_open_skips_schema_state_recovery() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    let _ = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+
+    // Drop a leftover schema-staging file. The schema-state recovery
+    // sweep would normally tidy this on open (either delete or rename
+    // depending on whether it matches the live schema). ReadOnly must
+    // skip that work.
+    let staging_path = dir.path().join("_schema.pg.staging");
+    std::fs::write(&staging_path, "node Person { name: String @key }\n").unwrap();
+    assert!(staging_path.exists());
+
+    let _db = Omnigraph::open_read_only(uri).await.unwrap();
+
+    // Staging file must be untouched.
+    assert!(
+        staging_path.exists(),
+        "ReadOnly open must not delete schema-staging files (no object-store mutations)"
+    );
+    let content = std::fs::read_to_string(&staging_path).unwrap();
+    assert_eq!(
+        content, "node Person { name: String @key }\n",
+        "staging file content must be unchanged"
+    );
+}
diff --git a/crates/omnigraph/tests/staged_writes.rs b/crates/omnigraph/tests/staged_writes.rs
index 225b215..83d5c30 100644
--- a/crates/omnigraph/tests/staged_writes.rs
+++ b/crates/omnigraph/tests/staged_writes.rs
@@ -397,8 +397,7 @@ async fn scan_with_staged_with_filter_silently_drops_staged_rows() {
          If you're here because this assertion failed: either (a) Lance \
          exposed a way to scan uncommitted fragments without stats-based \
          pruning (good — update to assert == [alice, carol, dave]), or \
-         (b) something changed in our scan_with_staged path. See PR #67 \
-         test fix discussion + .context/mr-794-step2-design.md §1.1."
+         (b) something changed in our scan_with_staged path."
     );
 
     // Without filter, staged data IS visible — confirms the issue is
@@ -493,7 +492,7 @@ async fn chained_stage_merge_insert_with_shared_key_documents_duplicate_behavior
     );
 }
 
-// ─── MR-793 Phase 2: stage_overwrite + scalar index staging ─────────────────
+// ─── stage_overwrite + scalar index staging ─────────────────
 
 /// `stage_overwrite` writes replacement fragments to object storage but
 /// does NOT advance Lance HEAD until `commit_staged` runs. Mirrors
@@ -663,11 +662,11 @@ async fn stage_create_inverted_index_does_not_advance_head_until_commit() {
 
 /// Pin the inline-commit behavior of `delete_where`. Lance 4.0.0 does
 /// NOT expose a public `DeleteJob::execute_uncommitted`
-/// (`pub(crate)` — see lance-format/lance#6658). MR-793 deliberately
+/// (`pub(crate)` — see lance-format/lance#6658). The trait deliberately
 /// does NOT introduce a `stage_delete` wrapper that would secretly
-/// inline-commit (a side-channel — see design doc §3.2). Instead, the
-/// trait keeps `delete_where` as the only delete entry point, named
-/// honestly.
+/// inline-commit (a side-channel between the staged and inline write
+/// paths). Instead, the trait keeps `delete_where` as the only delete
+/// entry point, named honestly.
 ///
 /// **When Lance #6658 lands**: this test will need to flip — replace
 /// the assertion with a `stage_delete` + `commit_staged` round-trip
@@ -704,9 +703,10 @@ async fn delete_where_advances_head_inline_documents_residual() {
 /// `create_vector_index`. Lance 4.0.0 vector indices take the
 /// "segment commit path" which calls `build_index_metadata_from_segments`
 /// (`pub(crate)` in lance-4.0.0 `src/index.rs:111`). Until upstream
-/// exposes that helper (companion ticket to #6658), MR-793's trait
-/// surface deliberately does NOT include `stage_create_vector_index` —
-/// see design doc Appendix A.3.
+/// exposes that helper (companion ticket to lance-format/lance#6658),
+/// the trait surface deliberately does NOT include
+/// `stage_create_vector_index` — same rationale as `stage_delete`'s
+/// absence (no side-channel between staged and inline write paths).
 #[tokio::test]
 async fn create_vector_index_advances_head_inline_documents_residual() {
     use arrow_array::FixedSizeListArray;
@@ -758,3 +758,191 @@ async fn create_vector_index_advances_head_inline_documents_residual() {
     );
     assert!(store.has_vector_index(&ds, "embedding").await.unwrap());
 }
+
+/// Empirical pin of `Dataset::restore` semantics for the recovery sweep.
+///
+/// The recovery sweep depends on the `restore` invariant: from HEAD =
+/// `h`, calling `Dataset::checkout_version(p).await?` then
+/// `Dataset::restore().await?` produces a NEW commit at HEAD = `h + 1`
+/// with content == content at version `p`.
+///
+/// The Lance source confirms this — `restore()` (no args) takes the
+/// currently-checked-out version's content and applies it via
+/// `apply_commit` against the latest manifest, advancing HEAD by one.
+/// See lance-4.0.0 `src/dataset.rs:1106` and the transaction-spec
+/// example at https://lance.org/format/table/transaction/.
+///
+/// If the lance bump (4.0.0 → 4.x) ever changes this delta or the call
+/// signature, the recovery sweep's rollback path breaks; this test
+/// surfaces the regression at compile/test time rather than under
+/// production drift recovery.
+#[tokio::test]
+async fn lance_restore_appends_one_commit_with_checked_out_content() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = format!("{}/people.lance", dir.path().to_str().unwrap());
+    let store = TableStore::new(dir.path().to_str().unwrap());
+
+    // Build version history: v1 = {alice}, v2 = {alice, bob}, v3 = {alice, bob, carol}.
+    let mut ds = TableStore::write_dataset(&uri, person_batch(&[("alice", Some(30))]))
+        .await
+        .unwrap();
+    assert_eq!(ds.version().version, 1);
+
+    store
+        .append_batch(&uri, &mut ds, person_batch(&[("bob", Some(25))]))
+        .await
+        .unwrap();
+    assert_eq!(ds.version().version, 2);
+
+    store
+        .append_batch(&uri, &mut ds, person_batch(&[("carol", Some(40))]))
+        .await
+        .unwrap();
+    assert_eq!(ds.version().version, 3);
+
+    let head_before = ds.version().version;
+
+    // Recovery's rollback shape: open + checkout(p) + restore().
+    let head_ds = Dataset::open(&uri).await.unwrap();
+    let mut to_restore = head_ds.checkout_version(1).await.unwrap();
+    assert_eq!(to_restore.manifest.version, 1);
+    to_restore.restore().await.unwrap();
+
+    // Verify against a fresh open — the previous handle's view doesn't
+    // tell us what other openers see.
+    let post = Dataset::open(&uri).await.unwrap();
+    assert_eq!(
+        post.version().version,
+        head_before + 1,
+        "Dataset::restore must append exactly one commit (HEAD + 1). If \
+         this assertion fires, lance changed restore semantics — re-read \
+         lance src/dataset.rs::restore and update the recovery sweep's \
+         rollback path before proceeding."
+    );
+
+    // Content equality: the restored HEAD must match version 1 (just alice).
+    let scanner = post.scan();
+    let batches: Vec<RecordBatch> = scanner
+        .try_into_stream()
+        .await
+        .unwrap()
+        .try_collect()
+        .await
+        .unwrap();
+    let ids = collect_ids(&batches);
+    assert_eq!(
+        ids,
+        vec!["alice".to_string()],
+        "post-restore content must equal version 1's content; got {:?}",
+        ids,
+    );
+}
+
+/// Empirical pin of the `Dataset::restore` concurrency hazard that
+/// motivates the recovery sweep's open-time-only invocation strategy
+/// and any future continuous-recovery reconciler's queue-acquisition
+/// requirement.
+///
+/// `Dataset::restore`'s `check_restore_txn` (lance-4.0.0
+/// `src/io/commit/conflict_resolver.rs:986`) returns `Ok(())` against
+/// almost every other op (Append, Update, Delete, CreateIndex, Merge, …),
+/// so a Restore commits successfully even with concurrent commits in
+/// flight. The symmetric checks (lines 318, 473, 634, 787, 853, 947, 978,
+/// 1018, 1059, 1115, 1187, 1280) classify Restore as incompatible from
+/// the *other* op's POV — but the *other* op already committed before the
+/// Restore arrived, so it sees no conflict. Net: the Restore appends a
+/// rewind commit AFTER the legitimate concurrent Append, silently
+/// orphaning that Append's data from the active timeline.
+///
+/// The recovery sweep sidesteps this by running only at `Omnigraph::open`
+/// (before any other writers can race). A future continuous-recovery
+/// reconciler must acquire per-(table_key, branch) queues for sidecar
+/// tables before invoking restore — otherwise this hazard becomes
+/// reachable during in-flight tenant traffic.
+///
+/// This test is the load-bearing constraint any future reconciler must
+/// honor.
+#[tokio::test]
+async fn lance_restore_loses_to_concurrent_append_via_orphaning() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = format!("{}/people.lance", dir.path().to_str().unwrap());
+    let store = TableStore::new(dir.path().to_str().unwrap());
+
+    // v1: seed with alice.
+    let _ = TableStore::write_dataset(&uri, person_batch(&[("alice", Some(30))]))
+        .await
+        .unwrap();
+
+    // Recovery handle: opened at the latest, then checked out at v1 (the
+    // pin we'd "rollback" to in a real recovery scenario). This handle
+    // has NOT yet called restore.
+    let recovery_open = Dataset::open(&uri).await.unwrap();
+    let mut recovery_handle = recovery_open.checkout_version(1).await.unwrap();
+
+    // Concurrent legitimate writer: appends bob, advancing HEAD to v2.
+    // This simulates a per-table-queue model where another tenant wrote
+    // between recovery's open and recovery's restore call.
+    let mut writer_handle = Dataset::open(&uri).await.unwrap();
+    store
+        .append_batch(&uri, &mut writer_handle, person_batch(&[("bob", Some(25))]))
+        .await
+        .unwrap();
+    assert_eq!(writer_handle.version().version, 2);
+
+    // Recovery now restores. Because restore's `check_restore_txn` returns
+    // Ok against Append, this commits at v3 with content == v1 (just alice).
+    recovery_handle.restore().await.unwrap();
+
+    // Re-open and inspect: HEAD is v3, content is just alice. Bob is gone
+    // from the active timeline.
+    let post = Dataset::open(&uri).await.unwrap();
+    assert_eq!(
+        post.version().version,
+        3,
+        "Restore commits at HEAD+1 even when a concurrent commit landed \
+         between recovery's open and recovery's restore call. If this \
+         assertion fails, lance changed restore-vs-append conflict \
+         semantics — re-read check_restore_txn and update the recovery \
+         sweep's concurrency analysis."
+    );
+
+    let scanner = post.scan();
+    let batches: Vec<RecordBatch> = scanner
+        .try_into_stream()
+        .await
+        .unwrap()
+        .try_collect()
+        .await
+        .unwrap();
+    let ids = collect_ids(&batches);
+    assert_eq!(
+        ids,
+        vec!["alice".to_string()],
+        "Concurrent Append's row 'bob' was silently orphaned by the \
+         Restore. Active-timeline contents == v1's contents. The recovery \
+         sweep sidesteps this hazard via open-time-only invocation; any \
+         future continuous-recovery reconciler must guard against it via \
+         per-(table, branch) queue acquisition. Got: {:?}",
+        ids,
+    );
+
+    // Sanity: bob's commit IS still readable via explicit checkout_version(2).
+    // The data isn't gone from disk — it's just unreachable from HEAD until
+    // cleanup_old_versions reclaims the orphan.
+    let v2 = Dataset::open(&uri)
+        .await
+        .unwrap()
+        .checkout_version(2)
+        .await
+        .unwrap();
+    let v2_batches: Vec<RecordBatch> = v2
+        .scan()
+        .try_into_stream()
+        .await
+        .unwrap()
+        .try_collect()
+        .await
+        .unwrap();
+    let v2_ids = collect_ids(&v2_batches);
+    assert_eq!(v2_ids, vec!["alice".to_string(), "bob".to_string()]);
+}
diff --git a/docs/branches-commits.md b/docs/branches-commits.md
index 4501822..de6c653 100644
--- a/docs/branches-commits.md
+++ b/docs/branches-commits.md
@@ -55,3 +55,9 @@ Filtered from `branch_list()` but visible to internals:
 
 - `__schema_apply_lock__` — serializes schema migrations.
 - `__run__<run-id>` — legacy from the pre-v0.4.0 Run state machine (removed in MR-771). The branch-name guard predicate `is_internal_run_branch` is kept as defense-in-depth so users cannot create a branch matching the legacy prefix; the filter will be removed once production legacy branches are swept (MR-770).
+
+## L2 — Recovery audit trail
+
+The four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) protect their multi-table commits with a sidecar at `__recovery/{ulid}.json` written before Phase B and deleted after Phase C. The next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the recovery sweep in `crates/omnigraph/src/db/manifest/recovery.rs`: classify per-table state, decide all-or-nothing per sidecar, roll forward / back, record an audit row.
+
+Audit rows live in `_graph_commit_recoveries.lance` (sibling to `_graph_commits.lance`) and reference the commit graph by `graph_commit_id`. The linked recovery commit is identified by that same `graph_commit_id`, and `actor_id="omnigraph:recovery"` is stored in `_graph_commit_actors.lance` (joined by `graph_commit_id`) — `_graph_commits.lance` itself does not carry the `actor_id` column. To find recoveries for a specific original actor: `omnigraph commit list --filter actor=omnigraph:recovery`, then join to `_graph_commit_recoveries.lance` by `graph_commit_id` to read `recovery_for_actor`. Schema: see `crates/omnigraph/src/db/recovery_audit.rs`.
diff --git a/docs/invariants.md b/docs/invariants.md
index 73056e7..3c46b74 100644
--- a/docs/invariants.md
+++ b/docs/invariants.md
@@ -105,13 +105,13 @@ These are user-visible commitments. They state what the engine guarantees and wh
 Specific defaults (timeout values, memory caps, TTL windows) are *configuration*, not invariants — see [docs/constants.md](constants.md) and per-deployment configuration. The invariant is that bounds and contracts exist, not their numerical values.
 
 23. **Atomicity is per-query.** Every `.gq` query is atomic — multi-statement mutations are all-or-nothing via the substrate's atomic-commit primitive. No cross-query `BEGIN`/`COMMIT`; branches and merges fill that role for agent workflows.
-    *Status: upheld at the writer-trait surface for inserts / updates / scalar-index builds / merge_insert / overwrite after MR-793 PR #70 — the sealed `TableStorage` trait routes those through `stage_*` + `commit_staged`, so a Phase A failure (between writing fragments and committing) leaves no Lance-HEAD drift on touched tables. **Per-table commit_staged → manifest publish window remains** — a failure between commits across multiple touched tables can leave drift on the partially-committed tables. Lance has no multi-dataset atomic commit primitive; closing this requires the recovery-on-open reconciler tracked in MR-847. Additionally, two writer paths still inline-commit pending upstream Lance work: `delete_where` (lance-format/lance#6658) and `create_vector_index` (lance-format/lance#6666).*
+    *Status: upheld at the writer-trait surface, across process boundaries, AND in-process for the common case — the sealed `TableStorage` trait routes inserts / updates / scalar-index builds / merge_insert / overwrite through `stage_*` + `commit_staged` (Phase A is drift-free); the open-time recovery sweep in `db/manifest/recovery.rs` (sidecars at `__recovery/{ulid}.json` written by `MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) closes the per-table commit_staged → manifest publish residual on the next `Omnigraph::open`; and `Omnigraph::refresh` runs roll-forward-only recovery in-process so long-running servers close the common case (mutation/load finalize → publisher failure) without restart. The "Lance HEAD ahead of `__manifest`" drift class is unreachable for op-execution failures, recoverable across process boundaries for all writer kinds, and recoverable in-process for roll-forward-eligible sidecars. Sidecars that would require `Dataset::restore` are deferred to the next ReadWrite open (restore unsafe under concurrency); continuous in-process recovery for that case requires per-(table, branch) writer-queue acquisition and is the goal of a future background reconciler. Two writer paths still inline-commit pending upstream Lance work: `delete_where` (lance-format/lance#6658) and `create_vector_index` (lance-format/lance#6666).*
 
 24. **Schema integrity is strict at commit.** Type validation, required-field presence (auto-filled from `@default` if declared), uniqueness across batches and versions, and referential integrity — all enforced before commit succeeds. Per-write softening flags are opt-in, never default.
     *Status: aspirational — referential integrity at scale requires SIP-backed cross-table validation; not yet implemented. Cross-batch / cross-version uniqueness tracked in MR-714.*
 
 25. **Isolation: per-query snapshot; read-your-writes within and across queries in a session.** Each query reads from one consistent manifest version. Within a multi-statement mutation, the read subplan inside each write operator sees the writes from earlier statements. Across queries in a session, reads always resolve the latest manifest version — no reader pinning to older snapshots.
-    *Status: upheld for inserts/updates after MR-794 step 2+ — `MutationStaging`'s in-memory accumulator + `TableStore::scan_with_pending` (DataFusion `MemTable` union with the committed Lance scan, with merge-shadow semantics for chained updates) implements read-your-writes within a multi-statement mutation. Delete-touching mutations are limited to delete-only by parse-time D₂; closing the within-query RYW gap for deletes requires Lance's two-phase delete API (tracked: MR-793 / Lance-upstream lance-format/lance#6658). The "Lance HEAD ahead of `__manifest`" drift class is unreachable for op-execution failures (the partial-failure test pins this), but a narrowed residual remains at the finalize→publisher boundary because Lance has no multi-dataset commit primitive — see [docs/runs.md](runs.md) "Finalize → publisher residual".*
+    *Status: upheld for inserts/updates — `MutationStaging`'s in-memory accumulator + `TableStore::scan_with_pending` (DataFusion `MemTable` union with the committed Lance scan, with merge-shadow semantics for chained updates) implements read-your-writes within a multi-statement mutation. Delete-touching mutations are limited to delete-only by parse-time D₂; closing the within-query RYW gap for deletes requires Lance's two-phase delete API (Lance-upstream lance-format/lance#6658). The "Lance HEAD ahead of `__manifest`" drift class is unreachable for op-execution failures (the partial-failure test pins this), and the narrower finalize→publisher residual is closed across one open cycle by the open-time recovery sweep — see [docs/runs.md](runs.md) "Open-time recovery sweep".*
 
 26. **Durability before acknowledgement.** Commit returns only after the substrate has confirmed durable persistence. No "fast" or "fire-and-forget" durability levels.
 
diff --git a/docs/maintenance.md b/docs/maintenance.md
index aaab37f..08ae8da 100644
--- a/docs/maintenance.md
+++ b/docs/maintenance.md
@@ -16,6 +16,7 @@
 - `CleanupPolicyOptions { keep_versions: Option<u32>, older_than: Option<Duration> }` — at least one is required.
 - Returns `[TableCleanupStats { table_key, bytes_removed, old_versions_removed }]`.
 - CLI guards with `--confirm`; without it, prints a preview line.
+- **Recovery floor:** `--keep < 3` may garbage-collect Lance versions that the open-time recovery sweep needs as a rollback target (the sweep restores to the branch's manifest-pinned table version, which is HEAD-1 in the typical Phase B → Phase C drift case). Default `--keep 10` is safe.
 
 ## Tombstones
 
diff --git a/docs/runs.md b/docs/runs.md
index 971801b..5c8dfd8 100644
--- a/docs/runs.md
+++ b/docs/runs.md
@@ -130,7 +130,7 @@ will replace it. Operator-driven (rare in agent workloads); document
 permanently until Lance exposes `Operation::Overwrite { fragments }` as
 a two-phase op.
 
-### Finalize → publisher residual
+### Open-time recovery sweep
 
 The staged-write rewire eliminates one drift class **by construction at
 the writer layer**: an op that fails before pushing to the in-memory
@@ -139,26 +139,85 @@ rejection) leaves Lance HEAD untouched on every staged table. This is
 the case the `partial_failure_leaves_target_queryable_and_unblocks_next_mutation`
 test pins.
 
-A second, narrower drift class remains. `MutationStaging::finalize`
-runs `stage_*` + `commit_staged` per touched table sequentially, then
-the publisher commits the manifest. Lance has no multi-dataset atomic
-commit, so the per-table `commit_staged` calls are independent
-operations: if commit_staged on table N+1 fails *after* commit_staged
-on tables 1..N succeeded, or if the publisher's CAS pre-check rejects
-*after* every commit_staged succeeded, tables 1..N are left at
-`Lance HEAD = manifest_pinned + 1`. The next mutation against those
-tables surfaces `ManifestConflictDetails::ExpectedVersionMismatch` —
-the same loud failure mode the rewire was designed to make rare, just
-no longer "unreachable."
+A second, narrower drift class — the **finalize → publisher window** —
+is closed across one open cycle by the open-time recovery sweep:
 
-Triggers: transient Lance write errors during finalize (object-store
-retry budget exhaustion, disk full); persistent publisher contention
-exceeding `PUBLISHER_RETRY_BUDGET = 5` retries. Closing this requires
-either a Lance multi-dataset atomic-commit primitive (filed upstream
-alongside the two-phase delete request) or a manifest-layer journal
-that replays staged commits on next open. Both are heavyweight; the
-v1 stance is "narrowed window, documented residual, surface the loud
-error when it fires."
+`MutationStaging::finalize` runs `stage_*` + `commit_staged` per touched
+table sequentially, then the publisher commits the manifest. Lance has
+no multi-dataset atomic commit, so the per-table `commit_staged` calls
+are independent operations: if commit_staged on table N+1 fails *after*
+commit_staged on tables 1..N succeeded, or if the publisher's CAS
+pre-check rejects *after* every commit_staged succeeded, tables 1..N
+are left at `Lance HEAD = manifest_pinned + 1`.
+
+**Recovery protocol** (lifecycle of every staged-write writer —
+`MutationStaging::finalize`, `schema_apply::apply_schema_with_lock`,
+`branch_merge_on_current_target`, `ensure_indices_for_branch`):
+
+1. **Phase A**: writer writes a sidecar JSON to
+   `__recovery/{ulid}.json` BEFORE its first `commit_staged`. The
+   sidecar names every `(table_key, table_path, expected_version,
+   post_commit_pin)` it intends to commit + the writer kind +
+   actor_id.
+2. **Phase B**: writer's per-table `commit_staged` loop runs.
+3. **Phase C**: publisher commits the manifest.
+4. **Phase D**: writer deletes the sidecar.
+
+> **Phase letter convention.** Throughout the recovery code, log
+> messages, failpoint names (e.g. `branch_merge.post_phase_b_pre_manifest_commit`),
+> and the per-writer integration tests, "Phase A/B/C/D" refers
+> exclusively to the four-step lifecycle above. The per-table
+> staged-write contract (`stage_*` then `commit_staged`, two steps)
+> is referred to by those API verbs — never by phase letters — so a
+> reader of `recovery.rs`, `failpoints.rs`, or this document only
+> encounters phase letters in the per-writer context.
+
+A failure between Phase A and Phase D leaves the sidecar on disk. The
+next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the
+recovery sweep in `crates/omnigraph/src/db/manifest/recovery.rs`:
+
+- For each sidecar in `__recovery/`, compare every named table's
+  Lance HEAD to the manifest pin. Classify per the all-or-nothing
+  decision tree (RolledPastExpected / NoMovement / UnexpectedAtP1 /
+  UnexpectedMultistep / InvariantViolation).
+- If any table is `InvariantViolation` (Lance HEAD < manifest pinned —
+  should be impossible), **abort** with a loud error and leave the
+  sidecar on disk for operator review.
+- Otherwise, if every table is `RolledPastExpected`, **roll forward**:
+  a single `ManifestBatchPublisher::publish` call extends every pin
+  atomically. `SchemaApply` sidecars are eligible only when schema-state
+  recovery promoted the matching staging files in the same recovery pass;
+  otherwise full open-time recovery rolls them back and refresh-time
+  recovery leaves them for the next read-write open.
+- Otherwise **roll back**: per-table `Dataset::restore` to the
+  manifest-pinned table version for that branch. Rollback records the
+  actual restore target in the audit row's `to_version`.
+- After a successful roll-forward or roll-back, an audit row is
+  recorded — `_graph_commits.lance` carries
+  a commit tagged `actor_id = "omnigraph:recovery"`, and a sibling
+  `_graph_commit_recoveries.lance` row carries `recovery_kind`,
+  `recovery_for_actor` (the original sidecar's actor), `operation_id`,
+  per-table outcomes. Operators run `omnigraph commit list --filter
+  actor=omnigraph:recovery` to find recoveries.
+- Sidecar deleted as the final step.
+
+Triggers for the residual: transient Lance write errors during finalize
+(object-store retry budget exhaustion, disk full); persistent publisher
+contention exceeding `PUBLISHER_RETRY_BUDGET = 5` retries.
+
+**Long-running servers**: `Omnigraph::refresh` runs roll-forward-only
+recovery in-process — the common Phase B → Phase C residual closes
+without a restart. The next mutation on the same handle (after refresh)
+no longer surfaces `ExpectedVersionMismatch` for the failed table.
+Sidecars that would require a `Dataset::restore` (mixed / unexpected
+state) are deferred to the next `OpenMode::ReadWrite` open: restore is
+unsafe under concurrency because Lance's `check_restore_txn` accepts
+the restore against in-flight Append/Update/Delete commits and
+silently orphans them (pinned by
+`tests/staged_writes.rs::lance_restore_loses_to_concurrent_append_via_orphaning`).
+Continuous in-process recovery for the rollback path is the goal of a
+future background reconciler with per-(table, branch) writer-queue
+acquisition.
 
 The publisher-CAS contract is unchanged: a *concurrent writer* that
 advances any of our touched tables between snapshot capture and
diff --git a/docs/storage.md b/docs/storage.md
index 90806d0..825fbbe 100644
--- a/docs/storage.md
+++ b/docs/storage.md
@@ -62,13 +62,15 @@ flowchart TB
     manifest["__manifest/<br/>L2 catalog of sub-tables"]:::l2
     nodes["nodes/{fnv1a64-hex}/<br/>one dataset per node type"]:::l2
     edges["edges/{fnv1a64-hex}/<br/>one dataset per edge type"]:::l2
-    cgraph["_graph_commits.lance/<br/>_graph_commit_actors.lance/"]:::l2
+    cgraph["_graph_commits.lance/<br/>_graph_commit_actors.lance/<br/>_graph_commit_recoveries.lance/"]:::l2
+    recovery["__recovery/{ulid}.json<br/>recovery sidecars (transient)"]:::l2
     refs["_refs/branches/{name}.json<br/>graph-level branches"]:::l2
 
     repo --> manifest
     repo --> nodes
     repo --> edges
     repo --> cgraph
+    repo --> recovery
     repo --> refs
 
     subgraph dataset[Inside each Lance dataset — L1]
@@ -90,6 +92,8 @@ flowchart TB
 - **`__manifest/`** is a Lance dataset whose rows describe which sub-table version is published at which graph-branch. Reading a snapshot starts here.
 - **`nodes/`** and **`edges/`** are sibling directories holding one Lance dataset per declared type. Names are `fnv1a64-hex` of the type name to keep paths fixed-length and case-safe.
 - **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 repos also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; MR-770 sweeps these in production.)
+- **`_graph_commit_recoveries.lance`** — one row per recovery sweep action. Joined to `_graph_commits.lance` by `graph_commit_id`; the linked commit row carries `actor_id=omnigraph:recovery`. Operators correlate recoveries with the original mutations they rolled forward / back via this join. See `crates/omnigraph/src/db/recovery_audit.rs`.
+- **`__recovery/{ulid}.json`** — transient sidecar files written by the four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) before Phase B begins, deleted after Phase C succeeds. A sidecar persisting after process exit means the writer crashed in the Phase B → Phase C window; the next `Omnigraph::open` recovery sweep processes it. Steady-state directory is empty. See `crates/omnigraph/src/db/manifest/recovery.rs`.
 - **`_refs/branches/{name}.json`** is graph-level branch metadata — pointers from a branch name to the manifest version it heads.
 - **Inside each Lance dataset** (orange): the standard Lance directory layout. `_versions/{n}.manifest` records every commit; `data/` holds the actual Arrow fragments; `_indices/{uuid}/` holds index segments with their own `fragment_bitmap` for partial coverage; `_refs/` holds Lance-native per-dataset branches and tags.
 
diff --git a/docs/testing.md b/docs/testing.md
index 4ba3cbc..015209b 100644
--- a/docs/testing.md
+++ b/docs/testing.md
@@ -32,7 +32,9 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
 | `export.rs` | NDJSON streaming export filters |
 | `s3_storage.rs` | S3-backed repo (skipped unless `OMNIGRAPH_S3_TEST_BUCKET` is set) |
 | `lance_version_columns.rs` | Per-row `_row_last_updated_at_version` behavior |
-| `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature) |
+| `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature). Includes the four per-writer Phase B → recovery integration tests (`recovery_rolls_forward_after_finalize_publisher_failure`, `schema_apply_phase_b_failure_recovered_on_next_open`, `branch_merge_phase_b_failure_recovered_on_next_open`, `ensure_indices_phase_b_failure_recovered_on_next_open`). |
+| `recovery.rs` | Open-time recovery sweep — sidecar I/O, classifier dispatch (NoMovement / RolledPastExpected / UnexpectedAtP1 / UnexpectedMultistep / InvariantViolation), all-or-nothing decision, roll-forward via `ManifestBatchPublisher::publish`, roll-back via `Dataset::restore`, audit row in `_graph_commit_recoveries.lance`, `OpenMode::ReadOnly` skip path |
+| `composite_flow.rs` | Compositional/narrative end-to-end stories — multi-step flows that compose mechanics covered by other test files. Catches integration regressions where individual operations all pass their unit tests but their composition breaks (sequential merges, post-merge main writes, time-travel through merge DAG, reopen consistency over multi-merge histories). |
 
 ## Fixtures