mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-07-03 02:51:04 +02:00
Merge origin/main (v0.6.2, cluster Stage 2B) into ragnorc/scrutinize-rfc-002
Conflict resolutions: - cli/main.rs: keep the omnigraph_api_types import (branch extraction) and add main's omnigraph_cluster import; test-import list takes the branch's ResolvedCliGraph/is_remote_uri additions - Cargo manifests: union of deps — branch's config/queries/api-types crates plus main's cluster crate; all versions unified at 0.6.2 (new crates bumped from 0.6.1) - cli/Cargo.toml: omnigraph-server dep stays dropped (branch decision; CLI references it only in comments and the test harness binary spawn) - AGENTS.md: 0.6.2 + union crate list - cli-reference.md: branch's --graph/layered-config sections + main's cluster/repair rows - Cargo.lock regenerated cargo test --workspace --locked passes; scripts/check-agents-md.sh passes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
commit
48912167d0
91 changed files with 10889 additions and 736 deletions
15
AGENTS.md
15
AGENTS.md
|
|
@ -16,8 +16,8 @@ Tools that support `@`-imports (Claude Code) auto-include all three files via th
|
|||
|
||||
`CLAUDE.md` is a symlink to this file — there is exactly one source of truth. Edit `AGENTS.md`.
|
||||
|
||||
**Version surveyed:** 0.6.1
|
||||
**Workspace crates:** `omnigraph-compiler`, `omnigraph` (engine), `omnigraph-policy`, `omnigraph-config`, `omnigraph-queries`, `omnigraph-api-types`, `omnigraph-cli`, `omnigraph-server`
|
||||
**Version surveyed:** 0.6.2
|
||||
**Workspace crates:** `omnigraph-compiler`, `omnigraph` (engine), `omnigraph-policy`, `omnigraph-config`, `omnigraph-queries`, `omnigraph-api-types`, `omnigraph-cluster`, `omnigraph-cli`, `omnigraph-server`
|
||||
**Storage substrate:** Lance 6.x (columnar, versioned, branchable)
|
||||
**License:** MIT
|
||||
**Toolchain:** Rust stable, edition 2024
|
||||
|
|
@ -214,8 +214,12 @@ omnigraph schema apply --schema ./next.pg s3://my-bucket/graph.omni --json
|
|||
# Merge review branch back
|
||||
omnigraph branch merge review/2026-04-25 --into main s3://my-bucket/graph.omni
|
||||
|
||||
# Compact + GC (preview, then confirm)
|
||||
# Compact, preview any uncovered drift, then repair/GC after review
|
||||
omnigraph optimize s3://my-bucket/graph.omni
|
||||
omnigraph repair s3://my-bucket/graph.omni
|
||||
omnigraph repair --confirm s3://my-bucket/graph.omni
|
||||
# For suspicious/unverifiable drift only after deliberate review:
|
||||
# omnigraph repair --force --confirm s3://my-bucket/graph.omni
|
||||
omnigraph cleanup --keep 10 --older-than 7d s3://my-bucket/graph.omni
|
||||
omnigraph cleanup --keep 10 --older-than 7d --confirm s3://my-bucket/graph.omni
|
||||
|
||||
|
|
@ -236,8 +240,9 @@ omnigraph policy explain --actor act-alice --action change --branch main
|
|||
| Columnar storage on object store | ✅ Arrow/Lance | URI normalization, S3 env-var plumbing |
|
||||
| Per-dataset versioning + time travel | ✅ | `snapshot_at_version`, `entity_at`, snapshot-pinned reads across many tables |
|
||||
| Per-dataset branches | ✅ | **Graph-level** branches (atomic across all sub-tables), lazy fork, system branch filtering |
|
||||
| Atomic single-dataset commits | ✅ | **Multi-table publish via three layers**, NOT a single Lance primitive: (1) per-table Lance `commit_staged` for the data write, (2) `__manifest` row-level CAS via `ManifestBatchPublisher` for cross-table ordering, (3) the open-time recovery sweep for the residual gap between (1) and (2). All three layers ship; the four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) write a `__recovery/{ulid}.json` sidecar before Phase B and delete it after Phase C. The next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the sweep in `db/manifest/recovery.rs`: classify, decide all-or-nothing per sidecar, roll forward via single `ManifestBatchPublisher::publish` or roll back via `Dataset::restore`, and record an audit row in `_graph_commit_recoveries.lance` (queryable via `omnigraph commit list --filter actor=omnigraph:recovery`). Continuous in-process recovery (no restart needed between Phase B failure and recovery) is the goal of a future background reconciler. Engine writes route through a sealed `TableStorage` trait exposing `stage_*` + `commit_staged` as the canonical staged-write surface; documented inline-commit residuals (`delete_where`, `create_vector_index`, plus legacy `append_batch` / `merge_insert_batches` / `overwrite_batch` / `create_*_index`) remain on the trait until upstream Lance ships a public two-phase API ([#6658](https://github.com/lance-format/lance/issues/6658), [#6666](https://github.com/lance-format/lance/issues/6666)) and the migration of every call site completes. |
|
||||
| Compaction (`compact_files`) | ✅ | `omnigraph optimize` orchestrates over all node/edge tables, bounded concurrency; **skips blob-bearing tables** (reported via `TableOptimizeStats.skipped`, not silent), gated on `LANCE_SUPPORTS_BLOB_COMPACTION` until the upstream blob-v2 compaction-decode bug is fixed (see [docs/dev/invariants.md](docs/dev/invariants.md) Known Gaps) |
|
||||
| Atomic single-dataset commits | ✅ | **Multi-table publish via three layers**, NOT a single Lance primitive: (1) per-table Lance `commit_staged` for the data write, (2) `__manifest` row-level CAS via `ManifestBatchPublisher` for cross-table ordering, (3) the open-time recovery sweep for the residual gap between (1) and (2). All three layers ship; the five migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`, `optimize_all_tables`) write a `__recovery/{ulid}.json` sidecar before Phase B and delete it after Phase C. The next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the sweep in `db/manifest/recovery.rs`: classify, decide all-or-nothing per sidecar, roll forward via single `ManifestBatchPublisher::publish` or roll back via `Dataset::restore` followed by a manifest publish of the restored version (so both directions converge to `manifest == HEAD` — no residual drift), and record an audit row in `_graph_commit_recoveries.lance` (queryable via `omnigraph commit list --filter actor=omnigraph:recovery`). Continuous in-process recovery (no restart needed between Phase B failure and recovery) is the goal of a future background reconciler. Engine writes route through a sealed `TableStorage` trait exposing `stage_*` + `commit_staged` as the canonical staged-write surface; documented inline-commit residuals (`delete_where`, `create_vector_index`, plus legacy `append_batch` / `merge_insert_batches` / `overwrite_batch` / `create_*_index`) remain on the trait until upstream Lance ships a public two-phase API ([#6658](https://github.com/lance-format/lance/issues/6658), [#6666](https://github.com/lance-format/lance/issues/6666)) and the migration of every call site completes. |
|
||||
| Compaction (`compact_files`) | ✅ | `omnigraph optimize` orchestrates over all node/edge tables, bounded concurrency; **publishes each compacted table's new version to `__manifest`** (so the manifest tracks the Lance HEAD — required for reads to observe compaction and for schema apply / strict writes to pass their HEAD-vs-manifest precondition), under the per-`(table, main)` write queue with `SidecarKind::Optimize` recovery coverage; **refuses on an unrecovered graph** (errors if a `__recovery` sidecar is pending); **skips uncovered HEAD > manifest drift** with `DriftNeedsRepair` instead of interpreting it; **skips blob-bearing tables** (reported via `TableOptimizeStats.skipped`, not silent), gated on `LANCE_SUPPORTS_BLOB_COMPACTION` until the upstream blob-v2 compaction-decode bug is fixed (see [docs/dev/invariants.md](docs/dev/invariants.md) Known Gaps) |
|
||||
| Repair uncovered drift | — | `omnigraph repair` explicitly classifies uncovered table `HEAD > manifest` drift: verified maintenance drift (`ReserveFragments`/`Rewrite`) can be published with `--confirm`; suspicious or unverifiable drift requires `--force --confirm`. Sidecar-covered crash residuals still recover automatically on open. |
|
||||
| Cleanup (`cleanup_old_versions`) | ✅ | `omnigraph cleanup` with `--keep` / `--older-than` policy |
|
||||
| BTREE / inverted (FTS) / vector indexes | ✅ | `ensure_indices` builds them on every relevant column; idempotent; lazy across branches |
|
||||
| `merge_insert` upsert | ✅ | `LoadMode::Merge`, mutation `update`/`insert`/`delete` lowering |
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue