omnigraph/crates/omnigraph-cli/Cargo.toml

33 lines
993 B
TOML
Raw Normal View History

2026-04-10 20:49:41 +03:00
[package]
name = "omnigraph-cli"
Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46) * Parallel per-type load writes + omnigraph optimize/cleanup CLI ## MR-677.3 — parallel per-type load writes The load path already groups records into one RecordBatch per type and makes one Lance commit per table (loader::mod.rs:249-..), but those commits ran sequentially. Wrap node and edge write loops in `futures::stream::buffered(N)` against a new helper `write_batches_concurrently`. Concurrency tunable via `OMNIGRAPH_LOAD_CONCURRENCY` (default 8). ## MR-676 — `omnigraph optimize` and `omnigraph cleanup` New CLI subcommands that walk every node + edge table in the repo: - `omnigraph optimize <uri>` — runs Lance `compact_files` on each table to merge small fragments into fewer larger ones. - `omnigraph cleanup <uri> --keep N | --older-than 7d --confirm` — runs Lance `cleanup_old_versions` to prune historical manifests + unique fragments. Requires `--confirm` because it's destructive. Supports both count-based and time-based retention (or both AND'd together). Time uses chrono `DateTime<Utc>` (added as a workspace dep, default-features off). Both commands run their per-table loops in parallel (8-way bounded, `OMNIGRAPH_MAINTENANCE_CONCURRENCY` env override). Smoke-tested against the 114-table prod graph: optimize went 7m15s sequential → 1m28s parallel. cleanup --keep 1 removed 137 historical versions across 114 tables in 1m57s without disrupting `/healthz` or query responses. Public API on `Omnigraph`: pub async fn optimize(&mut self) -> Result<Vec<TableOptimizeStats>> pub async fn cleanup(&mut self, opts: CleanupPolicyOptions) -> Result<Vec<TableCleanupStats>> All 10 existing loader tests still pass. Closes MR-676. Partially addresses MR-677 (the .3 — parallel by type — piece; MR-677.1 is for the `omnigraph embed` path, not load, since load doesn't call Gemini directly. .2 was already in place). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: regenerate openapi.json --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-25 14:22:14 +03:00
version = "0.3.1"
2026-04-10 20:49:41 +03:00
edition = "2024"
description = "CLI for the Omnigraph graph database."
license = "MIT"
2026-04-14 20:13:00 +03:00
repository = "https://github.com/ModernRelay/omnigraph"
homepage = "https://github.com/ModernRelay/omnigraph"
documentation = "https://docs.rs/omnigraph-cli"
2026-04-10 20:49:41 +03:00
[[bin]]
name = "omnigraph"
path = "src/main.rs"
[dependencies]
Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46) * Parallel per-type load writes + omnigraph optimize/cleanup CLI ## MR-677.3 — parallel per-type load writes The load path already groups records into one RecordBatch per type and makes one Lance commit per table (loader::mod.rs:249-..), but those commits ran sequentially. Wrap node and edge write loops in `futures::stream::buffered(N)` against a new helper `write_batches_concurrently`. Concurrency tunable via `OMNIGRAPH_LOAD_CONCURRENCY` (default 8). ## MR-676 — `omnigraph optimize` and `omnigraph cleanup` New CLI subcommands that walk every node + edge table in the repo: - `omnigraph optimize <uri>` — runs Lance `compact_files` on each table to merge small fragments into fewer larger ones. - `omnigraph cleanup <uri> --keep N | --older-than 7d --confirm` — runs Lance `cleanup_old_versions` to prune historical manifests + unique fragments. Requires `--confirm` because it's destructive. Supports both count-based and time-based retention (or both AND'd together). Time uses chrono `DateTime<Utc>` (added as a workspace dep, default-features off). Both commands run their per-table loops in parallel (8-way bounded, `OMNIGRAPH_MAINTENANCE_CONCURRENCY` env override). Smoke-tested against the 114-table prod graph: optimize went 7m15s sequential → 1m28s parallel. cleanup --keep 1 removed 137 historical versions across 114 tables in 1m57s without disrupting `/healthz` or query responses. Public API on `Omnigraph`: pub async fn optimize(&mut self) -> Result<Vec<TableOptimizeStats>> pub async fn cleanup(&mut self, opts: CleanupPolicyOptions) -> Result<Vec<TableCleanupStats>> All 10 existing loader tests still pass. Closes MR-676. Partially addresses MR-677 (the .3 — parallel by type — piece; MR-677.1 is for the `omnigraph embed` path, not load, since load doesn't call Gemini directly. .2 was already in place). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: regenerate openapi.json --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-25 14:22:14 +03:00
omnigraph = { package = "omnigraph-engine", path = "../omnigraph", version = "0.3.1" }
omnigraph-compiler = { path = "../omnigraph-compiler", version = "0.3.1" }
omnigraph-server = { path = "../omnigraph-server", version = "0.3.1" }
2026-04-10 20:49:41 +03:00
clap = { workspace = true }
color-eyre = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
serde_yaml = { workspace = true }
tokio = { workspace = true }
reqwest = { workspace = true, features = ["blocking"] }
[dev-dependencies]
assert_cmd = "2"
predicates = "3"
serde_json = { workspace = true }
tempfile = { workspace = true }
lance-index = { workspace = true }