omnigraph/docs/user/reference/constants.md
Andrew Altshuler 77dffdae92
docs(user): de-dev polish — strip internal scaffolding from user docs (Phase 3a) (#226)
Remove developer-only scaffolding that leaked into the public user/operator
docs, while preserving every user-facing behavior, command, flag, endpoint,
constant, and env var. No behavior changes.

Removed across 18 files:
- internal ticket / sequencing refs (MR-NNN, RFC-NNN, "Phase N");
- source-code paths (crates/**/*.rs, *.pest) and internal struct/function
  dumps (e.g. the QueryIR / GraphCommit / SchemaMigrationPlan Rust types,
  internal fn names like fork_branch_from_state, optimize_all_tables);
- Lance-internal blocker prose (upstream issue numbers, blob-decode cause,
  sidecar Phase-B/C mechanics) — keeping the user-visible behavior (e.g.
  "optimize skips Blob-column tables; reads/writes unaffected");
- pre-v0.4.0 Run-state-machine archaeology.

Internal IR/lowering/recovery-internals sections were either trimmed to a
brief user-facing note (e.g. "Traversal execution", "interrupted writes
recover automatically; recovery commits are recorded under actor
omnigraph:recovery") or removed.

Kept: all language syntax, lint codes, Cedar actions/scopes, endpoints,
error taxonomy, every constant and env var (verified none dropped from the
constants cheat-sheet), and the operator-facing explanations of on-disk
artifacts. Residual "legacy" mentions are all user-facing (the deprecated
omnigraph.yaml, the legacy token chain, old command names).

Verified: zero internal-scaffolding leaks (MR/RFC/Phase/.rs/.pest = 0) across
docs/user; zero broken links; check-agents-md.sh green.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 14:39:25 +03:00

2.7 KiB
Raw Blame History

Constants & Tunables (cheat sheet)

Name Value Area
MANIFEST_DIR __manifest manifest layout
Commit graph dir _graph_commits.lance commit graph
Run registry dir (legacy, removed) _graph_runs.lance inert post-v0.4.0; bytes remain until a prefix-delete primitive lands
Run branch prefix (legacy, removed) __run__ swept off __manifest by the internal schema migration; no longer a reserved name
Schema apply lock __schema_apply_lock__ schema apply
Manifest publisher retry budget PUBLISHER_RETRY_BUDGET = 5 manifest publish
Internal manifest schema version INTERNAL_MANIFEST_SCHEMA_VERSION = 3 manifest migrations
Merge stage batch MERGE_STAGE_BATCH_ROWS = 8192 merge execution
Maintenance concurrency OMNIGRAPH_MAINTENANCE_CONCURRENCY=8 optimize/cleanup
Lance blob compaction support LANCE_SUPPORTS_BLOB_COMPACTION = false optimize
Graph index cache size 8 (LRU) runtime cache
Expand indexed-path frontier ceiling OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER=1024 traversal
Expand indexed-path hop ceiling OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS=6 traversal
Expand CSR-build cost factor CSR_BUILD_FACTOR = 1.5 traversal
Expand mode override OMNIGRAPH_TRAVERSAL_MODE (indexed|csr; unset = cost-based auto) traversal
Default body limit 1 MB HTTP server
Ingest body limit 32 MB HTTP server
Engine embed model gemini-embedding-2-preview engine embedding
Compiler embed model text-embedding-3-small compiler embedding
Embed timeout 30 000 ms both clients
Embed retries 4 both clients
Embed retry backoff 200 ms both clients
LANCE memory pool default 1 GB (raised in v0.3.0) runtime

Expand traversal dispatch. With OMNIGRAPH_TRAVERSAL_MODE unset, the engine chooses the indexed (per-hop BTREE) vs CSR (whole-graph in-memory) path with a cost model over cheap manifest counts (frontier size, |E|, source-vertex count, hops) plus the index-coverage signal: the indexed path is preferred when its frontier-relative work beats building the CSR (≈ when hops × frontier is a small fraction of the source-vertex set), and CSR is preferred for dense/deep traversals or when the BTREE coverage is degraded and a full scan would be paid per hop. The two ceilings bound the initial dispatch frontier/hops (beyond them CSR is always used); they are not a hard per-hop bound — the cost model estimates total indexed work as ~hops × frontier × fanout, so dense fan-out is priced toward CSR rather than capped mid-traversal. The override flag forces a path (the auto result is identical either way; only the path differs).