mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-09 01:35:18 +02:00
docs: split user and developer docs (#93)
This commit is contained in:
parent
e8d49559c4
commit
60eee78465
39 changed files with 499 additions and 445 deletions
|
|
@ -10,7 +10,7 @@ Three views, increasing zoom:
|
|||
2. **Layer view** — the eight-layer stack inside one OmniGraph process.
|
||||
3. **Component zoom-ins** — what's inside each layer.
|
||||
|
||||
For runtime flows (read query, mutation), see [`docs/execution.md`](execution.md). For the on-disk layout of a repo, see [`docs/storage.md`](storage.md).
|
||||
For runtime flows (read query, mutation), see [`docs/dev/execution.md`](execution.md). For the on-disk layout of a repo, see [`docs/user/storage.md`](../user/storage.md).
|
||||
|
||||
L1 (orange in the diagrams) is what we inherit from Lance; L2 (blue) is what OmniGraph adds. The L1/L2 framing is also called out in prose at the bottom of this doc.
|
||||
|
||||
|
|
@ -86,7 +86,7 @@ flowchart TB
|
|||
lance_layer -- bytes --> object_store
|
||||
```
|
||||
|
||||
The `storage trait` row is partly aspirational. Today the engine calls `lance::Dataset` methods through `table_store`; a capability-bearing `Dataset` trait per [`docs/invariants.md`](invariants.md) §I.4 is on the roadmap (MR-737). The diagram shows the intended seam.
|
||||
The storage seam is partly aspirational. `TableStorage` exists as the sealed staged-write trait, but capability/stat surfaces and full call-site migration are still roadmap. The diagram shows the intended boundary.
|
||||
|
||||
## Component zoom-ins
|
||||
|
||||
|
|
@ -174,7 +174,7 @@ Code paths:
|
|||
|
||||
Inserts and updates inside `mutate_as` and the bulk loader's
|
||||
Append/Merge modes go through `MutationStaging`
|
||||
([`crates/omnigraph/src/exec/staging.rs`](../crates/omnigraph/src/exec/staging.rs)),
|
||||
([`crates/omnigraph/src/exec/staging.rs`](../../crates/omnigraph/src/exec/staging.rs)),
|
||||
a per-query in-memory accumulator. No Lance HEAD advance happens during
|
||||
op execution; one `stage_*` + `commit_staged` per touched table runs
|
||||
at end-of-query, then the publisher commits the manifest atomically.
|
||||
|
|
@ -204,11 +204,10 @@ contracts:
|
|||
the committed snapshot at the captured `expected_version` and unions
|
||||
with a DataFusion `MemTable` over the pending batches.
|
||||
|
||||
This pattern realizes [docs/invariants.md §VI.25](invariants.md)
|
||||
(read-your-writes within a multi-statement mutation) and §VI.32
|
||||
(failure scope bounded) for inserts/updates by construction at the
|
||||
writer layer. See [docs/runs.md](runs.md) for the publisher CAS
|
||||
contract this builds on.
|
||||
This pattern realizes read-your-writes within a multi-statement mutation
|
||||
and keeps failure scope bounded for inserts/updates by construction at
|
||||
the writer layer. See [docs/dev/invariants.md](invariants.md) and
|
||||
[docs/dev/runs.md](runs.md) for the publisher CAS contract this builds on.
|
||||
|
||||
### Storage trait — today vs. roadmap
|
||||
|
||||
|
|
@ -222,10 +221,10 @@ flowchart LR
|
|||
d2[storage.rs<br/>S3 / file URI plumbing]:::now
|
||||
end
|
||||
|
||||
subgraph roadmap[Roadmap — invariants §I.4]
|
||||
subgraph roadmap[Roadmap - storage capabilities]
|
||||
t[trait Dataset<br/>schema · stats · placement<br/>capabilities · scan · write]:::future
|
||||
impl1[LanceStorage]:::future
|
||||
impl2[MemStorage for tests]:::future
|
||||
impl2[future test impl]:::future
|
||||
end
|
||||
|
||||
today -.-> roadmap
|
||||
|
|
@ -233,7 +232,7 @@ flowchart LR
|
|||
t --> impl2
|
||||
```
|
||||
|
||||
The storage layer's trait surface is aspirational. Today the engine calls `lance::Dataset` methods directly. The roadmap (per [`docs/invariants.md`](invariants.md) §I.4 and MR-737) is a `Dataset` trait that surfaces capabilities and statistics so the planner can reason about pushdown opportunities.
|
||||
The staged-write trait exists today as `TableStorage`, implemented by `TableStore`. Full engine migration plus capability and statistics surfaces remain roadmap, so the planner cannot yet reason about all pushdown opportunities through a documented trait surface.
|
||||
|
||||
### Index lifecycle — today vs. roadmap
|
||||
|
||||
|
|
@ -247,7 +246,7 @@ flowchart LR
|
|||
manual[called manually<br/>or from optimize]:::now
|
||||
end
|
||||
|
||||
subgraph roadmap[Roadmap — invariants §VII.38]
|
||||
subgraph roadmap[Roadmap - manifest reconciler]
|
||||
rec[Reconciler<br/>observes manifest]:::future
|
||||
diff[coverage diff<br/>fragments − fragment_bitmap]:::future
|
||||
wp[worker pool<br/>builds index segments]:::future
|
||||
|
|
@ -258,7 +257,7 @@ flowchart LR
|
|||
rec --> diff --> wp
|
||||
```
|
||||
|
||||
Today, indexes are built explicitly via `ensure_indices`. Reads degrade gracefully when index coverage is partial — Lance's scanner unions indexed and scan paths automatically. The roadmap reconciler (per [`docs/invariants.md`](invariants.md) §VII.38) observes manifest state and converges coverage in the background.
|
||||
Today, indexes are built explicitly via `ensure_indices`. Reads degrade gracefully when index coverage is partial — Lance's scanner unions indexed and scan paths automatically. The roadmap reconciler observes manifest state and converges coverage in the background.
|
||||
|
||||
### Server / CLI
|
||||
|
||||
|
|
@ -279,7 +278,7 @@ flowchart LR
|
|||
eng --> wq
|
||||
```
|
||||
|
||||
The server applies Cedar policy at the HTTP boundary today (per [`docs/invariants.md`](invariants.md) §VII.45, the roadmap is to push policy into the planner as predicates). After Cedar, mutating handlers go through `WorkloadController` (per-actor admission cap + byte budget; PR 2 / MR-686) before reaching the engine. The engine itself holds an `Arc<WriteQueueManager>` so concurrent mutations on the same `(table, branch)` serialize at the queue, while disjoint keys run in parallel — see [server.md](server.md) "Per-actor admission control" and [runs.md](runs.md). The CLI bypasses the HTTP layer (and admission) and calls the engine API directly.
|
||||
The server applies Cedar policy at the HTTP boundary today. The roadmap, called out in [docs/dev/invariants.md](invariants.md) as a known gap, is to push policy into the planner as predicates. After Cedar, mutating handlers go through `WorkloadController` (per-actor admission cap + byte budget; PR 2 / MR-686) before reaching the engine. The engine itself holds an `Arc<WriteQueueManager>` so concurrent mutations on the same `(table, branch)` serialize at the queue, while disjoint keys run in parallel — see [docs/user/server.md](../user/server.md) "Per-actor admission control" and [docs/dev/runs.md](runs.md). The CLI bypasses the HTTP layer (and admission) and calls the engine API directly.
|
||||
|
||||
Code paths:
|
||||
|
||||
|
|
@ -77,7 +77,7 @@ The branch-protection policy is the foundation. Future hardening adds:
|
|||
|
||||
- **Required signed commits** (`required_signatures: true`) — once maintainers enroll GPG/SSH signing.
|
||||
- **Tag protection** for `v*` tags via `repos/.../tags/protection`.
|
||||
- **Required reviewers from specific teams** for high-leverage paths (e.g., `docs/invariants.md`) via CODEOWNERS tier expansion + the N-unique-approvers CI workaround.
|
||||
- **Required reviewers from specific teams** for high-leverage paths (e.g., `docs/dev/invariants.md`) via CODEOWNERS tier expansion + the N-unique-approvers CI workaround.
|
||||
- **More required CI checks**: `cargo deny`, `cargo audit`, `cargo fmt --check`, `cargo clippy -D warnings`, CodeQL, secret scanning, schema-lint (MR-946).
|
||||
|
||||
See the hardening playbook for the full plan.
|
||||
|
|
@ -147,7 +147,7 @@ sequenceDiagram
|
|||
- End-of-query Lance commit: `TableStore::stage_append`, `stage_merge_insert`, `commit_staged` at `crates/omnigraph/src/table_store.rs`
|
||||
- Manifest commit primitive: `commit_updates_on_branch_with_expected` at `crates/omnigraph/src/db/omnigraph/table_ops.rs`
|
||||
|
||||
Atomicity guarantee for multi-statement mutations: a mid-query failure leaves Lance HEAD untouched on staged tables (no inline commit happened during op execution), so the next mutation proceeds normally with no `ExpectedVersionMismatch`. The publisher CAS at the very end either succeeds (manifest advances atomically across all touched sub-tables) or fails with a typed `ManifestConflictDetails::ExpectedVersionMismatch` (no partial publish). See [docs/invariants.md §VI.25 / §VI.32](invariants.md) and [docs/runs.md](runs.md).
|
||||
Atomicity guarantee for multi-statement mutations: a mid-query failure leaves Lance HEAD untouched on staged tables (no inline commit happened during op execution), so the next mutation proceeds normally with no `ExpectedVersionMismatch`. The publisher CAS at the very end either succeeds (manifest advances atomically across all touched sub-tables) or fails with a typed `ManifestConflictDetails::ExpectedVersionMismatch` (no partial publish). See [docs/dev/invariants.md](invariants.md) and [docs/dev/runs.md](runs.md).
|
||||
|
||||
## Bulk loader (`loader/mod.rs`)
|
||||
|
||||
|
|
@ -177,4 +177,4 @@ For Append/Merge, a mid-load failure (RI / cardinality violation, validation err
|
|||
|
||||
## Embeddings during load
|
||||
|
||||
If a node type has `@embed` properties, the loader calls the engine embedding client (Gemini, RETRIEVAL_DOCUMENT) per row to populate the vector column. See [embeddings.md](embeddings.md).
|
||||
If a node type has `@embed` properties, the loader calls the engine embedding client (Gemini, RETRIEVAL_DOCUMENT) per row to populate the vector column. See [embeddings.md](../user/embeddings.md).
|
||||
58
docs/dev/index.md
Normal file
58
docs/dev/index.md
Normal file
|
|
@ -0,0 +1,58 @@
|
|||
# Developer Docs
|
||||
|
||||
**Audience:** contributors, maintainers, and coding agents
|
||||
|
||||
This is the contributor-facing entry point. These docs explain architecture,
|
||||
invariants, implementation contracts, test ownership, and upstream Lance
|
||||
constraints. User-facing behavior should still be documented through
|
||||
[docs/user/index.md](../user/index.md) and the relevant public reference docs.
|
||||
|
||||
## Required For Every Non-Trivial Change
|
||||
|
||||
| Need | Read |
|
||||
|---|---|
|
||||
| Architectural rules, known gaps, deny-list | [invariants.md](invariants.md) |
|
||||
| Upstream Lance source-of-truth index | [lance.md](lance.md) |
|
||||
| Existing test coverage and test placement | [testing.md](testing.md) |
|
||||
|
||||
## Architecture And Storage
|
||||
|
||||
| Area | Read |
|
||||
|---|---|
|
||||
| System structure, L1/L2 framing, component diagrams | [architecture.md](architecture.md) |
|
||||
| On-disk layout, manifest schema, URI behavior | [storage.md](../user/storage.md) |
|
||||
| Direct-publish writes, D2, staged writes, recovery sidecars | [runs.md](runs.md) |
|
||||
| Query execution, mutation execution, loader flow | [execution.md](execution.md) |
|
||||
| Index lifecycle and graph topology indexes | [indexes.md](../user/indexes.md) |
|
||||
| Branch and commit internals | [branches-commits.md](../user/branches-commits.md) |
|
||||
| Three-way merge implementation and conflicts | [merge.md](merge.md) |
|
||||
| Diff/change-feed implementation | [changes.md](../user/changes.md) |
|
||||
| Branch protection policy | [branch-protection.md](branch-protection.md) |
|
||||
| CODEOWNERS source of truth | [codeowners.md](codeowners.md) |
|
||||
|
||||
## Language, Runtime, And Boundaries
|
||||
|
||||
| Area | Read |
|
||||
|---|---|
|
||||
| Schema grammar, catalog, migration planner | [schema-language.md](../user/schema-language.md) |
|
||||
| Query grammar, IR, lints, mutation restrictions | [query-language.md](../user/query-language.md) |
|
||||
| Embedding client and `@embed` integration | [embeddings.md](../user/embeddings.md) |
|
||||
| Cedar policy surface and server gating | [policy.md](../user/policy.md) |
|
||||
| Server auth, OpenAPI, endpoint handlers | [server.md](../user/server.md) |
|
||||
| Error taxonomy and serialization | [errors.md](../user/errors.md) |
|
||||
| Constants and tunables | [constants.md](../user/constants.md) |
|
||||
| Transaction model public contract | [transactions.md](../user/transactions.md) |
|
||||
|
||||
## Project Operations
|
||||
|
||||
| Area | Read |
|
||||
|---|---|
|
||||
| CI and release workflows | [ci.md](ci.md) |
|
||||
| Install and deployment packaging | [install.md](../user/install.md), [deployment.md](../user/deployment.md) |
|
||||
| Release history | [releases/](../releases/) |
|
||||
|
||||
## Boundary
|
||||
|
||||
Developer docs may mention implementation details, stale gaps, upstream Lance
|
||||
blockers, and review rules. User docs should not require that context unless
|
||||
the detail changes the public contract.
|
||||
206
docs/dev/invariants.md
Normal file
206
docs/dev/invariants.md
Normal file
|
|
@ -0,0 +1,206 @@
|
|||
# Architectural Invariants
|
||||
|
||||
**Type:** standing review checklist
|
||||
**Status:** living document
|
||||
**Audience:** anyone proposing, reviewing, or implementing an OmniGraph change
|
||||
|
||||
This file is intentionally short. It records the rules that should be in
|
||||
working memory for every non-trivial change. Detailed mechanics live in the
|
||||
area docs linked below.
|
||||
|
||||
Use it this way:
|
||||
|
||||
- Review the change against **Hard Invariants** and the **Deny-list**.
|
||||
- If code and docs disagree, either fix the code or add/update a **Known Gap**.
|
||||
- Keep implementation ledgers, roadmap detail, and historical MR notes in the
|
||||
per-area docs. This file is the filter, not the encyclopedia.
|
||||
|
||||
## Hard Invariants
|
||||
|
||||
1. **Respect the substrate.** Lance owns columnar storage, per-dataset
|
||||
versioning, fragments, branches, compaction, cleanup, and index primitives.
|
||||
DataFusion should own relational execution where it fits. Do not add custom
|
||||
WALs, transaction managers, buffer pools, page formats, or local clones of
|
||||
substrate behavior. Read [lance.md](lance.md) before guessing.
|
||||
|
||||
2. **Graph visibility is manifest-atomic.** Lance commits are per dataset.
|
||||
OmniGraph's graph-level atomicity comes from publishing one manifest update
|
||||
for the whole graph, guarded by expected table versions and sidecar recovery.
|
||||
No write path may make a subset of touched node/edge tables visible as a
|
||||
graph commit.
|
||||
|
||||
3. **A query reads one snapshot.** Query execution captures a manifest snapshot
|
||||
for its lifetime. Do not re-read branch head mid-query to discover newer
|
||||
table versions.
|
||||
|
||||
4. **Mutations publish at one boundary.** A `mutate_as` or `load` operation
|
||||
accumulates constructive writes, commits each touched table at the end, then
|
||||
publishes one manifest update. Do not commit per statement. Delete-only
|
||||
queries are the documented inline residual; the parse-time D2 rule prevents
|
||||
mixing deletes with insert/update until Lance exposes two-phase delete.
|
||||
Read [runs.md](runs.md) and [execution.md](execution.md).
|
||||
|
||||
5. **Recovery is part of the commit protocol.** Writers that can advance Lance
|
||||
HEAD before manifest publish must write `__recovery/{ulid}.json` sidecars.
|
||||
`Omnigraph::open` in read-write mode runs the all-or-nothing sweep, and
|
||||
`refresh` runs roll-forward-only recovery for long-lived processes. Do not
|
||||
add a new writer kind without sidecar coverage or an explicit proof that no
|
||||
Lance HEAD can move before manifest publish.
|
||||
|
||||
6. **Strong consistency is the default.** Reads are snapshot-isolated, writes
|
||||
are durable before acknowledgement, and branch reads observe the current
|
||||
committed graph state. Any eventual-consistency mode must be explicit,
|
||||
read-only, auditable, and non-default.
|
||||
|
||||
7. **Indexes are derived state.** Reads must see the correct result for the
|
||||
branch they read even when index coverage is partial. Expensive index work
|
||||
should converge from manifest state instead of extending the critical write
|
||||
path. Scalar staged index builds and vector inline residuals are documented
|
||||
in [runs.md](runs.md) and [indexes.md](../user/indexes.md).
|
||||
|
||||
8. **Schema identity survives renames.** Accepted schema identity must remain
|
||||
stable across type and property renames. Rename support belongs in migration
|
||||
planning, not in "drop and recreate" behavior. See the known gap below.
|
||||
|
||||
9. **Schema/data integrity failures are loud.** Type errors, required-field
|
||||
misses, invalid edge endpoints, cardinality violations, and unsupported
|
||||
mixed mutation modes fail before a graph commit is published. The system must
|
||||
not invent placeholder nodes or silently weaken integrity.
|
||||
|
||||
10. **Query semantics are first-class IR concepts.** Search modes, mutations,
|
||||
polymorphism, traversal, retrieval scores, imports, and policy predicates
|
||||
belong in typed AST/IR/planner structures. Do not smuggle semantics through
|
||||
strings, side tables, global state, or transport-specific flags.
|
||||
|
||||
11. **Transport/auth stay at the boundary.** Kernel crates should not depend on
|
||||
HTTP, OpenAPI, bearer-token parsing, or future transport protocols. The
|
||||
server resolves bearer tokens to actors; clients cannot set actor identity
|
||||
directly.
|
||||
|
||||
12. **Bearer-token plaintext is not retained.** Server startup hashes bearer
|
||||
tokens, authentication uses constant-time comparison, and request handling
|
||||
carries only the resolved actor identity and hash-derived match state.
|
||||
|
||||
13. **Operational failures are bounded and observable.** Timeout, memory, OOM,
|
||||
partial result, recovery, and conflict paths must fail loudly or degrade in
|
||||
a documented way. If a metric affects plan choice or operator behavior, it
|
||||
must be exposed through the relevant trait or observability surface.
|
||||
|
||||
14. **Tests match the boundary being changed.** Prefer extending the existing
|
||||
test that owns the area. Planner changes need planner-level coverage,
|
||||
storage changes need storage/recovery coverage, and end-to-end tests are not
|
||||
a substitute for missing lower-level assertions. Read [testing.md](testing.md)
|
||||
before adding tests.
|
||||
|
||||
## Current Truth Matrix
|
||||
|
||||
| Area | Current state | Source |
|
||||
|---|---|---|
|
||||
| Multi-table commit | Manifest CAS plus recovery sidecars; not a single Lance primitive | [runs.md](runs.md), [architecture.md](architecture.md) |
|
||||
| Constructive mutations | In-memory `MutationStaging`, one end-of-query table commit per touched table, then one manifest publish | [runs.md](runs.md), [execution.md](execution.md) |
|
||||
| Deletes | Inline-commit residual; delete-only queries allowed, mixed insert/update/delete rejected by D2 | [query-language.md](../user/query-language.md), [runs.md](runs.md) |
|
||||
| Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema-language.md), [execution.md](execution.md) |
|
||||
| Unique constraints | Intra-batch and write-path checks exist; full cross-version uniqueness is still a gap | [schema-language.md](../user/schema-language.md) |
|
||||
| Storage trait | `TableStorage` exists as the sealed staged-write surface; full call-site migration and capability/stat surfaces are incomplete | [runs.md](runs.md), [architecture.md](architecture.md) |
|
||||
| Index lifecycle | `ensure_indices` is explicit today; reconciler-based convergence is roadmap | [indexes.md](../user/indexes.md), [maintenance.md](../user/maintenance.md) |
|
||||
| Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/query-language.md) |
|
||||
| Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/server.md), [policy.md](../user/policy.md) |
|
||||
| Tests | Tempdir-backed Lance tests are the current substrate; there is no `MemStorage` test backend | [testing.md](testing.md) |
|
||||
|
||||
## Known Gaps
|
||||
|
||||
Do not hide these behind invariant wording. Either move them forward or keep
|
||||
them explicit.
|
||||
|
||||
- **Rename-stable schema identity:** the invariant is that accepted IDs survive
|
||||
renames. The current compiler still derives type IDs from `kind:name`; this
|
||||
must be fixed before relying on renamed IDs across accepted schemas.
|
||||
- **Storage abstraction:** `TableStorage` is present, sealed, and canonical for
|
||||
staged writes, but older inherent `TableStore` call sites and inline residuals
|
||||
remain. New write paths should use the staged shape unless a documented Lance
|
||||
blocker applies.
|
||||
- **Deletes and vector indexes:** `delete_where` and vector index creation still
|
||||
advance Lance HEAD inline because the required public Lance APIs are missing.
|
||||
Keep D2 and recovery coverage in place until those residuals are removed.
|
||||
- **Planner capability/stat surfaces:** cost-aware planning, complete
|
||||
capability advertisement, and explain-with-cost are roadmap. Do not describe
|
||||
them as implemented.
|
||||
- **Traversal execution:** current multi-hop execution still uses `TypeIndex`,
|
||||
ad-hoc ID filtering, and eager materialization in places. Stable row IDs, SIP,
|
||||
and factorization are target patterns, not current fact.
|
||||
- **Retrieval ranks:** hybrid search works, but rank/score are not yet carried
|
||||
everywhere as ordinary columns through the plan.
|
||||
- **Policy pushdown and `Source`:** Cedar enforcement is at the HTTP boundary
|
||||
today, and imports are still loader-shaped. Planner predicates and a unified
|
||||
`Source` operator are roadmap.
|
||||
- **Resource bounds:** some operations still lack enforced per-query memory or
|
||||
time budgets. New long-running work should add explicit bounds rather than
|
||||
widening the gap.
|
||||
|
||||
## Deny-list
|
||||
|
||||
If a proposal fits one of these, the burden is on the proposer to prove why the
|
||||
case is exceptional.
|
||||
|
||||
- Custom WAL, transaction manager, buffer pool, page format, or storage engine.
|
||||
- Per-table graph publishing outside the manifest publisher.
|
||||
- Re-reading current branch head during a query instead of using the captured
|
||||
snapshot.
|
||||
- New write paths that can advance Lance HEAD before manifest publish without a
|
||||
recovery sidecar.
|
||||
- Cross-query `BEGIN`/`COMMIT` transactions in the OSS engine. Use branches and
|
||||
merges for multi-query workflows.
|
||||
- Acknowledging writes before durable Lance and manifest persistence.
|
||||
- Silent fallback to eventual consistency, partial results, or dropped rows.
|
||||
- State that drifts from Lance or the manifest when it can be derived.
|
||||
- Job queues for manifest-derivable state where a reconciler is the right shape.
|
||||
- Synchronous inline vector/FTS index rebuilds on the query commit path, except
|
||||
for documented Lance API residuals.
|
||||
- Side-channels for query semantics: hidden globals, magic strings, transport
|
||||
flags, or out-of-band metadata.
|
||||
- Cost-blind plan choice when statistics are available or required.
|
||||
- Hidden statistics for behavior that affects planning or operator choice.
|
||||
- Hash-map iteration order in result ordering, plan choice, or migration output.
|
||||
- String-flattened SQL/filter generation when a structured pushdown API is
|
||||
available.
|
||||
- Eager multi-hop cross-product materialization when factorization fits.
|
||||
- Ad-hoc `IN`-list filtering where SIP or another structured selectivity path
|
||||
fits.
|
||||
- Discarding retrieval score/rank before fusion or projection decisions.
|
||||
- Auto-creating placeholder nodes for orphan edges.
|
||||
- Wire-protocol-specific code in compiler or engine crates.
|
||||
- Cloud-only correctness fixes or forks of the OSS engine for correctness.
|
||||
- Mutating immutable substrate state in place, including Lance fragments or
|
||||
index segments.
|
||||
- Shipping observable behavior as if it were not part of the contract. Output
|
||||
ordering, error text, timestamp precision, defaults, and latency profiles all
|
||||
become dependencies once exposed.
|
||||
|
||||
## Review Checklist
|
||||
|
||||
Use this as yes/no/NA for any non-trivial design or PR:
|
||||
|
||||
- Does it respect Lance/DataFusion instead of rebuilding them?
|
||||
- Does it preserve manifest-atomic graph visibility?
|
||||
- Does every query keep one snapshot for its lifetime?
|
||||
- Do mutations publish once at the commit boundary?
|
||||
- Can every Lance-HEAD-before-manifest gap recover all-or-nothing?
|
||||
- Are schema and edge integrity checks strict by default?
|
||||
- Are query semantics represented in AST/IR/planner structures?
|
||||
- Are transport, auth, and policy boundaries preserved?
|
||||
- Are failures bounded, typed, and observable?
|
||||
- Are result ordering and plan choices deterministic within a snapshot?
|
||||
- Are stats/capabilities exposed when behavior depends on them?
|
||||
- Are existing known gaps left no worse and documented if touched?
|
||||
- Does the test live at the same boundary as the change?
|
||||
- Does the change avoid every deny-list pattern, or justify the exception?
|
||||
|
||||
## Maintenance Policy
|
||||
|
||||
Update this file when an invariant changes, a known gap opens or closes, or a
|
||||
new review anti-pattern deserves deny-list treatment. Prefer stable headings
|
||||
over numbered sections so other docs can link here without churn.
|
||||
|
||||
Removing or relaxing a hard invariant requires the same review process as code.
|
||||
Adding a known gap is acceptable when it makes reality explicit; leaving stale
|
||||
claims is not.
|
||||
|
|
@ -6,7 +6,7 @@ This file is the curated entry point. **When you hit a Lance-shaped problem, fin
|
|||
|
||||
Base URL: `https://lance.org`. **Fetch the FULL page content, not summaries** — use `npx mdrip <url>` (or `npx mdrip --max-chars 200000 <url>` for very long pages). Tools that summarize pages (like Claude's `WebFetch`) routinely drop load-bearing details — defaults, `pub(crate)` blockers, sub-specs hidden behind navigation hubs. If `npx mdrip` is unavailable, fall back to `curl <url> | pandoc -f html -t markdown` or paste the rendered page text manually; **never act on a summarized fetch alone**. Keep this index curated to relevant material — the upstream sitemap has hundreds of URLs (notably the Namespace REST API model surface, Spark/Trino/Databricks integrations) that we don't use.
|
||||
|
||||
> **Substrate boundary check.** Before fetching, recall [docs/invariants.md §I](invariants.md): if Lance already does the thing, we don't reimplement it. The most common reason to read these docs is to confirm a substrate behavior, not to learn what to clone.
|
||||
> **Substrate boundary check.** Before fetching, recall [docs/dev/invariants.md](invariants.md): if Lance already does the thing, we don't reimplement it. The most common reason to read these docs is to confirm a substrate behavior, not to learn what to clone.
|
||||
|
||||
## Quick-start (read these once per project)
|
||||
|
||||
|
|
@ -129,7 +129,7 @@ Touching `omnigraph optimize` / `cleanup`, the underlying `compact_files` / `cle
|
|||
|
||||
### DataFusion integration
|
||||
|
||||
The runtime substrate that may carry our query execution. See [docs/invariants.md §I.4](invariants.md): we don't rebuild relational machinery.
|
||||
The runtime substrate that may carry our query execution. See [docs/dev/invariants.md](invariants.md): we don't rebuild relational machinery.
|
||||
|
||||
| Topic | URL |
|
||||
|---|---|
|
||||
|
|
@ -22,7 +22,7 @@ A `.gq` query with multiple ops (e.g. `insert Person … insert Knows …`)
|
|||
must observe earlier ops' writes when validating later ops (referential
|
||||
integrity, edge cardinality). After MR-794 step 2+ this is implemented
|
||||
via an in-memory `MutationStaging` accumulator in
|
||||
[`crates/omnigraph/src/exec/staging.rs`](../crates/omnigraph/src/exec/staging.rs),
|
||||
[`crates/omnigraph/src/exec/staging.rs`](../../crates/omnigraph/src/exec/staging.rs),
|
||||
shared by both `mutate_as` and the bulk loader:
|
||||
|
||||
- On the first touch of each table, the pre-write manifest version is
|
||||
|
|
@ -48,9 +48,8 @@ shared by both `mutate_as` and the bulk loader:
|
|||
prevents inserts/updates from coexisting with deletes in one query,
|
||||
so the inline path is safe for delete-only mutations.
|
||||
|
||||
This upholds [docs/invariants.md §VI.23](invariants.md) (atomicity per
|
||||
query) and §VI.25 (read-your-writes within a multi-statement mutation,
|
||||
upheld).
|
||||
This upholds the manifest-atomic mutation and read-your-writes invariants
|
||||
tracked in [docs/dev/invariants.md](invariants.md).
|
||||
|
||||
### D₂ — parse-time mixed-mode rejection
|
||||
|
||||
|
|
@ -233,7 +232,7 @@ success and one failure. The losing writer's error is
|
|||
`ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected,
|
||||
actual }`. The HTTP server maps this to **409 Conflict** with body
|
||||
`{"error": "...", "code": "conflict", "manifest_conflict": { "table_key":
|
||||
"...", "expected": N, "actual": M }}` — see [docs/server.md](server.md).
|
||||
"...", "expected": N, "actual": M }}` — see [docs/user/server.md](../user/server.md).
|
||||
|
||||
## Audit
|
||||
|
||||
|
|
@ -1,6 +1,6 @@
|
|||
# Testing
|
||||
|
||||
This file is the always-on map of the test surface. **Consult it before every task** so you know what tests already cover the area you're about to change, what helpers to reuse, and where a new test belongs. The architectural invariant *"tests at every boundary, not just end-to-end"* lives in [docs/invariants.md §VIII.47](invariants.md).
|
||||
This file is the always-on map of the test surface. **Consult it before every task** so you know what tests already cover the area you're about to change, what helpers to reuse, and where a new test belongs. The architectural invariant for boundary-matched tests lives in [docs/dev/invariants.md](invariants.md).
|
||||
|
||||
## Where tests live, per crate
|
||||
|
||||
|
|
@ -49,7 +49,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
|
|||
- **CLI** — `crates/omnigraph-cli/tests/support/mod.rs`: `Command`-style wrapper for invoking `omnigraph`, server-process spawning, fixture resolution, output assertion helpers.
|
||||
- **Server** — no shared helpers; server tests call the `Omnigraph` engine API directly and exercise endpoints over the wire.
|
||||
|
||||
> Note: there is **no `MemStorage` or in-memory backend** today. Tests use `tempfile::tempdir()` for local FS. If you find yourself needing one for layer isolation, that's an architectural ask — see [docs/invariants.md §VIII.48](invariants.md) (reference impl + test impl per trait).
|
||||
> Note: there is **no `MemStorage` or in-memory backend** today. Tests use `tempfile::tempdir()` for local FS. If you find yourself needing one for layer isolation, that's an architectural ask — keep it explicit in [docs/dev/invariants.md](invariants.md) under known gaps.
|
||||
|
||||
## Failpoints (fault injection)
|
||||
|
||||
|
|
@ -75,7 +75,7 @@ Locally, set `OMNIGRAPH_S3_TEST_BUCKET` (and the usual `AWS_*` vars including `A
|
|||
## Examples & benches
|
||||
|
||||
- `crates/omnigraph/examples/bench_expand.rs` — runnable example (not part of CI).
|
||||
- No `benches/` directories. The architectural rule [docs/invariants.md §VIII.50](invariants.md) requires benchmark motivation before optimization, so add `benches/` per crate when you ship a perf-driven change.
|
||||
- No `benches/` directories. Add `benches/` per crate when you ship a perf-driven change, and include the motivating workload with the optimization.
|
||||
|
||||
## Coverage tooling — what's missing
|
||||
|
||||
|
|
@ -107,9 +107,9 @@ When you pick up any change, walk through this:
|
|||
2. **Run those tests locally before editing.** `cargo test --workspace --locked` for the broad pass; `-p <crate> --test <file>` for a focused loop. Confirm a clean baseline.
|
||||
3. **Decide extend-vs-new** explicitly. If you can extend an existing test (assertion, fixture row, parameterization), do that. Only add a new test fn or new file if no existing one owns the area.
|
||||
4. **Reuse the helpers.** `init_and_load()`, fixture files, the CLI `support` harness — re-use them. Don't bootstrap a fresh repo by hand if a helper exists.
|
||||
5. **Mind the boundary.** Per [docs/invariants.md §VIII.47](invariants.md), test at the layer the change lives at — planner-level changes deserve planner-level tests, not just end-to-end.
|
||||
5. **Mind the boundary.** Per [docs/dev/invariants.md](invariants.md), test at the layer the change lives at — planner-level changes deserve planner-level tests, not just end-to-end.
|
||||
6. **For substrate-touching changes** (Lance behavior), reach for `failpoints` or fixture-driven scenarios, not stubbed-out mocks.
|
||||
7. **For server / API changes**, confirm the OpenAPI regeneration happens in `openapi.rs` and that the diff lands in `openapi.json`.
|
||||
8. **Verify your change makes an existing test fail before it makes the new one pass.** If you can break the code without breaking a test, your coverage gap is the problem to fix first.
|
||||
|
||||
When in doubt, re-read [docs/invariants.md §VIII](invariants.md) — quality gates apply to every change.
|
||||
When in doubt, re-read [docs/dev/invariants.md](invariants.md) — quality gates apply to every change.
|
||||
|
|
@ -1,305 +0,0 @@
|
|||
# Architectural Invariants & Patterns
|
||||
|
||||
**Type:** Reference / standing document
|
||||
**Status:** Living — updated as decisions accrue
|
||||
**Audience:** anyone proposing, reviewing, or implementing a change to any part of OmniGraph
|
||||
|
||||
This document captures two things:
|
||||
|
||||
- **Invariants** (Parts I–VI, VIII): load-bearing principles that hold across the architecture. Breaking one is rare and requires explicit justification.
|
||||
- **Current architectural patterns** (Part VII): how we realize the invariants today. These are committed conventions, not eternal facts; they may evolve as the engine matures, but until they do, they constrain new work.
|
||||
|
||||
These are not query-engine-specific. They apply to every layer.
|
||||
|
||||
## Status legend
|
||||
|
||||
- *Status: decided.* No annotation needed; this is the default.
|
||||
- *Status: open — see MR-X.* The principle is captured, but the concrete default or mechanism is still under discussion. Future work should follow the captured intent or update this document with the resolution.
|
||||
- *Status: aspirational.* The invariant describes the target state; current code may not yet uphold it. PRs that move toward upholding it are welcome; PRs that drift away need explicit justification.
|
||||
|
||||
Capturing aspirational invariants on purpose: we'd rather record what we want to be true and have current code be measured against it than not have the rule at all.
|
||||
|
||||
## How to use
|
||||
|
||||
- **Writing an RFC or design proposal:** walk through the relevant sections and state how the proposal upholds each invariant — or why a documented exception is justified.
|
||||
- **Reviewing a PR or design:** scan for invariants the change might violate. The deny-list (§IX) is the fastest first pass.
|
||||
- **Debating a tradeoff:** invoke the relevant invariant and check whether the tradeoff respects it.
|
||||
- **Updating this document:** add to the deny-list freely. Removing or relaxing an invariant requires the same review process as any other architectural decision.
|
||||
|
||||
---
|
||||
|
||||
## I. Substrate respect — delegate, don't rebuild
|
||||
|
||||
The first question for any new component: does the substrate already do this?
|
||||
|
||||
Current substrate is **Lance** for storage, indexes, and MVCC; **DataFusion** is the working assumption for relational machinery. These are committed choices (MR-737 §2.2, §5.11) but not eternal facts. The invariants below are about respecting *whatever* substrate we adopt.
|
||||
|
||||
1. **Don't rebuild what the substrate owns.** Storage format, durability (WAL, transaction journal), buffer pool, MVCC, index lifecycle — all delegated. Building parallel implementations turns the project into a different one and locks us out of substrate improvements.
|
||||
*Check:* Does this proposal introduce a parallel storage format, custom on-disk pages, custom serialization, custom WAL, custom buffer pool?
|
||||
|
||||
2. **Don't rebuild relational machinery** provided by the runtime substrate. Joins, aggregations, parallelism, spill — extension via the substrate's trait surfaces; never reimplementation.
|
||||
*Check:* Are we extending the substrate via traits, or reimplementing parts of it?
|
||||
|
||||
3. **Don't maintain state parallel to the substrate.** Observe substrate state and derive what we need. State that drifts from the substrate is a bug.
|
||||
*Check:* Does this proposal track index coverage, manifest versions, or fragment locations independently of the substrate?
|
||||
|
||||
## II. Layering — the seams hold
|
||||
|
||||
4. **The IR is the contract between frontend and backend.** Frontends emit IR; planner / executor consume it. No frontend logic leaks downward; no executor concerns leak upward.
|
||||
*Check:* Does the proposal add to the IR, or to a layer? If to a layer, does it cross another layer's concern?
|
||||
|
||||
5. **Capabilities and statistics flow upward; data flows downward.** Lower layers expose what they can do (capabilities) and what they know (statistics). Upper layers consume both. Methods alone are insufficient — methods without capability advertisement force one-size-fits-all plans.
|
||||
*Check:* When adding a method to a layer trait, did we also expose the capability so the planner can reason about it?
|
||||
|
||||
6. **One trait boundary per layer.** Crossing a layer means going through its trait. Direct calls to lower-layer concrete types from upper layers are forbidden.
|
||||
*Check:* Does this code call `lance::Dataset` directly outside engine-storage? Call planner internals from the executor?
|
||||
|
||||
7. **No god modules.** Single-module concerns: storage, IR, planner, executor, frontend, reconciler, schema, policy. Each crate has a reference test suite that runs without the others.
|
||||
*Check:* Does this PR add a concern to a crate that already owns a different one?
|
||||
|
||||
8. **Wire protocols are interchangeable; the IR is the contract.** The kernel produces `Stream<RecordBatch>` end-to-end; transports (HTTP/JSON, Arrow Flight, FlightSQL, future protocols) deliver them at the server boundary. No wire-protocol-specific code in kernel crates.
|
||||
*Status: aspirational — Flight not yet implemented; tracked in MR-765.*
|
||||
*Check:* Does this code import `arrow_flight` (or any transport crate) outside the server layer?
|
||||
|
||||
## III. Distributability — kernel stays remote-friendly
|
||||
|
||||
These are technical constraints, independent of whether we ship a distributed product. They preserve the architectural seam.
|
||||
|
||||
9. **The kernel admits parallel and remote implementations.** Trait surfaces are thread-safe; no in-process-only assumptions; remote dataset descriptors (URI, snapshot ref, fragment ID) are accepted without requiring an open in-process handle.
|
||||
|
||||
10. **IR is location-neutral.** No IR operator embeds an assumption about where data lives.
|
||||
|
||||
11. **Cost models accept new dimensions** (network, latency-tier) as additive extensions. No place hard-codes "all cost is local I/O."
|
||||
|
||||
12. **Background work admits alternate implementations.** In-process default; separable worker fleet for distributed deployment uses the same trait.
|
||||
*Status: aspirational — distributed deployment is out of scope today (MR-737 §2.2); these constraints preserve the seam.*
|
||||
|
||||
## IV. Evolution — additive over rewrite
|
||||
|
||||
13. **Additive over rewrite.** New IR variants and planner rules slot in. No "tear out and replace" PRs.
|
||||
|
||||
14. **Capabilities are additive enums.** New variants are additive. Existing implementations keep working.
|
||||
|
||||
15. **Feature-flag behavior changes.** Every change that alters runtime behavior ships behind a flag. Old code path stays until the new one is proven.
|
||||
|
||||
16. **No data drops without a migration.** When data needs to move (e.g., adopting stable row IDs), use in-place or dual-write windows. Never "drop and recreate."
|
||||
|
||||
17. **No breaking schema changes without a migration plan.** Schema-IR changes go through the migration planner with safety tier classification. See the MR-694 family.
|
||||
|
||||
## V. Honesty — what the system tells operators
|
||||
|
||||
18. **Estimate-vs-actual logging on every estimator.** Cost models drift; calibration is a continuous process, not a one-off.
|
||||
|
||||
19. **Operationally important state is observable.** Index coverage, reconciler lag, cost-model accuracy — surfaced through the storage trait's `capabilities()` and a unified observability API.
|
||||
|
||||
20. **Honest failure modes.** Cost-model misses degrade gracefully (spill, partial-result, bounded abort). No silent OOM.
|
||||
|
||||
21. **Per-query resource consumption is bounded and exposed.** Memory cap, wall-clock timeout, max-rows-scanned, max-fragments-scanned. Operators respect them; bounds exposed via explain.
|
||||
|
||||
22. **Plans are explainable.** Every executed query can be inspected as IR + physical plan + cost annotations. No "you'd have to read the source to know what this does." See MR-684.
|
||||
|
||||
## VI. Database guarantees — what OmniGraph promises as a system of record
|
||||
|
||||
These are user-visible commitments. They state what the engine guarantees and what it does not. For an "agent-native system of record," credibility lives here.
|
||||
|
||||
Specific defaults (timeout values, memory caps, TTL windows) are *configuration*, not invariants — see [docs/constants.md](constants.md) and per-deployment configuration. The invariant is that bounds and contracts exist, not their numerical values.
|
||||
|
||||
23. **Atomicity is per-query.** Every `.gq` query is atomic — multi-statement mutations are all-or-nothing via the substrate's atomic-commit primitive. No cross-query `BEGIN`/`COMMIT`; branches and merges fill that role for agent workflows.
|
||||
*Status: upheld at the writer-trait surface, across process boundaries, AND in-process for the common case under concurrent writers (PR 2 / MR-686) — the sealed `TableStorage` trait routes inserts / updates / scalar-index builds / merge_insert / overwrite through `stage_*` + `commit_staged` (Phase A is drift-free); the open-time recovery sweep in `db/manifest/recovery.rs` (sidecars at `__recovery/{ulid}.json` written by `MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) closes the per-table commit_staged → manifest publish residual on the next `Omnigraph::open`; `Omnigraph::refresh` runs roll-forward-only recovery in-process so long-running servers close the common case without restart; and the per-(table, branch) writer-queue (`db/write_queue.rs`) + revalidation under the queue (`MutationStaging::commit_all`) prevents concurrent writers on the same key from corrupting each other once the HTTP server's global `RwLock<Omnigraph>` is removed (PR 2 Step F). The "Lance HEAD ahead of `__manifest`" drift class is unreachable for op-execution failures, recoverable across process boundaries for all writer kinds, and recoverable in-process for roll-forward-eligible sidecars. Sidecars that would require `Dataset::restore` are deferred to the next ReadWrite open (restore unsafe under concurrency); continuous in-process rollback recovery is the goal of a future background reconciler (MR-870). Two writer paths still inline-commit pending upstream Lance work: `delete_where` (lance-format/lance#6658) and `create_vector_index` (lance-format/lance#6666).*
|
||||
|
||||
24. **Schema integrity is strict at commit.** Type validation, required-field presence (auto-filled from `@default` if declared), uniqueness across batches and versions, and referential integrity — all enforced before commit succeeds. Per-write softening flags are opt-in, never default.
|
||||
*Status: aspirational — referential integrity at scale requires SIP-backed cross-table validation; not yet implemented. Cross-batch / cross-version uniqueness tracked in MR-714.*
|
||||
|
||||
25. **Isolation: per-query snapshot; read-your-writes within and across queries in a session.** Each query reads from one consistent manifest version. Within a multi-statement mutation, the read subplan inside each write operator sees the writes from earlier statements. Across queries in a session, reads always resolve the latest manifest version — no reader pinning to older snapshots.
|
||||
*Status: upheld for inserts/updates — `MutationStaging`'s in-memory accumulator + `TableStore::scan_with_pending` (DataFusion `MemTable` union with the committed Lance scan, with merge-shadow semantics for chained updates) implements read-your-writes within a multi-statement mutation. Delete-touching mutations are limited to delete-only by parse-time D₂; closing the within-query RYW gap for deletes requires Lance's two-phase delete API (Lance-upstream lance-format/lance#6658). The "Lance HEAD ahead of `__manifest`" drift class is unreachable for op-execution failures (the partial-failure test pins this), and the narrower finalize→publisher residual is closed across one open cycle by the open-time recovery sweep — see [docs/runs.md](runs.md) "Open-time recovery sweep".*
|
||||
|
||||
26. **Durability before acknowledgement.** Commit returns only after the substrate has confirmed durable persistence. No "fast" or "fire-and-forget" durability levels.
|
||||
|
||||
27. **Causal consistency across sessions.** If session A commits and session B subsequently reads, session B sees A's write. Single-coordinator: trivially via single-source manifest. Multi-coordinator: enforced via leader-for-writes plus session-token replica reads. Never weakened.
|
||||
*Status: aspirational on the multi-coordinator side.*
|
||||
|
||||
28. **Determinism within a snapshot.** Same query + same snapshot + same parameters → order-stable results (deterministic tie-breaks). Plan choice is deterministic given identical statistics. Cross-version determinism is best-effort, not guaranteed (statistics change, plans change).
|
||||
*Status: aspirational — current code may rely on HashMap iteration in some paths.*
|
||||
|
||||
29. **Writes are idempotent under retry.** Insert / Update / Merge take an explicit `on_conflict` policy. Clients may provide an idempotency key on writes; the server deduplicates retries within a configurable TTL window. Schema migrations are idempotent under replay.
|
||||
*Status: open — `on_conflict` policy lands with mutation IR (MR-737 Phase 8); idempotency-key TTL default is undecided.*
|
||||
|
||||
30. **No silent data loss or corruption.** Substrate-level checksums are trusted for storage integrity. Semantic-invariant checks at every commit catch higher-level cases (orphan edges, type drift, broken uniqueness). Every operation succeeds, fails loudly with cause, or degrades observably with metrics.
|
||||
|
||||
31. **Every operation has a documented bound.** "May run forever" is forbidden as a default. Defaults are configurable; the invariant is that bounds exist, are documented, and are enforced.
|
||||
|
||||
32. **Failure scope is bounded.** A failing query, fragment-level corruption, or background-task crash does not cascade. Per-table fragment isolation at the storage tier; per-query memory and timeout in the executor.
|
||||
*Status: aspirational on the per-query side — per-query memory cap not yet enforced; planned with MR-737 Phase 7.*
|
||||
|
||||
33. **Crash recovery via the same code paths as steady-state.** No special "recovery mode." On restart, the engine reads the manifest, finds the latest committed state, and resumes. Substrate atomicity ensures no partial writes survive.
|
||||
|
||||
34. **Strong consistency by default; relaxation is per-query, never per-default.** Strong (read-your-writes, monotonic, snapshot) is the default for every query. Eventual consistency is opt-in per read query for analytical workloads where staleness is acceptable. Never available on writes; always logged for audit.
|
||||
*Status: aspirational — eventual-consistency opt-in flag tracked in MR-425.*
|
||||
|
||||
35. **Branches are the cross-query coordination primitive.** Branches are cheap to create, fully isolated, per-branch SI, with durable queryable metadata (creator, intent, parent, fork point). Agents use branches for any multi-step coordination that needs atomicity beyond a single query. Lifecycle policies (TTL, auto-cleanup) are deployment configuration; the invariant is that branches *exist* as first-class durable objects with full SI parity to main.
|
||||
*Status: upheld. Lance shallow-clone gives cheap creation; per-branch SI is the same code path as main; metadata in `_refs/branches/{name}.json` already supports a queryable `metadata` map.*
|
||||
|
||||
36. **Per-query isolation is adjustable per-query, never per-default.** Default is Snapshot Isolation (§VI.25). Queries can opt **up** to Serializable for cross-table-invariant safety (`USING SERIALIZABLE`) or **down** to eventual consistency for analytical reads (`USING EVENTUAL`). Stricter than Serializable (Strict Serial / linearizable-across-queries) is **not offered**; branches (§VI.35) replace that role for high-stakes coordination. Stronger and weaker are both per-query opt-ins, never per-default.
|
||||
*Status: SI default upheld. Serializable opt-in aspirational — predicate revalidation under MR-686's per-(table, branch) queue is the implementation seam. Eventual-read opt-in aspirational — tracked in MR-425. Subsumes §VI.34 (which only covers the downgrade direction); §VI.34 is preserved for now to keep its MR-425 pointer addressable.*
|
||||
|
||||
37. **Merges are type-aware and agent-resolvable.** Branch merge resolution combines two layers. **Structural** (row-level last-write-wins by deterministic tie-break) is exact for sets of independent rows. **Semantic** (per-type policies declared in schema) handles CRDT-shaped operations: grow-only set, monotonic counter, last-writer-wins-with-timestamp, multi-valued register, first-writer-wins. Conflicts no policy resolves pause the merge with structured `MergeConflictKind` rows; agents produce resolution rows and resume. Auto-resolution never silently picks a side when policies are ambiguous.
|
||||
*Status: structural merge upheld via `OrderedTableCursor` + `StagedTableWriter`. Type-declared semantic policies aspirational. Pausable merges aspirational — current code fails on conflict, doesn't pause.*
|
||||
|
||||
### Explicit non-commitments
|
||||
|
||||
These are *not* part of the OmniGraph contract. Listed so reviewers and downstream users see what is intentionally out of scope.
|
||||
|
||||
- **Strict Serializable across queries.** Branches (§VI.35) are the replacement for cross-query strict-serial coordination.
|
||||
- **Cross-process linearizable single-object writes** in multi-coordinator deployments without explicit external coordination (Postgres advisory, S3 sentinel, leader election). §VI.27 multi-coordinator stays aspirational with a clear cost model.
|
||||
- **Automatic semantic conflict resolution.** §VI.37 is explicit: ambiguous conflicts always pause for agent or human resolution; auto-resolution requires a per-type policy.
|
||||
|
||||
## VII. Current architectural patterns
|
||||
|
||||
These are *how* we realize the invariants today. They are committed conventions — until we explicitly revise them, new code follows them. They are not eternal: a future architecture review may replace any of these with a different mechanism that upholds the same invariants. The deny-list (§IX) protects them in the meantime.
|
||||
|
||||
38. **Reconciler pattern for derivable state.** Index coverage, statistics, anything derivable from manifest state — reconciled, not job-queued. *Realizes the "don't maintain state parallel to the substrate" invariant.* See MR-737 §5.16.
|
||||
*Status: partial after MR-793 PR #70 — scalar index builds (BTree, Inverted) now route through the staged primitives `stage_create_*_index` + `commit_staged` instead of inline `create_*_index`; this is the building block. The reconciler pattern itself (background `IndexReconciler` task driven by manifest commits, removing synchronous index work from the publish path) is tracked in MR-848. Vector indices remain inline-commit until lance-format/lance#6666 ships.*
|
||||
|
||||
39. **Polymorphism via Union, not per-feature lowering.** Interfaces / wildcards / alternation on nodes and edges share one IR (`Polymorphism<T>`) and one lowering (Union of per-type concrete plans). *Realizes "shared mechanism for shared shape."* See MR-737 §5.13.
|
||||
*Status: aspirational — node interfaces in MR-579; edge wildcards in MR-744.*
|
||||
|
||||
40. **Mutations wrap read subplans.** Insert / Update / Delete / Merge are operators that consume read-shaped subplans. Same planner, same cost model, same storage trait. *Realizes "writes share the planner with reads."* See MR-737 §5.12.
|
||||
*Status: aspirational — current mutation path is separate from reads.*
|
||||
|
||||
41. **SIP for cross-operator selectivity propagation.** Producers publish ID bitmaps; downstream scans consume them through structured pushdown. *Realizes "downstream operators prune via upstream selectivity."*
|
||||
*Status: aspirational — current code uses IN-list flattening in `Expand`.*
|
||||
|
||||
42. **Factorize multi-hop, flatten only at projection.** Lists carry multiplicity through intermediate operators. `Flatten` is inserted by the planner where required, not eagerly. *Realizes "intermediate state shouldn't materialize cross-products eagerly."*
|
||||
*Status: aspirational — current code materializes cross-products eagerly.*
|
||||
|
||||
43. **Stable row IDs as dense graph IDs.** Don't maintain parallel string→u32 maps. Lance's stable row IDs are the substrate's identity layer; we use them directly. *Realizes "use the substrate's identity layer."*
|
||||
*Status: aspirational — current code rebuilds `TypeIndex` per query.*
|
||||
|
||||
44. **Rank and score are columns.** Retrieval operators emit `_score`, `_rank`. Fusion operators consume rank-bearing batches. *Realizes "rank/score is data, not metadata."*
|
||||
*Status: aspirational — current RRF runs the pipeline twice and discards rank.*
|
||||
|
||||
45. **Policy as predicates.** Authorization decisions are filter expressions injected into the planner, not enforcement at the API boundary. *Realizes "authorization pushes down with other filters."*
|
||||
*Status: aspirational — Cedar enforcement currently at HTTP boundary only; tracked in MR-722 / MR-725.*
|
||||
|
||||
46. **Imports unify under `Source`; transport is interchangeable.** A single `Source` IR operator with provider variants (File, Flight, Lance, Stream) handles all imports. Lance-to-Lance is a fast-path that bypasses Arrow encode/decode. *Realizes "external data sources share one operator surface."*
|
||||
*Status: aspirational — current loader is JSONL-only; tracked in MR-765.*
|
||||
|
||||
## VIII. Quality gates — every change passes
|
||||
|
||||
47. **Tests at every boundary.** `MemStorage` for engine tests; planner-only tests; executor-only tests with a stub storage. No layer tested only via end-to-end.
|
||||
|
||||
48. **Reference implementation per trait.** Every trait has a primary impl (Lance for storage) and at least a test impl.
|
||||
*Status: partial after MR-793 PR #70 — `TableStorage` (the engine-internal staged-write trait, sealed) has its primary impl on `TableStore` (Lance-backed). The trait's signatures use opaque `SnapshotHandle` / `StagedHandle` types so a future test impl (e.g., `MemStorage`) can land without changing call sites. No test impl yet; `tempfile::tempdir()` + Lance is the de-facto test substrate today (see [docs/testing.md](testing.md)).*
|
||||
|
||||
49. **Documented capability surface.** New capabilities are documented with what they advertise, who consumes them, how the planner uses them.
|
||||
|
||||
50. **Benchmark before optimization.** New optimizations land with a benchmark that motivates them; if the motivating workload doesn't exist, the feature waits.
|
||||
|
||||
## IX. Anti-patterns — deny-list
|
||||
|
||||
If a proposal fits one of these, the burden is on the proposer to justify why this case is the exception.
|
||||
|
||||
### Invariant violations (high bar to override)
|
||||
|
||||
- **Custom WAL / transaction manager / buffer pool.** Substrate owns these (§I.1).
|
||||
- **Wire-protocol-specific code in kernel crates.** Kernel produces `Stream<RecordBatch>`; transport adapters live at the server boundary only (§II.8).
|
||||
- **In-process-only `Dataset` impls.** Trait surfaces stay remote-friendly (§III.9).
|
||||
- **State that drifts from the substrate / manifest.** Derive from observable state (§I.3).
|
||||
- **Cross-query `BEGIN`/`COMMIT` transactions.** Branches replace them in OSS (§VI.23).
|
||||
- **Acks before durable persistence.** "Best-effort commit" is forbidden (§VI.26).
|
||||
- **Reads that see partial commits.** Atomicity is non-negotiable (§VI.23).
|
||||
- **Operations without time bounds.** Every operation has a documented timeout or backoff (§VI.31).
|
||||
- **"Recovery mode" code paths separate from steady-state.** Recovery uses the same code as ordinary reads (§VI.33).
|
||||
- **Eventual consistency as a default.** Strong is default; eventual is opt-in per query, never on writes (§VI.34).
|
||||
- **Schema migrations that are not idempotent under replay.** Idempotency is required for replay safety (§VI.29).
|
||||
- **Plan choice that varies given identical input statistics.** Determinism is required (§VI.28).
|
||||
- **HashMap iteration order in result ordering or plan choice.** Use deterministic tie-breaks (§VI.28).
|
||||
- **Cost-blind plan choice.** Lowering-order execution is not a planner.
|
||||
- **Hidden statistics.** If a metric matters for plan choice, it must be exposed through the trait surface (§II.5).
|
||||
- **Side-channels for query semantics.** Search modes, mutations, polymorphism, imports — all first-class IR concepts (§II.4).
|
||||
- **Hand-rolling something the substrate already does.** Check the spec first (§I.1).
|
||||
- **Mutating in place** state that should be immutable (Lance fragments, index segments). New segments instead.
|
||||
- **Silent failures.** OOM, timeout, partial result — all surfaced and bounded (§V.20).
|
||||
- **Shipping observable behavior as if it weren't part of the contract.** Output ordering, error-message text, timestamp precision, default-flag values, latency profile, query-result column order — every observable behavior gets depended on once shipped (Hyrum's Law). Don't expose what you don't want to commit to; treat changes to undocumented-but-observable behavior as breaking changes.
|
||||
- **Strict-serial coordination expressed as locks held across queries.** Branches are the agent-native primitive for that (§VI.35).
|
||||
- **Auto-resolving merge conflicts when the per-type policy is silent or absent.** Pause and surface the conflict; never silently pick a side (§VI.37).
|
||||
|
||||
### Pattern violations (overridable with justification)
|
||||
|
||||
These protect the *current* architectural patterns (§VII). A future review may revise them.
|
||||
|
||||
- **Synchronous-inline index updates** for indexes expensive to build (vector ANN, FTS). Reconciler pattern instead (§VII.38).
|
||||
- **Job queue for state derivable from manifest.** Reconciler pattern instead (§VII.38).
|
||||
- **Per-feature lowering for shapes that share a structure** (interfaces, wildcards, alternation). Use one mechanism (§VII.39).
|
||||
- **Per-format import code paths** (one path for JSONL, another for Parquet, another for Flight). Use the `Source` IR operator (§VII.46).
|
||||
- **Eager materialization of cross-products** in multi-hop. Factorize (§VII.42).
|
||||
- **Ad-hoc `IN`-list filtering** when SIP fits (§VII.41).
|
||||
- **String-flattened SQL filter generation** when structured pushdown is available.
|
||||
- **Discarding rank in retrieval.** Score and rank propagate as columns (§VII.44).
|
||||
- **Auto-creating placeholder nodes for orphan edges** (silent invention of data). Reject by default; opt-in per write (§VI.24).
|
||||
- **Double-encoding data when both endpoints speak the same format** (e.g., Lance → Arrow → Lance when both are Lance). Use a fast-path (§VII.46).
|
||||
- **Per-write durability fast paths** until MemWAL is stable AND a use case justifies the latency vs. risk tradeoff.
|
||||
|
||||
## X. Review checklist (use against any non-trivial change)
|
||||
|
||||
Print this when reviewing an RFC or PR. Each line is **yes / no / N/A**.
|
||||
|
||||
- Does it respect the substrate? (§I)
|
||||
- Does it cross only one trait boundary per layer? (§II)
|
||||
- Are capabilities and stats exposed for any new behavior? (§II.5)
|
||||
- If touching the wire / transport surface, does kernel code stay protocol-agnostic? (§II.8)
|
||||
- Do trait surfaces stay remote-friendly? (§III)
|
||||
- Additive, not rewrite? Feature-flagged where behavior changes? (§IV)
|
||||
- Any new estimator has estimate-vs-actual logging? (§V.18)
|
||||
- Coverage / lag / budget metrics surfaced? (§V.19–21)
|
||||
- Failure modes graceful, bounded, observable? (§V.20)
|
||||
- Atomicity scope respected per query? (§VI.23)
|
||||
- Schema integrity enforced strict at commit unless explicit opt-out? (§VI.24)
|
||||
- Isolation level matches default (per-query snapshot, read-your-writes)? (§VI.25)
|
||||
- Durability ack only after manifest commit? (§VI.26)
|
||||
- Determinism preserved (order-stable, plan-deterministic)? (§VI.28)
|
||||
- Idempotency: explicit `on_conflict`; idempotency keys honored if used? (§VI.29)
|
||||
- Bounded operations: explicit timeout / memory / concurrency limits? (§VI.31)
|
||||
- If proposing cross-query strict-serial coordination, is it expressed via branches rather than long-held locks? (§VI.35)
|
||||
- If touching merge resolution, are silent-pick paths explicitly absent? (§VI.37)
|
||||
- If touching imports / external data, does it go through `Source`? (§VII.46)
|
||||
- If implementing a graph / retrieval feature: reuses an existing pattern (reconciler, Union, mutation-wrap-read, SIP, factorize, Source) where applicable? (§VII)
|
||||
- Tests at every boundary, not just end-to-end? (§VIII.47)
|
||||
- Reference impl + test impl for any new trait? (§VIII.48)
|
||||
- None of the deny-list patterns apply? (§IX)
|
||||
|
||||
## XI. Living document policy
|
||||
|
||||
This document is updated when:
|
||||
|
||||
- A new architectural decision establishes a new invariant — add it.
|
||||
- An existing invariant is challenged and either reaffirmed (with the case sharpened) or revised (with explicit migration of any affected code).
|
||||
- A new architectural pattern is adopted — add to §VII.
|
||||
- A current pattern (§VII) is replaced — update or remove the entry; update the deny-list.
|
||||
- A new anti-pattern surfaces in review and deserves a place on the deny-list — add it.
|
||||
- An *aspirational* invariant becomes upheld — remove the status annotation.
|
||||
- An *open* invariant is decided — record the decision and remove the status annotation.
|
||||
|
||||
Updates require the same review process as code. Adding to the deny-list (§IX) is cheap; removing or relaxing an invariant (§I–VI, VIII) requires explicit justification in the proposal. Replacing a pattern (§VII) requires a design discussion linking to the new pattern; until that lands, the existing pattern stays.
|
||||
|
||||
When an invariant is contested in the moment, the resolution path is: (a) state the case in the relevant RFC or PR; (b) link it from this document; (c) update this document if the resolution changes the rule.
|
||||
|
||||
## XII. Source / origin
|
||||
|
||||
These invariants and patterns were extracted from the architectural decisions in:
|
||||
|
||||
- **MR-737** — Query Engine v2 RFC (the kernel scope and seams)
|
||||
- **MR-744** — Edge wildcards / alternation (one cell of the polymorphic-bindings matrix)
|
||||
- **MR-765** — Arrow Flight transport (query, import, export)
|
||||
- The schema migration program (**MR-694** family — additive evolution, safety tiers, idempotent replay)
|
||||
- The policy program (**MR-722** / **MR-725** — predicate pushdown)
|
||||
- The reconciler / index-lifecycle work (**MR-737 §5.16**, **MR-688**, **MR-679**, **MR-680**)
|
||||
- The factorization and SIP work (**MR-737 §5.2**, **§5.3** — Kuzu / Ladybug inspiration)
|
||||
- The polymorphic-bindings framing (**MR-737 §5.13** — one mechanism for eight cells)
|
||||
- The Source-operator framing (**MR-737 §5.12** — one mechanism for all imports)
|
||||
- The database-guarantees discussion (§VI): ACID dimensions, CAP-style consistency model, scale-system precedents (ClickHouse, Turbopuffer, LanceDB, Postgres). Each invariant in §VI corresponds to a specific named decision; see prior architecture discussions for the option space considered.
|
||||
- **MR-686** — Per-table writer queues and per-actor admission. Source for §VI.35–37 and the explicit non-commitments subsection (MR-686's queue is the seam that makes Serializable opt-in implementable, and the reason §VI.27 multi-coordinator stays aspirational).
|
||||
|
||||
General precedent: Lance + LanceDB Enterprise architecture; ClickHouse merge subsystem; Kubernetes controllers; Postgres autovacuum; the FDAL stack (Flight + DataFusion + Arrow + Lance).
|
||||
|
||||
Adding a new invariant or pattern here means we've learned something — either from a hard call we made and want to preserve, or from a mistake we don't want to repeat. Both are worth recording.
|
||||
|
|
@ -65,7 +65,7 @@ manifest. The next mutation against that table fails with
|
|||
`ExpectedVersionMismatch`. Most validation runs before any Lance write,
|
||||
so single-statement mutations are unaffected; the narrow path is
|
||||
multi-statement queries with late-op failures. Tracked as a follow-up;
|
||||
see [docs/runs.md](../runs.md#known-limitation-mid-query-partial-failure-on-the-same-table)
|
||||
see [docs/dev/runs.md](../dev/runs.md#known-limitation-mid-query-partial-failure-on-the-same-table)
|
||||
for the workaround.
|
||||
|
||||
## Upgrade notes
|
||||
|
|
|
|||
|
|
@ -19,7 +19,7 @@ mutation proceeds normally.
|
|||
HEAD on every staged table is untouched and the next mutation
|
||||
proceeds normally. A narrowed residual remains at the
|
||||
finalize→publisher boundary (multi-table `commit_staged` is not
|
||||
atomic with the manifest commit) — see [docs/runs.md](../runs.md)
|
||||
atomic with the manifest commit) — see [docs/dev/runs.md](../dev/runs.md)
|
||||
"Finalize → publisher residual" for details.
|
||||
- **D₂ parse-time rule**: a single mutation query is either
|
||||
insert/update-only or delete-only. Mixed → rejected with a clear
|
||||
|
|
@ -40,8 +40,8 @@ mutation proceeds normally.
|
|||
`restore_coordinator` API and `CoordinatorRestoreGuard` are removed
|
||||
from `mutation.rs`. (`merge.rs` keeps its own swap pattern; that's
|
||||
a separate workflow.)
|
||||
- **`docs/invariants.md` §VI.25** flips from `aspirational/open` to
|
||||
`upheld for inserts/updates`. The within-query read-your-writes
|
||||
- **`docs/dev/invariants.md` mutation atomicity / read-your-writes status**
|
||||
flips from `aspirational/open` to `upheld for inserts/updates`. The within-query read-your-writes
|
||||
guarantee is now load-bearing for the publisher CAS contract.
|
||||
|
||||
## Behavior changes
|
||||
|
|
@ -105,29 +105,29 @@ mutation proceeds normally.
|
|||
- `Cargo.toml` (workspace) + `crates/omnigraph/Cargo.toml` — added
|
||||
`datafusion = "52"` direct dep (transitively pulled by Lance
|
||||
already; required for `MemTable`).
|
||||
- `docs/runs.md` — removed "Known limitation" section; documented
|
||||
- `docs/dev/runs.md` — removed "Known limitation" section; documented
|
||||
the new accumulator + D₂ + LoadMode::Overwrite residual.
|
||||
- `docs/invariants.md` — §VI.25 status flipped to `upheld for
|
||||
inserts/updates`.
|
||||
- `docs/architecture.md` — added "Mutation atomicity — in-memory
|
||||
- `docs/dev/invariants.md` — mutation atomicity / read-your-writes status
|
||||
flipped to `upheld for inserts/updates`.
|
||||
- `docs/dev/architecture.md` — added "Mutation atomicity — in-memory
|
||||
accumulator" subsection; refreshed the engine + state
|
||||
diagrams to drop `RunRegistry` and add `MutationStaging`.
|
||||
- `docs/execution.md` — rewrote the mutation flow sequence diagram
|
||||
- `docs/dev/execution.md` — rewrote the mutation flow sequence diagram
|
||||
for the staged-write path; updated the `LoadMode` table to call
|
||||
out per-mode commit semantics; rewrote `load` vs `ingest`.
|
||||
- `docs/query-language.md` — documented the D₂ parse-time rule.
|
||||
- `docs/errors.md` — added the D₂ `BadRequest` rejection path.
|
||||
- `docs/storage.md` — dropped the live `_graph_runs.lance` reference
|
||||
- `docs/user/query-language.md` — documented the D₂ parse-time rule.
|
||||
- `docs/user/errors.md` — added the D₂ `BadRequest` rejection path.
|
||||
- `docs/user/storage.md` — dropped the live `_graph_runs.lance` reference
|
||||
from the layout diagram and prose.
|
||||
- `docs/branches-commits.md` — moved `__run__<id>` to a legacy note;
|
||||
- `docs/user/branches-commits.md` — moved `__run__<id>` to a legacy note;
|
||||
removed `publish_run` from the publish-trigger list.
|
||||
- `docs/audit.md` — current `_as` API list refreshed; legacy
|
||||
- `docs/user/audit.md` — current `_as` API list refreshed; legacy
|
||||
`RunRecord.actor_id` moved to a historical note.
|
||||
- `docs/constants.md` — marked the run registry / branch-prefix rows
|
||||
- `docs/user/constants.md` — marked the run registry / branch-prefix rows
|
||||
as legacy.
|
||||
- `docs/cli.md` — replaced the legacy `omnigraph run *` quickstart
|
||||
- `docs/user/cli.md` — replaced the legacy `omnigraph run *` quickstart
|
||||
block with `omnigraph commit list/show`.
|
||||
- `docs/testing.md` — extended the `runs.rs` row to cover the new
|
||||
- `docs/dev/testing.md` — extended the `runs.rs` row to cover the new
|
||||
staged-write contract tests; added the `staged_writes.rs` row.
|
||||
- `AGENTS.md` (CLAUDE.md symlink) — updated the atomic-per-query
|
||||
description and the L2 capability matrix row.
|
||||
|
|
|
|||
|
|
@ -9,7 +9,7 @@
|
|||
- `Manifest(ManifestError { kind: BadRequest|NotFound|Conflict|Internal, details: Option<ManifestConflictDetails>, … })`
|
||||
- `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }` — caller's `expected_table_versions` did not match the manifest's current latest non-tombstoned version (set by `OmniError::manifest_expected_version_mismatch`).
|
||||
- `ManifestConflictDetails::RowLevelCasContention` — Lance row-level CAS rejected the publish because a concurrent writer landed the same `object_id`. Retried internally by the publisher; only surfaces if the retry budget exhausts.
|
||||
- **D₂ parse-time rejection** (MR-794): a single mutation query that mixes inserts/updates with deletes errors out *before any I/O* with kind `BadRequest`. Message: `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes`. See [docs/query-language.md](query-language.md) for the rule and [docs/runs.md](runs.md) for the underlying staged-write rationale.
|
||||
- **D₂ parse-time rejection** (MR-794): a single mutation query that mixes inserts/updates with deletes errors out *before any I/O* with kind `BadRequest`. Message: `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes`. See [docs/user/query-language.md](query-language.md) for the rule and [docs/dev/runs.md](../dev/runs.md) for the underlying staged-write rationale.
|
||||
- `MergeConflicts(Vec<MergeConflict>)`
|
||||
|
||||
Compiler-side `NanoError` covers parse / catalog / type / storage / plan / execution / arrow / lance / IO / manifest / unique-constraint, each with structured spans (`SourceSpan { start, end }`) for ariadne-style diagnostics.
|
||||
52
docs/user/index.md
Normal file
52
docs/user/index.md
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
# User Docs
|
||||
|
||||
**Audience:** users, CLI users, HTTP clients, and self-hosting operators
|
||||
|
||||
This is the public-facing entry point. These docs should describe behavior,
|
||||
commands, configuration, and operational contracts without requiring knowledge
|
||||
of MRs, internal recovery mechanics, or contributor-only invariants.
|
||||
|
||||
## Start Here
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Install OmniGraph | [install.md](install.md) |
|
||||
| Run the CLI locally | [cli.md](cli.md) |
|
||||
| Look up every CLI flag and config field | [cli-reference.md](cli-reference.md) |
|
||||
| Write schemas | [schema-language.md](schema-language.md) |
|
||||
| Read schema-lint diagnostic codes | [schema-lint.md](schema-lint.md) |
|
||||
| Write queries and mutations | [query-language.md](query-language.md) |
|
||||
| Use embeddings | [embeddings.md](embeddings.md) |
|
||||
|
||||
## Operate A Repo
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Understand repo layout and URI support | [storage.md](storage.md) |
|
||||
| Work with branches, commits, and snapshots | [branches-commits.md](branches-commits.md) |
|
||||
| Coordinate multi-query workflows | [transactions.md](transactions.md) |
|
||||
| Read diffs and change feeds | [changes.md](changes.md) |
|
||||
| Build and use indexes | [indexes.md](indexes.md) |
|
||||
| Compact and clean old versions | [maintenance.md](maintenance.md) |
|
||||
| Interpret errors and output formats | [errors.md](errors.md) |
|
||||
|
||||
## Run The Server
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Deploy the binary or container | [deployment.md](deployment.md) |
|
||||
| Use HTTP endpoints | [server.md](server.md) |
|
||||
| Configure Cedar authorization | [policy.md](policy.md) |
|
||||
| Track actors and audit behavior | [audit.md](audit.md) |
|
||||
|
||||
## Releases
|
||||
|
||||
Release notes live in [releases/](../releases/). Use them for user-visible
|
||||
changes between versions, not for contributor design history.
|
||||
|
||||
## Boundary
|
||||
|
||||
User docs should focus on stable behavior. If a paragraph needs to explain
|
||||
internal sidecars, Lance API blockers, MR numbers, test strategy, or review
|
||||
rules, it probably belongs in [docs/dev/index.md](../dev/index.md) or a developer-area document
|
||||
instead.
|
||||
|
|
@ -70,7 +70,7 @@ A single mutation query must be **either insert/update-only or delete-only**. Mi
|
|||
|
||||
> `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes. This restriction lifts when Lance exposes a two-phase delete API (tracked: MR-793 / Lance-upstream).`
|
||||
|
||||
Reason: under the staged-write rewire (MR-794), inserts and updates accumulate in memory and commit at end-of-query, while deletes still inline-commit (Lance 4.0.0 has no public two-phase delete). Mixing creates ordering hazards (same-row insert→delete becomes a no-op because the staged insert isn't visible to delete; cascading deletes of just-inserted edges break referential integrity by silent design). Until Lance exposes `DeleteJob::execute_uncommitted`, the parse-time rejection keeps both paths atomic and correct. See [docs/runs.md](runs.md) and [docs/invariants.md §VI.25](invariants.md).
|
||||
Reason: under the staged-write rewire (MR-794), inserts and updates accumulate in memory and commit at end-of-query, while deletes still inline-commit (Lance 4.0.0 has no public two-phase delete). Mixing creates ordering hazards (same-row insert→delete becomes a no-op because the staged insert isn't visible to delete; cascading deletes of just-inserted edges break referential integrity by silent design). Until Lance exposes `DeleteJob::execute_uncommitted`, the parse-time rejection keeps both paths atomic and correct. See [docs/dev/runs.md](../dev/runs.md) and [docs/dev/invariants.md](../dev/invariants.md).
|
||||
|
||||
## IR (Intermediate Representation)
|
||||
|
||||
|
|
@ -60,7 +60,8 @@ Edge bodies only allow `@unique` and `@index`.
|
|||
## Schema IR & stable type IDs
|
||||
|
||||
- `SCHEMA_IR_VERSION = 1` (`catalog/schema_ir.rs`).
|
||||
- Each interface/node/edge gets a `stable_type_id` (kind+name hashed) so renames can be tracked.
|
||||
- Each interface/node/edge currently gets a `stable_type_id` from a kind+name hash.
|
||||
- Rename-preserving accepted IDs are an architectural invariant, but the current hash-on-name implementation is a known gap until migration carries IDs across `@rename_from`.
|
||||
- Serialized as JSON for diff/migration plans.
|
||||
|
||||
## Schema migration planning
|
||||
|
|
@ -2,9 +2,11 @@
|
|||
|
||||
OmniGraph does not have `BEGIN` / `COMMIT` / `ROLLBACK`. Branches do that job. This page explains the model, when to use which primitive, and shows worked examples for the patterns that come up most.
|
||||
|
||||
The architectural rule lives in [`docs/invariants.md`](invariants.md) §VI.23:
|
||||
The architectural rule lives in [`docs/dev/invariants.md`](../dev/invariants.md):
|
||||
|
||||
> **Atomicity is per-query.** Every `.gq` query is atomic via the substrate's atomic-commit primitive. **No cross-query `BEGIN`/`COMMIT`; branches and merges fill that role for agent workflows.**
|
||||
> **Mutations publish at one boundary.** A `mutate_as` or `load` operation
|
||||
> accumulates constructive writes, commits each touched table at the end, then
|
||||
> publishes one manifest update.
|
||||
|
||||
If you need to coordinate multiple queries atomically, you fork a branch, run mutations on it, and merge when you're satisfied. If something goes wrong, you delete the branch.
|
||||
|
||||
|
|
@ -159,8 +161,8 @@ This is the workflow MR-797 / agentic loops are designed around: **branches are
|
|||
|
||||
## See also
|
||||
|
||||
- [`docs/branches-commits.md`](branches-commits.md) — branch and commit-graph mechanics.
|
||||
- [`docs/merge.md`](merge.md) — three-way merge details and conflict kinds.
|
||||
- [`docs/query-language.md`](query-language.md) — `.gq` syntax for the multi-statement queries used above.
|
||||
- [`docs/runs.md`](runs.md) — the per-query commit pipeline that gives single-query atomicity.
|
||||
- [`docs/invariants.md`](invariants.md) §VI.23 — the architectural rule.
|
||||
- [`docs/user/branches-commits.md`](branches-commits.md) — branch and commit-graph mechanics.
|
||||
- [`docs/dev/merge.md`](../dev/merge.md) — three-way merge details and conflict kinds.
|
||||
- [`docs/user/query-language.md`](query-language.md) — `.gq` syntax for the multi-statement queries used above.
|
||||
- [`docs/dev/runs.md`](../dev/runs.md) — the per-query commit pipeline that gives single-query atomicity.
|
||||
- [`docs/dev/invariants.md`](../dev/invariants.md) — the architectural rule.
|
||||
Loading…
Add table
Add a link
Reference in a new issue