mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-15 01:55:13 +02:00
Remove developer-only scaffolding that leaked into the public user/operator docs, while preserving every user-facing behavior, command, flag, endpoint, constant, and env var. No behavior changes. Removed across 18 files: - internal ticket / sequencing refs (MR-NNN, RFC-NNN, "Phase N"); - source-code paths (crates/**/*.rs, *.pest) and internal struct/function dumps (e.g. the QueryIR / GraphCommit / SchemaMigrationPlan Rust types, internal fn names like fork_branch_from_state, optimize_all_tables); - Lance-internal blocker prose (upstream issue numbers, blob-decode cause, sidecar Phase-B/C mechanics) — keeping the user-visible behavior (e.g. "optimize skips Blob-column tables; reads/writes unaffected"); - pre-v0.4.0 Run-state-machine archaeology. Internal IR/lowering/recovery-internals sections were either trimmed to a brief user-facing note (e.g. "Traversal execution", "interrupted writes recover automatically; recovery commits are recorded under actor omnigraph:recovery") or removed. Kept: all language syntax, lint codes, Cedar actions/scopes, endpoints, error taxonomy, every constant and env var (verified none dropped from the constants cheat-sheet), and the operator-facing explanations of on-disk artifacts. Residual "legacy" mentions are all user-facing (the deprecated omnigraph.yaml, the legacy token chain, old command names). Verified: zero internal-scaffolding leaks (MR/RFC/Phase/.rs/.pest = 0) across docs/user; zero broken links; check-agents-md.sh green. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
77 lines
3.8 KiB
Markdown
77 lines
3.8 KiB
Markdown
# Schema Language (`.pg`)
|
|
|
|
## Top-level declarations
|
|
|
|
- `interface <Name> { property* }` — reusable property contracts.
|
|
- `node <Name> [implements <Iface>, ...] { property* | constraint* }`
|
|
- `edge <Name>: <FromType> -> <ToType> [@card(min..max)] { property* | constraint* }`
|
|
- Comments: line `//` and block `/* … */`.
|
|
|
|
## Property declarations
|
|
|
|
`<ident>: <TypeRef> [annotation*]`
|
|
|
|
## Built-in scalar types
|
|
|
|
| Scalar | Arrow type |
|
|
|---|---|
|
|
| `String` | Utf8 |
|
|
| `Blob` | LargeBinary |
|
|
| `Bool` | Boolean |
|
|
| `I32` / `I64` | Int32 / Int64 |
|
|
| `U32` / `U64` | UInt32 / UInt64 |
|
|
| `F32` / `F64` | Float32 / Float64 |
|
|
| `Date` | Date32 |
|
|
| `DateTime` | Date64 |
|
|
| `Vector(<dim>)` | FixedSizeList(Float32, dim), `1 ≤ dim ≤ i32::MAX` |
|
|
| `[<scalar>]` | List(scalar) |
|
|
| `enum(v1, v2, …)` | Utf8 with sorted/dedup'd set of allowed string values |
|
|
| `<scalar>?` | Same as scalar but `nullable: true` |
|
|
|
|
## Constraints (body level)
|
|
|
|
| Constraint | On | Effect |
|
|
|---|---|---|
|
|
| `@key(p, …)` | node | Primary key; implies index on key columns; `key_property()` returns the first key |
|
|
| `@unique(p, …)` | node, edge | Uniqueness across listed columns |
|
|
| `@index(p, …)` | node, edge | Build a scalar (BTREE) index on the columns |
|
|
| `@range(p, min..max)` | node | Numeric range validation (open ranges allowed) |
|
|
| `@check(p, "regex")` | node | Regex pattern validation |
|
|
| `@card(min..max?)` | edge | Edge multiplicity — default `0..*`; `0..1`, `1..1`, `1..*`, etc. |
|
|
|
|
Edge bodies only allow `@unique` and `@index`.
|
|
|
|
## Annotations
|
|
|
|
- `@<ident>` or `@<ident>(<literal>)` on any declaration or property.
|
|
- Known annotations:
|
|
- `@embed` on a Vector property — names the *source* property whose text gets embedded into this vector at ingest.
|
|
- `@description("…")`, `@instruction("…")` on query declarations (carried through to clients).
|
|
- Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation.
|
|
|
|
## Table layout
|
|
|
|
- Each node type compiles to a table with an `id: Utf8` column plus all declared properties (blob columns are stored as `LargeBinary`); `implements` clauses expand the interface's properties into the node.
|
|
- Each edge type compiles to a table with `id: Utf8, src: Utf8, dst: Utf8` plus the edge's own properties. Edge endpoint types (`from`/`to`) must exist, and edge names are matched case-insensitively.
|
|
|
|
## Schema migration planning
|
|
|
|
A migration plan compares the accepted schema against the desired one and reports whether the change is supported plus the ordered steps it requires:
|
|
|
|
- Add a type
|
|
- Rename a type
|
|
- Add a property
|
|
- Rename a property
|
|
- Add a constraint
|
|
- Update type or property metadata (annotations)
|
|
- Unsupported change (reports the entity and reason; forces the plan to unsupported)
|
|
|
|
Applying a plan reports whether it was supported, the steps applied, and the resulting manifest version. Concurrent schema applies serialize so they can't interleave.
|
|
|
|
## Destructive drops — `--allow-data-loss`
|
|
|
|
`DropProperty` and `DropType` steps default to `Soft` mode: the catalog tombstones the entry but the prior column / dataset remains time-travel-reachable via `snapshot_at_version(prev)` until `omnigraph cleanup` runs. Soft drops are reversible.
|
|
|
|
Pass `--allow-data-loss` (CLI) or `allow_data_loss: true` (HTTP `POST /schema/apply` body, SDK `SchemaApplyOptions`) to promote every drop in the plan to `Hard` mode. Hard drops run `cleanup_old_versions` on the affected dataset immediately after the manifest publish, making the prior column / dataset unreachable. **Irreversible.**
|
|
|
|
The flag is honored uniformly across transports — `omnigraph schema apply --allow-data-loss`, `POST /schema/apply { schema_source, allow_data_loss: true }`, and `apply_schema_with_options(.., SchemaApplyOptions { allow_data_loss: true })` produce identical plans and identical effects.
|