The schema-lint chassis v1.2 (PR #100) shipped `--allow-data-loss` on the CLI, but `SchemaApplyRequest` had no equivalent field — Hard-mode drops were CLI-only. This commit closes that feature gap and adds e2e test coverage for drop modes across HTTP + CLI, plus data preservation on additive apply, plus a CLI↔SDK plan-parity assertion. Feature gap closed: - `crates/omnigraph-server/src/api.rs` — added `allow_data_loss: bool` (default false via `#[serde(default)]`) to `SchemaApplyRequest`. Added `Default` derive so test usages can use `..Default::default()`. - `crates/omnigraph-server/src/lib.rs` — `server_schema_apply` now constructs `SchemaApplyOptions { allow_data_loss: request.allow_data_loss }` and threads through to `apply_schema_as`. - `crates/omnigraph-cli/src/main.rs` — remote-URI schema-apply path used to bail with "--allow-data-loss not yet supported on remote"; now forwards the flag into the JSON payload so the CLI behaves identically against local and remote URIs. - `openapi.json` — regenerated; only diff is the new field on `SchemaApplyRequest`. Tests added (8 new): * `crates/omnigraph-server/tests/server.rs` (+5): - `schema_apply_route_soft_drops_property_via_http` — POST schema removing nullable property, verify catalog reflects the drop AND `snapshot_at_version(pre)` still has `age` in the field list (time-travel reachability is the Soft contract). - `schema_apply_route_soft_drops_node_type_via_http` — POST schema removing `Company` node + cascading `WorksAt` edge. - `schema_apply_route_hard_drops_property_with_allow_data_loss` — POST with `allow_data_loss: true`, verify plan step reports `mode: hard`. - `schema_apply_route_keeps_drops_soft_without_flag` — same schema without flag, verify `mode: soft`. Pins default semantics against accidental Hard promotion. - `schema_apply_route_additive_property_preserves_existing_rows` — load fixture, POST adding nullable property, verify row count preserved (SDK suite covers data preservation on drops + renames; additive AddProperty wasn't pinned). Plus helpers `schema_without_age` and `schema_without_company`. * `crates/omnigraph-cli/tests/cli.rs` (+3): - `schema_apply_allow_data_loss_flag_promotes_drops_to_hard` — CLI `omnigraph schema apply --allow-data-loss --schema X.pg --json`, verify plan step has `mode: hard`. - `schema_apply_without_allow_data_loss_keeps_soft_drops` — without flag, verify Soft. - `schema_plan_parity_cli_and_sdk` — same `.pg` source through `Omnigraph::plan_schema` (SDK) and `omnigraph schema plan --json` (CLI), assert the steps array is byte-identical post-JSON. HTTP has no `/schema/plan` endpoint; apply-side parity is implicitly covered by the HTTP drop tests + CLI drop tests using identical fixtures. Docs: - `docs/user/schema-language.md` — new "Destructive drops" section documenting Soft vs Hard semantics and that `allow_data_loss` is now honored uniformly across CLI / HTTP / SDK. Verification: every new test passes; full `cargo test --workspace --locked` green; `scripts/check-agents-md.sh` passes. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
4.6 KiB
Schema Language (.pg)
Pest grammar at crates/omnigraph-compiler/src/schema/schema.pest. AST at schema/ast.rs. Catalog at catalog/mod.rs.
Top-level declarations
interface <Name> { property* }— reusable property contracts.node <Name> [implements <Iface>, ...] { property* | constraint* }edge <Name>: <FromType> -> <ToType> [@card(min..max)] { property* | constraint* }- Comments: line
//and block/* … */.
Property declarations
<ident>: <TypeRef> [annotation*]
Built-in scalar types
| Scalar | Arrow type |
|---|---|
String |
Utf8 |
Blob |
LargeBinary |
Bool |
Boolean |
I32 / I64 |
Int32 / Int64 |
U32 / U64 |
UInt32 / UInt64 |
F32 / F64 |
Float32 / Float64 |
Date |
Date32 |
DateTime |
Date64 |
Vector(<dim>) |
FixedSizeList(Float32, dim), 1 ≤ dim ≤ i32::MAX |
[<scalar>] |
List(scalar) |
enum(v1, v2, …) |
Utf8 with sorted/dedup'd set of allowed string values |
<scalar>? |
Same as scalar but nullable: true |
Constraints (body level)
| Constraint | On | Effect |
|---|---|---|
@key(p, …) |
node | Primary key; implies index on key columns; key_property() returns the first key |
@unique(p, …) |
node, edge | Uniqueness across listed columns |
@index(p, …) |
node, edge | Build a scalar (BTREE) index on the columns |
@range(p, min..max) |
node | Numeric range validation (open ranges allowed) |
@check(p, "regex") |
node | Regex pattern validation |
@card(min..max?) |
edge | Edge multiplicity — default 0..*; 0..1, 1..1, 1..*, etc. |
Edge bodies only allow @unique and @index.
Annotations
@<ident>or@<ident>(<literal>)on any declaration or property.- Known annotations:
@embedon a Vector property — names the source property whose text gets embedded into this vector at ingest (embed_sourcesmap in NodeType).@description("…"),@instruction("…")on query declarations (carried through to clients).
- Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation.
Catalog construction
- Pass 0: collect interfaces.
- Pass 1: collect nodes, expand
implements, build constraint and@embedmappings, build the Arrow schema for each node table (id: Utf8plus all properties; blob columns getLargeBinary). - Pass 2: collect edges, validate that
from_type/to_typeexist, normalize edge names case-insensitively for lookup, validate constraints for edges. Edge Arrow schema:id: Utf8, src: Utf8, dst: Utf8plus edge properties.
Schema IR & stable type IDs
SCHEMA_IR_VERSION = 1(catalog/schema_ir.rs).- Each interface/node/edge currently gets a
stable_type_idfrom a kind+name hash. - Rename-preserving accepted IDs are an architectural invariant, but the current hash-on-name implementation is a known gap until migration carries IDs across
@rename_from. - Serialized as JSON for diff/migration plans.
Schema migration planning
plan_schema_migration(accepted, desired) -> SchemaMigrationPlan { supported, steps[] } with step types:
AddType { type_kind, name }RenameType { type_kind, from, to }AddProperty { type_kind, type_name, property_name, property_type }RenameProperty { type_kind, type_name, from, to }AddConstraint { type_kind, type_name, constraint }UpdateTypeMetadata { … annotations }UpdatePropertyMetadata { … annotations }UnsupportedChange { entity, reason }(forcessupported=false)
apply_schema() returns SchemaApplyResult { supported, applied, manifest_version, steps } and is gated by an internal __schema_apply_lock__ system branch so concurrent schema applies serialize.
Destructive drops — --allow-data-loss
DropProperty and DropType steps default to Soft mode: the catalog tombstones the entry but the prior column / dataset remains time-travel-reachable via snapshot_at_version(prev) until omnigraph cleanup runs. Soft drops are reversible.
Pass --allow-data-loss (CLI) or allow_data_loss: true (HTTP POST /schema/apply body, SDK SchemaApplyOptions) to promote every drop in the plan to Hard mode. Hard drops run cleanup_old_versions on the affected dataset immediately after the manifest publish, making the prior column / dataset unreachable. Irreversible.
The flag is honored uniformly across transports — omnigraph schema apply --allow-data-loss, POST /schema/apply { schema_source, allow_data_loss: true }, and apply_schema_with_options(.., SchemaApplyOptions { allow_data_loss: true }) produce identical plans and identical effects.