Add an "Enum evolution" section to schema-language.md covering the four supported shapes and their tiers, plus the unsupported cases (non-String scalar change, interface enums, in-place variant rename). Record the new ChangeEnumConstraint migration step. Add OG-MF-105 / OG-MF-107 to the schema-lint code table and clarify OG-MF-106 as a genuine scalar change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6.3 KiB
Schema Language (.pg)
Pest grammar at crates/omnigraph-compiler/src/schema/schema.pest. AST at schema/ast.rs. Catalog at catalog/mod.rs.
Top-level declarations
interface <Name> { property* }— reusable property contracts.node <Name> [implements <Iface>, ...] { property* | constraint* }edge <Name>: <FromType> -> <ToType> [@card(min..max)] { property* | constraint* }- Comments: line
//and block/* … */.
Property declarations
<ident>: <TypeRef> [annotation*]
Built-in scalar types
| Scalar | Arrow type |
|---|---|
String |
Utf8 |
Blob |
LargeBinary |
Bool |
Boolean |
I32 / I64 |
Int32 / Int64 |
U32 / U64 |
UInt32 / UInt64 |
F32 / F64 |
Float32 / Float64 |
Date |
Date32 |
DateTime |
Date64 |
Vector(<dim>) |
FixedSizeList(Float32, dim), 1 ≤ dim ≤ i32::MAX |
[<scalar>] |
List(scalar) |
enum(v1, v2, …) |
Utf8 with sorted/dedup'd set of allowed string values |
<scalar>? |
Same as scalar but nullable: true |
Constraints (body level)
| Constraint | On | Effect |
|---|---|---|
@key(p, …) |
node | Primary key; implies index on key columns; key_property() returns the first key |
@unique(p, …) |
node, edge | Uniqueness across listed columns |
@index(p, …) |
node, edge | Build a scalar (BTREE) index on the columns |
@range(p, min..max) |
node | Numeric range validation (open ranges allowed) |
@check(p, "regex") |
node | Regex pattern validation |
@card(min..max?) |
edge | Edge multiplicity — default 0..*; 0..1, 1..1, 1..*, etc. |
Edge bodies only allow @unique and @index.
Annotations
@<ident>or@<ident>(<literal>)on any declaration or property.- Known annotations:
@embedon a Vector property — names the source property whose text gets embedded into this vector at ingest (embed_sourcesmap in NodeType).@description("…"),@instruction("…")on query declarations (carried through to clients).
- Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation.
Catalog construction
- Pass 0: collect interfaces.
- Pass 1: collect nodes, expand
implements, build constraint and@embedmappings, build the Arrow schema for each node table (id: Utf8plus all properties; blob columns getLargeBinary). - Pass 2: collect edges, validate that
from_type/to_typeexist, normalize edge names case-insensitively for lookup, validate constraints for edges. Edge Arrow schema:id: Utf8, src: Utf8, dst: Utf8plus edge properties.
Schema IR & stable type IDs
SCHEMA_IR_VERSION = 1(catalog/schema_ir.rs).- Each interface/node/edge currently gets a
stable_type_idfrom a kind+name hash. - Rename-preserving accepted IDs are an architectural invariant, but the current hash-on-name implementation is a known gap until migration carries IDs across
@rename_from. - Serialized as JSON for diff/migration plans.
Schema migration planning
plan_schema_migration(accepted, desired) -> SchemaMigrationPlan { supported, steps[] } with step types:
AddType { type_kind, name }RenameType { type_kind, from, to }AddProperty { type_kind, type_name, property_name, property_type }RenameProperty { type_kind, type_name, from, to }AddConstraint { type_kind, type_name, constraint }UpdateTypeMetadata { … annotations }UpdatePropertyMetadata { … annotations }ChangeEnumConstraint { type_kind, type_name, property_name, to_property_type, code }— evolve an enum-typed property's value-set (see below)UnsupportedChange { entity, reason }(forcessupported=false)
apply_schema() returns SchemaApplyResult { supported, applied, manifest_version, steps } and is gated by an internal __schema_apply_lock__ system branch so concurrent schema applies serialize.
Enum evolution
Enums are stored physically as Utf8; the allowed value-set lives only in the schema, not in the column. So enum migrations change catalog metadata, never the data — no table rewrite, and the manifest version does not advance. Four shapes are supported on node and edge properties (interface enum changes are not supported in v1):
| Change | Example | Tier | Behavior |
|---|---|---|---|
| Widen (add variants) | enum(open, closed) → enum(open, closed, archived) |
Safe | Metadata-only; applies unconditionally. No existing row can be invalid. |
enum → String (loosen) |
enum(open, closed) → String |
Safe | Metadata-only; every enum value is a valid String. |
| Narrow (remove variants) | enum(open, closed, archived) → enum(open, closed) |
Validated (OG-MF-105) |
Apply scans existing rows; if any holds a removed value it aborts before publish, naming the offending value. No data is dropped — fix or migrate the rows, then re-apply. |
String → enum (constrain) |
String → enum(open, closed) |
Validated (OG-MF-107) |
Apply scans existing rows; aborts on the first out-of-set value. |
Reordering variants is a no-op (the value-set is sorted + deduped, so enum(b, a) and enum(a, b) are identical). Changing an enum to a non-String scalar (e.g. enum(...) → I32), or changing nullability/list-ness alongside the value-set, is a genuine type change and stays UnsupportedChange (OG-MF-106). Renaming a variant in place (a value remap, e.g. closed → done) is not yet supported — model it as add-then-narrow with a data migration in between.
Destructive drops — --allow-data-loss
DropProperty and DropType steps default to Soft mode: the catalog tombstones the entry but the prior column / dataset remains time-travel-reachable via snapshot_at_version(prev) until omnigraph cleanup runs. Soft drops are reversible.
Pass --allow-data-loss (CLI) or allow_data_loss: true (HTTP POST /schema/apply body, SDK SchemaApplyOptions) to promote every drop in the plan to Hard mode. Hard drops run cleanup_old_versions on the affected dataset immediately after the manifest publish, making the prior column / dataset unreachable. Irreversible.
The flag is honored uniformly across transports — omnigraph schema apply --allow-data-loss, POST /schema/apply { schema_source, allow_data_loss: true }, and apply_schema_with_options(.., SchemaApplyOptions { allow_data_loss: true }) produce identical plans and identical effects.