openai-alias host (Cursor): OMNIGRAPH_EMBED_PROVIDER=openai now defaults its base URL to https://api.openai.com/v1 (model text-embedding-3-large), while openai-compatible/unset keep the OpenRouter gateway default. The default is derived from the alias rather than the Provider enum, so an operator's stated intent can no longer be silently routed to OpenRouter; an explicit OMNIGRAPH_EMBED_BASE_URL still wins. New test from_env_openai_alias_uses_openai_host_not_openrouter. single model source of truth (Cursor): remove the EmbedSpec.model field. The provider config is authoritative for the model, so a spec can no longer declare a model that is silently ignored while the API uses another (the wrong-space-vectors footgun); the embed summary reports the model actually resolved. Correct by construction rather than a truthful-echo patch. stale @embed docs (Codex): docs/user/schema/index.md and docs/dev/execution.md still claimed @embed embeds at ingest; corrected to the real contract (catalog annotation; vectors supplied or pre-filled by 'omnigraph embed'). Also documented the openai-vs-OpenRouter base default in embeddings.md. Greptile's RFC-status note is declined: the repo lifecycle keeps an RFC Status: Proposed while its PR is open and flips to Accepted on merge.
4 KiB
Schema Language (.pg)
Top-level declarations
interface <Name> { property* }— reusable property contracts.node <Name> [implements <Iface>, ...] { property* | constraint* }edge <Name>: <FromType> -> <ToType> [@card(min..max)] { property* | constraint* }- Comments: line
//and block/* … */.
Property declarations
<ident>: <TypeRef> [annotation*]
Built-in scalar types
| Scalar | Arrow type |
|---|---|
String |
Utf8 |
Blob |
LargeBinary |
Bool |
Boolean |
I32 / I64 |
Int32 / Int64 |
U32 / U64 |
UInt32 / UInt64 |
F32 / F64 |
Float32 / Float64 |
Date |
Date32 |
DateTime |
Date64 |
Vector(<dim>) |
FixedSizeList(Float32, dim), 1 ≤ dim ≤ i32::MAX |
[<scalar>] |
List(scalar) |
enum(v1, v2, …) |
Utf8 with sorted/dedup'd set of allowed string values |
<scalar>? |
Same as scalar but nullable: true |
Constraints (body level)
| Constraint | On | Effect |
|---|---|---|
@key(p, …) |
node | Primary key; implies index on key columns; key_property() returns the first key |
@unique(p, …) |
node, edge | Uniqueness across listed columns |
@index(p, …) |
node, edge | Build a scalar (BTREE) index on the columns |
@range(p, min..max) |
node | Numeric range validation (open ranges allowed) |
@check(p, "regex") |
node | Regex pattern validation |
@card(min..max?) |
edge | Edge multiplicity — default 0..*; 0..1, 1..1, 1..*, etc. |
Edge bodies only allow @unique and @index.
Annotations
@<ident>or@<ident>(<literal>)on any declaration or property.- Known annotations:
@embed("source_property")on a Vector property — records which String property is the embedding source for query-timenearest($v, "string")auto-embedding. It is a catalog annotation; it does not populate the vector at ingest (supply vectors in load data, or pre-fill via the offlineomnigraph embedpipeline).@description("…"),@instruction("…")on query declarations (carried through to clients).
- Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation.
Table layout
- Each node type compiles to a table with an
id: Utf8column plus all declared properties (blob columns are stored asLargeBinary);implementsclauses expand the interface's properties into the node. - Each edge type compiles to a table with
id: Utf8, src: Utf8, dst: Utf8plus the edge's own properties. Edge endpoint types (from/to) must exist, and edge names are matched case-insensitively.
Schema migration planning
A migration plan compares the accepted schema against the desired one and reports whether the change is supported plus the ordered steps it requires:
- Add a type
- Rename a type
- Add a property
- Rename a property
- Add a constraint
- Update type or property metadata (annotations)
- Unsupported change (reports the entity and reason; forces the plan to unsupported)
Applying a plan reports whether it was supported, the steps applied, and the resulting manifest version. Concurrent schema applies serialize so they can't interleave.
Destructive drops — --allow-data-loss
DropProperty and DropType steps default to Soft mode: the catalog tombstones the entry but the prior column / dataset remains time-travel-reachable via snapshot_at_version(prev) until omnigraph cleanup runs. Soft drops are reversible.
Pass --allow-data-loss (CLI) or allow_data_loss: true (HTTP POST /schema/apply body, SDK SchemaApplyOptions) to promote every drop in the plan to Hard mode. Hard drops run cleanup_old_versions on the affected dataset immediately after the manifest publish, making the prior column / dataset unreachable. Irreversible.
The flag is honored uniformly across transports — omnigraph schema apply --allow-data-loss, POST /schema/apply { schema_source, allow_data_loss: true }, and apply_schema_with_options(.., SchemaApplyOptions { allow_data_loss: true }) produce identical plans and identical effects.