omnigraph/docs/dev/index.md

68 lines
3 KiB
Markdown
Raw Normal View History

# Developer Docs
**Audience:** contributors, maintainers, and coding agents
This is the contributor-facing entry point. These docs explain architecture,
invariants, implementation contracts, test ownership, and upstream Lance
constraints. User-facing behavior should still be documented through
[docs/user/index.md](../user/index.md) and the relevant public reference docs.
## Required For Every Non-Trivial Change
| Need | Read |
|---|---|
| Architectural rules, known gaps, deny-list | [invariants.md](invariants.md) |
| Upstream Lance source-of-truth index | [lance.md](lance.md) |
| Existing test coverage and test placement | [testing.md](testing.md) |
## Architecture And Storage
| Area | Read |
|---|---|
| System structure, L1/L2 framing, component diagrams | [architecture.md](architecture.md) |
| On-disk layout, manifest schema, URI behavior | [storage.md](../user/storage.md) |
| Direct-publish writes, D2, staged writes, recovery sidecars | [runs.md](runs.md) |
| Query execution, mutation execution, loader flow | [execution.md](execution.md) |
| Index lifecycle and graph topology indexes | [indexes.md](../user/indexes.md) |
| Branch and commit internals | [branches-commits.md](../user/branches-commits.md) |
| Three-way merge implementation and conflicts | [merge.md](merge.md) |
| Diff/change-feed implementation | [changes.md](../user/changes.md) |
| Branch protection policy | [branch-protection.md](branch-protection.md) |
| CODEOWNERS source of truth | [codeowners.md](codeowners.md) |
## Language, Runtime, And Boundaries
| Area | Read |
|---|---|
| Schema grammar, catalog, migration planner | [schema-language.md](../user/schema-language.md) |
| Query grammar, IR, lints, mutation restrictions | [query-language.md](../user/query-language.md) |
| Embedding client and `@embed` integration | [embeddings.md](../user/embeddings.md) |
| Cedar policy surface and server gating | [policy.md](../user/policy.md) |
| Server auth, OpenAPI, endpoint handlers | [server.md](../user/server.md) |
| Error taxonomy and serialization | [errors.md](../user/errors.md) |
| Constants and tunables | [constants.md](../user/constants.md) |
| Transaction model public contract | [transactions.md](../user/transactions.md) |
## Project Operations
| Area | Read |
|---|---|
| CI and release workflows | [ci.md](ci.md) |
| Install and deployment packaging | [install.md](../user/install.md), [deployment.md](../user/deployment.md) |
| Release history | [releases/](../releases/) |
schema-lint chassis v1.0: DropProperty Soft + code-tagged diagnostics (MR-694) (#90) * schema-lint chassis v1 (WIP): tier surfacing + plan doc First commit of the chassis v1 branch. Lands a small, foundational slice without behavior change, plus a planning doc that lays out the remaining 7 commits in sequence so the PR can be reviewed incrementally. This commit: - Adds SchemaMigrationStep::diagnostic() returning the full &'static DiagnosticCode (family + tier + severity) for UnsupportedChange steps with codes. Renderers can now reach the tier without re-implementing the code → tier lookup. - CLI `omnigraph schema plan` output now displays tier alongside code: unsupported change on node:Person.age [OG-DS-104, destructive]: removing property 'Person.age' is not supported in schema migration v1 Operators see at-a-glance the kind of risk each rejection represents — not just the rule identifier. - No behavior change. All 11 existing schema_apply tests still pass. Planning doc at docs/schema-lint-v1-plan.md tracks the 7 remaining commits to bring v1 to feature-complete: 1. (this commit) Tier surfacing in plan output. 2. Soft / Hard mode enum on drop steps. 3. Tombstone fields on catalog IR. 4. Planner emits DropProperty { Soft } by default. 5. Apply path implements Soft mode. 6. Convert PR #62 destructive-rejection tests. 7. --allow-data-loss flag + Hard mode. 8. (optional) Tombstone unhide / restore command. Delete the planning doc when v1 lands. Intentionally checked in to the WIP branch so the scope is reviewable; not intended as a permanent doc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * schema-lint v1 commit 2: DropMode + dormant Drop* variants Second commit of the chassis v1 branch. Lands the type-level shape of soft/hard drops without wiring them up. Variants are reachable from emitters but the planner doesn't produce them yet; the apply path returns an explicit not-yet-implemented error if one shows up via deserialization. Added: - `DropMode { Soft, Hard }` — orthogonal to `SafetyTier`. Tier classifies the rule's risk class; mode is the operator's intent for data treatment. - `Soft` → catalog tombstone, data retained. Tier: safe. - `Hard` → Lance-level removal. Tier: destructive; will require --allow-data-loss to apply (commit 7). - `SchemaMigrationStep::DropType { type_kind, name, mode }` and `SchemaMigrationStep::DropProperty { type_kind, type_name, property_name, mode }` variants. - Re-export `DropMode` from `omnigraph_compiler::DropMode` so downstream crates don't reach into the catalog submodule. - CLI `render_schema_plan_step` arms for both variants, surfacing the mode in plan output: `drop property 'Person.age' of node 'Person' (soft mode)`. - `apply_schema_with_lock` exhaustive match arm for the two new variants that returns `manifest_internal` with a clear not-yet-implemented message. If a SchemaIR JSON containing Drop{Type,Property} arrives (e.g. from a future tool or hand- written), the apply path fails explicitly rather than silently misclassifying. - Two new in-source tests: - `drop_steps_round_trip_through_serde` — pins the wire shape for all four (variant × mode) combinations. - `drop_mode_serde_uses_snake_case` — pins external-tool- friendly serialization (`"soft"` / `"hard"`). Build: clean, only pre-existing warnings. Tests: - omnigraph-compiler schema_plan: 6/6 (4 existing + 2 new). - omnigraph-engine schema_apply: 11/11 (unchanged — planner still emits UnsupportedChange for removal paths). Next commit (commit 3 per docs/schema-lint-v1-plan.md): add the `tombstoned: bool` fields to NodeIR / EdgeIR / PropertyIR for the catalog representation of soft-mode tombstones. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * plan doc: reframe v1 around Lance native drop_columns After a substrate audit of the Lance data-evolution guide on 2026-05-13, the v1 plan was simplified. Two key findings: 1. Lance's `drop_columns()` is already metadata-only and reversible via time travel until cleanup. No need for a parallel `tombstoned: bool` field in our catalog IR — Lance's version graph IS the tombstone. 2. The full schema_apply substrate migration (add_columns, drop_columns, alter_columns vs. stage_overwrite across all step types) is consolidated in MR-948 as a sibling issue. v1 only uses the relevant slice (drop_columns for OG-DS-1XX). Net plan changes: - Commit 3 (original): tombstone fields on catalog IR → dropped. No catalog IR change needed. The Lance drop_columns commit IS the tombstone. - Commit 5 (original): apply path writes tombstoned: true → replaced with: apply path calls Dataset::drop_columns([name]). - Commit 7 Hard mode: stage_overwrite removing the column → replaced with: drop_columns + compact_files + cleanup_old_versions. Same APIs omnigraph cleanup already uses. - Commit 8 (original): omnigraph schema unhide → dropped. Time travel is the undo (omnigraph snapshot --at <commit>). Net result: 8 commits → 5 commits. ~250 LoC less surface. More substrate-aligned. The chassis types from commit 2 (DropMode enum, DropType / DropProperty variants) are kept exactly as designed; only the implementation strategy changed. The Lance docs quote is included in the doc so future readers see the substrate behavior cited verbatim. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * schema-lint v1 commit 3: emit + apply DropProperty { Soft } Wire the dormant DropProperty variant end-to-end for the Soft case. Per docs/schema-lint-v1-plan.md, commit #3 of the schema-lint chassis v1 series (MR-694). Planner (schema_plan.rs): - plan_properties: emit DropProperty { type_kind, type_name, property_name, mode: Soft } instead of UnsupportedChange when a property exists in accepted but not in desired. Plan is now supported = true for drop-only changes. Apply (schema_apply.rs): - Route DropProperty { Soft } through rewritten_tables. The existing batch_for_schema_apply_rewrite path already iterates the *target* schema fields, so a property absent from desired_catalog is naturally projected away. The prior Lance version retains the dropped column for time-travel reversibility (until cleanup runs). - DropType still errors (lands in commit #4 with different mechanics: __manifest entry removal instead of column projection). - DropProperty { Hard } still errors (lands in commit #5 with --allow-data-loss CLI flag + immediate compact_files + cleanup_old_versions). Tests: - Planner unit test plan_emits_soft_drop_for_removed_nullable_property asserts the variant emission + supported = true + no UnsupportedChange. - Integration test apply_schema_drops_a_nullable_property_softly_ preserves_prior_version (replaces the former apply_schema_rejects_dropping_a_property_with_data) asserts: (a) plan contains DropProperty { Soft } (b) apply succeeds + manifest advances + row count unchanged (c) current dataset schema lacks the dropped column (d) snapshot_at_version(pre_drop) still has the dropped column (e) reopen consistency — drop preserved across engine restart Recovery: rides on SidecarKind::SchemaApply per MR-847. No new sidecar kind needed; the entire apply path is already sidecar-wrapped. Substrate alignment: this commit uses the stage_overwrite full-rewrite path (full_rewrite cost class) rather than Lance native drop_columns (catalog_only cost class). MR-948 is the follow-up substrate-alignment refactor that introduces a LanceColumnOp surface and switches the metadata-only case onto drop_columns. Functional outcome is identical; cost-class improvement deferred. Test results: - cargo test -p omnigraph-compiler --lib: 238 passed - cargo test -p omnigraph-engine --test schema_apply: 11 passed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: move schema-lint-v1-plan into docs/dev/ + add to index Post-rebase fixup for the docs split (#93). The plan doc was added to docs/ at the top level before main reorganized to docs/{user,dev}/. This moves it into docs/dev/ and adds an entry to docs/dev/index.md under a new "Active Implementation Plans" section so the check-agents-md.sh link check passes. Per the original commit message (617a77d), the plan doc is intentionally temporary — it will be deleted when v1 lands. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:30:03 +03:00
## Active Implementation Plans
Working documents for in-flight feature work. Removed when the work lands.
| Area | Read |
|---|---|
| Schema-lint chassis v1 (MR-694) — `--allow-data-loss`, soft/hard drops | [schema-lint-v1-plan.md](schema-lint-v1-plan.md) |
feat: inline query strings in CLI and HTTP server (#110) * feat(MR-656): inline query strings in CLI and HTTP server CLI: - Add -e / --query-string <STRING> to omnigraph read and omnigraph change - Exactly one of --query, --query-string, --alias is required (3-way XOR) - Empty --query-string is rejected with a clear error HTTP: - New POST /query (read-only, clean field names: query/name/params/branch/snapshot) - Mutations on /query are rejected with 400 -- use POST /change instead - ChangeRequest fields polished: query (alias query_source), name (alias query_name) - POST /read and POST /change remain byte-compatible for existing clients Tests: - cli.rs: -e happy-path on read/change, mutex error vs --query, empty -e rejected - system_local.rs: inline -e read and -e change exercise the local flow - system_remote.rs: inline -e read/change over HTTP plus direct /query 200/400 - server.rs: /query 200, /query 400 on mutation, /change legacy field alias - openapi.rs: new /query path, QueryRequest schema, ChangeRequest field-name polish Docs: cli.md (-e examples), cli-reference.md (read/change rows), server.md (/query) Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com> * feat(MR-656): rename read/change to query/mutate with deprecation signals HTTP server: - Add POST /mutate as canonical write endpoint (pairs with POST /query). - Mark POST /read and POST /change as deprecated. Three-channel signal: * OpenAPI: `deprecated: true` on the operation (every codegen flags the generated SDK method). * RFC 9745: response `Deprecation: true` header on every response. * RFC 8288: response `Link: </successor>; rel="successor-version"` pointing at /query and /mutate respectively. - Share business logic across /mutate and /change via run_mutate(); the /change wrapper is the only place that adds the deprecation headers. - ChangeRequest field aliases (query_source/query_name) preserved. - AliasCommand serde now accepts `query`/`mutate` alongside `read`/`change`. CLI: - Promote `omnigraph query` / `omnigraph mutate` to top-level canonical subcommands (clap visible_alias keeps `omnigraph read` / `omnigraph change` working forever). - Promote `omnigraph lint` / `omnigraph check` to top-level (was nested under `omnigraph query lint`, which is now a deprecated argv shim that rewrites to the canonical form). - Argv-level preprocessing prints a one-line deprecation warning to stderr when any legacy spelling is used. Canonical names are silent. Tests: - Server: /mutate works, /change emits Deprecation+Link headers, /read emits Deprecation+Link headers, /query carries no deprecation signal. - OpenAPI: /read and /change flagged deprecated; /query and /mutate not. - CLI: canonical `lint` matches deprecated `query lint` / `query check` output; `read` / `change` print deprecation warnings. Docs: - cli.md: new canonical examples; "Deprecated names" migration table. - cli-reference.md: top-level table updated; aliases.<name>.command accepts both legacy and canonical spellings. - server.md: endpoint inventory shows /query and /mutate as canonical and /read and /change as deprecated; dedicated section explains the three-channel deprecation signal. - og-cheet-sheet.md: use new `omnigraph lint` / `omnigraph check`. - openapi.json regenerated. Migration is purely cosmetic — every deprecated form continues to work indefinitely; only the spelling changes. Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com> * fix(MR-656): address Devin Review findings on /query and /change Two issues raised by Devin Review on PR #110: 1. `POST /query` mutation-rejection error pointed at the deprecated `/change` endpoint instead of the canonical `/mutate`. Fixed in three places: the runtime error message in `server_query`, the utoipa 400-response description, and the handler doc comment. The `QueryRequest` schema docstrings in `api.rs` got the same update so the openapi.json bodies match. Server and openapi tests updated. 2. `execute_change_remote` serialized `ChangeRequest` directly, which emits the new canonical field names `query` / `name` on the wire. `#[serde(alias = "query_source")]` only affects deserialization, so a newer CLI talking to an older server would have its `/change` POST body fail with "missing field: query_source". Fixed by extracting a `legacy_change_request_body` helper that hand-rolls the JSON with the legacy keys (`query_source` / `query_name`), the same byte-stable contract `execute_read_remote` already uses against `/read`. Added two unit tests on the helper to lock the wire shape in. Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com> * docs(dev): RFC 001 — inline + stored queries, envelope, MCP Tracked artifact consolidating the design across MR-656 (this branch), MR-976 (Phase 1 envelope hardening parent, with MR-977/978/979/980 sub-issues), and MR-969 (stored queries + MCP). Sections: * Two paths, one engine — inline `/query` + `/mutate` (this PR) coexist with stored `/queries/{name}` (MR-969). Same `run_query` / `run_mutate` backend (the fold-in landed in the previous commit). * Request envelope ("before") — Idempotency-Key, If-Match, X-Deadline, X-Trace-Id, expect, dry_run, fields. Phase 1 ships the load-bearing subset on `/mutate`. * Response envelope ("after") — audit_id, snapshot_id, commit_id, stats, warnings. Closes the provenance loop today's `ChangeOutput` leaves open. * `.gq` pragmas — `@description`, `@returns`, `@mcp`. Source-of-truth for the stored-query agent contract; no separate YAML registry. * Multi-graph MCP — per-graph `/graphs/{id}/mcp/tools` + `/mcp/invoke`. Token binds to one graph by default; cross-graph agents loop. * Cedar split — `read`/`change` for inline, `invoke_query` for stored. Operators deny ad-hoc for agent groups while keeping curated tool list open. * Rejected alternatives — per-env override files, compiled bundles, tool-name prefixing across graphs, body-field graph dispatch. Index entry added under "Active Implementation Plans" so future agents land on the RFC before touching queries / mutations / envelope code. `scripts/check-agents-md.sh` clean (35 links, 34 docs). * docs(server): clarify why run_query lacks AppState parameter run_mutate takes state for workload admission; run_query doesn't because reads aren't admission-gated today. Mark the asymmetry as intentional and flag the two future events that would grow the signature: Phase 1's `expect: { max_rows_scanned: N }` budget (MR-976) or per-actor admission extending to stored-read invocations (MR-969). Prevents the natural "make these symmetrical" follow-up. * refactor(server): run_query / run_mutate take &ResolvedActor Replace `Option<Extension<ResolvedActor>>` in the helpers with `Option<&ResolvedActor>`. Saves MR-969's stored-query handler from wrapping a bare actor in axum's `Extension(...)` before calling. Handler signatures (`server_query`, `server_read`, `server_mutate`, `server_change`) keep `Option<Extension<ResolvedActor>>` because that is what axum injects, and unwrap at the call site with `actor.as_ref().map(|Extension(actor)| actor)`. Net: -13/+10 LOC, 89/0 server tests pass. * docs(releases): v0.6.0 — describe inline + canonical-named queries (MR-656) Extend the v0.6.0 release notes to cover the third piece of work landing alongside the graph terminology rename and multi-graph server mode: canonical-named `POST /query` and `POST /mutate` endpoints, the CLI's new `-e/--query-string` flag, the top-level promotion of `lint` / `check`, and the three-channel deprecation signal on `/read` and `/change` (OpenAPI `deprecated: true` + RFC 9745 + RFC 8288). Additions: * Top blurb: "Two pieces" -> "Three pieces" with a bullet describing the rename + inline flow. * Breaking Changes: new "Query / mutation rename" subsection covering the `ChangeRequest` field rename (with the back-compat serde aliases and the CLI's `legacy_change_request_body` byte-stable wire helper) and the `omnigraph query lint` -> `omnigraph lint` move. * New: 5 bullets — the two endpoints, the CLI subcommands, the `-e` flag, the deprecation signal channels, the widened `aliases.<name>.command` vocabulary. * User Impact: one bullet making explicit that the rename is cosmetic on the client side and migration is voluntary. * Documentation: pointers to the updated `server.md` / `cli.md` / `cli-reference.md` and the new `docs/dev/rfc-001-queries-envelope-mcp.md`. +15/-1 lines. `./scripts/check-agents-md.sh` clean. * refactor(cli): demote `check` from visible_alias to deprecation shim `omnigraph check` was a clap `visible_alias` on `lint`, advertised in `--help` as an equivalent canonical name. Per MR-981 §6 (long-form flags as canonical, short forms as visible aliases), visible aliases on subcommand names hurt agent CX: agents emit either spelling depending on training-data drift, and there's no length signal pointing at the canonical name. Changes: * Remove `#[command(visible_alias = "check")]` from the `Lint` variant. `omnigraph --help` now shows only `lint`. * Add bare `check` to `rewrite_deprecated_argv` so `omnigraph check <args>` still works — it rewrites to `omnigraph lint <args>` and emits a one-line stderr deprecation warning, matching the existing pattern for `read` / `change` / `query lint` / `query check`. * Fix the nested `query check` shim to substitute `check` -> `lint` in the rewritten argv (previously it relied on `check` being a visible_alias to reach the `Lint` variant). * New test `deprecated_check_top_level_rewrites_to_lint` covers: bare `check` produces identical stdout to `lint`, emits the deprecation warning, and `check` does NOT appear as an alias in `omnigraph --help`. * Release notes updated to reflect the deprecation-shim treatment and cross-reference MR-981 §6 reasoning. Cargo / Go users typing `check` still work indefinitely; one stderr nudge per invocation teaches the canonical name. Agents see only `lint` in `--help --json` so they emit one canonical form. 67/0 omnigraph-cli tests pass; 39 workspace test suites green. --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Ragnor Comerford <ragnor.comerford@gmail.com> Co-authored-by: Ragnor Comerford <hello@ragnor.co>
2026-05-29 13:41:54 +02:00
| Inline + stored queries, request/response envelope, MCP (MR-656 / MR-976 / MR-969) | [rfc-001-queries-envelope-mcp.md](rfc-001-queries-envelope-mcp.md) |
schema-lint chassis v1.0: DropProperty Soft + code-tagged diagnostics (MR-694) (#90) * schema-lint chassis v1 (WIP): tier surfacing + plan doc First commit of the chassis v1 branch. Lands a small, foundational slice without behavior change, plus a planning doc that lays out the remaining 7 commits in sequence so the PR can be reviewed incrementally. This commit: - Adds SchemaMigrationStep::diagnostic() returning the full &'static DiagnosticCode (family + tier + severity) for UnsupportedChange steps with codes. Renderers can now reach the tier without re-implementing the code → tier lookup. - CLI `omnigraph schema plan` output now displays tier alongside code: unsupported change on node:Person.age [OG-DS-104, destructive]: removing property 'Person.age' is not supported in schema migration v1 Operators see at-a-glance the kind of risk each rejection represents — not just the rule identifier. - No behavior change. All 11 existing schema_apply tests still pass. Planning doc at docs/schema-lint-v1-plan.md tracks the 7 remaining commits to bring v1 to feature-complete: 1. (this commit) Tier surfacing in plan output. 2. Soft / Hard mode enum on drop steps. 3. Tombstone fields on catalog IR. 4. Planner emits DropProperty { Soft } by default. 5. Apply path implements Soft mode. 6. Convert PR #62 destructive-rejection tests. 7. --allow-data-loss flag + Hard mode. 8. (optional) Tombstone unhide / restore command. Delete the planning doc when v1 lands. Intentionally checked in to the WIP branch so the scope is reviewable; not intended as a permanent doc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * schema-lint v1 commit 2: DropMode + dormant Drop* variants Second commit of the chassis v1 branch. Lands the type-level shape of soft/hard drops without wiring them up. Variants are reachable from emitters but the planner doesn't produce them yet; the apply path returns an explicit not-yet-implemented error if one shows up via deserialization. Added: - `DropMode { Soft, Hard }` — orthogonal to `SafetyTier`. Tier classifies the rule's risk class; mode is the operator's intent for data treatment. - `Soft` → catalog tombstone, data retained. Tier: safe. - `Hard` → Lance-level removal. Tier: destructive; will require --allow-data-loss to apply (commit 7). - `SchemaMigrationStep::DropType { type_kind, name, mode }` and `SchemaMigrationStep::DropProperty { type_kind, type_name, property_name, mode }` variants. - Re-export `DropMode` from `omnigraph_compiler::DropMode` so downstream crates don't reach into the catalog submodule. - CLI `render_schema_plan_step` arms for both variants, surfacing the mode in plan output: `drop property 'Person.age' of node 'Person' (soft mode)`. - `apply_schema_with_lock` exhaustive match arm for the two new variants that returns `manifest_internal` with a clear not-yet-implemented message. If a SchemaIR JSON containing Drop{Type,Property} arrives (e.g. from a future tool or hand- written), the apply path fails explicitly rather than silently misclassifying. - Two new in-source tests: - `drop_steps_round_trip_through_serde` — pins the wire shape for all four (variant × mode) combinations. - `drop_mode_serde_uses_snake_case` — pins external-tool- friendly serialization (`"soft"` / `"hard"`). Build: clean, only pre-existing warnings. Tests: - omnigraph-compiler schema_plan: 6/6 (4 existing + 2 new). - omnigraph-engine schema_apply: 11/11 (unchanged — planner still emits UnsupportedChange for removal paths). Next commit (commit 3 per docs/schema-lint-v1-plan.md): add the `tombstoned: bool` fields to NodeIR / EdgeIR / PropertyIR for the catalog representation of soft-mode tombstones. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * plan doc: reframe v1 around Lance native drop_columns After a substrate audit of the Lance data-evolution guide on 2026-05-13, the v1 plan was simplified. Two key findings: 1. Lance's `drop_columns()` is already metadata-only and reversible via time travel until cleanup. No need for a parallel `tombstoned: bool` field in our catalog IR — Lance's version graph IS the tombstone. 2. The full schema_apply substrate migration (add_columns, drop_columns, alter_columns vs. stage_overwrite across all step types) is consolidated in MR-948 as a sibling issue. v1 only uses the relevant slice (drop_columns for OG-DS-1XX). Net plan changes: - Commit 3 (original): tombstone fields on catalog IR → dropped. No catalog IR change needed. The Lance drop_columns commit IS the tombstone. - Commit 5 (original): apply path writes tombstoned: true → replaced with: apply path calls Dataset::drop_columns([name]). - Commit 7 Hard mode: stage_overwrite removing the column → replaced with: drop_columns + compact_files + cleanup_old_versions. Same APIs omnigraph cleanup already uses. - Commit 8 (original): omnigraph schema unhide → dropped. Time travel is the undo (omnigraph snapshot --at <commit>). Net result: 8 commits → 5 commits. ~250 LoC less surface. More substrate-aligned. The chassis types from commit 2 (DropMode enum, DropType / DropProperty variants) are kept exactly as designed; only the implementation strategy changed. The Lance docs quote is included in the doc so future readers see the substrate behavior cited verbatim. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * schema-lint v1 commit 3: emit + apply DropProperty { Soft } Wire the dormant DropProperty variant end-to-end for the Soft case. Per docs/schema-lint-v1-plan.md, commit #3 of the schema-lint chassis v1 series (MR-694). Planner (schema_plan.rs): - plan_properties: emit DropProperty { type_kind, type_name, property_name, mode: Soft } instead of UnsupportedChange when a property exists in accepted but not in desired. Plan is now supported = true for drop-only changes. Apply (schema_apply.rs): - Route DropProperty { Soft } through rewritten_tables. The existing batch_for_schema_apply_rewrite path already iterates the *target* schema fields, so a property absent from desired_catalog is naturally projected away. The prior Lance version retains the dropped column for time-travel reversibility (until cleanup runs). - DropType still errors (lands in commit #4 with different mechanics: __manifest entry removal instead of column projection). - DropProperty { Hard } still errors (lands in commit #5 with --allow-data-loss CLI flag + immediate compact_files + cleanup_old_versions). Tests: - Planner unit test plan_emits_soft_drop_for_removed_nullable_property asserts the variant emission + supported = true + no UnsupportedChange. - Integration test apply_schema_drops_a_nullable_property_softly_ preserves_prior_version (replaces the former apply_schema_rejects_dropping_a_property_with_data) asserts: (a) plan contains DropProperty { Soft } (b) apply succeeds + manifest advances + row count unchanged (c) current dataset schema lacks the dropped column (d) snapshot_at_version(pre_drop) still has the dropped column (e) reopen consistency — drop preserved across engine restart Recovery: rides on SidecarKind::SchemaApply per MR-847. No new sidecar kind needed; the entire apply path is already sidecar-wrapped. Substrate alignment: this commit uses the stage_overwrite full-rewrite path (full_rewrite cost class) rather than Lance native drop_columns (catalog_only cost class). MR-948 is the follow-up substrate-alignment refactor that introduces a LanceColumnOp surface and switches the metadata-only case onto drop_columns. Functional outcome is identical; cost-class improvement deferred. Test results: - cargo test -p omnigraph-compiler --lib: 238 passed - cargo test -p omnigraph-engine --test schema_apply: 11 passed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: move schema-lint-v1-plan into docs/dev/ + add to index Post-rebase fixup for the docs split (#93). The plan doc was added to docs/ at the top level before main reorganized to docs/{user,dev}/. This moves it into docs/dev/ and adds an entry to docs/dev/index.md under a new "Active Implementation Plans" section so the check-agents-md.sh link check passes. Per the original commit message (617a77d), the plan doc is intentionally temporary — it will be deleted when v1 lands. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:30:03 +03:00
## Boundary
Developer docs may mention implementation details, stale gaps, upstream Lance
blockers, and review rules. User docs should not require that context unless
the detail changes the public contract.