Commit graph

299 commits

Author SHA1 Message Date
Andrew Altshuler
58cee158d8
schema-lint v1 commit 4: emit + apply DropType { Soft } (#99)
Wire the second half of the dormant Drop* family. Per
docs/dev/schema-lint-v1-plan.md, commit #4 of the schema-lint chassis
v1 series (MR-694). Builds on commit #3 (PR #90, DropProperty Soft).

Planner (schema_plan.rs):
- plan_nodes leftover loop: emit DropType { Node, name, Soft }
  instead of UnsupportedChange (OG-DS-102) for node-type removals.
- plan_edges leftover loop: emit DropType { Edge, name, Soft }
  instead of UnsupportedChange (OG-DS-103) for edge-type removals.

Apply (schema_apply.rs):
- New dropped_tables: BTreeSet<String> accumulator alongside
  added_tables / renamed_tables / rewritten_tables.
- DropType arm in the metadata loop populates dropped_tables for
  Soft mode. Hard mode errors (lands in commit #5 with
  --allow-data-loss).
- New tombstone-emission loop after the rename sidecar build:
  for each dropped table, push to sidecar_tombstones AND populate
  table_tombstones with table_version + 1. The existing manifest
  publish path converts table_tombstones into ManifestChange::Tombstone
  operations — no new manifest plumbing needed.
- Soft DropType has no Phase B per-table write; the tombstone is the
  entire change. Lance dataset files are retained — prior __manifest
  versions still reference them, so time travel + branch-from-snapshot
  can read the dropped table until cleanup_old_versions runs.
- Rides on SidecarKind::SchemaApply per MR-847 (already established
  by commit #3).

Tests:
- Planner unit test plan_emits_soft_drop_for_removed_node_and_edge_types
  asserts both Node and Edge DropType { Soft } emission for the
  Company + WorksAt combined drop, plus no UnsupportedChange.
- Integration test apply_schema_drops_node_and_referencing_edge_softly
  (replaces apply_schema_rejects_dropping_a_node_type): asserts
  plan emission, apply success, current manifest entries absent,
  pre-drop manifest entries present (time-travel reversibility),
  reopen consistency.
- Integration test apply_schema_drops_an_edge_type_softly (replaces
  apply_schema_rejects_dropping_an_edge_type): single edge drop,
  asserts other tables untouched, time-travel reversibility.

Test results:
- cargo test -p omnigraph-compiler --lib: 239 passed (1 new + 238)
- cargo test -p omnigraph-engine --test schema_apply: 11 passed
  (2 converted + 9 unchanged)

Pending for v1 completion:
- Commit #5: --allow-data-loss CLI flag + Hard mode promotion in
  planner + immediate compact_files + cleanup_old_versions for
  both DropProperty and DropType.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 20:25:42 +03:00
Andrew Altshuler
e98347eb7b
schema-lint chassis v1.0: DropProperty Soft + code-tagged diagnostics (MR-694) (#90)
* schema-lint chassis v1 (WIP): tier surfacing + plan doc

First commit of the chassis v1 branch. Lands a small, foundational
slice without behavior change, plus a planning doc that lays out the
remaining 7 commits in sequence so the PR can be reviewed
incrementally.

This commit:

- Adds SchemaMigrationStep::diagnostic() returning the full
  &'static DiagnosticCode (family + tier + severity) for
  UnsupportedChange steps with codes. Renderers can now reach the
  tier without re-implementing the code → tier lookup.

- CLI `omnigraph schema plan` output now displays tier alongside
  code:

    unsupported change on node:Person.age [OG-DS-104, destructive]:
        removing property 'Person.age' is not supported in schema
        migration v1

  Operators see at-a-glance the kind of risk each rejection
  represents — not just the rule identifier.

- No behavior change. All 11 existing schema_apply tests still pass.

Planning doc at docs/schema-lint-v1-plan.md tracks the 7 remaining
commits to bring v1 to feature-complete:

  1. (this commit) Tier surfacing in plan output.
  2. Soft / Hard mode enum on drop steps.
  3. Tombstone fields on catalog IR.
  4. Planner emits DropProperty { Soft } by default.
  5. Apply path implements Soft mode.
  6. Convert PR #62 destructive-rejection tests.
  7. --allow-data-loss flag + Hard mode.
  8. (optional) Tombstone unhide / restore command.

Delete the planning doc when v1 lands. Intentionally checked in to
the WIP branch so the scope is reviewable; not intended as a
permanent doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* schema-lint v1 commit 2: DropMode + dormant Drop* variants

Second commit of the chassis v1 branch. Lands the type-level shape
of soft/hard drops without wiring them up. Variants are reachable
from emitters but the planner doesn't produce them yet; the apply
path returns an explicit not-yet-implemented error if one shows up
via deserialization.

Added:

- `DropMode { Soft, Hard }` — orthogonal to `SafetyTier`. Tier
  classifies the rule's risk class; mode is the operator's intent
  for data treatment.
    - `Soft` → catalog tombstone, data retained. Tier: safe.
    - `Hard` → Lance-level removal. Tier: destructive; will require
      --allow-data-loss to apply (commit 7).

- `SchemaMigrationStep::DropType { type_kind, name, mode }` and
  `SchemaMigrationStep::DropProperty { type_kind, type_name,
  property_name, mode }` variants.

- Re-export `DropMode` from `omnigraph_compiler::DropMode` so
  downstream crates don't reach into the catalog submodule.

- CLI `render_schema_plan_step` arms for both variants, surfacing
  the mode in plan output: `drop property 'Person.age' of node
  'Person' (soft mode)`.

- `apply_schema_with_lock` exhaustive match arm for the two new
  variants that returns `manifest_internal` with a clear
  not-yet-implemented message. If a SchemaIR JSON containing
  Drop{Type,Property} arrives (e.g. from a future tool or hand-
  written), the apply path fails explicitly rather than silently
  misclassifying.

- Two new in-source tests:
    - `drop_steps_round_trip_through_serde` — pins the wire shape
      for all four (variant × mode) combinations.
    - `drop_mode_serde_uses_snake_case` — pins external-tool-
      friendly serialization (`"soft"` / `"hard"`).

Build: clean, only pre-existing warnings.
Tests:
- omnigraph-compiler schema_plan: 6/6 (4 existing + 2 new).
- omnigraph-engine schema_apply: 11/11 (unchanged — planner still
  emits UnsupportedChange for removal paths).

Next commit (commit 3 per docs/schema-lint-v1-plan.md): add the
`tombstoned: bool` fields to NodeIR / EdgeIR / PropertyIR for the
catalog representation of soft-mode tombstones.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* plan doc: reframe v1 around Lance native drop_columns

After a substrate audit of the Lance data-evolution guide on
2026-05-13, the v1 plan was simplified. Two key findings:

1. Lance's `drop_columns()` is already metadata-only and reversible
   via time travel until cleanup. No need for a parallel
   `tombstoned: bool` field in our catalog IR — Lance's version
   graph IS the tombstone.

2. The full schema_apply substrate migration (add_columns,
   drop_columns, alter_columns vs. stage_overwrite across all step
   types) is consolidated in MR-948 as a sibling issue. v1 only
   uses the relevant slice (drop_columns for OG-DS-1XX).

Net plan changes:

- Commit 3 (original): tombstone fields on catalog IR → dropped.
  No catalog IR change needed. The Lance drop_columns commit IS the
  tombstone.

- Commit 5 (original): apply path writes tombstoned: true → replaced
  with: apply path calls Dataset::drop_columns([name]).

- Commit 7 Hard mode: stage_overwrite removing the column → replaced
  with: drop_columns + compact_files + cleanup_old_versions. Same
  APIs omnigraph cleanup already uses.

- Commit 8 (original): omnigraph schema unhide → dropped. Time
  travel is the undo (omnigraph snapshot --at <commit>).

Net result: 8 commits → 5 commits. ~250 LoC less surface. More
substrate-aligned.

The chassis types from commit 2 (DropMode enum, DropType /
DropProperty variants) are kept exactly as designed; only the
implementation strategy changed.

The Lance docs quote is included in the doc so future readers see
the substrate behavior cited verbatim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* schema-lint v1 commit 3: emit + apply DropProperty { Soft }

Wire the dormant DropProperty variant end-to-end for the Soft case.
Per docs/schema-lint-v1-plan.md, commit #3 of the schema-lint chassis
v1 series (MR-694).

Planner (schema_plan.rs):
- plan_properties: emit DropProperty { type_kind, type_name,
  property_name, mode: Soft } instead of UnsupportedChange when a
  property exists in accepted but not in desired. Plan is now
  supported = true for drop-only changes.

Apply (schema_apply.rs):
- Route DropProperty { Soft } through rewritten_tables. The existing
  batch_for_schema_apply_rewrite path already iterates the *target*
  schema fields, so a property absent from desired_catalog is
  naturally projected away. The prior Lance version retains the
  dropped column for time-travel reversibility (until cleanup runs).
- DropType still errors (lands in commit #4 with different mechanics:
  __manifest entry removal instead of column projection).
- DropProperty { Hard } still errors (lands in commit #5 with
  --allow-data-loss CLI flag + immediate compact_files +
  cleanup_old_versions).

Tests:
- Planner unit test plan_emits_soft_drop_for_removed_nullable_property
  asserts the variant emission + supported = true + no UnsupportedChange.
- Integration test apply_schema_drops_a_nullable_property_softly_
  preserves_prior_version (replaces the former
  apply_schema_rejects_dropping_a_property_with_data) asserts:
  (a) plan contains DropProperty { Soft }
  (b) apply succeeds + manifest advances + row count unchanged
  (c) current dataset schema lacks the dropped column
  (d) snapshot_at_version(pre_drop) still has the dropped column
  (e) reopen consistency — drop preserved across engine restart

Recovery: rides on SidecarKind::SchemaApply per MR-847. No new
sidecar kind needed; the entire apply path is already sidecar-wrapped.

Substrate alignment: this commit uses the stage_overwrite full-rewrite
path (full_rewrite cost class) rather than Lance native drop_columns
(catalog_only cost class). MR-948 is the follow-up substrate-alignment
refactor that introduces a LanceColumnOp surface and switches the
metadata-only case onto drop_columns. Functional outcome is identical;
cost-class improvement deferred.

Test results:
- cargo test -p omnigraph-compiler --lib: 238 passed
- cargo test -p omnigraph-engine --test schema_apply: 11 passed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: move schema-lint-v1-plan into docs/dev/ + add to index

Post-rebase fixup for the docs split (#93). The plan doc was added
to docs/ at the top level before main reorganized to docs/{user,dev}/.
This moves it into docs/dev/ and adds an entry to docs/dev/index.md
under a new "Active Implementation Plans" section so the
check-agents-md.sh link check passes.

Per the original commit message (617a77d), the plan doc is intentionally
temporary — it will be deleted when v1 lands.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:30:03 +03:00
Ragnor Comerford
5c889f8e42
Update README.md 2026-05-15 18:06:25 -07:00
Ragnor Comerford
e046c193bc
Update README.md 2026-05-15 18:03:40 -07:00
Ragnor Comerford
6abe59bbaa
Update use cases in README with second brains link 2026-05-15 15:40:30 -07:00
Andrew Altshuler
0de5f69d86
docs: drop npx mdrip; use curl | pandoc for full-page fetches (#97)
The previous "fetch the full page" recommendation in AGENTS.md and
docs/dev/lance.md pointed at an unknown-author npm CLI that, on consent,
wrote agent-targeted content into AGENTS.md and modified .gitignore /
tsconfig.json. Source audit was clean of malicious code but the
self-perpetuating prompt-injection pattern combined with a single
maintainer and ~21 downloads/day made it not worth the risk. Switched
to the curl + pandoc command already documented as the no-tool option.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 16:06:24 +03:00
Andrew Altshuler
60eee78465
docs: split user and developer docs (#93) 2026-05-15 03:45:22 +03:00
Andrew Altshuler
e8d49559c4
branch-protection: allow admin bypass on main (#94)
Flip enforce_admins from true to false. Repo admins can now merge
their own PRs without waiting for code-owner review, by clicking
"Merge without waiting for requirements to be met" once CI is green.
The action is recorded in the audit log.

Non-admins still see full enforcement: code-owner review required,
1 approving review, required status checks must pass.

Rationale: as the solo owner of most CODEOWNERS scopes, the author
cannot satisfy GitHub's "non-self approver" rule on their own PRs,
which made every PR block on a second human. Admin bypass restores
the practical workflow while keeping the protection rules as the
default for everyone else.
2026-05-15 03:32:12 +03:00
Andrew Altshuler
6bad829ed0
branch-protection: declarative policy + apply script (#89)
Branch protection on main, declared as code rather than as opaque
GitHub UI state. Pairs with the CODEOWNERS chassis (#88): once this
PR lands and an admin runs the apply script, every PR to main must
satisfy code-owner review and the listed required checks.

Components:

- .github/branch-protection.json — the policy. Edit this to change
  required checks, review counts, etc. Includes a _comment field for
  human readers; the apply script strips it before PUT.
- scripts/apply-branch-protection.sh — idempotent apply via `gh api`.
  Reads back current state for verification. Supports DRY_RUN=1.
- docs/branch-protection.md — explains the policy, how to apply, how
  to change, why declared as code.
- AGENTS.md topic-index row.

Policy summary:

- Required status checks (strict): Classify Changes, Check AGENTS.md
  Links, Test Workspace, Test omnigraph-server --features aws,
  CODEOWNERS / drift, CODEOWNERS / noedit.
- Required approving reviews: 1, must be a code owner.
- Dismiss stale reviews on new commits.
- Required linear history (squash or rebase merges only).
- No force pushes, no deletions, no admin bypasses.
- Required conversation resolution.

What's NOT in this PR:

- Required signed commits — not yet; maintainers must enroll GPG/SSH
  signing first or merges will block.
- Tag protection for v* tags — separate PR.
- Additional required checks (cargo deny, audit, fmt, clippy, CodeQL,
  schema-lint MR-946) — separate PRs as each lands.
- The script is NOT run by CI. Branch-protection changes are admin
  actions; CI-driven auto-apply would defeat the purpose. Manual
  invocation is the audit point.

How to apply after merge:

  ./scripts/apply-branch-protection.sh

Requires gh-CLI auth with repo-admin permissions.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:38:20 +03:00
Andrew Altshuler
730712b73f
codeowners: yml source of truth + generator + drift CI (#88)
* codeowners: generator + drift CI + initial roles

Source-of-truth approach to CODEOWNERS: yml is hand-edited, CODEOWNERS
is generated and CI-enforced. Every role change is a reviewable PR
with a permanent in-repo audit trail. No GitHub UI clicks, no shadow
state.

Initial roles:

  engineering  @aaltshuler            owns crates/** + default (.github/,
                                       scripts/, Cargo.*, openapi.json,
                                       everything else not docs)

  docs         @aaltshuler @ragnorc   owns docs/**, README.md, AGENTS.md,
                                       CLAUDE.md, SECURITY.md

Per GitHub semantics, multiple owners on a CODEOWNERS line means "any
one satisfies the review" — for docs, either named member can approve.
Strict "N distinct approvers" would need a CI workaround (not wired
today; tracked for future hardening).

Components:

- .github/codeowners-roles.yml — source of truth. Edit this.
- .github/scripts/render-codeowners.py — generator (PyYAML; ~100 LoC).
- .github/CODEOWNERS — generated. CI rejects hand-edits.
- .github/workflows/codeowners.yml — two checks:
  * drift: re-render and assert CODEOWNERS matches.
  * noedit: reject PRs that edit CODEOWNERS without editing the yml.
- docs/codeowners.md — explains the source-of-truth pattern, how to
  change roles, how to add new roles.
- AGENTS.md topic-index row.

What's NOT in this PR:

- Branch protection on main (separate PR; needs `gh api` call against
  the org).
- Required-reviewer enforcement (depends on branch protection landing).
- Required CI status checks (depends on branch protection landing).
- Scheduled rotation (the schedule: block in the yml + a weekly
  workflow). Today's roles are stable; rotation isn't needed yet.
- Linear-as-source-of-truth integration (Approach 4 from the design
  discussion; deferred).

Verified:
- Generator output is deterministic (idempotent re-runs).
- scripts/check-agents-md.sh OK (28 links, 28 docs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* codeowners: fix catch-all ordering (Devin review #88)

Devin caught a real bug: GitHub CODEOWNERS uses "last match wins"
semantics, but the generator emitted the catch-all `*` AFTER specific
patterns. Net effect: `*` won for every file, silently nullifying the
docs role and never routing reviews to @ragnorc.

Fix is one-line — emit the default `*` line before iterating the
specific paths. Also:

- Added a regression assertion in the generator: after rendering, the
  first non-comment line must start with `*` if a default is
  configured. Generator exits non-zero otherwise. Catches the same
  class of mistake in any future refactor.
- Rewrote the yml header comment, which incorrectly stated "keep
  more-specific paths after broader patterns" (correct for GitHub
  semantics but the generator was doing the opposite — so the comment
  read as a description of behavior when it was actually a contradicted
  intention).

Verified by re-rendering: `*` is now line 12, `crates/**` is line 14,
`docs/**` is line 15, etc. README.md matches both `*` and `README.md`;
`README.md` is later → wins → @aaltshuler + @ragnorc both assigned.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:26:06 +03:00
Andrew Altshuler
c142dafdf3
schema-lint chassis v0: code-tagged diagnostics (MR-694) (#87)
First slice of the schema-lint chassis. Adds stable `OG-XXX-NNN`
codes to schema-migration rejections so operators can suppress, look
up, and filter on identifiers rather than free-text prose. Atlas-style
chassis adapted to omnigraph's typed-IR substrate (no SQL injection
vector, no per-engine locks, native edge/vector/embedding types).

What's in v0:

- New `omnigraph-compiler/src/lint/` module with:
  - `diagnostic.rs` — Family / SafetyTier / Severity enums covering ten
    families (DS, MF, CD, BC, NM, OW, NL, VE, ED, LK). Only DS and MF
    are populated in this PR.
  - `codes.rs` — 8 DiagnosticCode constants (OG-DS-101..105,
    OG-MF-103, OG-MF-104, OG-MF-106). Five of the eight are wired to
    real emission sites; the other three are reserved.
  - Unit tests for catalog invariants: codes unique, prefix matches
    family, suffixes are 3-digit, destructive defaults to error,
    lookup() works, EMITTED_IN_V0 codes exist in ALL_CODES.

- `SchemaMigrationStep::UnsupportedChange` gains an optional
  `code: Option<String>` field. New `unsupported_error_message()`
  helper prefixes the message with `[code]` when present.

- 5 of 17 existing rejection paths now carry codes:
  - `removing node type` → OG-DS-102
  - `removing edge type` → OG-DS-103
  - `removing property` → OG-DS-104
  - `adding required property without backfill` → OG-MF-103
  - `changing property type` → OG-MF-106
  Remaining 12 paths carry `code: None` and are tagged as future work.

- `schema_apply` surfaces the formatted error (with `[code]` prefix);
  CLI `omnigraph schema plan` renders the code on the
  `unsupported change on <entity>` line.

- PR #62 destructive-rejection tests in `tests/schema_apply.rs` now
  assert on the stable code (`msg.contains("OG-DS-104")`) instead of
  the error-message substring. 11/11 tests pass.

- New `docs/schema-lint.md` documents the v0 catalog + the 10 families
  + Atlas prior art. AGENTS.md index updated.

What's explicitly NOT in v0 (subsequent PRs):

- No severity config in `omnigraph.yaml` (MR-694 §2).
- No `@allow(OG-XXX-NNN, "rationale")` suppression directive (§3).
- No `--allow-data-loss` flag or destructive-tier enforcement.
- No new `SchemaMigrationStep` variants (soft/hard drops, default,
  widen/narrow). MR-700, MR-697 land those.
- No pre-migration checks (MR-941).
- No CD / VE / LK / NM family rules (MR-942..945).
- No CI integration (MR-946).

Tests: 235 compiler tests, 11 schema_apply integration tests, 14
lint module tests, 55 CLI tests — all green.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:08:18 +03:00
Ragnor Comerford
f28f644bf2
Merge pull request #83 from ModernRelay/devin/1778623807-remove-orphan-loader-files
Remove orphaned loader/{constraints,embeddings,jsonl}.rs files
2026-05-12 21:03:36 -07:00
Ragnor Comerford
53d41a30b4
Merge pull request #85 from ModernRelay/ragnorc/survey-state
engine: pin stable-row-id preservation through stage_overwrite
2026-05-12 17:24:55 -07:00
Ragnor Comerford
3cc5c6a9a2
chore: gitignore the mdrip/ markdown snapshot cache
npx mdrip writes fetched-page snapshots under mdrip/. The cache is a
local-only working artifact (docs/lance.md is the curated index of
upstream Lance pages we fetch on demand). Keep the cache out of the
tree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:02:14 -07:00
Ragnor Comerford
a30d1cc0dc
engine: stage_overwrite sets enable_stable_row_ids explicitly
Defensive — Lance 4.0.0 preserves the source dataset's flag through
Operation::Overwrite even when WriteParams omits it (pinned by the
prior commit's test), but setting it explicitly matches the public
overwrite_dataset path at line 454 and documents the dependency at
the call site so a future refactor doesn't accidentally drop it.

Setting it on a dataset created without stable row IDs is a no-op
per Lance's row-id-lineage spec, so this stays correct for legacy
datasets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 16:57:05 -07:00
Ragnor Comerford
549060f297
tests: pin stable-row-id preservation across stage_overwrite
stage_overwrite is used by schema_apply to rewrite tables when an
additive migration touches data. If Lance Operation::Overwrite ever
stopped preserving the source dataset's enable_stable_row_ids flag,
every schema_apply that triggers a rewrite would silently disable
stable row IDs on the affected tables and downstream readers that
depend on _rowid stability (change-feed validators, index
reconcilers) would observe silent corruption.

Empirically Lance 4.0.0 does preserve the flag through Overwrite
even when WriteParams omits it — but the preservation isn't
documented at the Lance spec level, so pin it here. Any future
behaviour change surfaces as a test failure rather than silent
corruption.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 16:56:58 -07:00
Ragnor Comerford
2121d9f6c3
docs: storage stable-row-ids reflects every dataset
The L1 capability list claimed the flag was enabled "for the
commit-graph and run-registry datasets" — stale. Every Lance
dataset OmniGraph creates has enable_stable_row_ids: true; the
run-registry datasets are gone since MR-771. Replace with a single
paragraph capturing the invariant, the consequences (row-version
columns available, CreateIndex × Rewrite not retryable, Lance reader
version required), the legacy-dataset constraint (one-way at create,
dump-and-reload to migrate), and a pointer to the regression test in
staged_writes.rs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 16:56:51 -07:00
Ragnor Comerford
8427e705dd
Merge pull request #84 from ModernRelay/ragnorc/read-docs
docs: lead AGENTS.md first principle with integrated-over-time framing
2026-05-12 16:38:52 -07:00
Ragnor Comerford
24c0558180
docs: lead AGENTS.md first principle with integrated-over-time framing
Reframes the first-principle section to lead with Winters' "engineering
is programming integrated over time" as the lens, keeping "minimize
ongoing liability" as the operative directive and folding in "complexity
should be earned." Adds a new Tiebreakers subsection with two rules
that the prior section lacked clean appeals for:

- correctness > simplicity > performance (lexicographic)
- reversibility shapes evidence demand (reversible → prod metrics over
  napkin math over RFCs; irreversible → RFC up-front)

Adds a Hyrum's-Law deny-list entry in both AGENTS.md and
docs/invariants.md §IX: shipping observable behavior is shipping a
contract, even when undocumented.

Net always-on context cost: ~7 lines. No renumbering of §I–VIII
invariants; Hyrum's Law lands in the deny-list to avoid breaking
back-references.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 16:27:24 -07:00
Devin AI
327eb821b5 Remove orphaned loader/{constraints,embeddings,jsonl}.rs files
These three files in crates/omnigraph/src/loader/ have no `mod`
declaration anywhere in the workspace and no `#[path = "…"]`
reference. They are not compiled — `touch`-ing them does not trigger
`cargo check` to recompile anything.

Their imports (`crate::catalog::schema_ir`, `crate::error::NanoError`,
`crate::store::manifest::hash_string`, `crate::types::ScalarType`,
`super::super::graph::DatasetAccumulator`) reference modules that no
longer exist in the engine crate, so they could not even be wired in
without further work. They are vestigial code from an earlier
monolithic crate layout. The live functionality is independently
implemented inside crates/omnigraph/src/loader/mod.rs.

These files have been orphaned since the initial public commit.

`cargo check --workspace --all-targets` and
`cargo test --workspace --no-run` both pass with no new warnings.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-12 22:57:20 +00:00
Claude
a9c4423b82 Strengthen cleanup-then-optimize sequencing test with postconditions
Reviewer feedback on PR #62: the original
`cleanup_then_optimize_succeed_in_sequence` only unwrapped both calls
and asserted nothing, so it didn't validate the claimed sequencing
behavior. The concern that motivates the test is that cleanup destroys
version history and optimize on a freshly-cleaned table could trip on
dropped fragment refs or stale manifests.

Rename to `cleanup_then_optimize_preserves_rows_and_table_remains_writable`
and add three concrete postconditions: row counts in both Person and
Company tables survive the sequence; the head remains readable; and a
subsequent merge load still succeeds.
2026-05-12 23:36:01 +03:00
Claude
57a62756c5 Exercise actual type rename in schema-apply rename test
The previous version of `apply_schema_renames_node_type_via_rename_from_and_preserves_rows`
kept the node name as `Person` (`@rename_from("Person")`) and only renamed
a property. The planner only emits a `RenameType` step when the new name
differs from the accepted one, so the test name overstated what it
covered: a regression in `RenameType` step emission or in the
coordinator's table-key remap during type rename could pass while the
test still went green.

Rename the desired node from `Person` to `Human` (with
`@rename_from("Person")`), update the dependent edge endpoints to point
at `Human`, and assert both the `RenameType` step and that the manifest
table key has moved from `node:Person` to `node:Human`.
2026-05-12 23:36:01 +03:00
Claude
e22d468e27 Add maintenance + destructive-migration test coverage
The audit of test coverage flagged three holes:

- `omnigraph optimize` and `omnigraph cleanup` had no integration tests
  (no `maintenance.rs`). Add one covering empty/idempotent edges, the
  policy-validation contract on `cleanup`, and head preservation under
  aggressive policies.
- `apply_schema` only covered I32 -> I64 type-change rejection. Add the
  symmetric narrowing case plus rejections for the other destructive
  shapes (drop property with data, drop node type, drop edge type, add
  required property without backfill) and assert the manifest version
  doesn't advance. Add a positive `@rename_from` case to pin the
  stable-type-id contract preserves rows through a rename.
- `docs/testing.md` was missing `validators.rs` and the new
  `maintenance.rs` from its file table; bump the count and add rows.
2026-05-12 23:36:01 +03:00
devin-ai-integration[bot]
6914e0256e
MR-786: merge-pair truth table with exhaustive op-variant matrix (#81)
* MR-786: merge-pair truth table with exhaustive op-variant matrix

Add crates/omnigraph/tests/merge_truth_table.rs that enumerates every
(left_op, right_op) cell from the operation vocabulary named in the
ticket — {noop, addNode, removeNode, addEdge, removeEdge, setProperty,
dropProperty, addLabel, removeLabel} — and asserts the deterministic
outcome of Omnigraph::branch_merge against a structured oracle.

The matrix is built in a 9x9 match in build_case, so adding a new
OpVariant is a compile-time, fail-on-omission task. Today's mutation
grammar only exposes insert | update set | delete (see
docs/query-language.md), so the 36 cells over the first six ops are
executable and the 45 cells involving dropProperty/addLabel/removeLabel
are recorded as Expected::Unsupported with a note. Each executable cell
spins up a fresh tempdir, applies one mutation per branch, calls
branch_merge, and asserts either:

  * MergeOutcome (AlreadyUpToDate / FastForward / Merged) plus a
    GraphAssert on the affected entities, or
  * an OmniError::MergeConflicts whose entries match the expected
    table_key + MergeConflictKind (row_id is optional because edge
    ULIDs are generated at runtime).

branch_merge is directional, so the (L, R) and (R, L) cells live in
separate entries in the matrix and are run independently — the
op-pair symmetry encoded in build_case serves as the commutativity
oracle without doubling the runtime. End-to-end the suite runs in
~10s on a fresh build, well under the 30s budget asserted at the
bottom of the test.

Also adds a row to docs/testing.md so the test-coverage map points
future agents at this file.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>

* Use one Omnigraph handle for both branches

Self-review caught that the runner was opening two Omnigraph handles
on the same temp dataset (one for main, a second via Omnigraph::open
for feature). tests/branching.rs uses one handle and passes the branch
name to mutate_branch — same pattern works here and avoids any
cache-coherency surprises between the two handles. Also drops the
post-merge reopen, which only existed to give the second handle a
fresh snapshot.

Runtime drops ~10s -> ~9s.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>

* Assert exact conflict count, not subset inclusion

cubic and Devin Review both flagged that check_outcome's
Expected::Conflicts arm only enforces want ⊆ got, so a regression that
produces a spurious extra conflict (e.g. emitting both OrphanEdge and
a stray DivergentInsert) would silently pass the truth-table cell.

For a deterministic oracle that's the wrong direction — the cell pins
the exact conflict-artifact set, not a lower bound. Add an
assert_eq!(got.len(), want.len()) before the existence loop. All 36
executable cells still pass; runtime unchanged.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>

* Subsume 4 conflict tests in branching.rs into truth table

The four `branch_merge_reports_*_conflict` tests
(DivergentUpdate / DivergentInsert / DeleteVsUpdate / OrphanEdge)
were redundant with the deterministic-oracle cells in the new
`merge_truth_table.rs` and only added drift risk.

To preserve the post-conflict invariant that lived in
`branch_merge_reports_divergent_update_conflict` (target unchanged
after a failed merge), the truth-table runner now generalizes it:
on every `Conflicts` cell, main's state is asserted against
`state_after_apply_only(right_op)`. That gives strictly more
coverage than the deleted tests carried, since the invariant now
applies to *all* seven conflict cells, not just one.

The `UniqueViolation` and `CardinalityViolation` cases stay in
`branching.rs` — they're combinatorial (require >1 op per side
with a non-default schema) and out of scope for the pair-wise
truth table.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>

* Fix misleading 'Total edges: 0' comment in (AddEdge, RemoveEdge) cell

Devin Review flagged that the comment said 'Total edges: 0' while the
parenthetical math evaluates to 1 (matching `GraphAssert::base()`).
The assertion is correct; only the leading number in the comment was
wrong. Reworded to 'Net edges: … = 1 (matches base)' so the prose
agrees with both the math and the assertion.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>

---------

Co-authored-by: Ragnor <ragnor@modernrelay.com>
Co-authored-by: Ragnor Comerford <ragnor.comerford@gmail.com>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-05-12 22:36:01 +03:00
Ragnor Comerford
3bd072c917
docs: add docs/transactions.md — branch-as-transaction explainer (#69)
The architectural rule "no cross-query BEGIN/COMMIT; branches fill that
role" lives in docs/invariants.md §VI.23 but is not surfaced anywhere
user-facing. New users coming from Postgres/MySQL hit the gap when they
realize multiple queries on main are independently atomic, not jointly
atomic.

This page explains the model with worked examples:
* Single-query multi-statement (atomic by default)
* Two separate queries on main (NOT atomic — common surprise)
* Many queries via a branch (atomic at merge)
* Coordinating multiple agents via branch-per-agent

Plus a comparison table to BEGIN/COMMIT, failure-mode rundown, and
"when to use what" decision matrix.

Linked from AGENTS.md "Where to find each topic" between
branches-commits.md and runs.md.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 22:35:57 +03:00
Ragnor Comerford
c9c7c0672e
Update README.md 2026-05-12 08:17:31 -07:00
Ragnor Comerford
c2e3a9e5c3
Add use cases for unified company brain and context graphs 2026-05-12 08:08:08 -07:00
Ragnor Comerford
676c9eab05
Merge pull request #78 from ModernRelay/devin/1778363660-mr-901-blob-branch-merge
Fix branch merge with blob columns
2026-05-12 07:31:04 -07:00
Ragnor Comerford
d6d2763609
Merge pull request #80 from ModernRelay/devin/1778524905-mr-923-merge-restore-refresh
Fix MR-923: refresh restored coordinator on merge Err path
2026-05-11 15:55:43 -07:00
Devin AI
725d41205e Drop redundant server-level regression test
The matrix cell d:merge×change:into-target already exercises this
race: pre-fix it flakes ~20% on shared-CPU hardware (sentinel 409s);
post-fix it passes 100% regardless of which side of the racing pair
returns first. That flake-to-stable transition is the regression
signal.

The replacement test (concurrent_merge_clean_409_does_not_poison_next_
change_on_target) tried to sharpen this by looping until the clean-
409 path fired and then strictly requiring it. On fast CI hardware
the race window never opens in 50 iterations, which made the strict
variant fail in CI despite passing 10/10 locally. The bug genuinely
needs a real concurrent writer to advance on-disk manifest during
the swap window — a deterministic failpoint can't substitute because
forcing the merge body to Err without a real concurrent writer leaves
no cache-vs-disk drift to validate.

Reverting to the matrix cell as the sole regression coverage. Updated
the comment in merge.rs accordingly.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-11 21:57:47 +00:00
Devin AI
a6c7e5fab5 Use if-let shape for refresh outcome handling
Switch from match-on-Result to if-let-Err so the refresh outcome and
merge_result outcome are checked independently, making the intent
clearer: 'attempt refresh; on Ok-merge-with-refresh-error propagate;
on Err-merge-with-refresh-error log and surface the original merge
error'. No semantic change — both shapes were valid (wildcard patterns
don't move the scrutinee) — but the if-let form sidesteps a
needs-second-reading question raised in code review.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-11 21:50:26 +00:00
Devin AI
7d1a40102c Address review feedback
merge.rs: best-effort refresh on the Err path so a refresh-time
storage error doesn't replace the merge body's structured error
(typically the manifest_conflict that the HTTP layer maps to a 409
with a structured payload) with a less informative one. Ok-path
behavior is unchanged — there a refresh failure is propagated so the
caller knows the coord's cache is unsynced.

server.rs: bump MAX_ITERATIONS to 50 and assert at the end that the
named clean-409 path actually fired at least once. With ~20% per-iter
rate on shared-CPU CI (per the original MR-923 repro), P(no hit in
50) is < 0.002%. Without this assertion the test silently degraded
to exercising only the 200-merge path — covered already by the
matrix cell.

Both changes per Devin Review + cubic comments on PR #80.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-11 21:35:18 +00:00
Devin AI
b7353e1dc7 Use refresh_coordinator_only to avoid racing branch_merge's sidecar
The previous fix used `self.refresh()` to sync the restored
coordinator's cache after the swap-restore window. `refresh` runs the
`RollForwardOnly` recovery sweep — which, on the merge Err path with a
phase-B failure (sidecar written, per-table HEAD advanced, manifest
publish skipped), would observe the merge's own in-flight sidecar and
close it here.

That violates the contract documented on `Omnigraph::refresh`:

> Engine-internal callers that already hold an in-flight sidecar
> (e.g. `schema_apply` mid-write) MUST use `refresh_coordinator_only`
> to avoid the recovery sweep racing their own sidecar.

The post-restore step's purpose is to sync the coord cache with disk,
not to run recovery, so `refresh_coordinator_only` is the right
primitive on both paths. CI surfaced this via
`branch_merge_phase_b_failure_recovered_on_next_open` in
`crates/omnigraph/tests/failpoints.rs`, which asserts the sidecar
persists after the failpoint fires.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-11 21:09:44 +00:00
Devin AI
e91d5615c6 Fix MR-923: refresh restored coordinator on merge Err path
branch_merge_impl swaps the coordinator for the merge target, runs the
merge body, then restores the original coordinator. A concurrent /change
on the same target during this window publishes against the swapped
coord, advancing on-disk manifest state that the restored coord doesn't
see.

The post-restore refresh was previously gated on merge_result.is_ok(),
so the clean-409 path (merge body's post_queue_snapshot drift check
returning a recoverable conflict) left the restored coord's cached
snapshot stale relative to disk. The next sequential /change seeded its
publisher expected_versions from that stale cache and 409'd with
ExpectedVersionMismatch — a non-retryable conflict surfaced to a caller
with no concurrent writer of their own.

Refresh on both Ok and Err paths so cached state cannot diverge from
the manifest across the swap-restore window.

Add a focused regression test
(concurrent_merge_clean_409_does_not_poison_next_change_on_target) that
loops the cell-d scenario until the clean-409 branch fires and asserts
the follow-up sentinel succeeds in that branch specifically.

Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-11 20:31:18 +00:00
Devin AI
fca2b74dee Materialize external blob URIs during branch merge
Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-11 12:54:04 +00:00
Devin AI
da89e18e62 Merge main into blob merge fix
Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-10 21:55:02 +00:00
Ragnor Comerford
19e9292ec0
Merge pull request #75 from ModernRelay/ragnorc/mr-686-lance
Per-table writer queues + per-actor admission + op-kind-aware version check
2026-05-10 23:50:56 +02:00
Devin AI
7a338a8223 agents: keep guide short for context 2026-05-10 14:41:02 +00:00
Devin AI
4eb865b340 docs: expand 0.4.2 release notes 2026-05-10 14:37:58 +00:00
Devin AI
e44a4704eb docs: fix admission gating description 2026-05-10 14:16:26 +00:00
Devin AI
a42d178119 release: prepare omnigraph 0.4.2 2026-05-10 14:02:28 +00:00
Devin AI
31b8ffe7b5 engine: inline-delete sidecar covers version-mismatch check 2026-05-10 10:37:46 +00:00
Devin AI
01660faa26 Tighten blob descriptor validation
Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-10 09:28:44 +00:00
Devin AI
16ac166059 Fix branch merge with blob columns
Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
2026-05-09 22:33:29 +00:00
Devin AI
6a3f0677ae server: drop unwired try_admit_rewrite / 503 admission surface 2026-05-09 20:58:17 +00:00
Devin AI
4bb7964af9 tests: matrix cell k asserts post-reopen row count 2026-05-09 20:16:44 +00:00
Devin AI
708e170dc5 engine: branch-merge revalidates target snapshot under queue 2026-05-09 20:16:12 +00:00
Devin AI
a6d244e648 engine: strict drift check uses read-time pin 2026-05-09 20:06:25 +00:00
Ragnor Comerford
f9a0f31f80
server: drop 503 from OpenAPI on admission-gated endpoints (unreachable)
Cursor Bugbot LOW on commit 3ad359d: try_admit_rewrite is defined and
tested but no HTTP handler calls it; the six handler OpenAPI
annotations declared status = 503 (added in 8e1a8e7) but try_admit
(the only path handlers invoke) returns 429 only. 503 was unreachable.

Fix: remove (status = 503, ...) from the six handler OpenAPI
annotations and regenerate openapi.json. Kept as forward-looking
infrastructure: try_admit_rewrite, global rewrite semaphore,
RejectReason::GlobalRewriteExhausted, ApiError::ServiceUnavailable,
the 503 branch in IntoResponse, --global-rewrite-cap, and
OMNIGRAPH_GLOBAL_REWRITE_MAX. When a future commit wires
try_admit_rewrite into a handler, the 503 OpenAPI annotation lands
alongside that wiring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:54:24 +02:00
Ragnor Comerford
3ad359db8b
tests: admission test uses new_with_workload, drops env mutation + #[serial]
Migrates `ingest_per_actor_admission_cap_returns_429` from env-var
override to direct `WorkloadController::new(1, ...)` construction via
`AppState::new_with_workload`. Removes the `EnvGuard` and the
`#[serial]` annotation that paired with it.

Why correct by design (AGENTS.md rule 9): the previous round's matrix
fix (commit 8bd9a5f) shielded the matrix from this test's env
mutation, but the broader bug class — "test A's process-wide env
mutation can leak into any test B that calls
`AppState::open` / `WorkloadController::from_env()`" — was still
reachable by any future test that didn't think to opt out. Closing
the class at the source: this test no longer mutates global state at
all, so no other test needs to defend against it.

Net effect:
- This test no longer needs `#[serial]` (was the only reason it was
  marked) — runs in parallel with the rest of the suite.
- The matrix's defensive `with_defaults()` construction (commit
  8bd9a5f) remains correct but is no longer required for correctness;
  it's now a "belt and suspenders" guard against any FUTURE
  env-mutating test.

Verified locally: both tests pass when run together; full server
suite (44 tests) green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:35:41 +02:00