Commit graph

9 commits

Author SHA1 Message Date
Andrew Altshuler
cb80fa40f1
exec/query: structured Expr pushdown via Scanner::filter_expr (unblocks CompOp::Contains) (#113)
* exec/query: pushdown IR filters via DataFusion Expr (Scanner::filter_expr)

Switches `execute_node_scan` from string-flattened Lance SQL pushdown
(`build_lance_filter` + `scanner.filter(&str)`) to structured DataFusion
Expr pushdown (`build_lance_filter_expr` + `scanner.filter_expr(Expr)`).

## What this enables

1. **`CompOp::Contains` now pushes down.** `ir_filter_to_sql` returned
   `None` for list-contains (the comment said *"Can't pushdown list
   contains"*) because string SQL can't easily express it. With Expr,
   it lowers to DataFusion's `array_has(col, value)` builtin via the
   `nested_expressions` feature, and pushes down to Lance's scan layer
   the same way Eq/Lt/etc. do. Pinned by the new regression test
   `end_to_end::ir_filter_with_list_contains_pushes_down`.

2. **DataFusion 53's optimizer rules now reach our predicates.** Once
   the Expr lands at the Lance scanner, DF's planner runs:
   - `IN`-list vectorized eq kernel (DF #20528)
   - `PhysicalExprSimplifier` (DF #20111)
   - CASE WHEN x THEN y ELSE NULL shortcut (DF #20097)
   - Push limit into hash join (DF #20228)
   None of these were applicable before because the string SQL path
   short-circuited the optimizer.

## Scope

This is one of three string-flattened pushdown sites; the other two
(`hydrate_nodes`/Expand pushdown at query.rs:771-796 and the mutation
delete path in `exec/mutation.rs::predicate_to_sql`) stay on the SQL
string path for now:

- The Expand pushdown still serializes through `hydrate_nodes`'s
  `extra_filter_sql: Option<&str>` parameter. Migrating it changes the
  `TableStorage` trait surface (`scan_stream(filter: Option<&str>)` →
  `Option<Expr>`) and the cascading call sites — out of scope for this
  MR.
- The mutation delete predicate still goes through `Dataset::delete(&str)`
  in Lance 6.0.1. MR-A (delete two-phase via Lance #6658, gated on the
  Lance v7 bump per issue #112) will migrate that path to
  `DeleteBuilder::execute_uncommitted` taking an Expr.

The existing `ir_filter_to_sql` / `ir_expr_to_sql` / `literal_to_sql`
helpers stay in place to serve the remaining string-SQL consumers
(mutation predicates). They get retired when the other call sites
migrate.

## Cargo

Enables the `nested_expressions` feature on the `datafusion` workspace
dep. Lance already pulls in `datafusion-functions-nested` transitively
(it's listed in their feature set), so this just exposes the
`datafusion::functions_nested::expr_fn::array_has` re-export. No
transitive dep change (Cargo.lock unchanged).

## Tests

- New: `ir_filter_with_list_contains_pushes_down` — pins the case that
  was previously impossible (`ir_filter_to_sql` returning `None`).
- 906/906 workspace tests still pass.
- 417/417 engine integration tests pass (was 416 + the new one).
- 19/19 failpoints (recovery canary).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: pin rustfs/rustfs to 1.0.0-beta.3 (last known-good before creds-policy break)

The RustFS S3 Integration job started failing 2026-05-23 with all 3
tests panicking on the first PUT:

  HTTP error: error sending request

The "Dump RustFS logs on failure" step revealed the container was
dying at startup:

  [FATAL] Server encountered an error and is shutting down:
  Default root credentials are not allowed on non-loopback listeners;
  set RUSTFS_ACCESS_KEY and RUSTFS_SECRET_KEY to non-default values,
  bind to loopback, or set RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true
  for local development only

`rustfs/rustfs:latest` was updated 2026-05-21 (1.0.0-beta.4) with a
credentials-policy check that rejects `rustfsadmin`/`rustfsadmin` as
"default" values. PR #111 passed yesterday because it ran against
beta.3; today's runs against beta.4 fail at container startup.

This is unrelated to PR #113's Expr-pushdown refactor — the bump
just happened to hit the same week.

Pin to 1.0.0-beta.3 (2026-05-14, last tag before the change). The
right long-term fix is one of:
  - Rotate the CI creds to less-default values (less coupling to
    RustFS's "default" set definition)
  - Set `RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true` per the
    error message
  - Use a workflow service container with controlled lifecycle

Deferred — pinning is the minimal restore. Also incidentally
documents *which* version we tested against, which `:latest` never
did.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:47:33 +01:00
Andrew Altshuler
3551e0d40e
chore(lance): bump 4.0.0 → 6.0.1 (DataFusion 52→53, Arrow 57→58) (#111)
* tests: add lance_surface_guards pre-flight pins for the v6 bump

Land 8 named guards in a new test file that pin Lance API surfaces
OmniGraph relies on. Each guard turns a silent-break risk (variant
rename, struct restructure, async-flip) into a red CI bar instead of
runtime drift.

Guards (mapped to the silent-break inventory from the v6 migration plan):

  Runtime (#[tokio::test]):
  1. lance_error_too_much_write_contention_variant_exists — pins the
     variant referenced by db/manifest/publisher.rs::map_lance_publish_error.
  2. manifest_location_field_shape — pins .path/.size/.e_tag/.naming_scheme
     types and ManifestLocation accessor returning &Self (the access
     pattern at db/manifest/metadata.rs:84-88).
  6. write_params_default_does_not_set_storage_version — confirms our
     explicit V2_2 pin remains load-bearing (blob v2 requirement).

  Compile-only async fns (#[allow(...)] + unimplemented!() placeholders;
  never run, but cargo build --tests enforces the API shape):
  3. checkout_version + restore chain — pins the recovery rollback hammer
     at db/manifest/recovery.rs:505-522.
  4. DatasetBuilder::from_namespace().with_branch().with_version().load()
     — pins the namespace builder chain at db/manifest/namespace.rs:162-174.
  5. MergeInsertBuilder fluent chain — pins the manifest CAS at
     db/manifest/publisher.rs:370-391, including the return shape
     (Arc<Dataset>, MergeStats).
  7. compact_files(&mut ds, CompactionOptions, None) — pins
     db/omnigraph/optimize.rs:107.
  8. DeleteResult { new_dataset, num_deleted_rows } — pins the inline
     delete result shape (MR-A will repurpose this guard to the staged
     two-phase variant once Lance #6658 migration lands).

This is commit 1 of the chore/lance-6.0.1 migration. Cargo bump
follows in commit 2 (will trigger the guards under v6 if any surface
drifted).

Per the migration plan at ~/.claude/plans/shimmering-percolating-duckling.md
(written this session). Two guards from the plan deferred to follow-up:
  - manifest_cas_returns_row_level_contention_variant (full publisher
    race integration test — needs harness scaffolding)
  - table_version_metadata_byte_compatible_with_v4 (TableVersionMetadata
    is pub(crate); requires test reach extension).

Verified on v4: cargo test -p omnigraph-engine --test lance_surface_guards
passes 3/3 runtime tests; cargo build -p omnigraph-engine --tests
compiles all 5 compile-only guards clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(deps): bump Lance 4.0.0 → 6.0.1, DataFusion 52 → 53, Arrow 57 → 58

The Cargo bump itself. Source is intentionally untouched — this commit
will not compile. The compile errors are the work-list for subsequent
commits on this branch.

Lance updates: lance + 7 sub-crates 4.0.0 → 6.0.1. Transitive churn:
  + lance-tokenizer v6.0.1 (vendored tokenizer per Lance PR #6512)
  + object_store 0.13.x (Lance 6 brings it transitively; our explicit
    pin stays at 0.12.5 for now — revisit in stages if diamond bites)
  - tantivy* crates (replaced by lance-tokenizer)

Compile error landscape on this commit (11 errors):
  • 1× E0432: `lance_index::DatasetIndexExt` import (Lance PR #6280
    moved it to lance::index). Sites: table_store.rs:20,
    db/manifest.rs:37 (the second site was missed by the pre-flight
    inventory).
  • 8× E0599: `create_index_builder` / `load_indices` missing on
    `lance::Dataset` — all downstream of the DatasetIndexExt move.
    Once the import is corrected on table_store.rs and db/manifest.rs,
    these resolve automatically.
  • 2× E0063: missing field `is_only_declared` in `DescribeTableResponse`
    initializer at db/manifest/namespace.rs:221, 364. New Lance
    namespace field per the v5 namespace restructure (PR #6186).

Surface guards (lance_surface_guards.rs, commit d571fa8) all still
compile + the 3 runtime ones pass on v6 — none of the silent-break
surfaces drifted. That's the load-bearing observation: the publisher
CAS chain, ManifestLocation field shape, checkout_version/restore,
DatasetBuilder fluent chain, MergeInsertBuilder return shape,
WriteParams::default, compact_files signature, and DeleteResult
fields are all v6-stable.

Next commits address the 11 errors per the migration plan stages
3-8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* imports: move DatasetIndexExt to lance::index (Lance PR #6280)

Lance 5.0 (PR #6280) moved `DatasetIndexExt` out of `lance-index` into
`lance::index`. `is_system_index` and `IndexType` stayed in `lance-index`.

Mechanical update of 6 import sites:
  crates/omnigraph/src/table_store.rs:20 — split into two `use` lines
  crates/omnigraph-server/tests/server.rs:10 — was traits::DatasetIndexExt
  crates/omnigraph/tests/search.rs:6
  crates/omnigraph/tests/branching.rs:7
  crates/omnigraph/tests/failpoints.rs:467
  crates/omnigraph-cli/tests/cli.rs:3 — was traits::DatasetIndexExt

All 9 E0599 cascading errors on .create_index_builder / .load_indices
resolve once the trait is back in scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* namespace: add is_only_declared field to DescribeTableResponse

Lance namespace 6.0.0 added `is_only_declared: Option<bool>` to
`DescribeTableResponse` (lance-namespace-reqwest-client 0.7+ via the
v5.0 namespace API restructure, Lance PR #6186). Set to `Some(false)`
because every table BranchManifestNamespace returns from describe_table
is materialized — the manifest snapshot only includes entries for
tables we've already opened via Dataset::open.

Two sites in db/manifest/namespace.rs (BranchManifestNamespace +
StagedTableNamespace impls of LanceNamespace::describe_table).

Closes the last two compile errors from the v6 bump in the engine lib.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* cargo: add lance to omnigraph-cli + omnigraph-server dev-deps

Stage 3 moved DatasetIndexExt imports from `lance-index` to `lance::index`
in the cli and server test crates. Both crates only had `lance-index`
in their dev-dependencies; add `lance` alongside so the new path
resolves.

This is the last compile-error fix from the v6 bump — `cargo build
--workspace --tests` is now green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: refresh Lance alignment audit for v6.0.1; bump surveyed version

Per CLAUDE.md maintenance rule 2 (same-PR docs):

- docs/dev/lance.md: replace the v4.0.1 alignment audit stanza with
  the v6.0.1 audit. Captures every v5/v6 finding from this PR (the
  DatasetIndexExt move, DescribeTableResponse.is_only_declared,
  MergeInsertBuilder return shape, ManifestLocation field shape,
  LanceFileVersion::default flip, file-reader async, tokenizer
  vendor, Lance #6658/#6666/#6877 status). Cross-references each
  guard in tests/lance_surface_guards.rs.

- AGENTS.md: bump "Storage substrate: Lance 4.x" → "Lance 6.x".
  Note: surveyed crate version stays at 0.4.2 — substrate version
  bumps are independent of OmniGraph's release version.

- crates/omnigraph/src/storage_layer.rs: update the trait module-level
  doc-comment to reflect that Lance #6658 closed 2026-05-14 and
  delete_where two-phase migration is MR-A (the next follow-up).
  #6666 stays open; create_vector_index inline residual stays.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* tests: silence clippy::diverging_sub_expression on compile-only guards

The five `_compile_*` async fns in lance_surface_guards.rs use
`let ds: Dataset = unimplemented!()` as a placeholder so type inference
can chase the method chain we want to pin, without ever running the
function. Clippy's `diverging_sub_expression` lint flags this pattern
because the RHS diverges; that's the entire point. Added to the
per-fn `#[allow(...)]` list, alongside dead_code / unreachable_code /
unused_variables / unused_mut already there.

No behavior change. cargo test -p omnigraph-engine --test
lance_surface_guards still 3/3 green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: correct #6658 status — closed but API ships in Lance v7.x, not v6.0.1

The audit stanza in docs/dev/lance.md and the storage_layer.rs trait
doc-comment both implied the public DeleteBuilder::execute_uncommitted
API shipped with Lance 6.0.1. It did not. Issue #6658 closed
2026-05-14, but binary search across the release stream confirms:

  v6.0.1             no pub async fn execute_uncommitted on DeleteBuilder
  v6.1.0-rc.1       
  v7.0.0-beta.5     
  v7.0.0-beta.10     first appearance
  v7.0.0-rc.1       

So MR-A (delete two-phase migration) is gated on the Lance v7.x bump,
not on this PR. v7.0.0-rc.1 dropped 2026-05-21; GA likely within a
week.

No behavior change. Doc-only correction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(lib): bump recursion_limit to 256 — Lance 6 trait depth on Linux

Lance 6's heavier trait surface around futures/streams in storage_layer.rs's
staged-write API pushes the rustc trait-resolution recursion limit past
the default 128 on Linux builds. CI on PR #111 surfaced this in both
`Test Workspace` and `Test omnigraph-server --features aws`:

  error: queries overflow the depth limit!
    = help: consider increasing the recursion limit by adding a
      `#![recursion_limit = "256"]` attribute to your crate (`omnigraph`)
    = note: query depth increased by 130 when computing layout of
      `{async block@crates/omnigraph/src/storage_layer.rs:697:5: 697:10}`

(The async block is `stage_create_btree_index`'s body — its return type
is several layers of `impl Future<Output=Result<StagedHandle>>` deep on
top of Lance's own builder return types.)

Local macOS builds happened to short-circuit before tripping the limit,
which is why this didn't surface during the v6 bump sequence. The fix
rustc itself suggests is one line at the crate root.

No behavior change. Revisit if a future Lance bump stops needing it.

Verified: `cargo build --locked -p omnigraph-server --features aws`
compiles clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 00:42:29 +01:00
Andrew Altshuler
9973683261
policy: chassis core — omnigraph-policy crate + Omnigraph::enforce() (MR-722) (#102)
PR #2 of the policy chassis series (PR #1 = MR-731, merged in #101).
The structural fix that moves Cedar enforcement from HTTP-only to
engine-wide. apply_schema is the proof-of-concept writer; PR #3 fans
the enforce() call out to the remaining six (mutate_as, load,
ingest_as, branch_create_from, branch_delete, branch_merge).

## What lands

### New crate: omnigraph-policy

The 844-line policy.rs moves from `omnigraph-server` into a new
`omnigraph-policy` workspace crate so both engine and server can
depend on it. Cedar dependency moves with it. The server's policy.rs
becomes a re-export shim (`pub use omnigraph_policy::*`) so existing
`omnigraph_server::PolicyAction` etc. paths keep working — CLI and
test consumers don't have to migrate in one go.

### New trait: PolicyChecker

```rust
pub trait PolicyChecker: Send + Sync {
    fn check(&self, action: PolicyAction, scope: &ResourceScope,
             actor: &str) -> Result<(), PolicyError>;
}
```

`PolicyEngine` (Cedar-backed) implements it. `Omnigraph::with_policy()`
takes `Arc<dyn PolicyChecker>`. Engine tests mock the trait without
spinning up Cedar. MR-725 will extend the trait with `predicate_for()`
for query-layer pushdown — additive, no call-site changes.

### New enum: ResourceScope

Four variants — Graph, Branch, TargetBranch, BranchTransition —
mapping cleanly to today's `(branch, target_branch)` shape on
PolicyRequest via `to_branch_pair()`. Each engine writer picks the
variant that matches the existing HTTP-layer convention so engine
and HTTP evaluate the same Cedar decision.

**Invariant**: ResourceScope stays at branch granularity. Per-type
and per-row scope are MR-725's territory, not engine-layer's.
Adding Type/Row variants here creates two places per-type policy
can be evaluated, which can drift. See chassis design refinements
comment on MR-722 (2026-05-17).

### Omnigraph::with_policy() + enforce()

* New `policy: Option<Arc<dyn PolicyChecker>>` field on Omnigraph,
  None by default (preserves embedded/dev no-enforcement mode).
* `with_policy(self, checker)` setter — builder-style, consumes self.
* `enforce(action, scope, actor)` — the gate. When policy is None,
  no-op. When policy is Some AND actor is None, hard error — silent
  bypass via "I forgot the actor" is exactly the footgun this gate
  is here to prevent.

### apply_schema_as: first writer wired

* New public method `apply_schema_as(source, options, actor)` that
  calls `enforce(SchemaApply, TargetBranch("main"), actor)` before
  acquiring the schema-apply lock or doing any other work.
* Existing `apply_schema(source)` and `apply_schema_with_options(...)`
  delegate to it with actor=None (no-actor variants).
* HTTP handler `server_schema_apply` updated to call apply_schema_as
  with the resolved actor. AppState construction injects the
  PolicyEngine into Omnigraph via `with_policy`. HTTP-layer
  authorize_request still fires first; the engine gate is the
  redundant-but-correct backstop and the only path that protects SDK
  / embedded callers. PR #3 removes the HTTP redundancy.

### OmniError::Policy

New error variant for engine-layer policy denial / evaluation
failure. ApiError::from_omni maps it to 403.

### MR-724 Admin action — Option A reservation

PolicyAction::Admin kept in the enum with a load-bearing doc
comment naming its future consumers (hot reload, audit log query,
approvals list per MR-726 / MR-732 / MR-734). No enforce(Admin, ...)
call site exists yet — the variant is reserved so the action
vocabulary is complete from chassis day one. MR-724 closes when
the first consumer surface ships.

### New SDK-side integration test

`crates/omnigraph/tests/policy_engine_chassis.rs` — four tests
covering:
* Policy denies for unauthorized actor → OmniError::Policy
* Policy permits for authorized actor → apply succeeds
* Policy installed + no actor → hard error (forget-the-actor footgun)
* No policy → no-op (embedded/dev default still works)

These exercise the engine path directly — no HTTP layer involved.

## Test results

- cargo test --workspace --locked --no-fail-fast: 851 passed, 0 failed
  * 45 server tests (existing) pass
  * 14 schema_apply tests (existing) pass
  * 4 new chassis tests pass
  * 60 OpenAPI tests pass (no HTTP API surface changes)
  * No regressions across the workspace

## Architectural decisions baked in

Per MR-722 chassis design refinements comment (2026-05-17):

1. PolicyChecker is a trait, not just a concrete. Engine and server
   consume the trait. MR-725 adds predicate_for() additively.
2. ResourceScope stays at branch granularity. No Type/Row variants.
3. Coarse-vs-fine framing pinned: engine-layer is action gate;
   query-layer (MR-725) is predicate gate. Both backed by same Cedar
   engine; non-overlapping responsibilities.
4. Admin action reserved for policy-management surfaces (MR-724
   Option A).

## Pending follow-ups (PR #3+)

- Fan-out enforce() to mutate_as, load, ingest_as, branch_create_from,
  branch_delete, branch_merge (PR #3).
- Remove HTTP-layer authorize_request redundancy once engine gate
  covers all writers (PR #3).
- CLI policy injection into Omnigraph for non-`policy validate|test|explain`
  subcommands (PR #3 or follow-up).
- MR-723 default-deny 3-state matrix (PR #4).
- MR-736 severity warn/deny (PR #5).
- AGENTS.md scope-of-enforcement rewrite once chassis fully lands.
- Coarse-vs-fine framing in docs/user/policy.md.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 00:36:36 +03:00
Ragnor Comerford
cd780e2d37
deps: add arc-swap to workspace for PR 2 catalog/schema_source wrapping
PR 2 wraps the Omnigraph engine's catalog and schema_source fields in
ArcSwap so reads stay zero-cost while apply_schema can swap atomically
without &mut self. arc-swap lands as an unused workspace dep here so the
follow-up commits that wrap fields can land in isolation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 15:25:22 +02:00
Ragnor Comerford
cdfbccbfdc
MR-794 step 2: scaffold MutationStaging accumulator + scan_with_pending
Add the scaffolding for the in-memory staged-write rewire — no behavior
change yet:

* New crates/omnigraph/src/exec/staging.rs with MutationStaging,
  PendingTable, PendingMode, StagedTablePath, plus the end-of-query
  finalize() that issues one stage_* + commit_staged per pending
  table (Merge mode dedupes by id, last-write-wins).
* TableStore::scan_with_pending and count_rows_with_pending helpers —
  Lance scan committed + DataFusion MemTable scan pending, concat.
  Sidesteps the Scanner::with_fragments filter-pushdown limitation
  documented on scan_with_staged.
* Add datafusion = "52" to workspace + omnigraph-engine deps for
  MemTable (transitively pulled by Lance already).

Engine code still uses the legacy MutationStaging shape; the rewire
lands in subsequent commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 10:42:21 +02:00
Andrew Altshuler
74eb5a5380
Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46)
* Parallel per-type load writes + omnigraph optimize/cleanup CLI

## MR-677.3 — parallel per-type load writes

The load path already groups records into one RecordBatch per type and
makes one Lance commit per table (loader::mod.rs:249-..), but those
commits ran sequentially. Wrap node and edge write loops in
`futures::stream::buffered(N)` against a new helper
`write_batches_concurrently`. Concurrency tunable via
`OMNIGRAPH_LOAD_CONCURRENCY` (default 8).

## MR-676 — `omnigraph optimize` and `omnigraph cleanup`

New CLI subcommands that walk every node + edge table in the repo:

- `omnigraph optimize <uri>` — runs Lance `compact_files` on each
  table to merge small fragments into fewer larger ones.
- `omnigraph cleanup <uri> --keep N | --older-than 7d --confirm` —
  runs Lance `cleanup_old_versions` to prune historical manifests +
  unique fragments. Requires `--confirm` because it's destructive.
  Supports both count-based and time-based retention (or both AND'd
  together). Time uses chrono `DateTime<Utc>` (added as a workspace
  dep, default-features off).

Both commands run their per-table loops in parallel (8-way bounded,
`OMNIGRAPH_MAINTENANCE_CONCURRENCY` env override). Smoke-tested
against the 114-table prod graph: optimize went 7m15s sequential
→ 1m28s parallel. cleanup --keep 1 removed 137 historical versions
across 114 tables in 1m57s without disrupting `/healthz` or query
responses.

Public API on `Omnigraph`:

  pub async fn optimize(&mut self) -> Result<Vec<TableOptimizeStats>>
  pub async fn cleanup(&mut self, opts: CleanupPolicyOptions)
      -> Result<Vec<TableCleanupStats>>

All 10 existing loader tests still pass.

Closes MR-676.
Partially addresses MR-677 (the .3 — parallel by type — piece;
MR-677.1 is for the `omnigraph embed` path, not load, since load
doesn't call Gemini directly. .2 was already in place).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate openapi.json

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-25 14:22:14 +03:00
andrew
c338e80180 Harden bearer auth: constant-time compare, hashed at rest, authoritative actor_id
Fixes two live authz bugs in omnigraph-server:

- Bearer-token lookup previously used HashMap::get, which compares keys with
  Eq and short-circuits on the first differing byte — a network-observable
  timing oracle for brute-forcing tokens. Tokens are now stored as SHA-256
  digests and compared with subtle::ConstantTimeEq, iterating every entry
  unconditionally so total work is independent of which slot matches. Raw
  token bytes no longer live in server memory after startup.

- authorize_request now overwrites PolicyRequest.actor_id from the
  authenticated session instead of trusting the handler-supplied field,
  which previously defaulted to "" via unwrap_or_default(). The empty
  string can no longer reach Cedar as a policy subject even if a future
  refactor drops the None check.

External API of AppState constructors is unchanged — tokens still enter as
Vec<(String, String)> and are hashed on the way in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 01:41:02 +03:00
Claude
859ec9faa8
Add OpenAPI spec generation via utoipa with /openapi.json endpoint
Integrate utoipa 5 to auto-generate an OpenAPI 3.1 spec from the existing
Axum handlers and serde types. All 16 endpoints are annotated with path
metadata, request/response schemas, security requirements, and tags. A
public /openapi.json endpoint serves the spec without requiring auth.

Includes 59 tests covering path completeness, HTTP methods, schema fields,
enum variants, security scheme, path/query parameters, request bodies,
response references, and endpoint integration.

https://claude.ai/code/session_01NfoPVx21rZUQned1f7WpXY
2026-04-12 11:03:23 +00:00
andrew
338289656a Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00