Authorization analysis logic improvements (#61)

2026-07-24 21:41:02 +02:00 · 2026-05-02 16:44:49 -04:00 · 2026-05-02 16:44:49 -04:00 · 40995e45e7
commit 40995e45e7
parent 3c89bddbf2
55 changed files with 4193 additions and 134 deletions
--- a/docs/SUMMARY.md
+++ b/docs/SUMMARY.md
@ -27,3 +27,8 @@
  - [CFG](detectors/cfg.md)
  - [State](detectors/state.md)
  - [Taint](detectors/taint.md)
+
+# Project
+
+- [Roadmap](roadmap.md)
+- [Changelog](changelog.md)
--- a/docs/advanced-analysis.md
+++ b/docs/advanced-analysis.md
@ -96,8 +96,24 @@ hash per-argument `Cap` bits but not source-origin identity, so two
 callers with identical caps but different origins share cached
 origin-attribution.

-**Source**: [`src/taint/ssa_transfer.rs`](https://github.com/elicpeter/nyx/blob/master/src/taint/ssa_transfer.rs)
-(`ArgTaintSig`, `InlineCache`, `inline_analyse_callee`).
+**Helper-validator propagation.** SSA summaries carry a
+`validated_params_to_return` field listing parameter indices whose
+taint flow to the return value is fully validated by a dominating
+predicate (regex allowlist, type check, validation call) on every
+return path. At call sites, each tainted argument passed to a
+validated position — and the call's own return value — are marked
+`validated_must` / `validated_may` in the caller's SSA taint state,
+the same way an inline `if (!regex.test(x)) throw …` would validate
+the surviving branch. Sound because the summary is recorded only when
+the parameter's name is in `validated_must` at *every* return block; a
+normal-returning call therefore proves the validating arm. JS/TS
+object-pattern formals (`({ column, operator, value }) => …`) seed
+every destructured sibling in the per-parameter probe, so flow through
+any of them counts toward the slot being validated.
+
+**Source**: [`src/taint/ssa_transfer/`](https://github.com/elicpeter/nyx/tree/master/src/taint/ssa_transfer/)
+(`ArgTaintSig`, `InlineCache`, `inline_analyse_callee`,
+`propagate_validated_params_to_return`).

 ---

--- a/docs/auth.md
+++ b/docs/auth.md
@ -6,14 +6,31 @@

 The Rust rule is `rs.auth.missing_ownership_check`. It fires when a request handler reaches a privileged operation that takes a scoped identifier (`*_id`, row reference, scoped resource) without a preceding ownership or membership check.

-Concretely, it looks for five patterns of authorization in the function body and flags the call when none are present:
+Concretely, it looks for these patterns of authorization in the function body and flags the call when none are present:

 - A call to a recognised authorization helper. Defaults: `check_ownership`, `has_ownership`, `require_ownership`, `ensure_ownership`, `is_owner`, `authorize`, `verify_access`, `has_permission`, `can_access`, `can_manage`, plus `*_membership` and `require_{group,org,workspace,tenant,team}_member` variants. Extend in `[analysis.languages.rust]`.
 - An ownership-equality check on a row reference: `if owner_id != user.id { return 403 }` or any `field_id != self_actor` shape. The check writes `AuthCheck` evidence back to the row-fetch arguments via `AnalysisUnit.row_field_vars`.
 - A self-actor reference: `let user = require_auth(...).await?` followed by use of `user.id`, `user.user_id`, `user.uid`. The actor is recognised from typed extractor params (`Extension<Session>`, `CurrentUser`, etc.) and from typed helper bindings.
+- A typed extractor wrapper that proves route-level capability/policy enforcement: meilisearch-style `GuardedData<ActionPolicy<X>, _>`. Recognised by outer wrapper name (last segment, case-insensitive `starts_with`) so `GuardedData<ActionPolicy<X>, Data<AuthController>>` is classified by the outer `GuardedData`, not by whether an inner generic arg substring-matches `auth`. Configured via `policy_guard_names` (Rust default: `["Guarded"]`). Distinct from authentication-only wrappers so the pattern doesn't pollute regular call recognition.
 - A SQL query that joins through an ACL table or filters by `user_id` predicate. Detected without a SQL parser via [`sql_semantics.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/sql_semantics.rs); the authorized result variable propagates through `let row = ...prepare(LIT)...`, `for row in result`, `let id = row.get(...)`.
 - A helper-summary lift: handler calls `validate_target(db, widget_id, user.id)` whose body contains a `require_*_member` call. Cross-function summaries are merged at fixed-point (capped at 4 iterations).

+Handlers registered through attribute macros (`#[get("/path")]`, `#[routes::path(…)]`) or external service-config builders are also walked for typed-extractor guards, complementing the `.route(...)` registration path.
+
+## Caller-scope-entity exemption
+
+`<entity>.id` / `<entity>.pk` is not flagged when `<entity>` is a unit parameter named after a multi-tenant scope primitive: `organization` / `org`, `project`, `team`, `workspace`, `tenant`, `account`, `community`, `group`, `repository` / `repo`, `company`. The argument represents the caller's scope, not a user-controlled target, so internal helpers like `def get_environments(request, organization): Environment.objects.filter(organization_id=organization.id, …)` inherit the caller's authorization. Other field names (`.name`, `.slug`) still flag, and `user` / `member` / `actor` are deliberately excluded — those are handled by the actor-context recogniser.
+
+## Project-level web-framework gate (Rust)
+
+In Rust, the `context_inputs` and param-name arms of the user-input heuristic are gated by a project-level web-framework signal. The signal is three-valued:
+
+- `Some(true)` — the project's `Cargo.toml` names `axum`, `actix-web`, or `rocket`, OR the file directly imports one (`axum::`, `actix_web::`, `rocket::`, `axum_extra::`). Heuristics stay on.
+- `Some(false)` — `Cargo.toml` was inspected and named no web framework, AND the file does not directly import one. Heuristics off; only `RouteHandler` classification (concrete route-registration evidence) survives.
+- `None` — no detection ran (single-file scan with no project root). Heuristics on; behavior unchanged.
+
+This avoids a class of FPs in non-web Rust crates where a debug-session handle named `session` would trip on `session.update(cx, …)`-style desktop-app code. Other languages keep prior behavior; the gate is currently Rust-only.
+
 ## Sink classification

 The same call name can be safe on a local collection and dangerous on a database. The detector categorises each candidate sink before deciding whether to flag:
@ -62,6 +79,15 @@ cap      = "unauthorized_id"

 The same rule recognised in the standalone analyser also strips `Cap::UNAUTHORIZED_ID` for the taint-based variant.

+### Add a project-specific typed-extractor policy wrapper
+
+```toml
+[analysis.languages.rust.auth]
+policy_guard_names = ["MyAppGuarded", "PolicyExtractor"]
+```
+
+Matched as last-segment + case-insensitive `starts_with` (so a single entry `"Guarded"` covers `Guarded`, `GuardedData`, `GuardedRoute`). Distinct from `login_guard_names` and `admin_guard_names`.
+
 ### Recognised actor names

 Recognised by default: `user.id`, `user.user_id`, `user.uid`, `session.user_id`, `current_user.id`, plus typed extractor parameters with `CurrentUser`, `SessionUser`, `AuthUser`, `Extension<...>` shapes. To add a custom binding pattern, file an issue or add a fixture; the heuristic is in [`src/auth_analysis/checks.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/checks.rs) under `extract_validation_target` and friends.
@ -88,4 +114,4 @@ Auth findings render alongside taint findings in the [browser UI](serve.md). The

 ## Benchmark corpus

-The Rust auth corpus at [`tests/benchmark/corpus/rust/auth/`](https://github.com/elicpeter/nyx/tree/master/tests/benchmark/corpus/rust/auth/) is 10 fixtures covering the five FP patterns plus a true-positive control. Per-row metrics live under the Rust auth row in `tests/benchmark/RESULTS.md`.
+The Rust auth corpus at [`tests/benchmark/corpus/rust/auth/`](https://github.com/elicpeter/nyx/tree/master/tests/benchmark/corpus/rust/auth/) covers the recognised authorization patterns, true-positive controls, typed-extractor guard injection, and the project-level web-framework gate (full-Cargo.toml fixtures under `safe_non_web_rust_project/` and `unsafe_actix_web_project_no_check/`). Per-row metrics live under the Rust auth row in `tests/benchmark/RESULTS.md`.