Prerelease cleanup (#46)

* feat: Add const_bound_vars tracking to prevent false positives in ownership checks

* feat: Introduce field interner and typed bounded vars for enhanced type tracking

* feat: Add typed_call_receivers and typed_bounded_dto_fields for enhanced type tracking

* feat: Centralize method name extraction with bare_method_name helper

* feat: Implement Phase-6 hierarchy fan-out for runtime virtual dispatch

* feat: Enhance C++ taint tracking with additional container operations and inline method resolution

* feat: Introduce field-sensitive points-to analysis for enhanced resource tracking

* feat: Implement Pointer-Phase 6 subscript handling for enhanced container analysis

* test: Add comprehensive tests for JavaScript control flow constructs and lattice operations

* docs: Update advanced analysis documentation with field-sensitive points-to and hierarchy fan-out details

* test: Add comprehensive tests for lattice algebra laws and SSA edge cases

* feat: Add destructured session user handling and safe user ID access patterns

* feat: Implement row-population reverse-walk for enhanced authorization checks

* feat: Enhance authorization checks with local alias chain for self-actor types

* feat: Introduce ActiveRecord query safety checks and enhance snippet extraction

* feat: Implement chained method call inner-gate rebinding for SSRF prevention

* feat: Add observability and error modules, enhance debug functionality, and implement theme context

* feat: Remove Auth Analysis page and update navigation to redirect to Explorer

* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor

* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor

* feat: Reset path-safe-suppressed spans before lowering to maintain analysis integrity

* fix(ssa): ungate debug_assert_bfs_ordering for release-tests build

The helper at src/ssa/lower.rs was gated `#[cfg(debug_assertions)]` while
the unit test at the bottom of the file was gated only `#[cfg(test)]`.
Since `cfg(test)` is set in release builds with `--tests` but
`cfg(debug_assertions)` is not, `cargo build --release --tests` failed
with E0425. Removing the gate fixes the build; the body is `debug_assert!`
only, so the helper is free in release. Also drop the gate at the call
site to avoid a `dead_code` warning when the lib is built without
`--tests`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(closure-capture): flip JS/TS fixtures to required-finding

The JS and TS closure-capture fixtures pinned the old broken behaviour
via `forbidden_findings: [{ "id_prefix": "taint-" }]`. The engine now
correctly traces taint through the closure boundary (env source captured
by an arrow function, sunk via `child_process.exec` inside the body), so
the formerly-forbidden finding is a true positive.

Match the Python sibling's shape — `required_findings` with
`id_prefix` + `min_count` plus a small `noise_budget` — and rewrite the
companion READMEs and the phase8_fragility_tests doc-comments from
"known gap" to "regression guard".

Verified:
- cargo test --release --test phase8_fragility_tests → 8/8 pass
- cargo test --release --lib bfs_assertion → pass
- corpus benchmark F1 = 0.9976 (TP=205, FP=1, FN=0) — unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: Add OWASP mapping and baseline mutation hooks for enhanced security analysis

* feat: Introduce health module and enhance health score computation with calibration tests

* feat: Add expectations configuration and cleanup .gitignore for log files

* feat: Implement theme selection and enhance settings panel for triage sync

* feat: Suppress false positives for strcpy calls with literal sources in AST

* feat: Update analyse_function_ssa to return body CFG for accurate analysis

* feat: Add bug report and feature request templates for improved issue tracking

* feat: removed dev scripts

* feat: update README.md for clarity and consistency in fixture descriptions

* feat: removed dev docs

* feat: clean up error handling and UI elements for improved user experience

* feat: adjust button sizes in HeaderBar for better UI consistency

* feat: enhance taint analysis with additional context for sanitizer and taint findings

* cargo fmt

* prettier

* refactor: simplify conditional checks and improve code readability in AST and screenshot capture scripts

* feat: add script to frame PNG screenshots with brand gradient

* feat: add fuzzing support with new targets and CI workflows

* refactor: streamline match expressions and improve formatting in CLI and output handling

* feat: enhance configuration display with detailed output options

* feat: stage demo configuration for improved CLI screenshot output

* feat: expose merge_configs function for user-configurable settings

* refactor: simplify code structure and improve readability in config handling

* refactor: improve descriptions for vulnerability patterns in various languages

* feat: update MIT License section with additional usage details and copyright information

* feat: update screenshots

* refactor: update build process and paths for frontend assets

* feat: add cross-file taint fuzzing target and supporting dictionary

* refactor: clean up formatting and comments in fuzz configuration and example files

* refactor: remove outdated comments and clean up CI configuration files

* chore: update changelog dates and improve formatting in documentation

* refactor: update Cargo.toml and CI configuration for improved packaging and build process

* refactor: enhance quote-stripping logic to prevent panics and add regression tests

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Eli Peter 2026-04-29 00:58:38 -04:00 committed by GitHub
parent 79c29b394d
commit 82f18184b1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
348 changed files with 48731 additions and 2925 deletions

View file

@ -1,7 +1,7 @@
use super::config::AuthAnalysisRules;
use super::model::{
AnalysisUnit, AuthCheck, AuthCheckKind, AuthorizationModel, OperationKind, SensitiveOperation,
ValueRef, ValueSourceKind,
AnalysisUnit, AnalysisUnitKind, AuthCheck, AuthCheckKind, AuthorizationModel, OperationKind,
SensitiveOperation, ValueRef, ValueSourceKind,
};
use crate::patterns::Severity;
@ -67,6 +67,9 @@ fn check_ownership_gaps(model: &AuthorizationModel, rules: &AuthAnalysisRules) -
let mut findings = Vec::new();
for unit in &model.units {
if !unit_has_user_input_evidence(unit) {
continue;
}
for op in &unit.operations {
if op.kind == OperationKind::TokenLookup {
continue;
@ -116,6 +119,9 @@ fn check_partial_batch_authorization(
let mut findings = Vec::new();
for unit in &model.units {
if !unit_has_user_input_evidence(unit) {
continue;
}
for op in &unit.operations {
// In-memory bookkeeping is never a batch sink.
if op.sink_class.is_some_and(|c| !c.is_auth_relevant()) {
@ -167,6 +173,9 @@ fn check_stale_authorization(
let mut findings = Vec::new();
for unit in &model.units {
if !unit_has_user_input_evidence(unit) {
continue;
}
for op in unit.operations.iter().filter(|operation| {
operation.kind == OperationKind::Mutation
&& operation.sink_class.is_none_or(|c| c.is_auth_relevant())
@ -211,6 +220,18 @@ fn check_token_override_without_validation(
let mut findings = Vec::new();
for unit in &model.units {
// The rule reasons about "Token acceptance flow" — by
// construction, that is a user-facing handler that receives a
// token from the client and writes through token-bound state.
// Internal helpers, Celery / cron tasks, Django migrations,
// pytest fixtures, and seed-data utilities have no user reach
// and cannot host a token-acceptance flow even when their
// call shape happens to look token-y (`account.token = …;
// account.save()`). Gate on positive user-input evidence so
// these pure backend units are never claimed as a token flow.
if !unit_has_user_input_evidence(unit) {
continue;
}
let Some(token_lookup) = unit
.operations
.iter()
@ -293,6 +314,10 @@ fn has_prior_subject_auth(
op: &SensitiveOperation,
subjects: &[&ValueRef],
) -> bool {
if has_row_fetch_exemption(unit, op) {
return true;
}
let relevant_checks = unit.auth_checks.iter().filter(|check| {
check.line <= op.line
&& !matches!(
@ -310,6 +335,70 @@ fn has_prior_subject_auth(
})
}
/// Phase A4 row-fetch exemption.
///
/// Recognises the canonical "fetch-then-authorize" idiom in row-level
/// authz code: a route handler fetches a row by id (`let community =
/// Community::read(pool, data.community_id)?`), then calls a named
/// authorization function on the fetched row (`check_community_user_action(
/// &user, &community, ...)`). The authorization check appears
/// textually after the fetch, so the existing `check.line <= op.line`
/// rule cannot cover the fetch.
///
/// The exemption fires only when:
/// 1. `op` is the row-fetch operation itself (line == row let-line).
/// 2. SOME auth check in the unit names the resulting row variable as
/// a subject (directly or via `check.subjects[i].base`).
///
/// Coverage is intentionally narrow: only the row-fetch operation is
/// exempted. Any sink that runs *between* the fetch and the check
/// (e.g. `delete(community)` before `check_*`) still flags, because
/// its subject is `community` itself — not a fetch arg — and we
/// require the operation to be a row-fetch site to apply the
/// exemption.
fn has_row_fetch_exemption(unit: &AnalysisUnit, op: &SensitiveOperation) -> bool {
// Find the row var (if any) declared at this op's line.
let row_var: Option<&str> = unit
.row_population_data
.iter()
.find_map(|(var, (line, _))| {
if *line == op.line {
Some(var.as_str())
} else {
None
}
});
let Some(row_var) = row_var else {
return false;
};
// Look for any non-login auth check whose subjects mention the row.
// Match against the *root* of the subject's chain (`a.b.c` → `a`)
// so an auth check on a row's nested field — e.g.
// `is_mod_or_admin(pool, &user, comment_view.community.id)` —
// still names the row var.
unit.auth_checks.iter().any(|check| {
if matches!(
check.kind,
AuthCheckKind::LoginGuard | AuthCheckKind::TokenExpiry | AuthCheckKind::TokenRecipient
) {
return false;
}
check
.subjects
.iter()
.any(|subj| chain_root(subj) == row_var)
})
}
/// Root segment of a subject's chain. Subjects produced from
/// `a.b.c` carry `name = "a.b.c"` and `base = Some("a.b")`; the root
/// is `a`. Bare identifiers carry `base = None` and use `name`.
fn chain_root(subj: &ValueRef) -> &str {
let raw = subj.base.as_deref().unwrap_or(subj.name.as_str());
raw.split('.').next().unwrap_or(raw)
}
fn has_prior_collection_auth(
unit: &AnalysisUnit,
op: &SensitiveOperation,
@ -351,6 +440,56 @@ fn auth_check_covers_subject(check: &AuthCheck, subject: &ValueRef, unit: &Analy
.iter()
.any(|name| unit.authorized_sql_vars.contains(name));
// **Row-population reverse-walk** (lemmy fetch-then-check pattern).
//
// `row_population_data[R]` records the value-refs of every arg
// passed to a `let R = CALL(args)` row fetch. When a later auth
// check authorizes the resulting row (e.g. `check_community_user_action(
// &user, &community, ..)` after `let community = Community::read(
// pool, data.community_id)`), the check materially covers
// `data.community_id` too — it gated access to the row that was
// fetched using that id, so any subsequent operation re-using the
// same id (read of a related view, mutation on the row itself) is
// within the scope of that authorization.
//
// Match by canonical subject name so `data.community_id`,
// `community_id`, `data.comment_id`, etc. all resolve uniformly
// regardless of whether the route handler aliased the request
// field into a local before passing it on.
//
// **Local-alias chain.** When the subject is a plain identifier
// (no base/field), also consult `unit.var_alias_chain`: a sink
// that uses `community_id` after `let community_id =
// req.community_id` should see the population args recorded as
// `req.community_id` matched, not just the bare name.
let subject_alias_chain: Option<&str> = if subject.base.is_none() && subject.field.is_none() {
unit.var_alias_chain.get(&subject.name).map(|s| s.as_str())
} else {
None
};
let subject_populates: Vec<&str> = unit
.row_population_data
.iter()
.filter_map(|(row_var, (_line, args))| {
let matches_arg = args.iter().any(|arg| {
if canonical_subject_name(arg) == subject_key {
return true;
}
if let Some(chain) = subject_alias_chain
&& arg.name == chain
{
return true;
}
false
});
if matches_arg {
Some(row_var.as_str())
} else {
None
}
})
.collect();
check.subjects.iter().any(|check_subject| {
let check_key = canonical_subject_name(check_subject);
let check_related_base = related_subject_base(check_subject);
@ -366,6 +505,14 @@ fn auth_check_covers_subject(check: &AuthCheck, subject: &ValueRef, unit: &Analy
return true;
}
}
// Row-population reverse-walk: subject was passed to a row
// fetch, and the check covers that row (chain root match on
// the row var).
for row in &subject_populates {
if chain_root(check_subject) == *row {
return true;
}
}
// B3: SQL synth checks name the auth-gated row var directly.
// If our subject's row chain leads into the same authorized
// var family this check anchors to, accept the coverage.
@ -426,7 +573,54 @@ fn related_subject_base(subject: &ValueRef) -> Option<String> {
}
fn is_relevant_target_subject(subject: &ValueRef, unit: &AnalysisUnit) -> bool {
is_id_like(subject) && !is_actor_context_subject(subject, unit)
is_id_like(subject)
&& !is_actor_context_subject(subject, unit)
&& !is_const_bound_subject(subject, unit)
&& !is_typed_bounded_subject(subject, unit)
}
/// True iff `subject` is a plain identifier whose declaration binds
/// it to a literal constant (`id := "id"`, `let userId = 1`, etc.).
/// Such bindings cannot be user-controlled and so must not be
/// classified as scoped-identifier subjects. Only matches plain
/// `Identifier`-kind subjects (no base/field) — member chains like
/// `req.params.id` still pass through to the regular checks.
fn is_const_bound_subject(subject: &ValueRef, unit: &AnalysisUnit) -> bool {
if subject.base.is_some() || subject.field.is_some() {
return false;
}
unit.const_bound_vars.contains(&subject.name)
}
/// True iff `subject` is a plain identifier that resolves to a
/// function parameter whose static type is a payload-incompatible
/// scalar (numeric or boolean — see [`super::apply_typed_bounded_params`]).
/// Spring `@PathVariable Long userId`, Axum `Path<i64>`, NestJS
/// `@Param('id') id: number`, and FastAPI `user_id: int` all qualify.
///
/// Phase 6: also matches member-access subjects like `dto.userId`
/// when `dto` is a typed-extractor parameter recognised by a Phase
/// 1-2 matcher AND the field's declared TypeKind is Int/Bool.
fn is_typed_bounded_subject(subject: &ValueRef, unit: &AnalysisUnit) -> bool {
if subject.base.is_none() && subject.field.is_none() {
return unit.typed_bounded_vars.contains(&subject.name);
}
// Phase 6: member-access shape `base.field` whose `base` is a
// typed-extractor parameter and whose field is declared as an
// Int/Bool in the same-file DTO definition. Per Hard Rule 3,
// only fires when the base param itself was recognised by a
// Phase 1-2 matcher — bare `dto.age` without a framework gate
// never lifts.
let Some(base) = subject.base.as_deref() else {
return false;
};
let Some(field) = subject.field.as_deref() else {
return false;
};
let root = base.split('.').next().unwrap_or(base);
unit.typed_bounded_dto_fields
.get(root)
.is_some_and(|fields| fields.iter().any(|f| f == field))
}
fn is_actor_context_subject(subject: &ValueRef, unit: &AnalysisUnit) -> bool {
@ -434,6 +628,20 @@ fn is_actor_context_subject(subject: &ValueRef, unit: &AnalysisUnit) -> bool {
return true;
}
// Per-unit dynamic session-base set (TRPC `Options { ctx: { user:
// TrpcSessionUser } }` populates `<localCtx>.user` via the
// typed-extractor pre-pass). The static `is_self_scoped_session_base`
// list deliberately omits bare `ctx.user` because `ctx` is generic
// and a blanket addition over-suppresses in non-TRPC code; this
// branch fires only when the param's static type literally
// references `TrpcSessionUser` (or a known TRPC alias).
if let Some(base) = subject.base.as_deref()
&& unit.self_scoped_session_bases.contains(base)
&& subject.field.as_deref().is_some_and(is_self_actor_id_field)
{
return true;
}
// A3: `V.id`-shape subjects where `V` is bound from a login-guard /
// auth-check call (or from a typed self-actor extractor parameter)
// are the caller's own id. `V.group_id` / `V.workspace_id` stay
@ -563,7 +771,7 @@ fn is_delegated_read_with_actor_context(
op: &SensitiveOperation,
relevant_subjects: &[&ValueRef],
) -> bool {
unit.kind == super::model::AnalysisUnitKind::RouteHandler
unit.kind == AnalysisUnitKind::RouteHandler
&& op.kind == OperationKind::Read
&& op.callee.to_ascii_lowercase().contains("service")
&& op.subjects.iter().any(is_self_scoped_session_subject)
@ -583,7 +791,15 @@ fn is_id_like(subject: &ValueRef) -> bool {
.as_deref()
.or(subject.base.as_deref())
.unwrap_or(&subject.name);
let lower = field.to_ascii_lowercase();
is_id_like_name(field)
}
/// String-level analogue of `is_id_like` for working with parameter
/// names (which carry no `ValueRef` structure). Mirrors the same
/// suffix vocabulary so a parameter `doc_id` / `groupId` / `userIds`
/// is recognised as an id-bearing input.
fn is_id_like_name(name: &str) -> bool {
let lower = name.to_ascii_lowercase();
lower == "id"
|| lower.ends_with("id")
|| lower.ends_with("_id")
@ -593,6 +809,86 @@ fn is_id_like(subject: &ValueRef) -> bool {
|| lower.contains("noteid")
}
/// True when the analysis unit shows positive evidence of receiving
/// user-controlled input — the precondition for any auth rule that
/// reasons about "scoped identifier" or "token-acceptance flow"
/// shapes.
///
/// A unit qualifies if any of the following hold:
/// * It is a recognised framework route handler (`RouteHandler` —
/// the strongest signal: registered with a router).
/// * It accesses a request-shaped value (`request.body`, `req.params`,
/// `c.Query(..)`, etc.) — populated as `context_inputs`.
/// * It declares at least one parameter whose name signals an
/// externally-supplied value (id-like, token-like, request-like).
/// Internal helpers that take only typed objects
/// (`promotion: Promotion`, `apps`, `schema_editor`, `config`,
/// `items`) are excluded.
///
/// Migrations, Celery tasks, pytest fixtures, conftest hooks, and
/// pure utility helpers fail all three conditions and are skipped —
/// they cannot, by construction, be the entry point of an
/// authentication-bearing flow.
fn unit_has_user_input_evidence(unit: &AnalysisUnit) -> bool {
if unit.kind == AnalysisUnitKind::RouteHandler {
return true;
}
if !unit.context_inputs.is_empty() {
return true;
}
unit.params.iter().any(|p| is_external_input_param_name(p))
}
/// Parameter-name heuristic: does this name carry external/user input
/// as part of its calling contract? Captures three classes of name:
/// * id-like (`*_id`, `*Id`, `id`, `*Ids`),
/// * token-like (`token`, `*_token`, `accessToken`),
/// * framework-request objects (`request`, `req`, `ctx` — the
/// standard names used by Express/Django/Flask/Gin/Axum/NestJS
/// handlers as the parameter that carries the HTTP request).
///
/// Used by `unit_has_user_input_evidence` to recognise helper
/// functions that, while not registered as route handlers, are
/// clearly invoked with caller-supplied identifiers or request data.
fn is_external_input_param_name(name: &str) -> bool {
if is_id_like_name(name) {
return true;
}
let lower = name.to_ascii_lowercase();
// Token-shaped: bare `token` or any `*_token` / `*Token` /
// `accessToken` / `refreshToken`-style suffix. Conservative —
// only fires on explicit token-naming, not on incidental
// substrings.
if lower == "token" || lower.ends_with("_token") || lower.ends_with("token") {
return true;
}
// Standard framework request-parameter names. These cover the
// cross-language convention for the parameter holding the HTTP
// request object (`req` / `request` / `ctx` / `context` / `info`)
// **and** the typed-extractor parameter naming used by
// Axum/Actix/NestJS handlers (`path`, `payload`, `body`, `dto`,
// `form`, `query`). In `web::Path<String>` / `web::Json<T>` /
// `@Body() dto: ...` the parameter name itself is the standard
// convention used by every example in the framework docs, so
// matching on the name is a reliable proxy for the typed
// extractor binding. Bare `c` is too common (incidental local
// variable) to include without an additional type signal.
matches!(
lower.as_str(),
"req"
| "request"
| "ctx"
| "context"
| "info"
| "path"
| "payload"
| "body"
| "dto"
| "form"
| "query"
)
}
fn is_batch_collection(subject: &ValueRef) -> bool {
subject.source_kind == ValueSourceKind::Identifier
&& subject.name.to_ascii_lowercase().ends_with("ids")
@ -600,7 +896,10 @@ fn is_batch_collection(subject: &ValueRef) -> bool {
#[cfg(test)]
mod tests {
use super::{is_actor_context_subject, is_relevant_target_subject};
use super::{
auth_check_covers_subject, is_actor_context_subject, is_external_input_param_name,
is_relevant_target_subject, unit_has_user_input_evidence,
};
use crate::auth_analysis::model::{AnalysisUnit, AnalysisUnitKind, ValueRef, ValueSourceKind};
use std::collections::{HashMap, HashSet};
@ -618,9 +917,15 @@ mod tests {
condition_texts: Vec::new(),
line: 1,
row_field_vars: HashMap::new(),
var_alias_chain: HashMap::new(),
row_population_data: HashMap::new(),
self_actor_vars: HashSet::new(),
self_actor_id_vars: HashSet::new(),
authorized_sql_vars: HashSet::new(),
const_bound_vars: HashSet::new(),
typed_bounded_vars: HashSet::new(),
typed_bounded_dto_fields: HashMap::new(),
self_scoped_session_bases: HashSet::new(),
}
}
@ -716,4 +1021,395 @@ mod tests {
// Foreign-user fields still flag.
assert!(!is_actor_context_subject(&member("target", "email"), &unit));
}
/// Real-repo regression (gin/context_test.go): `id := "id";
/// c.AddParam(id, value)` previously fired the rule because `id`
/// matched is_id_like but had no actor-context exemption. After
/// the const-binding tracker, `id` (a plain Local with no base /
/// field) bound to a literal is excluded from relevant subjects.
#[test]
fn const_bound_plain_subjects_are_not_relevant() {
let mut unit = empty_unit();
unit.const_bound_vars.insert("id".into());
// `id` matches is_id_like (name=="id") but is constant-bound.
assert!(!is_relevant_target_subject(&plain("id"), &unit));
// Plain `id` NOT in the const-bound set still flags as
// relevant — regression guard for the user-controlled case.
let unit2 = empty_unit();
assert!(is_relevant_target_subject(&plain("id"), &unit2));
// Member access `req.id` is unaffected by const-bound check
// (different ValueRef shape).
unit.const_bound_vars.insert("req".into());
assert!(is_relevant_target_subject(&member("req", "id"), &unit));
}
/// Phase 5 typed-bounded subject exclusion: a parameter whose
/// static type was recovered as `Int`/`Bool` (Spring `Long userId`,
/// Axum `Path<i64>`, FastAPI `user_id: int`) has its name added to
/// `unit.typed_bounded_vars` by `apply_typed_bounded_params`. The
/// subject `userId` then must not be classified as a scoped
/// identifier — the framework guarantees the value is numeric and
/// cannot drive ownership-bypass.
#[test]
fn typed_bounded_plain_subjects_are_not_relevant() {
let mut unit = empty_unit();
unit.typed_bounded_vars.insert("user_id".into());
// `user_id` matches is_id_like but is bounded by static type.
assert!(!is_relevant_target_subject(&plain("user_id"), &unit));
// Plain `user_id` NOT in the typed-bounded set still flags.
let unit2 = empty_unit();
assert!(is_relevant_target_subject(&plain("user_id"), &unit2));
// Member access `req.user_id` is unaffected (only plain
// identifiers are exempted — fields/base remain regular
// subjects so DTO-shape leaks still flag).
unit.typed_bounded_vars.insert("req".into());
assert!(is_relevant_target_subject(&member("req", "user_id"), &unit));
}
/// Real-repo regression: pure-backend units (Django migrations,
/// Celery tasks with no params, pytest fixtures) must fail the
/// user-input precondition so token-override / ownership rules
/// don't fire. Conversely, helpers with id-like / token-like /
/// request-named parameters do count as user-input-bearing.
#[test]
fn unit_user_input_evidence_recognises_external_inputs() {
// Function with no params and no context_inputs (Celery task
// shape) — must NOT count as user-input-bearing.
let mut unit = empty_unit();
assert!(!unit_has_user_input_evidence(&unit));
// Adding internal-typed params (apps, schema_editor — Django
// migration RunPython callback shape) keeps the gate closed.
unit.params.push("apps".into());
unit.params.push("schema_editor".into());
assert!(!unit_has_user_input_evidence(&unit));
// pytest hook shape: (config, items) — gate stays closed.
let mut unit = empty_unit();
unit.params.push("config".into());
unit.params.push("items".into());
assert!(!unit_has_user_input_evidence(&unit));
// Adding an id-like param flips the gate open.
unit.params.push("doc_id".into());
assert!(unit_has_user_input_evidence(&unit));
// Token-named param flips the gate open (Express helper
// `acceptInvitation(token, currentUser, roleOverride)`).
let mut unit = empty_unit();
unit.params.push("token".into());
unit.params.push("currentUser".into());
unit.params.push("roleOverride".into());
assert!(unit_has_user_input_evidence(&unit));
// Framework request-name param flips the gate open
// (Django/Flask `def view(request, project_id):`).
let mut unit = empty_unit();
unit.params.push("request".into());
assert!(unit_has_user_input_evidence(&unit));
// Axum/Actix typed-extractor convention name flips it open.
let mut unit = empty_unit();
unit.params.push("path".into());
assert!(unit_has_user_input_evidence(&unit));
// RouteHandler kind always wins, regardless of params.
let mut unit = empty_unit();
unit.kind = AnalysisUnitKind::RouteHandler;
assert!(unit_has_user_input_evidence(&unit));
}
/// `is_external_input_param_name` covers id-, token-, and
/// framework-request shapes; bare internal-typed names are
/// rejected so internal helpers stay outside the gate.
#[test]
fn external_input_param_name_classification() {
// ID-shaped names.
assert!(is_external_input_param_name("id"));
assert!(is_external_input_param_name("doc_id"));
assert!(is_external_input_param_name("groupId"));
assert!(is_external_input_param_name("voucher_code_ids"));
// Token-shaped names.
assert!(is_external_input_param_name("token"));
assert!(is_external_input_param_name("access_token"));
assert!(is_external_input_param_name("refreshToken"));
// Framework request / extractor names.
assert!(is_external_input_param_name("request"));
assert!(is_external_input_param_name("req"));
assert!(is_external_input_param_name("ctx"));
assert!(is_external_input_param_name("path"));
assert!(is_external_input_param_name("payload"));
assert!(is_external_input_param_name("dto"));
assert!(is_external_input_param_name("query"));
// Internal-typed names that internal helpers / migrations
// commonly use must NOT match.
assert!(!is_external_input_param_name("apps"));
assert!(!is_external_input_param_name("schema_editor"));
assert!(!is_external_input_param_name("config"));
assert!(!is_external_input_param_name("items"));
assert!(!is_external_input_param_name("promotion"));
assert!(!is_external_input_param_name("update_rule_variants"));
assert!(!is_external_input_param_name("manager"));
// `c` alone is too common as a local variable to count.
assert!(!is_external_input_param_name("c"));
}
/// Phase A4 row-fetch exemption.
///
/// Row var declared at line 10; auth check naming the row appears
/// at line 20. An operation at line 10 (the fetch) is exempted
/// because the auth check authorises the resulting row. Coverage
/// is intentionally narrow — operations between fetch (10) and
/// check (20) that are NOT row-fetch sites must still flag.
#[test]
fn row_fetch_exemption_covers_fetch_when_check_names_row() {
use super::has_row_fetch_exemption;
use crate::auth_analysis::model::{
AuthCheck, AuthCheckKind, OperationKind, SensitiveOperation,
};
let mut unit = empty_unit();
// `let community = Community::read(pool, data.community_id)?;` at line 10
unit.row_population_data.insert(
"community".to_string(),
(10, vec![member("data", "community_id")]),
);
// Auth check at line 20 with `community` as a subject base.
unit.auth_checks.push(AuthCheck {
kind: AuthCheckKind::Membership,
callee: "check_community_user_action".into(),
subjects: vec![member("community", "id")],
span: (0, 0),
line: 20,
args: Vec::new(),
condition_text: None,
});
let fetch_op = SensitiveOperation {
kind: OperationKind::Read,
sink_class: None,
callee: "Community.read".into(),
subjects: vec![member("data", "community_id")],
span: (0, 0),
line: 10,
text: String::new(),
};
assert!(has_row_fetch_exemption(&unit, &fetch_op));
// Operation at a different line (between fetch and check) is
// NOT a row-fetch site — exemption does not apply.
let mid_op = SensitiveOperation {
kind: OperationKind::Mutation,
sink_class: None,
callee: "delete_post".into(),
subjects: vec![member("data", "post_id")],
span: (0, 0),
line: 15,
text: String::new(),
};
assert!(!has_row_fetch_exemption(&unit, &mid_op));
}
#[test]
fn row_fetch_exemption_skips_when_no_check_names_row() {
use super::has_row_fetch_exemption;
use crate::auth_analysis::model::{OperationKind, SensitiveOperation};
let mut unit = empty_unit();
unit.row_population_data.insert(
"community".to_string(),
(10, vec![member("data", "community_id")]),
);
// No auth check pushed — exemption must NOT apply.
let fetch_op = SensitiveOperation {
kind: OperationKind::Read,
sink_class: None,
callee: "Community.read".into(),
subjects: vec![member("data", "community_id")],
span: (0, 0),
line: 10,
text: String::new(),
};
assert!(!has_row_fetch_exemption(&unit, &fetch_op));
}
#[test]
fn row_fetch_exemption_ignores_login_token_checks() {
use super::has_row_fetch_exemption;
use crate::auth_analysis::model::{
AuthCheck, AuthCheckKind, OperationKind, SensitiveOperation,
};
let mut unit = empty_unit();
unit.row_population_data.insert(
"community".to_string(),
(10, vec![member("data", "community_id")]),
);
// Login-only check on the row should NOT exempt the row-fetch
// — login proves identity, not authorization.
unit.auth_checks.push(AuthCheck {
kind: AuthCheckKind::LoginGuard,
callee: "require_login".into(),
subjects: vec![member("community", "id")],
span: (0, 0),
line: 20,
args: Vec::new(),
condition_text: None,
});
let fetch_op = SensitiveOperation {
kind: OperationKind::Read,
sink_class: None,
callee: "Community.read".into(),
subjects: vec![member("data", "community_id")],
span: (0, 0),
line: 10,
text: String::new(),
};
assert!(!has_row_fetch_exemption(&unit, &fetch_op));
}
/// Row-population reverse-walk (lemmy fetch-then-check pattern).
///
/// `let community = Community::read(pool, data.community_id)` at
/// line 10 records `community → [data.community_id]`. An auth
/// check on `community` at line 20 must materially cover any
/// downstream operation that re-uses `data.community_id` (e.g. a
/// later `delete_mods_for_community(pool, community_id)`),
/// because the check authorised access to the row that was
/// fetched using that id.
#[test]
fn auth_check_covers_subject_via_row_population_reverse_walk() {
use crate::auth_analysis::model::{AuthCheck, AuthCheckKind};
let mut unit = empty_unit();
unit.row_population_data.insert(
"community".to_string(),
(10, vec![member("data", "community_id")]),
);
let check = AuthCheck {
kind: AuthCheckKind::Membership,
callee: "check_community_user_action".into(),
subjects: vec![member("community", "id")],
span: (0, 0),
line: 20,
args: Vec::new(),
condition_text: None,
};
// Direct member subject `data.community_id` (the original
// request field) — covered via reverse-walk.
assert!(auth_check_covers_subject(
&check,
&member("data", "community_id"),
&unit
));
// A later op that re-passed the *same* id-bearing argument
// (`Community::read(pool, data.community_id)`) gets covered
// even though the check's subject names the row, not the id.
// Before the fix, this fired as
// `rs.auth.missing_ownership_check` on lemmy
// `community/transfer.rs:88` and similar.
// Negative: an unrelated id (different request field that
// never populated this row) must NOT be covered.
assert!(!auth_check_covers_subject(
&check,
&member("data", "post_id"),
&unit
));
}
/// Subject as plain identifier copied from the request
/// (`let community_id = data.community_id; let community =
/// Community::read(pool, community_id);`) must also benefit from
/// the reverse-walk — `row_population_data["community"]` then
/// records `[community_id]` (a plain identifier, not the
/// member-access shape).
#[test]
fn auth_check_covers_subject_via_row_population_reverse_walk_plain_arg() {
use crate::auth_analysis::model::{AuthCheck, AuthCheckKind};
let mut unit = empty_unit();
unit.row_population_data
.insert("community".to_string(), (10, vec![plain("community_id")]));
let check = AuthCheck {
kind: AuthCheckKind::Membership,
callee: "check_community_mod_action".into(),
subjects: vec![member("community", "id")],
span: (0, 0),
line: 20,
args: Vec::new(),
condition_text: None,
};
assert!(auth_check_covers_subject(
&check,
&plain("community_id"),
&unit
));
// Different plain id is not covered.
assert!(!auth_check_covers_subject(&check, &plain("post_id"), &unit));
}
/// Local-alias chain coverage (lemmy `community/transfer.rs` shape).
///
/// `let community = Community::read(pool, req.community_id)` at
/// line 10 records `community → [req.community_id]`. After the
/// auth check on the row, the handler aliases the request field
/// into a local: `let community_id = req.community_id;` then
/// reuses the bare `community_id` in a downstream sink.
/// `var_alias_chain["community_id"] = "req.community_id"` lets
/// the reverse-walk match the population args (which still
/// contain the original member chain) against the plain subject.
#[test]
fn auth_check_covers_subject_via_row_population_alias_chain() {
use crate::auth_analysis::model::{AuthCheck, AuthCheckKind};
let mut unit = empty_unit();
unit.row_population_data.insert(
"community".to_string(),
(10, vec![member("req", "community_id")]),
);
unit.var_alias_chain
.insert("community_id".to_string(), "req.community_id".to_string());
let check = AuthCheck {
kind: AuthCheckKind::Membership,
callee: "check_community_user_action".into(),
subjects: vec![member("community", "id")],
span: (0, 0),
line: 20,
args: Vec::new(),
condition_text: None,
};
// Sink subject is the bare alias — covered via the chain.
assert!(auth_check_covers_subject(
&check,
&plain("community_id"),
&unit
));
// The original member-access subject is still covered (no
// regression in the existing reverse-walk path).
assert!(auth_check_covers_subject(
&check,
&member("req", "community_id"),
&unit
));
// Plain identifier with no alias entry must NOT be covered.
assert!(!auth_check_covers_subject(&check, &plain("post_id"), &unit));
}
}

View file

@ -1,4 +1,5 @@
use crate::auth_analysis::model::SinkClass;
use crate::labels::bare_method_name;
use crate::utils::config::Config;
#[derive(Debug, Clone)]
@ -175,7 +176,7 @@ impl AuthAnalysisRules {
/// receiver — `someElement.addEventListener` is just as
/// categorically client-side as `document.addEventListener`.
pub fn callee_has_non_sink_method(&self, callee: &str) -> bool {
let last = callee.rsplit('.').next().unwrap_or(callee);
let last = bare_method_name(callee);
let last = last.rsplit("::").next().unwrap_or(last);
if last.is_empty() {
return false;
@ -244,11 +245,29 @@ impl AuthAnalysisRules {
if self.receiver_matches_any_prefix(first, &self.cache_receiver_prefixes) {
return Some(SinkClass::CacheCrossTenant);
}
if self.is_mutation(callee) {
return Some(SinkClass::DbMutation);
}
if self.is_read(callee) {
return Some(SinkClass::DbCrossTenantRead);
// Verb-name fallback (`is_mutation` / `is_read`) is the loosest
// dispatch: it prefix-matches the bare method name against
// generic verbs (`Get`, `Save`, `Find`, …) regardless of the
// receiver. When the receiver chain itself contains a call
// expression (`w.Header().Get(..)`, `r.URL.Query().Get(..)`,
// `db.Tx(..).Query(..)`), the receiver is the *return value of
// another call* — its type is opaque to the auth analyser and
// the bare verb match is too speculative to assume a data-layer
// sink. The realtime/outbound/cache prefix dispatches above
// already match by the chain root; if none of them claimed the
// receiver, dropping the verb-name fallback for chained-call
// shapes prevents the entire `w.Header().Get` /
// `r.URL.Query().Get` cluster from masquerading as a
// `DbCrossTenantRead`. A canonical data-layer call still has a
// bare-identifier receiver (`repo.Find(id)`, `db.Query(..)`)
// and is unaffected.
if !receiver_is_chained_call(callee) {
if self.is_mutation(callee) {
return Some(SinkClass::DbMutation);
}
if self.is_read(callee) {
return Some(SinkClass::DbCrossTenantRead);
}
}
None
}
@ -596,6 +615,38 @@ pub fn build_auth_rules(config: &Config, lang_slug: &str) -> AuthAnalysisRules {
"verify_access!".into(),
"can_access?".into(),
"can?".into(),
// Rails per-record permission predicates — the canonical
// "load by id, then check on the loaded record" idiom
// (see redmine `app/controllers/issues_controller.rb`,
// mastodon controllers, diaspora ApplicationController).
// Combined with `row_population_data` reverse-walk, this
// recognises the post-fetch ownership check that is
// textually after the find call.
"visible?".into(),
"editable?".into(),
"editable_by?".into(),
"deletable?".into(),
"deletable_by?".into(),
"destroyable?".into(),
"destroyable_by?".into(),
"commentable?".into(),
"commentable_by?".into(),
"permitted?".into(),
"accessible?".into(),
"accessible_by?".into(),
"authorized?".into(),
"allowed_to?".into(),
"allowed?".into(),
"viewable?".into(),
"viewable_by?".into(),
"writable?".into(),
"writable_by?".into(),
"readable?".into(),
"readable_by?".into(),
"manageable?".into(),
"manageable_by?".into(),
"owned_by?".into(),
"belongs_to?".into(),
],
mutation_indicator_names: vec![
"update".into(),
@ -1294,13 +1345,32 @@ pub fn first_receiver_segment(callee: &str) -> &str {
callee.split('.').next().unwrap_or(callee)
}
/// True when the callee's receiver chain contains a call expression —
/// i.e. the LAST segment is being invoked on the *return value* of an
/// earlier call (`w.Header().Get`, `r.URL.Query().Get`,
/// `db.Tx(opts).Query`). Detected as: the substring before the last
/// `.` contains a `(`.
///
/// `classify_sink_class` consults this to suppress the loose verb-name
/// fallback (`is_read` / `is_mutation`) for chained-call shapes whose
/// receiver type is opaque to the analyser.
pub fn receiver_is_chained_call(callee: &str) -> bool {
let Some((receiver, _method)) = callee.rsplit_once('.') else {
return false;
};
receiver.contains('(')
}
/// Recognise `require_<resource>_<role>` / `ensure_<resource>_<role>`
/// shapes where `<role>` is a closed-vocabulary authorization noun
/// (`member`, `owner`, `admin`, `access`, `permission`, `manager`,
/// `editor`, `viewer`). The resource segment is project-specific
/// (`trip`, `doc`, `project`, `workspace`, …) and cannot be enumerated
/// in the static defaults — but the prefix+role pattern is unambiguous
/// enough that recognising it as an authorization check is safe.
/// `editor`, `viewer`, `user`, `mod`). The resource segment is
/// project-specific (`trip`, `doc`, `project`, `community`, …) and
/// cannot be enumerated in the static defaults — but the
/// prefix+role pattern is unambiguous enough that recognising it as
/// an authorization check is safe. Also accepts `is_<role>` /
/// `is_<role>_(or|and)_<role>...` predicate forms (`is_admin`,
/// `is_mod_or_admin`).
///
/// Strips path-namespace and method prefixes before matching:
/// `authz::require_trip_member` → `require_trip_member`;
@ -1309,23 +1379,60 @@ fn is_require_resource_role_call(name: &str) -> bool {
let last = name.rsplit("::").next().unwrap_or(name);
let last = last.rsplit('.').next().unwrap_or(last);
let lower = last.to_ascii_lowercase();
let after_prefix = if let Some(rest) = lower.strip_prefix("require_") {
rest
} else if let Some(rest) = lower.strip_prefix("ensure_") {
rest
} else {
return false;
};
let Some(last_underscore) = after_prefix.rfind('_') else {
return false;
};
// Must have at least one resource char before the role and a
// non-empty role after. Rejects degenerate `require__member`,
// `require_member` (no resource).
if last_underscore == 0 || last_underscore == after_prefix.len() - 1 {
return false;
// Pattern 1: `<verb>_<resource>_<role>[_<context>]?` where
// <verb> ∈ {require, ensure, check, assert, verify} and
// <context> ∈ {action, allowed, valid} (a small closed suffix
// set that wraps the role, e.g. `check_community_mod_action`).
if let Some(after_prefix) = strip_auth_verb_prefix(&lower) {
let core = strip_role_context_suffix(after_prefix);
if let Some(last_underscore) = core.rfind('_')
&& last_underscore > 0
&& last_underscore < core.len() - 1
{
let role = &core[last_underscore + 1..];
if is_known_auth_role(role) {
return true;
}
}
}
let role = &after_prefix[last_underscore + 1..];
// Pattern 2: `is_<role>` and `is_<role>_(or|and)_<role>...`.
// Conservative role list — excludes `user` / `staff` to avoid
// matching ambiguous predicates like `is_user`.
if let Some(rest) = lower.strip_prefix("is_")
&& !rest.is_empty()
&& all_tokens_are_predicate_roles(rest)
{
return true;
}
false
}
fn strip_auth_verb_prefix(lower: &str) -> Option<&str> {
for verb in ["require_", "ensure_", "check_", "assert_", "verify_"] {
if let Some(rest) = lower.strip_prefix(verb) {
return Some(rest);
}
}
None
}
/// Strip a single trailing `_<context>` suffix where <context> wraps
/// a role word with extra noise (`_action` / `_allowed` / `_valid`).
/// Does NOT strip `_access` / `_permission` because those are
/// themselves valid role suffixes (`require_doc_access`).
fn strip_role_context_suffix(s: &str) -> &str {
for suffix in ["_action", "_allowed", "_valid"] {
if let Some(stripped) = s.strip_suffix(suffix) {
return stripped;
}
}
s
}
fn is_known_auth_role(role: &str) -> bool {
matches!(
role,
"member"
@ -1344,9 +1451,55 @@ fn is_require_resource_role_call(name: &str) -> bool {
| "viewer"
| "viewers"
| "role"
| "user"
| "mod"
| "mods"
| "moderator"
| "moderators"
)
}
/// `is_<role>` predicate role set. Tighter than the
/// `<verb>_<resource>_<role>` set because predicates lack the
/// resource segment that disambiguates ambiguous role nouns
/// (`is_user` could be a typeof check, not an authorization check).
fn is_predicate_auth_role(role: &str) -> bool {
matches!(
role,
"admin"
| "admins"
| "owner"
| "owners"
| "member"
| "members"
| "manager"
| "managers"
| "moderator"
| "moderators"
| "mod"
| "mods"
| "editor"
| "editors"
)
}
/// Returns `true` iff every `_or_` / `_and_`-separated token in `rest`
/// is a known predicate auth role. E.g. `mod_or_admin` → true,
/// `mod_or_owner_and_admin` → true, `mod_or_logged_in` → false.
fn all_tokens_are_predicate_roles(rest: &str) -> bool {
let mut tokens: Vec<&str> = vec![rest];
for sep in &["_or_", "_and_"] {
let mut next: Vec<&str> = Vec::new();
for t in &tokens {
for piece in t.split(sep) {
next.push(piece);
}
}
tokens = next;
}
!tokens.is_empty() && tokens.iter().all(|t| is_predicate_auth_role(t))
}
pub fn matches_name(name: &str, pattern: &str) -> bool {
let name_last = name.rsplit('.').next().unwrap_or(name);
let pattern_last = pattern.rsplit('.').next().unwrap_or(pattern);
@ -1521,6 +1674,51 @@ mod tests {
);
}
#[test]
fn receiver_is_chained_call_detects_intermediate_calls() {
use super::receiver_is_chained_call;
// Chained-call shape: receiver chain contains a `(`.
assert!(receiver_is_chained_call("w.Header().Get"));
assert!(receiver_is_chained_call("r.URL.Query().Get"));
assert!(receiver_is_chained_call("db.Tx(opts).Query"));
assert!(receiver_is_chained_call("client.WithToken(t).Get"));
// Pure field/identifier chain — no `(` anywhere.
assert!(!receiver_is_chained_call("repo.Find"));
assert!(!receiver_is_chained_call("c.Fs.Create"));
assert!(!receiver_is_chained_call("globalBatchJobsMetrics.save"));
assert!(!receiver_is_chained_call("self.cache.insert"));
// Bare callee with no receiver.
assert!(!receiver_is_chained_call("Get"));
assert!(!receiver_is_chained_call("HashMap::new"));
}
#[test]
fn classify_sink_class_suppresses_chained_call_verb_fallback() {
use crate::auth_analysis::model::SinkClass;
use std::collections::HashSet;
let cfg = Config::default();
let rules = build_auth_rules(&cfg, "go");
let empty: HashSet<String> = HashSet::new();
// Chained-call receiver: verb-name fallback is suppressed.
// The minio `w.Header().Get(constName)` cluster — `Get` would
// match the `Get` read indicator on a bare receiver but the
// chained-call shape masks the receiver type.
assert_eq!(rules.classify_sink_class("w.Header().Get", &empty), None);
assert_eq!(rules.classify_sink_class("r.URL.Query().Get", &empty), None);
// Bare-identifier receiver: verb-name fallback still fires.
// Pin the regression guard so this fix doesn't over-suppress
// canonical data-layer shapes.
assert_eq!(
rules.classify_sink_class("repo.Find", &empty),
Some(SinkClass::DbCrossTenantRead)
);
assert_eq!(
rules.classify_sink_class("repo.Save", &empty),
Some(SinkClass::DbMutation)
);
}
#[test]
fn sink_class_is_auth_relevant_only_for_non_local_classes() {
use crate::auth_analysis::model::SinkClass;
@ -1614,4 +1812,50 @@ mod tests {
assert!(!rules.is_authorization_check("require_member"));
assert!(!rules.is_authorization_check("require_owner"));
}
/// Phase A4 — broader verb / role / context-suffix shapes seen in
/// real-world Rust apps. `check_<resource>_<role>_action` is the
/// canonical lemmy idiom; verifying the `is_<role>` predicate
/// recogniser closes `is_mod_or_admin` style checks.
#[test]
fn is_authorization_check_recognises_check_action_and_predicate_shapes() {
let cfg = Config::default();
let rules = build_auth_rules(&cfg, "rust");
// `check_<resource>_<role>_action` (lemmy `check_community_*_action`)
assert!(rules.is_authorization_check("check_community_user_action"));
assert!(rules.is_authorization_check("check_community_mod_action"));
assert!(rules.is_authorization_check("check_community_admin_action"));
assert!(rules.is_authorization_check("check_post_owner_action"));
// Verb variants
assert!(rules.is_authorization_check("assert_post_owner"));
assert!(rules.is_authorization_check("verify_doc_editor"));
// `_allowed` / `_valid` context suffix wrapping the role
assert!(rules.is_authorization_check("require_trip_member_allowed"));
assert!(rules.is_authorization_check("ensure_doc_owner_valid"));
// Path-namespaced
assert!(rules.is_authorization_check("authz::check_community_user_action"));
assert!(rules.is_authorization_check("self.check_community_mod_action"));
// `is_<role>` and `is_<role>_(or|and)_<role>` predicates.
assert!(rules.is_authorization_check("is_admin"));
assert!(rules.is_authorization_check("is_owner"));
assert!(rules.is_authorization_check("is_member"));
assert!(rules.is_authorization_check("is_moderator"));
assert!(rules.is_authorization_check("is_mod_or_admin"));
assert!(rules.is_authorization_check("is_owner_or_admin"));
assert!(rules.is_authorization_check("is_admin_or_moderator"));
assert!(rules.is_authorization_check("is_member_and_owner"));
// Negatives — predicates whose tokens are NOT known auth roles.
assert!(!rules.is_authorization_check("is_user"));
assert!(!rules.is_authorization_check("is_logged_in"));
assert!(!rules.is_authorization_check("is_active"));
assert!(!rules.is_authorization_check("is_visible"));
assert!(!rules.is_authorization_check("is_admin_or_logged_in"));
// `_action` / `_allowed` / `_valid` suffix without preceding
// role still rejects.
assert!(!rules.is_authorization_check("check_db_action"));
assert!(!rules.is_authorization_check("check_session_valid"));
}
}

View file

@ -378,6 +378,19 @@ fn classify_rocket_param(
}
}
/// Classify a route-handler parameter type as a route-level auth
/// guard. Used to tag the route as gated by a login or admin check
/// when one of its parameters is a typed auth extractor.
///
/// **Looser than [`super::common::is_self_actor_type_text`] by
/// design.** This recogniser runs only on the type of a route-bound
/// parameter — appearing in a route handler signature is itself a
/// strong signal — and a false positive here just over-credits the
/// route with a login guard, which is conservative w.r.t. flagging.
/// `is_self_actor_type_text` runs on every parameter, including in
/// non-route functions, and a false positive there suppresses
/// downstream `V.id` flagging entirely; that path uses a structural
/// recogniser keyed on the `<PREFIX>User<SUFFIX>?` shape.
fn classify_guard_type(type_text: &str) -> Option<AuthCheckKind> {
let lower = type_text.to_ascii_lowercase();
if is_extractor_wrapper(&lower) {

File diff suppressed because it is too large Load diff

View file

@ -9,6 +9,7 @@ use crate::auth_analysis::extract::common::{attach_route_handler, collect_top_le
use crate::auth_analysis::model::{
AnalysisUnitKind, AuthorizationModel, CallSite, Framework, HttpMethod,
};
use crate::labels::bare_method_name;
use crate::utils::project::{DetectedFramework, FrameworkContext};
use std::path::Path;
use tree_sitter::{Node, Tree};
@ -55,7 +56,7 @@ fn maybe_collect_django_path(
return;
};
let callee = text(function, bytes);
let target = callee.rsplit('.').next().unwrap_or(&callee);
let target = bare_method_name(&callee);
if !matches!(target, "path" | "re_path") {
return;
}

View file

@ -6,6 +6,7 @@ use super::common::{
use crate::auth_analysis::config::{AuthAnalysisRules, matches_name};
use crate::auth_analysis::extract::common::{collect_top_level_units, decorated_definition_child};
use crate::auth_analysis::model::{AuthorizationModel, CallSite, Framework, HttpMethod};
use crate::labels::bare_method_name;
use crate::utils::project::{DetectedFramework, FrameworkContext};
use std::path::Path;
use tree_sitter::{Node, Tree};
@ -117,7 +118,7 @@ fn parse_flask_route_decorator(
};
let callee = text(function, bytes);
let method_name = callee.rsplit('.').next().unwrap_or(&callee);
let method_name = bare_method_name(&callee);
let arguments = decorator_expr.child_by_field_name("arguments")?;
let args = named_children(arguments);

View file

@ -7,6 +7,7 @@ use crate::auth_analysis::config::{AuthAnalysisRules, matches_name, strip_quotes
use crate::auth_analysis::model::{
AnalysisUnitKind, AuthorizationModel, CallSite, Framework, HttpMethod, RouteRegistration,
};
use crate::labels::bare_method_name;
use crate::utils::project::{DetectedFramework, FrameworkContext};
use std::path::Path;
use tree_sitter::{Node, Tree};
@ -175,7 +176,7 @@ fn class_filter_directives(body: Node<'_>, bytes: &[u8]) -> Vec<FilterDirective>
continue;
}
let callee = call_name(child, bytes);
let directive_name = callee.rsplit('.').next().unwrap_or(&callee);
let directive_name = bare_method_name(&callee);
if !matches_name(directive_name, "before_action")
&& !matches_name(directive_name, "prepend_before_action")
&& !matches_name(directive_name, "skip_before_action")

View file

@ -7,6 +7,7 @@ use crate::auth_analysis::config::{AuthAnalysisRules, matches_name};
use crate::auth_analysis::model::{
AnalysisUnitKind, AuthorizationModel, CallSite, Framework, HttpMethod, RouteRegistration,
};
use crate::labels::bare_method_name;
use crate::utils::project::{DetectedFramework, FrameworkContext};
use std::path::Path;
use tree_sitter::{Node, Tree};
@ -43,7 +44,7 @@ fn collect_before_filters(root: Node<'_>, bytes: &[u8]) -> Vec<CallSite> {
continue;
}
let callee = call_name(child, bytes);
let target = callee.rsplit('.').next().unwrap_or(&callee);
let target = bare_method_name(&callee);
if !matches_name(target, "before") {
continue;
}
@ -79,7 +80,7 @@ fn maybe_collect_route(
model: &mut AuthorizationModel,
) {
let callee = call_name(node, bytes);
let route_name = callee.rsplit('.').next().unwrap_or(&callee);
let route_name = bare_method_name(&callee);
let method = match route_name.to_ascii_lowercase().as_str() {
"get" => HttpMethod::Get,
"post" => HttpMethod::Post,

View file

@ -59,6 +59,7 @@ pub fn run_auth_analysis(
// (skipped for slug-lookup / unit-test call sites).
if let Some(types) = var_types {
apply_var_types_to_model(&mut model, &rules, types);
apply_typed_bounded_params(&mut model, types);
}
// Lift per-function auth-check summaries and synthesise call-site
@ -220,6 +221,47 @@ fn apply_var_types_to_model(
}
}
/// Populate each [`model::AnalysisUnit::typed_bounded_vars`] with the
/// names of formal parameters whose SSA-inferred [`TypeKind`] is a
/// payload-incompatible scalar ([`TypeKind::Int`] or
/// [`TypeKind::Bool`]). Only parameter-rooted entries are considered;
/// function-local bindings stay outside this set so a downstream
/// reassignment from user input (`let id = req.params.id`) never gets
/// suppressed by accident.
///
/// Phase 6: when a parameter's type is a [`TypeKind::Dto`], lift each
/// of its `Int`/`Bool` fields as `typed_bounded_dto_fields[<param>]`
/// so member-access subjects like `dto.age` are recognised as
/// payload-incompatible. Only fires when the base param itself was
/// recognised as a typed extractor by a Phase 1-2 matcher — bare
/// parameters with no framework gate never lift their fields.
fn apply_typed_bounded_params(model: &mut model::AuthorizationModel, var_types: &VarTypes) {
for unit in &mut model.units {
for name in &unit.params {
let Some(ty) = var_types.get(name) else {
continue;
};
match ty {
TypeKind::Int | TypeKind::Bool => {
unit.typed_bounded_vars.insert(name.clone());
}
TypeKind::Dto(dto) => {
let mut bounded = Vec::new();
for (field_name, field_kind) in &dto.fields {
if matches!(field_kind, TypeKind::Int | TypeKind::Bool) {
bounded.push(field_name.clone());
}
}
if !bounded.is_empty() {
unit.typed_bounded_dto_fields.insert(name.clone(), bounded);
}
}
_ => {}
}
}
}
}
/// First segment of a callee's receiver chain (`map.insert` → `"map"`,
/// `self.cache.set` → `"self"`). Returns `None` when the callee has no
/// receiver (e.g. a free function call).
@ -676,9 +718,15 @@ mod tests {
condition_texts: Vec::new(),
line: 1,
row_field_vars: HashMap::new(),
var_alias_chain: HashMap::new(),
row_population_data: HashMap::new(),
self_actor_vars: HashSet::new(),
self_actor_id_vars: HashSet::new(),
authorized_sql_vars: HashSet::new(),
const_bound_vars: HashSet::new(),
typed_bounded_vars: HashSet::new(),
typed_bounded_dto_fields: HashMap::new(),
self_scoped_session_bases: HashSet::new(),
}
}

View file

@ -168,6 +168,26 @@ pub struct AnalysisUnit {
/// row-level ownership-equality check on the row implicitly covers
/// downstream uses of fields read from the same row.
pub row_field_vars: HashMap<String, String>,
/// Map from local variable name to the full member-chain expression
/// it was bound from (`let community_id = req.community_id` →
/// `community_id → "req.community_id"`). Distinct from
/// `row_field_vars`, which records only the receiver (loses the
/// field name). Powers the row-population reverse-walk's local-
/// alias case: when a sink subject is a plain identifier, the
/// reverse walk consults this map to also accept rows whose
/// population args contain the aliased chain.
pub var_alias_chain: HashMap<String, String>,
/// Per row-binding metadata: the `let ROW = CALL(..)` declaration
/// line and the value-refs appearing in the call's arguments.
/// Populated for every `let V = call(..)` shape. Powers the
/// "fetch-then-authorize" exemption in `checks.rs`: if a row-fetch
/// operation produces variable `V` and SOME auth check elsewhere
/// in the unit names `V`, the row-fetch operation is considered
/// authorized — even though the check appears textually after the
/// fetch. This is the standard idiom in row-level authz code:
/// fetch the row first to extract the resource id, then call
/// `check_<resource>_<role>(&user, &row, ...)` to authorize it.
pub row_population_data: HashMap<String, (usize, Vec<ValueRef>)>,
/// Variables bound to an authenticated-user value. Populated from
/// `let V = require_auth(..).await?` (or any call matching the
/// configured login-guard / authorization-check names) and from
@ -196,6 +216,46 @@ pub struct AnalysisUnit {
/// and treats a subject as covered when the chain terminates in
/// one of these names.
pub authorized_sql_vars: HashSet<String>,
/// Local variables bound (by `let`, `:=`, `var`, `const`) to a
/// pure literal — string, integer, float, or boolean. These are
/// developer-chosen constants and cannot be user-controlled, so
/// they must never trip `<lang>.auth.missing_ownership_check`
/// even when the variable name passes `is_id_like`. Closes the
/// gin/context_test.go FP where `id := "id"` triggered the rule.
pub const_bound_vars: HashSet<String>,
/// Function parameter names whose static type maps to a
/// payload-incompatible scalar ([`crate::ssa::type_facts::TypeKind::Int`]
/// or [`crate::ssa::type_facts::TypeKind::Bool`]). Populated
/// per-file by [`super::apply_typed_bounded_params`] using the
/// SSA-derived `VarTypes` map. Consulted by
/// `is_typed_bounded_subject` so parameters like Spring `Long
/// userId`, Axum `Path<i64>`, or FastAPI `user_id: int` are not
/// classified as scoped-identifier subjects even when their name
/// passes `is_id_like` — the framework guarantees the value is a
/// number that cannot carry a SQL/file/shell payload.
pub typed_bounded_vars: HashSet<String>,
/// Phase 6: per-DTO-extractor parameter, the field names whose
/// declared type is a payload-incompatible scalar. Map key is the
/// parameter name (e.g. `dto`), value is the list of field names
/// (e.g. `["age", "count"]`). Populated by
/// [`super::apply_typed_bounded_params`] only when the parameter
/// itself was recognised as a typed extractor by a Phase 1-2
/// matcher — bare parameters with no framework gate never lift
/// their fields.
pub typed_bounded_dto_fields: HashMap<String, Vec<String>>,
/// Per-unit dynamic session-base text set, supplementing the
/// hard-coded list in `is_self_scoped_session_base`. Populated by
/// the extractor when a parameter's static type signals a known
/// auth-context shape — e.g. TRPC's `Options { ctx: { user:
/// NonNullable<TrpcSessionUser> } }` adds `<localCtx>.user` so
/// downstream `ctx.user.id` accesses count as actor context. Each
/// entry is the dotted base text (e.g. `"ctx.user"`,
/// `"opts.ctx.user"`) that should match a subject's `base` when
/// the subject's `field` is an id-like field name. Distinct from
/// `self_actor_vars` (single-segment locals) because TRPC
/// destructures route through a base chain, not a top-level
/// binding.
pub self_scoped_session_bases: HashSet<String>,
}
/// Per-function summary of which positional parameters are