Prerelease cleanup (#46)

* feat: Add const_bound_vars tracking to prevent false positives in ownership checks

* feat: Introduce field interner and typed bounded vars for enhanced type tracking

* feat: Add typed_call_receivers and typed_bounded_dto_fields for enhanced type tracking

* feat: Centralize method name extraction with bare_method_name helper

* feat: Implement Phase-6 hierarchy fan-out for runtime virtual dispatch

* feat: Enhance C++ taint tracking with additional container operations and inline method resolution

* feat: Introduce field-sensitive points-to analysis for enhanced resource tracking

* feat: Implement Pointer-Phase 6 subscript handling for enhanced container analysis

* test: Add comprehensive tests for JavaScript control flow constructs and lattice operations

* docs: Update advanced analysis documentation with field-sensitive points-to and hierarchy fan-out details

* test: Add comprehensive tests for lattice algebra laws and SSA edge cases

* feat: Add destructured session user handling and safe user ID access patterns

* feat: Implement row-population reverse-walk for enhanced authorization checks

* feat: Enhance authorization checks with local alias chain for self-actor types

* feat: Introduce ActiveRecord query safety checks and enhance snippet extraction

* feat: Implement chained method call inner-gate rebinding for SSRF prevention

* feat: Add observability and error modules, enhance debug functionality, and implement theme context

* feat: Remove Auth Analysis page and update navigation to redirect to Explorer

* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor

* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor

* feat: Reset path-safe-suppressed spans before lowering to maintain analysis integrity

* fix(ssa): ungate debug_assert_bfs_ordering for release-tests build

The helper at src/ssa/lower.rs was gated `#[cfg(debug_assertions)]` while
the unit test at the bottom of the file was gated only `#[cfg(test)]`.
Since `cfg(test)` is set in release builds with `--tests` but
`cfg(debug_assertions)` is not, `cargo build --release --tests` failed
with E0425. Removing the gate fixes the build; the body is `debug_assert!`
only, so the helper is free in release. Also drop the gate at the call
site to avoid a `dead_code` warning when the lib is built without
`--tests`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(closure-capture): flip JS/TS fixtures to required-finding

The JS and TS closure-capture fixtures pinned the old broken behaviour
via `forbidden_findings: [{ "id_prefix": "taint-" }]`. The engine now
correctly traces taint through the closure boundary (env source captured
by an arrow function, sunk via `child_process.exec` inside the body), so
the formerly-forbidden finding is a true positive.

Match the Python sibling's shape — `required_findings` with
`id_prefix` + `min_count` plus a small `noise_budget` — and rewrite the
companion READMEs and the phase8_fragility_tests doc-comments from
"known gap" to "regression guard".

Verified:
- cargo test --release --test phase8_fragility_tests → 8/8 pass
- cargo test --release --lib bfs_assertion → pass
- corpus benchmark F1 = 0.9976 (TP=205, FP=1, FN=0) — unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: Add OWASP mapping and baseline mutation hooks for enhanced security analysis

* feat: Introduce health module and enhance health score computation with calibration tests

* feat: Add expectations configuration and cleanup .gitignore for log files

* feat: Implement theme selection and enhance settings panel for triage sync

* feat: Suppress false positives for strcpy calls with literal sources in AST

* feat: Update analyse_function_ssa to return body CFG for accurate analysis

* feat: Add bug report and feature request templates for improved issue tracking

* feat: removed dev scripts

* feat: update README.md for clarity and consistency in fixture descriptions

* feat: removed dev docs

* feat: clean up error handling and UI elements for improved user experience

* feat: adjust button sizes in HeaderBar for better UI consistency

* feat: enhance taint analysis with additional context for sanitizer and taint findings

* cargo fmt

* prettier

* refactor: simplify conditional checks and improve code readability in AST and screenshot capture scripts

* feat: add script to frame PNG screenshots with brand gradient

* feat: add fuzzing support with new targets and CI workflows

* refactor: streamline match expressions and improve formatting in CLI and output handling

* feat: enhance configuration display with detailed output options

* feat: stage demo configuration for improved CLI screenshot output

* feat: expose merge_configs function for user-configurable settings

* refactor: simplify code structure and improve readability in config handling

* refactor: improve descriptions for vulnerability patterns in various languages

* feat: update MIT License section with additional usage details and copyright information

* feat: update screenshots

* refactor: update build process and paths for frontend assets

* feat: add cross-file taint fuzzing target and supporting dictionary

* refactor: clean up formatting and comments in fuzz configuration and example files

* refactor: remove outdated comments and clean up CI configuration files

* chore: update changelog dates and improve formatting in documentation

* refactor: update Cargo.toml and CI configuration for improved packaging and build process

* refactor: enhance quote-stripping logic to prevent panics and add regression tests

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Eli Peter 2026-04-29 00:58:38 -04:00 committed by GitHub
parent 79c29b394d
commit 82f18184b1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
348 changed files with 48731 additions and 2925 deletions

View file

@ -112,32 +112,7 @@ impl<'a> SinkSiteLocator<'a> {
}
}
/// Extract the source line containing `byte_offset`, trimmed and capped at
/// 120 chars. Returns `None` when the offset is out of range or the line
/// is entirely blank after trimming.
pub(crate) fn line_snippet(src: &[u8], byte_offset: usize) -> Option<String> {
if byte_offset >= src.len() {
return None;
}
let line_start = src[..byte_offset]
.iter()
.rposition(|&b| b == b'\n')
.map_or(0, |p| p + 1);
let line_end = src[byte_offset..]
.iter()
.position(|&b| b == b'\n')
.map_or(src.len(), |p| byte_offset + p);
let line = std::str::from_utf8(&src[line_start..line_end]).ok()?;
let trimmed = line.trim();
if trimmed.is_empty() {
return None;
}
if trimmed.len() > 120 {
Some(format!("{}...", &trimmed[..120]))
} else {
Some(trimmed.to_string())
}
}
pub(crate) use crate::utils::snippet::line_snippet;
/// Union two `SmallVec<[SinkSite; 1]>` lists with `(file_rel, line, col,
/// cap)` dedup. Preserves insertion order of `existing` then appends any
@ -403,6 +378,31 @@ pub struct FuncSummary {
/// alias.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub rust_wildcards: Option<Vec<String>>,
/// Per-file class / trait / interface hierarchy edges captured at
/// CFG-construction time. Each entry is
/// `(sub_container, super_container)` after language-specific
/// normalisation:
///
/// * Java `class X extends Y` → `(X, Y)`; `implements I, J` → `(X, I)`, `(X, J)`
/// * Rust `impl Trait for Type` → `(Type, Trait)`
/// * TypeScript `class X extends Y implements I` → `(X, Y)`, `(X, I)`
/// * Python `class X(A, B)` → `(X, A)`, `(X, B)`
/// * PHP `class X extends Y implements I` → `(X, Y)`, `(X, I)`
/// * Ruby `class X < Y` → `(X, Y)`
/// * C++ `class X : public Y` → `(X, Y)`
///
/// Empty for files with no declared inheritance / impl
/// relationships and for Go (which uses implicit interface
/// satisfaction — Phase 6 does not try to compute it).
///
/// **Per-file duplication.** Every `FuncSummary` produced from a
/// given file carries the **same** `hierarchy_edges` vector so the
/// information survives summary-by-summary persistence to SQLite.
/// `merge_summaries` deduplicates downstream when building
/// [`crate::callgraph::TypeHierarchyIndex`].
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub hierarchy_edges: Vec<(String, String)>,
}
// ── Cap conversion helpers ──────────────────────────────────────────────
@ -562,6 +562,20 @@ pub struct GlobalSummaries {
/// pass 1 and consumed by
/// [`crate::auth_analysis::run_auth_analysis`] during pass 2.
auth_by_key: HashMap<FuncKey, crate::auth_analysis::model::AuthCheckSummary>,
/// Phase 6 type hierarchy index for runtime virtual-dispatch fan-out.
///
/// Installed by [`Self::install_hierarchy`] after pass 1 from the
/// merged `FuncSummary::hierarchy_edges` vectors. Consumed by
/// [`Self::resolve_callee_widened`] during pass 2 so the taint
/// engine sees every concrete implementer of a method when the
/// receiver is statically typed as a super-class / trait /
/// interface — recovering the dispatch precision that today's
/// single-result [`Self::resolve_callee`] discards.
///
/// `None` until installed: every consumer treats `None` as
/// "fall through to today's bare resolution", so the index is
/// strictly additive.
hierarchy: Option<crate::callgraph::TypeHierarchyIndex>,
}
impl GlobalSummaries {
@ -858,6 +872,13 @@ impl GlobalSummaries {
for (key, auth_sum) in other.auth_by_key {
self.auth_by_key.insert(key, auth_sum);
}
// Hierarchy index: invalidate after a merge so the next consumer
// sees a freshly-built view that includes `other`'s edges. The
// alternative — point-merging two indexes — is racy when the
// same `(lang, super)` key carries different sub-orderings in
// each input; rebuild is O(n) over `by_key.iter()` and is the
// single source of truth.
self.hierarchy = None;
}
/// Insert an SSA summary.
@ -873,8 +894,74 @@ impl GlobalSummaries {
/// functions — we synthesize a disambig so both are kept. Silent
/// replacement in that case would drop one function's cross-file
/// taint signal entirely, which the caller cannot recover.
///
/// Before reconciliation, drop any parameter-index reference at or
/// above `key.arity`. Such indices come from synthetic SSA `Param`
/// ops emitted by scoped lowering for **external captures** (free
/// identifiers like `this`, module imports, or unresolved method
/// names) and are useful for *intra-file* pass-2 analysis (the
/// caller's implicit-uses argument group at the same index aligns
/// with the synthetic Param) but never for cross-file consumers,
/// which key off the FuncKey arity exclusively. Without the trim,
/// `ssa_summary_fits_arity` would reject the summary and
/// `reconcile_ssa_summary_key` would synthesise a disambig that
/// uncouples the SSA FuncKey from the matching FuncSummary FuncKey
/// (audit gap A.2.1.G1 —
/// `project_typed_callgraph_audit_gap_ssa_disambig.md`).
pub fn insert_ssa(&mut self, key: FuncKey, summary: SsaFuncSummary) {
let key = self.reconcile_ssa_summary_key(key, &summary);
// The summary may reference a parameter index ≥ `key.arity` when
// scoped SSA lowering synthesised `Param` ops for **external
// captures** (free identifiers like `this`, module imports,
// unresolved method names) — see audit gap A.2.1.G1
// (`project_typed_callgraph_audit_gap_ssa_disambig.md`). These
// synthetic refs are useful inside the file they were extracted
// in (the caller's implicit-uses argument group at the same
// index aligns with the synthetic Param) and stay useful when
// resolved cross-file by name from this map (the same
// implicit-uses alignment applies). But they would trip
// [`ssa_summary_fits_arity`] inside [`reconcile_ssa_summary_key`],
// forcing a synthetic disambig that uncouples the SSA FuncKey
// from the matching FuncSummary FuncKey — and Phase 3's
// `summaries.get_ssa(caller_key)` lookup (consuming
// `typed_call_receivers` at the FuncSummary-aligned key) would
// miss.
//
// Resolution rule (applies only when `summary` does not fit
// arity):
//
// * **No existing entry, or existing entry also has out-of-range
// refs** — keep the (untrimmed) summary at the original key,
// bypassing the disambig synthesis. Phase 3 finds the entry
// under the FuncSummary's own disambig; cross-file resolvers
// find the same entry with its full per-param signal
// (closures, lambdas, captured-var sinks). The "existing also
// has out-of-range refs" branch covers the iterative-rescan
// case where round 2's incoming summary lands on top of round
// 1's already-installed copy of the same function.
//
// * **Existing entry fits arity (legit) but new doesn't** — fall
// back to the disambig synthesis. This preserves the
// `insert_ssa_arity_overflow_rekeys` invariant: a structurally
// incompatible incoming summary (different function sharing
// name + container + arity, with param refs at indices that
// don't even exist in the legitimate function) cannot
// dethrone the existing entry by silent overwrite. Both
// summaries survive — the existing one at the original key,
// the new one at the synthesised disambig.
let key = if key.arity.is_some() && !ssa_summary_fits_arity(&summary, key.arity) {
let existing_also_overflows = self
.ssa_by_key
.get(&key)
.is_some_and(|existing| !ssa_summary_fits_arity(existing, key.arity));
let existing_present = self.ssa_by_key.contains_key(&key);
if !existing_present || existing_also_overflows {
key
} else {
self.reconcile_ssa_summary_key(key, &summary)
}
} else {
self.reconcile_ssa_summary_key(key, &summary)
};
self.ssa_by_key.insert(key, summary);
}
@ -1363,6 +1450,148 @@ impl GlobalSummaries {
_ => CalleeResolution::Ambiguous(same_ns.into_iter().cloned().collect()),
}
}
/// Install / refresh the type-hierarchy index from the currently
/// loaded summaries. Idempotent — calling twice rebuilds.
///
/// Call this once after pass-1 merge (and again whenever
/// summary state changes in a way that could affect virtual
/// dispatch — typically: after the call-graph is rebuilt mid-fixed-point).
/// `merge()` automatically invalidates so a forgotten reinstall
/// degrades to today's behaviour rather than a stale lookup.
pub fn install_hierarchy(&mut self) {
let h = crate::callgraph::TypeHierarchyIndex::build(self);
self.hierarchy = Some(h);
}
/// Borrow the installed hierarchy index, if any.
pub fn hierarchy(&self) -> Option<&crate::callgraph::TypeHierarchyIndex> {
self.hierarchy.as_ref()
}
/// Hard cap on hierarchy fan-out from a single call site — see
/// [`Self::resolve_callee_widened`] for rationale. Public for tests
/// that need to assert cap behaviour without hard-coding the value.
pub const MAX_HIERARCHY_FANOUT: usize = 8;
/// Resolve a call site to *every* candidate FuncKey reachable
/// through type-hierarchy fan-out. This is the runtime counterpart
/// of the [`crate::callgraph::TypeHierarchyIndex::resolve_with_hierarchy`]
/// step that the call-graph builder applies at edge-construction time.
///
/// Behaviour:
///
/// * `receiver_type = None` → falls through to
/// [`Self::resolve_callee`]; returns `[k]` on `Resolved`, `[]`
/// otherwise.
/// * `receiver_type = Some(rt)` and either no hierarchy is installed
/// or `rt` has no recorded sub-types → identical fall-through;
/// the hierarchy lookup is a no-op.
/// * `receiver_type = Some(rt)` with sub-types `s1, s2, …` →
/// union of `lookup_qualified` for `(rt, s1, s2, …)` after arity
/// filtering. Result is dedup'd in insertion order
/// (direct-receiver match first, then each sub-type's match).
///
/// Hard cap: at most [`Self::MAX_HIERARCHY_FANOUT`] keys are
/// returned. When the cap fires, the cap-hit is logged at `debug`
/// and the tail impls are silently dropped — over-fanning is a
/// precision-tax knob, not a soundness one.
///
/// Empty result + non-empty `subs` triggers a
/// secondary fall-through to [`Self::resolve_callee`] so a
/// type-fact misclassification (receiver typed as a super-class
/// that has no method by this name on any sub) does not silently
/// regress to "no resolution at all" — the leaf-name path can still
/// pick up a match. This preserves the
/// "subset of today's targets, never a superset" rule under
/// hierarchy-aware resolution failure.
pub fn resolve_callee_widened(&self, q: &CalleeQuery<'_>) -> Vec<FuncKey> {
let arity_matches = |k: &FuncKey| match q.arity {
Some(a) => k.arity == Some(a),
None => true,
};
let single_fallback = || -> Vec<FuncKey> {
match self.resolve_callee(q) {
CalleeResolution::Resolved(k) => vec![k],
_ => Vec::new(),
}
};
// Hierarchy fan-out only fires when the call has an
// authoritative receiver type AND the index is installed AND
// the type has recorded sub-types. Every other case collapses
// to today's resolver.
let Some(rt) = q.receiver_type.filter(|s| !s.is_empty()) else {
return single_fallback();
};
let Some(h) = self.hierarchy.as_ref() else {
return single_fallback();
};
let subs = h.subs_of(q.caller_lang, rt);
if subs.is_empty() {
return single_fallback();
}
// Union direct + sub-type matches in insertion order. Dedup is
// O(n²) over the cap (n ≤ 8) so a HashSet would be wasted
// overhead; linear scan is faster and order-preserving.
let mut out: Vec<FuncKey> = Vec::new();
let push_unique = |out: &mut Vec<FuncKey>, k: FuncKey| -> bool {
if !out.iter().any(|e| e == &k) {
out.push(k);
true
} else {
false
}
};
let qualified_lookup = |container: &str| -> Vec<FuncKey> {
let qual = format!("{container}::{}", q.name);
self.lookup_qualified(q.caller_lang, &qual)
.into_iter()
.map(|(k, _)| k.clone())
.filter(|k| arity_matches(k))
.collect()
};
for k in qualified_lookup(rt) {
push_unique(&mut out, k);
if out.len() >= Self::MAX_HIERARCHY_FANOUT {
tracing::debug!(
receiver = rt,
method = q.name,
cap = Self::MAX_HIERARCHY_FANOUT,
"hierarchy fan-out cap reached on direct receiver match"
);
return out;
}
}
for sub in subs {
for k in qualified_lookup(sub.as_str()) {
push_unique(&mut out, k);
if out.len() >= Self::MAX_HIERARCHY_FANOUT {
tracing::debug!(
receiver = rt,
method = q.name,
cap = Self::MAX_HIERARCHY_FANOUT,
"hierarchy fan-out cap reached; tail impls dropped"
);
return out;
}
}
}
if out.is_empty() {
// Hierarchy widening produced nothing (e.g., none of the
// recorded sub-types declare this method). Fall back to
// today's qualified-first resolver so the misclassified-
// type case still finds a leaf match — the same
// "preserve today's behaviour on miss" rule the call-graph
// builder applies.
return single_fallback();
}
out
}
}
impl std::fmt::Debug for GlobalSummaries {

View file

@ -336,3 +336,208 @@ mod tests {
assert_eq!(s, back);
}
}
// ── Pointer-Phase 5: field-granularity points-to summary ──────────────
/// Maximum field names retained per parameter in [`FieldPointsToSummary`].
///
/// Mirror of [`MAX_ALIAS_EDGES`]. Bounds on-disk + cross-file work
/// while leaving room for typical helpers (a handful of fields each).
pub const MAX_FIELDS_PER_PARAM: usize = 8;
/// Pointer-Phase 5: field-granularity per-parameter points-to summary.
///
/// Records, for each positional parameter index, the set of field
/// **names** read from and written to inside the callee body. Names
/// (not [`crate::ssa::ir::FieldId`]) are persisted because field IDs
/// are body-local — the per-body [`crate::ssa::ir::FieldInterner`]
/// reassigns IDs across files. Callers re-intern through their own
/// body's interner before consulting `field_taint` cells.
///
/// The receiver (`self` / `this`) uses sentinel index [`usize::MAX`]
/// in the outer `Vec` so positional params and the receiver share the
/// same indexing convention as `SsaFuncSummary::receiver_to_*`
/// (separate channel).
///
/// Empty by default — functions that don't read or write any field on
/// their parameters carry no entries and cost nothing on disk.
#[derive(Debug, Clone, Default, PartialEq, Eq, Serialize, Deserialize)]
pub struct FieldPointsToSummary {
/// `(param_index, field_names_read)` — the callee projected each
/// listed field on a value derived from `param_index` somewhere
/// in its body. Sorted, deduped per-entry.
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub param_field_reads: Vec<(u32, SmallVec<[String; 2]>)>,
/// `(param_index, field_names_written)` — the callee assigned to
/// each listed field on a value derived from `param_index`.
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub param_field_writes: Vec<(u32, SmallVec<[String; 2]>)>,
/// Set when the read/write graph hit
/// [`MAX_FIELDS_PER_PARAM`] for any parameter. Callers seeing
/// `overflow=true` treat each parameter as reading/writing every
/// field on every other parameter — the conservative greatest
/// lower bound that preserves soundness.
#[serde(default, skip_serializing_if = "core::ops::Not::not")]
pub overflow: bool,
}
impl FieldPointsToSummary {
pub fn empty() -> Self {
Self::default()
}
pub fn is_empty(&self) -> bool {
self.param_field_reads.is_empty() && self.param_field_writes.is_empty() && !self.overflow
}
fn insert_into(
list: &mut Vec<(u32, SmallVec<[String; 2]>)>,
param: u32,
field: &str,
overflow: &mut bool,
) {
let entry = match list.iter_mut().find(|(p, _)| *p == param) {
Some(e) => &mut e.1,
None => {
list.push((param, SmallVec::new()));
&mut list.last_mut().unwrap().1
}
};
if entry.iter().any(|s| s == field) {
return;
}
if entry.len() >= MAX_FIELDS_PER_PARAM {
*overflow = true;
return;
}
entry.push(field.to_string());
entry.sort();
}
/// Record a field READ on parameter `param`. Bounded by
/// [`MAX_FIELDS_PER_PARAM`] per parameter; over-cap inserts trip
/// `overflow`.
pub fn add_read(&mut self, param: u32, field: &str) {
if self.overflow {
return;
}
let mut overflow = false;
Self::insert_into(&mut self.param_field_reads, param, field, &mut overflow);
if overflow {
self.overflow = true;
}
}
/// Record a field WRITE on parameter `param`. Mirror of [`Self::add_read`].
pub fn add_write(&mut self, param: u32, field: &str) {
if self.overflow {
return;
}
let mut overflow = false;
Self::insert_into(&mut self.param_field_writes, param, field, &mut overflow);
if overflow {
self.overflow = true;
}
}
/// Union with `other`. Overflow propagates per
/// [`PointsToSummary::merge`]'s semantics — once a callee is
/// "any field on any parameter", merging cannot recover precision.
pub fn merge(&mut self, other: &Self) {
if other.overflow {
self.overflow = true;
return;
}
for (p, fields) in &other.param_field_reads {
for f in fields {
self.add_read(*p, f);
}
}
for (p, fields) in &other.param_field_writes {
for f in fields {
self.add_write(*p, f);
}
}
}
}
#[cfg(test)]
mod field_summary_tests {
use super::*;
#[test]
fn empty_summary_round_trips() {
let s = FieldPointsToSummary::empty();
assert!(s.is_empty());
let json = serde_json::to_string(&s).unwrap();
let back: FieldPointsToSummary = serde_json::from_str(&json).unwrap();
assert_eq!(s, back);
}
#[test]
fn add_read_dedupes_and_sorts() {
let mut s = FieldPointsToSummary::empty();
s.add_read(0, "name");
s.add_read(0, "id");
s.add_read(0, "name"); // duplicate
let entry = s.param_field_reads.iter().find(|(p, _)| *p == 0).unwrap();
assert_eq!(entry.1.as_slice(), &["id".to_string(), "name".to_string()]);
}
#[test]
fn distinct_params_get_distinct_entries() {
let mut s = FieldPointsToSummary::empty();
s.add_write(0, "cache");
s.add_write(1, "log");
assert_eq!(s.param_field_writes.len(), 2);
}
#[test]
fn overflow_trips_at_cap() {
let mut s = FieldPointsToSummary::empty();
for i in 0..(MAX_FIELDS_PER_PARAM + 4) {
s.add_read(0, &format!("field{i}"));
}
assert!(s.overflow);
}
#[test]
fn merge_unions_disjoint_keys() {
let mut a = FieldPointsToSummary::empty();
let mut b = FieldPointsToSummary::empty();
a.add_read(0, "alpha");
b.add_read(1, "beta");
a.merge(&b);
assert!(a.param_field_reads.iter().any(|(p, _)| *p == 0));
assert!(a.param_field_reads.iter().any(|(p, _)| *p == 1));
}
#[test]
fn merge_propagates_overflow() {
let mut a = FieldPointsToSummary::empty();
let mut b = FieldPointsToSummary::empty();
b.overflow = true;
a.merge(&b);
assert!(a.overflow);
}
#[test]
fn round_trip_preserves_entries() {
let mut s = FieldPointsToSummary::empty();
s.add_read(0, "name");
s.add_write(1, "cache");
s.add_write(1, "log");
let json = serde_json::to_string(&s).unwrap();
let back: FieldPointsToSummary = serde_json::from_str(&json).unwrap();
assert_eq!(s, back);
}
#[test]
fn empty_serializes_as_empty_object() {
let s = FieldPointsToSummary::empty();
let json = serde_json::to_string(&s).unwrap();
assert_eq!(json, "{}");
let back: FieldPointsToSummary = serde_json::from_str("{}").unwrap();
assert!(back.is_empty());
}
}

View file

@ -2,7 +2,7 @@ use crate::abstract_interp::{AbstractTransfer, AbstractValue, PathFact};
use crate::labels::Cap;
use crate::ssa::type_facts::TypeKind;
use crate::summary::SinkSite;
use crate::summary::points_to::PointsToSummary;
use crate::summary::points_to::{FieldPointsToSummary, PointsToSummary};
use serde::{Deserialize, Serialize};
use smallvec::SmallVec;
@ -268,6 +268,20 @@ pub struct SsaFuncSummary {
/// each other or the return value.
#[serde(default, skip_serializing_if = "PointsToSummary::is_empty")]
pub points_to: PointsToSummary,
/// Pointer-Phase 5: field-granularity per-parameter points-to
/// summary. Records which fields the callee reads from / writes
/// to on each parameter, so cross-file resolution can spread
/// taint through field-level mutations the callee performs on
/// caller-supplied objects.
///
/// Default-empty (most functions don't field-mutate their params)
/// and elided from serialised output via `skip_serializing_if` so
/// pre-Phase-5 summaries deserialise cleanly without migration.
/// Built by extraction in `summary_extract.rs` when the per-body
/// [`crate::pointer::PointsToFacts`] are available
/// (`NYX_POINTER_ANALYSIS=1`); empty otherwise.
#[serde(default, skip_serializing_if = "FieldPointsToSummary::is_empty")]
pub field_points_to: FieldPointsToSummary,
/// Per-return-path abstract [`PathFact`] decomposition.
///
/// When non-empty, supplies per-predicate-gate facts finer than the
@ -285,6 +299,25 @@ pub struct SsaFuncSummary {
/// behaviour.
#[serde(default, skip_serializing_if = "SmallVec::is_empty")]
pub return_path_facts: SmallVec<[PathFactReturnEntry; 2]>,
/// Per-call-site receiver-type info: `(call_ordinal, container_name)`.
///
/// Populated during SSA lowering (`lower_all_functions_from_bodies`)
/// when type-fact analysis can resolve a method call's receiver SSA
/// value to a concrete [`crate::ssa::type_facts::TypeKind`] with a
/// non-empty [`crate::ssa::type_facts::TypeKind::container_name`].
///
/// Consumed by [`crate::callgraph::build_call_graph`] to feed
/// `CalleeQuery.receiver_type` for the matching ordinal — letting
/// the call graph narrow indirect method-call edges to only those
/// targets whose defining container matches the inferred type.
/// Strictly additive: an empty map means today's name-only
/// resolution applies unchanged.
///
/// Ordinal here is the per-function `CallMeta.call_ordinal` shared
/// with [`crate::summary::CalleeSite::ordinal`] so the two tables
/// can be joined by ordinal at call-graph build time.
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub typed_call_receivers: Vec<(u32, String)>,
}
/// A per-return-path [`PathFact`] entry.

View file

@ -438,7 +438,9 @@ fn ssa_summary_serde_round_trip_identity() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
let json = serde_json::to_string(&summary).unwrap();
let back: SsaFuncSummary = serde_json::from_str(&json).unwrap();
@ -468,7 +470,9 @@ fn ssa_summary_serde_round_trip_strip_bits() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
let json = serde_json::to_string(&summary).unwrap();
let back: SsaFuncSummary = serde_json::from_str(&json).unwrap();
@ -495,7 +499,9 @@ fn ssa_summary_serde_round_trip_add_bits() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
let json = serde_json::to_string(&summary).unwrap();
let back: SsaFuncSummary = serde_json::from_str(&json).unwrap();
@ -529,7 +535,9 @@ fn ssa_summary_serde_round_trip_all_variants() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
let json = serde_json::to_string(&summary).unwrap();
let back: SsaFuncSummary = serde_json::from_str(&json).unwrap();
@ -565,7 +573,9 @@ fn global_summaries_insert_ssa_exact_key_replacement() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
gs.insert_ssa(key.clone(), v1.clone());
assert_eq!(gs.get_ssa(&key), Some(&v1));
@ -589,7 +599,9 @@ fn global_summaries_insert_ssa_exact_key_replacement() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
gs.insert_ssa(key.clone(), v2.clone());
assert_eq!(gs.get_ssa(&key), Some(&v2));
@ -633,7 +645,9 @@ fn global_summaries_merge_with_ssa_entries() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
let sum_b = SsaFuncSummary {
param_to_return: vec![],
@ -653,7 +667,9 @@ fn global_summaries_merge_with_ssa_entries() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
gs1.insert_ssa(key_a.clone(), sum_a.clone());
@ -697,7 +713,9 @@ fn global_summaries_is_empty_considers_ssa() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
},
);
@ -724,7 +742,9 @@ fn ssa_summary_serde_round_trip_param_to_sink_param() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
let json = serde_json::to_string(&summary).unwrap();
let back: SsaFuncSummary = serde_json::from_str(&json).unwrap();
@ -766,7 +786,9 @@ fn ssa_summary_serde_round_trip_container_fields() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
let json = serde_json::to_string(&summary).unwrap();
let back: SsaFuncSummary = serde_json::from_str(&json).unwrap();
@ -818,7 +840,9 @@ fn ssa_summary_serde_round_trip_return_abstract() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
let json = serde_json::to_string(&summary).unwrap();
let back: SsaFuncSummary = serde_json::from_str(&json).unwrap();
@ -890,6 +914,8 @@ fn make_callee_body(
value_defs,
cfg_node_map: std::collections::HashMap::new(),
exception_edges: vec![],
field_interner: crate::ssa::ir::FieldInterner::default(),
field_writes: std::collections::HashMap::new(),
},
opt: crate::ssa::OptimizeResult {
const_values: std::collections::HashMap::new(),
@ -1046,6 +1072,7 @@ fn callee_body_serde_with_all_ssa_op_variants() {
value: SsaValue(7),
op: SsaOp::Call {
callee: "foo".into(),
callee_text: None,
args: vec![smallvec![SsaValue(0)], smallvec![SsaValue(1)]],
receiver: Some(SsaValue(2)),
},
@ -1077,6 +1104,7 @@ fn callee_body_serde_with_all_ssa_op_variants() {
callee,
args,
receiver,
..
} => {
assert_eq!(callee, "foo");
assert_eq!(args.len(), 2);
@ -1330,7 +1358,9 @@ fn global_summaries_resolve_body_requires_body_present() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
},
);
// Don't insert body
@ -3169,6 +3199,95 @@ fn insert_ssa_arity_overflow_rekeys() {
assert!(kept.param_to_sink.is_empty());
}
/// Audit gap A.2.1.G1 reproducer: a summary whose only param-index
/// references come from synthetic SSA `Param` ops for external
/// captures (free identifiers, module imports, unresolved method
/// names) lands at the original key when no existing entry occupies
/// it.
///
/// This is the case `lower_to_ssa` produces for Java instance/static
/// methods that reference free identifiers (e.g. `f.close()` where
/// `close` is treated as an external capture — the synthetic Param 0
/// then leaks into `param_to_return`/`param_to_sink`). Without the
/// audit-gap fix, `reconcile_ssa_summary_key` would synthesise a
/// disambig and Phase 3's `summaries.get_ssa(caller_key)` lookup
/// (consuming `typed_call_receivers` at the FuncSummary-aligned key)
/// would miss.
#[test]
fn insert_ssa_arity_overflow_keeps_original_key_when_no_collision() {
// Single-file fresh insert: no prior entry at `key` to protect, so
// the synthetic-Param overflow is treated as the function's own
// signal and lands at the original FuncKey.
let mut gs = GlobalSummaries::new();
let key = FuncKey {
lang: Lang::Java,
namespace: "Reader.java".into(),
container: "Reader".into(),
name: "read".into(),
arity: Some(0),
..Default::default()
};
let summary = SsaFuncSummary {
// Synthetic Param-0 for the external `close` identifier inside
// the static `read()` body — `param_count == 0` per the source-
// level signature.
param_to_return: vec![(0, TaintTransform::Identity)],
typed_call_receivers: vec![(1, "FileHandle".to_string())],
..Default::default()
};
gs.insert_ssa(key.clone(), summary.clone());
let kept = gs
.get_ssa(&key)
.expect("Reader::read SSA must be reachable at the FuncSummary-aligned key");
assert_eq!(kept.typed_call_receivers, summary.typed_call_receivers);
// The synthetic Param-0 reference is preserved verbatim — pass-2
// analysis still aligns it with the caller's implicit-uses
// argument group at the same index.
assert_eq!(kept.param_to_return, summary.param_to_return);
}
/// Companion to `insert_ssa_arity_overflow_keeps_original_key_when_no_collision`:
/// when both rounds of an iterative scan produce summaries whose
/// param-index references overflow the FuncKey arity (the same
/// synthetic-Param signal each round), the second-round insert must
/// land at the original key (last-writer-wins for the same function),
/// not split off into a synthetic disambig.
#[test]
fn insert_ssa_arity_overflow_iterative_rescan_stays_at_original_key() {
let mut gs = GlobalSummaries::new();
let key = FuncKey {
lang: Lang::Java,
namespace: "Reader.java".into(),
container: "Reader".into(),
name: "read".into(),
arity: Some(0),
..Default::default()
};
let round1 = SsaFuncSummary {
param_to_return: vec![(0, TaintTransform::Identity)],
typed_call_receivers: vec![(1, "FileHandle".to_string())],
..Default::default()
};
gs.insert_ssa(key.clone(), round1);
// Iteration 2 of the scan loop produces the same shape with
// refined typed_call_receivers (e.g. a new constructor type
// discovered cross-file).
let round2 = SsaFuncSummary {
param_to_return: vec![(0, TaintTransform::Identity)],
typed_call_receivers: vec![(1, "FileHandle".to_string()), (2, "Cache".to_string())],
..Default::default()
};
gs.insert_ssa(key.clone(), round2.clone());
let kept = gs
.get_ssa(&key)
.expect("iterative-rescan summary must stay at the original key");
assert_eq!(kept.typed_call_receivers, round2.typed_call_receivers);
assert_eq!(kept.param_to_return, round2.param_to_return);
}
// ── Primary sink-location attribution — SinkSite round-trips ────────────
#[test]
@ -3382,7 +3501,9 @@ fn cf4_return_path_transform_serde_round_trip() {
],
)],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
};
let json = serde_json::to_string(&summary).unwrap();
let back: SsaFuncSummary = serde_json::from_str(&json).unwrap();
@ -3503,8 +3624,15 @@ fn cf4_union_param_return_paths_by_index() {
}
#[test]
fn cf4_ssa_summary_fits_arity_rejects_out_of_range_path_idx() {
// A path whose param index exceeds the key's arity is incompatible.
fn cf4_ssa_summary_fits_arity_keeps_out_of_range_path_idx_at_original_key() {
// A path whose param index exceeds the key's arity is treated as a
// synthetic external-capture artefact (audit gap A.2.1.G1 — see
// `project_typed_callgraph_audit_gap_ssa_disambig.md`). When no
// existing entry sits at the key, `insert_ssa` keeps the (untrimmed)
// summary at the original key so the SSA FuncKey stays aligned with
// the matching FuncSummary FuncKey — Phase 3's
// `summaries.get_ssa(caller_key)` lookup (consuming
// `typed_call_receivers`) depends on this alignment.
let bad = SsaFuncSummary {
param_return_paths: vec![(5, smallvec![rpt(TaintTransform::Identity, 1, 0, 0)])],
..Default::default()
@ -3513,14 +3641,16 @@ fn cf4_ssa_summary_fits_arity_rejects_out_of_range_path_idx() {
lang: Lang::Rust,
namespace: "test.rs".into(),
name: "helper".into(),
arity: Some(2), // too small for idx 5
arity: Some(2), // too small for idx 5 — synthetic-Param marker
..Default::default()
};
let mut gs = GlobalSummaries::new();
gs.insert_ssa(key.clone(), bad);
// Reconciliation synthesises a disambig to keep the bad entry under a
// different key; the original key stays empty.
assert!(gs.get_ssa(&key).is_none());
let kept = gs
.get_ssa(&key)
.expect("synthetic-Param summary inserted at original key");
assert_eq!(kept.param_return_paths.len(), 1);
assert_eq!(kept.param_return_paths[0].0, 5);
}
// ── Parameter-granularity points-to summary ─────────────────────────────
@ -3568,10 +3698,14 @@ fn cf6_ssa_summary_legacy_json_without_points_to_deserialises() {
}
#[test]
fn cf6_ssa_summary_fits_arity_rejects_out_of_range_points_to_idx() {
fn cf6_ssa_summary_fits_arity_keeps_out_of_range_points_to_idx_at_original_key() {
// Same arity-overflow handling as `cf4_ssa_summary_fits_arity_*`
// for the points-to channel: when the summary references a
// synthetic-Param index beyond `key.arity` and no existing entry
// occupies the key, `insert_ssa` preserves the FuncKey-aligned
// identity by inserting at the original key (audit gap A.2.1.G1).
use crate::summary::points_to::{AliasKind, AliasPosition, PointsToSummary};
let mut pts = PointsToSummary::empty();
// Index 7 exceeds arity 2 below.
pts.insert(
AliasPosition::Param(7),
AliasPosition::Return,
@ -3590,6 +3724,499 @@ fn cf6_ssa_summary_fits_arity_rejects_out_of_range_points_to_idx() {
};
let mut gs = GlobalSummaries::new();
gs.insert_ssa(key.clone(), bad);
// Reconciliation rekeys the bad entry; the original key is empty.
assert!(gs.get_ssa(&key).is_none());
let kept = gs
.get_ssa(&key)
.expect("synthetic-Param points_to summary inserted at original key");
assert_eq!(kept.points_to.max_param_index(), Some(7));
}
/// Phase 4 (typed call-graph devirtualisation): two `findById`
/// definitions on different containers must remain structurally
/// disjoint after [`merge_summaries`] — no cap union may leak
/// across them. The FuncKey identity model already keys on
/// `(lang, namespace, container, name, arity, ...)` so this is
/// supposed to be true today; the test pins it down so a future
/// refactor can't silently widen the merge granularity.
///
/// Concretely: `Repository::findById` is parameterised (no
/// `SQL_QUERY` sink cap), `UnsafeCache::findById` runs a string-
/// concatenated query (carries `Cap::SQL_QUERY`). After merge,
/// each FuncKey must own only its own caps — Repository must NOT
/// inherit Cache's `SQL_QUERY` bit.
#[test]
fn cross_file_devirt_does_not_union_unrelated_findbyids() {
use crate::labels::Cap;
use crate::symbol::FuncKey;
fn method_summary(name: &str, container: &str, file: &str, sink_caps: u16) -> FuncSummary {
FuncSummary {
name: name.into(),
file_path: file.into(),
lang: "rust".into(),
param_count: 1,
param_names: vec!["id".into()],
source_caps: 0,
sanitizer_caps: 0,
sink_caps,
propagating_params: vec![],
propagates_taint: false,
tainted_sink_params: if sink_caps != 0 { vec![0] } else { vec![] },
callees: vec![],
container: container.into(),
..Default::default()
}
}
let safe_repo = method_summary("findById", "Repository", "src/repo.rs", 0);
let unsafe_cache = method_summary(
"findById",
"UnsafeCache",
"src/cache.rs",
Cap::SQL_QUERY.bits(),
);
let gs = merge_summaries(vec![safe_repo, unsafe_cache], None);
// Two distinct keys must coexist — no merge collision.
let repo_key = FuncKey {
lang: Lang::Rust,
namespace: "src/repo.rs".into(),
container: "Repository".into(),
name: "findById".into(),
arity: Some(1),
..Default::default()
};
let cache_key = FuncKey {
lang: Lang::Rust,
namespace: "src/cache.rs".into(),
container: "UnsafeCache".into(),
name: "findById".into(),
arity: Some(1),
..Default::default()
};
let repo_sum = gs.get(&repo_key).expect("Repository::findById missing");
let cache_sum = gs.get(&cache_key).expect("UnsafeCache::findById missing");
// Sink caps stay on their own owner — the whole point of
// devirtualisation. Repository must not have inherited the
// SQL_QUERY bit from UnsafeCache.
assert_eq!(
repo_sum.sink_caps, 0,
"Repository::findById inherited a sink cap from UnsafeCache::findById — \
the per-FuncKey identity model has been broken (sink_caps bits = {:#x})",
repo_sum.sink_caps,
);
assert_eq!(
cache_sum.sink_caps,
Cap::SQL_QUERY.bits(),
"UnsafeCache::findById lost its own sink cap during merge"
);
// Same invariant on tainted_sink_params — must not bleed across.
assert!(
repo_sum.tainted_sink_params.is_empty(),
"Repository::findById inherited tainted_sink_params from UnsafeCache: {:?}",
repo_sum.tainted_sink_params,
);
assert_eq!(cache_sum.tainted_sink_params, vec![0]);
}
// ── Phase 6 hierarchy fan-out at runtime resolution ────────────────────
//
// `GlobalSummaries::resolve_callee_widened` is the runtime counterpart of
// the call-graph builder's `TypeHierarchyIndex::resolve_with_hierarchy`.
// These tests pin the contract that *every* concrete implementer is
// reachable when the receiver type is statically a super-class / trait /
// interface, with the explicit fall-throughs that preserve today's
// behaviour when no fan-out applies.
mod hierarchy_widened_tests {
use super::*;
/// Build a minimal `(FuncKey, FuncSummary)` for a method on the
/// given container with optional `hierarchy_edges` carried through.
fn java_method(
namespace: &str,
container: &str,
name: &str,
arity: usize,
sink_bits: u16,
hierarchy_edges: Vec<(String, String)>,
) -> (FuncKey, FuncSummary) {
let (key, mut summary) = fs_with(
namespace,
container,
name,
arity,
FuncKind::Method,
Some((namespace.len() + container.len() + name.len()) as u32),
sink_bits,
);
summary.hierarchy_edges = hierarchy_edges;
(key, summary)
}
/// A1 — no hierarchy installed. Widening collapses to today's
/// single-result behaviour: one key in / one key out.
#[test]
fn widened_without_hierarchy_returns_single_resolved() {
let mut gs = GlobalSummaries::new();
let (k, s) = java_method("src/http.java", "HttpClient", "send", 1, 0x01, vec![]);
gs.insert(k.clone(), s);
// Hierarchy is intentionally NOT installed.
let widened = gs.resolve_callee_widened(&CalleeQuery {
name: "send",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("HttpClient"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(1),
});
assert_eq!(widened, vec![k]);
}
/// A2 — hierarchy installed but the receiver type has no recorded
/// sub-types. Falls through to today's single-result behaviour.
#[test]
fn widened_no_subtypes_returns_single() {
let mut gs = GlobalSummaries::new();
let (k, s) = java_method("src/http.java", "HttpClient", "send", 1, 0x01, vec![]);
gs.insert(k.clone(), s);
gs.install_hierarchy();
let widened = gs.resolve_callee_widened(&CalleeQuery {
name: "send",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("HttpClient"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(1),
});
assert_eq!(widened, vec![k]);
}
/// A3 — hierarchy with one sub-type implementer. Widening returns
/// both the direct receiver match and the sub-type's match.
#[test]
fn widened_one_subtype_returns_two_keys() {
let mut gs = GlobalSummaries::new();
// Carrier: ILogger -> ConsoleLogger edge.
let (k_iface, s_iface) = java_method(
"src/logger.java",
"ILogger",
"log",
1,
0x00,
vec![("ConsoleLogger".to_string(), "ILogger".to_string())],
);
let (k_impl, s_impl) =
java_method("src/logger.java", "ConsoleLogger", "log", 1, 0x01, vec![]);
gs.insert(k_iface.clone(), s_iface);
gs.insert(k_impl.clone(), s_impl);
gs.install_hierarchy();
let widened = gs.resolve_callee_widened(&CalleeQuery {
name: "log",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("ILogger"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(1),
});
assert_eq!(
widened.len(),
2,
"expected ILogger + ConsoleLogger fan-out, got {widened:?}"
);
assert!(widened.contains(&k_iface));
assert!(widened.contains(&k_impl));
}
/// A4 — hierarchy with multiple sub-types: every implementer's
/// matching method is in the result, deduplicated.
#[test]
fn widened_multiple_subtypes_returns_all() {
let mut gs = GlobalSummaries::new();
// Three impls + one interface. The interface itself has no
// body so we omit a method on it (that is the more common
// shape — a pure interface plus concrete classes).
let edges = vec![
("FileLogger".to_string(), "ILogger".to_string()),
("NetLogger".to_string(), "ILogger".to_string()),
("StdLogger".to_string(), "ILogger".to_string()),
];
let (k_file, s_file) = java_method(
"src/file_logger.java",
"FileLogger",
"log",
1,
0x01,
edges.clone(),
);
let (k_net, s_net) =
java_method("src/net_logger.java", "NetLogger", "log", 1, 0x02, vec![]);
let (k_std, s_std) =
java_method("src/std_logger.java", "StdLogger", "log", 1, 0x04, vec![]);
gs.insert(k_file.clone(), s_file);
gs.insert(k_net.clone(), s_net);
gs.insert(k_std.clone(), s_std);
gs.install_hierarchy();
let widened = gs.resolve_callee_widened(&CalleeQuery {
name: "log",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("ILogger"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(1),
});
assert_eq!(widened.len(), 3, "expected three impls, got {widened:?}");
assert!(widened.contains(&k_file));
assert!(widened.contains(&k_net));
assert!(widened.contains(&k_std));
}
/// A5 — the arity filter must apply across the whole fan-out, not
/// just the direct-receiver leg. An implementer with a different
/// arity must not leak into the result.
#[test]
fn widened_arity_filter_applies_across_fanout() {
let mut gs = GlobalSummaries::new();
let edges = vec![
("OneArg".to_string(), "IBase".to_string()),
("TwoArg".to_string(), "IBase".to_string()),
];
let (k_one, s_one) = java_method("src/one.java", "OneArg", "do_it", 1, 0x01, edges.clone());
let (k_two, s_two) = java_method("src/two.java", "TwoArg", "do_it", 2, 0x02, vec![]);
gs.insert(k_one.clone(), s_one);
gs.insert(k_two.clone(), s_two);
gs.install_hierarchy();
let widened = gs.resolve_callee_widened(&CalleeQuery {
name: "do_it",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("IBase"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(1),
});
assert_eq!(widened, vec![k_one], "arity-2 impl must be filtered out");
}
/// A6 — fan-out is bounded at `MAX_HIERARCHY_FANOUT`. Build a
/// hierarchy with more impls than the cap allows and assert the
/// result is exactly capped (and that early impls are preserved
/// — the cap drops the *tail*, not the head).
#[test]
fn widened_caps_at_max_hierarchy_fanout() {
let cap = GlobalSummaries::MAX_HIERARCHY_FANOUT;
let mut gs = GlobalSummaries::new();
// Build cap+3 impls so we can assert the tail truncates and a
// deterministic prefix remains.
let extra = 3;
let total = cap + extra;
let edges: Vec<(String, String)> = (0..total)
.map(|i| (format!("Impl{i:02}"), "IBase".to_string()))
.collect();
// Carrier — first impl carries every edge so the index is
// populated in one shot.
let (k0, s0) = java_method("src/impl00.java", "Impl00", "run", 0, 0x01, edges);
gs.insert(k0.clone(), s0);
for i in 1..total {
let (k, s) = java_method(
&format!("src/impl{i:02}.java"),
&format!("Impl{i:02}"),
"run",
0,
0x01,
vec![],
);
gs.insert(k, s);
}
gs.install_hierarchy();
let widened = gs.resolve_callee_widened(&CalleeQuery {
name: "run",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("IBase"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(0),
});
assert_eq!(
widened.len(),
cap,
"fan-out must cap at MAX_HIERARCHY_FANOUT={cap}, got {}",
widened.len()
);
}
/// A7 — when hierarchy widening produces no candidates AND the
/// receiver_type lookup is authoritative (Step 1), the secondary
/// fall-through goes through `resolve_callee` which returns
/// Ambiguous/NotFound rather than silently picking an unrelated
/// leaf — exactly the "subset of today's targets, never a
/// superset" rule. Test asserts the empty result is preserved.
#[test]
fn widened_empty_does_not_silently_pick_unrelated_leaf() {
let mut gs = GlobalSummaries::new();
// Edge: IUnused has a sub Used, but neither declares
// `something`. An unrelated free function `something` exists
// in the same namespace — under today's authoritative
// receiver_type rules, that function MUST NOT be picked when
// the call is annotated with receiver_type "IUnused".
let edges = vec![("Used".to_string(), "IUnused".to_string())];
let (k_carrier, s_carrier) =
java_method("src/util.java", "Used", "carrier", 0, 0x00, edges);
let (k_free, s_free) = free_summary("src/app.java", "something", 0, 0x01);
gs.insert(k_carrier, s_carrier);
gs.insert(k_free, s_free);
gs.install_hierarchy();
let widened = gs.resolve_callee_widened(&CalleeQuery {
name: "something",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("IUnused"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(0),
});
assert!(
widened.is_empty(),
"receiver_type IUnused with no matching method must NOT silently \
pick an unrelated free function got {widened:?}"
);
}
/// A7b — when hierarchy widening produces nothing AND today's
/// `resolve_callee` *does* resolve (no receiver_type, just bare
/// leaf or qualifier hint), the fallback returns the single key.
/// This pins the secondary-fallback contract on the path where it
/// actually matters (no authoritative receiver_type).
#[test]
fn widened_falls_through_when_resolve_callee_resolves() {
let mut gs = GlobalSummaries::new();
let (k_free, s_free) = free_summary("src/app.java", "helper", 0, 0x01);
gs.insert(k_free.clone(), s_free);
gs.install_hierarchy();
// No receiver_type → first branch of `resolve_callee_widened`
// is the single-result fallback path.
let widened = gs.resolve_callee_widened(&CalleeQuery {
name: "helper",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: None,
namespace_qualifier: None,
receiver_var: None,
arity: Some(0),
});
assert_eq!(widened, vec![k_free]);
}
/// A8 — receiver_type is None → no widening; behaves identically
/// to `resolve_callee` (single-result wrap).
#[test]
fn widened_no_receiver_type_collapses_to_resolve_callee() {
let mut gs = GlobalSummaries::new();
let (k_free, s_free) = free_summary("src/app.java", "helper", 0, 0x01);
gs.insert(k_free.clone(), s_free);
gs.install_hierarchy();
let widened = gs.resolve_callee_widened(&CalleeQuery {
name: "helper",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: None,
namespace_qualifier: None,
receiver_var: None,
arity: Some(0),
});
assert_eq!(widened, vec![k_free]);
}
/// A9 — `merge()` must invalidate the cached hierarchy index so a
/// post-merge call to `resolve_callee_widened` doesn't look up a
/// stale view. Since `install_hierarchy` is required after merges,
/// the test asserts: post-merge, before reinstall, fan-out must
/// fall through to single-result behaviour.
#[test]
fn merge_invalidates_hierarchy_cache() {
let mut gs_a = GlobalSummaries::new();
let edges = vec![("Sub".to_string(), "Super".to_string())];
let (k_super, s_super) = java_method("src/super.java", "Super", "m", 0, 0x00, edges);
let (k_sub, s_sub) = java_method("src/sub.java", "Sub", "m", 0, 0x01, vec![]);
gs_a.insert(k_super.clone(), s_super);
gs_a.insert(k_sub.clone(), s_sub);
gs_a.install_hierarchy();
// Before merge: fan-out works.
let pre_merge = gs_a.resolve_callee_widened(&CalleeQuery {
name: "m",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("Super"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(0),
});
assert_eq!(pre_merge.len(), 2);
// Merge in an empty `gs_b` — should invalidate the cached
// hierarchy.
gs_a.merge(GlobalSummaries::new());
assert!(
gs_a.hierarchy().is_none(),
"merge() must clear the cached hierarchy"
);
// After merge, before reinstall: the resolver must fall back
// to single-result behaviour (no fan-out).
let post_merge_no_install = gs_a.resolve_callee_widened(&CalleeQuery {
name: "m",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("Super"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(0),
});
assert_eq!(post_merge_no_install.len(), 1);
assert_eq!(post_merge_no_install[0], k_super);
// After reinstall: fan-out is restored.
gs_a.install_hierarchy();
let post_merge_reinstalled = gs_a.resolve_callee_widened(&CalleeQuery {
name: "m",
caller_lang: Lang::Java,
caller_namespace: "src/app.java",
caller_container: None,
receiver_type: Some("Super"),
namespace_qualifier: None,
receiver_var: None,
arity: Some(0),
});
assert_eq!(post_merge_reinstalled.len(), 2);
assert!(post_merge_reinstalled.contains(&k_super));
assert!(post_merge_reinstalled.contains(&k_sub));
}
}