Added Cap::DATA_EXFIL and taint fp and fn fixes on real repos (#59)

* feat: Enhance data exfiltration detection with source sensitivity gating for cookies and headers

* feat: Implement cross-file data exfiltration detection with parameter-specific gate filters

* feat: Add calibration tests and refine DATA_EXFIL severity scoring logic

* feat: Introduce per-detector configuration for data exfiltration suppression

* feat: Enhance DATA_EXFIL findings with destination field tracking in diagnostics and SARIF output

* feat: Add tainted body and URL handling for data exfiltration detection

* feat: Add integration tests and fixtures for DATA_EXFIL and SSRF detection in Go

* feat: Add Java integration tests and fixtures for DATA_EXFIL detection across multiple HTTP clients

* feat: Add synthetic externals handling for closure-captured variables in SSA

* feat: Implement closure-based suppression for resource leak findings

* feat: Add regression guards for shell-injection and taint propagation in for-of destructure patterns

* feat: Implement constructor cap narrowing for data exfiltration detection in HTTP request builders

* feat: Add gated sinks for data exfiltration detection in C and C++ using curl_easy_setopt

* feat: Implement DATA_EXFIL cap parity for backwards analysis and add integration tests

* feat: Add data exfiltration sinks for various languages and enhance documentation

* refactor: Simplify formatting and improve readability in various files

* refactor: Improve readability by simplifying conditional statements and adding clippy linting

* docs: Update CHANGELOG and comments for data exfiltration features and configuration

* docs: Clarify configuration instructions for data exfiltration trusted destinations

* docs: Enhance comments for evidence routing logic in data exfiltration
This commit is contained in:
Eli Peter 2026-05-01 10:59:52 -04:00 committed by GitHub
parent a438886217
commit 58f1794a4e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
189 changed files with 8421 additions and 383 deletions

View file

@ -158,6 +158,39 @@ pub struct SsaFuncSummary {
/// (caller_param_index, sink_arg_position, sink_caps).
#[serde(default)]
pub param_to_sink_param: Vec<(usize, usize, Cap)>,
/// Per-parameter gate-filter cap masks lifted from inner multi-gate
/// sink call sites.
///
/// When a function body contains a callee whose
/// [`crate::cfg::CallMeta::gate_filters`] carries more than one entry
/// (e.g. `fetch` is both an `SSRF` gate on the URL arg and a
/// `DATA_EXFIL` gate on the body arg), the multi-gate dispatch in
/// [`super::super::collect_block_events`] cap-narrows the event's
/// `sink_caps` to the specific gate's `label_caps`. Each
/// `(param_idx, label_caps)` entry records that this function's
/// parameter `param_idx` flowed into a gated sink whose narrowed
/// caps were `label_caps`.
///
/// Cross-file callers consume this list to preserve per-position cap
/// attribution through wrapper functions: a wrapper
/// `fn forward(url, body) { fetch(url, {body}) }` records
/// `[(0, SSRF), (1, DATA_EXFIL)]` so a caller of `forward` splits
/// URL-tainted SSRF findings from body-tainted DATA_EXFIL findings
/// instead of conflating both caps onto every parameter.
///
/// `Vec<(param_idx, label_caps)>` is sufficient at cross-file
/// granularity, the corresponding `payload_args` and
/// `destination_uses` are intra-file context that does not survive
/// the function-summary boundary (field idents reference SSA
/// values from the callee body).
///
/// Empty (the default) for callees whose internal sinks carry zero
/// or one gate filter, the existing
/// [`Self::param_to_sink`] /
/// [`Self::param_to_sink_param`] machinery already records those
/// cases without per-position cap conflict.
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub param_to_gate_filters: Vec<(usize, Cap)>,
/// Parameter indices whose container identity flows to the return value
/// (e.g., function returns the same container it received as input).
///