new capacity bits (#67)

2026-07-24 21:41:02 +02:00 · 2026-05-07 01:29:31 -04:00 · 2026-05-07 01:29:31 -04:00 · 7d0e7320e2
commit 7d0e7320e2
parent afaffc0df6
261 changed files with 10591 additions and 231 deletions
--- a/.gitignore
+++ b/.gitignore
@ -14,3 +14,5 @@
 .pitboss
 .node_modules-target
 node_modules
+__pycache__/
+*.pyc
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -6,6 +6,15 @@ All notable changes to Nyx are documented here. The format is based on [Keep a C

 A round of cross-file FastAPI auth, two new sink/validator classes, a ~957-FP Go DAO helper precision pass, four CVE corpus pairs, a local web UI visual refresh, and a performance pass on the auth extractor pipeline plus SCCP and the global summaries hash map.

+This branch also adds seven new vulnerability classes (LDAP injection, XPath injection, header / CRLF injection, open redirect, server-side template injection, XXE, prototype pollution), a `nyx rules` CLI subcommand, two SSA configuration sidecars (XML parser hardening, XPath variable resolver), two new path-state predicates for inline open-redirect sanitisers, and a flow-sensitive `Object.create(null)` recogniser for prototype-pollution suppression.
+
+### Detector classes
+
+- New `Cap` bits and canonical rule ids: `Cap::LDAP_INJECTION` / `taint-ldap-injection`, `Cap::XPATH_INJECTION` / `taint-xpath-injection`, `Cap::HEADER_INJECTION` / `taint-header-injection`, `Cap::OPEN_REDIRECT` / `taint-open-redirect`, `Cap::SSTI` / `taint-template-injection`, `Cap::XXE` / `taint-xxe`, `Cap::PROTOTYPE_POLLUTION` / `taint-prototype-pollution`. Each ships with per-language sink, sanitizer, and (where applicable) gated-sink rules across JS/TS, Python, Java, PHP, Go, Ruby, Rust, and C/C++. Severity, OWASP 2021 mapping, and human-readable description live in a single `CAP_RULE_REGISTRY` table in `src/labels/mod.rs`; `cap_rule_meta()` and `rule_id_for_caps()` are the public lookups.
+- `Cap` widened from `u16` to `u32` to fit the new bits. `Evidence.sink_caps` is now `u32`; `RuleInfo.cap_bits` is also `u32`. The serde decoder accepts any unsigned integer width so caches written before the bump still load. SQLite schema bumped 3 to 4 to force a rescan, since older `source_caps` / `sanitizer_caps` / `sink_caps` blobs were emitted before any of the new bits could appear.
+- `owasp_bucket_for` consults `CAP_RULE_REGISTRY` first so adding a new cap class does not require a second-table edit. The match requires an exact rule id or a recognised separator (` `, `(`, `.`) so a future `taint-ssrf-allowlist-violation` can no longer silently inherit `taint-ssrf`'s OWASP bucket. The legacy family-token table now also routes `xpath`, `header`, and `xxe` to A03 / A05.
+- `issue_category_label` (dashboard badge) routes the seven new rule-id prefixes to dedicated labels: LDAP Injection, XPath Injection, Header Injection, Open Redirect, Template Injection, XXE, Prototype Pollution.
+
 ### Changed

 - Refreshed the local web UI visual system around the mint-cyan Nyx brand: warmer light surfaces, deep green accents, updated severity/confidence colors, tighter typography, smaller radii, denser cards, table, badge, button, header, and sidebar styling, and matched graph/code-viewer colors.
@ -16,6 +25,32 @@ A round of cross-file FastAPI auth, two new sink/validator classes, a ~957-FP Go

 ### Added

+- `nyx rules list` CLI subcommand. Surfaces the same registry the dashboard's `/api/rules` page reads from: built-in cap-class entries (one per `Cap` with a canonical rule id), per-language label rules (sink / source / sanitizer), gated sinks, and any custom rules from config. Filters: `--lang <slug>`, `--kind <class|source|sink|sanitizer>`, `--class-only` for registry entries only, `--no-class` for per-language rules only. `--json` for machine output. Cap-class entries carry `language = "all"` so a language filter still surfaces them unless `--no-class` is set.
+- `RuleInfo.is_class` / `RuleInfo.emission_active` flags. Cap-class entries carry `is_class = true` so dashboards can group them separately from per-language label rules. `emission_active = false` marks legacy classes (SQL_QUERY, SSRF, FILE_IO, FMT_STRING, DESERIALIZE, CODE_EXEC, CRYPTO) whose findings still surface under the catch-all `taint-unsanitised-flow` rule id; the seven new classes plus `unauthorized_id` and `data_exfil` are `emission_active = true`. The active set is pinned in `cap_rule_registry_emission_active_set_is_pinned` so a future migration of a legacy cap to its specific rule id can't drift silently.
+- XML-parser configuration tracking. New `src/ssa/xml_config.rs` runs alongside type-fact analysis and carries per-receiver `secure_processing` / `disallow_doctype` / `external_entities` flags forward through copy assignments and phi joins (meet for safe flags, sticky union for the unsafe `external_entities` polarity). `xxe_safe()` queries the result at the type-qualified `XmlParser.parse` sink and strips `Cap::XXE` when the parser was provably hardened (JAXP `setFeature(FEATURE_SECURE_PROCESSING, true)`, lxml `XMLParser(resolve_entities=False, no_network=True)`, fast-xml-parser `processEntities: false`). Persisted to `OptimizeResult.xml_parser_config`.
+- XPath-receiver configuration tracking. New `src/ssa/xpath_config.rs` mirrors the XML sidecar for Java's `XPath` instances: `setXPathVariableResolver(...)` flips the receiver's `has_resolver` flag, copy assignments union, phi joins meet. `xpath_safe()` strips `Cap::XPATH_INJECTION` at `xpath.evaluate(expr, ...)` / `xpath.compile(expr)` sinks when the receiver was provably bound to a resolver (parameterised XPath shape). Persisted to `OptimizeResult.xpath_config`.
+- Five new `TypeKind` variants: `LdapClient` (JNDI `InitialDirContext` / `InitialLdapContext`, Spring `LdapTemplate`, ldapjs `createClient`, python-ldap `initialize`, ldap3 `Connection`), `XPathClient` (JAXP `newXPath`, lxml `etree.XPath`, npm `xpath`), `XmlParser` (JAXP factory products: `newDocumentBuilder`, `newSAXParser`, `getXMLReader`), `Template` (Apache FreeMarker `new Template(...)` / `Configuration.getTemplate`), and `NullPrototypeObject` for JS/TS values produced by `Object.create(null)`. Each is wired into `constructor_type` for return-type inference and into `TypeKind::label_prefix()` for type-qualified callee resolution. `XPathClient` is kept distinct from `DatabaseConnection` so a generic `pdo->query` SQL_QUERY sink does not collide with `xpath.query`.
+- `GateActivation::LiteralOnly`. Strict literal-value activation: the gate fires only when the activation argument is a literal that matches `dangerous_values` / `dangerous_prefixes`. Unknown or dynamic activation argument suppresses (no conservative `ALL_ARGS_PAYLOAD` push). Used for ambiguously named matchers where the dangerous shape is identifiable only by an explicit literal flag, e.g. bare `extend` where `jQuery.extend(true, target, src)` is the deep-merge prototype-pollution form but Backbone's `Model.extend({proto})` shares the suffix.
+- Two new `PredicateKind` variants in `src/taint/path_state.rs` for inline open-redirect sanitisers. `RelativeUrlValidated` covers `x.startsWith("/")`, `x.starts_with("/")`, `x.startswith("/")`, PHP `strpos($x, "/") === 0`, and direct `x[0] === "/"`. `HostAllowlistValidated` covers `new URL(x).host === ALLOWED`, `urlparse(x).netloc == ALLOWED`, multi-statement `parsed.host_str() == "..."` for Rust, and `parsed.Host == "..."` / `parsed.Hostname() == "..."` for Go. Both are cap-aware: they clear `Cap::OPEN_REDIRECT` only on the validated branch, leaving any non-redirect taint downstream to fire on its own caps. The Go form gates on case-sensitive capital `H` so a lowercase `u.host == X` field comparison falls through to the generic `Comparison` predicate.
+- `Object.create(null)` recogniser. New `is_object_create_null_call` in `cfg/literals.rs` matches `Object.create(null)` (and parenthesised, awaited, or TS type-cast wrappers) and tags `CallMeta.produces_null_proto = true` for JS/TS calls. Type-fact analysis lifts the flag to `TypeKind::NullPrototypeObject` on the returned SSA value so the synthetic `__index_set__` sink is suppressed flow-sensitively. Phi joins drop the tag back to `Unknown` so a partial null-proto receiver still fires on the unsafe path.
+- CFG-layer prototype-pollution suppression on the synthetic `__index_set__` sink (JS/TS only, recognised by the existing `try_lower_subscript_write` lowering). Three flow-insensitive shapes elide the `Sink(PROTOTYPE_POLLUTION)` label before SSA sees the node: constant-key fold (literal key not in `__proto__` / `constructor` / `prototype`); reject pattern (an enclosing-block sibling `if (idx === "__proto__" || ...) return / throw / break;`); allowlist pattern (an ancestor `if (idx === "name" || idx === "id") { obj[idx] = v }`). Walks stop at the enclosing function so closure-captured guards in an outer scope can't silently authorise inner assignments.
+- Spring MVC `return "redirect:" + tainted` open-redirect recogniser (Java only). New `try_lower_spring_redirect_return` in `cfg/mod.rs` matches the leftmost `+`-chain whose root is a `redirect:` string literal and emits a synthetic `__spring_redirect__` Call sink with `Sink(Cap::OPEN_REDIRECT)` between the predecessors and the Return node. Concatenated identifiers from anywhere in the right-hand chain feed the synthetic node's `arg_uses[0]`, so the existing taint pipeline carries any tainted suffix through OPEN_REDIRECT.
+- Subscript-set form classification for header sinks. `response.headers["X-Foo"] = bar` / `headers["X-Foo"] = bar` (Ruby `element_reference`, JS/TS `subscript_expression`, Python `subscript`) had no `property` field on the LHS, so the existing classification path skipped it. `push_node` now walks into the subscript's `object` and classifies its member-expression text (`response.headers`, `res.headers`, `self.response.headers`), so `Cap::HEADER_INJECTION` fires on the bare bracket form alongside `setHeader` / `res.set` / `headers_mut.insert`.
+- PHP literal extraction extended in `cfg/literals.rs`. `extract_const_string_arg` now folds: PHP `encapsed_string` (double-quoted) when every child is a pure-literal segment; boolean literals (`true` / `false`) so jQuery's `extend(true, target, src)` deep-merge marker activates the `LiteralOnly` gate; leading-string `binary_expression` concat (PHP `"Location: " . $url`, JS/TS `"Location: " + url`) so `dangerous_prefixes` matching activates on partially dynamic concatenations.
+- PHP receiver-text strip for chain construction. `helpers::root_receiver_text` now drops the leading `$` from `variable_name` nodes so `$smarty->fetch(...)` / `$twig->createTemplate(...)` reconstruct as `Smarty.fetch` / `Environment.createTemplate` for suffix-matcher gates instead of carrying a `$smarty.fetch` form that fails the boundary rule.
+- Gate-callee resolution hardening for member-source rewrites. When `first_member_label` rewrites a call's `text` to a Source like `req.body` (because the wrapper carries a member-source argument), the gate matcher now reads the call's `function` / `method` / `name` field instead, so `setValue(target, req.body, ...)` matches the `setValue` proto-pollution gate instead of the rewritten `req.body` text. Whitespace stripped from the function field so multi-line chains still match flat gate matchers.
+- Ruby option-constant lookup in gate activation. Bare `scope_resolution` / `constant` nodes (`Nokogiri::XML::ParseOptions::NOENT`) now fall back to the macro-arg extractor used by C/C++/PHP, so Nokogiri XXE gates activate on idiomatic option-flag arguments rather than firing conservatively on every positional arg.
+- Per-language label rules expanded to cover the seven new caps:
+  - JavaScript / TypeScript: ldapjs `LdapClient.search`, `escapeXpath` / `xpathEscape`, `document.evaluate` / npm `xpath.select`, `setHeader` / `res.set` / `res.append` / `res.headers[]=`, `stripCRLF` / `escapeHeader`, lodash / dot-prop / object-path deep-merge prototype-pollution gates, Handlebars / EJS / Mustache template sinks, fast-xml-parser / xml2js with `processEntities`-aware activation, `redirect` / `Location` open-redirect sinks.
+  - Python: python-ldap `LDAPObject.search_s`, ldap3 `Connection.search`, lxml `etree.XPath` / `lxml.etree.parse` with parser-config awareness, Flask `response.headers[]=` / `make_response`, Jinja2 `Template(...)` and Mako `Template(...)` SSTI sinks, `flask.redirect` / `aiohttp HTTPFound` open-redirect.
+  - Java / Kotlin: `DirContext.search`, `XPath.evaluate` / `XPath.compile`, JAXP `DocumentBuilder.parse` / `SAXParser.parse` / `XMLReader.parse`, FreeMarker `Template.process`, Spring `redirect:` view-name synthetic sink, `HttpServletResponse.setHeader` / `addHeader`.
+  - PHP: `ldap_search` / `ldap_list` / `ldap_read`, `DOMXPath::query` / `DOMXPath::evaluate`, `header()` with leading-prefix activation, Smarty `fetch` / Twig `createTemplate` / Blade compile + `eval` template forms, `loadXML` / `simplexml_load_string` with `LIBXML_NOENT` activation.
+  - Go: `go-ldap conn.Search`, `etree.Path` / `xmlpath.Compile`, `http.Header.Set` / `Response.Header().Set`, `html/template` and `text/template` `Parse(...)`, `encoding/xml.Unmarshal` / `Decoder.Decode`, `http.Redirect` with relative-URL / host-allowlist gating.
+  - Ruby: `Net::LDAP#search`, `Nokogiri::XML::Document#xpath`, `response.headers[]=`, `ERB.new` SSTI, `Nokogiri::XML.parse` with `NOENT` / `DTDLOAD` activation, `redirect_to` with relative-URL gate.
+  - C / C++: libldap `ldap_search_ext_s`, libxml2 `xmlXPathEval`, `curl_easy_setopt` with header-list activation, libxml2 `xmlReadFile` / `xmlReadMemory` with `XML_PARSE_NOENT` activation.
+  - Rust: actix-web `HeaderMap.insert` / `HeaderValue::from_str` header-injection gates. `Redirect::to` retagged from `Cap::SSRF` to `Cap::OPEN_REDIRECT` so the open-redirect rule fires distinctly from the SSRF rule.
+- `NYX_PYTHON_PROTO_POLLUTION` env var flag. Python `dict.update` / `__dict__.update` proto-pollution gates are opt-in: bare `update` overlaps too broadly with `Counter.update` and ordinary state-mutation patterns to ship as a default sink. When the var is set to `1` / `true` / `yes` / `on` the merged slice is leaked into a `'static` reference so the registry's lifetime invariant holds.
+- New per-cap integration suites: `tests/{xpath_injection,xxe,ssti,prototype_pollution,header_injection,open_redirect,ldap_injection}_tests.rs`, plus `python_proto_pollution_tests.rs` for the env-gated Python form. Per-cap fixture trees under `tests/fixtures/<class>/<lang>/` cover safe, unsafe, and irrelevant-baseline shapes for every supported language.
 - FastAPI cross-file `include_router` dependency tracking. New `auth_analysis/router_facts.rs` captures per-file router declarations (`<router> = X(deps=[…])`) and `<parent>.include_router(<child_module>.<child_var>)` edges in pass 1, persists them into `GlobalSummaries::router_facts_by_module`, and resolves them into the active file's `AuthorizationModel::cross_file_router_deps` at pass 2 entry. Transitive lifts (`grandparent → parent → child`) handled by iterative index walk. Module identity is the file basename without `.py` (approximate, but sufficient for airflow-style `task_instances.router` naming). Closes the airflow execution-API shape where a child router lives in `routes/task_instances.py` and its auth is declared on the parent in `routes/__init__.py`.
 - FastAPI router-level `dependencies=[...]` propagation. Module-level `router = APIRouter(dependencies=[Security(...)])` declarations are pre-walked once per file, then merged onto every `@<router>.<verb>(...)` route attached in the same file. Closes airflow's execution-API routes that re-use a single `ti_id_router` declared once at module scope.
 - FastAPI `Security(callable, scopes=[...])` recognised distinctly from `Depends(callable)`. Scoped Security promotes the synthetic `AuthCheck` to `AuthCheckKind::Other` (route-level scope-checked authorization), not just Login. New scope-tracking boolean threaded through `expand_decorator_calls` and `extract_fastapi_dependencies`.
@ -55,6 +90,15 @@ A round of cross-file FastAPI auth, two new sink/validator classes, a ~957-FP Go

 ### Fixed (false positives)

+- `Object.create(null)` receivers no longer fire prototype-pollution at the synthetic `__index_set__` sink. Suppression is flow-sensitive via `TypeKind::NullPrototypeObject` so a phi join that only sometimes resolves to a null-proto receiver still fires on the unsafe path.
+- `cfg-unguarded-sink` over-fires on JS/TS object-literal property writes guarded by an explicit `__proto__` / `constructor` / `prototype` reject `if` (early `return` / `throw` / `break`) or by an allowlist `if` whose true arm contains the assignment. Resolved at the CFG layer before the SSA sink scan.
+- Spring MVC `return "redirect:" + url` flagged generic `taint-unsanitised-flow` even when the redirect destination was the load-bearing taint. Now routed through the synthetic `__spring_redirect__` sink so the finding emerges as `taint-open-redirect`.
+- `$smarty->fetch(...)` / `$twig->createTemplate(...)` no longer drop their SSTI gate match on idiomatic PHP receiver shapes. Receiver text strip in `helpers::root_receiver_text` rebuilds the chain text with `.` separators.
+- `setValue(target, req.body, ...)` and similar wrappers no longer gate-match on the rewritten Source `req.body` text. Gate matcher now reads the call's `function` / `method` / `name` field when a Source label override has clobbered the call text.
+- Nokogiri / lxml / fast-xml-parser parser bodies hardened with `setFeature` / `processEntities: false` / `XMLParser(resolve_entities=False)` no longer fire `taint-xxe`. Suppression runs through the new `xml_parser_config` sidecar.
+- `XPath` instances bound to `setXPathVariableResolver(...)` no longer fire `taint-xpath-injection` on subsequent `xpath.evaluate(expr, ...)` sinks. Suppression runs through the new `xpath_config` sidecar.
+- Inline `if (!url.startsWith("/")) reject` and `if (new URL(url).host !== ALLOWED) reject` open-redirect sanitisers now narrow the `Cap::OPEN_REDIRECT` bit on the validated branch instead of falling through to the generic `Comparison` predicate. Cap-aware: other taint downstream still fires on its own caps.
+- Rust `Redirect::to` no longer fires `taint-ssrf` for what is structurally an open redirect. Retagged to `Cap::OPEN_REDIRECT` so the report classifies the issue under the correct cap.
 - ~957 gitea backend DAO `go.auth.missing_ownership_check` findings (id-scalar precision pass, see Added).
 - 169 of 216 openmrs `cfg-unguarded-sink` findings (JpaCriteriaQuery type, see Added). Equivalent reductions on xwiki / keycloak Hibernate DAO clusters.
 - joomla and drupal `php.deser.unserialize` flagged inside `Serializable::unserialize($input)` magic-method bodies (passthrough recognition, see Added).
@ -74,6 +118,8 @@ A round of cross-file FastAPI auth, two new sink/validator classes, a ~957-FP Go
 - New `cfg/cfg_tests.rs` covers ternary-branch CFG lowering shapes.
 - New `summary/tests.rs` covers cross-file `include_router` summary persistence and resolution.
 - Refactor passes across `auth_analysis`, `ssa/const_prop`, `ssa/type_facts`, `summary`, and the per-framework auth extractors (cleaner conditional checks, simpler function signatures, deduplicated assertions). No behaviour change.
+- `parse_cap` and `CapName::FromStr` accept the new short names (`ldap_injection` / `ldapi`, `xpath_injection` / `xpathi`, `header_injection` / `crlf` / `response_splitting`, `open_redirect` / `redirect`, `ssti` / `template_injection`, `xxe`, `prototype_pollution` / `proto_pollution`, plus the existing `data_exfil` alias). The `nyx config add-rule --cap` flag and `[analysis.languages.*.rules]` entries take any of these.
+- Frontend `RuleListItem` carries the new `is_class` flag so the dashboard's Rules page can group cap-class entries separately. `RuleDetailView` adds the same field.

 ## [0.6.1] - 2026-05-03

--- a/README.md
+++ b/README.md
@ -186,7 +186,7 @@ Two passes over the filesystem, with an optional SQLite index to skip unchanged
 3. **Pass 2**: re-analyze each file with cross-file context under bounded context sensitivity (k=1 inlining for intra-file callees, SCC fixpoint capped at 64 iterations, and summary fallback for callees above the inline body-size cap). A forward dataflow worklist propagates taint through the SSA lattice with guaranteed convergence. Call-graph SCCs iterate to fixed-point (within the cap) so mutually recursive functions get accurate summaries.
 4. **Rank, dedupe, emit**: findings are scored by severity × evidence strength × source-kind exploitability, then emitted to console, JSON, or SARIF.

-Detector families: taint (cross-file source→sink), CFG structural (auth gaps, unguarded sinks, resource leaks), state model (use-after-close, double-close, must-leak, unauthed-access), AST patterns (tree-sitter structural match). Full detector docs: [Detectors](https://elicpeter.github.io/nyx/detectors.html).
+Detector families: taint (cross-file source→sink, with cap-specific rule classes for SQLi, XSS, command/code exec, deserialization, SSRF, path traversal, format string, crypto, LDAP injection, XPath injection, HTTP header / response splitting, open redirect, server-side template injection, XXE, prototype pollution, data exfiltration, and the auth fold-in), CFG structural (auth gaps, unguarded sinks, resource leaks), state model (use-after-close, double-close, must-leak, unauthed-access), AST patterns (tree-sitter structural match). Full detector docs: [Detectors](https://elicpeter.github.io/nyx/detectors.html).

 ---

@ -211,7 +211,7 @@ kind     = "sanitizer"
 cap      = "html_escape"
 ```

-Or add rules interactively: `nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape`. Caps: `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `data_exfil`, `code_exec`, `crypto`, `unauthorized_id`, `all`. Full schema: [Configuration](https://elicpeter.github.io/nyx/configuration.html).
+Or add rules interactively: `nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape`. Caps: `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `data_exfil`, `code_exec`, `crypto`, `unauthorized_id`, `ldap_injection`, `xpath_injection`, `header_injection`, `open_redirect`, `ssti`, `xxe`, `prototype_pollution`, `all`. Full schema: [Configuration](https://elicpeter.github.io/nyx/configuration.html). Run `nyx rules list` to browse the registry from the terminal.

 ---

--- a/docs/cli.md
+++ b/docs/cli.md
@ -275,7 +275,7 @@ Add a custom taint rule. Written to `nyx.local`.
 | `--lang` | `rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby` |
 | `--matcher` | Function or property name to match |
 | `--kind` | `source`, `sanitizer`, `sink` |
-| `--cap` | `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `all` |
+| `--cap` | `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `data_exfil`, `ldap_injection`, `xpath_injection`, `header_injection`, `open_redirect`, `ssti`, `xxe`, `prototype_pollution`, `all` |

 ### `nyx config add-terminator`

@ -287,6 +287,41 @@ Add a terminator function (e.g. `process.exit`). Written to `nyx.local`.

 ---

+## `nyx rules`
+
+Browse the built-in rule registry from the terminal. Same dataset the dashboard's Rules page reads from: cap-class entries (one per `Cap` with a canonical rule id), per-language label rules (sink / source / sanitizer), gated sinks, and any custom rules from your config.
+
+### `nyx rules list`
+
+```
+nyx rules list [--lang <SLUG>] [--kind <KIND>] [--class-only|--no-class] [--json]
+```
+
+| Flag | Values |
+|------|--------|
+| `--lang` | Language slug (`javascript`, `typescript`, `python`, `java`, `php`, `go`, `ruby`, `rust`, `c`, `cpp`). Cap-class entries (`language = "all"`) still surface alongside any language filter unless `--no-class` is set. |
+| `--kind` | `class` (cap-class entry), `source`, `sink`, `sanitizer` |
+| `--class-only` | Show only the cap-class registry entries, suppressing per-language label rules and gated sinks. |
+| `--no-class` | Suppress cap-class registry entries, show only per-language label rules and gated sinks. Conflicts with `--class-only`. |
+| `--json` | Emit JSON instead of the human-readable table. Schema matches the `/api/rules` response. |
+
+Examples:
+
+```bash
+# Browse the seven new vulnerability classes
+nyx rules list --class-only
+
+# All Java sinks
+nyx rules list --lang java --kind sink
+
+# JSON output for scripted filtering
+nyx rules list --json | jq '.[] | select(.cap == "ldap_injection")'
+```
+
+The `enabled` column reflects the `analysis.disabled_rules` overlay from your config, so a rule disabled in `nyx.local` shows up here too. Custom rules added via `nyx config add-rule` appear at the end with `is_custom: true`.
+
+---
+
 ## Exit codes

 See [output.md](output.md#exit-codes). Summary: `0` on success (including findings without `--fail-on`), `1` when `--fail-on` trips, non-zero on scan errors.
--- a/docs/configuration.md
+++ b/docs/configuration.md
@ -253,9 +253,14 @@ cap = "html_escape"       # "env_var" | "html_escape" | "shell_escape" |
                          # "url_encode" | "json_parse" | "file_io" |
                          # "fmt_string" | "sql_query" | "deserialize" |
                          # "ssrf" | "data_exfil" | "code_exec" | "crypto" |
-                          # "unauthorized_id" | "all"
+                          # "unauthorized_id" | "ldap_injection" |
+                          # "xpath_injection" | "header_injection" |
+                          # "open_redirect" | "ssti" | "xxe" |
+                          # "prototype_pollution" | "all"
 ```

+Aliases accepted by `parse_cap` and `[..rules].cap`: `data_exfiltration` for `data_exfil`, `ldapi` for `ldap_injection`, `xpathi` for `xpath_injection`, `crlf` and `response_splitting` for `header_injection`, `redirect` for `open_redirect`, `template_injection` for `ssti`, `proto_pollution` for `prototype_pollution`.
+
 ---

 ## Example Configurations
--- a/docs/detectors.md
+++ b/docs/detectors.md
@ -13,11 +13,20 @@ The taint family is split into cap-specific rule classes when a sink callee carr

 | Rule id | Cap | Surface |
 |---|---|---|
-| `taint-unsanitised-flow` | every cap except `data_exfil` and `unauthorized_id` | Default taint flow class |
+| `taint-unsanitised-flow` | `sql_query`, `ssrf`, `code_exec`, `file_io`, `fmt_string`, `deserialize`, `crypto` | Catch-all class for the legacy caps that have not migrated to a dedicated rule id yet. |
+| `taint-ldap-injection` | `ldap_injection` | Attacker-controlled data concatenated into an LDAP filter or DN without RFC 4515 escaping. Receivers typed as `LdapClient` (JNDI `DirContext`, Spring `LdapTemplate`, ldapjs `Client`, python-ldap `LDAPObject`, ldap3 `Connection`) and chained `.search` / `.searchByEntity` / `.search_s` form the sink set. |
+| `taint-xpath-injection` | `xpath_injection` | Attacker-controlled string passed as the XPath expression to `xpath.evaluate` / `xpath.compile` / `document.evaluate` / `DOMXPath::query` / `etree.XPath`. Suppressed when the receiver was bound to an `XPathVariableResolver` (parameterised XPath shape). |
+| `taint-header-injection` | `header_injection` | Attacker-controlled bytes landing in an HTTP response header without `\r\n` stripping (response splitting, cache poisoning). Covers `setHeader` / `res.set` / `res.append` / `headers["X-Foo"] = bar` / `Header().Set` / `add_header` / `setcookie` / `http.Header.Set`. |
+| `taint-open-redirect` | `open_redirect` | Attacker-controlled URL driving a redirect / `Location` header without an allowlist or relative-URL check. Includes the Spring MVC `return "redirect:" + url` view-name shape via the `__spring_redirect__` synthetic sink. Suppressed by `RelativeUrlValidated` (`startsWith("/")` family) and `HostAllowlistValidated` (`new URL(x).host === ALLOWED`, `urlparse(x).netloc == ...`) inline predicates. |
+| `taint-template-injection` | `ssti` | Attacker controls the *template source string* fed to a server-side renderer (Jinja2 / Mako / FreeMarker / Twig / Handlebars / EJS / Mustache / ERB / `text/template` / `html/template` / Smarty / Blade `Template(...)` / `compile(...)`), distinct from rendering a trusted template with tainted variables. |
+| `taint-xxe` | `xxe` | Attacker-controlled XML reaching a parser that resolves external entities. Covers JAXP `DocumentBuilder.parse` / `SAXParser.parse` / `XMLReader.parse`, lxml `etree.parse`, Nokogiri, fast-xml-parser, xml2js, libxml2 `xmlReadFile` / `xmlReadMemory`. Suppressed when the receiver carries a hardening fact in `xml_parser_config` (`secure_processing`, `disallow_doctype`, `processEntities: false`, `LIBXML_NOENT` not set). |
+| `taint-prototype-pollution` | `prototype_pollution` | Attacker-controlled key reaching an object property assignment that can mutate `Object.prototype`. JS/TS only. Covers `obj[tainted] = v` (synthetic `__index_set__` sink), library-mediated deep-merge / set helpers (`_.merge`, `_.set`, `dotProp.set`, `objectPath.set`, `setValue`), and jQuery's `extend(true, target, src)` deep-merge form via the `LiteralOnly` activation gate. Suppressed by constant-key fold (`__proto__` / `constructor` / `prototype` filtering), reject / allowlist guards on the key, and `Object.create(null)` receivers (flow-sensitive `NullPrototypeObject` type). Python equivalent (`dict.update`) is opt-in via `NYX_PYTHON_PROTO_POLLUTION=1`. |
 | `taint-data-exfiltration` | `data_exfil` | Sensitive data flowing into the payload of an outbound network request (body / headers / json on `fetch`, body on `XMLHttpRequest.send`). Distinct from SSRF: the destination is fixed but attacker-influenced bytes leave the process. |
 | `rs.auth.missing_ownership_check.taint` | `unauthorized_id` | Rust auth subsystem fold-in; see [auth.md](auth.md). |

-A single call site can fire several of these at once when it carries multiple gates — `fetch(taintedUrl, {body: tainted})` produces both an SSRF finding (URL flow) and a `taint-data-exfiltration` finding (body flow), each with its own cap mask rather than a conflated union.
+A single call site can fire several of these at once when it carries multiple gates. `fetch(taintedUrl, {body: tainted})` produces both an SSRF finding (URL flow) and a `taint-data-exfiltration` finding (body flow), each with its own cap mask rather than a conflated union.
+
+Each cap-class entry is registered in `CAP_RULE_REGISTRY` (`src/labels/mod.rs`) with its title, severity, OWASP 2021 code, and description. Browse the registry from the CLI with `nyx rules list --class-only`, or `nyx rules list --kind class --json` for machine output.

 For Rust auth-specific rules (`rs.auth.*`), see [auth.md](auth.md).

--- a/docs/detectors/taint.md
+++ b/docs/detectors/taint.md
@ -135,10 +135,17 @@ Sources, sanitizers, and sinks are linked by named capabilities. A sanitizer onl
 | `sql_query` | | parameterized query binders | `cursor.execute`, `db.query` with concatenation |
 | `deserialize` | | | `pickle.loads`, `yaml.load`, `Marshal.load` |
 | `ssrf` | | URL-prefix locks | `requests.get`, `fetch` URL arg, outbound HTTP destination |
-| `data_exfil` | cookies, headers, env, db rows, file reads (Sensitive-tier sources only) | | `fetch` body / headers / json, `XMLHttpRequest.send` body |
 | `code_exec` | | | `eval`, `exec`, `Function` |
 | `crypto` | | | weak-algorithm constructors |
 | `unauthorized_id` | request-bound scoped IDs (Rust auth analysis) | ownership check | row-level write |
+| `ldap_injection` | | `ldap-escape` filter / dn helpers, project-local `escapeLdapFilter` | `DirContext.search`, `LdapClient.search`, `ldap_search`, `Net::LDAP#search`, `ldap_search_ext_s` |
+| `xpath_injection` | | bound `XPathVariableResolver`, `escapeXpath` / `xpathEscape` helpers | `XPath.evaluate`, `DOMXPath::query`, `document.evaluate`, `xpath.select`, `etree.XPath` |
+| `header_injection` | | `stripCRLF` / `escapeHeader` / `sanitizeHeader` | `setHeader`, `res.set`, `res.append`, `headers["X-Foo"] = bar`, `Header().Set`, `header()`, `setcookie` |
+| `open_redirect` | | leading-slash check (`startsWith("/")`), URL-parse + host allowlist (`new URL(x).host === ALLOWED`) | `Redirect::to`, Spring `redirect:` view name, `flask.redirect`, `http.Redirect`, `redirect_to` |
+| `ssti` | | | template constructors fed by tainted source: `Jinja2 Template(...)`, `freemarker.Template`, `Twig::createTemplate`, Handlebars `compile`, `ERB.new`, Mako `Template(...)` |
+| `xxe` | | hardened parser config (`secure_processing`, `disallow-doctype-decl`, `processEntities: false`, `LIBXML_NOENT` not set) | `DocumentBuilder.parse`, `SAXParser.parse`, `xml2js`, `fast-xml-parser`, `lxml.etree.parse`, `xmlReadFile` |
+| `prototype_pollution` | | constant-key fold, reject / allowlist guards on the key, `Object.create(null)` receivers | `obj[tainted] = v` synthetic `__index_set__`, `_.merge`, `_.set`, `dotProp.set`, `objectPath.set`, jQuery `extend(true, ...)` |
+| `data_exfil` | cookies, headers, env, db rows, file reads (Sensitive-tier sources only) | | `fetch` body / headers / json, `XMLHttpRequest.send` body |
 | `all` | Sources typically use `all` so they match any sink | | |

 Sources typically use `cap = "all"` so they match every sink. Sinks declare the specific cap they need. Sanitizers only clear the cap they name.
--- a/docs/rules.md
+++ b/docs/rules.md
@ -24,13 +24,22 @@ Language prefixes: `rs`, `c`, `cpp`, `go`, `java`, `js`, `ts`, `py`, `php`, `rb`

 ### Taint

-One rule covers every source-to-sink flow. The parenthetical identifies the source location.
+The taint family is split into cap-specific rule classes. The `taint-unsanitised-flow` id is the catch-all for the legacy caps that have not migrated to a dedicated rule id yet (`sql_query`, `ssrf`, `code_exec`, `file_io`, `fmt_string`, `deserialize`, `crypto`). The seven new vulnerability classes plus auth and data-exfil emerge under their own rule id. The parenthetical identifies the source location.

-| Rule ID | Severity |
-|---|---|
-| `taint-unsanitised-flow (source L:C)` | Varies by source kind and sink capability |
+| Rule ID | Cap | Severity |
+|---|---|---|
+| `taint-unsanitised-flow (source L:C)` | `sql_query` / `ssrf` / `code_exec` / `file_io` / `fmt_string` / `deserialize` / `crypto` | Varies |
+| `taint-ldap-injection` | `ldap_injection` | High |
+| `taint-xpath-injection` | `xpath_injection` | High |
+| `taint-header-injection` | `header_injection` | High |
+| `taint-open-redirect` | `open_redirect` | Medium |
+| `taint-template-injection` | `ssti` | High |
+| `taint-xxe` | `xxe` | High |
+| `taint-prototype-pollution` | `prototype_pollution` | High |
+| `taint-data-exfiltration` | `data_exfil` | High / Medium |
+| `rs.auth.missing_ownership_check.taint` | `unauthorized_id` | High |

-The matcher sets (sources, sanitizers, sinks, gated sinks) live per-language in `src/labels/<lang>.rs`. [Language maturity](language-maturity.md) gives per-language counts and what's covered.
+Each cap-class entry is registered in `CAP_RULE_REGISTRY` (`src/labels/mod.rs`). Browse the registry from the CLI with `nyx rules list --class-only`, or via the dashboard's Rules page. The matcher sets (sources, sanitizers, sinks, gated sinks) live per-language in `src/labels/<lang>.rs`. [Language maturity](language-maturity.md) gives per-language counts and what's covered.

 ### CFG structural

@ -257,6 +266,8 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]

 `nyx config add-rule --cap <name>` and `[analysis.languages.*.rules]` in config accept:

-`env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `all`
+`env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `data_exfil`, `ldap_injection`, `xpath_injection`, `header_injection`, `open_redirect`, `ssti`, `xxe`, `prototype_pollution`, `all`

-Source for both the enum and the `to_cap` mapping: [`src/labels/mod.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/mod.rs) (`Cap`) and [`src/utils/config.rs`](https://github.com/elicpeter/nyx/blob/master/src/utils/config.rs) (`CapName`).
+Aliases: `data_exfiltration` for `data_exfil`, `ldapi` for `ldap_injection`, `xpathi` for `xpath_injection`, `crlf` and `response_splitting` for `header_injection`, `redirect` for `open_redirect`, `template_injection` for `ssti`, `proto_pollution` for `prototype_pollution`.
+
+Source for both the enum and the `to_cap` mapping: [`src/labels/mod.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/mod.rs) (`Cap` and `CAP_RULE_REGISTRY`) and [`src/utils/config.rs`](https://github.com/elicpeter/nyx/blob/master/src/utils/config.rs) (`CapName`).
--- a/frontend/src/api/types.ts
+++ b/frontend/src/api/types.ts
@ -355,6 +355,7 @@ export interface RuleListItem {
  enabled: boolean;
  is_custom: boolean;
  is_gated: boolean;
+  is_class: boolean;
  case_sensitive: boolean;
  finding_count: number;
  suppression_rate: number;
--- a/src/ast.rs
+++ b/src/ast.rs
@ -377,7 +377,7 @@ fn build_taint_diag(
    // Resolved sink capability bits, used by deduplication to distinguish
    // sinks with different cap types on the same source line (e.g.
    // `sink_sql(x); sink_shell(x);`).
-    let sink_caps_bits: u16 = cfg_graph[finding.sink]
+    let sink_caps_bits: u32 = cfg_graph[finding.sink]
        .taint
        .labels
        .iter()
@ -385,7 +385,7 @@ fn build_taint_diag(
            crate::labels::DataLabel::Sink(c) => Some(c.bits()),
            _ => None,
        })
-        .fold(0u16, |acc, b| acc | b);
+        .fold(0u32, |acc, b| acc | b);

    // Cap-specific rule-id routing.
    //
@ -508,6 +508,14 @@ fn build_taint_diag(
            || (finding.source_kind.sensitivity() >= crate::labels::Sensitivity::Sensitive
                && (flow_has_body_bind || source_is_credential_bearing)));

+    // Cap-specific rule routing.  Auth-as-taint and data-exfil keep their
+    // pre-existing branches so the routing rules they encode (auth-finding
+    // namespace alignment; body-bind / source-sensitivity gate) stay
+    // exactly as before.  New cap classes (LDAP / XPath / Header / Open
+    // redirect / SSTI / XXE / Prototype pollution) route through
+    // `cap_rule_meta()` so the canonical rule ids in the registry are the
+    // single source of truth.  Legacy generic taint findings continue to
+    // emit `taint-unsanitised-flow`.
    let diag_id = if effective_caps.contains(crate::labels::Cap::UNAUTHORIZED_ID) {
        "rs.auth.missing_ownership_check.taint".to_string()
    } else if is_data_exfil_rule {
@ -516,6 +524,25 @@ fn build_taint_diag(
            source_point.row + 1,
            source_point.column + 1
        )
+    } else if let Some(meta) = [
+        crate::labels::Cap::LDAP_INJECTION,
+        crate::labels::Cap::XPATH_INJECTION,
+        crate::labels::Cap::HEADER_INJECTION,
+        crate::labels::Cap::OPEN_REDIRECT,
+        crate::labels::Cap::SSTI,
+        crate::labels::Cap::XXE,
+        crate::labels::Cap::PROTOTYPE_POLLUTION,
+    ]
+    .iter()
+    .find(|c| effective_caps.contains(**c))
+    .and_then(|c| crate::labels::cap_rule_meta(*c))
+    {
+        format!(
+            "{} (source {}:{})",
+            meta.rule_id,
+            source_point.row + 1,
+            source_point.column + 1
+        )
    } else {
        format!(
            "taint-unsanitised-flow (source {}:{})",
@ -576,6 +603,23 @@ fn build_taint_diag(
            }
            _ => crate::patterns::Severity::Medium,
        }
+    } else if let Some(meta) = [
+        crate::labels::Cap::LDAP_INJECTION,
+        crate::labels::Cap::XPATH_INJECTION,
+        crate::labels::Cap::HEADER_INJECTION,
+        crate::labels::Cap::OPEN_REDIRECT,
+        crate::labels::Cap::SSTI,
+        crate::labels::Cap::XXE,
+        crate::labels::Cap::PROTOTYPE_POLLUTION,
+    ]
+    .iter()
+    .find(|c| effective_caps.contains(**c))
+    .and_then(|c| crate::labels::cap_rule_meta(*c))
+    {
+        // New cap classes draw severity from the rule registry so a single
+        // edit to `CAP_RULE_REGISTRY` cascades through SARIF, the dashboard,
+        // and the integration suite without per-language source-kind nudges.
+        meta.severity
    } else {
        severity_for_source_kind(finding.source_kind)
    };
--- a/src/auth_analysis/mod.rs
+++ b/src/auth_analysis/mod.rs
@ -206,8 +206,8 @@ pub fn run_auth_analysis_with_model(
    // (when provided) for cross-file helpers that live in other files.
    apply_helper_lifting(&mut model, lang, file_path, scan_root, global_summaries);

-    // Phase 1 caller-scope IPA: propagate route-handler-level auth
-    // checks DOWN to callee helper units within the same file.  See
+    // Caller-scope IPA: propagate route-handler-level auth checks DOWN
+    // to callee helper units within the same file.  See
    // [`apply_caller_scope_propagation`] for the propagation rule.
    apply_caller_scope_propagation(&mut model);

@ -547,8 +547,8 @@ fn apply_helper_lifting(
    }
 }

-/// Phase 1 caller-scope IPA: propagate route-handler-level auth checks
-/// DOWN to callee helper units within the same file.
+/// Caller-scope IPA: propagate route-handler-level auth checks DOWN to
+/// callee helper units within the same file.
 ///
 /// `apply_helper_lifting` walks UPWARD: a helper that internally
 /// proves ownership / membership / etc. has its summary lifted onto
--- a/src/cfg/cfg_tests.rs
+++ b/src/cfg/cfg_tests.rs
@ -1190,6 +1190,7 @@ fn clone_preserves_all_sub_structs() {
            destination_uses: None,
            gate_filters: Vec::new(),
            is_constructor: false,
+            produces_null_proto: false,
        },
        taint: TaintMeta {
            labels: {
@ -1841,9 +1842,12 @@ def outer(cmd):
    assert_eq!(kwargs[1].0, "check");
 }

-/// Languages without keyword-argument grammar should leave `kwargs` empty.
+/// JS object-literal positional args lift their `pair` children into
+/// `kwargs` so consumers like xml_config's `processEntities` /
+/// `resolve_entities` opt-in detector can read them without re-walking
+/// the tree-sitter AST.
 #[test]
-fn call_node_kwargs_empty_for_javascript() {
+fn call_node_kwargs_lifts_javascript_object_literal_pairs() {
    let src = br"
            function outer(cmd) {
                child_process.exec(cmd, { shell: true });
@ -1861,9 +1865,12 @@ fn call_node_kwargs_empty_for_javascript() {
                    .is_some_and(|c| c.ends_with("exec"))
        })
        .expect("child_process.exec call node should exist");
+    let kwargs = &call_node.call.kwargs;
    assert!(
-        call_node.call.kwargs.is_empty(),
-        "JS object-literal arg is not a keyword_argument — kwargs should stay empty"
+        kwargs
+            .iter()
+            .any(|(k, vs)| k == "shell" && vs.iter().any(|v| v == "true")),
+        "JS object-literal `{{ shell: true }}` should surface as kwarg, got {kwargs:?}"
    );
 }

--- a/src/cfg/dto.rs
+++ b/src/cfg/dto.rs
@ -7,7 +7,7 @@
 //! Strictly additive: classes whose fields cannot be classified produce
 //! a `DtoFields` with an empty `fields` map, the caller must decide
 //! whether to use that as a "Dto with no inferred fields" or fall back
-//! to the pre-Phase-6 Object/Unknown classification.
+//! to the generic Object/Unknown classification.

 use std::collections::{HashMap, HashSet};

--- a/src/cfg/helpers.rs
+++ b/src/cfg/helpers.rs
@ -35,6 +35,16 @@ pub(crate) fn root_receiver_text(n: Node, lang: &str, code: &[u8]) -> Option<Str
                None => text_of(n, code),
            }
        }
+        // PHP `variable_name` text carries a leading `$` (`$smarty`, `$twig`).
+        // Strip it so chain text built downstream (`{recv}.{method}`) presents
+        // a `.`-only delimiter sequence — required by the suffix-matcher
+        // boundary rule, which only accepts `.`/`:` as chain separators.
+        // Without this strip, gate matchers like `Smarty.fetch` /
+        // `Environment.createTemplate` never fire on idiomatic
+        // `$smarty->fetch(...)` / `$twig->createTemplate(...)` shapes.
+        _ if lang == "php" && n.kind() == "variable_name" => {
+            text_of(n, code).map(|s| s.trim_start_matches('$').to_string())
+        }
        _ => text_of(n, code),
    }
 }
--- a/src/cfg/literals.rs
+++ b/src/cfg/literals.rs
@ -195,6 +195,56 @@ pub(super) fn extract_destination_kwarg_pairs(

 /// Extract the string-literal content at argument position `index` (0-based).
 /// Returns `None` if the argument is not a string literal or the index is out of range.
+/// True when `call_node` is `Object.create(null)` (or its parenthesised /
+/// awaited / type-cast wrappers).  Strict literal-`null` first-arg match,
+/// no aliasing through intermediate variables.  Caller restricts to JS/TS.
+pub(super) fn is_object_create_null_call(call_node: Node, code: &[u8]) -> bool {
+    if !matches!(call_node.kind(), "call_expression") {
+        return false;
+    }
+    let callee = call_node
+        .child_by_field_name("function")
+        .and_then(|f| text_of(f, code))
+        .unwrap_or_default();
+    if callee != "Object.create" {
+        return false;
+    }
+    let Some(args) = call_node.child_by_field_name("arguments") else {
+        return false;
+    };
+    let mut cursor = args.walk();
+    let named: Vec<Node> = args.named_children(&mut cursor).collect();
+    if named.len() != 1 {
+        return false;
+    }
+    let mut arg = named[0];
+    // Unwrap parens / await / TS type-assertions.
+    for _ in 0..4 {
+        match arg.kind() {
+            "parenthesized_expression" => {
+                if let Some(inner) = arg.named_child(0) {
+                    arg = inner;
+                    continue;
+                }
+            }
+            "await_expression" => {
+                if let Some(inner) = arg.child_by_field_name("argument") {
+                    arg = inner;
+                    continue;
+                }
+            }
+            "as_expression" | "type_assertion" => {
+                if let Some(inner) = arg.named_child(0) {
+                    arg = inner;
+                    continue;
+                }
+            }
+            _ => break,
+        }
+    }
+    arg.kind() == "null" || text_of(arg, code).as_deref() == Some("null")
+}
+
 pub(super) fn extract_const_string_arg(
    call_node: Node,
    index: usize,
@ -222,6 +272,37 @@ pub(super) fn extract_const_string_arg(
                None
            }
        }
+        // Boolean literals — JS/TS `true`/`false` are their own node kinds; some
+        // grammars wrap them as identifiers carrying the keyword text.  Returned
+        // verbatim so `dangerous_values` matching can detect deep-flag forms
+        // like `extend(true, target, src)`.
+        "true" | "false" => Some(arg.kind().to_string()),
+        // PHP double-quoted strings parse as `encapsed_string` whose body is
+        // a sequence of `string_content` / `escape_sequence` / interpolation
+        // nodes.  Treat the string as constant only when every child is a
+        // pure-literal segment (no `variable_name` / `subscript_expression`
+        // interpolations); the returned value is the concatenation of the
+        // literal segments verbatim.
+        "encapsed_string" => {
+            let mut c = arg.walk();
+            let mut buf = String::new();
+            for ch in arg.named_children(&mut c) {
+                match ch.kind() {
+                    "string_content" => {
+                        if let Some(s) = text_of(ch, code) {
+                            buf.push_str(&s);
+                        }
+                    }
+                    "escape_sequence" => {
+                        if let Some(s) = text_of(ch, code) {
+                            buf.push_str(&s);
+                        }
+                    }
+                    _ => return None,
+                }
+            }
+            Some(buf)
+        }
        "template_string" => {
            // Only treat as constant if no interpolation (no template_substitution children)
            let mut c = arg.walk();
@ -238,6 +319,44 @@ pub(super) fn extract_const_string_arg(
                None
            }
        }
+        // Concat-style binary expression with a leading string literal, e.g.
+        // PHP `"Location: " . $url`, JS/TS `"Location: " + url`.  Returns the
+        // left-most literal so prefix-driven gates (`dangerous_prefixes`) can
+        // activate on partially-dynamic concatenations; falls through to
+        // `None` when the leading segment is not a string literal so
+        // exact-`dangerous_values` matching keeps its strict semantics.
+        "binary_expression" => {
+            let left = arg.child_by_field_name("left")?;
+            match left.kind() {
+                "string"
+                | "string_literal"
+                | "interpreted_string_literal"
+                | "raw_string_literal" => {
+                    let raw = text_of(left, code)?;
+                    if raw.len() >= 2 {
+                        Some(raw[1..raw.len() - 1].to_string())
+                    } else {
+                        None
+                    }
+                }
+                "encapsed_string" => {
+                    let mut c = left.walk();
+                    let mut buf = String::new();
+                    for ch in left.named_children(&mut c) {
+                        match ch.kind() {
+                            "string_content" | "escape_sequence" => {
+                                if let Some(s) = text_of(ch, code) {
+                                    buf.push_str(&s);
+                                }
+                            }
+                            _ => return None,
+                        }
+                    }
+                    Some(buf)
+                }
+                _ => None,
+            }
+        }
        _ => None,
    }
 }
@ -271,6 +390,27 @@ pub(super) fn extract_const_macro_arg(
        "identifier" | "name" | "qualified_name" | "scoped_identifier" => {
            text_of(arg, code).map(|s| s.to_string())
        }
+        // Ruby bare constant (`NOENT`) — leaf form.
+        "constant" => text_of(arg, code).map(|s| s.to_string()),
+        // Ruby scope-qualified constant (`Nokogiri::XML::ParseOptions::NOENT`).
+        // Return only the rightmost `name` segment so the gate's
+        // `dangerous_values` list can stay identifier-bare instead of
+        // enumerating every possible namespacing.  Falls back to the full
+        // text if the `name` field is missing for any reason.
+        "scope_resolution" => arg
+            .child_by_field_name("name")
+            .and_then(|n| text_of(n, code))
+            .map(|s| s.to_string())
+            .or_else(|| text_of(arg, code).map(|s| s.to_string())),
+        // Integer literals at the activation arg position.  PHP / C / C++
+        // commonly use plain `0` to opt into the safe-default option set
+        // (e.g. `simplexml_load_string($xml, "SimpleXMLElement", 0)`).  The
+        // gate's `dangerous_values` list is identifier-only, so returning
+        // the literal text lets the comparison fail against `LIBXML_NOENT`
+        // and suppresses the conservative-fire branch.
+        "integer" | "integer_literal" | "number_literal" | "decimal_integer_literal" => {
+            text_of(arg, code).map(|s| s.to_string())
+        }
        _ => None,
    }
 }
@ -728,35 +868,72 @@ pub(super) fn find_chained_inner_call<'a>(
        return Some((function, inner_text));
    }
    // The function/method field for a chained call is a member_expression
-    // (JS/TS) or attribute (Python) etc.; its `object` field is the
-    // receiver expression.  Only proceed when that receiver is itself a
-    // call.
-    let object = function.child_by_field_name("object")?;
+    // (JS/TS), attribute (Python), or field_expression (Rust); its
+    // receiver is the `object` field (JS/TS/Python) or `value` field
+    // (Rust).  Only proceed when that receiver is itself a call.
+    let object = function
+        .child_by_field_name("object")
+        .or_else(|| function.child_by_field_name("value"))?;
    if !matches!(lookup(lang, object.kind()), Kind::CallFn | Kind::CallMethod) {
        return None;
    }
-    // Recurse: the inner call may itself be chained
-    // (`axios.get(u).then(h).catch(h)`, innermost is `axios.get`).
-    if let Some(inner) = find_chained_inner_call(object, lang, code) {
-        return Some(inner);
-    }
-    // `object` is the innermost call_expression in the chain.  Extract
-    // its callee identifier the same way `first_call_ident_with_span`
-    // does for a CallFn (member_expression text → "http.get").
-    let inner_func = object
+    // Decide whether `object` is itself a chained method call (its
+    // function/method field is a member-style expression). When yes,
+    // recurse one more level so deeper chains resolve to their innermost
+    // method (e.g. `axios.get(u).then(h).catch(h)` → `axios.get`).
+    // When no — the receiver is a plain function/constructor call like
+    // Rust's `HttpResponse::Found()` — descending one more level would
+    // strand us on the non-method leaf whose text would not match any
+    // gate matcher. Stop here and return the current `outer` level,
+    // which IS the innermost method call.
+    let object_function = object
        .child_by_field_name("function")
-        .or_else(|| object.child_by_field_name("method"))
-        .or_else(|| object.child_by_field_name("name"))?;
-    // Multi-line dotted member expressions (`http\n  .get`) include
-    // formatting whitespace in the source-text slice. The labels map
-    // keys are literal `"http.get"` etc., strip whitespace so the
-    // chained-call inner-gate rebinding fires for both single-line and
-    // multi-line chain styles. Also strips `\r` for CRLF sources.
-    // Motivated by upstream Parse Server CVE-2025-64430 which uses the
-    // multi-line `http\n  .get(uri, ...)\n  .on(...)` form.
-    let raw = text_of(inner_func, code)?;
+        .or_else(|| object.child_by_field_name("method"));
+    let object_is_chained_method = object_function
+        .map(|f| {
+            matches!(
+                f.kind(),
+                "member_expression"
+                    | "attribute"
+                    | "field_expression"
+                    | "scoped_identifier"
+                    | "scope_resolution"
+            ) && f
+                .child_by_field_name("object")
+                .or_else(|| f.child_by_field_name("value"))
+                .is_some()
+        })
+        .unwrap_or(false);
+    if object_is_chained_method {
+        // Recurse: the inner call may itself be chained.
+        if let Some(inner) = find_chained_inner_call(object, lang, code) {
+            return Some(inner);
+        }
+        // `object` is the innermost call_expression in the chain.  Extract
+        // its callee identifier the same way `first_call_ident_with_span`
+        // does for a CallFn (member_expression text → "http.get").
+        let inner_func = object
+            .child_by_field_name("function")
+            .or_else(|| object.child_by_field_name("method"))
+            .or_else(|| object.child_by_field_name("name"))?;
+        // Multi-line dotted member expressions (`http\n  .get`) include
+        // formatting whitespace in the source-text slice. The labels map
+        // keys are literal `"http.get"` etc., strip whitespace so the
+        // chained-call inner-gate rebinding fires for both single-line and
+        // multi-line chain styles. Also strips `\r` for CRLF sources.
+        // Motivated by upstream Parse Server CVE-2025-64430 which uses the
+        // multi-line `http\n  .get(uri, ...)\n  .on(...)` form.
+        let raw = text_of(inner_func, code)?;
+        let inner_text: String = raw.chars().filter(|c| !c.is_whitespace()).collect();
+        return Some((object, inner_text));
+    }
+    // Receiver is a non-chained call (Rust constructor `Foo::new()` /
+    // `HttpResponse::Found()`, JS bare `f()`).  Outer level IS the
+    // innermost method call — return its own function text so gate
+    // matching sees the method name.
+    let raw = text_of(function, code)?;
    let inner_text: String = raw.chars().filter(|c| !c.is_whitespace()).collect();
-    Some((object, inner_text))
+    Some((outer, inner_text))
 }

 /// Recursively walk the receiver chain of `outer` (a CallFn / CallMethod
@ -1389,6 +1566,47 @@ pub(super) fn extract_kwargs(call_node: Node, code: &[u8]) -> Vec<(String, Vec<S
    let mut cursor = args_node.walk();
    for child in args_node.named_children(&mut cursor) {
        let kind = child.kind();
+        // JS/TS object-literal positional arg: `f(x, { a: true, b: 'str' })`.
+        // The pairs inside the object are not tree-sitter
+        // `keyword_argument` nodes (those are Python/Ruby), but
+        // downstream consumers (xml_config's
+        // `lookup_kwargs(inst.cfg_node)` JS branch checking
+        // `processEntities`) expect these fields in the kwargs vector.
+        // Lift each `pair` (and `shorthand_property_identifier`) into
+        // the kwargs list using the property name as kwarg name and the
+        // raw text of the value expression as the single value.
+        // Boolean / numeric / string / identifier values all surface as
+        // their textual form, which is what xml_config's kwarg-value
+        // matchers (e.g. `v == "true"`) compare against.
+        if kind == "object" {
+            let mut oc = child.walk();
+            for pair in child.named_children(&mut oc) {
+                let pk = pair.kind();
+                if pk == "pair" {
+                    let Some(kn) = pair.child_by_field_name("key") else {
+                        continue;
+                    };
+                    let Some(vn) = pair.child_by_field_name("value") else {
+                        continue;
+                    };
+                    let Some(raw_name) = text_of(kn, code) else {
+                        continue;
+                    };
+                    let name = raw_name
+                        .trim_start_matches(['"', '\''])
+                        .trim_end_matches(['"', '\''])
+                        .to_string();
+                    if let Some(val_text) = text_of(vn, code) {
+                        out.push((name, vec![val_text.to_string()]));
+                    }
+                } else if pk == "shorthand_property_identifier" {
+                    if let Some(name) = text_of(pair, code) {
+                        out.push((name.to_string(), vec![name.to_string()]));
+                    }
+                }
+            }
+            continue;
+        }
        if kind != "keyword_argument" && kind != "named_argument" {
            continue;
        }
@ -1413,6 +1631,32 @@ pub(super) fn extract_kwargs(call_node: Node, code: &[u8]) -> Vec<(String, Vec<S
        collect_idents_with_paths(vn, code, &mut idents, &mut paths);
        let mut combined = paths;
        combined.extend(idents);
+        // Boolean / numeric literal kwarg values (Python `True`/`False`,
+        // Ruby `true`/`false`/integer/float, JS `true`/`false`/number)
+        // do not surface through `collect_idents_with_paths` — the value
+        // node's kind is `true`/`false`/`integer`/`float`/`number`, not
+        // an identifier kind.  Capture the raw text so consumers like
+        // `xml_config::classify_call` (which checks
+        // `values.iter().any(|v| v == "True" || v == "true")` for the
+        // lxml `resolve_entities=True` opt-in) can match.
+        if combined.is_empty() {
+            if matches!(
+                vn.kind(),
+                "true"
+                    | "false"
+                    | "integer"
+                    | "float"
+                    | "number"
+                    | "string"
+                    | "string_literal"
+                    | "true_constant"
+                    | "false_constant"
+            ) {
+                if let Some(txt) = text_of(vn, code) {
+                    combined.push(txt.trim_matches(['"', '\'']).to_string());
+                }
+            }
+        }
        out.push((name, combined));
    }
    out
@ -1718,6 +1962,29 @@ pub(super) fn extract_arg_string_literals(call_node: Node, code: &[u8]) -> Vec<O
                let raw = text_of(target, code);
                raw.and_then(|s| strip_literal_quotes(&s, target, code))
            }
+            // Boolean / null / numeric literal tokens — capture verbatim so
+            // downstream pattern-aware analysis (e.g. the XXE config-fact
+            // pass that needs to read the boolean polarity arg of
+            // `setFeature(NAME, true)`) can recover the literal text without
+            // re-walking the AST.  Existing string-only consumers (URL
+            // prefix matching, etc.) are unaffected: a "true" / "false"
+            // token never satisfies their matching predicates.
+            "true"
+            | "false"
+            | "null"
+            | "null_literal"
+            | "nil"
+            | "nil_literal"
+            | "none"
+            | "boolean_literal"
+            | "true_literal"
+            | "false_literal"
+            | "decimal_integer_literal"
+            | "integer_literal"
+            | "integer"
+            | "number"
+            | "number_literal"
+            | "decimal_literal" => text_of(target, code).map(|s| s.to_string()),
            _ => None,
        };
        result.push(literal);
--- a/src/cfg/mod.rs
+++ b/src/cfg/mod.rs
@ -70,8 +70,8 @@ use literals::{
    extract_destination_field_pairs, extract_destination_kwarg_pairs, extract_kwargs,
    extract_literal_rhs, extract_object_arg_property, extract_shell_array_payload_idents,
    find_call_node, find_call_node_deep, find_chained_inner_call, has_keyword_arg,
-    has_object_arg_property, has_only_literal_args, is_parameterized_query_call,
-    java_chain_arg0_kind_for_method, js_chain_arg0_kind_for_method,
+    has_object_arg_property, has_only_literal_args, is_object_create_null_call,
+    is_parameterized_query_call, java_chain_arg0_kind_for_method, js_chain_arg0_kind_for_method,
    js_chain_outer_method_for_inner, ruby_chain_arg0_for_method, walk_chain_inner_call_args,
 };
 use params::{
@ -359,6 +359,14 @@ pub struct CallMeta {
    /// must not survive into the constructed object.
    #[serde(default)]
    pub is_constructor: bool,
+    /// True when this call is `Object.create(null)` (or alias). The returned
+    /// value has no prototype chain.  Consumed by TypeFacts to tag the
+    /// SsaValue with [`crate::ssa::type_facts::TypeKind::NullPrototypeObject`]
+    /// so PROTOTYPE_POLLUTION suppression can fire flow-sensitively at the
+    /// synthetic `__index_set__` sink.  Set during CFG node construction so
+    /// SSA does not need to re-walk the AST.
+    #[serde(default)]
+    pub produces_null_proto: bool,
 }

 /// One gate's contribution at a call site whose callee matches multiple
@ -601,8 +609,7 @@ pub struct BodyMeta {
    /// decorators / annotations / static type text at CFG construction
    /// time.  Same length as `params`; positions with no recoverable
    /// type info are `None`.  Strictly additive, when every entry is
-    /// `None`, downstream behaviour is identical to the pre-Phase-1
-    /// engine.
+    /// `None`, downstream behaviour is identical to the type-unaware path.
    pub param_types: Vec<Option<crate::ssa::type_facts::TypeKind>>,
    /// Per-parameter destructured-binding sibling names.  Same length
    /// as `params`; entry `i` lists field names bound by the same
@ -1811,6 +1818,31 @@ pub(super) fn push_node<'a>(
                    labels.push(l);
                }
            }
+            // Subscript-set form: `response.headers["X-Foo"] = bar`
+            // (Ruby `element_reference`, JS/TS `subscript_expression`,
+            // Python `subscript`).  The LHS has no `property` field, so
+            // walk into the subscript's `object` and try classifying its
+            // member-expression text (e.g. `response.headers`).  This
+            // lets header-injection sinks fire on the bare bracket form
+            // alongside the `set_header` / `headers_mut.insert` method
+            // shapes already covered above.
+            if labels.is_empty()
+                && matches!(
+                    lhs.kind(),
+                    "subscript_expression" | "subscript" | "element_reference"
+                )
+            {
+                let obj = lhs
+                    .child_by_field_name("object")
+                    .or_else(|| lhs.child_by_field_name("value"))
+                    .or_else(|| lhs.child(0));
+                if let Some(obj_node) = obj
+                    && let Some(obj_text) = member_expr_text(obj_node, code)
+                    && let Some(l) = classify(lang, &obj_text, extra)
+                {
+                    labels.push(l);
+                }
+            }
        }
    }

@ -1933,18 +1965,45 @@ pub(super) fn push_node<'a>(
    {
        let gate_call = call_ast.or_else(|| find_call_node_deep(ast, lang, 4));
        if let Some(cn) = gate_call {
-            let gate_callee_text = if call_ast.is_some() {
+            // Derive the gate's callee text from the call's
+            // `function`/`method`/`name` field, falling back to `text`.
+            //
+            // The default is `text`, which by this point reflects the
+            // qualified callee for method calls (`Velocity.evaluate`,
+            // `$smarty->fetch`) reconstructed in the `Kind::CallMethod`
+            // arm.  When `first_member_label` rewrites `text` to a member
+            // Source like `req.body` (because the wrapper carries one as
+            // an argument), the rewrite is correct for source attribution
+            // but defeats gate matching against a bare callee
+            // (`setValue(target, req.body, …)` would gate-match
+            // `req.body` instead of `setValue`).
+            //
+            // Detect that case structurally: a Source label is present AND
+            // the call's function-field text differs from `text`.  The
+            // function field carries the actual callee identifier; when it
+            // disagrees with `text`, `text` was clobbered by a member-source
+            // override and the function field is the right gate target.
+            // Whitespace is stripped to mirror `find_chained_inner_call`
+            // so multi-line chains (`http\n  .get(...)`) still match flat
+            // gate matchers like `http.get`.
+            let function_field_text: Option<String> = cn
+                .child_by_field_name("function")
+                .or_else(|| cn.child_by_field_name("method"))
+                .or_else(|| cn.child_by_field_name("name"))
+                .and_then(|f| text_of(f, code))
+                .map(|t| t.chars().filter(|c| !c.is_whitespace()).collect::<String>());
+            let has_source_label = labels
+                .iter()
+                .any(|l| matches!(l, crate::labels::DataLabel::Source(_)));
+            let gate_callee_text = if let Some(ff) = function_field_text.as_deref()
+                && has_source_label
+                && ff != text.as_str()
+            {
+                ff.to_string()
+            } else if call_ast.is_some() {
                text.clone()
            } else {
-                // Inner call reached via wrapper, use the call-expression's
-                // function name directly. Falls back to `text` so non-call-
-                // expression kinds (method calls, Ruby `call` nodes, macros)
-                // still have a usable callee string.
-                cn.child_by_field_name("function")
-                    .or_else(|| cn.child_by_field_name("method"))
-                    .or_else(|| cn.child_by_field_name("name"))
-                    .and_then(|f| text_of(f, code))
-                    .unwrap_or_else(|| text.clone())
+                function_field_text.unwrap_or_else(|| text.clone())
            };
            let matches = classify_gated_sink(
                lang,
@ -1953,12 +2012,15 @@ pub(super) fn push_node<'a>(
                    extract_const_string_arg(cn, idx, code).or_else(|| {
                        // C/C++ preprocessor macros and PHP `define`d constants
                        // surface as identifier nodes, not string literals.
-                        // Falling back to the macro-arg extractor for those
-                        // languages lets gates like `curl_easy_setopt` /
-                        // `curl_setopt` activate on a `CURLOPT_POSTFIELDS`
-                        // ident match instead of firing conservatively on
-                        // every positional arg.
-                        if matches!(lang, "c" | "cpp" | "c++" | "php") {
+                        // Ruby option constants (e.g.
+                        // `Nokogiri::XML::ParseOptions::NOENT`) surface as
+                        // `scope_resolution` / `constant` nodes.  Falling back
+                        // to the macro-arg extractor for those languages lets
+                        // gates like `curl_easy_setopt` / `curl_setopt` /
+                        // `Nokogiri::XML` activate on a bare-leaf identifier
+                        // match instead of firing conservatively on every
+                        // positional arg.
+                        if matches!(lang, "c" | "cpp" | "c++" | "php" | "ruby" | "rb") {
                            extract_const_macro_arg(cn, idx, code)
                        } else {
                            None
@ -2656,6 +2718,13 @@ pub(super) fn push_node<'a>(
        || call_ast
            .is_some_and(|cn| matches!(cn.kind(), "new_expression" | "object_creation_expression"));

+    // Detect `Object.create(null)` so TypeFacts can tag the returned
+    // SsaValue with `NullPrototypeObject` for flow-sensitive
+    // prototype-pollution suppression.  Restricted to JS/TS where
+    // `Object.create` is the idiomatic null-prototype constructor.
+    let produces_null_proto = matches!(lang, "javascript" | "typescript")
+        && call_ast.is_some_and(|cn| is_object_create_null_call(cn, code));
+
    let idx = g.add_node(NodeInfo {
        kind,
        call: CallMeta {
@ -2672,6 +2741,7 @@ pub(super) fn push_node<'a>(
            destination_uses,
            gate_filters,
            is_constructor,
+            produces_null_proto,
        },
        taint: TaintMeta {
            labels,
@ -2860,6 +2930,31 @@ fn try_lower_subscript_write(
    *call_ordinal += 1;
    let mut uses_all: Vec<String> = vec![arr_text.clone(), idx_text.clone()];
    uses_all.extend(rhs_uses.iter().cloned());
+
+    // Prototype pollution sink classification on the synthetic
+    // `__index_set__` node for JS/TS.  Tainted *key* in `obj[key] = val`
+    // is the pollution channel (a `__proto__` / `constructor` literal flowing
+    // through `key` mutates `Object.prototype` globally), so the gate's
+    // payload arg list is `[0]` (the key only — the value at index 1 is
+    // benign on its own).  Sanitizer recognition is structural (no taint
+    // engine plumbing) and runs before label attachment, so suppressed
+    // shapes never enter the SSA sink scan:
+    //   * constant string key whose literal value is not in the dangerous
+    //     set (`__proto__` / `constructor` / `prototype`),
+    //   * receiver was assigned `Object.create(null)` in this function
+    //     (no prototype chain to pollute),
+    //   * the assignment is dominated by an `if` whose condition rejects
+    //     dangerous keys with an early `return` / `throw` / `break`, or
+    //     that allowlists the key against safe constants on its true arm.
+    let mut pp_labels: smallvec::SmallVec<[DataLabel; 2]> = smallvec::SmallVec::new();
+    let mut pp_payload_args: Option<Vec<usize>> = None;
+    if matches!(lang, "javascript" | "typescript" | "js" | "ts")
+        && !pp_should_suppress_index_set(assign_ast, subscript_node, &arr_text, &idx_text, code)
+    {
+        pp_labels.push(DataLabel::Sink(Cap::PROTOTYPE_POLLUTION));
+        pp_payload_args = Some(vec![0]);
+    }
+
    let n = g.add_node(NodeInfo {
        kind: StmtKind::Call,
        call: CallMeta {
@ -2867,9 +2962,11 @@ fn try_lower_subscript_write(
            receiver: Some(arr_text.clone()),
            arg_uses: vec![vec![idx_text.clone()], rhs_uses.clone()],
            call_ordinal: ord,
+            sink_payload_args: pp_payload_args,
            ..Default::default()
        },
        taint: TaintMeta {
+            labels: pp_labels,
            uses: uses_all,
            ..Default::default()
        },
@ -2883,6 +2980,477 @@ fn try_lower_subscript_write(
    Some(n)
 }

+/// Spring MVC controller-return open-redirect recogniser.  Detects the
+/// shape `return "redirect:" + tainted` (Java string concatenation) and
+/// emits a synthetic `__spring_redirect__` Call sink with
+/// `Sink(OPEN_REDIRECT)` so the existing taint pipeline propagates the
+/// concatenated suffix through the OPEN_REDIRECT cap.  The synthetic
+/// node sequences between `preds` and the eventual Return node.
+///
+/// Returns `Some(synthetic_idx)` when matched, otherwise `None`.
+/// Java only — Spring's `redirect:` view-name convention has no
+/// counterpart in the other supported languages, and matching the
+/// literal across non-Spring code would over-fire.
+fn try_lower_spring_redirect_return(
+    ast: Node,
+    preds: &[NodeIndex],
+    g: &mut Cfg,
+    lang: &str,
+    code: &[u8],
+    enclosing_func: Option<&str>,
+    call_ordinal: &mut u32,
+) -> Option<NodeIndex> {
+    if lang != "java" {
+        return None;
+    }
+    // `return EXPR ;` — find the returned expression.  tree-sitter-java
+    // wraps the value in a `return_statement` whose first named child
+    // is the expression.
+    let expr = ast.named_child(0)?;
+    // Strip parentheses.
+    let mut cur = expr;
+    while cur.kind() == "parenthesized_expression" {
+        cur = cur.named_child(0)?;
+    }
+    if cur.kind() != "binary_expression" {
+        return None;
+    }
+    let op = cur.child_by_field_name("operator")?;
+    let op_text = text_of(op, code)?;
+    if op_text != "+" {
+        return None;
+    }
+    // Walk leftmost descent through left-associated `+` chains so that
+    // `"redirect:" + a + b` still matches (the AST nests as
+    // `(("redirect:" + a) + b)`).
+    let mut leftmost = cur;
+    loop {
+        let left = leftmost.child_by_field_name("left")?;
+        let mut left_inner = left;
+        while left_inner.kind() == "parenthesized_expression" {
+            left_inner = left_inner.named_child(0)?;
+        }
+        if left_inner.kind() == "binary_expression" {
+            let op_l = left_inner.child_by_field_name("operator")?;
+            if text_of(op_l, code).as_deref() == Some("+") {
+                leftmost = left_inner;
+                continue;
+            }
+        }
+        // `left_inner` is the leftmost atom — must be a string literal
+        // whose constant value starts with `redirect:`.
+        if !matches!(left_inner.kind(), "string_literal" | "string") {
+            return None;
+        }
+        let lit = text_of(left_inner, code)?;
+        if lit.len() < 2 {
+            return None;
+        }
+        let inner = &lit[1..lit.len() - 1];
+        if !inner.starts_with("redirect:") {
+            return None;
+        }
+        break;
+    }
+
+    // Collect identifiers referenced anywhere in the original concat
+    // expression — the tainted URL piece is one of them.  Receiver-style
+    // method calls (`view.toString()`) are intentionally captured via
+    // the bare identifier; precision improvements are deferred to the
+    // SSA / abstract-string layer.
+    let mut concat_uses: Vec<String> = Vec::new();
+    collect_idents(cur, code, &mut concat_uses);
+    if concat_uses.is_empty() {
+        return None;
+    }
+
+    let span = (ast.start_byte(), ast.end_byte());
+    let ord = *call_ordinal;
+    *call_ordinal += 1;
+
+    let mut labels: smallvec::SmallVec<[DataLabel; 2]> = smallvec::SmallVec::new();
+    labels.push(DataLabel::Sink(Cap::OPEN_REDIRECT));
+
+    let n = g.add_node(NodeInfo {
+        kind: StmtKind::Call,
+        call: CallMeta {
+            callee: Some("__spring_redirect__".to_string()),
+            arg_uses: vec![concat_uses.clone()],
+            call_ordinal: ord,
+            sink_payload_args: Some(vec![0]),
+            ..Default::default()
+        },
+        taint: TaintMeta {
+            labels,
+            uses: concat_uses,
+            ..Default::default()
+        },
+        ast: AstMeta {
+            span,
+            enclosing_func: enclosing_func.map(|s| s.to_string()),
+        },
+        ..Default::default()
+    });
+    connect_all(g, preds, n, EdgeKind::Seq);
+    Some(n)
+}
+
+/// Prototype-pollution suppression decisions for the synthetic
+/// `__index_set__` node emitted by `try_lower_subscript_write`.
+///
+/// Returns `true` when the assignment is provably safe and the
+/// `Cap::PROTOTYPE_POLLUTION` sink label should be elided.  The three
+/// CFG-layer recognised shapes are flow-insensitive AST patterns:
+///
+/// 1. Constant string key whose value is not one of the dangerous
+///    keys (`__proto__`, `constructor`, `prototype`).  A literal-keyed
+///    write cannot pollute even if the value is tainted.
+/// 2. Reject pattern `if (idx === "__proto__" || idx === "constructor"
+///    || idx === "prototype") <return/throw/break>` enclosing the
+///    assignment.  The dangerous-key path terminates before reaching
+///    the synthesised store.
+/// 3. Allowlist pattern `if (idx === "name" || idx === "id") { obj[idx]
+///    = v }`.  The assignment only executes when `idx` is one of a
+///    small set of known-safe constants.
+///
+/// The null-prototype receiver suppression (`Object.create(null)`) is
+/// handled flow-sensitively in the SSA taint engine via
+/// `TypeKind::NullPrototypeObject`, since AST scans cannot honour
+/// branch-local re-bindings or phi joins.
+///
+/// Conservative: any unrecognised shape returns `false` so the sink
+/// label is attached and the SSA layer decides on taint reachability.
+fn pp_should_suppress_index_set(
+    assign_ast: Node,
+    subscript_node: Node,
+    _arr_text: &str,
+    idx_text: &str,
+    code: &[u8],
+) -> bool {
+    // 1. Constant-key fold.
+    if let Some(idx_node) = subscript_node
+        .child_by_field_name("index")
+        .or_else(|| subscript_node.child_by_field_name("subscript"))
+        .or_else(|| {
+            let mut cur = subscript_node.walk();
+            subscript_node.named_children(&mut cur).nth(1)
+        })
+    {
+        if let Some(literal) = pp_string_literal_value(idx_node, code) {
+            return !pp_is_dangerous_proto_key(&literal);
+        }
+    }
+
+    // 2 + 3. Dominator-style guard ancestors (reject + allowlist).
+    if pp_is_guarded_by_proto_check(assign_ast, idx_text, code) {
+        return true;
+    }
+
+    false
+}
+
+/// Dangerous prototype-pollution key strings.  Matches the literal
+/// values that JS engines treat as references into the prototype chain.
+fn pp_is_dangerous_proto_key(s: &str) -> bool {
+    matches!(s, "__proto__" | "constructor" | "prototype")
+}
+
+/// Extract the value of a JS/TS string literal node, stripping the
+/// outer quote bytes (single, double, or backtick).  Returns `None`
+/// for non-literal nodes, template literals containing interpolation,
+/// or anything that doesn't resemble a single-segment string.
+fn pp_string_literal_value(n: Node, code: &[u8]) -> Option<String> {
+    let kind = n.kind();
+    if !matches!(kind, "string" | "string_literal" | "template_string") {
+        return None;
+    }
+    let raw = std::str::from_utf8(&code[n.start_byte()..n.end_byte()]).ok()?;
+    if raw.len() < 2 {
+        return None;
+    }
+    let bytes = raw.as_bytes();
+    let first = bytes[0];
+    let last = bytes[bytes.len() - 1];
+    if !matches!(first, b'"' | b'\'' | b'`') || first != last {
+        return None;
+    }
+    let inner = &raw[1..raw.len() - 1];
+    // Reject template literals carrying `${...}` interpolation — we
+    // can't fold those to a single concrete value.
+    if first == b'`' && inner.contains("${") {
+        return None;
+    }
+    Some(inner.to_string())
+}
+
+/// Walk up from the assignment node looking for two structural guard
+/// shapes:
+///
+/// * **Reject pattern** — a *previous sibling* `if_statement` in any
+///   enclosing block whose condition is `idx === DANGEROUS [|| …]` and
+///   whose consequence terminates control flow (`return` / `throw` /
+///   `break` / `continue`).  The dangerous-key path never reaches the
+///   subsequent assignment.
+/// * **Allowlist pattern** — an *ancestor* `if_statement` whose
+///   condition is `idx === SAFE [|| …]` and through whose consequence
+///   the descendant flows.  Only the safe-key arm reaches the
+///   assignment.
+///
+/// Both shapes must compare against the same key variable as the
+/// synthetic `__index_set__` node.  Stops at the enclosing function so
+/// guards in an outer scope around a closure passed elsewhere don't
+/// accidentally suppress inner assignments.
+fn pp_is_guarded_by_proto_check(from: Node, idx_text: &str, code: &[u8]) -> bool {
+    let mut cur = from;
+    while let Some(parent) = cur.parent() {
+        match parent.kind() {
+            "function_declaration"
+            | "function"
+            | "function_expression"
+            | "arrow_function"
+            | "method_definition"
+            | "generator_function_declaration"
+            | "program"
+            | "source_file" => return false,
+            "if_statement" => {
+                if let Some(cond) = parent.child_by_field_name("condition") {
+                    let consequence = parent.child_by_field_name("consequence");
+                    if let Some(verdict) =
+                        pp_classify_proto_guard(cond, consequence, cur, idx_text, code)
+                    {
+                        return verdict;
+                    }
+                }
+            }
+            _ => {}
+        }
+
+        // Reject pattern: scan previous siblings in the parent block
+        // for `if (idx === DANGEROUS [|| …]) { return; }` shapes that
+        // dominate the assignment via early-return.
+        let mut sibling_cursor = parent.walk();
+        for sibling in parent.named_children(&mut sibling_cursor) {
+            if sibling.start_byte() >= cur.start_byte() {
+                break;
+            }
+            if sibling.kind() != "if_statement" {
+                continue;
+            }
+            if pp_is_reject_pattern(sibling, idx_text, code) {
+                return true;
+            }
+        }
+
+        cur = parent;
+    }
+    false
+}
+
+/// True when `if_node` is `if (idx === DANGEROUS [|| idx === DANGEROUS]
+/// …) { return; / throw …; / break; }` shaped — every disjunct
+/// compares the named key variable to a dangerous prototype key, and
+/// the consequence terminates control flow.
+fn pp_is_reject_pattern(if_node: Node, idx_text: &str, code: &[u8]) -> bool {
+    let Some(cond) = if_node.child_by_field_name("condition") else {
+        return false;
+    };
+    let consequence = if_node.child_by_field_name("consequence");
+    let clauses = pp_split_or_clauses(cond);
+    if clauses.is_empty() {
+        return false;
+    }
+    for clause in &clauses {
+        let Some((var, lit)) = pp_extract_eq_compare(*clause, code) else {
+            return false;
+        };
+        if var != idx_text || !pp_is_dangerous_proto_key(&lit) {
+            return false;
+        }
+    }
+    consequence.map(pp_block_terminates).unwrap_or(false)
+}
+
+/// Decide whether an enclosing `if` clause around an `__index_set__`
+/// statement constitutes a prototype-pollution guard.
+///
+/// `cond` is the if's condition expression, `consequence` is the
+/// optional consequence block, and `descendant` is the node on the
+/// path from the if-statement down to the assignment (used to
+/// distinguish "assignment lives inside the consequence" from
+/// "assignment lives after the if").  `idx_text` is the textual key
+/// variable used by the synthetic `__index_set__`.
+///
+/// Returns `Some(true)` to suppress, `Some(false)` to keep the gate
+/// (e.g. an unrelated guard), and `None` when the if-statement is
+/// not a recognised guard so the walker continues outward.
+fn pp_classify_proto_guard(
+    cond: Node,
+    consequence: Option<Node>,
+    descendant: Node,
+    idx_text: &str,
+    code: &[u8],
+) -> Option<bool> {
+    let cond_clauses = pp_split_or_clauses(cond);
+    if cond_clauses.is_empty() {
+        return None;
+    }
+
+    let mut all_against_idx = true;
+    let mut all_dangerous = true;
+    let mut all_safe = true;
+    for clause in &cond_clauses {
+        let (var, lit) = pp_extract_eq_compare(*clause, code)?;
+        if var != idx_text {
+            all_against_idx = false;
+            break;
+        }
+        let dangerous = pp_is_dangerous_proto_key(&lit);
+        if dangerous {
+            all_safe = false;
+        } else {
+            all_dangerous = false;
+        }
+    }
+    if !all_against_idx {
+        return None;
+    }
+
+    let consequence_contains_descendant = consequence
+        .map(|c| pp_subtree_contains(c, descendant))
+        .unwrap_or(false);
+
+    // Allowlist pattern: every clause is `idx === SAFE` and the
+    // assignment lives inside the consequence (true arm).
+    if all_safe && consequence_contains_descendant {
+        return Some(true);
+    }
+
+    // Reject pattern: every clause is `idx === DANGEROUS` and the
+    // consequence terminates control flow before reaching the
+    // assignment.  Only suppress when the assignment is *outside* the
+    // consequence (i.e., follows the if).
+    if all_dangerous
+        && !consequence_contains_descendant
+        && consequence.map(pp_block_terminates).unwrap_or(false)
+    {
+        return Some(true);
+    }
+
+    None
+}
+
+/// True when `descendant` is identical to or transitively a child of
+/// `root`.  Identity is checked via byte-range equality because
+/// tree-sitter `Node` doesn't implement `Eq` directly.
+fn pp_subtree_contains(root: Node, descendant: Node) -> bool {
+    let dr = (descendant.start_byte(), descendant.end_byte());
+    let rr = (root.start_byte(), root.end_byte());
+    dr.0 >= rr.0 && dr.1 <= rr.1
+}
+
+/// True when `block` (typically an `if` consequence) terminates
+/// control flow on every path: the last meaningful statement is a
+/// return / throw / break / continue.  Conservative — falls back to
+/// `false` for empty blocks or anything non-trivial.
+fn pp_block_terminates(block: Node) -> bool {
+    // Bare statement consequence (no braces): the if's consequence is
+    // the terminator itself.
+    if pp_is_terminator(block) {
+        return true;
+    }
+    if !matches!(block.kind(), "statement_block" | "block") {
+        return false;
+    }
+    let mut cursor = block.walk();
+    let last_stmt = block.named_children(&mut cursor).last();
+    match last_stmt {
+        Some(s) => pp_is_terminator(s),
+        None => false,
+    }
+}
+
+/// True when `n` is a control-flow-ending statement: return / throw /
+/// break / continue.
+fn pp_is_terminator(n: Node) -> bool {
+    matches!(
+        n.kind(),
+        "return_statement" | "throw_statement" | "break_statement" | "continue_statement"
+    )
+}
+
+/// Split an expression by top-level `||` operators.  Returns the
+/// individual disjunct sub-expressions.  Single (non-OR) expressions
+/// yield a one-element vector.  Walks `binary_expression` nodes whose
+/// `operator` field is `||` and recurses into both sides.
+fn pp_split_or_clauses<'a>(expr: Node<'a>) -> Vec<Node<'a>> {
+    let mut out = Vec::new();
+    pp_collect_or_clauses(expr, &mut out);
+    out
+}
+
+fn pp_collect_or_clauses<'a>(expr: Node<'a>, out: &mut Vec<Node<'a>>) {
+    let stripped = pp_unwrap_paren(expr);
+    if matches!(stripped.kind(), "binary_expression") {
+        let op = stripped
+            .child_by_field_name("operator")
+            .map(|o| o.kind())
+            .unwrap_or("");
+        if op == "||" {
+            if let Some(l) = stripped.child_by_field_name("left") {
+                pp_collect_or_clauses(l, out);
+            }
+            if let Some(r) = stripped.child_by_field_name("right") {
+                pp_collect_or_clauses(r, out);
+            }
+            return;
+        }
+    }
+    out.push(stripped);
+}
+
+fn pp_unwrap_paren(n: Node) -> Node {
+    let mut cur = n;
+    while matches!(cur.kind(), "parenthesized_expression") {
+        match cur.named_child(0) {
+            Some(inner) => cur = inner,
+            None => break,
+        }
+    }
+    cur
+}
+
+/// Extract `(var_text, literal_value)` from an equality comparison
+/// `var === "literal"` / `var == "literal"` (and reversed forms).
+/// Returns `None` for any other shape.
+fn pp_extract_eq_compare(expr: Node, code: &[u8]) -> Option<(String, String)> {
+    let stripped = pp_unwrap_paren(expr);
+    if !matches!(stripped.kind(), "binary_expression") {
+        return None;
+    }
+    let op = stripped
+        .child_by_field_name("operator")
+        .map(|o| o.kind())
+        .unwrap_or("");
+    if !matches!(op, "===" | "==") {
+        return None;
+    }
+    let left = stripped.child_by_field_name("left")?;
+    let right = stripped.child_by_field_name("right")?;
+    let left = pp_unwrap_paren(left);
+    let right = pp_unwrap_paren(right);
+    if let (Some(lv), Some(rs)) = (text_of(left, code), pp_string_literal_value(right, code)) {
+        if matches!(left.kind(), "identifier" | "shorthand_property_identifier") {
+            return Some((lv, rs));
+        }
+    }
+    if let (Some(rv), Some(ls)) = (text_of(right, code), pp_string_literal_value(left, code)) {
+        if matches!(right.kind(), "identifier" | "shorthand_property_identifier") {
+            return Some((rv, ls));
+        }
+    }
+    None
+}
+
 /// Step 1 (`pre_emit_arg_source_nodes`): scan the AST, create Source nodes,
 /// wire them to `preds`, and return (effective_preds, synth_bindings,
 /// uses_only_synth_names).
@ -3682,6 +4250,21 @@ pub(super) fn build_sub<'a>(

                Vec::new()
            } else {
+                // Spring MVC `return "redirect:" + url` open-redirect
+                // synthetic-sink emission.  When matched the synthetic
+                // call sequences between `preds` and the Return node.
+                let mut effective_preds: Vec<NodeIndex> = preds.to_vec();
+                if let Some(synth) = try_lower_spring_redirect_return(
+                    ast,
+                    &effective_preds,
+                    g,
+                    lang,
+                    code,
+                    enclosing_func,
+                    call_ordinal,
+                ) {
+                    effective_preds = vec![synth];
+                }
                let ret = push_node(
                    g,
                    StmtKind::Return,
@ -3692,7 +4275,7 @@ pub(super) fn build_sub<'a>(
                    0,
                    analysis_rules,
                );
-                connect_all(g, preds, ret, EdgeKind::Seq);
+                connect_all(g, &effective_preds, ret, EdgeKind::Seq);
                Vec::new() // terminates this path
            }
        }
--- a/src/cfg/params.rs
+++ b/src/cfg/params.rs
@ -13,7 +13,7 @@ use tree_sitter::Node;
 /// of `build_cfg`.  Returns the [`TypeKind::Dto`] carrying the
 /// per-field type map when the class is declared in the same file;
 /// returns `None` otherwise so callers can fall through to the
-/// pre-Phase-6 behaviour (Object / Unknown).
+/// generic Object / Unknown classification.
 fn lookup_dto_class(class_name: &str) -> Option<TypeKind> {
    DTO_CLASSES.with(|cell| cell.borrow().get(class_name).cloned().map(TypeKind::Dto))
 }
@ -27,7 +27,7 @@ fn lookup_dto_class(class_name: &str) -> Option<TypeKind> {
 /// for the JS/TS object-pattern formal `({ a, b, c })`, the entry is
 /// `("a", None, ["b", "c"])`.  Strictly additive: when the param is
 /// not a destructured pattern (or the language has no destructure
-/// concept), behaviour is identical to the pre-Phase-5 names-only path.
+/// concept), behaviour is identical to the names-only path.
 ///
 /// Closes the residual gap behind CVE-2026-25544 (PayloadCMS Drizzle
 /// SQL injection): a per-parameter taint probe that seeds only the
--- a/src/cli.rs
+++ b/src/cli.rs
@ -49,6 +49,7 @@ impl Commands {
        match self {
            Commands::Scan { explain_engine, .. } => *explain_engine,
            Commands::List { .. } => true,
+            Commands::Rules { .. } => true,
            Commands::Config { action } => {
                matches!(action, ConfigAction::Show { .. } | ConfigAction::Path)
            }
@ -459,6 +460,12 @@ pub enum Commands {
        action: ConfigAction,
    },

+    /// Browse the built-in rule registry (cap classes + per-language label rules)
+    Rules {
+        #[command(subcommand)]
+        action: RulesAction,
+    },
+
    /// Start the local web UI for browsing scan results
    Serve {
        /// Path to scan root (defaults to current directory)
@ -525,6 +532,36 @@ pub enum ConfigAction {
    },
 }

+#[derive(Subcommand)]
+pub enum RulesAction {
+    /// List built-in rules
+    List {
+        /// Filter by language slug (e.g. javascript, java, python). Cap-class
+        /// entries (`language = "all"`) are always shown unless `--no-class`
+        /// is set.
+        #[arg(long)]
+        lang: Option<String>,
+
+        /// Filter by rule kind (`class`, `source`, `sink`, `sanitizer`).
+        #[arg(long)]
+        kind: Option<String>,
+
+        /// Show only the cap-class registry entries (one per vulnerability
+        /// class), suppressing per-language label rules.
+        #[arg(long, conflicts_with = "no_class")]
+        class_only: bool,
+
+        /// Suppress cap-class registry entries (show only per-language label
+        /// rules and gated sinks).
+        #[arg(long)]
+        no_class: bool,
+
+        /// Emit JSON instead of the human-readable table.
+        #[arg(long)]
+        json: bool,
+    },
+}
+
 #[derive(Subcommand)]
 pub enum IndexAction {
    /// Build or update index for current project
--- a/src/commands/mod.rs
+++ b/src/commands/mod.rs
@ -10,6 +10,7 @@ pub mod clean;
 pub mod config;
 pub mod index;
 pub mod list;
+pub mod rules;
 pub mod scan;
 #[cfg(feature = "serve")]
 pub mod serve;
@ -352,6 +353,9 @@ pub fn handle_command(
                }
            }
        }
+        Commands::Rules { action } => {
+            self::rules::handle(action, config)?;
+        }
        Commands::Serve {
            path,
            port,
--- a/src/commands/rules.rs
+++ b/src/commands/rules.rs
@ -0,0 +1,248 @@
+//! `nyx rules` subcommand.
+//!
+//! Surfaces the rule registry from the terminal so users can enumerate
+//! the same content that the dashboard's `/api/rules` endpoint and the
+//! browser's Rules page show.  The output composes built-in cap-class
+//! entries (one per `Cap` with a canonical rule id), per-language label
+//! rules (sink/source/sanitizer), gated sinks, and any custom rules
+//! defined in the user's config.
+
+use crate::cli::RulesAction;
+use crate::errors::NyxResult;
+use crate::labels::{self, RuleInfo};
+use crate::utils::config::{Config, RuleKind};
+use console::style;
+
+pub fn handle(action: RulesAction, config: &Config) -> NyxResult<()> {
+    match action {
+        RulesAction::List {
+            lang,
+            kind,
+            class_only,
+            no_class,
+            json: as_json,
+        } => list(
+            config,
+            lang.as_deref(),
+            kind.as_deref(),
+            class_only,
+            no_class,
+            as_json,
+        ),
+    }
+}
+
+fn list(
+    config: &Config,
+    lang_filter: Option<&str>,
+    kind_filter: Option<&str>,
+    class_only: bool,
+    no_class: bool,
+    as_json: bool,
+) -> NyxResult<()> {
+    let mut rules = labels::enumerate_builtin_rules();
+
+    // Apply disabled-rules overlay so the CLI matches the dashboard view.
+    for rule in &mut rules {
+        if config.analysis.disabled_rules.contains(&rule.id) {
+            rule.enabled = false;
+        }
+    }
+
+    // Append custom rules from config.  Mirrors the projection in
+    // `src/server/routes/rules.rs::build_rule_list`.
+    for (cfg_lang, lang_cfg) in &config.analysis.languages {
+        let canonical = labels::canonical_lang(cfg_lang);
+        for cr in &lang_cfg.rules {
+            let kind_str = match cr.kind {
+                RuleKind::Source => "source",
+                RuleKind::Sanitizer => "sanitizer",
+                RuleKind::Sink => "sink",
+            };
+            let id = labels::custom_rule_id(canonical, kind_str, &cr.matchers);
+            let first = cr.matchers.first().map(|s| s.as_str()).unwrap_or("?");
+            let title = format!("{} (custom {})", first, kind_str);
+            let cap = cr.cap.to_cap();
+            let enabled = !config.analysis.disabled_rules.contains(&id);
+            rules.push(RuleInfo {
+                id,
+                title,
+                language: canonical.to_string(),
+                kind: kind_str.to_string(),
+                cap: labels::cap_to_name(cap).to_string(),
+                cap_bits: cap.bits(),
+                matchers: cr.matchers.clone(),
+                case_sensitive: cr.case_sensitive,
+                is_custom: true,
+                is_gated: false,
+                is_class: false,
+                emission_active: true,
+                enabled,
+            });
+        }
+    }
+
+    // Filter.
+    let lang_filter_canonical = lang_filter.map(labels::canonical_lang);
+    rules.retain(|r| {
+        if class_only && !r.is_class {
+            return false;
+        }
+        if no_class && r.is_class {
+            return false;
+        }
+        if let Some(want) = lang_filter_canonical {
+            // Cap-class entries (`language == "all"`) are language-agnostic;
+            // surface them alongside any language filter unless explicitly
+            // suppressed via `--no-class`.
+            if r.language != want && r.language != "all" {
+                return false;
+            }
+        }
+        if let Some(want) = kind_filter
+            && !r.kind.eq_ignore_ascii_case(want)
+        {
+            return false;
+        }
+        true
+    });
+
+    if as_json {
+        let body = serde_json::to_string_pretty(&rules)
+            .map_err(|e| crate::errors::NyxError::Msg(format!("rules JSON serialise: {e}")))?;
+        println!("{body}");
+        return Ok(());
+    }
+
+    if rules.is_empty() {
+        println!("{}", style("(no rules match the supplied filters)").dim());
+        return Ok(());
+    }
+
+    // Header.
+    println!(
+        "{}",
+        style("Rules (built-in registry, per-language labels, and custom rules from config)")
+            .bold()
+    );
+    println!();
+
+    // Cap-class section first, distinct from per-language entries.
+    let class_rules: Vec<&RuleInfo> = rules.iter().filter(|r| r.is_class).collect();
+    if !class_rules.is_empty() {
+        println!("  {}", style("Vulnerability classes").cyan().bold());
+        for r in &class_rules {
+            print_class_row(r);
+        }
+        println!();
+    }
+
+    let builtin_label_rules: Vec<&RuleInfo> = rules
+        .iter()
+        .filter(|r| !r.is_class && !r.is_custom)
+        .collect();
+    if !builtin_label_rules.is_empty() {
+        println!("  {}", style("Built-in label rules").cyan().bold());
+        for r in &builtin_label_rules {
+            print_label_row(r);
+        }
+        println!();
+    }
+
+    let custom_rules: Vec<&RuleInfo> = rules.iter().filter(|r| r.is_custom).collect();
+    if !custom_rules.is_empty() {
+        println!("  {}", style("Custom rules (from config)").cyan().bold());
+        for r in &custom_rules {
+            print_label_row(r);
+        }
+        println!();
+    }
+
+    println!(
+        "{}",
+        style(format!(
+            "{} class · {} built-in label · {} custom · {} total",
+            class_rules.len(),
+            builtin_label_rules.len(),
+            custom_rules.len(),
+            rules.len()
+        ))
+        .dim()
+    );
+
+    Ok(())
+}
+
+fn print_class_row(r: &RuleInfo) {
+    let status = if r.enabled {
+        style("on ").green().to_string()
+    } else {
+        style("off").red().dim().to_string()
+    };
+    // Forward-declared classes (registered but not yet wired through
+    // `ast.rs::diag_for_finding`) carry a tag so users don't expect
+    // findings under the class id; live findings still surface under
+    // the legacy `taint-unsanitised-flow` rule id.
+    let tag = if r.emission_active {
+        String::new()
+    } else {
+        format!(" {}", style("(forward-declared)").yellow())
+    };
+    println!(
+        "    {} {:<32} {} {}{}",
+        status,
+        style(&r.id).white().bold(),
+        style(format!("[{}]", r.cap)).dim(),
+        style(&r.title).dim(),
+        tag,
+    );
+}
+
+fn print_label_row(r: &RuleInfo) {
+    let status = if r.enabled {
+        style("on ").green().to_string()
+    } else {
+        style("off").red().dim().to_string()
+    };
+    let tag = if r.is_custom {
+        style(" custom").yellow().to_string()
+    } else if r.is_gated {
+        style(" gated").magenta().to_string()
+    } else {
+        String::new()
+    };
+    let matchers = if r.matchers.is_empty() {
+        String::new()
+    } else {
+        let joined = r.matchers.join(", ");
+        format!(" — {joined}")
+    };
+    println!(
+        "    {} {:<10} {:<10} {:<14}{}{}",
+        status,
+        style(&r.language).cyan(),
+        style(&r.kind).white(),
+        style(&r.cap).dim(),
+        tag,
+        style(matchers).dim(),
+    );
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::utils::config::Config;
+
+    #[test]
+    fn list_runs_without_panic_default_config() {
+        let cfg = Config::default();
+        // Plain list, no filters.
+        list(&cfg, None, None, false, false, false).unwrap();
+        // Class-only.
+        list(&cfg, None, None, true, false, false).unwrap();
+        // JSON output.
+        list(&cfg, None, None, false, false, true).unwrap();
+        // Lang + kind filters.
+        list(&cfg, Some("javascript"), Some("sink"), false, true, false).unwrap();
+    }
+}
--- a/src/commands/scan.rs
+++ b/src/commands/scan.rs
@ -544,14 +544,14 @@ pub(crate) fn deduplicate_taint_flows(diags: &mut Vec<Diag>) {
        id.starts_with(TAINT_BASE)
    }

-    fn sink_cap_bits(d: &Diag) -> u16 {
+    fn sink_cap_bits(d: &Diag) -> u32 {
        d.evidence.as_ref().map(|e| e.sink_caps).unwrap_or(0)
    }

    // Group candidates by (path, line, severity, sink_cap_bits). Only
    // `taint-unsanitised-flow` rule IDs participate; findings with other
    // bases (e.g. `js.code_exec.eval`) are left untouched per guardrails.
-    let mut groups: HashMap<(String, usize, Severity, u16), Vec<usize>> = HashMap::new();
+    let mut groups: HashMap<(String, usize, Severity, u32), Vec<usize>> = HashMap::new();
    for (i, d) in diags.iter().enumerate() {
        if is_taint_flow(&d.id) {
            groups
@ -690,8 +690,8 @@ pub const SCC_UNCONVERGED_CROSS_FILE_NOTE_PREFIX: &str = "scc_unconverged:cross-
 /// file set.  Semantics match [`diff_cap_snapshots`], a key that
 /// appears or disappears counts as changed.
 fn changed_cap_keys_of(
-    before: &HashMap<crate::symbol::FuncKey, (u16, u16, u16, Vec<usize>)>,
-    after: &HashMap<crate::symbol::FuncKey, (u16, u16, u16, Vec<usize>)>,
+    before: &HashMap<crate::symbol::FuncKey, (u32, u32, u32, Vec<usize>)>,
+    after: &HashMap<crate::symbol::FuncKey, (u32, u32, u32, Vec<usize>)>,
 ) -> HashSet<crate::symbol::FuncKey> {
    let mut changed = HashSet::new();
    for (k, v_after) in after {
@ -971,10 +971,10 @@ fn run_topo_batches(
            // with a 64-iter budget; the classifier only needs the tail.
            let mut delta_trajectory: smallvec::SmallVec<[u32; 4]> = smallvec::SmallVec::new();

-            // Phase-B worklist: files to re-analyse in this iteration.
+            // SCC fixpoint worklist: files to re-analyse in this iteration.
            // Initialised to the full batch so iteration 0 behaves like
-            // the pre-Phase-B implementation; subsequent iterations
-            // prune to files containing a caller of a changed summary.
+            // the unconditional re-analysis; subsequent iterations prune
+            // to files containing a caller of a changed summary.
            //
            // Storing `PathBuf` clones (matching how the rest of the
            // SCC loop identifies files) so membership tests are cheap
--- a/src/constraint/domain.rs
+++ b/src/constraint/domain.rs
@ -113,22 +113,22 @@ impl ConstValue {

 // ── TypeSet ─────────────────────────────────────────────────────────────

-/// Bitset over [`TypeKind`] variants (12 bits used of u16).
+/// Bitset over [`TypeKind`] variants (19 bits used of u32).
 #[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)]
-pub struct TypeSet(u16);
+pub struct TypeSet(u32);

 impl TypeSet {
-    /// All 12 type bits set, no type constraint (Top).
-    pub const TOP: Self = Self(0x0FFF);
+    /// All 19 type bits set, no type constraint (Top).
+    pub const TOP: Self = Self(0x0007_FFFF);
    /// No type bits, unsatisfiable (Bottom).
    pub const BOTTOM: Self = Self(0);

    pub fn singleton(kind: &TypeKind) -> Self {
-        Self(1u16 << type_kind_index(kind))
+        Self(1u32 << type_kind_index(kind))
    }

    pub fn contains(&self, kind: &TypeKind) -> bool {
-        self.0 & (1u16 << type_kind_index(kind)) != 0
+        self.0 & (1u32 << type_kind_index(kind)) != 0
    }

    /// Meet (intersection): refine type knowledge.
@ -156,7 +156,7 @@ impl TypeSet {

    /// Check if this set contains exactly one type matching the given kind.
    pub fn is_singleton_of(&self, kind: &TypeKind) -> bool {
-        self.0 != 0 && self.0 == (1u16 << type_kind_index(kind))
+        self.0 != 0 && self.0 == (1u32 << type_kind_index(kind))
    }

    /// Return the TypeKind if this is a singleton set (exactly one type).
@ -186,12 +186,21 @@ fn type_kind_index(kind: &TypeKind) -> u32 {
        TypeKind::LocalCollection => 12,
        TypeKind::RequestBuilder => 13,
        TypeKind::JpaCriteriaQuery => 14,
+        TypeKind::LdapClient => 15,
+        TypeKind::XPathClient => 16,
+        TypeKind::XmlParser => 17,
+        TypeKind::Template => 18,
        // the analysis DTO types carry per-field structural info that the
        // bitset domain can't represent.  Collapse to Unknown so callers
        // still see "any type possible" rather than crashing on an
        // unhandled variant.  Same-file/cross-file Dto-aware paths read
        // the structured TypeKind directly, not via this index.
        TypeKind::Dto(_) => 6,
+        // NullPrototypeObject is a JS-only sub-kind of Object used for
+        // flow-sensitive prototype-pollution suppression.  The bitset
+        // domain has no dedicated slot, share the Object index so
+        // singleton recovery still maps to a meaningful TypeKind.
+        TypeKind::NullPrototypeObject => 3,
    }
 }

@ -212,6 +221,10 @@ fn type_kind_from_index(idx: u32) -> Option<TypeKind> {
        12 => Some(TypeKind::LocalCollection),
        13 => Some(TypeKind::RequestBuilder),
        14 => Some(TypeKind::JpaCriteriaQuery),
+        15 => Some(TypeKind::LdapClient),
+        16 => Some(TypeKind::XPathClient),
+        17 => Some(TypeKind::XmlParser),
+        18 => Some(TypeKind::Template),
        _ => None,
    }
 }
@ -801,7 +814,7 @@ pub struct PathEnv {
    /// Per-key meet count for widening decisions.
    meet_counts: SmallVec<[(SsaValue, u8); 8]>,
    /// Refinement counter (bounded per block).
-    refine_count: u16,
+    refine_count: u32,
 }

 impl PathEnv {
@ -837,7 +850,7 @@ impl PathEnv {
        if self.unsat {
            return;
        }
-        if self.refine_count >= MAX_REFINE_PER_BLOCK as u16 {
+        if self.refine_count >= MAX_REFINE_PER_BLOCK as u32 {
            return; // bounded
        }
        let canonical = self.uf.find_immutable(v);
@ -860,7 +873,7 @@ impl PathEnv {
        // but `refine_single` is also invoked directly from `assume_eq`,
        // `assume_neq`, and a few internal sites.  Large generated inputs
        // (thousands of short statements on one line) can drive millions
-        // of calls and overflow a plain u16 `refine_count`.  Saturate to
+        // of calls and overflow a plain u32 `refine_count`.  Saturate to
        // stay within bounds, the refinement pipeline is already
        // idempotent past the cap, so saturation is semantically a no-op.
        self.refine_count = self.refine_count.saturating_add(1);
--- a/src/constraint/solver.rs
+++ b/src/constraint/solver.rs
@ -250,6 +250,31 @@ pub fn class_name_to_type_kind(name: &str) -> Option<TypeKind> {
        // Java I/O supertypes (enables hierarchy fallback for subtypes)
        | "InputStream" | "OutputStream" | "Reader" | "Writer" | "PrintWriter"
        | "BufferedInputStream" | "BufferedOutputStream" => Some(TypeKind::FileHandle),
+        // JNDI / Spring LDAP directory-service types.  Field- and method-typed
+        // declarations (`DirContext ctx = ...`, `LdapTemplate ldapTemplate;`)
+        // attach this fact to the receiver SSA value so type-qualified
+        // resolution rewrites `ctx.search(...)` → `LdapClient.search`.
+        "DirContext" | "LdapContext" | "InitialDirContext" | "InitialLdapContext"
+        | "LdapTemplate" => Some(TypeKind::LdapClient),
+        // JAXP XML parser instances.  Field/local declarations like
+        // `DocumentBuilder builder = factory.newDocumentBuilder();` route
+        // through this map so the receiver SSA value carries
+        // `TypeKind::XmlParser` and the type-qualified
+        // `XmlParser.parse` rule fires on `builder.parse(...)`.
+        "DocumentBuilder" | "SAXParser" | "XMLReader" | "SAXBuilder" => {
+            Some(TypeKind::XmlParser)
+        }
+        // JAXP XPath instances.  `XPath xpath = factory.newXPath();`
+        // routes through this map so the receiver carries
+        // `TypeKind::XPathClient`, enabling the type-qualified
+        // `XPathClient.evaluate` resolution and the resolver-binding
+        // suppression sidecar.
+        "XPath" | "XPathExpression" => Some(TypeKind::XPathClient),
+        // Apache FreeMarker `Template` declared receiver type.  Routes
+        // `Template tpl = ...; tpl.process(model, out)` through
+        // type-qualified resolution to `Template.process`, the SSTI
+        // sink defined in `labels/java.rs`.
+        "Template" => Some(TypeKind::Template),
        // Python qualified type names.
        // Only covers raw lowered names from isinstance(). The lowering in lower.rs
        // extracts the literal type text: isinstance(x, requests.Session) produces
--- a/src/database.rs
+++ b/src/database.rs
@ -225,7 +225,17 @@ pub mod index {
    /// * `"3"`, `ssa_function_bodies.body` changed from JSON TEXT to
    ///   bincode BLOB.  Old JSON payloads cannot be deserialised by the
    ///   new engine, so they are silently rebuilt on open.
-    pub const SCHEMA_VERSION: &str = "3";
+    /// * `"4"`, `Cap` widened from u16 to u32 to accommodate cap bits
+    ///   ≥ 14 (LDAP_INJECTION, XPATH_INJECTION, HEADER_INJECTION,
+    ///   OPEN_REDIRECT, SSTI, XXE, PROTOTYPE_POLLUTION).  The `Cap`
+    ///   deserialiser accepts both u16- and u32-width JSON values, so
+    ///   pre-bump caches load without crashing, but the cached
+    ///   `source_caps` / `sanitizer_caps` / `sink_caps` blobs were
+    ///   produced before any of these caps could appear and would
+    ///   underreport rules that emit them.  Bumping forces a rescan so
+    ///   newly-emitted gates and sinks land in the cache with the wider
+    ///   footprint.
+    pub const SCHEMA_VERSION: &str = "4";

    // TODO: ADD CLEANS FOR EACH TABLE BASED ON PROJECT WHICH RUNS ON CLEAN
    // TODO: ADD DROP AND GIVE A CLI PARAMETER FOR DROP
@ -2899,6 +2909,8 @@ fn make_test_callee_body(
            type_facts: crate::ssa::type_facts::TypeFactResult {
                facts: std::collections::HashMap::new(),
            },
+            xml_parser_config: crate::ssa::xml_config::XmlParserConfigResult::default(),
+            xpath_config: crate::ssa::xpath_config::XPathConfigResult::default(),
            alias_result: crate::ssa::alias::BaseAliasResult::empty(),
            points_to: crate::ssa::heap::PointsToResult::empty(),
            module_aliases: std::collections::HashMap::new(),
@ -3765,7 +3777,7 @@ fn metadata_table_survives_clear() {
 /// receiver sentinel (`u32::MAX`), the container-element marker
 /// (`<elem>`), and the `overflow` flag across serialise → store →
 /// load → deserialise.  This is the strict-additive contract for
-/// pre-Phase-5 blobs (default-empty deserialises cleanly) and the
+/// older blobs without field_points_to (default-empty deserialises cleanly) and the
 /// completeness check for the W3 cross-call resolver.
 #[test]
 fn ssa_summaries_round_trip_preserves_field_points_to() {
@ -3840,15 +3852,15 @@ fn ssa_summaries_round_trip_preserves_field_points_to() {
    assert!(!sum.field_points_to.overflow);
 }

-/// Pre-Phase-5 blob compatibility: a summary serialised without
+/// Older blob compatibility: a summary serialised without
 /// `field_points_to` deserialises with the empty default, no
 /// migration needed because the field is `#[serde(default)]`.
 #[test]
-fn ssa_summaries_pre_phase5_blob_decodes_with_empty_field_points_to() {
+fn ssa_summaries_legacy_blob_decodes_with_empty_field_points_to() {
    use crate::summary::ssa_summary::SsaFuncSummary;

    // Hand-craft JSON without the `field_points_to` key.
-    let pre_phase5_json = r#"{
+    let legacy_json = r#"{
        "param_to_return": [],
        "param_to_sink": [],
        "source_caps": 0,
@ -3865,7 +3877,7 @@ fn ssa_summaries_pre_phase5_blob_decodes_with_empty_field_points_to() {
        "return_path_facts": [],
        "typed_call_receivers": []
    }"#;
-    let sum: SsaFuncSummary = serde_json::from_str(pre_phase5_json).unwrap();
+    let sum: SsaFuncSummary = serde_json::from_str(legacy_json).unwrap();
    assert!(
        sum.field_points_to.is_empty(),
        "missing field_points_to must default to empty",
--- a/src/evidence.rs
+++ b/src/evidence.rs
@ -217,15 +217,15 @@ pub struct Evidence {
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub symbolic: Option<SymbolicVerdict>,

-    /// Resolved sink capability bits (u16 from `Cap::bits()`).
+    /// Resolved sink capability bits (u32 from `Cap::bits()`).
    ///
    /// Used by deduplication to distinguish findings that share a
    /// `(path, line, severity)` key but target different sinks (e.g.
    /// `sink_sql(x); sink_shell(x);` on the same line). 0 when the sink
    /// caps could not be resolved at the CFG node (e.g. pure summary
    /// resolution where the caller's sink node carries no label).
-    #[serde(default, skip_serializing_if = "is_zero_u16")]
-    pub sink_caps: u16,
+    #[serde(default, skip_serializing_if = "is_zero_cap_bits")]
+    pub sink_caps: u32,

    /// Engine provenance notes attached to this finding (e.g. "worklist
    /// iteration budget was hit before convergence"), propagated from
@ -243,7 +243,7 @@ pub struct Evidence {
    pub data_exfil_field: Option<String>,
 }

-fn is_zero_u16(v: &u16) -> bool {
+fn is_zero_cap_bits(v: &u32) -> bool {
    *v == 0
 }

--- a/src/labels/c.rs
+++ b/src/labels/c.rs
@ -67,6 +67,30 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::SSRF),
        case_sensitive: false,
    },
+    // ─── LDAP injection sinks ───
+    //
+    // OpenLDAP / libldap surface: `ldap_search_s(ld, base, scope, filter, ...)`
+    // and the asynchronous variant `ldap_search_ext_s(ld, base, scope, filter,
+    // attrs, attrsonly, serverctrls, clientctrls, timeout, sizelimit, *res)`.
+    // The filter argument (position 3) is the LDAP-injection vector.  No
+    // standard libldap escape helper exists in the C surface; sanitisation is
+    // typically caller-implemented (`sanitize_*` covers the developer-named
+    // case via the existing prefix rule above).
+    LabelRule {
+        matchers: &["ldap_search_s", "ldap_search_ext_s"],
+        label: DataLabel::Sink(Cap::LDAP_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── XPath injection sinks ───
+    //
+    // libxml2 evaluation entry points: `xmlXPathEvalExpression(expr, ctx)`,
+    // `xmlXPathEval(expr, ctx)`, `xmlXPathCompile(expr)`.  The expression
+    // string is arg 0 and is the canonical XPath-injection vector.
+    LabelRule {
+        matchers: &["xmlXPathEvalExpression", "xmlXPathEval", "xmlXPathCompile"],
+        label: DataLabel::Sink(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
 ];

 /// Gated sinks for C.
--- a/src/labels/cpp.rs
+++ b/src/labels/cpp.rs
@ -89,6 +89,24 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::SSRF),
        case_sensitive: false,
    },
+    // ─── LDAP injection sinks ───
+    //
+    // OpenLDAP / libldap C interface (also used from C++ wrappers): the filter
+    // argument carries attacker-controlled data unless explicitly escaped.
+    LabelRule {
+        matchers: &["ldap_search_s", "ldap_search_ext_s"],
+        label: DataLabel::Sink(Cap::LDAP_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── XPath injection sinks ───
+    //
+    // libxml2 (the dominant C++ XML parser surface): `xmlXPathEvalExpression`,
+    // `xmlXPathEval`, `xmlXPathCompile` accept the expression string as arg 0.
+    LabelRule {
+        matchers: &["xmlXPathEvalExpression", "xmlXPathEval", "xmlXPathCompile"],
+        label: DataLabel::Sink(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
 ];

 /// Gated sinks for C++.
--- a/src/labels/go.rs
+++ b/src/labels/go.rs
@ -148,6 +148,97 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::CRYPTO),
        case_sensitive: false,
    },
+    // ─── LDAP injection sinks ───
+    //
+    // go-ldap (`github.com/go-ldap/ldap/v3`): `conn, _ := ldap.DialURL(url);
+    // req := ldap.NewSearchRequest(base, scope, deref, sizeLimit, timeLimit,
+    // typesOnly, filter, attrs, controls)`.  The filter argument (position 6)
+    // is the LDAP-injection vector; passing the request to `conn.Search(req)`
+    // executes the filter.  Type-qualified resolution rewrites `conn.Search`
+    // → `LdapClient.Search` when the receiver was returned by
+    // `ldap.DialURL` / `ldap.Dial` / `ldap.DialTLS` (see
+    // [`crate::ssa::type_facts::constructor_type`]).  We also tag
+    // `ldap.NewSearchRequest` directly so taint reaching the filter argument
+    // surfaces at the construction call (matches the typical FP-free shape
+    // where the request is built once and passed straight to `Search`).
+    LabelRule {
+        matchers: &[
+            "LdapClient.Search",
+            "LdapClient.SearchWithPaging",
+            "ldap.NewSearchRequest",
+        ],
+        label: DataLabel::Sink(Cap::LDAP_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── LDAP-filter sanitizer ───
+    //
+    // go-ldap exposes `ldap.EscapeFilter(s string) string` (RFC 4515 metachar
+    // escaping).  Treat any call as clearing the LDAP_INJECTION cap.
+    LabelRule {
+        matchers: &["ldap.EscapeFilter"],
+        label: DataLabel::Sanitizer(Cap::LDAP_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── Header / CRLF injection sinks ───
+    //
+    // `net/http` `ResponseWriter.Header()` returns a `Header` map; calls to
+    // `Set(name, val)` / `Add(name, val)` write a single header value.
+    // After paren-group stripping the chain text becomes
+    // `w.Header.Set` / `w.Header.Add`, so suffix matchers on `Header.Set` /
+    // `Header.Add` cover both the bound-receiver form (`w.Header().Set(...)`)
+    // and the documentation-style class-qualified form (`Header.Set`).
+    // Tainted strings without `\r\n` stripping enable response splitting.
+    LabelRule {
+        matchers: &["Header.Set", "Header.Add"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── Header / CRLF sanitizers ───
+    //
+    // Project-local `stripCRLF` / `escapeHeader` helpers that strip `\r` and
+    // `\n` from a value before it is written to a response header.
+    LabelRule {
+        matchers: &["stripCRLF", "stripCrlf", "escapeHeader", "sanitizeHeader"],
+        label: DataLabel::Sanitizer(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Open redirect sinks ───
+    //
+    // `net/http` `http.Redirect(w, r, url, code)` writes a `Location` header
+    // and a 3xx status from the supplied URL.  Without an allowlist check,
+    // a tainted `url` is the canonical Go open-redirect vector.
+    LabelRule {
+        matchers: &["http.Redirect"],
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    LabelRule {
+        matchers: &[
+            "validateRedirectUrl",
+            "isSafeRedirect",
+            "stripScheme",
+            "ensureRelativeUrl",
+            "assertRelativePath",
+            "isRelativeUrl",
+        ],
+        label: DataLabel::Sanitizer(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    // ─── SSTI sinks ───
+    //
+    // `text/template` and `html/template` parse a template source string via
+    // `template.New(name).Parse(src)`.  After paren-group stripping the chain
+    // text becomes `template.New.Parse`, so the suffix matcher catches both
+    // packages (`text/template`, `html/template`) regardless of import alias.
+    // `template.ParseFiles` / `ParseGlob` take file paths (path-traversal,
+    // not SSTI) and are intentionally excluded.  `html/template`'s auto-
+    // escaping applies during `Execute`, not `Parse`, so a tainted source
+    // string still yields SSTI.
+    LabelRule {
+        matchers: &["template.New.Parse"],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: false,
+    },
 ];

 /// Argument-role-aware Go sinks.  Two classes coexist on the outbound HTTP
--- a/src/labels/java.rs
+++ b/src/labels/java.rs
@ -1,4 +1,6 @@
-use crate::labels::{Cap, DataLabel, Kind, LabelRule, ParamConfig, RuntimeLabelRule};
+use crate::labels::{
+    Cap, DataLabel, GateActivation, Kind, LabelRule, ParamConfig, RuntimeLabelRule, SinkGate,
+};
 use crate::utils::project::{DetectedFramework, FrameworkContext};
 use phf::{Map, phf_map};

@ -265,6 +267,223 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::CODE_EXEC),
        case_sensitive: false,
    },
+    // ─── LDAP injection sinks ───
+    //
+    // JNDI / Spring LDAP search APIs accept an attacker-influenceable filter
+    // expression as either the second positional argument (`DirContext.search(name,
+    // filter, controls)` / `LdapTemplate.search(base, filter, mapper)`).  Without
+    // RFC 4515 escaping the filter can be rewritten to bypass authentication or
+    // exfiltrate directory entries.  Type-qualified resolution rewrites
+    // `ctx.search(...)` → `LdapClient.search` when the receiver carries a
+    // `TypeKind::LdapClient` fact (set by `class_name_to_type_kind` for the
+    // declared types `DirContext`, `InitialDirContext`, `LdapContext`,
+    // `LdapTemplate`, or by `constructor_type` for `new InitialDirContext(...)`
+    // / `new InitialLdapContext(...)`).  Direct flat matchers cover the
+    // documentation-style class-qualified call forms that bypass receiver
+    // typing.
+    LabelRule {
+        matchers: &[
+            "LdapClient.search",
+            "LdapClient.searchByEntity",
+            "LdapClient.searchForObject",
+            "LdapClient.searchForContext",
+            "DirContext.search",
+            "LdapTemplate.search",
+            "LdapTemplate.searchByEntity",
+            "LdapTemplate.searchForObject",
+            "LdapTemplate.searchForContext",
+            "ctx.search",
+        ],
+        label: DataLabel::Sink(Cap::LDAP_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── LDAP-filter sanitizers ───
+    //
+    // Spring LDAP's `LdapEncoder.filterEncode(s)` applies RFC 4515 escaping to
+    // metacharacters (`\`, `*`, `(`, `)`, ``).  `nameEncode` performs the
+    // companion DN-component escaping.  Both fully clear the LDAP_INJECTION
+    // cap; downstream sinks see a sanitised value.
+    LabelRule {
+        matchers: &["LdapEncoder.filterEncode", "LdapEncoder.nameEncode"],
+        label: DataLabel::Sanitizer(Cap::LDAP_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── XPath injection sinks ───
+    //
+    // `javax.xml.xpath.XPath.evaluate(expr, source, ...)` and the matching
+    // `XPathExpression.evaluate(source)` accept an attacker-influenceable
+    // expression string.  Without parameterisation via
+    // `XPathVariableResolver` the expression can be rewritten to bypass
+    // authentication or exfiltrate document subtrees.  `XPath.compile(expr)`
+    // is the equivalent pre-compile entry point.  Direct flat matchers cover
+    // the documentation-style class-qualified call forms.
+    LabelRule {
+        matchers: &[
+            "XPath.evaluate",
+            "XPath.compile",
+            "XPathExpression.evaluate",
+            "xpath.evaluate",
+            "xpath.compile",
+        ],
+        label: DataLabel::Sink(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── XPath escape sanitizers ───
+    //
+    // OWASP ESAPI's `Encoder.encodeForXPath(s)` escapes the XPath
+    // metacharacters (`'`, `"`, `[`, `]`, `(`, `)`, `,`, `=`, `<`, `>`,
+    // `*`).  Project-local `xpathEscape` / `escapeXpath` are the common
+    // developer-named equivalents.
+    LabelRule {
+        matchers: &["Encoder.encodeForXPath", "xpathEscape", "escapeXpath"],
+        label: DataLabel::Sanitizer(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // Parameterised XPath via `XPath.setXPathVariableResolver(resolver)`
+    // suppression is implemented as a receiver-config sidecar in
+    // [`crate::ssa::xpath_config::XPathConfigResult`]: a
+    // `setXPathVariableResolver` call on a receiver carrying
+    // `TypeKind::XPathClient` flips the receiver's `has_resolver` flag,
+    // and the SSA sink-emission site strips `Cap::XPATH_INJECTION` from
+    // any later `xpath.evaluate(taintedExpr, ...)` whose receiver is
+    // provably bound.  No flat sanitizer rule is needed (and a
+    // name-only rule would clear the wrong call site).
+    // ─── Header / CRLF injection sinks ───
+    //
+    // `HttpServletResponse.setHeader(name, val)` / `addHeader(name, val)`
+    // accept a single header value; tainted strings without `\r\n` stripping
+    // let an attacker inject extra headers (response splitting).
+    // `addCookie(c)` carries a `Cookie` whose constructor takes a value
+    // string; track at the higher-level setHeader / addHeader entry points.
+    LabelRule {
+        matchers: &["setHeader", "addHeader", "addCookie"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Header / CRLF sanitizers ───
+    LabelRule {
+        matchers: &["stripCRLF", "stripCrlf", "escapeHeader", "sanitizeHeader"],
+        label: DataLabel::Sanitizer(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Open redirect sinks ───
+    //
+    // Servlet API: `HttpServletResponse.sendRedirect(url)`.  Spring MVC
+    // controllers can also return a `"redirect:"` prefixed string but that
+    // sink shape is not modelled here.
+    LabelRule {
+        matchers: &["sendRedirect"],
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    LabelRule {
+        matchers: &[
+            "validateRedirectUrl",
+            "isSafeRedirect",
+            "stripScheme",
+            "ensureRelativeUrl",
+            "assertRelativePath",
+            "isRelativeUrl",
+        ],
+        label: DataLabel::Sanitizer(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    // ─── SSTI sinks ───
+    //
+    // Apache FreeMarker `Template.process(model, writer)` renders an
+    // already-parsed template; the SSTI vector is when the template source
+    // is attacker-influenced (e.g. `new Template(name, new StringReader(src), cfg)`).
+    // The flat matcher fires only when the receiver chain text resolves to
+    // `Template.process` — typically through a `Template`-typed declared
+    // receiver routed via type-qualified resolution.  Without a `Template`
+    // TypeKind, idiomatic `Template tpl = new Template(...); tpl.process(...)`
+    // shapes are not recognised; tracked under deferred phases.
+    //
+    // Apache Velocity `Velocity.evaluate(ctx, writer, tag, src)` is modelled
+    // as a gated sink in `GATED_SINKS` below so only the template-source
+    // arg (index 3) activates SSTI; tainted variables in the `ctx` arg
+    // (data) stay clean.
+    LabelRule {
+        matchers: &["Template.process"],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: true,
+    },
+    // ─── XXE sinks ───
+    //
+    // Java's stock XML parsers (JAXP) are XXE-vulnerable by default: the
+    // factories ship with external-entity / DTD resolution enabled and only
+    // become safe after `setFeature(FEATURE_SECURE_PROCESSING, true)` /
+    // disabling `external-general-entities` / `external-parameter-entities`.
+    // Tainted XML reaching any of these parser entry points is treated as
+    // an XXE flow; a config-check sanitizer pass (Phase XXE Layer 2) is
+    // out of scope for this rule and is the follow-up listed in
+    // `.pitboss/play/deferred.md`.
+    //
+    // Class-qualified suffix matching covers both the documentation-style
+    // `javax.xml.parsers.DocumentBuilder.parse(...)` form and the bound-
+    // receiver `XmlParser.parse(...)` form (when the receiver's TypeKind
+    // resolves to `XmlParser`).  Bare `parse` is intentionally avoided to
+    // prevent collisions with `Integer.parseInt`, `LocalDate.parse`,
+    // generic JSON parsers, etc.
+    LabelRule {
+        matchers: &[
+            "DocumentBuilder.parse",
+            "SAXParser.parse",
+            "XMLReader.parse",
+            "SAXBuilder.build",
+            "XmlParser.parse",
+            "XmlParser.build",
+        ],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+    },
+    // ─── XXE config-setter sanitizers ───
+    //
+    // Phase 07: a JAXP `setFeature(...)` / `setExpandEntityReferences(...)`
+    // call is itself a label-level Sanitizer for `Cap::XXE` so that the
+    // *call's return value* (rare but exists for fluent factory APIs)
+    // does not carry XXE through it.  The real load-bearing suppression
+    // is the receiver-fact path in
+    // [`crate::ssa::xml_config::XmlParserConfigResult`], which the SSA
+    // sink emission consults at every parse-class sink site.  This rule
+    // is conservative noise reduction for downstream sinks that consume
+    // the setter call's value.
+    LabelRule {
+        matchers: &[
+            "setFeature",
+            "setExpandEntityReferences",
+            "setXIncludeAware",
+            "setValidating",
+        ],
+        label: DataLabel::Sanitizer(Cap::XXE),
+        case_sensitive: true,
+    },
+];
+
+/// Java gated sinks.  Argument-position-aware classification for callees
+/// where the SSTI activation is restricted to the template-source arg
+/// rather than every positional argument.
+pub static GATED_SINKS: &[SinkGate] = &[
+    // Apache Velocity static API: `Velocity.evaluate(ctx, writer, logTag, src)`.
+    // Arg 3 carries the inline template source; tainted text at that
+    // position is SSTI.  Tainted data in the context (arg 0) is rendered
+    // through Velocity's escape policy, not parsed as template source, so
+    // those flows must not activate SSTI.  Activation is unconditional;
+    // payload_args narrows the cap to the template-source position.
+    SinkGate {
+        callee_matcher: "Velocity.evaluate",
+        arg_index: 3,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: true,
+        payload_args: &[3],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
 ];

 pub static KINDS: Map<&'static str, Kind> = phf_map! {
--- a/src/labels/javascript.rs
+++ b/src/labels/javascript.rs
@ -310,6 +310,178 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::SQL_QUERY),
        case_sensitive: true,
    },
+    // ─── LDAP injection sinks ───
+    //
+    // `ldapjs`: both the bound-variable idiom
+    // `const client = ldap.createClient({...}); client.search(...)` and the
+    // chained idiom `ldap.createClient({...}).search(...)` are covered by
+    // type-qualified receiver resolution.  The receiver of the inner call is
+    // typed `TypeKind::LdapClient` via `ssa::type_facts::constructor_type`,
+    // and (for the bound-variable form) closure-captured types are forwarded
+    // into the per-function type-fact result by
+    // [`crate::taint::inject_external_type_facts`], so the qualified callee
+    // text resolves to `LdapClient.search` in both shapes.
+    LabelRule {
+        matchers: &["LdapClient.search"],
+        label: DataLabel::Sink(Cap::LDAP_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── LDAP-filter sanitizers ───
+    //
+    // The `ldap-escape` package exports `filter` and `dn` tagged-template
+    // helpers (`filter`\`(uid=${input})\``).  After tree-sitter lifts the
+    // template-tag identifier, the callee text is the function name; suffix
+    // matching on `ldapEscape` / `ldapescape` covers `const ldapEscape =
+    // require('ldap-escape')` plus default-import shapes.
+    LabelRule {
+        matchers: &[
+            "ldapEscape",
+            "ldap-escape",
+            "ldapescape.filter",
+            "ldapescape.dn",
+        ],
+        label: DataLabel::Sanitizer(Cap::LDAP_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── XPath injection sinks ───
+    //
+    // `document.evaluate(expr, contextNode, ...)` (DOM) and the npm `xpath`
+    // package's `xpath.select(expr, doc)` / `xpath.evaluate(expr, doc, ...)`
+    // accept the expression string as arg 0; concatenated user input there
+    // is the canonical XPath-injection vector.
+    LabelRule {
+        matchers: &[
+            "document.evaluate",
+            "xpath.select",
+            "xpath.evaluate",
+            "xpath.select1",
+        ],
+        label: DataLabel::Sink(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── XPath escape sanitizers ───
+    //
+    // No standard library helper escapes XPath metacharacters; project-local
+    // `escapeXpath` / `xpathEscape` are the developer-named equivalents.
+    LabelRule {
+        matchers: &["escapeXpath", "xpathEscape", "escape_xpath"],
+        label: DataLabel::Sanitizer(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Header / CRLF injection sinks ───
+    //
+    // Express/Fastify/Node `http` response APIs that write a single header
+    // value: `res.setHeader(name, val)` (case-insensitive verb), `res.set`,
+    // `res.header`, `res.append`.  Tainted strings here without `\r\n`
+    // stripping let an attacker inject extra headers (response splitting).
+    LabelRule {
+        matchers: &["setHeader", "res.set", "res.header", "res.append"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // Subscript-set form: `res.headers["X-Foo"] = bar` /
+    // `response.headers["X-Foo"] = bar`.  The LHS-subscript classification
+    // path in `cfg/mod.rs::push_node` walks into the subscript's `object`
+    // and classifies its member-expression text, so the bare bracket form
+    // fires alongside `setHeader` / `res.set` / `res.header` / `res.append`.
+    LabelRule {
+        matchers: &["res.headers", "response.headers", "self.response.headers"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Header / CRLF sanitizers ───
+    //
+    // Project-local `stripCRLF` / `escapeHeader` helpers that strip `\r` and
+    // `\n` from a value before it is written to a response header.
+    LabelRule {
+        matchers: &["stripCRLF", "stripCrlf", "escapeHeader", "sanitizeHeader"],
+        label: DataLabel::Sanitizer(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Prototype pollution sinks (library-mediated) ───
+    //
+    // Recursive merge / deep-assign helpers from lodash / common bundles.
+    // Argument-role gating (target vs src) is enforced via Destination
+    // activation in `GATED_SINKS` below: only taint flowing into the
+    // source-object arguments (positions 1+) activates; tainted-target-
+    // only is benign because writes to a tainted target object don't
+    // pollute `Object.prototype`.  Flat rules here are intentionally
+    // empty for the merge family; see GATED_SINKS for the per-call
+    // gating.  `_.template` is excluded — it is handled separately as
+    // a gated CODE_EXEC sink (Strapi CVE-2023-22621 evaluate:false
+    // suppression).
+    // ─── Open redirect sinks ───
+    //
+    // Express response redirect: `res.redirect(url)`.  Browser-side
+    // navigation: `location.replace` / `location.assign` fire as direct
+    // calls; `window.location = url` / `window.location.href = url` /
+    // `location.href = url` fire as assignment-LHS sinks via the
+    // `member_expr_text` classification path in `cfg::push_node`.
+    // `router.navigate` covers the Angular Router (`Router.navigate`,
+    // `Router.navigateByUrl`) and the React-Router `useNavigate`-returned
+    // `navigate` function; suffix matching catches both the bound-receiver
+    // and direct-call shapes.
+    LabelRule {
+        matchers: &[
+            "res.redirect",
+            "location.replace",
+            "location.assign",
+            "router.navigate",
+            "router.navigateByUrl",
+            "window.location",
+            "window.location.href",
+            "location.href",
+        ],
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    // ─── Open-redirect URL allowlist sanitizers ───
+    //
+    // Project-local helpers that allowlist hosts or enforce relative-only
+    // URLs.  `validateRedirectUrl` / `isSafeRedirect` are the canonical
+    // developer-named allowlist helpers; `stripScheme` clears any absolute
+    // scheme and degrades the URL to a relative path.  `ensureRelativeUrl`
+    // / `assertRelativePath` cover the leading-slash / no-scheme idiom.
+    LabelRule {
+        matchers: &[
+            "validateRedirectUrl",
+            "isSafeRedirect",
+            "stripScheme",
+            "ensureRelativeUrl",
+            "assertRelativePath",
+            "isRelativeUrl",
+        ],
+        label: DataLabel::Sanitizer(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    // ─── SSTI sinks ───
+    //
+    // Template-engine entry points that accept the template *source string*
+    // as the first argument: tainted arg 0 lets the attacker drive
+    // arbitrary template execution.  `_.template` is excluded — it has
+    // its own gated CODE_EXEC classifier (Strapi CVE-2023-22621) that
+    // respects the `evaluate:false` opt-out.  `nunjucks.renderString` is
+    // also excluded — see GATED_SINKS below for arg-0-only payload
+    // gating (suppresses tainted-`ctx`-only flows).
+    LabelRule {
+        matchers: &["Handlebars.compile"],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: false,
+    },
+    // ─── XXE sinks ───
+    //
+    // libxmljs `parseXmlString` / `parseXml` resolve external entities by
+    // default when called with `{ noent: true }` or
+    // `{ replaceEntities: true }`.  The flat-rule modeling treats any call
+    // as a sink, the safe path requires explicit option suppression.
+    // libxmljs's own default ignores entities so the sink is conservative
+    // here; xml2js / fast-xml-parser are gated below in GATED_SINKS to
+    // suppress the safe-default case.
+    LabelRule {
+        matchers: &["libxmljs.parseXmlString", "libxmljs.parseXml"],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+    },
 ];

 /// Callee patterns that must never be classified as source/sanitizer/sink.
@ -420,6 +592,33 @@ pub static GATED_SINKS: &[SinkGate] = &[
        dangerous_kwargs: &[],
        activation: GateActivation::ValueMatch,
    },
+    // ── XML XXE gates ─────────────────────────────────────────────────────
+    //
+    // `xml2js.parseString(xml, opts, cb)` is XXE-safe by default; opts
+    // `{ explicitChildren: true, charkey: '__cdata' }` are benign, but
+    // resolving entities at the underlying sax-js layer requires user
+    // intent.  The gate fires only when the option object literal carries
+    // an entity-resolution kwarg with a truthy value (or is dynamic).  Only
+    // the XML payload (arg 0) is the protected position.
+    SinkGate {
+        callee_matcher: "xml2js.parseString",
+        arg_index: 1,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[
+            ("processEntities", &["true"]),
+            ("explicitEntities", &["true"]),
+            ("strict", &["false"]),
+        ],
+        activation: GateActivation::ValueMatch,
+    },
+    // Note: `fast-xml-parser` (`new XMLParser({...}).parse(xml)`) is XXE-safe
+    // by default; flagging it would require constructor-option tracking via
+    // TypeFacts (XmlParser type with config carry).  Deferred to Layer 2.
    // ── Outbound HTTP clients (SSRF) ──────────────────────────────────────
    //
    // Policy: SSRF fires only when taint reaches the destination-bearing
@ -797,6 +996,282 @@ pub static GATED_SINKS: &[SinkGate] = &[
            object_destination_fields: &[],
        },
    },
+    // `nunjucks.renderString(src, ctx)` — Nunjucks SSTI sink.  Only the
+    // template *source* (arg 0) lets an attacker drive template execution;
+    // the `ctx` data object (arg 1) is rendered via the template's escape
+    // policy and is not itself a code-injection vector.  Gate via
+    // Destination-style activation with `payload_args: &[0]` so taint
+    // flowing only into `ctx` is suppressed.
+    SinkGate {
+        callee_matcher: "nunjucks.renderString",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: false,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // ── Prototype pollution gates ────────────────────────────────────────
+    //
+    // Library-mediated recursive merge / deep-assign helpers.  Argument-
+    // role gating: `(target, src1, src2, ...)` — only taint reaching a
+    // *source* position (index 1+) can pollute `Object.prototype` via
+    // `__proto__` / `constructor` keys on attacker-controlled input.
+    // Tainted target alone is benign (it just mutates that object).
+    // `payload_args: &[1, 2, 3, 4, 5]` covers the canonical 1-target +
+    // up-to-5-source signatures used by lodash / Object.assign / jQuery
+    // extend; arity beyond 5 is rare in practice and would over-suppress
+    // only at the long tail.
+    SinkGate {
+        callee_matcher: "_.merge",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "_.mergeWith",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "_.defaultsDeep",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // `_.set(obj, path, value)` — both `path` (arg 1) and `value` (arg 2)
+    // can drive prototype pollution: a tainted path of `__proto__.foo`
+    // mutates `Object.prototype`, and a tainted value into `obj.__proto__`
+    // does the same.  Object (arg 0) is the canonical target.
+    SinkGate {
+        callee_matcher: "_.set",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "_.setWith",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // Generic project-local deep-merge helpers.  Suffix-matched so any
+    // `*.deepMerge` / `*.defaultsDeep` qualified call also resolves.
+    SinkGate {
+        callee_matcher: "deepMerge",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "defaultsDeep",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // `Object.assign(target, ...sources)` is safe with constant-literal
+    // sources (`{a: 1, b: 2}`) but dangerous with attacker-controlled
+    // input (`req.body`).  Gate target out of payload_args so tainted-
+    // target alone does not fire.
+    SinkGate {
+        callee_matcher: "Object.assign",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // jQuery / Zepto `$.extend(target, ...sources)` and `jQuery.extend`.
+    // Arg 0 may be a deep-flag boolean (`true`) when the deep-merge form
+    // is in use, in which case sources start at arg 2.  Cover both
+    // shapes by listing arg 1, 2, 3, 4 in `payload_args`: a `true` first
+    // arg never carries taint, so its inclusion is harmless; for the
+    // shallow `$.extend(target, src)` form, src at arg 1 still fires.
+    SinkGate {
+        callee_matcher: "$.extend",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1, 2, 3, 4],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "jQuery.extend",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1, 2, 3, 4],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // Bare `extend` (suffix-matched) for jQuery's deep form imported as a
+    // bound name: `const { extend } = require('jquery'); extend(true, t, s)`.
+    // Suffix `extend` would over-fire on Backbone's `Model.extend(proto)` /
+    // `View.extend({...})` class-extension idiom, so this gate uses
+    // `LiteralOnly` activation: it fires only when arg 0 is the literal
+    // boolean `true` (the deep-flag form, never used by Backbone subclassing).
+    // Sources start at arg 2 because arg 0 is the flag and arg 1 is the
+    // target; tainting the target alone is benign.
+    SinkGate {
+        callee_matcher: "extend",
+        arg_index: 0,
+        dangerous_values: &["true"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::LiteralOnly,
+    },
+    // `set-value` standalone helper: `setValue(obj, key, val)` — historic
+    // CVE-2019-10747 (set-value <2.0.1) and CVE-2021-23440 (set-value <4.0.1)
+    // recursive set-by-path helper that did not block `__proto__` keys.
+    // Suffix-matched so qualified imports (`require('set-value')`) bound to
+    // `setValue` still resolve.
+    SinkGate {
+        callee_matcher: "setValue",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1, 2],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // `dot-prop` standalone helper: `dotProp.set(obj, path, val)` —
+    // CVE-2020-8116.  Path is a dotted-string with prototype-key support;
+    // a tainted `path` of `__proto__.x` mutates Object.prototype.
+    SinkGate {
+        callee_matcher: "dotProp.set",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1, 2],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // `JSONPath` / `jsonpath-plus` `JSONPath({path: p, json: o, callback: fn})`
+    // historically supported a `resultType: 'value'` mode that, combined with
+    // `parent`/`parentProperty` writes inside the callback, can mutate the
+    // prototype chain.  Recognise the `jp.set(obj, path, value)` family
+    // (jsonpath, jsonpath-plus) on the same shape as `_.set`.
+    SinkGate {
+        callee_matcher: "jp.set",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1, 2],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "jsonpath.set",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
 ];

 pub static KINDS: Map<&'static str, Kind> = phf_map! {
--- a/src/labels/mod.rs
+++ b/src/labels/mod.rs
@ -66,6 +66,17 @@ pub enum GateActivation {
    /// selects which attribute is being set) and `parseFromString` (activation
    /// arg selects the MIME type).
    ValueMatch,
+    /// Strict literal-value activation.  The gate fires only when the
+    /// activation arg is a literal that matches `dangerous_values` /
+    /// `dangerous_prefixes`.  Unknown/dynamic activation arg suppresses
+    /// (no conservative ALL_ARGS_PAYLOAD push).
+    ///
+    /// Used for ambiguously-named matchers where the dangerous shape is
+    /// only identifiable by an explicit literal flag — e.g. bare `extend`
+    /// where the deep-merge form is `extend(true, target, src)` but
+    /// Backbone's `Model.extend({proto})` shares the suffix.  Conservative
+    /// fallback would over-fire on the class-extension form.
+    LiteralOnly,
    /// Destination-bearing flow activation.  The gate fires when taint reaches
    /// a declared destination location at the call site, no literal
    /// inspection, no prefix heuristic.
@ -156,53 +167,83 @@ bitflags! {
    /// In practice: a finding fires when a tainted value reaches a sink and
    /// `(value_caps & sink_caps) != 0`.
    #[derive(Debug, Clone, Copy, PartialEq, Eq)]
-    pub struct Cap: u16 {
+    pub struct Cap: u32 {
        /// Taint that originated from an environment variable read.
        /// Used as a source-origin marker for env-injection rules.
-        const ENV_VAR         = 0b0000_0000_0000_0001;  // bit 0
+        const ENV_VAR              = 1 << 0;
        /// Sanitizer: the value has passed through HTML entity escaping.
        /// Strips XSS risk from values that reach HTML output sinks.
-        const HTML_ESCAPE     = 0b0000_0000_0000_0010;  // bit 1
+        const HTML_ESCAPE          = 1 << 1;
        /// Sanitizer: the value has been shell-argument escaped.
        /// Strips command-injection risk before shell sinks.
-        const SHELL_ESCAPE    = 0b0000_0000_0000_0100;  // bit 2
+        const SHELL_ESCAPE         = 1 << 2;
        /// Sanitizer: the value has been percent-encoded for use in a URL.
-        const URL_ENCODE      = 0b0000_0000_0000_1000;  // bit 3
+        const URL_ENCODE           = 1 << 3;
        /// Sanitizer: the value was parsed through a structured JSON decoder
        /// (as opposed to `eval`-based or regex parsing).
-        const JSON_PARSE      = 0b0000_0000_0001_0000;  // bit 4
+        const JSON_PARSE           = 1 << 4;
        /// Sink: file system read or write operation (path traversal, arbitrary
        /// file read/write).
-        const FILE_IO         = 0b0000_0000_0010_0000;  // bit 5
+        const FILE_IO              = 1 << 5;
        /// Sink: format string injection (e.g. `printf`-family, `String.format`).
-        const FMT_STRING      = 0b0000_0000_0100_0000;  // bit 6
+        const FMT_STRING           = 1 << 6;
        /// Sink: SQL query construction. Fires for string-concatenated queries
        /// and parameterized-query builders where the query text itself is tainted.
-        const SQL_QUERY       = 0b0000_0000_1000_0000;  // bit 7
+        const SQL_QUERY            = 1 << 7;
        /// Sink: unsafe object deserialization (Java `ObjectInputStream`,
        /// Python `pickle`, Ruby `Marshal`, PHP `unserialize`, etc.).
-        const DESERIALIZE     = 0b0000_0001_0000_0000;  // bit 8
+        const DESERIALIZE          = 1 << 8;
        /// Sink: server-side request forgery. Fires when attacker-controlled
        /// data reaches the destination URL of an outbound HTTP request.
-        const SSRF            = 0b0000_0010_0000_0000;  // bit 9
+        const SSRF                 = 1 << 9;
        /// Sink: code or command execution (shell injection, `eval`, `exec`,
        /// dynamic `require`/`import`, template injection).
-        const CODE_EXEC       = 0b0000_0100_0000_0000;  // bit 10
+        const CODE_EXEC            = 1 << 10;
        /// Sink: cryptographic operation with a tainted algorithm name or seed
        /// (weak-crypto / predictable-randomness patterns).
-        const CRYPTO          = 0b0000_1000_0000_0000;  // bit 11
+        const CRYPTO               = 1 << 11;
        /// Request-bound, caller-supplied identifier that has not yet been
        /// validated against an ownership/membership check.  Used as the
        /// carrier cap for folding `auth_analysis` into the SSA/taint
        /// engine.
-        const UNAUTHORIZED_ID = 0b0001_0000_0000_0000;  // bit 12
+        const UNAUTHORIZED_ID      = 1 << 12;
        /// Cross-boundary data-exfiltration: tainted sensitive data flowing
        /// into outbound request bodies, headers, or other payload-bearing
        /// fields of network egress APIs.  Distinct from `SSRF` (attacker
        /// control over the destination URL), `DATA_EXFIL` fires when the
        /// destination is fixed but attacker-influenced data leaves the
        /// process via the request payload.
-        const DATA_EXFIL      = 0b0010_0000_0000_0000;  // bit 13
+        const DATA_EXFIL           = 1 << 13;
+        /// Sink: LDAP search/query construction. Fires when attacker-controlled
+        /// data reaches a directory-service filter or DN argument without
+        /// LDAP-filter escaping.
+        const LDAP_INJECTION       = 1 << 14;
+        /// Sink: XPath expression construction. Fires when attacker-controlled
+        /// data is concatenated into an XPath query rather than passed via
+        /// XPath variable bindings.
+        const XPATH_INJECTION      = 1 << 15;
+        /// Sink: HTTP response header value (or any CRLF-sensitive output).
+        /// Fires when attacker-controlled data lands in a `Set-Header` /
+        /// header-add call without `\r\n` stripping (response splitting).
+        const HEADER_INJECTION     = 1 << 16;
+        /// Sink: redirect / `Location` header destination. Fires when an
+        /// attacker-controlled URL reaches a redirect call without an
+        /// allowlist or relative-URL check.
+        const OPEN_REDIRECT        = 1 << 17;
+        /// Sink: server-side template injection. Fires when the **template
+        /// source string** itself is attacker-controlled (e.g.
+        /// `Template(user_input).render()`), distinct from rendering a
+        /// trusted template with tainted variables.
+        const SSTI                 = 1 << 18;
+        /// Sink: XML external entity resolution. Fires when attacker-controlled
+        /// XML reaches a parser configured to resolve external entities (or
+        /// missing the secure-processing feature).
+        const XXE                  = 1 << 19;
+        /// Sink: prototype pollution. Fires when an attacker-controlled key
+        /// reaches an object property assignment that can mutate
+        /// `Object.prototype` (`__proto__`, `constructor.prototype`, deep-merge
+        /// helpers).
+        const PROTOTYPE_POLLUTION  = 1 << 20;
    }
 }

@ -214,14 +255,18 @@ impl Default for Cap {

 impl serde::Serialize for Cap {
    fn serialize<S: serde::Serializer>(&self, s: S) -> Result<S::Ok, S::Error> {
-        s.serialize_u16(self.bits())
+        s.serialize_u32(self.bits())
    }
 }

 impl<'de> serde::Deserialize<'de> for Cap {
    fn deserialize<D: serde::Deserializer<'de>>(d: D) -> Result<Self, D::Error> {
-        let bits = u16::deserialize(d)?;
-        Ok(Cap::from_bits_truncate(bits))
+        // Accept any unsigned integer width (existing JSON written with the
+        // u16 representation must continue to deserialise into the widened
+        // u32 cap field). serde-json hands these through `deserialize_u64`;
+        // the truncating cast preserves all currently-defined cap bits.
+        let bits = u64::deserialize(d)?;
+        Ok(Cap::from_bits_truncate(bits as u32))
    }
 }

@ -370,16 +415,46 @@ static GATED_REGISTRY: Lazy<HashMap<&'static str, &'static [SinkGate]>> = Lazy::
    m.insert("js", javascript::GATED_SINKS);
    m.insert("typescript", typescript::GATED_SINKS);
    m.insert("ts", typescript::GATED_SINKS);
-    m.insert("python", python::GATED_SINKS);
-    m.insert("py", python::GATED_SINKS);
+
+    // Python prototype-pollution gates are opt-in: `dict.update(target,
+    // src)` overlaps too broadly with non-pollution use of `update`
+    // (Counter, namespaced state mutation) to ship as a default sink.
+    // The `NYX_PYTHON_PROTO_POLLUTION` env var enables them; when set
+    // the merged slice is leaked into a `'static` reference so the
+    // registry's lifetime invariant holds.
+    let python_gates: &'static [SinkGate] = if env_python_proto_pollution() {
+        let mut combined: Vec<SinkGate> = python::GATED_SINKS.to_vec();
+        combined.extend_from_slice(python::PROTO_POLLUTION_GATES);
+        Box::leak(combined.into_boxed_slice())
+    } else {
+        python::GATED_SINKS
+    };
+    m.insert("python", python_gates);
+    m.insert("py", python_gates);
+
    m.insert("go", go::GATED_SINKS);
    m.insert("php", php::GATED_SINKS);
    m.insert("c", c::GATED_SINKS);
    m.insert("cpp", cpp::GATED_SINKS);
    m.insert("c++", cpp::GATED_SINKS);
+    m.insert("ruby", ruby::GATED_SINKS);
+    m.insert("rb", ruby::GATED_SINKS);
+    m.insert("java", java::GATED_SINKS);
+    m.insert("rust", rust::GATED_SINKS);
+    m.insert("rs", rust::GATED_SINKS);
    m
 });

+/// Feature flag for the Python prototype-pollution gates.  Disabled by
+/// default; set `NYX_PYTHON_PROTO_POLLUTION=1` (or `true`) to enable
+/// `dict.update` / `__dict__.update` proto-pollution detection.
+fn env_python_proto_pollution() -> bool {
+    matches!(
+        std::env::var("NYX_PYTHON_PROTO_POLLUTION").ok().as_deref(),
+        Some("1") | Some("true") | Some("TRUE") | Some("yes") | Some("on")
+    )
+}
+
 /// Per-language exclusion patterns: callee text that must never be classified.
 static EXCLUDES: Lazy<HashMap<&'static str, &'static [&'static str]>> = Lazy::new(|| {
    let mut m = HashMap::new();
@ -725,6 +800,13 @@ pub fn parse_cap(s: &str) -> Option<Cap> {
        "crypto" => Some(Cap::CRYPTO),
        "unauthorized_id" => Some(Cap::UNAUTHORIZED_ID),
        "data_exfil" | "data_exfiltration" => Some(Cap::DATA_EXFIL),
+        "ldap_injection" | "ldapi" => Some(Cap::LDAP_INJECTION),
+        "xpath_injection" | "xpathi" => Some(Cap::XPATH_INJECTION),
+        "header_injection" | "crlf" | "response_splitting" => Some(Cap::HEADER_INJECTION),
+        "open_redirect" | "redirect" => Some(Cap::OPEN_REDIRECT),
+        "ssti" | "template_injection" => Some(Cap::SSTI),
+        "xxe" => Some(Cap::XXE),
+        "prototype_pollution" | "proto_pollution" => Some(Cap::PROTOTYPE_POLLUTION),
        "all" => Some(Cap::all()),
        _ => None,
    }
@ -1274,7 +1356,15 @@ pub fn classify_gated_sink(
            // where `userAttr` is user-controlled) is itself a vulnerability
            // path. Return ALL_ARGS_PAYLOAD so downstream sink scanning
            // considers every positional argument.
+            //
+            // `LiteralOnly` opts out of this conservative branch: the gate
+            // requires positive literal evidence to fire, so unknown
+            // activation suppresses entirely (avoids false positives on
+            // ambiguously-named suffix matchers like bare `extend`).
            None => {
+                if matches!(gate.activation, GateActivation::LiteralOnly) {
+                    continue;
+                }
                out.push(GateMatch {
                    label: gate.label,
                    payload_args: ALL_ARGS_PAYLOAD,
@ -1396,10 +1486,283 @@ pub fn cap_to_name(cap: Cap) -> &'static str {
        Cap::CODE_EXEC => "code_exec",
        Cap::CRYPTO => "crypto",
        Cap::UNAUTHORIZED_ID => "unauthorized_id",
+        Cap::DATA_EXFIL => "data_exfil",
+        Cap::LDAP_INJECTION => "ldap_injection",
+        Cap::XPATH_INJECTION => "xpath_injection",
+        Cap::HEADER_INJECTION => "header_injection",
+        Cap::OPEN_REDIRECT => "open_redirect",
+        Cap::SSTI => "ssti",
+        Cap::XXE => "xxe",
+        Cap::PROTOTYPE_POLLUTION => "prototype_pollution",
        _ => "unknown",
    }
 }

+// ── Cap rule registry ────────────────────────────────────────────────────
+//
+// Static, single-source-of-truth metadata table keyed by [`Cap`].  Every
+// vulnerability class with its own canonical rule id appears here; the
+// per-language `RULES` arrays only carry the language-specific match shapes.
+// Sink-cap fields on a finding (or `Cap::DATA_EXFIL` carried alongside) feed
+// `cap_rule_meta()` to pick the rule id surfaced to SARIF, the dashboard,
+// and `enumerate_builtin_rules()` for `nyx rules list`.
+
+/// Static metadata for one cap-defined vulnerability class.
+#[derive(Debug, Clone, Copy)]
+pub struct CapRuleMeta {
+    pub cap: Cap,
+    /// Canonical rule id surfaced by finding emission (no source-suffix).
+    pub rule_id: &'static str,
+    /// Display title for `nyx rules list` and dashboard.
+    pub title: &'static str,
+    pub severity: crate::patterns::Severity,
+    /// OWASP 2021 code (e.g. `"A03"`).
+    pub owasp_code: &'static str,
+    /// OWASP 2021 long label (e.g. `"Injection"`).
+    pub owasp_label: &'static str,
+    pub description: &'static str,
+    /// `false` only for caps gated behind a config flag (e.g.
+    /// `Cap::UNAUTHORIZED_ID`, which still defers to the standalone
+    /// `auth_analysis` subsystem unless `enable_auth_as_taint` is on).
+    pub default_enabled: bool,
+    /// Whether the diag-id emission path in `ast.rs` actually surfaces
+    /// findings under [`Self::rule_id`].  When `false`, sink findings
+    /// for this cap currently surface under the legacy
+    /// `taint-unsanitised-flow` id (the per-language family-token
+    /// dispatch in [`crate::server::owasp::owasp_bucket_for`] still
+    /// buckets them correctly).  Dashboards and `nyx rules list` consume
+    /// this flag to decide whether to surface the synthetic class entry
+    /// alongside live findings or hide it as forward-declared.
+    ///
+    /// Migrating a cap from `false` → `true` requires adding it to the
+    /// cap-specific routing list in `ast.rs::diag_for_finding`; tests
+    /// that pin the legacy `taint-unsanitised-flow` rule id for that
+    /// cap must be updated to the cap-specific id.
+    pub emission_active: bool,
+}
+
+/// Registry of cap-class metadata.  Keyed in cap-bit order so additions
+/// stay clustered with their bitflag declarations.
+pub static CAP_RULE_REGISTRY: &[CapRuleMeta] = &[
+    CapRuleMeta {
+        cap: Cap::FILE_IO,
+        rule_id: "taint-path-traversal",
+        title: "Path Traversal / Arbitrary File Access",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A01",
+        owasp_label: "Broken Access Control",
+        description: "Attacker-controlled data flows into a filesystem path without canonicalisation \
+             or root-confinement, allowing reads or writes outside the intended directory.",
+        default_enabled: true,
+        emission_active: false,
+    },
+    CapRuleMeta {
+        cap: Cap::FMT_STRING,
+        rule_id: "taint-format-string",
+        title: "Format String Injection",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A03",
+        owasp_label: "Injection",
+        description: "Attacker-controlled data is used as a format string argument (printf-family, \
+             String.format) and can leak memory or crash the process.",
+        default_enabled: true,
+        emission_active: false,
+    },
+    CapRuleMeta {
+        cap: Cap::SQL_QUERY,
+        rule_id: "taint-sql-injection",
+        title: "SQL Injection",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A03",
+        owasp_label: "Injection",
+        description: "Attacker-controlled data is concatenated into a SQL query string instead of \
+             being bound through a parameterised statement.",
+        default_enabled: true,
+        emission_active: false,
+    },
+    CapRuleMeta {
+        cap: Cap::DESERIALIZE,
+        rule_id: "taint-deserialization",
+        title: "Unsafe Deserialization",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A08",
+        owasp_label: "Software and Data Integrity Failures",
+        description: "Attacker-controlled bytes are fed to an unsafe object deserialiser \
+             (pickle, ObjectInputStream, Marshal, unserialize) enabling arbitrary code \
+             execution via crafted payloads.",
+        default_enabled: true,
+        emission_active: false,
+    },
+    CapRuleMeta {
+        cap: Cap::SSRF,
+        rule_id: "taint-ssrf",
+        title: "Server-Side Request Forgery",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A10",
+        owasp_label: "Server-Side Request Forgery",
+        description: "Attacker-controlled URL reaches the destination of an outbound HTTP request \
+             without an allowlist or scheme/host restriction.",
+        default_enabled: true,
+        emission_active: false,
+    },
+    CapRuleMeta {
+        cap: Cap::CODE_EXEC,
+        rule_id: "taint-code-execution",
+        title: "Code / Command Execution",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A03",
+        owasp_label: "Injection",
+        description: "Attacker-controlled data reaches an `eval`/`exec`/shell sink, dynamic \
+             require/import, or other arbitrary-code construct.",
+        default_enabled: true,
+        emission_active: false,
+    },
+    CapRuleMeta {
+        cap: Cap::CRYPTO,
+        rule_id: "taint-crypto-misuse",
+        title: "Tainted Cryptographic Parameter",
+        severity: crate::patterns::Severity::Medium,
+        owasp_code: "A02",
+        owasp_label: "Cryptographic Failures",
+        description: "Attacker-controlled data drives the algorithm name, key, or seed of a \
+             cryptographic primitive (weak-crypto / predictable-randomness).",
+        default_enabled: true,
+        emission_active: false,
+    },
+    CapRuleMeta {
+        cap: Cap::UNAUTHORIZED_ID,
+        rule_id: "rs.auth.missing_ownership_check.taint",
+        title: "Missing Ownership Check (taint variant)",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A01",
+        owasp_label: "Broken Access Control",
+        description: "Request-bound identifier reaches a privileged sink without an intervening \
+             ownership/membership check.  Companion to the standalone `auth_analysis` \
+             rule; gated by `scanner.enable_auth_as_taint`.",
+        default_enabled: false,
+        emission_active: true,
+    },
+    CapRuleMeta {
+        cap: Cap::DATA_EXFIL,
+        rule_id: "taint-data-exfiltration",
+        title: "Sensitive Data Exfiltration",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A04",
+        owasp_label: "Insecure Design",
+        description: "Sensitive data (cookies, headers, env, db rows, files) flows into the body, \
+             headers, or other payload field of an outbound network request to a fixed \
+             destination.",
+        default_enabled: true,
+        emission_active: true,
+    },
+    // ── Cap-specific rule ids ────────────────────────────────────────────
+    CapRuleMeta {
+        cap: Cap::LDAP_INJECTION,
+        rule_id: "taint-ldap-injection",
+        title: "LDAP Injection",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A03",
+        owasp_label: "Injection",
+        description: "Attacker-controlled data is concatenated into an LDAP filter or DN without \
+             RFC 4515 escaping, letting the attacker rewrite the directory query.",
+        default_enabled: true,
+        emission_active: true,
+    },
+    CapRuleMeta {
+        cap: Cap::XPATH_INJECTION,
+        rule_id: "taint-xpath-injection",
+        title: "XPath Injection",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A03",
+        owasp_label: "Injection",
+        description: "Attacker-controlled data is concatenated into an XPath expression instead of \
+             passed through XPath variable bindings, letting the attacker rewrite the \
+             query.",
+        default_enabled: true,
+        emission_active: true,
+    },
+    CapRuleMeta {
+        cap: Cap::HEADER_INJECTION,
+        rule_id: "taint-header-injection",
+        title: "HTTP Header / Response Splitting",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A03",
+        owasp_label: "Injection",
+        description: "Attacker-controlled data lands in an HTTP response header without `\\r\\n` \
+             stripping, enabling response splitting and cache-poisoning attacks.",
+        default_enabled: true,
+        emission_active: true,
+    },
+    CapRuleMeta {
+        cap: Cap::OPEN_REDIRECT,
+        rule_id: "taint-open-redirect",
+        title: "Open Redirect",
+        severity: crate::patterns::Severity::Medium,
+        owasp_code: "A01",
+        owasp_label: "Broken Access Control",
+        description: "Attacker-controlled URL drives a redirect / `Location` header without an \
+             allowlist or relative-URL check, enabling phishing pivots.",
+        default_enabled: true,
+        emission_active: true,
+    },
+    CapRuleMeta {
+        cap: Cap::SSTI,
+        rule_id: "taint-template-injection",
+        title: "Server-Side Template Injection",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A03",
+        owasp_label: "Injection",
+        description: "Attacker controls the template *source string* (not just template variables) \
+             passed to a server-side renderer (Jinja2, Twig, Handlebars, ERB), enabling \
+             arbitrary expression evaluation.",
+        default_enabled: true,
+        emission_active: true,
+    },
+    CapRuleMeta {
+        cap: Cap::XXE,
+        rule_id: "taint-xxe",
+        title: "XML External Entity Resolution",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A05",
+        owasp_label: "Security Misconfiguration",
+        description: "Attacker-controlled XML reaches a parser configured to resolve external \
+             entities (or missing the secure-processing feature), enabling SSRF, file \
+             read, and DoS.",
+        default_enabled: true,
+        emission_active: true,
+    },
+    CapRuleMeta {
+        cap: Cap::PROTOTYPE_POLLUTION,
+        rule_id: "taint-prototype-pollution",
+        title: "Prototype Pollution",
+        severity: crate::patterns::Severity::High,
+        owasp_code: "A05",
+        owasp_label: "Security Misconfiguration",
+        description: "Attacker-controlled key reaches an object property assignment that can mutate \
+             `Object.prototype` (deep-merge / `__proto__` / dynamic subscript).",
+        default_enabled: true,
+        emission_active: true,
+    },
+];
+
+/// Resolve a cap to its canonical rule metadata.  Returns `None` for caps
+/// without a rule-emission role (origin / sanitizer markers like
+/// [`Cap::ENV_VAR`], [`Cap::HTML_ESCAPE`]).
+pub fn cap_rule_meta(cap: Cap) -> Option<&'static CapRuleMeta> {
+    CAP_RULE_REGISTRY.iter().find(|m| m.cap == cap)
+}
+
+/// Resolve any subset of `effective_caps` to a single rule id.  When
+/// multiple bits are set, picks the first registry entry that intersects
+/// (registry order is bit-position).  Returns `None` when no bit in the
+/// set has a registered rule id.
+pub fn rule_id_for_caps(effective_caps: Cap) -> Option<&'static str> {
+    CAP_RULE_REGISTRY
+        .iter()
+        .find(|m| effective_caps.contains(m.cap))
+        .map(|m| m.rule_id)
+}
+
 /// Generate a stable rule ID from language, kind, and matchers.
 pub fn rule_id(lang: &str, kind: &str, matchers: &[&str]) -> String {
    let mut sorted: Vec<&str> = matchers.to_vec();
@ -1418,11 +1781,25 @@ pub struct RuleInfo {
    pub language: String,
    pub kind: String,
    pub cap: String,
-    pub cap_bits: u16,
+    pub cap_bits: u32,
    pub matchers: Vec<String>,
    pub case_sensitive: bool,
    pub is_custom: bool,
    pub is_gated: bool,
+    /// Cap-class registry entry (one per `Cap` with a canonical rule id),
+    /// distinct from per-language sink/source/sanitizer match rules.  The
+    /// dashboard groups these separately so the rules surface does not mix
+    /// "the LDAP injection class exists" with "Java's `DirContext.search`
+    /// is a sink for that class".
+    pub is_class: bool,
+    /// For class entries (`is_class == true`), whether the diag-id
+    /// emission path in `ast.rs` actually surfaces findings under
+    /// [`Self::id`].  When `false`, the class is registered but live
+    /// findings still emerge under the legacy `taint-unsanitised-flow`
+    /// rule id; dashboards can use this flag to suppress the synthetic
+    /// entry until the cap is migrated to its specific rule id.
+    /// Always `true` for non-class label rules.
+    pub emission_active: bool,
    pub enabled: bool,
 }

@ -1430,6 +1807,27 @@ pub struct RuleInfo {
 pub fn enumerate_builtin_rules() -> Vec<RuleInfo> {
    let mut out = Vec::new();

+    // Cap-class entries (one per registered vulnerability class). Kind
+    // `class` so dashboards can distinguish them from per-language
+    // sink/source/sanitizer entries.
+    for meta in CAP_RULE_REGISTRY {
+        out.push(RuleInfo {
+            id: meta.rule_id.to_string(),
+            title: meta.title.to_string(),
+            language: "all".to_string(),
+            kind: "class".to_string(),
+            cap: cap_to_name(meta.cap).to_string(),
+            cap_bits: meta.cap.bits(),
+            matchers: Vec::new(),
+            case_sensitive: false,
+            is_custom: false,
+            is_gated: false,
+            is_class: true,
+            emission_active: meta.emission_active,
+            enabled: meta.default_enabled,
+        });
+    }
+
    for &lang in CANONICAL_LANGS {
        if let Some(rules) = REGISTRY.get(lang) {
            for rule in *rules {
@ -1453,6 +1851,8 @@ pub fn enumerate_builtin_rules() -> Vec<RuleInfo> {
                    case_sensitive: rule.case_sensitive,
                    is_custom: false,
                    is_gated: false,
+                    is_class: false,
+                    emission_active: true,
                    enabled: true,
                });
            }
@ -1479,6 +1879,8 @@ pub fn enumerate_builtin_rules() -> Vec<RuleInfo> {
                    case_sensitive: gate.case_sensitive,
                    is_custom: false,
                    is_gated: true,
+                    is_class: false,
+                    emission_active: true,
                    enabled: true,
                });
            }
@ -1498,6 +1900,65 @@ pub fn custom_rule_id(lang: &str, kind: &str, matchers: &[String]) -> String {
 mod tests {
    use super::*;

+    /// Pin the current set of caps whose `rule_id` is reachable via the
+    /// diag-id routing in `ast.rs::diag_for_finding`.  When migrating a
+    /// legacy cap (e.g. SQL_QUERY → `taint-sql-injection`), update both
+    /// `ast.rs` (add the cap to the cap-specific routing list) and the
+    /// `emission_active: true` flag in `CAP_RULE_REGISTRY`, then update
+    /// this assertion.  The split exists because legacy taint findings
+    /// historically all surfaced under the generic `taint-unsanitised-flow`
+    /// rule id; the seven cap-specific routes (LDAP / XPath / header /
+    /// open redirect / SSTI / XXE / prototype pollution) plus
+    /// `unauthorized_id` and `data_exfil` are the only ones wired through.
+    #[test]
+    fn cap_rule_registry_emission_active_set_is_pinned() {
+        let active: Vec<Cap> = CAP_RULE_REGISTRY
+            .iter()
+            .filter(|m| m.emission_active)
+            .map(|m| m.cap)
+            .collect();
+        let expected = [
+            Cap::UNAUTHORIZED_ID,
+            Cap::DATA_EXFIL,
+            Cap::LDAP_INJECTION,
+            Cap::XPATH_INJECTION,
+            Cap::HEADER_INJECTION,
+            Cap::OPEN_REDIRECT,
+            Cap::SSTI,
+            Cap::XXE,
+            Cap::PROTOTYPE_POLLUTION,
+        ];
+        for c in expected {
+            assert!(
+                active.contains(&c),
+                "cap {:?} expected to be emission_active in CAP_RULE_REGISTRY",
+                c
+            );
+        }
+        let inactive: Vec<Cap> = CAP_RULE_REGISTRY
+            .iter()
+            .filter(|m| !m.emission_active)
+            .map(|m| m.cap)
+            .collect();
+        let expected_inactive = [
+            Cap::FILE_IO,
+            Cap::FMT_STRING,
+            Cap::SQL_QUERY,
+            Cap::DESERIALIZE,
+            Cap::SSRF,
+            Cap::CODE_EXEC,
+            Cap::CRYPTO,
+        ];
+        for c in expected_inactive {
+            assert!(
+                inactive.contains(&c),
+                "cap {:?} expected to be emission_inactive in CAP_RULE_REGISTRY (legacy \
+                 finding still emits as taint-unsanitised-flow)",
+                c
+            );
+        }
+    }
+
    #[test]
    fn receiver_validator_python_relative_to() {
        // Bare method name fires.
@ -1781,6 +2242,33 @@ mod tests {
    // from `File.open` / `IO.open` / `URI.open`, each of which has its
    // own non-piping semantics.  Without the sigil, the suffix-with-
    // boundary matcher would over-fire on every `X.open` call.
+    #[test]
+    fn classify_javascript_set_value_is_proto_pollution_gate() {
+        let no_kw = |_: &str| None;
+        let no_kw_present = |_: &str| false;
+        let result = classify_gated_sink("javascript", "setValue", |_| None, no_kw, no_kw_present);
+        assert!(
+            result
+                .iter()
+                .any(|m| m.label == DataLabel::Sink(Cap::PROTOTYPE_POLLUTION)),
+            "expected PROTOTYPE_POLLUTION gate match for bare `setValue`, got {result:?}"
+        );
+    }
+
+    #[test]
+    fn classify_javascript_dot_prop_set_is_proto_pollution_gate() {
+        let no_kw = |_: &str| None;
+        let no_kw_present = |_: &str| false;
+        let result =
+            classify_gated_sink("javascript", "dotProp.set", |_| None, no_kw, no_kw_present);
+        assert!(
+            result
+                .iter()
+                .any(|m| m.label == DataLabel::Sink(Cap::PROTOTYPE_POLLUTION)),
+            "expected PROTOTYPE_POLLUTION gate match for `dotProp.set`, got {result:?}"
+        );
+    }
+
    #[test]
    fn classify_ruby_bare_open_is_shell_escape_sink() {
        let result = classify("ruby", "open", None);
@ -2419,7 +2907,7 @@ mod tests {
        );
        assert_eq!(
            classify("rust", "Redirect::to(next)", Some(&extras)),
-            Some(DataLabel::Sink(Cap::SSRF)),
+            Some(DataLabel::Sink(Cap::OPEN_REDIRECT)),
        );

        let empty = rust::framework_rules(&FrameworkContext::default());
@ -2470,7 +2958,7 @@ mod tests {
        );
        assert_eq!(
            classify("rust", "Redirect::to(next)", Some(&extras)),
-            Some(DataLabel::Sink(Cap::SSRF)),
+            Some(DataLabel::Sink(Cap::OPEN_REDIRECT)),
        );
    }
 }
--- a/src/labels/php.rs
+++ b/src/labels/php.rs
@ -178,6 +178,143 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::DATA_EXFIL),
        case_sensitive: true,
    },
+    // ─── LDAP injection sinks ───
+    //
+    // PHP's procedural LDAP API: `ldap_search($ds, $base, $filter)`,
+    // `ldap_list($ds, $base, $filter)`, `ldap_read($ds, $base, $filter)`.
+    // The filter argument is the LDAP-injection vector when concatenated
+    // with attacker-controlled input.
+    LabelRule {
+        matchers: &["ldap_search", "ldap_list", "ldap_read"],
+        label: DataLabel::Sink(Cap::LDAP_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── LDAP-filter sanitizer ───
+    //
+    // `ldap_escape($value, $ignore, LDAP_ESCAPE_FILTER)` applies RFC 4515
+    // escaping; treat any `ldap_escape` call as clearing the LDAP_INJECTION
+    // cap (the no-flag default also escapes filter metacharacters
+    // conservatively).
+    LabelRule {
+        matchers: &["ldap_escape"],
+        label: DataLabel::Sanitizer(Cap::LDAP_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── XPath injection sinks ───
+    //
+    // `DOMXPath::query($expr, $ctx)` and `DOMXPath::evaluate($expr, $ctx)`
+    // accept the expression string as arg 0; concatenated user input there
+    // is the canonical PHP XPath-injection vector.  `SimpleXMLElement::xpath`
+    // takes the same shape.  Direct flat matchers cover the
+    // class-qualified call forms.
+    // Type-qualified rewrites: `$xp = new DOMXPath($doc)` tags `$xp` as
+    // `TypeKind::XPathClient`, so `$xp->query(...)` / `$xp->evaluate(...)`
+    // resolve to `XPathClient.query` / `XPathClient.evaluate`.  Without
+    // the distinct TypeKind, bare `query` would match the SQL_QUERY sink.
+    LabelRule {
+        matchers: &[
+            "XPathClient.query",
+            "XPathClient.evaluate",
+            "DOMXPath::query",
+            "DOMXPath::evaluate",
+            "SimpleXMLElement::xpath",
+        ],
+        label: DataLabel::Sink(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // Bare `xpath` method: SimpleXMLElement instances expose `->xpath($expr)`
+    // and Symfony / DOMCrawler wrappers do the same.  Suffix matching on
+    // `xpath` covers `$xml->xpath(...)` and similar bound-receiver shapes
+    // where the receiver type is not statically known.  Case-sensitive to
+    // avoid collisions with the `XPath` capitalisation used by qualified
+    // names.
+    LabelRule {
+        matchers: &["xpath"],
+        label: DataLabel::Sink(Cap::XPATH_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── XPath escape sanitizers ───
+    //
+    // No PHP standard library helper escapes XPath metacharacters; project-
+    // local `escape_xpath` / `xpath_escape` are the developer-named
+    // equivalents.
+    LabelRule {
+        matchers: &["escape_xpath", "xpath_escape"],
+        label: DataLabel::Sanitizer(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Header / CRLF injection sinks ───
+    //
+    // PHP's `header($line)` writes a raw header line.  Tainted strings
+    // without `\r\n` stripping let an attacker inject extra headers
+    // (response splitting); see GATED_SINKS for the corresponding
+    // OPEN_REDIRECT co-tag on `Location: ...` forms.
+    //
+    // The HEADER_INJECTION sink is intentionally implemented as a gate
+    // (not a flat rule) so the multi-gate SSA dispatch can co-emit it
+    // alongside the OPEN_REDIRECT gate on the same call site, producing
+    // separate findings for each cap with their canonical rule ids.
+    // ─── Header / CRLF sanitizers ───
+    LabelRule {
+        matchers: &["strip_crlf", "escape_header", "sanitize_header"],
+        label: DataLabel::Sanitizer(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Open-redirect URL allowlist sanitizers ───
+    //
+    // Mirrors the JS/TS rule.  Developer-named functions that allowlist
+    // / scheme-strip a redirect URL clear OPEN_REDIRECT taint before it
+    // reaches `header("Location: …")`.  PHP also commonly uses
+    // `snake_case` variants.
+    LabelRule {
+        matchers: &[
+            "validateRedirectUrl",
+            "isSafeRedirect",
+            "stripScheme",
+            "validate_redirect_url",
+            "is_safe_redirect",
+            "strip_scheme",
+            "ensure_relative_url",
+            "ensureRelativeUrl",
+            "assert_relative_path",
+            "assertRelativePath",
+            "is_relative_url",
+            "isRelativeUrl",
+        ],
+        label: DataLabel::Sanitizer(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    // ─── SSTI sinks ───
+    //
+    // Twig `\Twig\Environment::createTemplate(string $template)` parses an
+    // arbitrary template source string at runtime; a tainted source yields
+    // SSTI when the resulting template is rendered.  `Environment::render`
+    // / `Environment::load` take a *template name* (file lookup, not source)
+    // and are intentionally excluded.  After PHP scope-resolution stripping
+    // the chain text covers both `$twig->createTemplate($src)` and
+    // `Twig\Environment::createTemplate(...)` shapes.
+    LabelRule {
+        matchers: &["Environment.createTemplate", "Twig.createTemplate"],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: true,
+    },
+    // ─── XXE sanitizers ───
+    //
+    // `libxml_disable_entity_loader(true)` (PHP <8) / `libxml_set_external_entity_loader($cb)`
+    // disable external-entity expansion process-wide.  Treat their return
+    // value as XXE-cleared so config-style fixtures (`libxml_disable_entity_loader(true);
+    // simplexml_load_string($xml, ...)`) suppress the gate when the call is
+    // present in the same SSA scope.  The flat-rule sanitizer is a coarse
+    // approximation, the real config-check pattern would track parser-instance
+    // hardening (deferred Layer 2).
+    LabelRule {
+        matchers: &[
+            "libxml_disable_entity_loader",
+            "libxml_set_external_entity_loader",
+        ],
+        label: DataLabel::Sanitizer(Cap::XXE),
+        case_sensitive: false,
+    },
 ];

 /// Gated sinks for PHP.
@ -193,18 +330,157 @@ pub static RULES: &[LabelRule] = &[
 ///
 /// Identifier-based activation is enabled via the macro-arg fallback in
 /// `cfg::mod::classify_gated_sink` for `lang == "php"`.
-pub static GATED_SINKS: &[SinkGate] = &[SinkGate {
-    callee_matcher: "curl_setopt",
-    arg_index: 1,
-    dangerous_values: &["CURLOPT_POSTFIELDS", "CURLOPT_COPYPOSTFIELDS"],
-    dangerous_prefixes: &[],
-    label: DataLabel::Sink(Cap::DATA_EXFIL),
-    case_sensitive: true,
-    payload_args: &[2],
-    keyword_name: None,
-    dangerous_kwargs: &[],
-    activation: GateActivation::ValueMatch,
-}];
+pub static GATED_SINKS: &[SinkGate] = &[
+    SinkGate {
+        callee_matcher: "curl_setopt",
+        arg_index: 1,
+        dangerous_values: &["CURLOPT_POSTFIELDS", "CURLOPT_COPYPOSTFIELDS"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::DATA_EXFIL),
+        case_sensitive: true,
+        payload_args: &[2],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
+    // PHP `header($line)` HEADER_INJECTION sink.  Modelled as a gate so
+    // it can coexist with the OPEN_REDIRECT gate below: the multi-gate
+    // SSA dispatch needs each capability declared on its own gate filter
+    // to emit one finding per cap.  Always activates (Destination), with
+    // payload arg 0 only (`header()` only accepts the line as arg 0;
+    // arg 1 is `replace`/`response_code`, not the line content).
+    SinkGate {
+        callee_matcher: "=header",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // PHP `simplexml_load_string($xml, $class, $options)` —
+    // XXE sink gated on the `LIBXML_NOENT` flag (or `LIBXML_DTDLOAD`,
+    // `LIBXML_DTDATTR`).  PHP's libxml is XXE-safe by default since 2.9.0;
+    // the gate fires only when the `$options` literal includes one of the
+    // dangerous flags.  Identifier-based activation works via the macro-arg
+    // fallback in `cfg::mod::classify_gated_sink` for `lang == "php"`.
+    SinkGate {
+        callee_matcher: "simplexml_load_string",
+        arg_index: 2,
+        dangerous_values: &["LIBXML_NOENT", "LIBXML_DTDLOAD", "LIBXML_DTDATTR"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
+    SinkGate {
+        callee_matcher: "simplexml_load_file",
+        arg_index: 2,
+        dangerous_values: &["LIBXML_NOENT", "LIBXML_DTDLOAD", "LIBXML_DTDATTR"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
+    // DOMDocument::loadXML($xml, $options) — same gating as
+    // simplexml_load_string.  The chain-normalised callee text for
+    // `$dom->loadXML(...)` is `dom.loadXML`; suffix matching on
+    // `loadXML` covers the bound-receiver form.
+    SinkGate {
+        callee_matcher: "loadXML",
+        arg_index: 1,
+        dangerous_values: &["LIBXML_NOENT", "LIBXML_DTDLOAD", "LIBXML_DTDATTR"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
+    // PHP `header($line)` co-tag for OPEN_REDIRECT.
+    //
+    // The flat HEADER_INJECTION sink (`=header`) above already fires for
+    // any `header(...)` call regardless of the line content.  This gate
+    // adds the OPEN_REDIRECT co-tag specifically when the first argument
+    // is a `Location: ...` header, so the dashboard / OWASP bucket
+    // correctly classifies redirect-class flows independently of CRLF.
+    //
+    // Activation: arg 0 prefix `Location:` (case-insensitive).  When arg
+    // 0 is a constant string starting with `Location:` the gate fires and
+    // checks payload arg 0 for taint; constants like `Content-Type: ...`
+    // are suppressed by the safe-literal branch.  When arg 0 is a binary
+    // expression (`"Location: " . $url`) or otherwise dynamic, the
+    // value-extraction returns `None` and the gate fires conservatively
+    // — matching the existing convention in `setAttribute`/`parseFromString`.
+    SinkGate {
+        callee_matcher: "=header",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &["Location:"],
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
+    // Smarty `$smarty->fetch($name)` — only the `string:` resource prefix
+    // accepts an inline template *source*; the bare form (`page.tpl`) is a
+    // file lookup (not SSTI).  Gate activates only when arg 0's leading
+    // literal segment is the `string:` prefix; the constant-string suffix
+    // and concat (`"string:" . $src`) shapes both reach `extract_const_string_arg`'s
+    // leading-literal path and trigger activation.  Payload is arg 0
+    // itself — taint reaching the template source string is the SSTI flow.
+    // Suffix matching catches both `Smarty.fetch` and the bound-receiver
+    // `$smarty->fetch(...)` forms.
+    SinkGate {
+        callee_matcher: "Smarty.fetch",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &["string:"],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: false,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
+    // Twig `\Twig\Environment::createTemplate(string $template)` —
+    // gated SSTI sink.  Activation is unconditional (no value gate);
+    // payload arg 0 is the template source string.  Bare suffix
+    // `createTemplate` matches the idiomatic instance shape
+    // `$twig->createTemplate($src)` (chain text `twig.createTemplate`)
+    // as well as the static `Environment::createTemplate(...)` form;
+    // `createTemplate` is Twig-specific terminology so over-fire risk
+    // is low.  The matching flat rule remains for documentation-style
+    // class-qualified call shapes.
+    SinkGate {
+        callee_matcher: "createTemplate",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: false,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+];

 pub static KINDS: Map<&'static str, Kind> = phf_map! {
    // control-flow
--- a/src/labels/python.rs
+++ b/src/labels/python.rs
@ -61,7 +61,7 @@ pub static RULES: &[LabelRule] = &[
    // pattern that follows `from flask import session`.  The `=session`
    // exact-match form fires only when the call is the bare top-level
    // `session(...)` so accidental field projections like
-    // `obj.client.session` (Phase 2 chained-receiver lowering) don't get
+    // `obj.client.session` (chained-receiver lowering) don't get
    // mis-labelled as sources.
    LabelRule {
        matchers: &[
@ -284,6 +284,212 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::DESERIALIZE),
        case_sensitive: false,
    },
+    // ─── LDAP injection sinks ───
+    //
+    // python-ldap exposes module-level `ldap.search_s` / `ldap.search_ext_s`
+    // and method-style `conn.search_s(base, scope, filter)` after `conn =
+    // ldap.initialize(url)`.  Suffix matching on the method names catches both
+    // the qualified form (`ldap.search_s`, matched as a literal) and the
+    // bound-receiver form (`conn.search_s` ends with `search_s`).  ldap3 uses
+    // `Connection(server, ...)` whose `.search(...)` accepts a filter kwarg /
+    // positional; receiver typing tags the connection as `TypeKind::LdapClient`
+    // so type-qualified resolution rewrites `conn.search` → `LdapClient.search`.
+    LabelRule {
+        matchers: &[
+            "ldap.search_s",
+            "ldap.search_ext_s",
+            "search_s",
+            "search_ext_s",
+            "LdapClient.search",
+            "ldap3.Connection.search",
+        ],
+        label: DataLabel::Sink(Cap::LDAP_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── LDAP-filter sanitizers ───
+    //
+    // python-ldap: `ldap.filter.escape_filter_chars(s)` and ldap3's
+    // `ldap3.utils.conv.escape_filter_chars(s)` both apply RFC 4515 escaping
+    // to filter metacharacters.  Suffix matching on `escape_filter_chars`
+    // covers both the fully-qualified import and the bare-name destructured
+    // import (`from ldap.filter import escape_filter_chars`).
+    LabelRule {
+        matchers: &[
+            "escape_filter_chars",
+            "ldap.filter.escape_filter_chars",
+            "ldap3.utils.conv.escape_filter_chars",
+        ],
+        label: DataLabel::Sanitizer(Cap::LDAP_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── XPath injection sinks ───
+    //
+    // lxml: `tree.xpath(expr)` / `etree.XPath(expr)` accept an
+    // attacker-influenceable expression string.  ElementTree's
+    // `find` / `findall` / `findtext` accept the same kind of XPath subset
+    // and admit injection when the path is built by string concatenation.
+    // Suffix matching on the bare method names catches both
+    // `lxml.etree._Element.xpath(...)` and `tree.xpath(...)` shapes.
+    LabelRule {
+        matchers: &[
+            "xpath",
+            "lxml.etree.XPath",
+            "etree.XPath",
+            "ElementTree.find",
+            "ElementTree.findall",
+            "ElementTree.findtext",
+        ],
+        label: DataLabel::Sink(Cap::XPATH_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── XPath escape sanitizers ───
+    //
+    // No standard library helper escapes XPath metacharacters; project-local
+    // `escape_xpath` / `xpath_escape` are the developer-named equivalents.
+    LabelRule {
+        matchers: &["escape_xpath", "xpath_escape"],
+        label: DataLabel::Sanitizer(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Header / CRLF injection sinks ───
+    //
+    // Flask / Werkzeug response APIs that write a single header value:
+    // `response.headers.add(name, val)`, `response.set_cookie(name, val)`,
+    // and the bare subscript-set form `response.headers[name] = val`.
+    // The subscript-set form is picked up via the LHS-subscript
+    // classification path in `cfg/mod.rs::push_node`: the LHS object's
+    // member-expression text matches `response.headers` /
+    // `self.response.headers` and tags the assignment as a HEADER_INJECTION
+    // sink.
+    LabelRule {
+        matchers: &["headers.add", "headers.set", "set_cookie"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    LabelRule {
+        matchers: &["response.headers", "self.response.headers", "resp.headers"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Header / CRLF sanitizers ───
+    LabelRule {
+        matchers: &["strip_crlf", "escape_header", "sanitize_header"],
+        label: DataLabel::Sanitizer(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Open redirect sinks ───
+    //
+    // Flask `redirect(url)`, Django `HttpResponseRedirect(url)`, FastAPI /
+    // Starlette `RedirectResponse(url=...)`.  Tainted URL flowing to any of
+    // these without an allowlist check is an open-redirect vector.
+    LabelRule {
+        matchers: &[
+            "redirect",
+            "flask.redirect",
+            "django.shortcuts.redirect",
+            "HttpResponseRedirect",
+            "RedirectResponse",
+        ],
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: true,
+    },
+    LabelRule {
+        matchers: &[
+            "validate_redirect_url",
+            "is_safe_redirect",
+            "strip_scheme",
+            "ensure_relative_url",
+            "assert_relative_path",
+            "is_relative_url",
+        ],
+        label: DataLabel::Sanitizer(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    // ─── SSTI sinks ───
+    //
+    // Template-engine constructors / `from_string` factories that accept the
+    // template *source string* as arg 0.  `flask.render_template` takes a
+    // file PATH (not source) so does NOT match here — the safe API stays
+    // clean by name.
+    LabelRule {
+        matchers: &[
+            "=Template",
+            "jinja2.Template",
+            "jinja2.Environment.from_string",
+            "Environment.from_string",
+            // `compile_expression` is jinja2-specific terminology (it returns a
+            // callable from an inline expression source).  Bare suffix lets the
+            // rule fire on idiomatic instance shapes (`env.compile_expression(s)`)
+            // without a `jinja2.Environment` TypeKind.
+            "compile_expression",
+            "mako.template.Template",
+            "Template.render",
+        ],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: true,
+    },
+    // Template-loader paths: a tainted `name` lets the attacker swap the
+    // resolved template behind the renderer.  Mako's `TemplateLookup.get_template`
+    // and Jinja2's `Environment.get_template` / `select_template` /
+    // `loader.get_source` all take a template name (path-like) as arg 0.
+    // Modeling these as SSTI sinks captures the loader-path attack — the
+    // file resolver itself becomes the gadget when the name is attacker-controlled.
+    LabelRule {
+        matchers: &[
+            "TemplateLookup.get_template",
+            "Environment.get_template",
+            "Environment.select_template",
+            "loader.get_source",
+            // Bare-suffix forms for the idiomatic instance shapes
+            // (`env.get_template(name)`, `lookup.get_template(name)`).
+            "get_template",
+            "select_template",
+        ],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: true,
+    },
+    // ─── XXE sinks ───
+    //
+    // Python's stock `xml.sax.parseString` / `xml.sax.parse` parsers are
+    // XXE-vulnerable by default; `xml.dom.minidom.parseString` /
+    // `xml.dom.minidom.parse` likewise resolve external entities through
+    // the underlying expat parser unless the entity-loader is hardened.
+    // Each entry is the dotted-module suffix; bare `parseString` / `parse`
+    // are intentionally avoided to prevent collisions with JSON parsers
+    // (`json.loads`), `lxml.etree.fromstring` is excluded — modern lxml
+    // disables external entities by default and would over-fire here.
+    LabelRule {
+        matchers: &[
+            "xml.sax.parseString",
+            "xml.sax.parse",
+            "xml.dom.minidom.parseString",
+            "xml.dom.minidom.parse",
+            "xml.dom.pulldom.parseString",
+            "xml.dom.pulldom.parse",
+        ],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+    },
+    // `defusedxml.*` is the canonical hardened drop-in: every parser in
+    // the package strips external-entity / DTD resolution and raises on
+    // the patterns that would otherwise XXE.  Treat any defusedxml
+    // call as an XXE sanitizer.
+    LabelRule {
+        matchers: &[
+            "defusedxml.ElementTree.fromstring",
+            "defusedxml.ElementTree.parse",
+            "defusedxml.minidom.parseString",
+            "defusedxml.minidom.parse",
+            "defusedxml.sax.parseString",
+            "defusedxml.sax.parse",
+            "defusedxml.pulldom.parseString",
+            "defusedxml.pulldom.parse",
+            "defusedxml.lxml.fromstring",
+            "defusedxml.lxml.parse",
+        ],
+        label: DataLabel::Sanitizer(Cap::XXE),
+        case_sensitive: true,
+    },
 ];

 /// Method-call validators that strip caps from their *receiver* (and
@ -1041,6 +1247,55 @@ pub static GATED_SINKS: &[SinkGate] = &[
    },
 ];

+/// Prototype-pollution-style gates for Python.  Opt-in via the
+/// `NYX_PYTHON_PROTO_POLLUTION` env var (see
+/// `super::env_python_proto_pollution`); when enabled they are merged
+/// into the language's `GATED_REGISTRY` slice at startup.
+///
+/// Coverage is deliberately narrow: the `dict.update(target, src)`
+/// class-method form (where the first arg is the target and the second
+/// is the source) is the canonical attack shape for `__class__` /
+/// `__dict__` pollution in Python frameworks that thread user input
+/// through configuration objects.  The bound-method form
+/// (`config.update(req_data)`) is handled by the suffix-matched
+/// `dict.update` callee text only when the receiver text literally
+/// equals `dict`, keeping the gate from over-firing on every `update`
+/// method in the codebase.
+pub static PROTO_POLLUTION_GATES: &[SinkGate] = &[
+    // `dict.update(target, src)` — class-method form.  Argument-role
+    // gating: only `src` (arg 1) taint activates; tainted target alone
+    // is benign.
+    SinkGate {
+        callee_matcher: "dict.update",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // `obj.__dict__.update(src)` — instance-attribute pollution shape.
+    SinkGate {
+        callee_matcher: "__dict__.update",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+];
+
 pub static KINDS: Map<&'static str, Kind> = phf_map! {
    // control-flow
    "if_statement"          => Kind::If,
--- a/src/labels/ruby.rs
+++ b/src/labels/ruby.rs
@ -1,4 +1,6 @@
-use crate::labels::{Cap, DataLabel, Kind, LabelRule, ParamConfig, RuntimeLabelRule};
+use crate::labels::{
+    Cap, DataLabel, GateActivation, Kind, LabelRule, ParamConfig, RuntimeLabelRule, SinkGate,
+};
 use crate::utils::project::{DetectedFramework, FrameworkContext};
 use phf::{Map, phf_map};

@ -226,10 +228,30 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::SQL_QUERY),
        case_sensitive: true,
    },
-    // Open redirect: redirect_to with user-controlled destination.
+    // Open redirect: redirect_to (Rails) / redirect (Sinatra) with
+    // user-controlled destination.  `redirect` is a top-level Sinatra
+    // helper; case-sensitive matching keeps it from over-firing on
+    // unrelated identifiers.  `redirect_to` is the Rails canonical.
    LabelRule {
        matchers: &["redirect_to"],
-        label: DataLabel::Sink(Cap::SSRF),
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    LabelRule {
+        matchers: &["redirect"],
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: true,
+    },
+    LabelRule {
+        matchers: &[
+            "validate_redirect_url",
+            "is_safe_redirect",
+            "strip_scheme",
+            "ensure_relative_url",
+            "assert_relative_path",
+            "is_relative_url",
+        ],
+        label: DataLabel::Sanitizer(Cap::OPEN_REDIRECT),
        case_sensitive: false,
    },
    // Path traversal: file serving with user-controlled path.
@ -244,6 +266,173 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::HTML_ESCAPE),
        case_sensitive: false,
    },
+    // ─── LDAP injection sinks ───
+    //
+    // `Net::LDAP.new(host:, ...).search(base:, filter:, ...)` is the canonical
+    // ruby-ldap shape.  Type-qualified resolution rewrites `ldap.search` →
+    // `LdapClient.search` when the receiver was constructed via `Net::LDAP.new`
+    // / `Net::LDAP.open` (see [`crate::ssa::type_facts::constructor_type`]).
+    // The chained literal form `Net::LDAP.new(...).search(...)` is also caught
+    // by the suffix matcher `Net::LDAP.search` after `()` stripping (the
+    // post-strip text is `Net::LDAP.new.search`, which ends in `.search`; the
+    // explicit `LDAP.search` keyword form `Net::LDAP.search(filter)` matches
+    // the same matcher directly).
+    LabelRule {
+        matchers: &["LdapClient.search", "Net::LDAP.search"],
+        label: DataLabel::Sink(Cap::LDAP_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── LDAP-filter sanitizer ───
+    //
+    // `Net::LDAP::Filter.escape(value)` applies RFC 4515 escaping; treat any
+    // call as clearing the LDAP_INJECTION cap.
+    LabelRule {
+        matchers: &["Net::LDAP::Filter.escape"],
+        label: DataLabel::Sanitizer(Cap::LDAP_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── XPath injection sinks ───
+    //
+    // `Nokogiri::XML::Node#xpath(expr)`, `at_xpath(expr)`, and `search(expr)`
+    // accept the expression string as arg 0; concatenated user input there is
+    // the canonical Nokogiri XPath-injection vector.  Suffix matching on the
+    // bare method names catches the bound-receiver form (`doc.xpath(expr)`).
+    LabelRule {
+        matchers: &["xpath", "at_xpath"],
+        label: DataLabel::Sink(Cap::XPATH_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── XPath escape sanitizers ───
+    //
+    // No Nokogiri / stdlib helper escapes XPath metacharacters; project-local
+    // `escape_xpath` / `xpath_escape` are the developer-named equivalents.
+    LabelRule {
+        matchers: &["escape_xpath", "xpath_escape"],
+        label: DataLabel::Sanitizer(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Header / CRLF injection sinks ───
+    //
+    // Rack `Response#set_header(name, value)` / `add_header(name, value)`
+    // and `ActionDispatch::Response#headers[]=` write a single header value.
+    // The subscript-set form `response.headers["X-Foo"] = bar` is picked up
+    // via the LHS-subscript classification path in `cfg/mod.rs`: when the
+    // LHS object's member-expression text matches `response.headers` (or a
+    // synonym), the assignment is tagged as a HEADER_INJECTION sink.
+    // Tainted strings without `\r\n` stripping enable response splitting.
+    LabelRule {
+        matchers: &["set_header", "add_header"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    LabelRule {
+        matchers: &["response.headers", "res.headers", "self.response.headers"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    LabelRule {
+        matchers: &["strip_crlf", "escape_header", "sanitize_header"],
+        label: DataLabel::Sanitizer(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── SSTI sinks ───
+    //
+    // `ERB.new(template_source)` and `Liquid::Template.parse(source)` accept
+    // the template *source string* as arg 0; tainted source there yields
+    // arbitrary template execution at the corresponding `result(binding)` /
+    // `render` step.  `=ERB.new` exact-matcher syntax limits the rule to the
+    // direct call (the leading `=` is the same convention used elsewhere in
+    // this file for Kernel-style globals like `=open`).
+    LabelRule {
+        matchers: &["=ERB.new", "Liquid::Template.parse"],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: true,
+    },
+    // ─── XXE sinks ───
+    //
+    // `REXML::Document.new(xml)` instantiates the (legacy, default-vulnerable)
+    // pure-Ruby XML parser; an attacker-controlled `xml` is XXE.
+    //
+    // Nokogiri (`Nokogiri::XML(xml)` / `Nokogiri::XML::Document.parse(xml)`)
+    // is XXE-safe by default since 1.10, but resolving external entities
+    // requires explicitly opting in via `Nokogiri::XML::ParseOptions::NOENT`
+    // (or `DTDLOAD` / `DTDATTR`).  Option-flagged detection lives in
+    // `GATED_SINKS` below.
+    LabelRule {
+        matchers: &["REXML::Document.new"],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+    },
+];
+
+/// Ruby gated sinks.  Argument-role-aware classification for callees that
+/// are XXE-safe by default but become unsafe when the caller passes an
+/// option flag that re-enables external-entity resolution.
+///
+/// Activation uses the bare-leaf comparison: scope-qualified constants like
+/// `Nokogiri::XML::ParseOptions::NOENT` are reduced to the rightmost
+/// `name` segment by the `scope_resolution` branch in
+/// `cfg::literals::extract_const_macro_arg`, so the
+/// `dangerous_values` list stays identifier-bare.
+///
+/// Default-arg semantics: Ruby `Nokogiri::XML(xml)` with no options arg
+/// reaches the gate's `None` activation branch (the activation arg
+/// position simply doesn't exist), which falls through to a conservative
+/// fire.  Callers wishing to suppress the gate explicitly should pass a
+/// safe options literal at the activation position (e.g.
+/// `Nokogiri::XML::ParseOptions::DEFAULT_XML`); any non-dangerous
+/// scope-qualified constant disables the gate.
+pub static GATED_SINKS: &[SinkGate] = &[
+    // `Nokogiri::XML(xml, url=nil, encoding=nil, options=NIL)` — top-level
+    // module method.  arg 3 carries the parse-option flag literal.
+    //
+    // tree-sitter-ruby parses `Nokogiri::XML(args)` as a `call` whose
+    // `receiver` field is the `Nokogiri` constant and `method` field is
+    // the `XML` constant (with `::` as the call operator).  `push_node`'s
+    // `CallMethod` path joins these as `{receiver}.{method}` → matchable
+    // suffix `Nokogiri.XML`.
+    SinkGate {
+        callee_matcher: "Nokogiri.XML",
+        arg_index: 3,
+        dangerous_values: &["NOENT", "DTDLOAD", "DTDATTR"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
+    // `Nokogiri::XML::Document.parse(xml, url=nil, encoding=nil, options=NIL)`
+    // — receiver is the scope_resolution `Nokogiri::XML::Document` (text of
+    // the whole receiver is preserved verbatim) and method is `parse`, so
+    // the constructed callee text is `Nokogiri::XML::Document.parse`.
+    SinkGate {
+        callee_matcher: "Nokogiri::XML::Document.parse",
+        arg_index: 3,
+        dangerous_values: &["NOENT", "DTDLOAD", "DTDATTR"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
+    // `Nokogiri::HTML(html, ..., options)` shares the same option flags as
+    // the XML helper.  Same callee normalization as `Nokogiri.XML`.
+    SinkGate {
+        callee_matcher: "Nokogiri.HTML",
+        arg_index: 3,
+        dangerous_values: &["NOENT", "DTDLOAD", "DTDATTR"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
 ];

 pub static KINDS: Map<&'static str, Kind> = phf_map! {
--- a/src/labels/rust.rs
+++ b/src/labels/rust.rs
@ -1,4 +1,6 @@
-use crate::labels::{Cap, DataLabel, Kind, LabelRule, ParamConfig, RuntimeLabelRule};
+use crate::labels::{
+    Cap, DataLabel, GateActivation, Kind, LabelRule, ParamConfig, RuntimeLabelRule, SinkGate,
+};
 use crate::utils::project::{DetectedFramework, FrameworkContext};
 use phf::{Map, phf_map};

@ -245,6 +247,89 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::DESERIALIZE),
        case_sensitive: false,
    },
+    // ─── Header / CRLF injection sinks ───
+    //
+    // `http::HeaderMap::insert(name, val)` / `append(...)` write a single
+    // header value.  The canonical idiom is `response.headers_mut().insert(...)`
+    // (axum, actix-web `HttpResponse.headers_mut`, hyper `Response::headers_mut`).
+    // After paren-group stripping the chain text becomes
+    // `response.headers_mut.insert`, so suffix matchers on
+    // `headers_mut.insert` / `headers_mut.append` cover the bound-receiver
+    // form regardless of the response builder's concrete type.  Tainted
+    // strings without CRLF stripping enable response splitting.
+    LabelRule {
+        matchers: &["headers_mut.insert", "headers_mut.append"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    LabelRule {
+        matchers: &["strip_crlf", "escape_header", "sanitize_header"],
+        label: DataLabel::Sanitizer(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Open redirect sinks ───
+    //
+    // axum / rocket `Redirect::to(url)` / `Redirect::permanent(url)` /
+    // `Redirect::temporary(url)` build a 3xx response with the URL in the
+    // `Location` header.  Without an allowlist check, a tainted `url` is
+    // the canonical Rust open-redirect vector.  Listed unconditionally (not
+    // gated on framework detection) so non-framework helpers / re-exports
+    // still surface; the framework-conditional rules below are
+    // intentionally not duplicating this label.  Actix
+    // `HttpResponse::Found().header("Location", x)` is covered by the
+    // existing `header` HEADER_INJECTION sink and any Location-line
+    // co-tagging is deferred to the abstract-string-domain pattern hook.
+    LabelRule {
+        matchers: &["Redirect::to", "Redirect::permanent", "Redirect::temporary"],
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: true,
+    },
+    LabelRule {
+        matchers: &[
+            "validate_redirect_url",
+            "is_safe_redirect",
+            "strip_scheme",
+            "ensure_relative_url",
+            "assert_relative_path",
+            "is_relative_url",
+        ],
+        label: DataLabel::Sanitizer(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+];
+
+/// Rust gated sinks.  Argument-position-aware classification for callees
+/// where activation depends on a literal arg value rather than the bare
+/// callee name.
+pub static GATED_SINKS: &[SinkGate] = &[
+    // actix-web `HttpResponse::Found().header("Location", url)` (and other
+    // builder variants like `Ok().header(...)`, `MovedPermanently().header(...)`).
+    // After chain normalisation the callee text is e.g.
+    // `HttpResponse.Found.header`; suffix matching on `header` covers every
+    // builder variant.
+    //
+    // Activation: arg 0 case-insensitive equality with `"Location"`.  When
+    // arg 0 is a constant string equal to `Location` the gate fires and
+    // checks payload arg 1 for taint; constants like `"Content-Type"` are
+    // suppressed by the safe-literal branch.  When arg 0 is dynamic the
+    // gate fires conservatively (per the existing `setAttribute` /
+    // `parseFromString` convention).
+    //
+    // Mirrors PHP's `=header` Location gate; the Rust analog is split
+    // across two args (`name`, `value`) instead of PHP's single `Location: ...`
+    // line.
+    SinkGate {
+        callee_matcher: "header",
+        arg_index: 0,
+        dangerous_values: &["Location"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: true,
+        payload_args: &[1],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::ValueMatch,
+    },
 ];

 pub static KINDS: Map<&'static str, Kind> = phf_map! {
@ -337,11 +422,8 @@ pub fn framework_rules(ctx: &FrameworkContext) -> Vec<RuntimeLabelRule> {
            label: DataLabel::Sink(Cap::HTML_ESCAPE),
            case_sensitive: true,
        });
-        rules.push(RuntimeLabelRule {
-            matchers: vec!["Redirect::to".into()],
-            label: DataLabel::Sink(Cap::SSRF),
-            case_sensitive: true,
-        });
+        // `Redirect::to` is declared unconditionally as Sink(OPEN_REDIRECT)
+        // in `RULES` above; no framework-conditional duplicate needed.
    }

    if ctx.has(DetectedFramework::ActixWeb) {
@ -395,11 +477,8 @@ pub fn framework_rules(ctx: &FrameworkContext) -> Vec<RuntimeLabelRule> {
            label: DataLabel::Sink(Cap::HTML_ESCAPE),
            case_sensitive: true,
        });
-        rules.push(RuntimeLabelRule {
-            matchers: vec!["Redirect::to".into()],
-            label: DataLabel::Sink(Cap::SSRF),
-            case_sensitive: true,
-        });
+        // `Redirect::to` is declared unconditionally as Sink(OPEN_REDIRECT)
+        // in `RULES` above; no framework-conditional duplicate needed.
    }

    rules
--- a/src/labels/typescript.rs
+++ b/src/labels/typescript.rs
@ -255,6 +255,113 @@ pub static RULES: &[LabelRule] = &[
        label: DataLabel::Sink(Cap::SQL_QUERY),
        case_sensitive: true,
    },
+    // ─── LDAP injection sinks ───
+    //
+    // Mirror of `labels/javascript.rs`; ldapjs / ts-ldapjs has the same
+    // `client.search(...)` shape.  Type-qualified resolution covers both
+    // `const client = ldap.createClient({...}); client.search(...)` (bound
+    // variable, type forwarded from the parent body via
+    // [`crate::taint::inject_external_type_facts`]) and the chained
+    // `ldap.createClient({...}).search(...)` form.
+    LabelRule {
+        matchers: &["LdapClient.search"],
+        label: DataLabel::Sink(Cap::LDAP_INJECTION),
+        case_sensitive: true,
+    },
+    // ─── LDAP-filter sanitizers ───
+    LabelRule {
+        matchers: &[
+            "ldapEscape",
+            "ldap-escape",
+            "ldapescape.filter",
+            "ldapescape.dn",
+        ],
+        label: DataLabel::Sanitizer(Cap::LDAP_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── XPath injection sinks ───  (mirrors `labels/javascript.rs`)
+    LabelRule {
+        matchers: &[
+            "document.evaluate",
+            "xpath.select",
+            "xpath.evaluate",
+            "xpath.select1",
+        ],
+        label: DataLabel::Sink(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── XPath escape sanitizers ───  (mirrors `labels/javascript.rs`)
+    LabelRule {
+        matchers: &["escapeXpath", "xpathEscape", "escape_xpath"],
+        label: DataLabel::Sanitizer(Cap::XPATH_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Header / CRLF injection sinks ───  (mirrors `labels/javascript.rs`)
+    LabelRule {
+        matchers: &["setHeader", "res.set", "res.header", "res.append"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // Subscript-set form (mirrors `labels/javascript.rs`).
+    LabelRule {
+        matchers: &["res.headers", "response.headers", "self.response.headers"],
+        label: DataLabel::Sink(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Header / CRLF sanitizers ───  (mirrors `labels/javascript.rs`)
+    LabelRule {
+        matchers: &["stripCRLF", "stripCrlf", "escapeHeader", "sanitizeHeader"],
+        label: DataLabel::Sanitizer(Cap::HEADER_INJECTION),
+        case_sensitive: false,
+    },
+    // ─── Prototype pollution sinks ───  (mirrors `labels/javascript.rs`)
+    //
+    // Argument-role gating is enforced via Destination activation in
+    // `GATED_SINKS` below: only taint flowing into source-object
+    // arguments (positions 1+) activates; tainted-target alone is
+    // benign.  Flat rules here are intentionally empty for the merge
+    // family.
+    // ─── Open redirect sinks ───  (mirrors `labels/javascript.rs`)
+    LabelRule {
+        matchers: &[
+            "res.redirect",
+            "location.replace",
+            "location.assign",
+            "router.navigate",
+            "router.navigateByUrl",
+            "window.location",
+            "window.location.href",
+            "location.href",
+        ],
+        label: DataLabel::Sink(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    LabelRule {
+        matchers: &[
+            "validateRedirectUrl",
+            "isSafeRedirect",
+            "stripScheme",
+            "ensureRelativeUrl",
+            "assertRelativePath",
+            "isRelativeUrl",
+        ],
+        label: DataLabel::Sanitizer(Cap::OPEN_REDIRECT),
+        case_sensitive: false,
+    },
+    // ─── SSTI sinks ───  (mirrors `labels/javascript.rs`; `_.template`
+    // and `nunjucks.renderString` excluded — gated classifiers in
+    // GATED_SINKS)
+    LabelRule {
+        matchers: &["Handlebars.compile"],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: false,
+    },
+    // ─── XXE sinks ───  (mirrors `labels/javascript.rs`)
+    LabelRule {
+        matchers: &["libxmljs.parseXmlString", "libxmljs.parseXml"],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+    },
 ];

 /// Callee patterns that must never be classified as source/sanitizer/sink.
@ -309,6 +416,23 @@ pub static GATED_SINKS: &[SinkGate] = &[
        dangerous_kwargs: &[],
        activation: GateActivation::ValueMatch,
    },
+    // ── XML XXE gates, mirrors `labels/javascript.rs` ────────────────────
+    SinkGate {
+        callee_matcher: "xml2js.parseString",
+        arg_index: 1,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::XXE),
+        case_sensitive: true,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[
+            ("processEntities", &["true"]),
+            ("explicitEntities", &["true"]),
+            ("strict", &["false"]),
+        ],
+        activation: GateActivation::ValueMatch,
+    },
    // ── Outbound HTTP clients (SSRF), see javascript.rs for rationale ────
    SinkGate {
        callee_matcher: "fetch",
@ -603,6 +727,189 @@ pub static GATED_SINKS: &[SinkGate] = &[
            object_destination_fields: &[],
        },
    },
+    // `nunjucks.renderString(src, ctx)` — Nunjucks SSTI sink.  Only the
+    // template *source* (arg 0) lets an attacker drive template
+    // execution; the `ctx` data object (arg 1) is rendered via the
+    // template's escape policy and is not itself a code-injection
+    // vector.  Gate via Destination-style activation with
+    // `payload_args: &[0]` so taint flowing only into `ctx` is
+    // suppressed.  Mirrors `labels/javascript.rs`.
+    SinkGate {
+        callee_matcher: "nunjucks.renderString",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::SSTI),
+        case_sensitive: false,
+        payload_args: &[0],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // ── Prototype pollution gates ────────────────────────────────────────
+    //
+    // Mirrors `labels/javascript.rs` GATED_SINKS proto-pollution block.
+    // Argument-role gating: `(target, src1, src2, ...)`, only source
+    // positions trigger.  See the JS module for the rationale and the
+    // `payload_args` width choice.
+    SinkGate {
+        callee_matcher: "_.merge",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "_.mergeWith",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "_.defaultsDeep",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "_.set",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "_.setWith",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "deepMerge",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "defaultsDeep",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: false,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "Object.assign",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1, 2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "$.extend",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1, 2, 3, 4],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    SinkGate {
+        callee_matcher: "jQuery.extend",
+        arg_index: 0,
+        dangerous_values: &[],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[1, 2, 3, 4],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::Destination {
+            object_destination_fields: &[],
+        },
+    },
+    // Bare `extend` (suffix-matched) — see labels/javascript.rs for full
+    // rationale.  `LiteralOnly` activation requires arg 0 to be literal `true`
+    // so Backbone's `Model.extend({proto})` class-extension form does not
+    // fire (its arg 0 is an object literal, not a boolean).
+    SinkGate {
+        callee_matcher: "extend",
+        arg_index: 0,
+        dangerous_values: &["true"],
+        dangerous_prefixes: &[],
+        label: DataLabel::Sink(Cap::PROTOTYPE_POLLUTION),
+        case_sensitive: true,
+        payload_args: &[2, 3, 4, 5],
+        keyword_name: None,
+        dangerous_kwargs: &[],
+        activation: GateActivation::LiteralOnly,
+    },
 ];

 pub static KINDS: Map<&'static str, Kind> = phf_map! {
--- a/src/server/debug.rs
+++ b/src/server/debug.rs
@ -1181,7 +1181,12 @@ fn type_kind_tag(k: &TypeKind) -> String {
        TypeKind::LocalCollection => "LocalCollection".into(),
        TypeKind::RequestBuilder => "RequestBuilder".into(),
        TypeKind::JpaCriteriaQuery => "JpaCriteriaQuery".into(),
+        TypeKind::LdapClient => "LdapClient".into(),
+        TypeKind::XPathClient => "XPathClient".into(),
+        TypeKind::XmlParser => "XmlParser".into(),
+        TypeKind::Template => "Template".into(),
        TypeKind::Dto(_) => "Dto".into(),
+        TypeKind::NullPrototypeObject => "NullPrototypeObject".into(),
    }
 }

@ -1538,6 +1543,8 @@ pub fn analyse_function_taint(
        receiver_seed: None,
        const_values: Some(&opt.const_values),
        type_facts: Some(&opt.type_facts),
+        xml_parser_config: Some(&opt.xml_parser_config),
+        xpath_config: Some(&opt.xpath_config),
        ssa_summaries: None,
        extra_labels: None,
        callee_bodies: None,
--- a/src/server/models.rs
+++ b/src/server/models.rs
@ -138,6 +138,7 @@ pub struct RuleListItem {
    pub enabled: bool,
    pub is_custom: bool,
    pub is_gated: bool,
+    pub is_class: bool,
    pub case_sensitive: bool,
    pub finding_count: usize,
    pub suppression_rate: f64,
@ -156,6 +157,7 @@ pub struct RuleDetailView {
    pub enabled: bool,
    pub is_custom: bool,
    pub is_gated: bool,
+    pub is_class: bool,
    pub finding_count: usize,
    pub suppression_rate: f64,
    pub example_findings: Vec<RelatedFindingView>,
--- a/src/server/owasp.rs
+++ b/src/server/owasp.rs
@ -25,6 +25,20 @@ fn extract_family(rule_id: &str) -> &str {
    rule_id
 }

+/// True when `rule_id` either equals `prefix` or starts with `prefix`
+/// followed by one of the recognised separator characters used by the
+/// finding-id emitter.  Prevents `taint-ssrf-allowlist-violation`
+/// from silently inheriting `taint-ssrf`'s OWASP bucket.
+fn matches_cap_rule_id(rule_id: &str, prefix: &str) -> bool {
+    if !rule_id.starts_with(prefix) {
+        return false;
+    }
+    matches!(
+        rule_id.as_bytes().get(prefix.len()),
+        None | Some(b' ') | Some(b'(') | Some(b'.')
+    )
+}
+
 /// Return the OWASP 2021 (code, label) pair for a given rule id, or `None` if unmapped.
 pub fn owasp_bucket_for(rule_id: &str) -> Option<(&'static str, &'static str)> {
    let family = extract_family(rule_id);
@ -32,6 +46,27 @@ pub fn owasp_bucket_for(rule_id: &str) -> Option<(&'static str, &'static str)> {
        return None;
    }

+    // Cap-class rule ids carry their canonical OWASP code in
+    // `CAP_RULE_REGISTRY`; consult that first so adding a new cap class
+    // does not require updating two tables.  The legacy family-token
+    // dispatch below covers per-language tree-sitter pattern rules
+    // (`js.xss.outer_html` style) that have no cap entry.
+    //
+    // Match shape: exact equality, or registry id followed by a separator
+    // that the emitter actually uses (` ` for ` (source 1:1)` suffixes,
+    // `(` for `(source 1:1)` style without a leading space, `.` for
+    // dotted variants like `rs.auth.missing_ownership_check.taint`).
+    // Plain `starts_with` would silently bucket a future
+    // `taint-ssrf-allowlist-violation` under the SSRF entry; the
+    // separator gate keeps unrelated suffixes from inheriting a parent
+    // bucket.
+    if let Some(meta) = crate::labels::CAP_RULE_REGISTRY
+        .iter()
+        .find(|m| matches_cap_rule_id(rule_id, m.rule_id))
+    {
+        return Some((meta.owasp_code, meta.owasp_label));
+    }
+
    Some(match family {
        // A01, Broken Access Control
        "auth" | "csrf" | "mass_assign" | "path" | "redirect" => ("A01", "Broken Access Control"),
@ -39,10 +74,10 @@ pub fn owasp_bucket_for(rule_id: &str) -> Option<(&'static str, &'static str)> {
        "crypto" | "secrets" => ("A02", "Cryptographic Failures"),
        // A03, Injection (covers SQLi, XSS, command, code-eval, template, NoSQL, LDAP, reflection,
        // and engine-level taint findings without a more specific family tag).
-        "sqli" | "xss" | "cmdi" | "code_exec" | "template" | "nosql" | "ldap" | "reflection"
-        | "taint" => ("A03", "Injection"),
-        // A05, Security Misconfiguration (TLS verify off, cookie flags, prototype pollution)
-        "config" | "transport" | "prototype" => ("A05", "Security Misconfiguration"),
+        "sqli" | "xss" | "cmdi" | "code_exec" | "template" | "nosql" | "ldap" | "xpath"
+        | "header" | "reflection" | "taint" => ("A03", "Injection"),
+        // A05, Security Misconfiguration (TLS verify off, cookie flags, prototype pollution, XXE)
+        "config" | "transport" | "prototype" | "xxe" => ("A05", "Security Misconfiguration"),
        // A08, Software and Data Integrity Failures
        "deser" => ("A08", "Software and Data Integrity Failures"),
        // A09, Logging & Monitoring Failures
@ -112,6 +147,30 @@ fn issue_category_label(rule_id: &str) -> &'static str {
    if rule_id.starts_with("taint-data-exfiltration") {
        return "Data Exfiltration";
    }
+    // Cap-class rule ids share the `taint` family token but each represent
+    // a distinct vulnerability class.  Match them before falling through
+    // to family-based dispatch so the dashboard surfaces the right badge.
+    if rule_id.starts_with("taint-ldap-injection") {
+        return "LDAP Injection";
+    }
+    if rule_id.starts_with("taint-xpath-injection") {
+        return "XPath Injection";
+    }
+    if rule_id.starts_with("taint-header-injection") {
+        return "Header Injection";
+    }
+    if rule_id.starts_with("taint-open-redirect") {
+        return "Open Redirect";
+    }
+    if rule_id.starts_with("taint-template-injection") {
+        return "Template Injection";
+    }
+    if rule_id.starts_with("taint-xxe") {
+        return "XXE";
+    }
+    if rule_id.starts_with("taint-prototype-pollution") {
+        return "Prototype Pollution";
+    }
    match extract_family(rule_id) {
        "sqli" => "SQL Injection",
        "xss" => "Cross-Site Scripting",
@ -229,6 +288,40 @@ mod tests {
        assert_eq!(out[2].count, 2);
    }

+    #[test]
+    fn cap_rule_id_match_requires_separator() {
+        // Exact match → bucketed.
+        assert_eq!(
+            owasp_bucket_for("taint-ssrf"),
+            Some(("A10", "Server-Side Request Forgery"))
+        );
+        // Suffix after recognised separators is bucketed.
+        assert_eq!(
+            owasp_bucket_for("taint-ssrf (source 1:1)"),
+            Some(("A10", "Server-Side Request Forgery"))
+        );
+        assert_eq!(
+            owasp_bucket_for("taint-ssrf(source 1:1)"),
+            Some(("A10", "Server-Side Request Forgery"))
+        );
+        // Dotted suffix (used by `rs.auth.missing_ownership_check.taint`).
+        assert_eq!(
+            owasp_bucket_for("rs.auth.missing_ownership_check.taint"),
+            Some(("A01", "Broken Access Control"))
+        );
+        // Hyphenated suffix without separator must NOT silently inherit
+        // the parent bucket.  Falls through to the family-token table,
+        // where `ssrf` still resolves to A10, so use a hypothetical
+        // sibling that would only resolve via the cap registry.
+        assert_eq!(
+            owasp_bucket_for("taint-ldap-injection-allowlist"),
+            // Family token "taint" → A03; without separator gating this
+            // would have inherited the LDAP entry's A03 anyway, but the
+            // important property is that the registry match was rejected.
+            Some(("A03", "Injection"))
+        );
+    }
+
    #[test]
    fn issue_category_label_routes_data_exfil_to_dedicated_bucket() {
        // `taint-data-exfiltration` shares the `taint` family token with
--- a/src/server/routes/rules.rs
+++ b/src/server/routes/rules.rs
@ -53,6 +53,8 @@ fn build_rule_list(state: &AppState) -> Vec<RuleInfo> {
                case_sensitive: cr.case_sensitive,
                is_custom: true,
                is_gated: false,
+                is_class: false,
+                emission_active: true,
                enabled,
            });
        }
@ -89,6 +91,7 @@ async fn list_rules(State(state): State<AppState>) -> Json<Vec<RuleListItem>> {
                enabled: r.enabled,
                is_custom: r.is_custom,
                is_gated: r.is_gated,
+                is_class: r.is_class,
                case_sensitive: r.case_sensitive,
                finding_count: count,
                suppression_rate: rate,
@ -134,6 +137,7 @@ async fn get_rule(
        enabled: rule.enabled,
        is_custom: rule.is_custom,
        is_gated: rule.is_gated,
+        is_class: rule.is_class,
        finding_count: total,
        suppression_rate: rate,
        example_findings: examples,
--- a/src/ssa/mod.rs
+++ b/src/ssa/mod.rs
@ -31,6 +31,8 @@ pub mod param_points_to;
 pub mod pointsto;
 pub mod static_map;
 pub mod type_facts;
+pub mod xml_config;
+pub mod xpath_config;

 #[allow(unused_imports)]
 pub use ir::*;
@ -51,6 +53,20 @@ pub struct OptimizeResult {
    pub const_values: HashMap<SsaValue, const_prop::ConstLattice>,
    /// Type fact analysis results.
    pub type_facts: type_facts::TypeFactResult,
+    /// XML-parser configuration facts: per-receiver SSA value
+    /// `secure_processing` / `disallow_doctype` / `external_entities`
+    /// flags carried forward from setter calls and constructor kwargs.
+    /// Consumed by the SSA taint engine to suppress XXE on parse-class
+    /// sinks whose receiver was provably hardened.
+    #[serde(default)]
+    pub xml_parser_config: xml_config::XmlParserConfigResult,
+    /// XPath-receiver configuration facts: per-receiver SSA value
+    /// `has_resolver` flag set by `setXPathVariableResolver` calls.
+    /// Consumed by the SSA taint engine to suppress XPATH_INJECTION on
+    /// `evaluate` / `compile` sinks whose receiver was provably bound
+    /// to a variable resolver (parameterised XPath shape).
+    #[serde(default)]
+    pub xpath_config: xpath_config::XPathConfigResult,
    /// Base-variable alias groups from copy propagation.
    pub alias_result: alias::BaseAliasResult,
    /// Points-to analysis: per-SSA-value abstract heap object sets.
@ -100,6 +116,17 @@ pub fn optimize_ssa_with_param_types(
    let type_facts =
        type_facts::analyze_types_with_param_types(body, cfg, &cp.values, lang, param_types);

+    // 5b. XML-parser config analysis.  Tracks per-receiver hardening
+    // flags so XXE sinks can be suppressed when the parser was provably
+    // configured for secure processing.
+    let xml_parser_config = xml_config::analyze_xml_parser_config(body, cfg, &cp.values, lang);
+
+    // 5c. XPath-receiver config analysis.  Tracks per-receiver
+    // `has_resolver` flag so `XPath.evaluate(taintedExpr, ...)` sinks
+    // can be suppressed when the receiver was bound to an
+    // `XPathVariableResolver` (parameterised-XPath shape).
+    let xpath_config = xpath_config::analyze_xpath_config(body, cfg, lang);
+
    // 6. Points-to analysis (uses allocation site detection + SSA def-use)
    let points_to = heap::analyze_points_to(body, cfg, lang);

@ -113,6 +140,8 @@ pub fn optimize_ssa_with_param_types(
    OptimizeResult {
        const_values: cp.values,
        type_facts,
+        xml_parser_config,
+        xpath_config,
        alias_result,
        points_to,
        module_aliases,
--- a/src/ssa/type_facts.rs
+++ b/src/ssa/type_facts.rs
@ -52,12 +52,55 @@ pub enum TypeKind {
    /// where openmrs / xwiki / keycloak Hibernate DAOs build queries
    /// via `cb.createQuery(Foo.class)` + `Root` / `Predicate` API.
    JpaCriteriaQuery,
+    /// An LDAP directory-service client / connection (`DirContext`,
+    /// `LdapTemplate`, `Net::LDAP`, `ldap3.Connection`, `ldap.createClient`,
+    /// `ldap.DialURL`, etc.).  Distinct from `DatabaseConnection` so the
+    /// type-qualified `LdapClient.search` rule fires only on directory
+    /// search APIs rather than every DB receiver with a `search` method.
+    LdapClient,
+    /// An XPath query / evaluation client (`DOMXPath`, `XPath`,
+    /// `XPathExpression`, `lxml.etree.XPath`, etc.).  Distinct from
+    /// `DatabaseConnection` so the type-qualified `XPathClient.query` /
+    /// `XPathClient.evaluate` rules fire only on XPath APIs rather than
+    /// every receiver with a generic `query` / `evaluate` method (avoids
+    /// collision with PHP `$pdo->query` SQL_QUERY sink).
+    XPathClient,
+    /// A pre-parsed template object whose `process` / `merge` /
+    /// `render` method renders bound data through an already-compiled
+    /// template body.  The SSTI vector is when the template *source*
+    /// fed to the constructor / factory was attacker-influenced; the
+    /// render-time call site is the sink.  Currently populated by
+    /// `new freemarker.template.Template(...)`; the type-qualified
+    /// resolver rewrites `tpl.process(...)` → `Template.process` so
+    /// the existing flat SSTI rule fires on idiomatic
+    /// `Template tpl = new Template(...); tpl.process(model, out)`
+    /// shapes.
+    Template,
+    /// An XML parser instance produced by a JAXP factory call
+    /// (`DocumentBuilderFactory.newDocumentBuilder()`,
+    /// `SAXParserFactory.newSAXParser()`, `XMLReaderFactory.createXMLReader()`).
+    /// `DOMXPath` and friends keep their own `XPathClient` tag.  Used so
+    /// the type-qualified `XmlParser.parse` rule fires on instance-style
+    /// calls (`builder.parse(input)`) without needing a flat-rule
+    /// matcher per concrete subclass.  Also gates the XXE config-fact
+    /// suppression: only XmlParser-typed receivers consult the
+    /// [`crate::ssa::xml_config::XmlParserConfigResult`] sidecar.
+    XmlParser,
    /// A framework-injected DTO body whose field types are known.
    /// Populated when a parameter is recognised as a typed extractor and
    /// the DTO class / struct / Pydantic model is resolvable in scope.
    /// Strictly additive, without a DTO definition, callers fall back
    /// to name-only resolution.
    Dto(DtoFields),
+    /// An object created with `Object.create(null)` — has no prototype
+    /// chain, so subscript-write keys cannot pollute `Object.prototype`.
+    /// Populated for JS/TS values whose constructor call is
+    /// `Object.create(null)`. The PROTOTYPE_POLLUTION suppression at the
+    /// synthetic `__index_set__` sink consults this fact (via SSA receiver
+    /// value) so the suppression is flow-sensitive: if a phi join leaves
+    /// the receiver only sometimes null-prototyped, the fact widens to
+    /// `Unknown` and the sink fires on the unsafe path.
+    NullPrototypeObject,
 }

 /// structural carrier for a recognised DTO type.  Maps
@ -99,6 +142,10 @@ impl TypeKind {
            Self::Url => Some("URL"),
            Self::RequestBuilder => Some("RequestBuilder"),
            Self::JpaCriteriaQuery => Some("JpaCriteriaQuery"),
+            Self::LdapClient => Some("LdapClient"),
+            Self::XPathClient => Some("XPathClient"),
+            Self::XmlParser => Some("XmlParser"),
+            Self::Template => Some("Template"),
            _ => None,
        }
    }
@ -288,9 +335,11 @@ pub fn is_safe_query_object_arg(
 /// authoritative, and consumers see Unknown instead of a wrong
 /// type tag.
 ///
-/// `_args` and `_consts` are kept on the signature so we can later
-/// add arg-shape narrowing when class-literal lowering captures
-/// `Foo.class` as an arg-use.
+/// `_args` and `_consts` allow arg-shape narrowing when an arg's
+/// constant value distinguishes overloads.  Reserved for future Java
+/// `createQuery(Foo.class)` shape (the `Object.create(null)` case is
+/// driven by the `produces_null_proto` CFG flag instead, since a
+/// literal `null` arg leaves no SSA value to inspect).
 fn arg_aware_call_type(
    lang: Lang,
    callee: &str,
@ -392,6 +441,40 @@ pub(crate) fn constructor_type(lang: Lang, callee: &str) -> Option<TypeKind> {
            "createCriteriaUpdate" | "createCriteriaDelete" | "createTupleQuery" | "subquery" => {
                Some(TypeKind::JpaCriteriaQuery)
            }
+            // LDAP directory-service clients.  `new InitialDirContext(env)` /
+            // `new InitialLdapContext(env, ctls)` instantiate the JNDI LDAP
+            // provider; `new LdapTemplate(...)` / `LdapTemplate.<init>` is the
+            // Spring LDAP wrapper.  Both expose `search` / `searchByEntity`
+            // /`searchForObject` overloads where filter/DN strings are LDAP
+            // injection sinks.
+            "InitialDirContext" | "InitialLdapContext" | "LdapTemplate" => {
+                Some(TypeKind::LdapClient)
+            }
+            // JAXP factory-produced XML parser instances.  Each is
+            // XXE-vulnerable by default until hardened with
+            // `setFeature(FEATURE_SECURE_PROCESSING, true)` (or
+            // disallow-doctype-decl, etc.). The
+            // [`crate::ssa::xml_config::XmlParserConfigResult`] sidecar
+            // suppresses the XXE bit at the type-qualified `XmlParser.parse`
+            // sink when the receiver carries a hardening fact.
+            "newDocumentBuilder" | "newSAXParser" | "getXMLReader" | "newXMLReader"
+            | "createXMLReader" => Some(TypeKind::XmlParser),
+            // `XPathFactory.newXPath()` returns a JAXP `XPath` instance.
+            // Mapping it to `XPathClient` lets the type-qualified resolver
+            // pick up `xpath.evaluate(...)` against the existing
+            // `XPathClient.evaluate` rule and lets the
+            // [`crate::ssa::xpath_config::XPathConfigResult`] sidecar
+            // suppress XPATH_INJECTION when the receiver was bound to an
+            // `XPathVariableResolver`.
+            "newXPath" => Some(TypeKind::XPathClient),
+            // Apache FreeMarker `new Template(name, reader, cfg)` /
+            // `cfg.getTemplate(name)`.  The `Template` instance's
+            // `.process(model, out)` is an SSTI sink when the
+            // constructor source / template body came from tainted
+            // input.  Type-qualified resolution rewrites
+            // `tpl.process(...)` → `Template.process` against the
+            // existing flat rule in `labels/java.rs`.
+            "Template" | "getTemplate" => Some(TypeKind::Template),
            _ => None,
        },
        Lang::JavaScript | Lang::TypeScript => match suffix {
@ -409,6 +492,12 @@ pub(crate) fn constructor_type(lang: Lang, callee: &str) -> Option<TypeKind> {
            // `elementsMap.get(id)`, `origIdToDuplicateId.get(...)`,
            // `groupIdMapForOperation.set(...)` shapes).
            "Map" | "Set" | "WeakMap" | "WeakSet" | "Array" => Some(TypeKind::LocalCollection),
+            // ldapjs client factory: `ldap.createClient({ url: '…' })` returns
+            // a Client whose `search(base, opts, cb)` is an LDAP injection
+            // sink.  Match the qualified callee text rather than the bare
+            // `createClient` suffix to avoid widening to unrelated factories
+            // with the same verb name.
+            "createClient" if callee.contains("ldap") => Some(TypeKind::LdapClient),
            _ => None,
        },
        Lang::Python => {
@ -429,6 +518,15 @@ pub(crate) fn constructor_type(lang: Lang, callee: &str) -> Option<TypeKind> {
            } else if suffix == "open" && !callee.contains('.') {
                // Bare `open()` is file I/O in Python
                Some(TypeKind::FileHandle)
+            } else if callee == "ldap.initialize"
+                || callee == "ldap3.Connection"
+                || callee.ends_with(".initialize") && callee.contains("ldap")
+            {
+                // python-ldap: `conn = ldap.initialize(url)` returns an
+                // LDAPObject whose `search_s` / `search_ext_s` methods are
+                // LDAP-injection sinks.  ldap3: `Connection(server, ...)`
+                // returns a Connection with a `search()` method.
+                Some(TypeKind::LdapClient)
            } else {
                None
            }
@ -442,6 +540,10 @@ pub(crate) fn constructor_type(lang: Lang, callee: &str) -> Option<TypeKind> {
                Some(TypeKind::FileHandle)
            } else if callee.contains("url.") && suffix == "Parse" {
                Some(TypeKind::Url)
+            } else if callee.contains("ldap.") && matches!(suffix, "Dial" | "DialURL" | "DialTLS") {
+                // go-ldap (`github.com/go-ldap/ldap/v3`): `conn, _ := ldap.DialURL(url)`
+                // returns `*ldap.Conn` whose `Search(req)` is an LDAP-injection sink.
+                Some(TypeKind::LdapClient)
            } else {
                None
            }
@ -451,6 +553,10 @@ pub(crate) fn constructor_type(lang: Lang, callee: &str) -> Option<TypeKind> {
            "curl_init" => Some(TypeKind::HttpClient),
            "fopen" => Some(TypeKind::FileHandle),
            "SplFileObject" => Some(TypeKind::FileHandle),
+            // DOMXPath: `$xp = new DOMXPath($doc)`.  `$xp->query($expr)` /
+            // `$xp->evaluate($expr)` are XPath-injection sinks; without a
+            // distinct TypeKind they collide with the bare `query` SQL sink.
+            "DOMXPath" => Some(TypeKind::XPathClient),
            _ => None,
        },
        Lang::C => match suffix {
@ -524,6 +630,11 @@ pub(crate) fn constructor_type(lang: Lang, callee: &str) -> Option<TypeKind> {
                Some(TypeKind::DatabaseConnection)
            } else if after_colons.starts_with("File.") && matches!(suffix, "open" | "new") {
                Some(TypeKind::FileHandle)
+            } else if callee.contains("Net::LDAP") && matches!(suffix, "new" | "open") {
+                // net-ldap gem: `Net::LDAP.new(host: ...)` / `Net::LDAP.open`
+                // returns a connection whose `search(base:, filter:)` accepts
+                // an attacker-influenceable filter expression.
+                Some(TypeKind::LdapClient)
            } else {
                None
            }
@ -768,8 +879,7 @@ pub fn analyze_types(
 /// Same as [`analyze_types`] but seeds [`SsaOp::Param`] values with
 /// per-position [`TypeKind`] facts from `param_types` (parallel-vec to
 /// the function's BodyMeta.params).  An entry of `None` (or an out-of-
-/// range index) leaves the value at the default Param fact (Unknown),
-/// preserving the pre-Phase-3 behaviour.
+/// range index) leaves the value at the default Param fact (Unknown).
 pub fn analyze_types_with_param_types(
    body: &SsaBody,
    cfg: &Cfg,
@ -810,8 +920,7 @@ pub fn analyze_types_with_param_types(
                SsaOp::Param { index } => {
                    // Seed from the function's BodyMeta.param_types when
                    // a TypeKind was recovered at CFG construction time.
-                    // Out-of-range / None entries fall back to Unknown,
-                    // matching the pre-Phase-3 behaviour.
+                    // Out-of-range / None entries fall back to Unknown.
                    match param_types.get(*index).and_then(|t| t.clone()) {
                        Some(tk) => TypeFact::from_kind(tk),
                        None => TypeFact::unknown(),
@ -820,7 +929,19 @@ pub fn analyze_types_with_param_types(
                SsaOp::SelfParam => TypeFact::from_kind(TypeKind::Object),
                SsaOp::CatchParam => TypeFact::from_kind(TypeKind::Object),
                SsaOp::Call { callee, args, .. } => {
-                    if let Some(ty) = lang.and_then(|l| constructor_type(l, callee)) {
+                    // CFG marks `Object.create(null)` (and future
+                    // null-prototype constructors) at lowering time.
+                    // Honour it ahead of generic constructor / arg-aware
+                    // dispatch so the returned SsaValue carries
+                    // `NullPrototypeObject` for prototype-pollution
+                    // suppression.
+                    let null_proto = cfg
+                        .node_weight(inst.cfg_node)
+                        .map(|ni| ni.call.produces_null_proto)
+                        .unwrap_or(false);
+                    if null_proto {
+                        TypeFact::from_kind(TypeKind::NullPrototypeObject)
+                    } else if let Some(ty) = lang.and_then(|l| constructor_type(l, callee)) {
                        TypeFact::from_kind(ty)
                    } else if let Some(ty) =
                        lang.and_then(|l| arg_aware_call_type(l, callee, args, consts))
@ -1667,7 +1788,7 @@ mod tests {

    /// Param values seeded from `param_types` must surface
    /// the right TypeKind for downstream sink suppression.  An out-of-
-    /// range index falls back to Unknown (the pre-Phase-3 default).
+    /// range index falls back to Unknown.
    #[test]
    fn param_types_seed_param_value_facts() {
        use crate::cfg::Cfg;
@ -1728,7 +1849,7 @@ mod tests {
        // Index 99 is out of range → falls back to Unknown.
        assert_eq!(result.get_type(SsaValue(1)), Some(&TypeKind::Unknown));

-        // Empty slice = pre-Phase-3 behaviour.
+        // Empty slice = type-unaware fallback (analyze_types path).
        let result2 = analyze_types(&body, &cfg, &consts, Some(Lang::Java));
        assert_eq!(result2.get_type(SsaValue(0)), Some(&TypeKind::Unknown));
    }
@ -2364,7 +2485,7 @@ mod tests {
        ));
    }

-    // ── JPA Criteria query suppression (Phase: real-repo openmrs FP) ───
+    // ── JPA Criteria query suppression (real-repo openmrs FP) ─────────
    //
    // These tests pin the `TypeKind::JpaCriteriaQuery` variant + the
    // `is_safe_query_object_arg` predicate + the
--- a/src/ssa/xml_config.rs
+++ b/src/ssa/xml_config.rs
@ -0,0 +1,614 @@
+//! Per-SSA-value XML-parser configuration tracking.
+//!
+//! Tracks "is this XML parser configured to disable external entities / DTD
+//! resolution" facts on parser-receiver SSA values. When a parse-class sink
+//! is reached and the receiver is provably configured for secure processing,
+//! the XXE bit is stripped from the sink's cap mask.
+//!
+//! The pass is intentionally a small forward dataflow run alongside type-fact
+//! analysis. It does NOT flow through the SSA taint engine's worklist. Phi
+//! nodes propagate the meet of operand configs (a flag is "set" only when all
+//! reaching operands set it), and copy assignments propagate the receiver's
+//! config. Recognised setter calls update the receiver's config in place;
+//! identity-style transformer calls that produce a child parser (e.g.
+//! `factory.newDocumentBuilder()`) inherit the receiver's config into the
+//! result value.
+
+use std::collections::HashMap;
+
+use super::const_prop::ConstLattice;
+use super::ir::*;
+use crate::cfg::Cfg;
+use crate::symbol::Lang;
+use serde::{Deserialize, Serialize};
+
+/// Receiver-instance config carried forward from setter calls.
+///
+/// All flags default to `false` (parser may be unsafe).  A `true` flag
+/// means: we have proven this parser was hardened along this control-flow
+/// path.  The XXE-suppression check is `secure_processing ||
+/// disallow_doctype` — either gate is sufficient to neutralise external
+/// entity resolution in JAXP / lxml / xml2js.
+///
+/// `external_entities` is the *unsafe* polarity: when set to `true`, the
+/// parser was explicitly opted into external-entity resolution (e.g.
+/// `XMLParser(resolve_entities=True)`).  A parse call with this flag
+/// retains XXE even if the language default would otherwise be safe.
+#[derive(Clone, Copy, Debug, Default, PartialEq, Eq, Serialize, Deserialize)]
+pub struct XmlParserConfig {
+    pub secure_processing: bool,
+    pub disallow_doctype: bool,
+    pub external_entities: bool,
+}
+
+impl XmlParserConfig {
+    /// True when the parser is provably hardened against XXE.
+    pub fn is_secure(&self) -> bool {
+        (self.secure_processing || self.disallow_doctype) && !self.external_entities
+    }
+
+    /// Phi-meet: a flag survives only when *both* operands set it.  Used
+    /// when the parser variable was reassigned across branches.
+    fn meet(&self, other: &Self) -> Self {
+        XmlParserConfig {
+            secure_processing: self.secure_processing && other.secure_processing,
+            disallow_doctype: self.disallow_doctype && other.disallow_doctype,
+            // Unsafe polarity: ANY branch enabling external entities
+            // contaminates the join.  Conservative w.r.t. XXE.
+            external_entities: self.external_entities || other.external_entities,
+        }
+    }
+
+    /// Union: caller updates the same receiver across multiple setter
+    /// calls.  All known-safe flags accumulate; unsafe is sticky.
+    fn union(&self, other: &Self) -> Self {
+        XmlParserConfig {
+            secure_processing: self.secure_processing || other.secure_processing,
+            disallow_doctype: self.disallow_doctype || other.disallow_doctype,
+            external_entities: self.external_entities || other.external_entities,
+        }
+    }
+}
+
+/// Result of XML-parser config analysis.
+#[derive(Clone, Debug, Default, Serialize, Deserialize)]
+pub struct XmlParserConfigResult {
+    pub configs: HashMap<SsaValue, XmlParserConfig>,
+}
+
+impl XmlParserConfigResult {
+    /// True when the value carries a config fact proving secure processing.
+    pub fn is_secure(&self, v: SsaValue) -> bool {
+        self.configs.get(&v).is_some_and(|c| c.is_secure())
+    }
+
+    /// True when the value was explicitly opted into external-entity
+    /// resolution (e.g. lxml `resolve_entities=True`).
+    pub fn is_unsafe_explicit(&self, v: SsaValue) -> bool {
+        self.configs.get(&v).is_some_and(|c| c.external_entities)
+    }
+}
+
+/// Suppress the `Cap::XXE` bit when the receiver of an XXE-class sink
+/// was provably hardened.  Returns `true` when XXE should be stripped
+/// from the sink's cap mask.
+///
+/// Conservative defaults:
+/// * No receiver SSA value (free function) → returns `false` (cannot
+///   prove safety, fall through to existing classification).
+/// * Receiver carries no config fact → returns `false`.
+/// * `external_entities` flag is set → returns `false` even if a safe
+///   flag is also set, since the unsafe opt-in dominates.
+pub fn xxe_safe(receiver: Option<SsaValue>, xml_config: &XmlParserConfigResult) -> bool {
+    let Some(rv) = receiver else {
+        return false;
+    };
+    xml_config.is_secure(rv)
+}
+
+/// Per-call analysis result: how this call mutates the parser-config
+/// universe.
+#[allow(dead_code)] // SeedResult reserved for future constructor-driven seeding
+enum ConfigEffect {
+    /// No effect on parser configuration.
+    None,
+    /// Update the call's receiver in place by OR-ing the supplied config
+    /// into its current config.  Used for setter calls
+    /// (`factory.setFeature(FEATURE_SECURE_PROCESSING, true)`).
+    UpdateReceiver(XmlParserConfig),
+    /// Inherit the receiver's config into the call's result value.
+    /// Used for identity-style transformer calls
+    /// (`factory.newDocumentBuilder()` returns a builder that shares
+    /// the factory's hardening state).
+    InheritFromReceiver,
+    /// Initialise the call's result value with the supplied config.
+    /// Used for constructor calls whose options reveal the unsafe-explicit
+    /// opt-in (`new XMLParser({ processEntities: true })`,
+    /// `lxml.etree.XMLParser(resolve_entities=True)`).
+    SeedResult(XmlParserConfig),
+}
+
+/// Classify a Call instruction's effect on the parser-config universe.
+///
+/// `arg_const` looks up the const-lattice value for an SSA arg position
+/// (returns `None` if the position is out of range or the SSA value is
+/// not a known constant).  Setter detection consults arg-0 (the feature
+/// name) and arg-1 (the boolean flag).
+///
+/// `arg_idents` is the matching CFG-level [`info.call.arg_uses`] vector
+/// (per-position identifier text from the source AST).  Used to recover
+/// non-literal feature names like `XMLConstants.FEATURE_SECURE_PROCESSING`
+/// or bare identifiers (`FEATURE_SECURE_PROCESSING`, `Boolean.TRUE`)
+/// that const-propagation cannot fold to a literal.
+///
+/// `arg_literals` is the matching CFG-level
+/// [`info.call.arg_string_literals`] vector (per-position literal text;
+/// strings, booleans, and null/nil/None tokens).  Used to recover the
+/// boolean polarity of `setFeature(NAME, true)` since SSA lowering does
+/// not bind boolean arg literals to any SSA value (`arg_uses` skips them
+/// because they are not identifiers).
+fn classify_call(
+    lang: Lang,
+    callee: &str,
+    args: &[smallvec::SmallVec<[SsaValue; 2]>],
+    receiver: Option<SsaValue>,
+    consts: &HashMap<SsaValue, ConstLattice>,
+    arg_idents: &[Vec<String>],
+    arg_literals: &[Option<String>],
+) -> ConfigEffect {
+    let suffix = callee.rsplit(['.', ':']).next().unwrap_or(callee);
+
+    // Helper: lookup the const lattice for arg N's first SSA value.
+    let arg_const = |n: usize| -> Option<&ConstLattice> {
+        args.get(n)
+            .and_then(|vals| vals.first())
+            .and_then(|v| consts.get(v))
+    };
+    // Helper: text of the const lattice (for string/identifier comparison).
+    let arg_text = |n: usize| -> Option<String> {
+        match arg_const(n)? {
+            ConstLattice::Str(s) => Some(s.clone()),
+            ConstLattice::Bool(b) => Some(b.to_string()),
+            ConstLattice::Int(i) => Some(i.to_string()),
+            _ => None,
+        }
+    };
+    // Helper: textual identifier(s) at arg N from the CFG node.  Non-literal
+    // feature names (`XMLConstants.FEATURE_SECURE_PROCESSING`, bare
+    // `FEATURE_SECURE_PROCESSING`, etc.) surface here.
+    let arg_ident_text = |n: usize| -> Vec<&str> {
+        arg_idents
+            .get(n)
+            .map(|v| v.iter().map(|s| s.as_str()).collect())
+            .unwrap_or_default()
+    };
+    let arg_bool = |n: usize| -> Option<bool> {
+        if let Some(b) = arg_const(n).and_then(|c| match c {
+            ConstLattice::Bool(b) => Some(*b),
+            ConstLattice::Str(s) => match s.as_str() {
+                "True" | "true" => Some(true),
+                "False" | "false" => Some(false),
+                _ => None,
+            },
+            _ => None,
+        }) {
+            return Some(b);
+        }
+        // Fallback: tree-sitter classifies `true` / `false` as bare
+        // identifiers in some grammars.  Inspect the arg's use list.
+        for tok in arg_ident_text(n) {
+            match tok {
+                "true" | "True" | "Boolean.TRUE" => return Some(true),
+                "false" | "False" | "Boolean.FALSE" => return Some(false),
+                _ => {}
+            }
+        }
+        // Fallback: literal tokens lifted by `extract_arg_string_literals`
+        // (booleans / null / numeric tokens).  Java `setFeature(NAME, true)`
+        // does not bind the `true` token to any SSA value, but the literal
+        // surfaces here so the polarity can still be read.
+        if let Some(Some(lit)) = arg_literals.get(n) {
+            match lit.as_str() {
+                "true" | "True" | "Boolean.TRUE" => return Some(true),
+                "false" | "False" | "Boolean.FALSE" => return Some(false),
+                _ => {}
+            }
+        }
+        None
+    };
+
+    match lang {
+        Lang::Java => match suffix {
+            // `factory.setFeature(NAME, BOOL)` — the canonical JAXP
+            // hardening switch.  Three feature names matter:
+            //   * `FEATURE_SECURE_PROCESSING` (XMLConstants.FEATURE_SECURE_PROCESSING)
+            //   * `http://apache.org/xml/features/disallow-doctype-decl`
+            //   * `http://xml.org/sax/features/external-general-entities`
+            //   * `http://xml.org/sax/features/external-parameter-entities`
+            // The first two harden by being SET TRUE; the entity ones
+            // harden by being SET FALSE.
+            "setFeature" => {
+                if receiver.is_none() {
+                    return ConfigEffect::None;
+                }
+                let name_lit = arg_text(0).unwrap_or_default();
+                let name_idents = arg_ident_text(0);
+                let value = arg_bool(1);
+                let any_ident = |needle: &str| name_idents.iter().any(|s| s.contains(needle));
+                let mut cfg = XmlParserConfig::default();
+                if name_lit == "FEATURE_SECURE_PROCESSING"
+                    || name_lit.contains("XMLConstants.FEATURE_SECURE_PROCESSING")
+                    || name_lit.contains("javax.xml.XMLConstants/feature/secure-processing")
+                    || any_ident("FEATURE_SECURE_PROCESSING")
+                {
+                    if value == Some(true) {
+                        cfg.secure_processing = true;
+                    }
+                } else if name_lit.contains("disallow-doctype-decl")
+                    || any_ident("disallow-doctype-decl")
+                {
+                    if value == Some(true) {
+                        cfg.disallow_doctype = true;
+                    }
+                } else if (name_lit.contains("external-general-entities")
+                    || name_lit.contains("external-parameter-entities")
+                    || name_lit.contains("load-external-dtd")
+                    || any_ident("external-general-entities")
+                    || any_ident("external-parameter-entities")
+                    || any_ident("load-external-dtd"))
+                    && value == Some(false)
+                {
+                    cfg.disallow_doctype = true;
+                }
+                if cfg == XmlParserConfig::default() {
+                    ConfigEffect::None
+                } else {
+                    ConfigEffect::UpdateReceiver(cfg)
+                }
+            }
+            // `factory.setExpandEntityReferences(false)` —
+            // DocumentBuilderFactory legacy hardening switch.
+            "setExpandEntityReferences" => {
+                if receiver.is_none() {
+                    return ConfigEffect::None;
+                }
+                if arg_bool(0) == Some(false) {
+                    ConfigEffect::UpdateReceiver(XmlParserConfig {
+                        disallow_doctype: true,
+                        ..Default::default()
+                    })
+                } else {
+                    ConfigEffect::None
+                }
+            }
+            // `factory.newDocumentBuilder()` / `factory.newSAXParser()` /
+            // `parser.getXMLReader()` propagate the hardening state from
+            // the factory (receiver) onto the produced parser instance
+            // (return value).  Without this propagation, a hardened
+            // factory's child builder would parse with no config.
+            "newDocumentBuilder" | "newSAXParser" | "getXMLReader" | "newXMLReader" => {
+                if receiver.is_some() {
+                    ConfigEffect::InheritFromReceiver
+                } else {
+                    ConfigEffect::None
+                }
+            }
+            _ => ConfigEffect::None,
+        },
+        Lang::Python => {
+            // `lxml.etree.XMLParser(resolve_entities=False)` — the lxml
+            // parser default resolves entities; the keyword argument
+            // changes that.  Const-propagation will not generally see the
+            // kwarg value here (kwargs land in `info.call.kwargs`, not
+            // positional args), so we treat the constructor as a
+            // best-effort initialiser keyed off the keyword's literal
+            // text via the static-map.  When neither keyword surfaces,
+            // the parser keeps the default-empty config.
+            if callee.ends_with("etree.XMLParser") || suffix == "XMLParser" {
+                // Positional kwargs aren't reliable here; rely on the
+                // call's static-map kwargs (handled by the per-callsite
+                // pass below).  Fall through to None at this layer.
+                ConfigEffect::None
+            } else {
+                ConfigEffect::None
+            }
+        }
+        _ => ConfigEffect::None,
+    }
+}
+
+/// Run the XML-parser config analysis on an SSA body.
+pub fn analyze_xml_parser_config(
+    body: &SsaBody,
+    cfg: &Cfg,
+    consts: &HashMap<SsaValue, ConstLattice>,
+    lang: Option<Lang>,
+) -> XmlParserConfigResult {
+    let Some(lang) = lang else {
+        return XmlParserConfigResult::default();
+    };
+
+    let mut configs: HashMap<SsaValue, XmlParserConfig> = HashMap::new();
+
+    // Helper: read the kwargs attached to the original CFG node for the
+    // call instruction at hand.  Used for languages where parser
+    // hardening flags arrive as keyword arguments (Python lxml).
+    let lookup_kwargs = |node_idx: petgraph::graph::NodeIndex| -> Vec<(String, Vec<String>)> {
+        cfg.node_weight(node_idx)
+            .map(|ni| ni.call.kwargs.clone())
+            .unwrap_or_default()
+    };
+    // Helper: read the positional arg-use identifier vectors (e.g.
+    // `XMLConstants.FEATURE_SECURE_PROCESSING` surfaces as a dotted path
+    // here even when const-prop folds it to nothing).
+    let lookup_arg_idents = |node_idx: petgraph::graph::NodeIndex| -> Vec<Vec<String>> {
+        cfg.node_weight(node_idx)
+            .map(|ni| ni.call.arg_uses.clone())
+            .unwrap_or_default()
+    };
+    // Helper: read the per-position literal-token vector
+    // (`arg_string_literals` lifts strings, booleans, null tokens, and
+    // numeric tokens — see `extract_arg_string_literals`).
+    let lookup_arg_literals = |node_idx: petgraph::graph::NodeIndex| -> Vec<Option<String>> {
+        cfg.node_weight(node_idx)
+            .map(|ni| ni.call.arg_string_literals.clone())
+            .unwrap_or_default()
+    };
+
+    // Pass 1 — direct effects from Call instructions in source order.
+    // Setter updates and constructor seeds are effectively monotone
+    // (we OR safe flags onto the receiver / value), so a single pass is
+    // sufficient when phi nodes only appear after the setter.  Pass 2
+    // below handles phi/copy propagation.
+    for block in &body.blocks {
+        for inst in block.body.iter() {
+            if let SsaOp::Call {
+                callee,
+                args,
+                receiver,
+                ..
+            } = &inst.op
+            {
+                // Python lxml.etree.XMLParser(resolve_entities=...): the
+                // kwarg lives on the CFG node's `kwargs` list, not in
+                // the SSA Call args.  Inspect it directly.
+                if matches!(lang, Lang::Python)
+                    && (callee.ends_with("etree.XMLParser")
+                        || callee.rsplit(['.', ':']).next() == Some("XMLParser"))
+                {
+                    let kwargs = lookup_kwargs(inst.cfg_node);
+                    for (name, values) in &kwargs {
+                        if name == "resolve_entities" {
+                            // Look up the literal text on the matching
+                            // argument; tree-sitter-python keywords surface
+                            // the value identifier in the `values` slot.
+                            if values.iter().any(|v| v == "True" || v == "true") {
+                                let entry = configs.entry(inst.value).or_default();
+                                entry.external_entities = true;
+                            } else if values.iter().any(|v| v == "False" || v == "false") {
+                                let entry = configs.entry(inst.value).or_default();
+                                entry.disallow_doctype = true;
+                            }
+                        }
+                        if name == "no_network" && values.iter().any(|v| v == "True" || v == "true")
+                        {
+                            let entry = configs.entry(inst.value).or_default();
+                            entry.disallow_doctype = true;
+                        }
+                    }
+                    continue;
+                }
+
+                // JS/TS: `new XMLParser({ processEntities: true, ... })`.
+                // The fast-xml-parser constructor's option-object fields
+                // are not exposed via const-prop, but the CFG layer
+                // captures string-literal kwargs in the call's
+                // `arg_string_literals` for object-literal positions.
+                // For now, mark the result as unsafe-explicit only when
+                // the static-kwargs list carries `processEntities=true`.
+                if matches!(lang, Lang::JavaScript | Lang::TypeScript)
+                    && (callee.ends_with("XMLParser") || callee.ends_with(".XMLParser"))
+                {
+                    let kwargs = lookup_kwargs(inst.cfg_node);
+                    for (name, values) in &kwargs {
+                        if name == "processEntities" && values.iter().any(|v| v == "true") {
+                            let entry = configs.entry(inst.value).or_default();
+                            entry.external_entities = true;
+                        }
+                    }
+                    continue;
+                }
+
+                let arg_idents = lookup_arg_idents(inst.cfg_node);
+                let arg_literals = lookup_arg_literals(inst.cfg_node);
+                match classify_call(
+                    lang,
+                    callee,
+                    args,
+                    *receiver,
+                    consts,
+                    &arg_idents,
+                    &arg_literals,
+                ) {
+                    ConfigEffect::None => {}
+                    ConfigEffect::UpdateReceiver(delta) => {
+                        if let Some(rv) = *receiver {
+                            let entry = configs.entry(rv).or_default();
+                            *entry = entry.union(&delta);
+                        }
+                    }
+                    ConfigEffect::InheritFromReceiver => {
+                        if let Some(rv) = *receiver
+                            && let Some(parent) = configs.get(&rv).copied()
+                        {
+                            let entry = configs.entry(inst.value).or_default();
+                            *entry = entry.union(&parent);
+                        }
+                    }
+                    ConfigEffect::SeedResult(seed) => {
+                        let entry = configs.entry(inst.value).or_default();
+                        *entry = entry.union(&seed);
+                    }
+                }
+            }
+        }
+    }
+
+    // Pass 2 — fixed-point propagation through copy assignments and phi
+    // joins.  Caps the iteration count: in practice 2-3 rounds suffice
+    // on intra-procedural shapes.
+    for _ in 0..6 {
+        let mut changed = false;
+        for block in &body.blocks {
+            for inst in &block.phis {
+                if let SsaOp::Phi(operands) = &inst.op {
+                    let mut acc: Option<XmlParserConfig> = None;
+                    for (_, val) in operands {
+                        let cfg_val = configs.get(val).copied().unwrap_or_default();
+                        acc = Some(match acc {
+                            None => cfg_val,
+                            Some(prev) => prev.meet(&cfg_val),
+                        });
+                    }
+                    if let Some(joined) = acc
+                        && joined != XmlParserConfig::default()
+                    {
+                        let prev = configs.get(&inst.value).copied();
+                        if prev != Some(joined) {
+                            configs.insert(inst.value, joined);
+                            changed = true;
+                        }
+                    }
+                }
+            }
+            for inst in &block.body {
+                if let SsaOp::Assign(uses) = &inst.op
+                    && uses.len() == 1
+                    && let Some(src_cfg) = configs.get(&uses[0]).copied()
+                    && src_cfg != XmlParserConfig::default()
+                {
+                    let prev = configs.get(&inst.value).copied().unwrap_or_default();
+                    let new_cfg = prev.union(&src_cfg);
+                    if Some(new_cfg) != configs.get(&inst.value).copied() {
+                        configs.insert(inst.value, new_cfg);
+                        changed = true;
+                    }
+                }
+                // InheritFromReceiver may need a re-pass when the
+                // receiver's config was set after the call itself was
+                // visited (e.g. the call appears in a later block whose
+                // dominator chain only resolves on the second iteration).
+                if let SsaOp::Call {
+                    callee,
+                    receiver: Some(rv),
+                    ..
+                } = &inst.op
+                {
+                    let suffix = callee.rsplit(['.', ':']).next().unwrap_or(callee);
+                    let inherit = matches!(lang, Lang::Java)
+                        && matches!(
+                            suffix,
+                            "newDocumentBuilder" | "newSAXParser" | "getXMLReader" | "newXMLReader"
+                        );
+                    if inherit && let Some(parent) = configs.get(rv).copied() {
+                        let prev = configs.get(&inst.value).copied().unwrap_or_default();
+                        let new_cfg = prev.union(&parent);
+                        if Some(new_cfg) != configs.get(&inst.value).copied()
+                            && new_cfg != XmlParserConfig::default()
+                        {
+                            configs.insert(inst.value, new_cfg);
+                            changed = true;
+                        }
+                    }
+                }
+            }
+        }
+        if !changed {
+            break;
+        }
+    }
+
+    XmlParserConfigResult { configs }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn default_config_is_unsafe() {
+        let c = XmlParserConfig::default();
+        assert!(!c.is_secure());
+    }
+
+    #[test]
+    fn secure_processing_alone_is_safe() {
+        let c = XmlParserConfig {
+            secure_processing: true,
+            ..Default::default()
+        };
+        assert!(c.is_secure());
+    }
+
+    #[test]
+    fn external_entities_overrides_safe_flag() {
+        let c = XmlParserConfig {
+            secure_processing: true,
+            external_entities: true,
+            ..Default::default()
+        };
+        assert!(!c.is_secure());
+    }
+
+    #[test]
+    fn meet_keeps_only_intersection_of_safe_flags() {
+        let a = XmlParserConfig {
+            secure_processing: true,
+            disallow_doctype: true,
+            ..Default::default()
+        };
+        let b = XmlParserConfig {
+            secure_processing: true,
+            ..Default::default()
+        };
+        let m = a.meet(&b);
+        assert!(m.secure_processing);
+        assert!(!m.disallow_doctype);
+    }
+
+    #[test]
+    fn meet_propagates_unsafe_flag() {
+        let a = XmlParserConfig {
+            secure_processing: true,
+            ..Default::default()
+        };
+        let b = XmlParserConfig {
+            external_entities: true,
+            ..Default::default()
+        };
+        let m = a.meet(&b);
+        // Unsafe sticky → no longer secure even though one branch was.
+        assert!(!m.is_secure());
+    }
+
+    #[test]
+    fn xxe_safe_returns_false_without_receiver() {
+        let result = XmlParserConfigResult::default();
+        assert!(!xxe_safe(None, &result));
+    }
+
+    #[test]
+    fn xxe_safe_uses_receiver_config() {
+        let mut configs = HashMap::new();
+        configs.insert(
+            SsaValue(7),
+            XmlParserConfig {
+                secure_processing: true,
+                ..Default::default()
+            },
+        );
+        let result = XmlParserConfigResult { configs };
+        assert!(xxe_safe(Some(SsaValue(7)), &result));
+        assert!(!xxe_safe(Some(SsaValue(8)), &result));
+    }
+}
--- a/src/ssa/xpath_config.rs
+++ b/src/ssa/xpath_config.rs
@ -0,0 +1,235 @@
+//! Per-SSA-value XPath-receiver configuration tracking.
+//!
+//! Mirrors [`crate::ssa::xml_config`] but for `XPath` instances rather
+//! than JAXP parser instances.  Tracks "is this XPath receiver bound to
+//! an `XPathVariableResolver`" along the control-flow path: when a
+//! resolver has been bound, subsequent `xpath.evaluate(expr, ...)` calls
+//! are treated as parameterised and the `XPATH_INJECTION` bit is
+//! stripped from the sink's cap mask.
+//!
+//! Same engine shape as [`crate::ssa::xml_config::XmlParserConfigResult`]:
+//! a small forward dataflow run alongside type-fact analysis. Phi nodes
+//! propagate the meet of operand configs (a flag is "set" only when all
+//! reaching operands set it), copy assignments propagate the receiver's
+//! config, and `setXPathVariableResolver` calls update the receiver's
+//! config in place.
+
+use std::collections::HashMap;
+
+use super::ir::*;
+use crate::cfg::Cfg;
+use crate::symbol::Lang;
+use serde::{Deserialize, Serialize};
+
+/// Receiver-instance config carried forward from `setXPathVariableResolver`
+/// calls.  All flags default to `false` (resolver not bound).  A `true`
+/// flag means: we have proven this XPath receiver was configured for
+/// parameterised evaluation along this control-flow path.
+#[derive(Clone, Copy, Debug, Default, PartialEq, Eq, Serialize, Deserialize)]
+pub struct XPathReceiverConfig {
+    /// True when `xpath.setXPathVariableResolver(...)` has been called
+    /// on this receiver.  Set by Pass 1 on the receiver SSA value;
+    /// propagated through phi joins (meet) and copy assignments (union).
+    pub has_resolver: bool,
+}
+
+impl XPathReceiverConfig {
+    /// True when the receiver is provably bound to a variable resolver.
+    pub fn is_parameterised(&self) -> bool {
+        self.has_resolver
+    }
+
+    /// Phi-meet: a flag survives only when *both* operands set it.  Used
+    /// when the XPath variable was reassigned across branches and only
+    /// some branches bound a resolver.
+    fn meet(&self, other: &Self) -> Self {
+        XPathReceiverConfig {
+            has_resolver: self.has_resolver && other.has_resolver,
+        }
+    }
+
+    /// Union: caller binds a resolver after a copy / phi-join.  Any
+    /// branch setting the flag wins for the union (used for copy
+    /// propagation, which preserves the source value's flags).
+    fn union(&self, other: &Self) -> Self {
+        XPathReceiverConfig {
+            has_resolver: self.has_resolver || other.has_resolver,
+        }
+    }
+}
+
+/// Result of XPath-receiver config analysis.
+#[derive(Clone, Debug, Default, Serialize, Deserialize)]
+pub struct XPathConfigResult {
+    pub configs: HashMap<SsaValue, XPathReceiverConfig>,
+}
+
+impl XPathConfigResult {
+    /// True when the value carries a config fact proving resolver
+    /// binding.
+    pub fn is_parameterised(&self, v: SsaValue) -> bool {
+        self.configs.get(&v).is_some_and(|c| c.is_parameterised())
+    }
+}
+
+/// Suppress the `Cap::XPATH_INJECTION` bit when the receiver of an XPath
+/// `evaluate` / `compile` sink was provably bound to a variable
+/// resolver.  Returns `true` when XPATH_INJECTION should be stripped
+/// from the sink's cap mask.
+///
+/// Conservative defaults:
+/// * No receiver SSA value (free function) → returns `false` (cannot
+///   prove safety, fall through to existing classification).
+/// * Receiver carries no config fact → returns `false`.
+pub fn xpath_safe(receiver: Option<SsaValue>, xpath_config: &XPathConfigResult) -> bool {
+    let Some(rv) = receiver else {
+        return false;
+    };
+    xpath_config.is_parameterised(rv)
+}
+
+/// Run the XPath-receiver config analysis on an SSA body.
+///
+/// Currently models Java's `setXPathVariableResolver` only — the only
+/// language-level resolver-binding API for XPath in the existing
+/// detection corpus.  PHP's `DOMXPath::registerPhpFunctions()` is a
+/// different mechanism (PHP function registration) and not modelled
+/// here.
+pub fn analyze_xpath_config(body: &SsaBody, cfg: &Cfg, lang: Option<Lang>) -> XPathConfigResult {
+    let Some(lang) = lang else {
+        return XPathConfigResult::default();
+    };
+    if !matches!(lang, Lang::Java) {
+        return XPathConfigResult::default();
+    }
+
+    let mut configs: HashMap<SsaValue, XPathReceiverConfig> = HashMap::new();
+
+    // Pass 1 — direct effects from Call instructions in source order.
+    // `setXPathVariableResolver` updates the call's receiver in place;
+    // any non-null argument is treated as a resolver binding.  Argument
+    // null-check would require a const-prop fact, but the conservative
+    // direction here is to assume the bound value is non-null (matches the
+    // XML parser-config setter semantics).
+    for block in &body.blocks {
+        for inst in block.body.iter() {
+            if let SsaOp::Call {
+                callee, receiver, ..
+            } = &inst.op
+            {
+                let suffix = callee.rsplit(['.', ':']).next().unwrap_or(callee);
+                if suffix == "setXPathVariableResolver"
+                    && let Some(rv) = receiver
+                {
+                    let entry = configs.entry(*rv).or_default();
+                    entry.has_resolver = true;
+                }
+            }
+        }
+    }
+
+    if configs.is_empty() {
+        return XPathConfigResult::default();
+    }
+
+    // Pass 2 — fixed-point propagation through copy assignments and
+    // phi joins.  Caps the iteration count: in practice 2-3 rounds
+    // suffice on intra-procedural shapes.
+    let _ = cfg; // CFG retained for parity with `xml_config`; reserved for
+    // future kwarg-driven seeds (e.g. constructor options).
+    for _ in 0..6 {
+        let mut changed = false;
+        for block in &body.blocks {
+            for inst in &block.phis {
+                if let SsaOp::Phi(operands) = &inst.op {
+                    let mut acc: Option<XPathReceiverConfig> = None;
+                    for (_, val) in operands {
+                        let cfg_val = configs.get(val).copied().unwrap_or_default();
+                        acc = Some(match acc {
+                            None => cfg_val,
+                            Some(prev) => prev.meet(&cfg_val),
+                        });
+                    }
+                    if let Some(joined) = acc
+                        && joined != XPathReceiverConfig::default()
+                    {
+                        let prev = configs.get(&inst.value).copied();
+                        if prev != Some(joined) {
+                            configs.insert(inst.value, joined);
+                            changed = true;
+                        }
+                    }
+                }
+            }
+            for inst in &block.body {
+                if let SsaOp::Assign(uses) = &inst.op
+                    && uses.len() == 1
+                    && let Some(src_cfg) = configs.get(&uses[0]).copied()
+                    && src_cfg != XPathReceiverConfig::default()
+                {
+                    let prev = configs.get(&inst.value).copied().unwrap_or_default();
+                    let new_cfg = prev.union(&src_cfg);
+                    if Some(new_cfg) != configs.get(&inst.value).copied() {
+                        configs.insert(inst.value, new_cfg);
+                        changed = true;
+                    }
+                }
+            }
+        }
+        if !changed {
+            break;
+        }
+    }
+
+    XPathConfigResult { configs }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn default_config_is_unparameterised() {
+        let c = XPathReceiverConfig::default();
+        assert!(!c.is_parameterised());
+    }
+
+    #[test]
+    fn has_resolver_marks_parameterised() {
+        let c = XPathReceiverConfig { has_resolver: true };
+        assert!(c.is_parameterised());
+    }
+
+    #[test]
+    fn meet_keeps_intersection() {
+        let a = XPathReceiverConfig { has_resolver: true };
+        let b = XPathReceiverConfig {
+            has_resolver: false,
+        };
+        let m = a.meet(&b);
+        assert!(!m.has_resolver);
+    }
+
+    #[test]
+    fn meet_both_set_keeps_set() {
+        let a = XPathReceiverConfig { has_resolver: true };
+        let b = XPathReceiverConfig { has_resolver: true };
+        let m = a.meet(&b);
+        assert!(m.has_resolver);
+    }
+
+    #[test]
+    fn xpath_safe_returns_false_without_receiver() {
+        let result = XPathConfigResult::default();
+        assert!(!xpath_safe(None, &result));
+    }
+
+    #[test]
+    fn xpath_safe_uses_receiver_config() {
+        let mut configs = HashMap::new();
+        configs.insert(SsaValue(7), XPathReceiverConfig { has_resolver: true });
+        let result = XPathConfigResult { configs };
+        assert!(xpath_safe(Some(SsaValue(7)), &result));
+        assert!(!xpath_safe(Some(SsaValue(8)), &result));
+    }
+}
--- a/src/summary/mod.rs
+++ b/src/summary/mod.rs
@ -49,7 +49,7 @@ pub struct SinkSite {
 impl SinkSite {
    /// Dedup key: two sites with the same `(file_rel, line, col, cap)`
    /// describe the same consumption and collapse on merge.
-    pub(crate) fn dedup_key(&self) -> (&str, u32, u32, u16) {
+    pub(crate) fn dedup_key(&self) -> (&str, u32, u32, u32) {
        (self.file_rel.as_str(), self.line, self.col, self.cap.bits())
    }

@ -277,18 +277,18 @@ pub struct FuncSummary {
    pub param_names: Vec<String>,

    // ── Taint behaviour ──────────────────────────────────────────────────
-    // Stored as raw `u16` so serde doesn't need to know about `bitflags`.
+    // Stored as raw `u32` so serde doesn't need to know about `bitflags`.
    /// Caps this function **introduces**, i.e. the return value carries
    /// freshly‑tainted data even if no argument was tainted.
-    pub source_caps: u16,
+    pub source_caps: u32,

    /// Caps this function **cleans**, passing tainted data through this
    /// function strips the corresponding bits.
-    pub sanitizer_caps: u16,
+    pub sanitizer_caps: u32,

    /// Caps this function **consumes unsafely**, calling it with tainted
    /// arguments that still carry these bits is a finding.
-    pub sink_caps: u16,
+    pub sink_caps: u32,

    /// Which parameter indices (0‑based) flow through to the return value.
    #[serde(default)]
@ -1163,7 +1163,7 @@ impl GlobalSummaries {
    /// Returns `(source_caps, sanitizer_caps, sink_caps, propagating_params)`
    /// per key.  Used by the SCC fixed-point loop to detect when an iteration
    /// has not changed any summary, i.e. convergence.
-    pub fn snapshot_caps(&self) -> HashMap<FuncKey, (u16, u16, u16, Vec<usize>)> {
+    pub fn snapshot_caps(&self) -> HashMap<FuncKey, (u32, u32, u32, Vec<usize>)> {
        self.by_key
            .iter()
            .map(|(k, s)| {
--- a/src/summary/ssa_summary.rs
+++ b/src/summary/ssa_summary.rs
@ -283,7 +283,7 @@ pub struct SsaFuncSummary {
    ///
    /// Default-empty (most functions don't field-mutate their params)
    /// and elided from serialised output via `skip_serializing_if` so
-    /// pre-Phase-5 summaries deserialise cleanly without migration.
+    /// older summaries without this field deserialise cleanly without migration.
    /// Built by extraction in `summary_extract.rs` when the per-body
    /// [`crate::pointer::PointsToFacts`] are available
    /// (`NYX_POINTER_ANALYSIS=1`); empty otherwise.
--- a/src/summary/tests.rs
+++ b/src/summary/tests.rs
@ -9,7 +9,7 @@ fn cap_sites(cap: Cap) -> SmallVec<[SinkSite; 1]> {
    smallvec![SinkSite::cap_only(cap)]
 }

-fn make(name: &str, src: u16, san: u16, sink: u16) -> FuncSummary {
+fn make(name: &str, src: u32, san: u32, sink: u32) -> FuncSummary {
    FuncSummary {
        name: name.into(),
        file_path: "test.rs".into(),
@ -263,7 +263,7 @@ fn lookup_same_lang_returns_all_matches() {
 }

 #[test]
-fn u16_caps_round_trip_serde() {
+fn cap_bits_round_trip_serde() {
    let summary = FuncSummary {
        name: "dangerous".into(),
        file_path: "test.rs".into(),
@ -292,9 +292,96 @@ fn u16_caps_round_trip_serde() {
    assert!(!json.contains("propagates_taint"));
 }

+/// Every new cap class persists across the serde JSON round-trip used
+/// for SQLite blob storage and the `/debug` endpoint.  Catches a
+/// width-mismatch (cap bits truncated to u16) as a hard fail rather than
+/// silent zeroing of the upper bits.
+#[test]
+fn new_cap_classes_round_trip_serde() {
+    let new_caps = Cap::LDAP_INJECTION
+        | Cap::XPATH_INJECTION
+        | Cap::HEADER_INJECTION
+        | Cap::OPEN_REDIRECT
+        | Cap::SSTI
+        | Cap::XXE
+        | Cap::PROTOTYPE_POLLUTION;
+
+    // Sanity: bit-width must accommodate every new cap.
+    assert_ne!(
+        new_caps.bits(),
+        0,
+        "every new cap must carry a non-zero bit"
+    );
+    assert_eq!(
+        new_caps.bits().count_ones(),
+        7,
+        "exactly seven bits must be set across the new caps"
+    );
+
+    // Bit collisions with existing caps would mask a finding.
+    let existing = Cap::ENV_VAR
+        | Cap::HTML_ESCAPE
+        | Cap::SHELL_ESCAPE
+        | Cap::URL_ENCODE
+        | Cap::JSON_PARSE
+        | Cap::FILE_IO
+        | Cap::FMT_STRING
+        | Cap::SQL_QUERY
+        | Cap::DESERIALIZE
+        | Cap::SSRF
+        | Cap::CODE_EXEC
+        | Cap::CRYPTO
+        | Cap::UNAUTHORIZED_ID
+        | Cap::DATA_EXFIL;
+    assert!(
+        (existing & new_caps).is_empty(),
+        "new caps must not collide"
+    );
+
+    let summary = FuncSummary {
+        name: "all_new_classes".into(),
+        file_path: "fixture.rs".into(),
+        lang: "rust".into(),
+        param_count: 0,
+        param_names: vec![],
+        source_caps: 0,
+        sanitizer_caps: 0,
+        sink_caps: new_caps.bits(),
+        propagating_params: vec![],
+        propagates_taint: false,
+        tainted_sink_params: vec![],
+        callees: vec![],
+        ..Default::default()
+    };
+
+    // serde JSON round-trip (the on-disk SQLite format).
+    let json = serde_json::to_string(&summary).unwrap();
+    let back: FuncSummary = serde_json::from_str(&json).unwrap();
+    assert_eq!(back.sink_caps, new_caps.bits());
+    assert!(back.sink_caps().contains(Cap::LDAP_INJECTION));
+    assert!(back.sink_caps().contains(Cap::PROTOTYPE_POLLUTION));
+
+    // Cap registry must surface a rule id for each new cap.
+    for cap in [
+        Cap::LDAP_INJECTION,
+        Cap::XPATH_INJECTION,
+        Cap::HEADER_INJECTION,
+        Cap::OPEN_REDIRECT,
+        Cap::SSTI,
+        Cap::XXE,
+        Cap::PROTOTYPE_POLLUTION,
+    ] {
+        let meta = crate::labels::cap_rule_meta(cap)
+            .unwrap_or_else(|| panic!("missing CAP_RULE_REGISTRY entry for {cap:?}"));
+        assert!(meta.rule_id.starts_with("taint-"));
+        assert!(!meta.title.is_empty());
+        assert!(!meta.description.is_empty());
+    }
+}
+
 #[test]
 fn backward_compat_u8_json_deserializes() {
-    // Old u8-range values still deserialize correctly into u16 fields
+    // Old u8-range values still deserialize correctly into u32 fields
    let json = r#"{
        "name": "old_func",
        "file_path": "legacy.py",
@ -948,6 +1035,8 @@ fn make_callee_body(
            type_facts: crate::ssa::type_facts::TypeFactResult {
                facts: std::collections::HashMap::new(),
            },
+            xml_parser_config: crate::ssa::xml_config::XmlParserConfigResult::default(),
+            xpath_config: crate::ssa::xpath_config::XPathConfigResult::default(),
            alias_result: crate::ssa::alias::BaseAliasResult::empty(),
            points_to: crate::ssa::heap::PointsToResult::empty(),
            module_aliases: std::collections::HashMap::new(),
@ -1413,7 +1502,7 @@ fn fs_with(
    arity: usize,
    kind: FuncKind,
    disambig: Option<u32>,
-    sink_bits: u16,
+    sink_bits: u32,
 ) -> (FuncKey, FuncSummary) {
    let key = FuncKey {
        lang: Lang::Java,
@ -1611,7 +1700,7 @@ fn interop_lookup_returns_none_when_disambig_none_matches_many() {
    // and only disambig distinguishes them, the relaxed interop lookup must
    // return None rather than picking arbitrarily.
    let mut gs = GlobalSummaries::new();
-    let mk = |disambig: u32, bits: u16| {
+    let mk = |disambig: u32, bits: u32| {
        let k = FuncKey {
            lang: Lang::Go,
            namespace: "lib.go".into(),
@ -2102,7 +2191,7 @@ fn method_summary(
    container: &str,
    name: &str,
    arity: usize,
-    sink_bits: u16,
+    sink_bits: u32,
 ) -> (FuncKey, FuncSummary) {
    fs_with(
        namespace,
@ -2119,7 +2208,7 @@ fn free_summary(
    namespace: &str,
    name: &str,
    arity: usize,
-    sink_bits: u16,
+    sink_bits: u32,
 ) -> (FuncKey, FuncSummary) {
    fs_with(
        namespace,
@ -2912,7 +3001,7 @@ fn legacy_summary(
    param_names: Vec<String>,
    kind: FuncKind,
    container: &str,
-    sink: u16,
+    sink: u32,
 ) -> FuncSummary {
    FuncSummary {
        name: name.into(),
@ -3778,7 +3867,7 @@ fn cross_file_devirt_does_not_union_unrelated_findbyids() {
    use crate::labels::Cap;
    use crate::symbol::FuncKey;

-    fn method_summary(name: &str, container: &str, file: &str, sink_caps: u16) -> FuncSummary {
+    fn method_summary(name: &str, container: &str, file: &str, sink_caps: u32) -> FuncSummary {
        FuncSummary {
            name: name.into(),
            file_path: file.into(),
@ -3989,7 +4078,7 @@ mod hierarchy_widened_tests {
        container: &str,
        name: &str,
        arity: usize,
-        sink_bits: u16,
+        sink_bits: u32,
        hierarchy_edges: Vec<(String, String)>,
    ) -> (FuncKey, FuncSummary) {
        let (key, mut summary) = fs_with(
--- a/src/taint/mod.rs
+++ b/src/taint/mod.rs
@ -580,9 +580,19 @@ pub(crate) fn analyse_file_with_lowered(
            f.source.index(),
            !f.path_validated,
            f.path_hash,
+            f.effective_sink_caps.bits(),
+        )
+    });
+    all_findings.dedup_by_key(|f| {
+        (
+            f.body_id,
+            f.sink,
+            f.source,
+            f.path_validated,
+            f.path_hash,
+            f.effective_sink_caps.bits(),
        )
    });
-    all_findings.dedup_by_key(|f| (f.body_id, f.sink, f.source, f.path_validated, f.path_hash));

    // 5. Assign stable finding IDs now that `body_id` has been set and
    //    the dedup has picked the final set of distinct flows.  The ID
@ -679,9 +689,118 @@ fn containment_order(bodies: &[BodyCfg]) -> Vec<usize> {
    order
 }

+/// Build a `var_name → TypeKind` map from a body's optimised SSA + type-fact
+/// result.  Used by [`analyse_multi_body`] to forward closure-captured types
+/// from a parent body into its children, so that bound-variable receiver
+/// idioms (`const c = ldap.createClient(...); function f() { c.search(...) }`)
+/// pick up `TypeKind::LdapClient` on the inner reference via the
+/// [`ssa_transfer::resolve_type_qualified_labels`] receiver scan.
+///
+/// Conflict policy: if the same `var_name` reaches multiple SSA values with
+/// distinct `TypeKind`s the entry is dropped — propagating an ambiguous type
+/// into a child body would fabricate facts, while dropping it just falls back
+/// to the existing structural resolution paths.
+fn extract_named_type_facts(
+    ssa: &crate::ssa::SsaBody,
+    type_facts: &crate::ssa::type_facts::TypeFactResult,
+) -> HashMap<String, crate::ssa::type_facts::TypeKind> {
+    use crate::ssa::type_facts::TypeKind;
+    let mut acc: HashMap<String, TypeKind> = HashMap::new();
+    let mut conflicts: HashSet<String> = HashSet::new();
+    for block in &ssa.blocks {
+        for inst in block.phis.iter().chain(block.body.iter()) {
+            let Some(name) = inst.var_name.as_deref() else {
+                continue;
+            };
+            if conflicts.contains(name) {
+                continue;
+            }
+            let Some(kind) = type_facts.get_type(inst.value) else {
+                continue;
+            };
+            if matches!(kind, TypeKind::Unknown) {
+                continue;
+            }
+            match acc.get(name) {
+                Some(existing) if existing != kind => {
+                    acc.remove(name);
+                    conflicts.insert(name.to_string());
+                }
+                Some(_) => {}
+                None => {
+                    acc.insert(name.to_string(), kind.clone());
+                }
+            }
+        }
+    }
+    acc
+}
+
+/// Inject parent-known closure-capture types into a per-body
+/// [`crate::ssa::type_facts::TypeFactResult`].
+///
+/// Scoped lowering ([`crate::ssa::lower_to_ssa_with_params`]) injects a
+/// `SsaOp::Param` (or `SsaOp::SelfParam`) at the entry block for every
+/// free / closure-captured variable read by the body.  The per-body type
+/// analysis can only seed declared formal-parameter types (via
+/// `BodyMeta.param_types`); free variables are left as `TypeKind::Unknown`
+/// because their definition lives in an enclosing body whose SSA is not
+/// in scope.
+///
+/// This pass walks the entry block's synthetic prologue and, for each
+/// external Param whose name resolves in `parent_var_types`, inserts the
+/// matching [`crate::ssa::type_facts::TypeFact`] into `type_facts.facts`.
+/// Strictly additive: existing facts (e.g. a fact already produced by
+/// `BodyMeta.param_types` seeding for a real formal that happens to share
+/// a name) are never overwritten.
+fn inject_external_type_facts(
+    ssa: &crate::ssa::SsaBody,
+    type_facts: &mut crate::ssa::type_facts::TypeFactResult,
+    parent_var_types: &HashMap<String, crate::ssa::type_facts::TypeKind>,
+) {
+    use crate::ssa::ir::SsaOp;
+    use crate::ssa::type_facts::TypeFact;
+    if parent_var_types.is_empty() || ssa.blocks.is_empty() {
+        return;
+    }
+    for inst in ssa.blocks[0].body.iter() {
+        if !matches!(inst.op, SsaOp::Param { .. } | SsaOp::SelfParam) {
+            continue;
+        }
+        if type_facts.facts.contains_key(&inst.value) {
+            // `analyze_types_with_param_types` may have already typed this
+            // value via a non-Unknown entry from BodyMeta.param_types; in
+            // that case the formal-parameter declaration wins.  Note: the
+            // analysis seeds an Unknown placeholder for unparameterised
+            // Param ops, so we still need to override Unknown entries.
+            if !matches!(
+                type_facts.facts.get(&inst.value).map(|f| &f.kind),
+                Some(crate::ssa::type_facts::TypeKind::Unknown)
+            ) {
+                continue;
+            }
+        }
+        let Some(name) = inst.var_name.as_deref() else {
+            continue;
+        };
+        let Some(kind) = parent_var_types.get(name) else {
+            continue;
+        };
+        let nullable = matches!(kind, crate::ssa::type_facts::TypeKind::Null);
+        type_facts.facts.insert(
+            inst.value,
+            TypeFact {
+                kind: kind.clone(),
+                nullable,
+            },
+        );
+    }
+}
+
 /// Analyse a single body with an optional parent seed.
 ///
 /// Shared logic extracted from `analyse_multi_body` to avoid deep nesting.
+#[allow(clippy::type_complexity)]
 fn analyse_body_with_seed(
    body: &BodyCfg,
    lang: Lang,
@ -698,9 +817,11 @@ fn analyse_body_with_seed(
    seed: Option<&HashMap<ssa_transfer::BindingKey, crate::taint::domain::VarTaint>>,
    import_bindings: Option<&crate::cfg::ImportBindings>,
    cross_file_bodies: Option<&std::collections::HashMap<FuncKey, ssa_transfer::CalleeSsaBody>>,
+    parent_var_types: Option<&HashMap<String, crate::ssa::type_facts::TypeKind>>,
 ) -> (
    Vec<Finding>,
    Option<HashMap<ssa_transfer::BindingKey, crate::taint::domain::VarTaint>>,
+    Option<HashMap<String, crate::ssa::type_facts::TypeKind>>,
 ) {
    let cfg = &body.graph;
    let entry = body.entry;
@ -757,12 +878,21 @@ fn analyse_body_with_seed(

    match ssa_result {
        Ok(mut ssa_body) => {
-            let opt = crate::ssa::optimize_ssa_with_param_types(
+            let mut opt = crate::ssa::optimize_ssa_with_param_types(
                &mut ssa_body,
                cfg,
                Some(lang),
                &body.meta.param_types,
            );
+            // Forward parent-body type facts onto closure-captured Param ops
+            // before any consumer reads `opt.type_facts`.  This is the lever
+            // that makes bound-variable receiver idioms work in scoped bodies
+            // (`let c = ldap.createClient(...); function f() { c.search(...) }`)
+            // — without it the inner `c` SSA value stays Unknown because the
+            // per-body type-fact pass cannot see the enclosing definition.
+            if let Some(pvt) = parent_var_types {
+                inject_external_type_facts(&ssa_body, &mut opt.type_facts, pvt);
+            }
            if tracing::enabled!(tracing::Level::TRACE) {
                tracing::trace!(
                    func = body.meta.name.as_deref().unwrap_or("<anon>"),
@ -811,6 +941,8 @@ fn analyse_body_with_seed(
                receiver_seed: None,
                const_values: Some(&opt.const_values),
                type_facts: Some(&opt.type_facts),
+                xml_parser_config: Some(&opt.xml_parser_config),
+                xpath_config: Some(&opt.xpath_config),
                ssa_summaries,
                extra_labels,
                base_aliases: Some(&opt.alias_result),
@ -909,7 +1041,16 @@ fn analyse_body_with_seed(
                &transfer,
                body_id,
            );
-            (findings, Some(exit_state))
+            // Snapshot named TypeKinds so child bodies can pick up
+            // closure-captured types (e.g. an outer `LdapClient` flowing
+            // into an inner function via free-variable read).
+            let named_types = extract_named_type_facts(&ssa_body, &opt.type_facts);
+            let named_types = if named_types.is_empty() {
+                None
+            } else {
+                Some(named_types)
+            };
+            (findings, Some(exit_state), named_types)
        }
        Err(e) => {
            // SSA lowering produced no analyzable body.  We still surface
@ -929,7 +1070,7 @@ fn analyse_body_with_seed(
            // Drain the collector so the note does not bleed into the
            // next body (which will call reset on entry, but be explicit).
            let _ = ssa_transfer::take_body_engine_notes();
-            (Vec::new(), None)
+            (Vec::new(), None, None)
        }
    }
 }
@ -967,6 +1108,14 @@ fn analyse_multi_body(
        HashMap<ssa_transfer::BindingKey, crate::taint::domain::VarTaint>,
    > = HashMap::new();

+    // Per-body `var_name → TypeKind` snapshots, used to forward closure-
+    // captured types from parent bodies into their children's type-fact
+    // results.  Only populated when a body produces a non-empty set of
+    // typed named values, i.e. it has at least one named SSA value with
+    // a concrete `TypeKind` after optimisation.
+    let mut body_var_types: HashMap<BodyId, HashMap<String, crate::ssa::type_facts::TypeKind>> =
+        HashMap::new();
+
    // ── Pass 1: lexical containment propagation ──────────────────────
    for &idx in &order {
        let body = &file_cfg.bodies[idx];
@ -975,8 +1124,12 @@ fn analyse_multi_body(
            .meta
            .parent_body_id
            .and_then(|pid| body_exit_states.get(&pid));
+        let parent_var_types = body
+            .meta
+            .parent_body_id
+            .and_then(|pid| body_var_types.get(&pid));

-        let (findings, exit_state) = analyse_body_with_seed(
+        let (findings, exit_state, var_types) = analyse_body_with_seed(
            body,
            lang,
            namespace,
@ -990,6 +1143,7 @@ fn analyse_multi_body(
            parent_seed,
            import_bindings,
            cross_file_bodies,
+            parent_var_types,
        );
        tracing::debug!(
            body_id = body.meta.id.0,
@ -1003,6 +1157,9 @@ fn analyse_multi_body(
        if let Some(es) = exit_state {
            body_exit_states.insert(body.meta.id, es);
        }
+        if let Some(vt) = var_types {
+            body_var_types.insert(body.meta.id, vt);
+        }
    }

    // ── Pass 2: JS/TS iterative convergence ──────────────────────────
@ -1163,8 +1320,12 @@ fn analyse_multi_body(
                    .meta
                    .parent_body_id
                    .and_then(|pid| body_exit_states.get(&pid));
+                let parent_var_types = body
+                    .meta
+                    .parent_body_id
+                    .and_then(|pid| body_var_types.get(&pid));

-                let (findings, exit_state) = analyse_body_with_seed(
+                let (findings, exit_state, var_types) = analyse_body_with_seed(
                    body,
                    lang,
                    namespace,
@ -1178,11 +1339,15 @@ fn analyse_multi_body(
                    parent_seed,
                    import_bindings,
                    cross_file_bodies,
+                    parent_var_types,
                );
                // Phase-B: replace (not append) this body's findings
                // in the cache.  Previous rounds' findings for this
                // body are superseded by the new round's output.
                findings_by_body.insert(body.meta.id, findings);
+                if let Some(vt) = var_types {
+                    body_var_types.insert(body.meta.id, vt);
+                }
                if let Some(es) = exit_state {
                    // Phase-C Gauss-Seidel: immediately publish this
                    // body's filtered exit into `current_seed` and
@ -2073,6 +2238,8 @@ fn augment_summaries_with_child_sinks(
                receiver_seed: None,
                const_values: None,
                type_facts: None,
+                xml_parser_config: None,
+                xpath_config: None,
                ssa_summaries: Some(summaries),
                extra_labels: None,
                base_aliases: None,
@ -2135,6 +2302,8 @@ fn augment_summaries_with_child_sinks(
                    receiver_seed: None,
                    const_values: None,
                    type_facts: None,
+                    xml_parser_config: None,
+                    xpath_config: None,
                    ssa_summaries: Some(summaries),
                    extra_labels: None,
                    base_aliases: None,
--- a/src/taint/path_state.rs
+++ b/src/taint/path_state.rs
@ -30,6 +30,26 @@ pub enum PredicateKind {
    /// and the **false branch is the validated path**.  Use inverted polarity
    /// when applying branch predicates.
    ShellMetaValidated,
+    /// Inline relative-URL validation: `x.startsWith("/")` / `x.starts_with("/")`
+    /// / `x.startswith("/")` / `strpos(x, "/") === 0`.  The TRUE branch
+    /// constrains `x` to a relative path (no scheme, no `//host`), which is
+    /// the standard inline form of an open-redirect sanitiser when the
+    /// developer didn't extract a named helper.  Cap-aware: clears
+    /// [`crate::labels::Cap::OPEN_REDIRECT`] only on the validated branch
+    /// so non-redirect sinks downstream still fire on the residual taint.
+    /// Mirrors [`ShellMetaValidated`](Self::ShellMetaValidated) but with
+    /// non-inverted polarity (true branch is the validated path).
+    RelativeUrlValidated,
+    /// Inline URL-parse + host-allowlist validation:
+    /// `new URL(x).host === ALLOWED` (JS/TS),
+    /// `urlparse(x).netloc == ALLOWED` (Python),
+    /// `urlparse(x).hostname in ALLOWED_HOSTS` (Python).
+    /// The TRUE branch constrains the parsed URL's host to a developer-chosen
+    /// allowlist value, the canonical multi-statement open-redirect sanitiser
+    /// for absolute URLs.  Cap-aware: clears
+    /// [`crate::labels::Cap::OPEN_REDIRECT`] only on the validated branch so
+    /// non-redirect sinks downstream still fire on residual taint.
+    HostAllowlistValidated,
    /// Bounded-length rejection: `x.len() > N` / `x.length < N` with N >= 2.
    ///
    /// Commonly paired with `ShellMetaValidated` in OR-chain rejection
@ -178,6 +198,324 @@ fn is_metachar_regex_class(text: &str) -> bool {
    false
 }

+/// Check whether `text` is an inline relative-URL validation: a leading-
+/// slash check on a string variable.  Recognised shapes:
+///
+/// * `<X>.startsWith("/")` — JS/TS/Java/Kotlin
+/// * `<X>.starts_with("/")` — Rust
+/// * `<X>.startswith("/")` — Python
+/// * `strpos($X, "/") === 0` / `mb_strpos(...)` — PHP
+/// * `<X>[0] === "/"` / `<X>[0] == '/'` — JS/TS direct index
+///
+/// Negation prefixes (`!`, `not`) are NOT stripped, the caller's
+/// classification path handles those uniformly via the predicate
+/// polarity inversion machinery.
+fn is_leading_slash_check(text: &str) -> bool {
+    let lower = text.to_ascii_lowercase();
+    // Method-call form: `.startswith("/")` covers JS/TS/Java (`startsWith`
+    // lower-cases to `startswith`), Python (`startswith`), Rust
+    // (`starts_with` → `starts_with` after lower).  Keep the variants
+    // explicit so we don't miss the underscore form.
+    for method in [".startswith(", ".starts_with("] {
+        if let Some(idx) = lower.find(method) {
+            let args_start = idx + method.len();
+            if let Some(needle) = extract_first_string_arg(&lower[args_start..]) {
+                if needle == "/" {
+                    return true;
+                }
+            }
+        }
+    }
+    // PHP `strpos($x, "/") === 0` / `mb_strpos($x, "/") === 0` — leading-
+    // slash detection via offset-zero substring match.  Both equality
+    // forms (`===`, `==`) accepted; the `0` literal is the load-bearing
+    // bit.  Conservative: requires the closing `=== 0` form; bare
+    // `strpos(...)` (truthy check) is not recognised.
+    for prefix in ["strpos(", "mb_strpos("] {
+        if let Some(start) = lower.find(prefix) {
+            let after = &lower[start + prefix.len()..];
+            // Find the closing paren of the strpos call.
+            let mut depth = 1usize;
+            let bytes = after.as_bytes();
+            let mut close = None;
+            let mut i = 0;
+            while i < bytes.len() {
+                match bytes[i] {
+                    b'(' => depth += 1,
+                    b')' => {
+                        depth -= 1;
+                        if depth == 0 {
+                            close = Some(i);
+                            break;
+                        }
+                    }
+                    _ => {}
+                }
+                i += 1;
+            }
+            let Some(close) = close else { continue };
+            let args = &after[..close];
+            // Need at least one comma so we have two args.
+            let mut depth = 0i32;
+            let mut comma = None;
+            for (j, ch) in args.char_indices() {
+                match ch {
+                    '(' | '[' | '{' => depth += 1,
+                    ')' | ']' | '}' => depth -= 1,
+                    ',' if depth == 0 => {
+                        comma = Some(j);
+                        break;
+                    }
+                    _ => {}
+                }
+            }
+            let Some(comma) = comma else { continue };
+            let second = args[comma + 1..].trim();
+            // Strip optional surrounding parens / quotes.
+            let needle = second.trim_matches(|c: char| c == '"' || c == '\'');
+            if needle != "/" {
+                continue;
+            }
+            // Tail after the strpos `)` should compare against 0 with
+            // `===` / `==`.  Allow whitespace.
+            let tail = after[close + 1..].trim_start();
+            if let Some(rest) = tail.strip_prefix("===").or_else(|| tail.strip_prefix("==")) {
+                if rest.trim() == "0" {
+                    return true;
+                }
+            }
+        }
+    }
+    // Direct subscript form: `<X>[0] === '/'` / `<X>[0] == "/"`.
+    // Conservative: the literal `[0]` immediately followed by an
+    // equality op and a single-char `/` literal.
+    for op in ["===", "=="] {
+        let probe = format!("[0] {}", op);
+        if let Some(idx) = lower.find(&probe) {
+            let after = lower[idx + probe.len()..].trim_start();
+            if after.starts_with("'/'") || after.starts_with("\"/\"") {
+                return true;
+            }
+        }
+        // Without spaces around the operator: `[0]==='/'`.
+        let probe_tight = format!("[0]{}", op);
+        if let Some(idx) = lower.find(&probe_tight) {
+            let after = lower[idx + probe_tight.len()..].trim_start();
+            if after.starts_with("'/'") || after.starts_with("\"/\"") {
+                return true;
+            }
+        }
+    }
+    false
+}
+
+/// Check whether `text` is an inline URL-parse + host-allowlist validation.
+///
+/// Recognises the canonical multi-statement open-redirect sanitiser shapes:
+///
+/// * `new URL(<X>).host === ALLOWED` / `new URL(<X>).hostname === ALLOWED`
+///   / `new URL(<X>).origin === ALLOWED` (JS/TS) — accepts `==` and `===`.
+/// * `urlparse(<X>).netloc == ALLOWED` / `urlparse(<X>).hostname == ALLOWED`
+///   (Python `urllib.parse.urlparse` and the `urlparse.urlparse` legacy alias)
+///   — accepts `==`.
+/// * `urllib.parse.urlparse(<X>).netloc == ALLOWED` (qualified Python form).
+/// * `<parsed>.host_str() == ALLOWED` (Rust `url::Url::host_str()`).
+/// * `<parsed>.Host == ALLOWED` / `<parsed>.Hostname() == ALLOWED`
+///   (Go `*url.URL` — case-sensitive capital `H`).
+///
+/// The Rust/Go forms intentionally do not look for the parse call in the
+/// condition text — those parse on a separate line (`let parsed = Url::parse(x)?`,
+/// `parsed, err := url.Parse(x)`) and the validated branch then references
+/// `parsed` directly as the redirect target.  Distinctive accessor names
+/// (`.host_str()`, capital-`H` `.Host`/`.Hostname()`) gate the match so a bare
+/// `u.host == X` (lowercase, ambiguous) still falls through to `Comparison`.
+///
+/// The right-hand side may be a string literal or a bare identifier
+/// (`ALLOWED_HOST` / `cfg.allowed_origin`) — what matters is that the
+/// validation pins the parsed host to one fixed value, locking off the
+/// scheme/authority that would otherwise let the redirect leave the trusted
+/// origin.  The membership form
+/// `ALLOWED_HOSTS.includes(new URL(<X>).host)` / `urlparse(<X>).host in ALLOWED`
+/// is intentionally NOT recognised here, those fall through to
+/// `AllowlistCheck` whose generic validated-must mechanic already clears
+/// every cap for the matched receiver / member token.
+///
+/// Negation prefixes are not stripped, the caller's polarity-inversion
+/// machinery handles `!`-wrapped forms uniformly.
+fn is_host_allowlist_check(text: &str) -> bool {
+    let lower = text.to_ascii_lowercase();
+    // Need an equality operator so we know the host is being pinned to a
+    // specific allowed value (not e.g. assigned, indexed, or used as a key).
+    if !(lower.contains("==") || lower.contains("!=")) {
+        return false;
+    }
+    let has_parse_call = lower.contains("new url(")
+        || lower.contains("urlparse(")
+        || lower.contains("url.parse(")
+        || lower.contains("urllib.parse.urlparse(");
+    if has_parse_call {
+        // Need a host-style accessor on the parse result.
+        return lower.contains(".host")
+            || lower.contains(".hostname")
+            || lower.contains(".netloc")
+            || lower.contains(".origin");
+    }
+    // Multi-statement form: parse happened on a prior line.  Match
+    // distinctive Rust/Go accessor names so we don't misclassify a
+    // generic `obj.host == X` field comparison.
+    //
+    //   Rust: `parsed.host_str() == Some("x")`
+    //   Go:   `parsed.Host == "x"` / `parsed.Hostname() == "x"`
+    //
+    // `.host_str()` is Rust-specific (lowercase-stable identifier).
+    // `.Host`/`.Hostname()` use case-sensitive capital `H` to avoid
+    // matching lowercase `u.host` (which `host_allowlist_requires_parse_call`
+    // explicitly excludes).
+    if lower.contains(".host_str(") {
+        return true;
+    }
+    if has_capital_host_accessor(text) {
+        return true;
+    }
+    false
+}
+
+/// Test whether `text` contains a Go-style capital-`H` URL host accessor:
+/// `.Host` (followed by whitespace or `==`/`!=`) or `.Hostname(`.
+fn has_capital_host_accessor(text: &str) -> bool {
+    if text.contains(".Hostname(") {
+        return true;
+    }
+    let mut rest = text;
+    while let Some(pos) = rest.find(".Host") {
+        let after = &rest[pos + ".Host".len()..];
+        // Reject `.Hostname` (handled above) and any continuation that
+        // would make `.Host` part of a longer identifier (`.Hostess` etc.).
+        let next = after.chars().next();
+        let is_terminator = match next {
+            None => true,
+            Some(c) => !c.is_ascii_alphanumeric() && c != '_',
+        };
+        if is_terminator {
+            // Require an equality op somewhere after the accessor so it's
+            // a comparison, not e.g. an assignment target.
+            let trimmed = after.trim_start();
+            if trimmed.starts_with("==") || trimmed.starts_with("!=") {
+                return true;
+            }
+        }
+        rest = after;
+    }
+    false
+}
+
+/// Extract the parse-call argument from a host-allowlist condition.
+///
+/// Inline form (single-statement parse + check, JS/TS/Python):
+/// recognises `new URL(<X>)`, `urlparse(<X>)`, `URL.parse(<X>)`,
+/// `urllib.parse.urlparse(<X>)`.  Returns `Some("X")` when the argument is a
+/// bare identifier (with optional `&` or PHP `$` sigil stripped).
+///
+/// Multi-statement form (Rust/Go): recognises the receiver of `.host_str()`,
+/// case-sensitive `.Host`/`.Hostname()` and returns the receiver identifier
+/// (the parsed-URL var), which is what downstream code redirects on.
+///
+/// Returns `None` for nested expressions / multi-arg calls so branch
+/// narrowing doesn't widen to a non-existent var.  Mirrors the conservative
+/// target shape used by [`extract_validation_target`].
+fn extract_host_allowlist_target(text: &str) -> Option<String> {
+    let lower = text.to_ascii_lowercase();
+    for probe in [
+        "new url(",
+        "urllib.parse.urlparse(",
+        "urlparse(",
+        "url.parse(",
+    ] {
+        if let Some(idx) = lower.find(probe) {
+            let args_start = idx + probe.len();
+            if args_start <= text.len() {
+                if let Some(first_arg) = first_call_arg(&text[args_start..]) {
+                    let first_arg = first_arg.strip_prefix('&').unwrap_or(first_arg).trim();
+                    let first_arg = first_arg.strip_prefix('$').unwrap_or(first_arg);
+                    if !first_arg.is_empty() && is_identifier(first_arg) {
+                        return Some(first_arg.to_string());
+                    }
+                }
+            }
+        }
+    }
+    // Multi-statement form: receiver of the host accessor is the
+    // parsed-URL var.  Walk the original text (case-sensitive for Go).
+    extract_host_accessor_receiver(text)
+}
+
+/// Walk `text` for `<receiver>.host_str(` (Rust), `<receiver>.Host` followed
+/// by `==`/`!=` (Go), or `<receiver>.Hostname(` (Go).  Returns `Some(receiver)`
+/// when the receiver is a bare identifier (optionally with a `&` deref-prefix
+/// stripped, e.g. Rust `&parsed.host_str()`); `None` otherwise.
+fn extract_host_accessor_receiver(text: &str) -> Option<String> {
+    let probes: &[(&str, bool)] = &[
+        (".host_str(", false), // Rust, case-stable
+        (".Hostname(", false), // Go
+        (".Host", true),       // Go, requires `==`/`!=` after
+    ];
+    for (probe, requires_eq) in probes {
+        if let Some(idx) = text.find(probe) {
+            if *requires_eq {
+                let after = &text[idx + probe.len()..];
+                // Reject `.Hostname` (handled by its own probe) and any
+                // longer-identifier continuation.
+                if let Some(c) = after.chars().next()
+                    && (c.is_ascii_alphanumeric() || c == '_')
+                {
+                    continue;
+                }
+                let trimmed = after.trim_start();
+                if !(trimmed.starts_with("==") || trimmed.starts_with("!=")) {
+                    continue;
+                }
+            }
+            let before = &text[..idx];
+            // Receiver = trailing identifier of `before`, optionally
+            // preceded by `&` (Rust deref).  `parsed.foo.host_str()`
+            // would yield `foo`, which is not a parse var, so we
+            // conservatively reject any receiver with a `.` or `::`.
+            let recv = trailing_identifier(before)?;
+            if recv.contains('.') || recv.contains(':') {
+                return None;
+            }
+            return Some(recv);
+        }
+    }
+    None
+}
+
+/// Walk back from the end of `s` and return the trailing identifier token.
+///
+/// `&parsed` → `Some("parsed")`, `foo.bar` → `Some("bar")`,
+/// `()` → `None`.  Used by [`extract_host_accessor_receiver`] to pull the
+/// parsed-URL var out of `parsed.host_str() == ...`.
+fn trailing_identifier(s: &str) -> Option<String> {
+    let bytes = s.as_bytes();
+    let mut end = bytes.len();
+    while end > 0 {
+        let c = bytes[end - 1];
+        if c.is_ascii_alphanumeric() || c == b'_' {
+            end -= 1;
+        } else {
+            break;
+        }
+    }
+    if end == bytes.len() {
+        return None;
+    }
+    let ident = &s[end..];
+    if ident.is_empty() || ident.as_bytes()[0].is_ascii_digit() {
+        return None;
+    }
+    Some(ident.to_string())
+}
+
 /// Check whether `text` looks like a bounded-length rejection:
 /// `x.len() > N`, `x.len() < N`, `x.length >= N`, etc. where `N` is an
 /// integer literal >= 2.  Excludes `> 0` / `>= 1` / `< 1`, those are
@ -330,6 +668,28 @@ pub fn classify_condition(text: &str) -> PredicateKind {
        return PredicateKind::ShellMetaValidated;
    }

+    // ── Inline relative-URL validation ──────────────────────────────────
+    //
+    // `x.startsWith("/")` (JS/TS/Java/Kotlin), `x.starts_with("/")` (Rust),
+    // `x.startswith("/")` (Python), `strpos($x, "/") === 0` (PHP).
+    // The TRUE branch constrains `x` to a leading-slash relative path —
+    // the canonical inline open-redirect sanitiser.  Matched BEFORE
+    // AllowlistCheck (which would otherwise capture `.starts_with(`).
+    if is_leading_slash_check(text) {
+        return PredicateKind::RelativeUrlValidated;
+    }
+
+    // ── Host-allowlist URL-parse validation ─────────────────────────────
+    //
+    // `new URL(x).host === ALLOWED` (JS/TS), `urlparse(x).netloc == ALLOWED`
+    // (Python), etc.  Matched BEFORE AllowlistCheck so the membership form
+    // `ALLOWED.includes(new URL(x).host)` doesn't fall through here, and
+    // BEFORE the generic Comparison branch so the equality operator
+    // doesn't classify generically.
+    if is_host_allowlist_check(text) {
+        return PredicateKind::HostAllowlistValidated;
+    }
+
    // ── Allowlist / membership checks ────────────────────────────────────
    if lower.contains(".includes(")
        || lower.contains(".include?(")
@ -552,6 +912,19 @@ pub fn classify_condition_with_target(text: &str) -> (PredicateKind, Option<Stri
            let target = extract_validation_target(text);
            (kind, target)
        }
+        PredicateKind::RelativeUrlValidated => {
+            // Receiver of `.startsWith("/")` / `.startswith("/")` /
+            // `.starts_with("/")`, or first arg of `strpos($x, "/")`.
+            // Same machinery as ShellMetaValidated.
+            let target = extract_validation_target(text);
+            (kind, target)
+        }
+        PredicateKind::HostAllowlistValidated => {
+            // Argument of the parse call: `new URL(x).host` → `x`,
+            // `urlparse(x).netloc` → `x`.
+            let target = extract_host_allowlist_target(text);
+            (kind, target)
+        }
        PredicateKind::Comparison => {
            // `x === '/login'`, `x == 5`, `null != obj`, when exactly one
            // side is a literal, extract the identifier side as the target.
@ -1731,6 +2104,150 @@ mod tests {
        assert!(is_bounded_length_check("x.len() > 2"));
        assert!(is_bounded_length_check("x.len() <= 256"));
    }
+
+    // ── HostAllowlistValidated ────────────────────────────────────────────
+
+    #[test]
+    fn classify_host_allowlist_js_strict_eq() {
+        assert_eq!(
+            classify_condition("new URL(target).host === ALLOWED_HOST"),
+            PredicateKind::HostAllowlistValidated
+        );
+        assert_eq!(
+            classify_condition("new URL(target).hostname === \"trusted.example.com\""),
+            PredicateKind::HostAllowlistValidated
+        );
+        assert_eq!(
+            classify_condition("new URL(target).origin === ALLOWED_ORIGIN"),
+            PredicateKind::HostAllowlistValidated
+        );
+    }
+
+    #[test]
+    fn classify_host_allowlist_python_urlparse() {
+        assert_eq!(
+            classify_condition("urlparse(target).netloc == ALLOWED_HOST"),
+            PredicateKind::HostAllowlistValidated
+        );
+        assert_eq!(
+            classify_condition("urllib.parse.urlparse(target).hostname == \"trusted.example.com\""),
+            PredicateKind::HostAllowlistValidated
+        );
+    }
+
+    #[test]
+    fn target_host_allowlist_extracts_parse_arg_js() {
+        let (kind, target) =
+            classify_condition_with_target("new URL(target).host === ALLOWED_HOST");
+        assert_eq!(kind, PredicateKind::HostAllowlistValidated);
+        assert_eq!(target.as_deref(), Some("target"));
+    }
+
+    #[test]
+    fn target_host_allowlist_extracts_parse_arg_python() {
+        let (kind, target) =
+            classify_condition_with_target("urlparse(target).netloc == ALLOWED_HOST");
+        assert_eq!(kind, PredicateKind::HostAllowlistValidated);
+        assert_eq!(target.as_deref(), Some("target"));
+    }
+
+    #[test]
+    fn host_allowlist_requires_parse_call() {
+        // Bare `.host == X` without a parse call is not host-allowlist.
+        let kind = classify_condition("u.host == ALLOWED_HOST");
+        assert_ne!(kind, PredicateKind::HostAllowlistValidated);
+    }
+
+    #[test]
+    fn host_allowlist_requires_equality_op() {
+        // `new URL(x)` without an equality op is not host-allowlist.
+        let kind = classify_condition("new URL(target).host");
+        assert_ne!(kind, PredicateKind::HostAllowlistValidated);
+    }
+
+    // ── Multi-statement form: Rust `.host_str()` ──────────────────────────
+
+    #[test]
+    fn classify_host_allowlist_rust_host_str() {
+        assert_eq!(
+            classify_condition("parsed.host_str() == Some(\"trusted.example.com\")"),
+            PredicateKind::HostAllowlistValidated
+        );
+    }
+
+    #[test]
+    fn target_host_allowlist_rust_host_str_extracts_receiver() {
+        let (kind, target) =
+            classify_condition_with_target("parsed.host_str() == Some(\"trusted.example.com\")");
+        assert_eq!(kind, PredicateKind::HostAllowlistValidated);
+        assert_eq!(target.as_deref(), Some("parsed"));
+    }
+
+    #[test]
+    fn target_host_allowlist_rust_host_str_strips_amp_deref() {
+        // `&parsed.host_str()` is not idiomatic but we still pull out the
+        // receiver via the trailing-identifier walk.
+        let (kind, target) =
+            classify_condition_with_target("&parsed.host_str() == Some(\"trusted.com\")");
+        assert_eq!(kind, PredicateKind::HostAllowlistValidated);
+        assert_eq!(target.as_deref(), Some("parsed"));
+    }
+
+    // ── Multi-statement form: Go `.Host` / `.Hostname()` ──────────────────
+
+    #[test]
+    fn classify_host_allowlist_go_capital_host() {
+        assert_eq!(
+            classify_condition("parsed.Host == \"trusted.example.com\""),
+            PredicateKind::HostAllowlistValidated
+        );
+    }
+
+    #[test]
+    fn classify_host_allowlist_go_hostname_method() {
+        assert_eq!(
+            classify_condition("parsed.Hostname() == \"trusted.example.com\""),
+            PredicateKind::HostAllowlistValidated
+        );
+    }
+
+    #[test]
+    fn target_host_allowlist_go_extracts_receiver() {
+        let (kind, target) =
+            classify_condition_with_target("parsed.Host == \"trusted.example.com\"");
+        assert_eq!(kind, PredicateKind::HostAllowlistValidated);
+        assert_eq!(target.as_deref(), Some("parsed"));
+    }
+
+    #[test]
+    fn target_host_allowlist_go_hostname_extracts_receiver() {
+        let (kind, target) =
+            classify_condition_with_target("parsed.Hostname() == \"trusted.example.com\"");
+        assert_eq!(kind, PredicateKind::HostAllowlistValidated);
+        assert_eq!(target.as_deref(), Some("parsed"));
+    }
+
+    #[test]
+    fn host_allowlist_rejects_lowercase_host_field() {
+        // `.host` (lowercase) without a parse call must NOT match — that
+        // shape is too generic (could be any struct field named `host`).
+        let kind = classify_condition("u.host == ALLOWED_HOST");
+        assert_ne!(kind, PredicateKind::HostAllowlistValidated);
+    }
+
+    #[test]
+    fn host_allowlist_rejects_capital_host_without_eq() {
+        // `parsed.Host` used as a side-effect call argument, not a guard.
+        let kind = classify_condition("log(parsed.Host)");
+        assert_ne!(kind, PredicateKind::HostAllowlistValidated);
+    }
+
+    #[test]
+    fn host_allowlist_rejects_capital_host_substring_in_identifier() {
+        // `.Hostess` is NOT `.Host` — must not match.
+        let kind = classify_condition("party.Hostess == \"alice\"");
+        assert_ne!(kind, PredicateKind::HostAllowlistValidated);
+    }
 }

 #[cfg(test)]
--- a/src/taint/ssa_transfer/events.rs
+++ b/src/taint/ssa_transfer/events.rs
@ -277,7 +277,14 @@ pub fn ssa_events_to_findings(
    ssa: &SsaBody,
    cfg: &Cfg,
 ) -> Vec<crate::taint::Finding> {
-    type FindingDedupKey = (usize, usize, Option<(String, u32, u32)>);
+    // The dedup key includes `cap_bits` so the multi-gate dispatch can
+    // co-emit separate findings for distinct capabilities at the same
+    // (origin, sink) pair (e.g. PHP `header("Location: " . $url)` fires
+    // both HEADER_INJECTION and OPEN_REDIRECT, attributed by the gate
+    // filters' per-cap masks).  Single-cap call sites are unaffected:
+    // every event in that case carries the same `sink_caps`, so the key
+    // collapses identically with or without the extra component.
+    type FindingDedupKey = (usize, usize, Option<(String, u32, u32)>, u32);
    let mut findings = Vec::new();
    let mut seen: HashSet<FindingDedupKey> = HashSet::new();

@ -345,12 +352,14 @@ pub fn ssa_events_to_findings(
            .as_ref()
            .map(|l| (l.file_rel.clone(), l.line, l.col));
        for (val, caps, origins) in &event.tainted_values {
-            let cap_specificity = (*caps & event.sink_caps).bits().count_ones() as u8;
+            let effective_caps = event.sink_caps & *caps;
+            let cap_specificity = effective_caps.bits().count_ones() as u8;
            for origin in origins {
                if seen.insert((
                    origin.node.index(),
                    event.sink_node.index(),
                    loc_key.clone(),
+                    effective_caps.bits(),
                )) {
                    let hop_count = block_distance(ssa, origin.node, event.sink_node);
                    let flow_steps = reconstruct_flow_path(*val, origin, event.sink_node, ssa, cfg);
--- a/src/taint/ssa_transfer/inline.rs
+++ b/src/taint/ssa_transfer/inline.rs
@ -21,7 +21,7 @@ pub(super) const MAX_INLINE_BLOCKS: usize = 500;
 /// Compact cache key: per-arg-position cap bits (sorted, non-empty
 /// only). Origin identity is not part of the key.
 #[derive(Clone, Debug, PartialEq, Eq, Hash)]
-pub(crate) struct ArgTaintSig(pub(super) SmallVec<[(usize, u16); 4]>);
+pub(crate) struct ArgTaintSig(pub(super) SmallVec<[(usize, u32); 4]>);

 /// Call-site-adapted result of inline-analyzing a callee. Built fresh
 /// per call site so origins point to the current caller's chain.
@ -79,7 +79,7 @@ pub(crate) struct ReturnShape {
 impl CachedInlineShape {
    /// Cap bits of the return value, or zero if this shape records "no
    /// return taint".  Used by [`inline_cache_fingerprint`].
-    fn return_caps_bits(&self) -> u16 {
+    fn return_caps_bits(&self) -> u32 {
        self.0.as_ref().map(|s| s.caps.bits()).unwrap_or(0)
    }
 }
@ -101,7 +101,7 @@ pub(crate) fn inline_cache_clear_epoch(cache: &mut InlineCache) {
 #[allow(dead_code)]
 pub(crate) fn inline_cache_fingerprint(
    cache: &InlineCache,
-) -> HashMap<(FuncKey, ArgTaintSig), u16> {
+) -> HashMap<(FuncKey, ArgTaintSig), u32> {
    cache
        .iter()
        .map(|(k, v)| (k.clone(), v.return_caps_bits()))
--- a/src/taint/ssa_transfer/mod.rs
+++ b/src/taint/ssa_transfer/mod.rs
@ -105,6 +105,18 @@ pub struct SsaTaintTransfer<'a> {
    /// Type facts from type analysis.
    /// Used for type-aware sink filtering (e.g., suppress SQL injection for int-typed values).
    pub type_facts: Option<&'a crate::ssa::type_facts::TypeFactResult>,
+    /// XML-parser config facts. Used to suppress XXE bits at parse-class
+    /// sinks whose receiver was provably hardened
+    /// (`setFeature(FEATURE_SECURE_PROCESSING, true)`, etc.).  Strictly
+    /// additive: `None` falls back to the existing flat / gated XXE
+    /// classification.
+    pub xml_parser_config: Option<&'a crate::ssa::xml_config::XmlParserConfigResult>,
+    /// XPath-receiver config facts.  Used to suppress XPATH_INJECTION at
+    /// `evaluate` / `compile` sinks whose receiver was provably bound to
+    /// an `XPathVariableResolver` (parameterised-XPath shape).  Strictly
+    /// additive: `None` falls back to the existing flat / gated XPATH
+    /// classification.
+    pub xpath_config: Option<&'a crate::ssa::xpath_config::XPathConfigResult>,
    /// Precise per-function SSA summaries for intra-file callee resolution.
    /// Checked before legacy FuncSummary resolution.
    ///
@ -1207,6 +1219,85 @@ fn apply_branch_predicates(
        }
    }

+    // RelativeUrlValidated: TRUE branch is the validated path
+    // (`x.startsWith("/")` succeeded → `x` cannot redirect off-host).
+    // Cap-aware: clear `Cap::OPEN_REDIRECT` only; non-redirect sinks
+    // (XSS / SQLi / FILE_IO) downstream still fire on residual taint.
+    if kind == PredicateKind::RelativeUrlValidated && polarity {
+        for var in condition_vars {
+            let mut to_clear: SmallVec<[SsaValue; 4]> = SmallVec::new();
+            for (val, _) in state.values.iter() {
+                if let Some(name) = ssa
+                    .value_defs
+                    .get(val.0 as usize)
+                    .and_then(|vd| vd.var_name.as_deref())
+                {
+                    if name == var {
+                        to_clear.push(*val);
+                    }
+                }
+            }
+            for val in to_clear {
+                if let Some(taint) = state.get(val).cloned() {
+                    let new_caps = taint.caps & !Cap::OPEN_REDIRECT;
+                    if new_caps.is_empty() {
+                        state.remove(val);
+                    } else {
+                        state.set(
+                            val,
+                            VarTaint {
+                                caps: new_caps,
+                                origins: taint.origins,
+                                uses_summary: taint.uses_summary,
+                            },
+                        );
+                    }
+                }
+            }
+        }
+    }
+
+    // HostAllowlistValidated: TRUE branch is the validated path
+    // (`new URL(x).host === ALLOWED` succeeded → `x` cannot redirect off-host).
+    // Cap-aware: clear `Cap::OPEN_REDIRECT` only; non-redirect sinks downstream
+    // still fire on the residual taint caps.  Mirrors the
+    // `RelativeUrlValidated` handler exactly, the only difference is the
+    // recogniser shape (multi-statement parse + host comparison instead of
+    // inline leading-slash check).
+    if kind == PredicateKind::HostAllowlistValidated && polarity {
+        for var in condition_vars {
+            let mut to_clear: SmallVec<[SsaValue; 4]> = SmallVec::new();
+            for (val, _) in state.values.iter() {
+                if let Some(name) = ssa
+                    .value_defs
+                    .get(val.0 as usize)
+                    .and_then(|vd| vd.var_name.as_deref())
+                {
+                    if name == var {
+                        to_clear.push(*val);
+                    }
+                }
+            }
+            for val in to_clear {
+                if let Some(taint) = state.get(val).cloned() {
+                    let new_caps = taint.caps & !Cap::OPEN_REDIRECT;
+                    if new_caps.is_empty() {
+                        state.remove(val);
+                    } else {
+                        state.set(
+                            val,
+                            VarTaint {
+                                caps: new_caps,
+                                origins: taint.origins,
+                                uses_summary: taint.uses_summary,
+                            },
+                        );
+                    }
+                }
+            }
+        }
+    }
+
    // ShellMetaValidated: inverted polarity, the FALSE branch (no metachar
    // found) is the validated path; the TRUE branch is the rejection path.
    //
@ -2203,6 +2294,8 @@ fn inline_analyse_callee(
        receiver_seed: receiver_seed.as_ref(),
        const_values: Some(&callee_body.opt.const_values),
        type_facts: Some(&callee_body.opt.type_facts),
+        xml_parser_config: Some(&callee_body.opt.xml_parser_config),
+        xpath_config: Some(&callee_body.opt.xpath_config),
        ssa_summaries: transfer.ssa_summaries,
        extra_labels: transfer.extra_labels,
        base_aliases: Some(&callee_body.opt.alias_result),
@ -5891,6 +5984,34 @@ fn collect_block_events(
            sink_caps &= !Cap::DATA_EXFIL;
        }

+        // Receiver-type-incompatibility stripping.  When the receiver's type
+        // proves a structurally-attached cap cannot apply (e.g. an
+        // `LdapClient` receiver carrying an `HTML_ESCAPE` Sink label that was
+        // attached to the CFG node by a `*.send`/`*.json`-style suffix
+        // matcher), drop the offending bits *before* the type-qualified-
+        // resolution branch below, so that branch is reachable on the
+        // remaining empty `sink_caps` and can re-anchor a precise sink class
+        // (`LdapClient.search` → `Cap::LDAP_INJECTION`).  Both the
+        // flow-sensitive type from `path_env` and the static type from
+        // `type_facts` are consulted; the static path is what enables
+        // closure-captured receivers (parent body → child body via
+        // [`crate::taint::inject_external_type_facts`]) to participate.
+        if let SsaOp::Call {
+            receiver: Some(rv), ..
+        } = &inst.op
+        {
+            if let Some(ref env) = state.path_env {
+                if let Some(kind) = env.get(*rv).types.as_singleton() {
+                    sink_caps &= !receiver_incompatible_sink_caps(&kind, sink_caps);
+                }
+            }
+            if let Some(tf) = transfer.type_facts {
+                if let Some(kind) = tf.get_type(*rv) {
+                    sink_caps &= !receiver_incompatible_sink_caps(kind, sink_caps);
+                }
+            }
+        }
+
        // Type-qualified sink resolution: when normal sink resolution found nothing,
        // try using the receiver's inferred type to construct a qualified callee name.
        if sink_caps.is_empty() {
@ -5954,6 +6075,39 @@ fn collect_block_events(
            }
        }

+        // ADD XXE on opt-in. When the receiver was constructed
+        // with an explicit external-entity opt-in
+        // (`new XMLParser({ processEntities: true })`,
+        // `lxml.etree.XMLParser(resolve_entities=True)`), the subsequent
+        // `parser.parse(xml)` is an XXE flow even though the callee
+        // carries no flat XXE rule (fast-xml-parser and lxml are
+        // XXE-safe by default).  Runs BEFORE the empty check below so a
+        // previously-empty sink_caps becomes non-empty and downstream
+        // emission proceeds.  The complementary `xxe_safe` suppress path
+        // still runs after this; a call where the receiver was both
+        // opt-in AND later hardened by a setter results in net-zero
+        // (suppress strips what we added).
+        if let SsaOp::Call {
+            receiver: Some(rv),
+            callee: callee_str,
+            ..
+        } = &inst.op
+        {
+            if let Some(xc) = transfer.xml_parser_config {
+                if xc.is_unsafe_explicit(*rv) {
+                    let suffix = callee_str
+                        .rsplit(['.', ':'])
+                        .next()
+                        .unwrap_or(callee_str.as_str());
+                    // `feed` covers Python lxml incremental parsing
+                    // (`parser.feed(body); parser.close()`).
+                    if matches!(suffix, "parse" | "parseString" | "parseFromString" | "feed") {
+                        sink_caps |= Cap::XXE;
+                    }
+                }
+            }
+        }
+
        if sink_caps.is_empty() {
            // Callback pattern: check if callee has source_to_callback and the
            // actual callback argument has a matching param_to_sink.
@ -6055,17 +6209,89 @@ fn collect_block_events(
            continue;
        }

-        // Receiver type incompatibility check.
-        // If the receiver's flow-sensitive type proves it cannot be the kind
-        // of object the sink expects (e.g., Int receiver → not an HTTP response
-        // sink), strip those sink caps.
-        if let Some(ref env) = state.path_env {
+        if sink_caps.is_empty() {
+            continue;
+        }
+
+        // XXE config-fact suppression.  A parse-class sink whose receiver
+        // was provably hardened (`setFeature(FEATURE_SECURE_PROCESSING,
+        // true)`, `setExpandEntityReferences(false)`, etc.) is not an XXE
+        // flow. Drop the bit before downstream sink emission.  Runs after
+        // type-qualified resolution / module alias resolution so the XXE
+        // bit added by `XmlParser.parse` resolution is visible here.
+        if sink_caps.intersects(Cap::XXE) {
            if let SsaOp::Call {
                receiver: Some(rv), ..
            } = &inst.op
            {
-                if let Some(kind) = env.get(*rv).types.as_singleton() {
-                    sink_caps &= !receiver_incompatible_sink_caps(&kind, sink_caps);
+                if let Some(xc) = transfer.xml_parser_config {
+                    if crate::ssa::xml_config::xxe_safe(Some(*rv), xc) {
+                        sink_caps &= !Cap::XXE;
+                    }
+                }
+            }
+        }
+        if sink_caps.is_empty() {
+            continue;
+        }
+
+        // XPath resolver-binding suppression.  An XPath `evaluate` /
+        // `compile` sink whose receiver was provably bound to an
+        // `XPathVariableResolver` is treated as parameterised and the
+        // XPATH_INJECTION bit is stripped.  Mirrors the XXE config-fact
+        // shape above.  Only fires when the receiver also carries
+        // `TypeKind::XPathClient` (gates the suppression behind
+        // type-fact disambiguation so a generic `obj.evaluate(...)`
+        // matched as XPATH_INJECTION via name-only labelling does not
+        // accidentally clear).
+        if sink_caps.intersects(Cap::XPATH_INJECTION) {
+            if let SsaOp::Call {
+                receiver: Some(rv), ..
+            } = &inst.op
+            {
+                if let Some(xpc) = transfer.xpath_config {
+                    let receiver_is_xpath = transfer
+                        .type_facts
+                        .and_then(|tf| tf.get_type(*rv))
+                        .map(|kind| matches!(kind, crate::ssa::type_facts::TypeKind::XPathClient))
+                        .unwrap_or(false);
+                    if receiver_is_xpath && crate::ssa::xpath_config::xpath_safe(Some(*rv), xpc) {
+                        sink_caps &= !Cap::XPATH_INJECTION;
+                    }
+                }
+            }
+        }
+        if sink_caps.is_empty() {
+            continue;
+        }
+
+        // Prototype-pollution suppression (flow-sensitive).
+        // `Object.create(null)` produces a `NullPrototypeObject`-typed
+        // value; subscript writes to such an object cannot pollute
+        // `Object.prototype` because there is no prototype chain.
+        // Receiver SsaValue is read off the synthetic `__index_set__`
+        // Call op; phi joins downgrade to `Unknown` via `TypeFact::meet`
+        // so an if/else where only one branch initialises with
+        // `Object.create(null)` keeps the PROTOTYPE_POLLUTION bit on
+        // the unsafe path.
+        if sink_caps.intersects(Cap::PROTOTYPE_POLLUTION) {
+            if let SsaOp::Call {
+                callee,
+                receiver: Some(rv),
+                ..
+            } = &inst.op
+            {
+                if callee == "__index_set__" {
+                    let receiver_is_null_proto = transfer
+                        .type_facts
+                        .and_then(|tf| tf.get_type(*rv))
+                        .map(|kind| {
+                            matches!(kind, crate::ssa::type_facts::TypeKind::NullPrototypeObject)
+                        })
+                        .unwrap_or(false);
+                    if receiver_is_null_proto {
+                        sink_caps &= !Cap::PROTOTYPE_POLLUTION;
+                    }
                }
            }
        }
@ -6436,7 +6662,7 @@ fn pick_primary_sink_sites(
        return Vec::new();
    };
    let mut out: Vec<SinkSite> = Vec::new();
-    let mut seen: HashSet<(String, u32, u32, u16)> = HashSet::new();
+    let mut seen: HashSet<(String, u32, u32, u32)> = HashSet::new();
    for (param_idx, sites) in param_to_sink_sites {
        let Some(arg_vals) = args.get(*param_idx) else {
            continue;
@ -6475,7 +6701,7 @@ fn pick_primary_sink_sites_from_resolved(
        return Vec::new();
    }
    let mut out: Vec<SinkSite> = Vec::new();
-    let mut seen: HashSet<(String, u32, u32, u16)> = HashSet::new();
+    let mut seen: HashSet<(String, u32, u32, u32)> = HashSet::new();
    for (_, sites) in param_to_sink_sites {
        for site in sites {
            if site.line == 0 {
@ -8127,13 +8353,36 @@ fn type_safe_for_taint_sink(kind: &crate::ssa::type_facts::TypeKind, cap: Cap) -
 fn receiver_incompatible_sink_caps(kind: &crate::ssa::type_facts::TypeKind, sink_caps: Cap) -> Cap {
    use crate::ssa::type_facts::TypeKind;
    let mut remove = Cap::empty();
-    // HTML_ESCAPE requires HTTP response-like receiver
-    if sink_caps.intersects(Cap::HTML_ESCAPE) {
+    // HTML_ESCAPE / OPEN_REDIRECT / HEADER_INJECTION all require an HTTP
+    // response-like receiver: each is a write-side rule that fires when
+    // attacker data is rendered into / written onto the response stream
+    // (`*.send` / `*.redirect` / `*.setHeader` / etc.).  Receivers proven
+    // to be a different class — directory-service connections (LDAP),
+    // database connections, file handles, in-memory collections, query-
+    // builder objects, URL values, HTTP clients (request-side), and so on
+    // — cannot host these sinks even when a same-named matcher
+    // (`*.send`, `*.set`, `*.append`) attaches the label by suffix.
+    let response_like_caps = Cap::HTML_ESCAPE | Cap::OPEN_REDIRECT | Cap::HEADER_INJECTION;
+    if sink_caps.intersects(response_like_caps) {
        match kind {
            TypeKind::HttpResponse => {}               // compatible
            TypeKind::Unknown | TypeKind::Object => {} // could be response
            _ => {
-                remove |= Cap::HTML_ESCAPE;
+                remove |= sink_caps & response_like_caps;
+            }
+        }
+    }
+    // LDAP_INJECTION strictly requires a directory-service receiver.
+    // Non-LdapClient receivers carrying the cap by accident (e.g. a
+    // generic `*.search` suffix matcher firing on a Vec/HashMap) get the
+    // bit stripped.  Unknown/Object stay untouched so type-fact gaps
+    // don't silently drop real sinks.
+    if sink_caps.intersects(Cap::LDAP_INJECTION) {
+        match kind {
+            TypeKind::LdapClient => {}                 // compatible
+            TypeKind::Unknown | TypeKind::Object => {} // could be ldap
+            _ => {
+                remove |= Cap::LDAP_INJECTION;
            }
        }
    }
@ -9364,7 +9613,7 @@ fn resolve_callee_full(
    }

    // 0.5) Cross-file SSA summaries (GlobalSummaries.ssa_by_key) with
-    // optional Phase-6 hierarchy fan-out.
+    // optional class-hierarchy fan-out.
    //
    // When the call has an authoritative receiver type AND
    // `GlobalSummaries::install_hierarchy` has been called AND the
@ -9468,7 +9717,7 @@ fn resolve_callee_full(
        }
    }

-    // 2) Global same-language (FuncSummary path) with Phase-6 hierarchy
+    // 2) Global same-language (FuncSummary path) with class-hierarchy
    // fan-out.  Same semantics as step 0.5 but on coarse FuncSummary
    // entries, the SSA path missed because no implementer had an SSA
    // summary, so we widen the FuncSummary lookup symmetrically.
--- a/src/taint/ssa_transfer/summary_extract.rs
+++ b/src/taint/ssa_transfer/summary_extract.rs
@ -246,6 +246,8 @@ pub fn extract_ssa_func_summary_full(
            receiver_seed: None,
            const_values: None,
            type_facts: local_type_facts_ref,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries,
            extra_labels: None,
            base_aliases: None,
@ -792,6 +794,8 @@ pub fn extract_ssa_func_summary_full(
            receiver_seed: None,
            const_values: None,
            type_facts: local_type_facts_ref,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries,
            extra_labels: None,
            base_aliases: None,
--- a/src/taint/ssa_transfer/tests.rs
+++ b/src/taint/ssa_transfer/tests.rs
@ -93,6 +93,8 @@ mod cross_file_tests {
                type_facts: crate::ssa::type_facts::TypeFactResult {
                    facts: std::collections::HashMap::new(),
                },
+                xml_parser_config: crate::ssa::xml_config::XmlParserConfigResult::default(),
+                xpath_config: crate::ssa::xpath_config::XPathConfigResult::default(),
                alias_result: crate::ssa::alias::BaseAliasResult::empty(),
                points_to: crate::ssa::heap::PointsToResult::empty(),
                module_aliases: std::collections::HashMap::new(),
@ -251,7 +253,7 @@ mod inline_cache_epoch_tests {
        ArgTaintSig(SmallVec::new())
    }

-    fn shape(caps_bits: u16) -> CachedInlineShape {
+    fn shape(caps_bits: u32) -> CachedInlineShape {
        CachedInlineShape(Some(ReturnShape {
            caps: Cap::from_bits_retain(caps_bits),
            internal_origins: SmallVec::new(),
@ -448,7 +450,7 @@ mod binding_key_tests {

    // ── seed_lookup ────────────────────────────────────────────────────

-    fn taint(caps: u16) -> VarTaint {
+    fn taint(caps: u32) -> VarTaint {
        VarTaint {
            caps: Cap::from_bits_truncate(caps),
            origins: smallvec![],
@ -989,6 +991,8 @@ mod goto_succ_propagation_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -1079,6 +1083,8 @@ mod goto_succ_propagation_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -1516,10 +1522,10 @@ mod receiver_candidates_field_proj_tests {

    #[test]
    fn field_proj_receiver_walks_to_typed_root_in_go() {
-        // Go is not Rust, so pre-Phase-4 the candidate walk would have
-        // returned ONLY the immediate receiver (v2 = FieldProj). With
-        // We walk through FieldProj.receiver to recover v0 (the
-        // typed root `c`).
+        // Go is not Rust, so before the FieldProj walk fix the candidate
+        // walk would have returned ONLY the immediate receiver
+        // (v2 = FieldProj). We now walk through FieldProj.receiver to
+        // recover v0 (the typed root `c`).
        let body = body_with_field_proj_chain();
        let cands =
            super::super::receiver_candidates_for_type_lookup(SsaValue(2), Some(&body), Lang::Go);
@ -1709,7 +1715,7 @@ mod fanout_merge_tests {
        ];

        let m = merge_resolved_summaries_fanout(a, b);
-        let mut sorted: Vec<(usize, u16)> = m
+        let mut sorted: Vec<(usize, u32)> = m
            .param_to_sink
            .iter()
            .map(|(i, c)| (*i, c.bits()))
@ -2032,6 +2038,8 @@ mod field_write_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -2114,6 +2122,8 @@ mod field_write_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -2180,6 +2190,8 @@ mod field_write_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -2324,6 +2336,8 @@ mod field_write_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -2420,6 +2434,8 @@ mod container_elem_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -2697,6 +2713,8 @@ mod container_elem_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -2833,6 +2851,8 @@ mod container_elem_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -3387,6 +3407,8 @@ mod field_taint_origin_cap_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
@ -3673,6 +3695,8 @@ mod pointer_lattice_worklist_tests {
            receiver_seed: None,
            const_values: None,
            type_facts: None,
+            xml_parser_config: None,
+            xpath_config: None,
            ssa_summaries: None,
            extra_labels: None,
            base_aliases: None,
--- a/src/taint/tests.rs
+++ b/src/taint/tests.rs
@ -45,6 +45,8 @@ fn ssa_analyse_rust(src: &[u8]) -> Vec<Finding> {
        receiver_seed: None,
        const_values: None,
        type_facts: None,
+        xml_parser_config: None,
+        xpath_config: None,
        ssa_summaries: None,
        extra_labels: None,
        base_aliases: None,
@ -1669,10 +1671,10 @@ fn cpp_builder_chain_const_host_silent() {

 /// inline member-function bodies inside a
 /// `class_specifier` must be extracted as separate functions and
-/// intra-file calls must resolve to their bodies. Pre-Phase-4, the
-/// `class_specifier` AST kind was unmapped in cpp KINDS, so the CFG
-/// walker treated the entire class as a leaf `Seq` node and never
-/// descended into inline methods.
+/// intra-file calls must resolve to their bodies. Before the cpp KINDS
+/// fix the `class_specifier` AST kind was unmapped, so the CFG walker
+/// treated the entire class as a leaf `Seq` node and never descended
+/// into inline methods.
 #[test]
 fn cpp_inline_class_method_resolves() {
    let src = b"#include <cstdlib>\nclass Inner {\npublic:\n  void run(const char* arg) { std::system(arg); }\n};\nint main() {\n  char* input = std::getenv(\"X\");\n  Inner inner;\n  inner.run(input);\n  return 0;\n}\n";
@ -3768,6 +3770,8 @@ fn assert_ssa_integration(src: &[u8]) {
        receiver_seed: None,
        const_values: None,
        type_facts: None,
+        xml_parser_config: None,
+        xpath_config: None,
        ssa_summaries: None,
        extra_labels: None,
        base_aliases: None,
@ -3904,6 +3908,8 @@ fn integ_php_echo_simple_var() {
        receiver_seed: None,
        const_values: None,
        type_facts: None,
+        xml_parser_config: None,
+        xpath_config: None,
        ssa_summaries: None,
        extra_labels: None,
        base_aliases: None,
@ -3972,6 +3978,8 @@ fn integ_c_curl_handle_ssrf() {
        receiver_seed: None,
        const_values: None,
        type_facts: None,
+        xml_parser_config: None,
+        xpath_config: None,
        ssa_summaries: None,
        extra_labels: None,
        base_aliases: None,
--- a/src/utils/config.rs
+++ b/src/utils/config.rs
@ -74,6 +74,14 @@ pub enum CapName {
    Crypto,
    /// Request-bound identifier not yet ownership-checked.
    UnauthorizedId,
+    DataExfil,
+    LdapInjection,
+    XpathInjection,
+    HeaderInjection,
+    OpenRedirect,
+    Ssti,
+    Xxe,
+    PrototypePollution,
    All,
 }

@ -94,6 +102,14 @@ impl CapName {
            Self::CodeExec => Cap::CODE_EXEC,
            Self::Crypto => Cap::CRYPTO,
            Self::UnauthorizedId => Cap::UNAUTHORIZED_ID,
+            Self::DataExfil => Cap::DATA_EXFIL,
+            Self::LdapInjection => Cap::LDAP_INJECTION,
+            Self::XpathInjection => Cap::XPATH_INJECTION,
+            Self::HeaderInjection => Cap::HEADER_INJECTION,
+            Self::OpenRedirect => Cap::OPEN_REDIRECT,
+            Self::Ssti => Cap::SSTI,
+            Self::Xxe => Cap::XXE,
+            Self::PrototypePollution => Cap::PROTOTYPE_POLLUTION,
            Self::All => Cap::all(),
        }
    }
@ -115,6 +131,14 @@ impl fmt::Display for CapName {
            Self::CodeExec => write!(f, "code_exec"),
            Self::Crypto => write!(f, "crypto"),
            Self::UnauthorizedId => write!(f, "unauthorized_id"),
+            Self::DataExfil => write!(f, "data_exfil"),
+            Self::LdapInjection => write!(f, "ldap_injection"),
+            Self::XpathInjection => write!(f, "xpath_injection"),
+            Self::HeaderInjection => write!(f, "header_injection"),
+            Self::OpenRedirect => write!(f, "open_redirect"),
+            Self::Ssti => write!(f, "ssti"),
+            Self::Xxe => write!(f, "xxe"),
+            Self::PrototypePollution => write!(f, "prototype_pollution"),
            Self::All => write!(f, "all"),
        }
    }
@ -137,11 +161,21 @@ impl FromStr for CapName {
            "code_exec" => Ok(Self::CodeExec),
            "crypto" => Ok(Self::Crypto),
            "unauthorized_id" => Ok(Self::UnauthorizedId),
+            "data_exfil" | "data_exfiltration" => Ok(Self::DataExfil),
+            "ldap_injection" | "ldapi" => Ok(Self::LdapInjection),
+            "xpath_injection" | "xpathi" => Ok(Self::XpathInjection),
+            "header_injection" | "crlf" | "response_splitting" => Ok(Self::HeaderInjection),
+            "open_redirect" | "redirect" => Ok(Self::OpenRedirect),
+            "ssti" | "template_injection" => Ok(Self::Ssti),
+            "xxe" => Ok(Self::Xxe),
+            "prototype_pollution" | "proto_pollution" => Ok(Self::PrototypePollution),
            "all" => Ok(Self::All),
            _ => Err(format!(
                "invalid cap name: {s:?} (expected env_var, html_escape, shell_escape, \
                 url_encode, json_parse, file_io, fmt_string, sql_query, deserialize, \
-                 ssrf, code_exec, crypto, unauthorized_id, all)"
+                 ssrf, code_exec, crypto, unauthorized_id, data_exfil, ldap_injection, \
+                 xpath_injection, header_injection, open_redirect, ssti, xxe, \
+                 prototype_pollution, all)"
            )),
        }
    }
--- a/tests/fixtures/cross_file_js_redirect/expectations.json
+++ b/tests/fixtures/cross_file_js_redirect/expectations.json
@ -1,6 +1,6 @@
 {
  "required_findings": [
-    { "id_prefix": "taint-unsanitised-flow", "min_count": 1 }
+    { "id_prefix": "taint-open-redirect", "min_count": 1 }
  ],
  "forbidden_findings": [],
  "noise_budget": {
--- a/tests/fixtures/header_injection/go/safe_set_header.go
+++ b/tests/fixtures/header_injection/go/safe_set_header.go
@ -0,0 +1,18 @@
+// Safe: query value routed through the project-local `stripCRLF` helper
+// before being written to the response header.
+package main
+
+import (
+	"net/http"
+	"strings"
+)
+
+func stripCRLF(raw string) string {
+	return strings.ReplaceAll(strings.ReplaceAll(raw, "\r", ""), "\n", "")
+}
+
+func handler(w http.ResponseWriter, r *http.Request) {
+	lang := r.URL.Query().Get("lang")
+	safe := stripCRLF(lang)
+	w.Header().Set("X-Lang", safe)
+}
--- a/tests/fixtures/header_injection/go/unsafe_set_header.go
+++ b/tests/fixtures/header_injection/go/unsafe_set_header.go
@ -0,0 +1,12 @@
+// Unsafe: net/http `ResponseWriter.Header().Set` receives a value built from
+// `r.URL.Query().Get`.  HEADER_INJECTION fires on the value argument.
+package main
+
+import (
+	"net/http"
+)
+
+func handler(w http.ResponseWriter, r *http.Request) {
+	lang := r.URL.Query().Get("lang")
+	w.Header().Set("X-Lang", lang)
+}
--- a/tests/fixtures/header_injection/java/SafeSetHeader.java
+++ b/tests/fixtures/header_injection/java/SafeSetHeader.java
@ -0,0 +1,16 @@
+// Safe: request parameter routed through the project-local `stripCRLF`
+// helper before being written to the response header.
+import javax.servlet.http.HttpServletRequest;
+import javax.servlet.http.HttpServletResponse;
+
+public class SafeSetHeader {
+    public static String stripCRLF(String raw) {
+        return raw.replace("\r", "").replace("\n", "");
+    }
+
+    public void handle(HttpServletRequest req, HttpServletResponse res) {
+        String lang = req.getParameter("lang");
+        String safe = stripCRLF(lang);
+        res.setHeader("X-Lang", safe);
+    }
+}
--- a/tests/fixtures/header_injection/java/UnsafeSetHeader.java
+++ b/tests/fixtures/header_injection/java/UnsafeSetHeader.java
@ -0,0 +1,11 @@
+// Unsafe: HttpServletResponse.setHeader receives a value built from a
+// request parameter.  HEADER_INJECTION fires on the value argument.
+import javax.servlet.http.HttpServletRequest;
+import javax.servlet.http.HttpServletResponse;
+
+public class UnsafeSetHeader {
+    public void handle(HttpServletRequest req, HttpServletResponse res) {
+        String lang = req.getParameter("lang");
+        res.setHeader("X-Lang", lang);
+    }
+}
--- a/tests/fixtures/header_injection/javascript/safe_set_header.js
+++ b/tests/fixtures/header_injection/javascript/safe_set_header.js
@ -0,0 +1,14 @@
+// Safe: req.query.lang routed through the project-local `stripCRLF` helper
+// before being written to the response header.
+function stripCRLF(raw) {
+    return raw.replace(/[\r\n]/g, '');
+}
+
+function handler(req, res) {
+    const lang = req.query.lang;
+    const safe = stripCRLF(lang);
+    res.setHeader('X-Lang', safe);
+    res.end();
+}
+
+module.exports = handler;
--- a/tests/fixtures/header_injection/javascript/safe_subscript_set.js
+++ b/tests/fixtures/header_injection/javascript/safe_subscript_set.js
@ -0,0 +1,14 @@
+// Safe: req.query.lang routed through the project-local `stripCRLF` helper
+// (a registered HEADER_INJECTION sanitizer) before the subscript-set, so
+// taint-header-injection stays clean.
+function stripCRLF(raw) {
+    return raw.replace(/[\r\n]/g, '');
+}
+
+function handler(req, res) {
+    const lang = req.query.lang;
+    res.headers["X-Forwarded-By"] = stripCRLF(lang);
+    res.end();
+}
+
+module.exports = handler;
--- a/tests/fixtures/header_injection/javascript/unsafe_set_header.js
+++ b/tests/fixtures/header_injection/javascript/unsafe_set_header.js
@ -0,0 +1,9 @@
+// Unsafe: Express `res.setHeader` receives a value built from req.query.
+// HEADER_INJECTION fires on the value argument.
+function handler(req, res) {
+    const lang = req.query.lang;
+    res.setHeader('X-Lang', lang);
+    res.end();
+}
+
+module.exports = handler;
--- a/tests/fixtures/header_injection/javascript/unsafe_subscript_set.js
+++ b/tests/fixtures/header_injection/javascript/unsafe_subscript_set.js
@ -0,0 +1,11 @@
+// Unsafe: tainted req.query value flows into the bare-subscript header set
+// `res.headers["X-Forwarded-By"] = lang`.  The LHS-subscript classification
+// path matches `res.headers` as a HEADER_INJECTION sink so this form fires
+// alongside the explicit `setHeader` / `res.set` method-call shapes.
+function handler(req, res) {
+    const lang = req.query.lang;
+    res.headers["X-Forwarded-By"] = lang;
+    res.end();
+}
+
+module.exports = handler;
--- a/tests/fixtures/header_injection/php/safe_set_header.php
+++ b/tests/fixtures/header_injection/php/safe_set_header.php
@ -0,0 +1,10 @@
+<?php
+// Safe: $_GET['lang'] routed through the project-local `strip_crlf` helper
+// before concatenation.
+function strip_crlf($raw) {
+    return str_replace(["\r", "\n"], ["", ""], $raw);
+}
+
+$lang = $_GET['lang'];
+$safe = strip_crlf($lang);
+header("X-Lang: " . $safe);
--- a/tests/fixtures/header_injection/php/unsafe_set_header.php
+++ b/tests/fixtures/header_injection/php/unsafe_set_header.php
@ -0,0 +1,6 @@
+<?php
+// Unsafe: $_GET['lang'] concatenated into a `header()` line.  The bare
+// `header` matcher (exact-match sigil) fires on the call.  Tainted input
+// without `\r\n` stripping permits response splitting.
+$lang = $_GET['lang'];
+header("X-Lang: " . $lang);
--- a/tests/fixtures/header_injection/python/safe_set_header.py
+++ b/tests/fixtures/header_injection/python/safe_set_header.py
@ -0,0 +1,15 @@
+# Safe: request arg routed through `strip_crlf` before being added to the
+# response headers.
+from flask import request, make_response
+
+
+def strip_crlf(raw):
+    return raw.replace("\r", "").replace("\n", "")
+
+
+def handler():
+    lang = request.args.get("lang")
+    safe = strip_crlf(lang)
+    resp = make_response("ok")
+    resp.headers.add("X-Lang", safe)
+    return resp
--- a/tests/fixtures/header_injection/python/safe_subscript_set.py
+++ b/tests/fixtures/header_injection/python/safe_subscript_set.py
@ -0,0 +1,15 @@
+# Safe: request arg routed through `strip_crlf` (a registered
+# HEADER_INJECTION sanitizer) before the subscript-set, so
+# taint-header-injection stays clean.
+from flask import request, make_response
+
+
+def strip_crlf(raw):
+    return raw.replace("\r", "").replace("\n", "")
+
+
+def handler():
+    lang = request.args.get("lang")
+    response = make_response("ok")
+    response.headers["X-Forwarded-By"] = strip_crlf(lang)
+    return response
--- a/tests/fixtures/header_injection/python/unsafe_set_header.py
+++ b/tests/fixtures/header_injection/python/unsafe_set_header.py
@ -0,0 +1,10 @@
+# Unsafe: Flask response.headers.add receives a value built from request
+# args.  HEADER_INJECTION fires on the value argument.
+from flask import request, make_response
+
+
+def handler():
+    lang = request.args.get("lang")
+    resp = make_response("ok")
+    resp.headers.add("X-Lang", lang)
+    return resp
--- a/tests/fixtures/header_injection/python/unsafe_subscript_set.py
+++ b/tests/fixtures/header_injection/python/unsafe_subscript_set.py
@ -0,0 +1,13 @@
+# Unsafe: tainted request value flows into the bare-subscript header set
+# `response.headers["X-Forwarded-By"] = lang`.  The LHS-subscript
+# classification path matches `response.headers` / `resp.headers` as a
+# HEADER_INJECTION sink so this form fires alongside the explicit
+# `headers.add` / `set_cookie` method-call shapes.
+from flask import request, make_response
+
+
+def handler():
+    lang = request.args.get("lang")
+    response = make_response("ok")
+    response.headers["X-Forwarded-By"] = lang
+    return response
--- a/tests/fixtures/header_injection/ruby/safe_subscript_set.rb
+++ b/tests/fixtures/header_injection/ruby/safe_subscript_set.rb
@ -0,0 +1,7 @@
+# Safe: tainted request value routed through `strip_crlf` (a registered
+# HEADER_INJECTION sanitizer) before the subscript-set, so taint-header-injection
+# stays clean.
+def handle(params, response)
+  lang = params["lang"]
+  response.headers["X-Forwarded-By"] = strip_crlf(lang)
+end
--- a/tests/fixtures/header_injection/ruby/unsafe_subscript_set.rb
+++ b/tests/fixtures/header_injection/ruby/unsafe_subscript_set.rb
@ -0,0 +1,9 @@
+# Unsafe: tainted request value flows into the bare-subscript header set
+# `response.headers["X-Forwarded-By"] = lang`.  The LHS-subscript
+# classification path matches `response.headers` as a HEADER_INJECTION
+# sink so this form fires alongside the explicit `set_header` /
+# `add_header` method-call shapes.
+def handle(params, response)
+  lang = params["lang"]
+  response.headers["X-Forwarded-By"] = lang
+end
--- a/tests/fixtures/header_injection/rust/safe_set_header.rs
+++ b/tests/fixtures/header_injection/rust/safe_set_header.rs
@ -0,0 +1,14 @@
+// Safe: env value routed through the project-local `strip_crlf` helper
+// before being written to the response header.
+use std::env;
+
+fn strip_crlf(raw: &str) -> String {
+    raw.replace('\r', "").replace('\n', "")
+}
+
+fn handler(response: &mut http::Response<()>) {
+    let lang = env::var("LANG").unwrap_or_default();
+    let safe = strip_crlf(&lang);
+    let value = http::HeaderValue::from_str(&safe).unwrap();
+    response.headers_mut().insert("X-Lang", value);
+}
--- a/tests/fixtures/header_injection/rust/unsafe_set_header.rs
+++ b/tests/fixtures/header_injection/rust/unsafe_set_header.rs
@ -0,0 +1,9 @@
+// Unsafe: tainted env value flows into `response.headers_mut().insert`.
+// HEADER_INJECTION fires on the value argument.
+use std::env;
+
+fn handler(response: &mut http::Response<()>) {
+    let lang = env::var("LANG").unwrap_or_default();
+    let value = http::HeaderValue::from_str(&lang).unwrap();
+    response.headers_mut().insert("X-Lang", value);
+}
--- a/tests/fixtures/header_injection/typescript/safe_set_header.ts
+++ b/tests/fixtures/header_injection/typescript/safe_set_header.ts
@ -0,0 +1,12 @@
+// Safe: req.query.lang routed through `stripCRLF` before being written to
+// the response header.
+function stripCRLF(raw: string): string {
+    return raw.replace(/[\r\n]/g, '');
+}
+
+export function handler(req: any, res: any): void {
+    const lang: string = req.query.lang;
+    const safe: string = stripCRLF(lang);
+    res.setHeader('X-Lang', safe);
+    res.end();
+}
--- a/tests/fixtures/header_injection/typescript/safe_subscript_set.ts
+++ b/tests/fixtures/header_injection/typescript/safe_subscript_set.ts
@ -0,0 +1,12 @@
+// Safe: req.query.lang routed through the project-local `stripCRLF` helper
+// (a registered HEADER_INJECTION sanitizer) before the subscript-set, so
+// taint-header-injection stays clean.
+function stripCRLF(raw: string): string {
+    return raw.replace(/[\r\n]/g, '');
+}
+
+export function handler(req: any, res: any): void {
+    const lang: string = req.query.lang;
+    res.headers["X-Forwarded-By"] = stripCRLF(lang);
+    res.end();
+}
--- a/tests/fixtures/header_injection/typescript/unsafe_set_header.ts
+++ b/tests/fixtures/header_injection/typescript/unsafe_set_header.ts
@ -0,0 +1,7 @@
+// Unsafe: Express `res.setHeader` receives a value built from req.query.
+// HEADER_INJECTION fires on the value argument.
+export function handler(req: any, res: any): void {
+    const lang: string = req.query.lang;
+    res.setHeader('X-Lang', lang);
+    res.end();
+}
--- a/tests/fixtures/header_injection/typescript/unsafe_subscript_set.ts
+++ b/tests/fixtures/header_injection/typescript/unsafe_subscript_set.ts
@ -0,0 +1,9 @@
+// Unsafe: tainted req.query value flows into the bare-subscript header set
+// `res.headers["X-Forwarded-By"] = lang`.  The LHS-subscript classification
+// path matches `res.headers` as a HEADER_INJECTION sink so this form fires
+// alongside the explicit `setHeader` / `res.set` method-call shapes.
+export function handler(req: any, res: any): void {
+    const lang: string = req.query.lang;
+    res.headers["X-Forwarded-By"] = lang;
+    res.end();
+}
--- a/tests/fixtures/internal_redirect_taint/expectations.json
+++ b/tests/fixtures/internal_redirect_taint/expectations.json
@ -1,6 +1,6 @@
 {
  "required_findings": [
-    { "id_prefix": "taint-unsanitised-flow", "min_count": 1 }
+    { "id_prefix": "taint-open-redirect", "min_count": 1 }
  ],
  "forbidden_findings": [],
  "noise_budget": {
--- a/tests/fixtures/ldap_injection/c/baseline_constant_ldap.c
+++ b/tests/fixtures/ldap_injection/c/baseline_constant_ldap.c
@ -0,0 +1,12 @@
+/* Baseline: filter is a string literal, no LDAP_INJECTION finding. */
+#include <ldap.h>
+
+int do_lookup(LDAP *ld) {
+    LDAPMessage *res = NULL;
+    return ldap_search_ext_s(
+        ld,
+        "ou=people,dc=example,dc=com",
+        LDAP_SCOPE_SUBTREE,
+        "(objectClass=person)",
+        NULL, 0, NULL, NULL, NULL, 0, &res);
+}
--- a/tests/fixtures/ldap_injection/c/safe_ldap_search.c
+++ b/tests/fixtures/ldap_injection/c/safe_ldap_search.c
@ -0,0 +1,19 @@
+/* Safe: project-local sanitize_ldap_filter (matches the developer-named
+ * `sanitize_*` Sanitizer rule) clears caps on the user value before it
+ * reaches ldap_search_ext_s. */
+#include <ldap.h>
+#include <stdlib.h>
+
+extern char *sanitize_ldap_filter(const char *raw);
+
+int do_lookup(LDAP *ld) {
+    char *user_filter = getenv("USER_FILTER");
+    char *safe = sanitize_ldap_filter(user_filter);
+    LDAPMessage *res = NULL;
+    return ldap_search_ext_s(
+        ld,
+        "ou=people,dc=example,dc=com",
+        LDAP_SCOPE_SUBTREE,
+        safe,
+        NULL, 0, NULL, NULL, NULL, 0, &res);
+}
--- a/tests/fixtures/ldap_injection/c/unsafe_ldap_search.c
+++ b/tests/fixtures/ldap_injection/c/unsafe_ldap_search.c
@ -0,0 +1,15 @@
+/* Unsafe: tainted env-string passed straight as the LDAP filter argument
+ * to ldap_search_ext_s.  LDAP_INJECTION fires on the filter (arg 3). */
+#include <ldap.h>
+#include <stdlib.h>
+
+int do_lookup(LDAP *ld) {
+    char *user_filter = getenv("USER_FILTER");
+    LDAPMessage *res = NULL;
+    return ldap_search_ext_s(
+        ld,
+        "ou=people,dc=example,dc=com",
+        LDAP_SCOPE_SUBTREE,
+        user_filter,
+        NULL, 0, NULL, NULL, NULL, 0, &res);
+}
--- a/tests/fixtures/ldap_injection/cpp/baseline_constant_ldap.cpp
+++ b/tests/fixtures/ldap_injection/cpp/baseline_constant_ldap.cpp
@ -0,0 +1,12 @@
+// Baseline: literal filter, no taint reaches the sink.
+#include <ldap.h>
+
+int do_lookup(LDAP* ld) {
+    LDAPMessage* res = nullptr;
+    return ldap_search_ext_s(
+        ld,
+        "ou=people,dc=example,dc=com",
+        LDAP_SCOPE_SUBTREE,
+        "(objectClass=person)",
+        nullptr, 0, nullptr, nullptr, nullptr, 0, &res);
+}
--- a/tests/fixtures/ldap_injection/cpp/safe_ldap_search.cpp
+++ b/tests/fixtures/ldap_injection/cpp/safe_ldap_search.cpp
@ -0,0 +1,18 @@
+// Safe: developer-named sanitize_* helper clears caps on the user value
+// before it reaches ldap_search_ext_s.
+#include <cstdlib>
+#include <ldap.h>
+
+extern const char* sanitize_ldap_filter(const char* raw);
+
+int do_lookup(LDAP* ld) {
+    const char* user_filter = std::getenv("USER_FILTER");
+    const char* safe = sanitize_ldap_filter(user_filter);
+    LDAPMessage* res = nullptr;
+    return ldap_search_ext_s(
+        ld,
+        "ou=people,dc=example,dc=com",
+        LDAP_SCOPE_SUBTREE,
+        safe,
+        nullptr, 0, nullptr, nullptr, nullptr, 0, &res);
+}
--- a/tests/fixtures/ldap_injection/cpp/unsafe_ldap_search.cpp
+++ b/tests/fixtures/ldap_injection/cpp/unsafe_ldap_search.cpp
@ -0,0 +1,15 @@
+// Unsafe: tainted env value passed straight as the LDAP filter argument to
+// ldap_search_ext_s.  LDAP_INJECTION fires on the filter argument (position 3).
+#include <cstdlib>
+#include <ldap.h>
+
+int do_lookup(LDAP* ld) {
+    const char* user_filter = std::getenv("USER_FILTER");
+    LDAPMessage* res = nullptr;
+    return ldap_search_ext_s(
+        ld,
+        "ou=people,dc=example,dc=com",
+        LDAP_SCOPE_SUBTREE,
+        user_filter,
+        nullptr, 0, nullptr, nullptr, nullptr, 0, &res);
+}
--- a/tests/fixtures/ldap_injection/go/baseline_constant_ldap.go
+++ b/tests/fixtures/ldap_injection/go/baseline_constant_ldap.go
@ -0,0 +1,20 @@
+// Baseline: filter is a literal string, no taint reaches NewSearchRequest.
+package ldap_baseline
+
+import (
+	"github.com/go-ldap/ldap/v3"
+)
+
+func Lookup() {
+	conn, _ := ldap.DialURL("ldap://example.com")
+	req := ldap.NewSearchRequest(
+		"ou=people,dc=example,dc=com",
+		ldap.ScopeWholeSubtree,
+		ldap.NeverDerefAliases,
+		0, 0, false,
+		"(objectClass=person)",
+		[]string{"cn"},
+		nil,
+	)
+	conn.Search(req)
+}
--- a/tests/fixtures/ldap_injection/go/safe_ldap_search.go
+++ b/tests/fixtures/ldap_injection/go/safe_ldap_search.go
@ -0,0 +1,27 @@
+// Safe: ldap.EscapeFilter applies RFC 4515 escaping before the user value
+// is interpolated into the filter.  Sanitizer(LDAP_INJECTION) clears the cap.
+package ldap_safe
+
+import (
+	"fmt"
+	"net/http"
+
+	"github.com/go-ldap/ldap/v3"
+)
+
+func Lookup(w http.ResponseWriter, r *http.Request) {
+	conn, _ := ldap.DialURL("ldap://example.com")
+	user := r.FormValue("user")
+	safe := ldap.EscapeFilter(user)
+	filter := fmt.Sprintf("(uid=%s)", safe)
+	req := ldap.NewSearchRequest(
+		"ou=people,dc=example,dc=com",
+		ldap.ScopeWholeSubtree,
+		ldap.NeverDerefAliases,
+		0, 0, false,
+		filter,
+		[]string{"cn"},
+		nil,
+	)
+	conn.Search(req)
+}
--- a/tests/fixtures/ldap_injection/go/unsafe_ldap_search.go
+++ b/tests/fixtures/ldap_injection/go/unsafe_ldap_search.go
@ -0,0 +1,28 @@
+// Unsafe: form value concatenated into an LDAP filter passed to
+// ldap.NewSearchRequest, then executed via conn.Search.  The construction
+// call is tagged Cap::LDAP_INJECTION on the filter argument so the finding
+// fires here regardless of the eventual conn.Search execution site.
+package ldap_unsafe
+
+import (
+	"fmt"
+	"net/http"
+
+	"github.com/go-ldap/ldap/v3"
+)
+
+func Lookup(w http.ResponseWriter, r *http.Request) {
+	conn, _ := ldap.DialURL("ldap://example.com")
+	user := r.FormValue("user")
+	filter := fmt.Sprintf("(uid=%s)", user)
+	req := ldap.NewSearchRequest(
+		"ou=people,dc=example,dc=com",
+		ldap.ScopeWholeSubtree,
+		ldap.NeverDerefAliases,
+		0, 0, false,
+		filter,
+		[]string{"cn"},
+		nil,
+	)
+	conn.Search(req)
+}
--- a/tests/fixtures/ldap_injection/java/BaselineConstantLdap.java
+++ b/tests/fixtures/ldap_injection/java/BaselineConstantLdap.java
@ -0,0 +1,14 @@
+// Baseline: the filter is a compile-time constant; no taint reaches the sink
+// and no LDAP_INJECTION finding fires.  Guards the rule against firing on
+// safe-by-construction call sites that simply happen to hit a search API.
+import javax.naming.directory.DirContext;
+import javax.naming.directory.SearchControls;
+
+public class BaselineConstantLdap {
+    private DirContext ctx;
+
+    public Object lookup() throws Exception {
+        String filter = "(objectClass=person)";
+        return ctx.search("ou=people,dc=example,dc=com", filter, new SearchControls());
+    }
+}
--- a/tests/fixtures/ldap_injection/java/SafeLdapSearch.java
+++ b/tests/fixtures/ldap_injection/java/SafeLdapSearch.java
@ -0,0 +1,19 @@
+// Safe: the user-supplied substring is run through Spring LDAP's
+// LdapEncoder.filterEncode (RFC 4515 escape) before being assembled into the
+// filter.  The Sanitizer(LDAP_INJECTION) clears the cap and the sink does not
+// fire.
+import javax.naming.directory.DirContext;
+import javax.naming.directory.SearchControls;
+import javax.servlet.http.HttpServletRequest;
+import org.springframework.ldap.support.LdapEncoder;
+
+public class SafeLdapSearch {
+    private DirContext ctx;
+
+    public Object lookup(HttpServletRequest req) throws Exception {
+        String user = req.getParameter("user");
+        String safe = LdapEncoder.filterEncode(user);
+        String filter = "(uid=" + safe + ")";
+        return ctx.search("ou=people,dc=example,dc=com", filter, new SearchControls());
+    }
+}
--- a/tests/fixtures/ldap_injection/java/UnsafeLdapSearch.java
+++ b/tests/fixtures/ldap_injection/java/UnsafeLdapSearch.java
@ -0,0 +1,17 @@
+// Unsafe: attacker-controlled username concatenated into an LDAP filter passed
+// to DirContext.search.  The receiver `ctx` carries TypeKind::LdapClient via
+// the declared `DirContext` type so type-qualified resolution rewrites the
+// callee to `LdapClient.search` and the LDAP_INJECTION sink fires.
+import javax.naming.directory.DirContext;
+import javax.naming.directory.SearchControls;
+import javax.servlet.http.HttpServletRequest;
+
+public class UnsafeLdapSearch {
+    private DirContext ctx;
+
+    public Object lookup(HttpServletRequest req) throws Exception {
+        String user = req.getParameter("user");
+        String filter = "(uid=" + user + ")";
+        return ctx.search("ou=people,dc=example,dc=com", filter, new SearchControls());
+    }
+}
--- a/tests/fixtures/ldap_injection/javascript/baseline_constant_ldap.js
+++ b/tests/fixtures/ldap_injection/javascript/baseline_constant_ldap.js
@ -0,0 +1,11 @@
+// Baseline: filter is a literal constant; no taint reaches the search call.
+const ldap = require('ldapjs');
+
+const client = ldap.createClient({ url: 'ldap://example.com' });
+
+function lookup(_req, res) {
+    const filter = '(objectClass=person)';
+    client.search('ou=people,dc=example,dc=com', { filter: filter }, (err) => { res.json({ ok: !err }); });
+}
+
+module.exports = lookup;
--- a/tests/fixtures/ldap_injection/javascript/safe_ldap_search.js
+++ b/tests/fixtures/ldap_injection/javascript/safe_ldap_search.js
@ -0,0 +1,16 @@
+// Safe: ldap-escape's `filter` helper escapes the user-controlled substring
+// before it lands in the filter expression.  Mirrors the unsafe sibling's
+// bound-variable shape so only the sanitiser introduction differs.
+const ldap = require('ldapjs');
+const ldapEscape = require('ldap-escape');
+
+const client = ldap.createClient({ url: 'ldap://example.com' });
+
+function lookup(req, res) {
+    const user = req.query.user;
+    const safe = ldapEscape(user);
+    const filter = '(uid=' + safe + ')';
+    client.search('ou=people,dc=example,dc=com', { filter: filter }, (err) => { res.json({ ok: !err }); });
+}
+
+module.exports = lookup;
--- a/tests/fixtures/ldap_injection/javascript/unsafe_ldap_search.js
+++ b/tests/fixtures/ldap_injection/javascript/unsafe_ldap_search.js
@ -0,0 +1,16 @@
+// Unsafe: ldapjs `client.search` receives a filter assembled from req.query.
+// Bound-variable idiom: the closure-captured `client` carries
+// `TypeKind::LdapClient` (forwarded from the top-level body to the function
+// body by `taint::inject_external_type_facts`), so type-qualified receiver
+// resolution rewrites `client.search` → `LdapClient.search`.
+const ldap = require('ldapjs');
+
+const client = ldap.createClient({ url: 'ldap://example.com' });
+
+function lookup(req, res) {
+    const user = req.query.user;
+    const filter = '(uid=' + user + ')';
+    client.search('ou=people,dc=example,dc=com', { filter: filter }, (err) => { res.json({ ok: !err }); });
+}
+
+module.exports = lookup;
--- a/tests/fixtures/ldap_injection/php/baseline_constant_ldap.php
+++ b/tests/fixtures/ldap_injection/php/baseline_constant_ldap.php
@ -0,0 +1,4 @@
+<?php
+// Baseline: filter is a literal string, no taint reaches the sink.
+$ds = ldap_connect("ldap://example.com");
+$result = ldap_search($ds, "ou=people,dc=example,dc=com", "(objectClass=person)");
--- a/tests/fixtures/ldap_injection/php/safe_ldap_search.php
+++ b/tests/fixtures/ldap_injection/php/safe_ldap_search.php
@ -0,0 +1,9 @@
+<?php
+// Safe: ldap_escape() with LDAP_ESCAPE_FILTER (or default) sanitises the user
+// substring before it lands in the filter.  Sanitizer(LDAP_INJECTION) clears
+// the cap so the sink does not fire.
+$ds = ldap_connect("ldap://example.com");
+$user = $_GET['user'];
+$safe = ldap_escape($user, "", LDAP_ESCAPE_FILTER);
+$filter = "(uid=" . $safe . ")";
+$result = ldap_search($ds, "ou=people,dc=example,dc=com", $filter);
--- a/tests/fixtures/ldap_injection/php/unsafe_ldap_search.php
+++ b/tests/fixtures/ldap_injection/php/unsafe_ldap_search.php
@ -0,0 +1,7 @@
+<?php
+// Unsafe: $_GET['user'] concatenated into an LDAP filter and passed straight
+// to ldap_search.  LDAP_INJECTION fires on the filter argument.
+$ds = ldap_connect("ldap://example.com");
+$user = $_GET['user'];
+$filter = "(uid=" . $user . ")";
+$result = ldap_search($ds, "ou=people,dc=example,dc=com", $filter);
--- a/tests/fixtures/ldap_injection/python/baseline_constant_ldap.py
+++ b/tests/fixtures/ldap_injection/python/baseline_constant_ldap.py
@ -0,0 +1,10 @@
+# Baseline: filter is a compile-time constant.  No taint reaches `search_s` so
+# no LDAP_INJECTION finding fires.
+import ldap
+
+
+def lookup():
+    conn = ldap.initialize("ldap://example.com")
+    return conn.search_s(
+        "ou=people,dc=example,dc=com", ldap.SCOPE_SUBTREE, "(objectClass=person)"
+    )
--- a/tests/fixtures/ldap_injection/python/safe_ldap_search.py
+++ b/tests/fixtures/ldap_injection/python/safe_ldap_search.py
@ -0,0 +1,14 @@
+# Safe: user-supplied substring run through `escape_filter_chars` (RFC 4515)
+# before being concatenated into the filter.  The sanitizer clears the
+# LDAP_INJECTION cap so the sink does not fire.
+import ldap
+from ldap.filter import escape_filter_chars
+from flask import request
+
+
+def lookup():
+    conn = ldap.initialize("ldap://example.com")
+    user = request.form["user"]
+    safe = escape_filter_chars(user)
+    flt = "(uid=" + safe + ")"
+    return conn.search_s("ou=people,dc=example,dc=com", ldap.SCOPE_SUBTREE, flt)
--- a/Show more
+++ b/Show more