refactor(dynamic): replace reflective invocation with route replay logic for Micronaut and Quarkus, remove annotation stubs, and enhance runtime path binding

This commit is contained in:
elipeter 2026-05-26 11:38:12 -05:00
parent 61bfc0cf96
commit 41c7b73575
26 changed files with 1256 additions and 224 deletions

View file

@ -1,14 +1,14 @@
# Benchmark Results
Current baseline (2026-05-02):
Current baseline (2026-05-26):
| Metric | File-level | Rule-level | CI floor |
|-----------|------------|------------|----------|
| Precision | 1.000 | 1.000 | 0.861 |
| Recall | 1.000 | 1.000 | 0.944 |
| F1 | 1.000 | 1.000 | 0.901 |
| Recall | 0.996 | 0.996 | 0.944 |
| F1 | 0.998 | 0.998 | 0.901 |
Corpus: 507 cases across 10 languages, 504 evaluated (3 disabled). Per-run JSON lands in `tests/benchmark/results/` (`latest.json` plus dated snapshots). See `README.md` for what the scoring modes mean and how to run a subset.
Corpus: 565 cases across 10 languages, 564 evaluated (1 disabled). Per-run JSON lands in `tests/benchmark/results/` (`latest.json` plus dated snapshots). See `README.md` for what the scoring modes mean and how to run a subset.
The corpus is mostly synthetic 8-20 line fixtures, one vulnerability or one safe pattern per file. A smaller real-CVE replay set under `cve_corpus/` covers 30 published advisories across all 10 languages. Both contribute to the headline numbers.
@ -53,14 +53,14 @@ Real disclosed CVEs reduced to minimal reproducers, vulnerable + patched pair pe
| CVE-2024-32884 | Rust | gitoxide | Apache-2.0 OR MIT | CMDI | detected |
| CVE-2025-53549 | Rust | matrix-rust-sdk | Apache-2.0 | SQL Injection | detected |
| CVE-2016-3714 | C | ImageMagick (ImageTragick) | ImageMagick License | CMDI | detected |
| CVE-2017-1000117 | C | git (ssh:// argv injection)| GPL-2.0 | cmdi (argv-inj) | deferred |
| CVE-2017-1000117 | C | git (ssh:// argv injection)| GPL-2.0 | cmdi (argv-inj) | detected |
| CVE-2019-18634 | C | sudo (pwfeedback) | ISC | memory_safety | detected |
| CVE-2019-13132 | C++ | ZeroMQ libzmq | MPL-2.0 | memory_safety | detected |
| CVE-2022-1941 | C++ | Protocol Buffers | BSD-3-Clause | memory_safety | detected |
| CVE-2026-25544 | TypeScript | Payload (Drizzle adapter) | MIT | sql_injection | detected |
| CVE-2026-42353 | JavaScript | i18next-http-middleware | MIT | path_traversal | detected |
Deferred entries are real bugs Nyx can't yet detect. The fixture stays committed with `disabled: true` in ground truth so the gap remains visible.
No real-CVE entries are currently deferred. If a future real-CVE fixture exposes a detector gap, keep it committed with `disabled: true` in ground truth so the gap remains visible.
### How CVEs get picked
@ -83,7 +83,8 @@ Most recent first. Metrics are rule-level on the corpus size at that point.
| Date | Change | Corpus | P | R | F1 |
|------------|------------------------------------------------------------------------------|--------|-------|-------|-------|
| 2026-05-26 | Benchmark docs corrected for CVE-2026-25544: the Payload Drizzle SQL injection fixture is enabled and detected in `ground_truth.json`; only CVE-2017-1000117 remains deferred in the real-CVE table | 565 | 1.000 | 1.000 | 1.000 |
| 2026-05-26 | C argv-injection taint now propagates through execvp argv arrays while recognising the upstream `ssh_host[0] == '-'` dash-prefix rejection and ignoring env-derived executable-path argv elements; CVE-2017-1000117 re-enabled and detected, patched counterpart stays clean | 565 | 1.000 | 0.996 | 0.998 |
| 2026-05-26 | Benchmark docs corrected for CVE-2026-25544: the Payload Drizzle SQL injection fixture is enabled and detected in `ground_truth.json` | 565 | 1.000 | 1.000 | 1.000 |
| 2026-05-04 | C cvehunt session-0014: CVE-2017-1000117 (git ssh:// hostname-as-argv injection) added in corpus disabled — three-layer C engine gap: (a) array-element taint propagation through `args[i] = ssh_host;` writes, (b) missing `c.cmdi.exec*` AST patterns in `src/patterns/c.rs`, (c) sanitizer recognition of the upstream `if (ssh_host[0] == '-') die(...)` dash-prefix guard | 565 | 1.000 | 1.000 | 1.000 |
| 2026-05-04 | JS/TS array-method validator-callback narrowing (`try_array_method_validator_callback_narrowing` in `src/taint/ssa_transfer/mod.rs`) — `<arr>.filter(<isSafeXxx>)` / `.find` / `.findLast` strips `Cap::all()` from the call result when the callback resolves to a `BooleanTrueIsValid` validator; CVE-2026-42353 (i18next-http-middleware path traversal) re-enabled in ground truth, deferred queue cleared | 563 | 1.000 | 1.000 | 1.000 |
| 2026-05-04 | JS/TS ternary-RHS source-classification fix in `src/cfg/conditions.rs::lower_ternary_branch` (segment-strip first_member_label on the branch AST) — `let arr = cond ? req.query.lng : "";` now propagates taint through the diamond's join phi instead of lowering both branches to labelless Assign-with-empty-uses; CVE-2026-42353 (i18next-http-middleware path traversal / SSRF) added in corpus disabled — needs Array.prototype.filter(known_validator_callback) precision bridge | 561 | 1.000 | 1.000 | 1.000 |

View file

@ -5359,7 +5359,8 @@
"taint-unsanitised-flow"
],
"allowed_alternative_rule_ids": [
"c.cmdi.execvp"
"c.cmdi.execvp",
"cfg-unguarded-sink"
],
"forbidden_rule_ids": [],
"expected_severity": "HIGH",
@ -6078,7 +6079,8 @@
"taint-unsanitised-flow"
],
"allowed_alternative_rule_ids": [
"cpp.cmdi.execvp"
"cpp.cmdi.execvp",
"cfg-unguarded-sink"
],
"forbidden_rule_ids": [],
"expected_severity": "HIGH",
@ -11829,14 +11831,14 @@
"expected_category": "Security",
"expected_sink_lines": [
[
87,
87
95,
95
]
],
"expected_source_lines": [
[
92,
92
95,
95
]
],
"tags": [
@ -11845,8 +11847,7 @@
"argv-injection",
"cmdi"
],
"disabled": true,
"disabled_reason": "C taint engine does not propagate taint through C array-element writes (`args[i] = ssh_host;`) and has no `c.cmdi.exec*` AST pattern; even if such a pattern were added it would also fire on the patched fixture (precision miss) because the CVE is sanitised by a pre-call dash-prefix guard the engine does not classify as a validator. Three-layer deep fix tracked in CVE_DEFERRED.md.",
"disabled": false,
"notes": "CVE-2017-1000117 (git ssh:// argv injection): pre-2.7.6 git accepted `ssh://-oProxyCommand=...@host/repo` URLs and pushed the URL host as an argv element to ssh, where a leading dash was treated as an option flag. GPL-2.0"
},
{
@ -11877,8 +11878,7 @@
"patched",
"negative"
],
"disabled": true,
"disabled_reason": "Paired with cve-c-2017-1000117-vulnerable; precision side requires sanitizer recognition of the upstream `if (ssh_host[0] == '-') die(...)` guard so that adding any `c.cmdi.execvp` AST pattern would not also fire on the patched fixture.",
"disabled": false,
"notes": "CVE-2017-1000117 patched counterpart: dash-prefix gate added before argv assembly; regression guard that Nyx does not refire on the fix once the deferral lands"
},
{
@ -17800,4 +17800,4 @@
"notes": "Patched form of `sanitizeValue` from `@payloadcms/drizzle@v3.73.0` (MIT). Enabled after validated-flow propagation landed."
}
]
}
}

View file

@ -1,6 +1,6 @@
{
"benchmark_version": "1.0",
"timestamp": "2026-05-11T15:19:43Z",
"timestamp": "2026-05-26T16:09:13Z",
"scanner_version": "0.7.0",
"scanner_config": {
"analysis_mode": "Full",
@ -9,10 +9,10 @@
"state_analysis_enabled": true,
"worker_threads": 1
},
"ground_truth_hash": "sha256:00a4629e50841ab26c7ba947adfdab43b909d72d7a0885d604e702cc56552eb4",
"ground_truth_hash": "sha256:4ec1e5ec0d72129f458db49b8aab8579a03e704ed6fe6e67ef45038924868420",
"corpus_size": 565,
"cases_run": 562,
"cases_skipped": 3,
"cases_run": 564,
"cases_skipped": 1,
"outcomes": [
{
"case_id": "c-buf-001",
@ -151,11 +151,11 @@
"outcome_rule_level": "TP",
"outcome_location_level": "TP",
"matched_rule_ids": [
"taint-unsanitised-flow (source 5:18)"
"cfg-unguarded-sink"
],
"unexpected_rule_ids": [],
"all_finding_ids": [
"taint-unsanitised-flow (source 5:18)"
"cfg-unguarded-sink"
],
"security_finding_count": 1,
"non_security_finding_count": 0
@ -680,11 +680,11 @@
"outcome_rule_level": "TP",
"outcome_location_level": "TP",
"matched_rule_ids": [
"taint-unsanitised-flow (source 5:18)"
"cfg-unguarded-sink"
],
"unexpected_rule_ids": [],
"all_finding_ids": [
"taint-unsanitised-flow (source 5:18)"
"cfg-unguarded-sink"
],
"security_finding_count": 1,
"non_security_finding_count": 0
@ -1126,6 +1126,40 @@
"security_finding_count": 1,
"non_security_finding_count": 0
},
{
"case_id": "cve-c-2017-1000117-patched",
"file": "cve_corpus/c/CVE-2017-1000117/patched.c",
"language": "c",
"vuln_class": "safe",
"is_vulnerable": false,
"outcome_file_level": "TN",
"outcome_rule_level": "TN",
"outcome_location_level": null,
"matched_rule_ids": [],
"unexpected_rule_ids": [],
"all_finding_ids": [],
"security_finding_count": 0,
"non_security_finding_count": 0
},
{
"case_id": "cve-c-2017-1000117-vulnerable",
"file": "cve_corpus/c/CVE-2017-1000117/vulnerable.c",
"language": "c",
"vuln_class": "cmdi",
"is_vulnerable": true,
"outcome_file_level": "TP",
"outcome_rule_level": "TP",
"outcome_location_level": "TP",
"matched_rule_ids": [
"taint-unsanitised-flow (source 95:12)"
],
"unexpected_rule_ids": [],
"all_finding_ids": [
"taint-unsanitised-flow (source 95:12)"
],
"security_finding_count": 1,
"non_security_finding_count": 0
},
{
"case_id": "cve-c-2019-18634-patched",
"file": "cve_corpus/c/CVE-2019-18634/patched.c",
@ -10041,29 +10075,29 @@
}
],
"aggregate_file_level": {
"tp": 274,
"tp": 275,
"fp": 0,
"fn_": 1,
"tn": 287,
"tn": 288,
"precision": 1.0,
"recall": 0.9963636363636363,
"f1": 0.9981785063752276
"recall": 0.9963768115942029,
"f1": 0.9981851179673321
},
"aggregate_rule_level": {
"tp": 274,
"tp": 275,
"fp": 0,
"fn_": 1,
"tn": 287,
"tn": 288,
"precision": 1.0,
"recall": 0.9963636363636363,
"f1": 0.9981785063752276
"recall": 0.9963768115942029,
"f1": 0.9981851179673321
},
"by_language": {
"c": {
"tp": 17,
"tp": 18,
"fp": 0,
"fn_": 0,
"tn": 17,
"tn": 18,
"precision": 1.0,
"recall": 1.0,
"f1": 1.0
@ -10170,7 +10204,7 @@
"f1": 1.0
},
"cmdi": {
"tp": 58,
"tp": 59,
"fp": 0,
"fn_": 0,
"tn": 0,
@ -10290,7 +10324,7 @@
"tp": 0,
"fp": 0,
"fn_": 0,
"tn": 284,
"tn": 285,
"precision": 1.0,
"recall": 1.0,
"f1": 1.0
@ -10343,31 +10377,31 @@
},
"by_confidence": {
">=High": {
"tp": 85,
"fp": 114,
"fn_": 190,
"tn": 173,
"precision": 0.4271356783919598,
"recall": 0.3090909090909091,
"f1": 0.3586497890295359
"tp": 81,
"fp": 118,
"fn_": 195,
"tn": 170,
"precision": 0.40703517587939697,
"recall": 0.29347826086956524,
"f1": 0.3410526315789474
},
">=Low": {
"tp": 85,
"fp": 142,
"fn_": 190,
"tn": 145,
"precision": 0.3744493392070485,
"recall": 0.3090909090909091,
"f1": 0.33864541832669326
"tp": 81,
"fp": 147,
"fn_": 195,
"tn": 141,
"precision": 0.35526315789473684,
"recall": 0.29347826086956524,
"f1": 0.3214285714285714
},
">=Medium": {
"tp": 85,
"fp": 133,
"fn_": 190,
"tn": 154,
"precision": 0.38990825688073394,
"recall": 0.3090909090909091,
"f1": 0.3448275862068966
"tp": 81,
"fp": 139,
"fn_": 195,
"tn": 149,
"precision": 0.36818181818181817,
"recall": 0.29347826086956524,
"f1": 0.3266129032258065
}
}
}

View file

@ -1,4 +1,4 @@
// Phase 14 Micronaut `@Controller`, benign.
// Micronaut `@Controller`, benign.
//
// Same shape as the vuln but echoes a constant string instead of
// concatenating the path variable into a shell command.

View file

@ -1,17 +0,0 @@
// Phase 14 fixture stub minimal Micronaut `@Controller`.
// Lives in `io.micronaut.http.annotation` so the fixture's
// `import io.micronaut.http.annotation.Controller;` compiles under
// plain javac (no Micronaut Maven dep required).
package io.micronaut.http.annotation;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.TYPE)
public @interface Controller {
String value() default "";
}

View file

@ -1,14 +0,0 @@
// Phase 14 fixture stub minimal Micronaut `@Get`.
package io.micronaut.http.annotation;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface Get {
String value() default "";
}

View file

@ -1,8 +1,9 @@
// Phase 14 Micronaut `@Controller`, vulnerable.
// Micronaut `@Controller`, vulnerable.
//
// `@Controller("/run")` on the class + `@Get("/{id}")` on the handler
// matches the Phase 14 [`JavaShape::MicronautRoute`]. The harness
// invokes `show(payload)` via reflection.
// matches `JavaShape::MicronautRoute`. The harness keeps the real
// Micronaut annotations on the classpath and replays the route through
// those annotations.
import io.micronaut.http.annotation.Controller;
import io.micronaut.http.annotation.Get;

View file

@ -14,5 +14,10 @@
<artifactId>micronaut-http</artifactId>
<version>4.4.0</version>
</dependency>
<dependency>
<groupId>io.micronaut</groupId>
<artifactId>micronaut-core</artifactId>
<version>4.4.0</version>
</dependency>
</dependencies>
</project>

View file

@ -1,6 +1,8 @@
// Phase 14 Quarkus reactive route, benign.
// Quarkus reactive route, benign.
// import io.quarkus.runtime.Quarkus;
import io.quarkus.runtime.Quarkus;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import java.io.BufferedReader;
import java.io.InputStreamReader;

View file

@ -1,11 +0,0 @@
// Phase 14 fixture stub minimal `@GET` Jakarta REST annotation.
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface GET {
}

View file

@ -1,15 +0,0 @@
// Phase 14 fixture stub minimal `@Path` annotation (Jakarta REST).
// Lives in the default package; the fixture imports the symbol as
// plain `@Path` so javac is happy without a Quarkus / Jakarta REST
// Maven dep.
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.TYPE, ElementType.METHOD})
public @interface Path {
String value() default "";
}

View file

@ -1,10 +1,10 @@
// Phase 14 Quarkus reactive route, vulnerable.
//
// `@Path("/run")` on the type + `@GET` on the handler matches the
// Phase 14 [`JavaShape::detect`] for Quarkus. The harness invokes
// `run(payload)` via reflection.
// Quarkus reactive route, vulnerable. The harness keeps the real
// Jakarta REST annotations on the classpath and replays the route
// through those annotations.
// import io.quarkus.runtime.Quarkus;
import io.quarkus.runtime.Quarkus;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import java.io.BufferedReader;
import java.io.InputStreamReader;

View file

@ -14,5 +14,10 @@
<artifactId>quarkus-resteasy-reactive</artifactId>
<version>3.8.3</version>
</dependency>
<dependency>
<groupId>jakarta.ws.rs</groupId>
<artifactId>jakarta.ws.rs-api</artifactId>
<version>3.1.0</version>
</dependency>
</dependencies>
</project>

View file

@ -767,6 +767,40 @@ mod phase14_shape_tests {
assert_not_confirmed("quarkus_route", &r);
}
// ── micronaut_route ──────────────────────────────────────────────────────
#[test]
fn micronaut_route_vuln_is_confirmed() {
let Some(r) = run(
"micronaut_route",
"Vuln.java",
"show",
Cap::CODE_EXEC,
21,
EntryKind::HttpRoute,
PayloadSlot::Param(0),
) else {
return;
};
assert_confirmed("micronaut_route", &r);
}
#[test]
fn micronaut_route_benign_not_confirmed() {
let Some(r) = run(
"micronaut_route",
"Benign.java",
"show",
Cap::CODE_EXEC,
18,
EntryKind::HttpRoute,
PayloadSlot::Param(0),
) else {
return;
};
assert_not_confirmed("micronaut_route", &r);
}
// ── Phase 09 staging assertion (Spring transitive dep pick-up) ──────────
/// Verify the Phase 09 staging path identifies Spring when the