mirror of
https://github.com/elicpeter/nyx.git
synced 2026-06-06 19:35:13 +02:00
* feat: Add const_bound_vars tracking to prevent false positives in ownership checks
* feat: Introduce field interner and typed bounded vars for enhanced type tracking
* feat: Add typed_call_receivers and typed_bounded_dto_fields for enhanced type tracking
* feat: Centralize method name extraction with bare_method_name helper
* feat: Implement Phase-6 hierarchy fan-out for runtime virtual dispatch
* feat: Enhance C++ taint tracking with additional container operations and inline method resolution
* feat: Introduce field-sensitive points-to analysis for enhanced resource tracking
* feat: Implement Pointer-Phase 6 subscript handling for enhanced container analysis
* test: Add comprehensive tests for JavaScript control flow constructs and lattice operations
* docs: Update advanced analysis documentation with field-sensitive points-to and hierarchy fan-out details
* test: Add comprehensive tests for lattice algebra laws and SSA edge cases
* feat: Add destructured session user handling and safe user ID access patterns
* feat: Implement row-population reverse-walk for enhanced authorization checks
* feat: Enhance authorization checks with local alias chain for self-actor types
* feat: Introduce ActiveRecord query safety checks and enhance snippet extraction
* feat: Implement chained method call inner-gate rebinding for SSRF prevention
* feat: Add observability and error modules, enhance debug functionality, and implement theme context
* feat: Remove Auth Analysis page and update navigation to redirect to Explorer
* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor
* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor
* feat: Reset path-safe-suppressed spans before lowering to maintain analysis integrity
* fix(ssa): ungate debug_assert_bfs_ordering for release-tests build
The helper at src/ssa/lower.rs was gated `#[cfg(debug_assertions)]` while
the unit test at the bottom of the file was gated only `#[cfg(test)]`.
Since `cfg(test)` is set in release builds with `--tests` but
`cfg(debug_assertions)` is not, `cargo build --release --tests` failed
with E0425. Removing the gate fixes the build; the body is `debug_assert!`
only, so the helper is free in release. Also drop the gate at the call
site to avoid a `dead_code` warning when the lib is built without
`--tests`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(closure-capture): flip JS/TS fixtures to required-finding
The JS and TS closure-capture fixtures pinned the old broken behaviour
via `forbidden_findings: [{ "id_prefix": "taint-" }]`. The engine now
correctly traces taint through the closure boundary (env source captured
by an arrow function, sunk via `child_process.exec` inside the body), so
the formerly-forbidden finding is a true positive.
Match the Python sibling's shape — `required_findings` with
`id_prefix` + `min_count` plus a small `noise_budget` — and rewrite the
companion READMEs and the phase8_fragility_tests doc-comments from
"known gap" to "regression guard".
Verified:
- cargo test --release --test phase8_fragility_tests → 8/8 pass
- cargo test --release --lib bfs_assertion → pass
- corpus benchmark F1 = 0.9976 (TP=205, FP=1, FN=0) — unchanged
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: Add OWASP mapping and baseline mutation hooks for enhanced security analysis
* feat: Introduce health module and enhance health score computation with calibration tests
* feat: Add expectations configuration and cleanup .gitignore for log files
* feat: Implement theme selection and enhance settings panel for triage sync
* feat: Suppress false positives for strcpy calls with literal sources in AST
* feat: Update analyse_function_ssa to return body CFG for accurate analysis
* feat: Add bug report and feature request templates for improved issue tracking
* feat: removed dev scripts
* feat: update README.md for clarity and consistency in fixture descriptions
* feat: removed dev docs
* feat: clean up error handling and UI elements for improved user experience
* feat: adjust button sizes in HeaderBar for better UI consistency
* feat: enhance taint analysis with additional context for sanitizer and taint findings
* cargo fmt
* prettier
* refactor: simplify conditional checks and improve code readability in AST and screenshot capture scripts
* feat: add script to frame PNG screenshots with brand gradient
* feat: add fuzzing support with new targets and CI workflows
* refactor: streamline match expressions and improve formatting in CLI and output handling
* feat: enhance configuration display with detailed output options
* feat: stage demo configuration for improved CLI screenshot output
* feat: expose merge_configs function for user-configurable settings
* refactor: simplify code structure and improve readability in config handling
* refactor: improve descriptions for vulnerability patterns in various languages
* feat: update MIT License section with additional usage details and copyright information
* feat: update screenshots
* refactor: update build process and paths for frontend assets
* feat: add cross-file taint fuzzing target and supporting dictionary
* refactor: clean up formatting and comments in fuzz configuration and example files
* refactor: remove outdated comments and clean up CI configuration files
* chore: update changelog dates and improve formatting in documentation
* refactor: update Cargo.toml and CI configuration for improved packaging and build process
* refactor: enhance quote-stripping logic to prevent panics and add regression tests
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
302 lines
16 KiB
Markdown
302 lines
16 KiB
Markdown
# Language Maturity Matrix
|
|
|
|
Nyx supports ten languages, but support depth is not uniform. This page gives an
|
|
honest per-language picture so you can calibrate expectations before depending
|
|
on Nyx for a given stack.
|
|
|
|
The classifications here are grounded in three concrete signals:
|
|
|
|
1. **Rule depth**: how many distinct source / sanitizer / sink matchers exist
|
|
for the language in `src/labels/<lang>.rs`, and how many vulnerability
|
|
classes (Cap bits) those matchers cover.
|
|
2. **Benchmark results**: rule-level precision / recall / F1 on the 433-case
|
|
corpus in
|
|
[`tests/benchmark/RESULTS.md`](https://github.com/elicpeter/nyx/blob/master/tests/benchmark/RESULTS.md),
|
|
last measured 2026-04-29 with scanner version 0.5.0.
|
|
3. **Known weak spots**: FPs and FNs the maintainers have deliberately left
|
|
in the benchmark rather than suppressed, plus structural engine
|
|
limitations the corpus does not stress, documented release-by-release in
|
|
[`RESULTS.md`](https://github.com/elicpeter/nyx/blob/master/tests/benchmark/RESULTS.md).
|
|
|
|
As of 2026-04-29 the synthetic corpus has effectively saturated: nine of ten
|
|
languages report rule-level F1 = 100.0% and Go reports 94.1% (two FPs and
|
|
one FN on a real-CVE SSRF case, `cve-go-2023-3188-vulnerable`). Aggregate
|
|
rule-level P=0.991, R=0.995, F1=0.993. That means F1 alone no longer
|
|
differentiates tiers, so the differentiators are **rule depth**,
|
|
**gated-sink coverage**, and **structural idioms the corpus does not fully
|
|
stress** (deep pointer aliasing in C/C++, framework-specific context). All
|
|
parser integrations use tree-sitter and are stable; parsing is not a
|
|
differentiator.
|
|
|
|
---
|
|
|
|
## Tier Summary
|
|
|
|
| Tier | Languages | F1 | What to expect |
|
|
|------|-----------|----|----------------|
|
|
| **Stable** | Python, JavaScript, TypeScript | 100% | Deep rule sets, gated sinks (argument-role-aware), framework detection, extensive fixtures, and the bulk of advanced-analysis (SSA two-level solve, context-sensitivity, symbolic execution, abstract interpretation) coverage. Safe to depend on in CI gates. |
|
|
| **Beta** | Go, Java, PHP, Ruby, Rust | 94.1% to 100% | Solid mid-depth rule sets with narrower cap coverage and **no gated sinks**. Cross-file flows work; some idioms (variable-typed method receivers, framework context, string interpolation, match-arm guards) are partially modeled. Usable in CI; review FP/FN lists before tightening gates. |
|
|
| **Preview** | C, C++ | 100% on synthetic corpus | Recent work taught the engine to follow taint through `std::vector` / `std::string` / map containers (including `c_str()`), through fluent builder chains like `Socket::builder().host(h).connect()`, and through inline class member functions. Function pointers and deeper pointer aliasing through `*p` / `p->field` are still not tracked. Rule-level scores against a corpus of obvious unsafe-API uses look perfect, but that is not the same as a clean audit on a real codebase. Pair with clang-tidy, Clang Static Analyzer, or Infer. |
|
|
|
|
---
|
|
|
|
## Per-Language Detail
|
|
|
|
### Stable tier
|
|
|
|
#### Python: 100% P / 100% R / 100% F1 *(46-case corpus)*
|
|
|
|
- **Rule depth**: 5 source families, 7 sanitizer families, 21 sink matchers
|
|
spanning HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
|
|
- **Framework context**: Flask, Django, argparse source matchers; `flask_request`
|
|
import-alias support.
|
|
- **Advanced analysis**: gated sinks (`Popen`, `subprocess.run/call` with
|
|
activation-arg awareness), most SSA-equivalence and symbolic-execution
|
|
fixtures target Python.
|
|
- **Fixtures**: 125 under `tests/fixtures/` plus 42 benchmark cases.
|
|
- **Blind spots**: f-string interpolation is not explicitly modeled as a
|
|
distinct taint-producing construct; string-formatting flows are caught by
|
|
the general concatenation path.
|
|
|
|
#### JavaScript: 100% P / 100% R / 100% F1 *(42-case corpus)*
|
|
|
|
- **Rule depth**: 3 source families, 10 sanitizer families, 24 sink matchers
|
|
spanning HTML, URL, JSON, Shell, SQL, Code, SSRF, and File I/O.
|
|
- **Advanced analysis**: gated sinks (`setAttribute`, `parseFromString`),
|
|
two-level SSA solve for top-level + per-function scopes
|
|
(`analyse_ssa_js_two_level`), prefix-locked SSRF suppression via
|
|
StringFact, abstract-interpretation interval tracking.
|
|
- **Framework context**: Express, Koa, Fastify (via in-file import scan when
|
|
`package.json` is absent).
|
|
- **Fixtures**: 238 under `tests/fixtures/`; the largest fixture set of any
|
|
language.
|
|
- **Blind spots**: template literals are lowered through concatenation rather
|
|
than modeled as a first-class taint operator; dynamic property access
|
|
(`obj[user]`) is conservatively treated.
|
|
|
|
#### TypeScript: 100% P / 100% R / 100% F1 *(47-case corpus)*
|
|
|
|
- **Rule depth**: Shares the JS ruleset (3 sources, 10 sanitizers, 24 sinks)
|
|
plus TS-specific grammar handling.
|
|
- **Advanced analysis**: TSX and JSX grammars wired;
|
|
discriminated-union narrowing, generic erasure, decorator flow, and
|
|
interface dispatch are all validated against adversarial type-system
|
|
stressors.
|
|
- **Framework context**: Fastify detection via `detect_in_file_frameworks`
|
|
(import-driven, no `package.json` required).
|
|
- **Fixtures**: 39 test fixtures plus 42 benchmark cases.
|
|
- **Blind spots**: `as any` casts and `any`-typed flows are handled
|
|
conservatively (treated as tainted).
|
|
|
|
### Beta tier
|
|
|
|
#### Go: 92.3% P / 96.0% R / 94.1% F1 *(53-case corpus, 2 FPs, 1 FN)*
|
|
|
|
- **Rule depth**: 4 source families, 4 sanitizer families, 9 sink matchers
|
|
covering HTML, URL, Shell, SQL, SSRF, Crypto, and File I/O.
|
|
- **Framework context**: Gin, Echo source matchers.
|
|
- **Open weak spots**: `cve-go-2023-3188-vulnerable` (owncast SSRF) goes
|
|
undetected, and two safe Go fixtures (`go-safe-007`, `go-safe-009`) draw
|
|
spurious SQLi and CMDi findings respectively. These are the only
|
|
imperfect language scores in the current corpus.
|
|
- **Known gaps**: no gated sinks, no deserialization class. `fmt.Sprintf`
|
|
is deliberately not a sink. Cap coverage is narrower than the Stable
|
|
tier and argument-role-aware sink modeling is not yet implemented for Go,
|
|
so production CI gates may surface additional FPs the corpus does not
|
|
exercise.
|
|
|
|
#### Java: 100% P / 100% R / 100% F1 *(35-case corpus)*
|
|
|
|
- **Rule depth**: 3 source families, 8 sanitizer families, 10 sink matchers
|
|
covering HTML, URL, Shell, SQL, Code, SSRF, and Deserialization.
|
|
- **Framework context**: Spring, JPA, Hibernate ORM rules; JNDI injection
|
|
sinks.
|
|
- **Known gaps**: no gated sinks. Variable-receiver method calls
|
|
(`client.send(...)` vs `HttpClient.send(...)`) rely on type-qualified
|
|
resolution from receiver-type inference; flows where the receiver type
|
|
cannot be inferred are conservatively over-tainted on unusual builder
|
|
chains.
|
|
|
|
#### PHP: 100% P / 100% R / 100% F1 *(37-case corpus)*
|
|
|
|
- **Rule depth**: 3 source families (`$_GET`, `$_POST`, `$_REQUEST`
|
|
superglobals), 7 sanitizer families, 10 sink matchers covering HTML, URL,
|
|
Shell, SQL, Code, SSRF, File I/O, and Deserialization.
|
|
- **Known gaps**: no gated sinks. Limited framework context (Laravel raw
|
|
methods only). `echo` language-construct detection is wired but its
|
|
inner-argument propagation is narrower than function-call sinks.
|
|
|
|
#### Ruby: 100% P / 100% R / 100% F1 *(39-case corpus)*
|
|
|
|
- **Rule depth**: 3 source families, 7 sanitizer families, 15 sink matchers
|
|
covering HTML, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
|
|
- **Framework context**: Rails helpers (`sanitize_sql`, `permit`, `require`).
|
|
- **Known gaps**: string interpolation inside shell and SQL strings is
|
|
recognized structurally but not modeled as a distinct operator.
|
|
`begin/rescue/ensure` exception-edge wiring is documented as deferred
|
|
(structurally incompatible with `build_try()`). The previous open
|
|
`rb-interproc-001` FN closed in the 2026-04-28 baseline after the
|
|
Ruby `Kernel#open` CMDI sink and exact-match sigil work landed.
|
|
|
|
#### Rust: 100% P / 100% R / 100% F1 *(70-case adversarial corpus)*
|
|
|
|
Rust holds the largest per-language adversarial corpus and was promoted
|
|
from Experimental to Beta in the 2026-04-25 measurement after the PathFact
|
|
landings closed every previously-open `rs-safe-*` regression.
|
|
|
|
- **Rule depth**: 6 source families, **2** sanitizer families (prefix and
|
|
type-coercion), 11 sink matchers covering HTML, Shell, SQL, SSRF,
|
|
Deserialization, and File I/O. Extensive framework source coverage
|
|
(Axum, Actix, Rocket); the most of any language on the source side. The
|
|
narrow sanitizer count is the primary reason Rust is not in the Stable
|
|
tier. Engine-side path/typed sanitizer recognition (PathFact) compensates,
|
|
but the ruleset itself is shallow.
|
|
- **Recent additions**: SQL class (`rusqlite`, `sqlx`, `diesel`,
|
|
`postgres`), Deserialization class (`serde_yaml`, `bincode`,
|
|
`rmp_serde`, `ciborium`, `ron`, `toml`), expanded file I/O
|
|
(`fs::remove_file/dir/rename/copy`), `reqwest` SSRF builder chain.
|
|
- **Closed by recent PathFact landings**
|
|
(`src/abstract_interp/path_domain.rs` + per-return-path PathFact entries
|
|
on `SsaFuncSummary`): `rs-safe-007` (`.replace("..","")` sanitiser),
|
|
`rs-safe-008` (negative-validation return), `rs-safe-009` (match-arm
|
|
guards via condition lifting), `rs-safe-010` (static-map lookup),
|
|
`rs-safe-012` (`.contains("..")` + `.starts_with('/')` rejection),
|
|
`rs-safe-014` (Option-returning user sanitiser), `rs-safe-015`
|
|
(`Path::new(p).is_absolute()` typed rejection), `rs-safe-016`
|
|
(cross-function `.contains("..")` rejection), and CVE patches
|
|
`CVE-2018-20997`, `CVE-2022-36113`, `CVE-2024-24576`.
|
|
- **Not yet covered**: unsafe FFI / `std::mem::transmute` (no rules), Tokio
|
|
`process::Command` async variants (not distinguished from sync),
|
|
`hyper` / `surf` / `ureq` SSRF clients (reqwest family only).
|
|
|
|
### Preview tier
|
|
|
|
C and C++ remain **Preview** despite reporting 100% rule-level F1 on the
|
|
synthetic corpus. A run of additions in late April taught the engine to
|
|
follow taint through several constructs that used to be hard cutoffs (STL
|
|
containers, builder chains, inline member functions, the wider `std::sto*`
|
|
family), so the gap between "passes the synthetic corpus" and "would catch
|
|
the same flow on a real codebase" is narrower than it used to be. It is not
|
|
zero. The biggest remaining gaps are deep pointer aliasing and function
|
|
pointers, both of which are pervasive in real C/C++ code. Treat a clean
|
|
report as a starting point, not an audit. Pair Nyx with clang-tidy, the
|
|
Clang Static Analyzer, or Infer for production use.
|
|
|
|
**What now works** (added in late April):
|
|
|
|
- STL container flow. `vec.push_back(tainted)` followed by
|
|
`vec.front().c_str()` carries taint into a downstream `system()` sink.
|
|
`std::map::insert_or_assign`, `find`, `count`, `at`, and `data` all
|
|
participate in the container store/load model.
|
|
- Inline class member functions. `class C { void run(...) { ... } };`
|
|
bodies are now extracted as their own functions, so an intra-file call
|
|
like `inner.run(input)` resolves to the body summary. Same fix covers
|
|
`struct_specifier`, `union_specifier`, `enum_specifier`,
|
|
`template_declaration`, and `extern "C"` blocks.
|
|
- Lambda passthrough. `auto echo = [](const char* s) { return s; };` carries
|
|
argument taint into the result via the engine's default call-argument
|
|
propagation.
|
|
- Builder chains. `Socket::builder().host(user).port(8080).connect()`
|
|
resolves the chained returns and fires on `.connect()` when `user` is
|
|
tainted; the safe variant with a hardcoded host stays quiet.
|
|
- Wider numeric sanitizer family. The full `std::sto*` set (including
|
|
`stoll`, `stoull`, `stold`) and the C-stdlib forms (`atoi`, `atof`,
|
|
`strtol`, etc.) clear all caps when they're called.
|
|
- More header / source extensions. `.cc`, `.cxx`, `.hpp`, `.hxx`, `.hh`,
|
|
and `.h++` are recognized as C++ on top of `.cpp` and `.c++`. `.h` is
|
|
intentionally still routed to C since it's ambiguous without a build
|
|
system.
|
|
|
|
**Still not modeled** (common to both C and C++):
|
|
|
|
- Deep pointer aliasing. Taint through `*p`, `p->field`, and arbitrary
|
|
pointer arithmetic is not tracked through arbitrary aliased writes.
|
|
Field-sensitive points-to (see [Advanced analysis](advanced-analysis.md))
|
|
handles the "lock on a sub-field" case but is not a general escape
|
|
analysis.
|
|
- Function pointers and callback dispatch. An indirect call through
|
|
`void (*fn)(char *)` resolves to no callee, so cross-pointer flows are
|
|
invisible.
|
|
- Array-element taint by index. Writes to `buf[i]` do not always propagate
|
|
taint to `buf` as a whole; the recent subscript-handling work helps the
|
|
general case but doesn't make `buf` an alias for every element.
|
|
- Nested classes beyond one level (C++ only).
|
|
|
|
#### C: 100% P / 100% R / 100% F1 *(30-case corpus)*
|
|
|
|
- **Rule depth**: 3 source families, **2** sanitizer families (the
|
|
`sanitize_*` prefix and numeric-parse functions), 5 sink matchers spanning
|
|
Shell, File, SSRF, and Format-String.
|
|
- **Known gaps**: no framework rules, no gated sinks. The structural
|
|
limitations listed above are the dominant concern; rule additions alone
|
|
will not lift this language out of the Preview tier.
|
|
|
|
#### C++: 100% P / 100% R / 100% F1 *(33-case corpus, plus 6 new fixtures for STL / builder / inline-method flows)*
|
|
|
|
- **Rule depth**: Builds on the C ruleset with `std::cin` / `std::getline`
|
|
sources and a wider numeric-sanitizer set covering the full `std::sto*`
|
|
family (3 sources, 3 sanitizer families, 5 sinks).
|
|
- **Known gaps**: still no framework rules and no gated sinks. The
|
|
structural blind spots are now narrower than they were a release ago
|
|
(see "What now works" above), but function pointers and the harder
|
|
pointer-aliasing patterns still produce false negatives.
|
|
|
|
---
|
|
|
|
## How the tiers were assigned
|
|
|
|
Because rule-level F1 has saturated for nine of ten languages, the tier
|
|
boundaries are drawn primarily on **rule depth** and **engine coverage of
|
|
real-world idioms** rather than on benchmark scores alone.
|
|
|
|
A language lands in **Stable** when all three hold:
|
|
|
|
- Rule set covers ≥ 8 vulnerability classes with both source and sink
|
|
matchers, and at least one class has argument-role-aware **gated-sink**
|
|
modeling (e.g. `setAttribute("href", url)` only flags href-like attrs).
|
|
- Benchmark F1 ≥ 95% on a corpus of ≥ 25 cases.
|
|
- Advanced analysis (SSA lowering, context-sensitivity, symbolic execution,
|
|
abstract interpretation) is exercised by fixtures for the language.
|
|
|
|
A language lands in **Beta** when benchmark F1 is in the mid-90s or higher
|
|
on a meaningful corpus but at least one Stable criterion fails. Typical
|
|
gaps: absence of gated sinks, or sanitizer rule depth narrow enough that
|
|
the engine compensates structurally rather than via the ruleset.
|
|
|
|
A language lands in **Preview** when the engine has documented structural
|
|
blind spots for constructs that are pervasive in typical codebases for that
|
|
language. For C and C++ that means deep pointer aliasing, function
|
|
pointers, and array-element taint; STL container flow and builder chains
|
|
have moved out of the blind-spot list. Synthetic-corpus F1 is not a
|
|
reliable signal for Preview-tier languages: a clean report can coexist
|
|
with structural gaps.
|
|
|
|
(The previous **Experimental** tier was retired in the 2026-04-25
|
|
measurement when Rust's adversarial corpus reached 100% F1; no language
|
|
currently sits in that tier.)
|
|
|
|
---
|
|
|
|
## What this means for you
|
|
|
|
- **CI gates**: safe to set strict `--fail-on HIGH` gates on Stable-tier
|
|
languages. On Beta-tier, expect occasional FP triage on production code
|
|
(the synthetic corpus does not cover every framework idiom); the
|
|
weak-spot lists above tell you what to skim for. On Preview-tier, treat
|
|
Nyx findings as a starting point for manual review rather than
|
|
authoritative. STL container flow and builder chains are tracked now,
|
|
but deep pointer aliasing and function pointers are not, so a clean
|
|
report does not tell you what the engine could not see.
|
|
- **Rule contributions**: the shortest path to raising a language's tier is
|
|
contributing sink matchers and gated-sink registrations. Label files live
|
|
at `src/labels/<lang>.rs`; benchmark cases live at
|
|
`tests/benchmark/corpus/<lang>/`.
|
|
- **Scope planning**: if your primary stack is C or C++, Nyx will surface
|
|
real findings on obvious unsafe-API uses, but budget for review time and
|
|
combine Nyx with `clang-tidy` or the Clang Static Analyzer. Rust is now
|
|
Beta-tier and suitable as a CI gate; pair with `cargo-audit` for
|
|
dependency CVEs.
|
|
|
|
The benchmark thresholds in `tests/benchmark_test.rs` are deliberately set
|
|
~5 pp below current baselines so any drop in a language's F1 fails CI. Tier
|
|
promotions require sustained benchmark performance, not just rule additions.
|