mirror of
https://github.com/elicpeter/nyx.git
synced 2026-07-03 20:41:00 +02:00
Dynamic (#77)
This commit is contained in:
parent
55247b7fcd
commit
991c84a1eb
1464 changed files with 225448 additions and 1985 deletions
|
|
@ -6,6 +6,23 @@ If you're going to act on a finding, it helps to know how the scanner got there.
|
|||
|
||||
A scan runs in two passes over the file tree, with an optional SQLite index that lets the second scan skip files whose content hash hasn't changed.
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
Walk["Walk file tree"] --> Pass1["Pass 1 per file<br/>tree-sitter parse, CFG, SSA"]
|
||||
Pass1 --> Summaries["Per-function summaries<br/>sources, sinks, sanitizers, returns, points-to"]
|
||||
Pass1 --> Hierarchy["Type hierarchy index<br/>extends, implements, impl-for, includes"]
|
||||
Summaries --> Global["GlobalSummaries map<br/>plus optional SQLite cache"]
|
||||
Hierarchy --> Global
|
||||
Global --> Pass2["Pass 2 per file<br/>cross-file context"]
|
||||
Pass2 --> Taint["Forward SSA taint worklist<br/>finite lattice, guaranteed convergence"]
|
||||
Pass2 --> Calls["Call precision<br/>k=1 inline, summaries, SCC fixed-point"]
|
||||
Taint --> Findings["Findings with evidence<br/>source, path, sink, engine notes"]
|
||||
Calls --> Findings
|
||||
Findings --> Rank["Rank and dedupe<br/>severity, confidence, score"]
|
||||
Rank --> Verify["Dynamic verification<br/>sandboxed harnesses, verdicts"]
|
||||
Verify --> Emit["Emit<br/>console, JSON, SARIF, UI"]
|
||||
```
|
||||
|
||||
**Pass 1, per file.** Tree-sitter parses the file. Nyx builds an intra-procedural control-flow graph, lowers it to SSA, and extracts a summary per function describing what that function does at the boundary: which arguments flow to sinks, which sources it reads from, which sinks it calls, what taint it strips, what it returns. Summaries are persisted to SQLite ([`src/summary/`](https://github.com/elicpeter/nyx/tree/master/src/summary/), [`src/database.rs`](https://github.com/elicpeter/nyx/blob/master/src/database.rs)).
|
||||
|
||||
**Summary merge.** All per-file summaries get unioned into a global map keyed by qualified function name.
|
||||
|
|
@ -18,6 +35,8 @@ When a method call has a receiver typed as a super-class, trait, or interface, *
|
|||
|
||||
A separate **field-sensitive points-to** pass tracks abstract locations down to the field level, so `c.mu.Lock()` is a lock on `Field(c, mu)` rather than on `c` as a whole. That distinction is what lets the resource-lifecycle and taint passes tell `obj.field = tainted; sink(obj.other_field)` apart from the conservative whole-variable approximation. Subscript reads and writes (`arr[i]`, `map[k] = v`) lower to synthetic `__index_get__` / `__index_set__` calls so the same container model handles them. Set `NYX_POINTER_ANALYSIS=0` to fall back to the pre-pointer-pass behaviour for baseline comparison.
|
||||
|
||||
**Dynamic verification.** After ranking and dedupe, default builds verify Medium and High confidence findings unless `--no-verify` or `scanner.verify = false` is set. The verifier derives a small harness from the finding, runs it in a sandbox against curated payloads, and stores the result on `evidence.dynamic_verdict`. `Confirmed` means a vulnerable payload fired and its benign control stayed clean. `NotConfirmed` means the harness ran but did not fire, not that the finding is closed.
|
||||
|
||||
## Optional analyses on top
|
||||
|
||||
These run on top of the forward taint pass. They're independently switchable via `[analysis.engine]` config or matching CLI flags. See [advanced-analysis.md](advanced-analysis.md) for the full description and tradeoffs.
|
||||
|
|
@ -47,6 +66,6 @@ Findings whose engine notes indicate a bound was hit can be filtered with `--req
|
|||
|
||||
## What you get out
|
||||
|
||||
Each finding carries the source location, the sink location, the path in between (when symex produced one), the rule ID, severity, attack-surface score, confidence level, and a list of engine notes describing any precision loss along the way. Console output is human-readable; JSON and SARIF carry the full evidence object for tooling.
|
||||
Each finding carries the source location, the sink location, the path in between (when symex produced one), the rule ID, severity, attack-surface score, confidence level, dynamic verdict when one was attempted, and a list of engine notes describing any precision loss along the way. Console output is human-readable; JSON and SARIF carry the full evidence object for tooling.
|
||||
|
||||
For the JSON shape and SARIF mapping, see [output.md](output.md).
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue