nyx/docs/detectors.md
Eli Peter 1bbe4b1cfb
Phase 1 (#33)
* chore: Exclude CLAUDE.md from Cargo.toml

* feat: add callgraph module and integrate into main analysis flow

* feat: enhance CLI with new severity filtering and analysis modes

* feat: update CHANGELOG with recent enhancements and fixes to severity filtering and output handling

* feat: implement state-model dataflow analysis for resource lifecycle and auth state

* feat: enhance diagnostic output formatting and add evidence structure

* feat: implement attack surface ranking for diagnostics with scoring and sorting

* feat: add comprehensive documentation for installation, usage, and rules reference

* feat: add multiple language support for command execution and evaluation endpoints

* feat: implement inline suppression for findings using `nyx:ignore` comments

* feat: add confidence levels to AST patterns and update output structure

* feat: implement low-noise prioritization system with category filtering, rollup grouping, and configurable budgets

* feat: bump version to 0.4.0 and update changelog with new features and improvements

* feat: add dead code allowances to various functions in mod.rs and real_world_tests.rs
2026-02-25 21:16:36 -05:00

4.1 KiB

Detector Overview

Nyx uses four independent detector families. Each targets different vulnerability classes and operates at a different level of analysis depth. Findings from all active detectors are merged, deduplicated, ranked, and presented in a single result set.

The Four Detector Families

Family Rule prefix Analysis depth What it finds
Taint Analysis taint-* Cross-file dataflow Unsanitized data flowing from sources to sinks
CFG Structural cfg-* Intra-procedural CFG Auth gaps, unguarded sinks, resource leaks, error fallthrough
State Model state-* Intra-procedural lattice Use-after-close, double-close, resource leaks, unauthenticated access
AST Patterns <lang>.*.* Structural (no flow) Dangerous function calls, banned APIs, weak crypto

How They Combine

In --mode full (default), all four families run. Findings are deduplicated:

  1. Taint supersedes AST: If a taint finding and an AST pattern both fire at the same location (e.g. both flag eval(userInput)), both are kept with distinct rule IDs. The taint finding ranks higher due to the analysis-kind bonus.

  2. State supersedes CFG: If a state-model finding (e.g. state-resource-leak) fires at the same location as a CFG finding (e.g. cfg-resource-leak), the CFG finding is suppressed.

  3. Location-level dedup: Exact duplicates (same line, column, rule ID, severity) are removed.

Analysis Modes

Mode CLI flag Active detectors
Full --mode full All four
AST-only --mode ast AST patterns only
CFG/Taint --mode cfg Taint + CFG + State

Attack-Surface Ranking

Every finding receives a deterministic attack-surface score estimating exploitability. Findings are sorted by descending score.

Scoring Formula

score = severity_base + analysis_kind + evidence_strength + state_bonus - validation_penalty
Component Values Purpose
Severity base High=60, Medium=30, Low=10 Primary signal
Analysis kind taint=+10, state=+8, cfg(with evidence)=+5, cfg(no evidence)=+3, ast=+0 Confidence of analysis
Evidence strength +1 per evidence item (max 4), +2-6 for source kind Specificity of finding
State bonus use-after-close/unauthed=+6, double-close=+3, must-leak=+2, may-leak=+1 State rule severity
Validation penalty -5 if path-validated Guard reduces exploitability

Source-kind priority

Source type Bonus Examples
User input +6 req.body, argv, stdin, form, query, params
Environment +5 env::var, getenv, process.env
Unknown +4 Conservative default
File system +3 fs::read_to_string, fgets
Database +2 Query results

Score ranges (approximate)

Finding type Score range
High taint + user input ~76-80
High state (use-after-close) ~74
High CFG structural ~63-68
Medium taint + env source ~45-50
Medium state (resource leak) ~40
Low AST-only pattern ~10

Ranking is enabled by default. Disable with --no-rank or output.attack_surface_ranking = false.

Two-Pass Architecture

Nyx's taint analysis requires cross-file context, achieved via two passes:

  1. Pass 1 — Summary extraction: Each file is parsed, a CFG is built, and a FuncSummary is extracted per function. Summaries capture source/sanitizer/sink capabilities (bitflags), taint propagation behavior, and callee lists. Summaries are persisted to SQLite.

  2. Pass 2 — Analysis: All summaries are merged into a global map. Files are re-parsed and analyzed with full cross-file context. The taint engine resolves callees against local summaries (more precise) first, then falls back to global summaries.

With indexing enabled, Pass 1 skips files whose content hash hasn't changed since the last scan.