Phase 1 (#33)

* chore: Exclude CLAUDE.md from Cargo.toml * feat: add callgraph module and integrate into main analysis flow * feat: enhance CLI with new severity filtering and analysis modes * feat: update CHANGELOG with recent enhancements and fixes to severity filtering and output handling * feat: implement state-model dataflow analysis for resource lifecycle and auth state * feat: enhance diagnostic output formatting and add evidence structure * feat: implement attack surface ranking for diagnostics with scoring and sorting * feat: add comprehensive documentation for installation, usage, and rules reference * feat: add multiple language support for command execution and evaluation endpoints * feat: implement inline suppression for findings using `nyx:ignore` comments * feat: add confidence levels to AST patterns and update output structure * feat: implement low-noise prioritization system with category filtering, rollup grouping, and configurable budgets * feat: bump version to 0.4.0 and update changelog with new features and improvements * feat: add dead code allowances to various functions in mod.rs and real_world_tests.rs
2026-07-21 21:31:03 +02:00 · 2026-02-25 21:16:36 -05:00 · 2026-02-25 21:16:36 -05:00 · 1bbe4b1cfb
commit 1bbe4b1cfb
parent 19b578c5c4
456 changed files with 25628 additions and 1228 deletions
--- a/docs/cli.md
+++ b/docs/cli.md
@ -0,0 +1,234 @@
+# CLI Reference
+
+## Global
+
+```
+nyx [COMMAND]
+nyx --version
+nyx --help
+```
+
+---
+
+## `nyx scan`
+
+Run a security scan on a directory.
+
+```
+nyx scan [PATH] [OPTIONS]
+```
+
+**PATH** defaults to `.` (current directory).
+
+### Analysis Mode
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--mode <MODE>` | `full` | Analysis mode: `full`, `ast`, `cfg`, or `taint` |
+
+| Mode | What runs |
+|------|-----------|
+| `full` | AST patterns + CFG structural analysis + taint analysis |
+| `ast` | AST patterns only (fastest, no CFG or taint) |
+| `cfg` / `taint` | CFG + taint analysis only (no AST patterns) |
+
+**Deprecated aliases**: `--ast-only` (use `--mode ast`), `--cfg-only` (use `--mode cfg`), `--all-targets` (use `--mode full`).
+
+### Index Control
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--index <MODE>` | `auto` | Index behavior: `auto`, `off`, or `rebuild` |
+
+| Index Mode | Behavior |
+|------------|----------|
+| `auto` | Use existing index if available; build if missing |
+| `off` | Skip indexing, scan filesystem directly |
+| `rebuild` | Force rebuild index before scanning |
+
+**Deprecated aliases**: `--no-index` (use `--index off`), `--rebuild-index` (use `--index rebuild`).
+
+### Output
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `-f, --format <FMT>` | `console` | Output format: `console`, `json`, or `sarif` |
+| `--quiet` | off | Suppress status messages (stderr); stdout stays clean |
+| `--no-rank` | off | Disable attack-surface ranking |
+
+### Filtering
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--severity <EXPR>` | *(none)* | Filter findings by severity |
+| `--min-score <N>` | *(none)* | Drop findings with rank score below N |
+| `--min-confidence <LEVEL>` | *(none)* | Drop findings below this confidence level (`low`, `medium`, `high`) |
+| `--fail-on <SEV>` | *(none)* | Exit code 1 if any finding >= this severity |
+| `--show-suppressed` | off | Show inline-suppressed findings (dimmed, tagged `[SUPPRESSED]`) |
+| `--keep-nonprod-severity` | off | Don't downgrade severity for test/vendor paths |
+| `--all` | off | Disable category filtering, rollups, and LOW budgets — show everything |
+| `--include-quality` | off | Include Quality-category findings (hidden by default) |
+| `--max-low <N>` | `20` | Maximum total LOW findings to show |
+| `--max-low-per-file <N>` | `1` | Maximum LOW findings per file |
+| `--max-low-per-rule <N>` | `10` | Maximum LOW findings per rule |
+| `--rollup-examples <N>` | `5` | Number of example locations in rollup findings |
+| `--show-instances <RULE>` | *(none)* | Expand all instances of a specific rule (bypass rollup) |
+
+**Severity expression formats**:
+
+```bash
+--severity HIGH              # Only high
+--severity "HIGH,MEDIUM"     # High or medium
+--severity ">=MEDIUM"        # Medium and above (high + medium)
+--severity ">= low"         # All severities (case-insensitive)
+```
+
+**Deprecated aliases**: `--high-only` (use `--severity HIGH`), `--include-nonprod` (use `--keep-nonprod-severity`).
+
+### Examples
+
+```bash
+# Basic scan
+nyx scan
+
+# Scan specific path, JSON output
+nyx scan ./server --format json
+
+# CI gate: fail on medium+, SARIF output
+nyx scan . --format sarif --fail-on medium > results.sarif
+
+# Fast AST-only scan, no index
+nyx scan . --mode ast --index off
+
+# High-severity only, quiet mode
+nyx scan . --severity HIGH --quiet
+
+# Only findings scoring 50 or above
+nyx scan . --min-score 50
+
+# Only medium+ confidence findings
+nyx scan . --min-confidence medium
+
+# Show everything (no filtering, no rollups)
+nyx scan . --all
+
+# Include quality findings but keep rollups and budgets
+nyx scan . --include-quality
+
+# See all unwrap findings expanded
+nyx scan . --include-quality --show-instances rs.quality.unwrap
+
+# Allow more LOW findings
+nyx scan . --max-low 50 --max-low-per-file 5
+```
+
+---
+
+## `nyx index`
+
+Manage the SQLite file index.
+
+### `nyx index build`
+
+```
+nyx index build [PATH] [--force]
+```
+
+Build or update the index for the given path (default: `.`).
+
+| Flag | Description |
+|------|-------------|
+| `-f, --force` | Force full rebuild, ignoring cached file hashes |
+
+### `nyx index status`
+
+```
+nyx index status [PATH]
+```
+
+Display index statistics (file count, size, last modified) for the given path.
+
+---
+
+## `nyx list`
+
+```
+nyx list [-v]
+```
+
+List all indexed projects.
+
+| Flag | Description |
+|------|-------------|
+| `-v, --verbose` | Show detailed information per project |
+
+---
+
+## `nyx clean`
+
+```
+nyx clean [PROJECT] [--all]
+```
+
+Remove index data.
+
+| Argument/Flag | Description |
+|---------------|-------------|
+| `PROJECT` | Project name or path to clean |
+| `--all` | Clean all indexed projects |
+
+---
+
+## `nyx config`
+
+Manage configuration.
+
+### `nyx config show`
+
+Print the effective merged configuration as TOML.
+
+### `nyx config path`
+
+Print the configuration directory path.
+
+### `nyx config add-rule`
+
+```
+nyx config add-rule --lang <LANG> --matcher <MATCHER> --kind <KIND> --cap <CAP>
+```
+
+Add a custom taint rule. Written to `nyx.local`.
+
+| Flag | Values |
+|------|--------|
+| `--lang` | `rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby` |
+| `--matcher` | Function or property name to match |
+| `--kind` | `source`, `sanitizer`, `sink` |
+| `--cap` | `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `all` |
+
+### `nyx config add-terminator`
+
+```
+nyx config add-terminator --lang <LANG> --name <NAME>
+```
+
+Add a terminator function (e.g. `process.exit`). Written to `nyx.local`.
+
+---
+
+## Exit Codes
+
+| Code | Meaning |
+|------|---------|
+| `0` | Scan completed; no findings matched `--fail-on` threshold (or no `--fail-on` specified) |
+| `1` | Scan completed but at least one finding met or exceeded the `--fail-on` severity |
+| Non-zero | Error during scan (I/O error, config parse error, database error, etc.) |
+
+---
+
+## Environment Variables
+
+| Variable | Description |
+|----------|-------------|
+| `RUST_LOG` | Set tracing verbosity (e.g. `RUST_LOG=debug nyx scan .`) |
+| `NO_COLOR` | Disable ANSI color output |
--- a/docs/configuration.md
+++ b/docs/configuration.md
@ -0,0 +1,183 @@
+# Configuration
+
+Nyx uses TOML configuration files. A default config is auto-generated on first run.
+
+## File Locations
+
+| Platform | Directory |
+|----------|-----------|
+| Linux | `~/.config/nyx/` |
+| macOS | `~/Library/Application Support/nyx/` |
+| Windows | `%APPDATA%\elicpeter\nyx\config\` |
+
+Run `nyx config path` to see the exact directory on your system.
+
+## File Precedence
+
+1. **`nyx.conf`** — Default config (auto-created from built-in template on first run)
+2. **`nyx.local`** — User overrides (loaded on top of defaults)
+
+Both files are optional. CLI flags take precedence over both.
+
+## Merge Strategy
+
+| Type | Behavior |
+|------|----------|
+| Scalars (`mode`, `min_severity`, booleans) | User value wins |
+| Arrays (`excluded_extensions`, `excluded_directories`) | Union + deduplicate |
+| Analysis rules | Per-language union with deduplication |
+
+Example:
+```toml
+# nyx.conf (default):
+excluded_extensions = ["jpg", "png", "exe"]
+
+# nyx.local (user):
+excluded_extensions = ["foo", "jpg"]
+
+# Effective result:
+# ["exe", "foo", "jpg", "png"]  — sorted, deduped union
+```
+
+---
+
+## Full Schema
+
+### `[scanner]`
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `mode` | `"full"` \| `"ast"` \| `"cfg"` \| `"taint"` | `"full"` | Analysis mode |
+| `min_severity` | `"Low"` \| `"Medium"` \| `"High"` | `"Low"` | Minimum severity to report |
+| `max_file_size_mb` | int \| null | null | Max file size in MiB; null = unlimited |
+| `excluded_extensions` | [string] | `["jpg", "png", "gif", "mp4", ...]` | File extensions to skip |
+| `excluded_directories` | [string] | `["node_modules", ".git", "target", ...]` | Directories to skip |
+| `excluded_files` | [string] | `[]` | Specific files to skip |
+| `read_global_ignore` | bool | `false` | Honor global ignore file |
+| `read_vcsignore` | bool | `true` | Honor `.gitignore` / `.hgignore` |
+| `require_git_to_read_vcsignore` | bool | `true` | Require `.git` dir to apply gitignore |
+| `one_file_system` | bool | `false` | Don't cross filesystem boundaries |
+| `follow_symlinks` | bool | `false` | Follow symbolic links |
+| `scan_hidden_files` | bool | `false` | Scan dot-files |
+| `include_nonprod` | bool | `false` | Keep original severity for test/vendor paths |
+| `enable_state_analysis` | bool | `false` | Enable resource lifecycle + auth state analysis. Detects use-after-close, double-close, resource leaks (per-function scope), and unauthenticated access. Requires `mode = "full"` or `mode = "cfg"`. |
+
+### `[database]`
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `path` | string | `""` | Custom SQLite DB path; empty = platform default |
+
+### `[output]`
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `default_format` | `"console"` \| `"json"` \| `"sarif"` | `"console"` | Default output format |
+| `quiet` | bool | `false` | Suppress status messages |
+| `max_results` | int \| null | null | Cap number of findings; null = unlimited |
+| `attack_surface_ranking` | bool | `true` | Enable attack-surface ranking |
+| `min_score` | int \| null | null | Minimum rank score to include; null = no minimum |
+| `min_confidence` | string \| null | null | Minimum confidence level (`"low"`, `"medium"`, `"high"`); null = no minimum |
+| `include_quality` | bool | `false` | Include Quality-category findings (hidden by default) |
+| `show_all` | bool | `false` | Disable category filtering, rollups, and LOW budgets |
+| `max_low` | int | `20` | Maximum total LOW findings to show (rollups count as 1) |
+| `max_low_per_file` | int | `1` | Maximum LOW findings per file (rollups count as 1) |
+| `max_low_per_rule` | int | `10` | Maximum LOW findings per rule (rollups count as 1) |
+| `rollup_examples` | int | `5` | Number of example locations stored in rollup findings |
+
+### `[performance]`
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `worker_threads` | int \| null | null | Worker thread count; null/0 = auto-detect |
+| `batch_size` | int | `100` | Files per index batch |
+| `channel_multiplier` | int | `4` | Channel capacity = threads x multiplier |
+| `rayon_thread_stack_size` | int | `8388608` | Rayon thread stack size in bytes (8 MiB) |
+| `prune` | bool | `false` | Stop traversing into matching directories |
+
+### `[analysis.languages.<slug>]`
+
+Per-language custom rules. `<slug>` is one of: `rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby`.
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `rules` | array of rule objects | Custom label rules |
+| `terminators` | [string] | Functions that terminate execution |
+| `event_handlers` | [string] | Event handler function names |
+
+**Rule object**:
+
+```toml
+[[analysis.languages.javascript.rules]]
+matchers = ["escapeHtml"]
+kind = "sanitizer"        # "source" | "sanitizer" | "sink"
+cap = "html_escape"       # "env_var" | "html_escape" | "shell_escape" |
+                          # "url_encode" | "json_parse" | "file_io" | "all"
+```
+
+---
+
+## Example Configurations
+
+### Minimal override (`nyx.local`)
+
+```toml
+[scanner]
+min_severity = "Medium"
+
+[output]
+default_format = "json"
+max_results = 100
+```
+
+### CI-optimized
+
+```toml
+[scanner]
+mode = "full"
+min_severity = "Medium"
+excluded_directories = ["node_modules", ".git", "target", "vendor", "dist"]
+
+[output]
+quiet = true
+default_format = "sarif"
+
+[performance]
+worker_threads = 4
+```
+
+### Custom rules for a Node.js project
+
+```toml
+[analysis.languages.javascript]
+terminators = ["process.exit", "abort"]
+event_handlers = ["addEventListener"]
+
+[[analysis.languages.javascript.rules]]
+matchers = ["escapeHtml", "sanitizeInput"]
+kind = "sanitizer"
+cap = "html_escape"
+
+[[analysis.languages.javascript.rules]]
+matchers = ["dangerouslySetInnerHTML"]
+kind = "sink"
+cap = "html_escape"
+
+[[analysis.languages.javascript.rules]]
+matchers = ["getRequestBody", "readUserInput"]
+kind = "source"
+cap = "all"
+```
+
+### Adding rules via CLI
+
+```bash
+# Add a sanitizer
+nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape
+
+# Add a terminator
+nyx config add-terminator --lang javascript --name process.exit
+
+# Verify
+nyx config show
+```
--- a/docs/detectors.md
+++ b/docs/detectors.md
@ -0,0 +1,81 @@
+# Detector Overview
+
+Nyx uses four independent detector families. Each targets different vulnerability classes and operates at a different level of analysis depth. Findings from all active detectors are merged, deduplicated, ranked, and presented in a single result set.
+
+## The Four Detector Families
+
+| Family | Rule prefix | Analysis depth | What it finds |
+|--------|------------|----------------|---------------|
+| [**Taint Analysis**](detectors/taint.md) | `taint-*` | Cross-file dataflow | Unsanitized data flowing from sources to sinks |
+| [**CFG Structural**](detectors/cfg.md) | `cfg-*` | Intra-procedural CFG | Auth gaps, unguarded sinks, resource leaks, error fallthrough |
+| [**State Model**](detectors/state.md) | `state-*` | Intra-procedural lattice | Use-after-close, double-close, resource leaks, unauthenticated access |
+| [**AST Patterns**](detectors/patterns.md) | `<lang>.*.*` | Structural (no flow) | Dangerous function calls, banned APIs, weak crypto |
+
+## How They Combine
+
+In `--mode full` (default), all four families run. Findings are deduplicated:
+
+1. **Taint supersedes AST**: If a taint finding and an AST pattern both fire at the same location (e.g. both flag `eval(userInput)`), both are kept with distinct rule IDs. The taint finding ranks higher due to the analysis-kind bonus.
+
+2. **State supersedes CFG**: If a state-model finding (e.g. `state-resource-leak`) fires at the same location as a CFG finding (e.g. `cfg-resource-leak`), the CFG finding is suppressed.
+
+3. **Location-level dedup**: Exact duplicates (same line, column, rule ID, severity) are removed.
+
+## Analysis Modes
+
+| Mode | CLI flag | Active detectors |
+|------|----------|-----------------|
+| Full | `--mode full` | All four |
+| AST-only | `--mode ast` | AST patterns only |
+| CFG/Taint | `--mode cfg` | Taint + CFG + State |
+
+## Attack-Surface Ranking
+
+Every finding receives a deterministic **attack-surface score** estimating exploitability. Findings are sorted by descending score.
+
+### Scoring Formula
+
+```
+score = severity_base + analysis_kind + evidence_strength + state_bonus - validation_penalty
+```
+
+| Component | Values | Purpose |
+|-----------|--------|---------|
+| **Severity base** | High=60, Medium=30, Low=10 | Primary signal |
+| **Analysis kind** | taint=+10, state=+8, cfg(with evidence)=+5, cfg(no evidence)=+3, ast=+0 | Confidence of analysis |
+| **Evidence strength** | +1 per evidence item (max 4), +2-6 for source kind | Specificity of finding |
+| **State bonus** | use-after-close/unauthed=+6, double-close=+3, must-leak=+2, may-leak=+1 | State rule severity |
+| **Validation penalty** | -5 if path-validated | Guard reduces exploitability |
+
+### Source-kind priority
+
+| Source type | Bonus | Examples |
+|-------------|-------|---------|
+| User input | +6 | `req.body`, `argv`, `stdin`, `form`, `query`, `params` |
+| Environment | +5 | `env::var`, `getenv`, `process.env` |
+| Unknown | +4 | Conservative default |
+| File system | +3 | `fs::read_to_string`, `fgets` |
+| Database | +2 | Query results |
+
+### Score ranges (approximate)
+
+| Finding type | Score range |
+|-------------|------------|
+| High taint + user input | ~76-80 |
+| High state (use-after-close) | ~74 |
+| High CFG structural | ~63-68 |
+| Medium taint + env source | ~45-50 |
+| Medium state (resource leak) | ~40 |
+| Low AST-only pattern | ~10 |
+
+Ranking is enabled by default. Disable with `--no-rank` or `output.attack_surface_ranking = false`.
+
+## Two-Pass Architecture
+
+Nyx's taint analysis requires cross-file context, achieved via two passes:
+
+1. **Pass 1 — Summary extraction**: Each file is parsed, a CFG is built, and a `FuncSummary` is extracted per function. Summaries capture source/sanitizer/sink capabilities (bitflags), taint propagation behavior, and callee lists. Summaries are persisted to SQLite.
+
+2. **Pass 2 — Analysis**: All summaries are merged into a global map. Files are re-parsed and analyzed with full cross-file context. The taint engine resolves callees against local summaries (more precise) first, then falls back to global summaries.
+
+With indexing enabled, Pass 1 skips files whose content hash hasn't changed since the last scan.
--- a/docs/detectors/cfg.md
+++ b/docs/detectors/cfg.md
@ -0,0 +1,161 @@
+# CFG Structural Analysis
+
+## Summary
+
+Nyx builds an intra-procedural control-flow graph (CFG) for each function and analyzes structural properties: whether sinks are guarded by sanitizers or validators, whether web handlers check authentication, whether resources are released on all exit paths, and whether error-handling code terminates properly.
+
+These detectors use **dominator analysis** — they check whether a guard node dominates (must execute before) a sink node on the CFG.
+
+## Rule IDs
+
+| Rule ID | Severity | Description |
+|---------|----------|-------------|
+| `cfg-unguarded-sink` | High/Medium | Sink reachable without a dominating guard or sanitizer |
+| `cfg-auth-gap` | High | Web handler reaches privileged sink without auth check |
+| `cfg-unreachable-sink` | Medium | Dangerous function in unreachable code |
+| `cfg-unreachable-sanitizer` | Low | Sanitizer in unreachable code |
+| `cfg-unreachable-source` | Low | Source in unreachable code |
+| `cfg-error-fallthrough` | High/Medium | Error check doesn't terminate; dangerous code follows |
+| `cfg-resource-leak` | Medium | Resource acquired but not released on all exit paths |
+| `cfg-lock-not-released` | Medium | Lock acquired but not released on all exit paths |
+
+## What It Detects
+
+### Unguarded sinks (`cfg-unguarded-sink`)
+A sink call (e.g. `system()`, `eval()`, `Command::new()`) is reachable from the function entry without passing through a guard or sanitizer that matches the sink's capability.
+
+### Auth gaps (`cfg-auth-gap`)
+A function identified as a web handler (by parameter naming conventions like `req`, `res`, `ctx`, `request`) reaches a privileged sink (shell execution, file I/O) without a prior call to an authentication function (`is_authenticated`, `require_auth`, `check_permission`, etc.).
+
+### Unreachable security code (`cfg-unreachable-*`)
+Sinks, sanitizers, or sources in dead code branches. This often indicates a refactoring error where security-critical code was accidentally made unreachable.
+
+### Error fallthrough (`cfg-error-fallthrough`)
+An error check (null check, error return check) does not terminate the function or loop back. Execution continues to a dangerous operation on the error path.
+
+### Resource leaks (`cfg-resource-leak`, `cfg-lock-not-released`)
+A resource acquisition call (e.g. `File::open`, `fopen`, `socket`, `Lock`) is not matched by a release call (e.g. `close`, `fclose`, `unlock`) on all exit paths from the function.
+
+## What It Cannot Detect
+
+- **Inter-procedural guards**: If authentication is checked in a middleware function that calls this handler, the CFG detector cannot see it. It only analyzes one function at a time.
+- **Dynamic dispatch**: Virtual method calls, function pointers, and closures are opaque to the CFG.
+- **Complex guard patterns**: Only recognized guard function names are checked. Custom validation logic (e.g. `if password == expected`) is not recognized as a guard.
+- **Correct sanitization**: The detector checks that *some* guard dominates the sink, not that the guard is *correct*. A guard that always passes would suppress the finding.
+- **Cross-function resource flows**: If a file handle is opened in one function and closed in another, the detector will report a leak in the first function.
+
+## Common False Positives
+
+| Scenario | Why it fires | Mitigation |
+|----------|-------------|------------|
+| Framework-level auth middleware | Handler doesn't call auth directly | Document as expected; suppress with severity filter |
+| Resource closed via RAII/defer | Implicit cleanup not visible to CFG | Currently not detected; known limitation |
+| Custom guard function name | Function not in the recognized guard list | Add the function name as a sanitizer in config |
+| Test handlers | Intentionally skip auth in tests | Default non-prod downgrade reduces severity; or exclude test dirs |
+
+## Common False Negatives
+
+| Scenario | Why it's missed |
+|----------|----------------|
+| Auth in called function | Cross-function guards not tracked |
+| Guard via type system | Type-level guarantees (e.g. Rust's `AuthenticatedUser` wrapper) not analyzed |
+| Resource closed in finally/defer | Some cleanup patterns not recognized |
+
+## Confidence Signals
+
+| Signal | Meaning |
+|--------|---------|
+| **Evidence lists guard nodes** | Shows which guards were checked and found missing |
+| **Sink has high capability** | Shell execution or file I/O sinks are higher risk |
+| **Handler detection matched** | Web handler identification is based on conventional parameter names |
+
+## Tuning and Noise Controls
+
+### Add custom guards/sanitizers
+
+```toml
+[[analysis.languages.python.rules]]
+matchers = ["validate_request", "check_csrf"]
+kind = "sanitizer"
+cap = "all"
+```
+
+### Add auth rules
+
+Auth checks are recognized by function name. If your codebase uses non-standard names:
+
+```toml
+[[analysis.languages.javascript.rules]]
+matchers = ["ensureLoggedIn", "requirePermission"]
+kind = "sanitizer"
+cap = "all"
+```
+
+### Filter results
+
+```bash
+# Skip low-severity unreachable findings
+nyx scan . --severity ">=MEDIUM"
+```
+
+### Disable CFG analysis
+
+```bash
+nyx scan . --mode ast   # AST patterns only
+```
+
+## Examples
+
+### Unguarded sink
+
+```go
+func handler(w http.ResponseWriter, r *http.Request) {
+    cmd := r.URL.Query().Get("cmd")
+    exec.Command("sh", "-c", cmd).Run()  // cfg-unguarded-sink: no guard dominates
+}
+```
+
+### Auth gap
+
+```javascript
+app.get('/admin/delete', (req, res) => {
+    // No is_authenticated() call
+    db.execute("DELETE FROM users WHERE id = " + req.params.id);
+    // cfg-auth-gap: web handler reaches privileged sink without auth
+});
+```
+
+### Resource leak
+
+```c
+void process() {
+    FILE *f = fopen("data.txt", "r");  // acquire
+    if (error) {
+        return;  // cfg-resource-leak: f not closed on this path
+    }
+    fclose(f);
+}
+```
+
+## Guard Rules
+
+Nyx recognizes these function name patterns as guards:
+
+| Pattern | Applies to |
+|---------|-----------|
+| `validate*`, `sanitize*` | All sinks |
+| `check_*`, `verify_*`, `assert_*` | All sinks |
+| `shell_escape` | Shell execution sinks |
+| `html_escape` | HTML/XSS sinks |
+| `url_encode` | URL sinks |
+| `which` | Shell execution (binary lookup) |
+
+### Auth rules
+
+| Pattern | Category |
+|---------|----------|
+| `is_authenticated`, `require_auth`, `check_permission` | Common |
+| `authorize`, `authenticate`, `require_login` | Common |
+| `check_auth`, `verify_token`, `validate_token` | Common |
+| `middleware.auth`, `auth.required` | Go |
+| `isAuthenticated`, `checkPermission`, `hasAuthority`, `hasRole` | Java |
--- a/docs/detectors/patterns.md
+++ b/docs/detectors/patterns.md
@ -0,0 +1,149 @@
+# AST Pattern Matching
+
+## Summary
+
+AST patterns are tree-sitter queries that match specific structural code constructs. They are the simplest and fastest detector family — no dataflow, no CFG, just structural presence. A match means the dangerous construct exists in the code; it does not prove the code is exploitable.
+
+AST patterns run in all analysis modes, including `--mode ast` (where they are the only active detector).
+
+## Rule IDs
+
+Pattern rule IDs follow the format `<lang>.<category>.<specific>`:
+
+```
+rs.memory.transmute
+js.code_exec.eval
+py.deser.pickle_loads
+c.memory.gets
+java.sqli.execute_concat
+```
+
+See the [Rule Reference](../rules/index.md) for a complete listing per language.
+
+## Pattern Tiers
+
+| Tier | Meaning | Examples |
+|------|---------|---------|
+| **A** | Structural presence alone is high-signal | `gets()`, `eval()`, `pickle.loads()`, `mem::transmute` |
+| **B** | Query includes a heuristic guard | SQL `execute` with concatenated arg, `printf(var)` with non-literal format |
+
+Tier B patterns use additional tree-sitter predicates to reduce false positives. For example, `java.sqli.execute_concat` only fires when `executeQuery()` receives a `binary_expression` (string concatenation) as its argument, not when it receives a literal or parameter placeholder.
+
+## What It Detects
+
+### By category
+
+| Category | What it matches | Example languages |
+|----------|----------------|-------------------|
+| **CommandExec** | Shell command execution functions | C (`system`), Python (`os.system`), Ruby (backticks) |
+| **CodeExec** | Dynamic code evaluation | JS (`eval`, `new Function()`), Python (`exec`), PHP (`eval`) |
+| **Deserialization** | Unsafe object deserialization | Java (`readObject`), Python (`pickle.loads`), Ruby (`Marshal.load`) |
+| **SqlInjection** | SQL with string concatenation | Java, Go, Python, PHP (Tier B heuristic) |
+| **PathTraversal** | File inclusion with variable path | PHP (`include $var`) |
+| **Xss** | XSS sink functions | JS (`document.write`, `outerHTML`), Java (`getWriter().print`) |
+| **Crypto** | Weak cryptographic algorithms | All languages (`md5`, `sha1`, `Math.random()`) |
+| **Secrets** | Hardcoded credentials | Go (variable name matching) |
+| **InsecureTransport** | Unencrypted communication | Go (`InsecureSkipVerify`), JS (`fetch("http://")`) |
+| **Reflection** | Dynamic class/method dispatch | Java (`Class.forName`, `Method.invoke`), Ruby (`send`, `constantize`) |
+| **MemorySafety** | Memory safety violations | Rust (`transmute`, `unsafe`), C (`gets`, `strcpy`, `sprintf`) |
+| **Prototype** | Prototype pollution | JS/TS (`__proto__` assignment) |
+| **CodeQuality** | Panic/abort/type-safety issues | Rust (`unwrap`, `panic!`), TS (`as any`) |
+
+## What It Cannot Detect
+
+- **Dataflow**: Patterns don't track whether the dangerous function receives tainted input. `eval("hello")` (safe) and `eval(userInput)` (dangerous) both match `js.code_exec.eval`.
+- **Context**: Patterns don't understand whether the code is reachable, guarded, or inside a test.
+- **Semantics**: `strcpy(dst, src)` always matches — it cannot determine buffer sizes.
+- **Indirect calls**: Function pointers, dynamic dispatch, and aliased references are invisible.
+
+## Common False Positives
+
+| Scenario | Why it fires | Mitigation |
+|----------|-------------|------------|
+| `eval()` with a hardcoded string literal | Pattern matches structural presence | Taint analysis won't flag this — use `--mode cfg` for fewer false positives |
+| `unsafe` block in Rust with sound justification | All unsafe blocks match | Filter with `--severity ">=MEDIUM"` (unsafe_block is Medium) |
+| `.unwrap()` in test code | Acceptable in tests | Default non-prod downgrade reduces severity |
+| `md5()` used for checksums (not security) | Pattern doesn't know usage intent | Filter Low severity or add to exclusions |
+| SQL concatenation with trusted data | Tier B heuristic can't verify data source | Taint analysis is more precise here |
+
+## Common False Negatives
+
+| Scenario | Why it's missed |
+|----------|----------------|
+| `eval` called via alias (`let e = eval; e(input)`) | Pattern matches the identifier `eval`, not the resolved function |
+| Dangerous function in a macro expansion | Tree-sitter parses the macro call, not the expansion |
+| SQL injection via ORM query builder | No pattern for ORM-specific query building |
+| Imported function under different name | `from os import system as s; s(cmd)` — pattern looks for `system` |
+
+## Confidence Signals
+
+| Signal | Meaning |
+|--------|---------|
+| **Tier A** | High confidence — the function itself is dangerous |
+| **Tier B** | Moderate confidence — heuristic guard reduces false positives |
+| **High severity** | Critical vulnerability class (command exec, deserialization) |
+| **Low severity** | Informational (weak crypto, code quality) |
+| **Non-prod path** | Finding in test/vendor code — downgraded by default |
+
+## Tuning and Noise Controls
+
+### Severity filtering
+
+```bash
+# Skip code-quality and weak-crypto findings
+nyx scan . --severity ">=MEDIUM"
+
+# Only critical findings
+nyx scan . --severity HIGH
+```
+
+### Use taint for precision
+
+```bash
+# Taint-only mode: only report findings with confirmed dataflow
+nyx scan . --mode cfg
+```
+
+### Exclude directories
+
+```toml
+[scanner]
+excluded_directories = ["node_modules", "vendor", "generated"]
+```
+
+## Examples
+
+### Tier A — structural presence
+
+**C: Banned function**
+```c
+char buf[64];
+gets(buf);  // c.memory.gets — always dangerous, no safe usage
+```
+
+**Python: Unsafe deserialization**
+```python
+import pickle
+data = pickle.loads(user_input)  # py.deser.pickle_loads
+```
+
+### Tier B — heuristic-guarded
+
+**Java: SQL concatenation**
+```java
+// Fires: concatenated argument
+stmt.executeQuery("SELECT * FROM users WHERE id=" + userId);
+// java.sqli.execute_concat
+
+// Does NOT fire: parameterized query
+stmt.executeQuery(preparedSql);
+```
+
+**C: Format string**
+```c
+// Fires: variable as first argument
+printf(user_input);  // c.memory.printf_no_fmt
+
+// Does NOT fire: literal format string
+printf("%s", user_input);
+```
--- a/docs/detectors/state.md
+++ b/docs/detectors/state.md
@ -0,0 +1,204 @@
+# State Model Analysis
+
+## Summary
+
+Nyx's state model analysis tracks **resource lifecycle** and **authentication state** through a function using monotone dataflow over bounded lattices. It detects use-after-close bugs, double-close bugs, resource leaks, and unauthenticated access to privileged operations.
+
+State analysis is **opt-in** — enable it with `scanner.enable_state_analysis = true` in config. It requires `mode = "full"` or `mode = "cfg"`.
+
+## Rule IDs
+
+| Rule ID | Severity | Description |
+|---------|----------|-------------|
+| `state-use-after-close` | High | Variable used after being closed/released |
+| `state-double-close` | Medium | Resource closed twice |
+| `state-resource-leak` | Medium | Resource opened but never closed (definite) |
+| `state-resource-leak-possible` | Low | Resource may not be closed on all paths |
+| `state-unauthed-access` | High | Privileged operation reached without authentication |
+
+## What It Detects
+
+### Use-after-close (`state-use-after-close`)
+
+A resource transitions to the CLOSED state (via `close()`, `fclose()`, `disconnect()`, etc.), then a use operation (`read`, `write`, `send`, `recv`, `query`, etc.) is performed on it.
+
+```c
+FILE *f = fopen("data.txt", "r");
+fclose(f);
+fread(buf, 1, 100, f);  // state-use-after-close
+```
+
+### Double-close (`state-double-close`)
+
+A resource is closed twice. This can cause crashes or undefined behavior.
+
+```python
+f = open("data.txt")
+f.close()
+f.close()  # state-double-close
+```
+
+### Resource leak (`state-resource-leak`)
+
+A resource is opened but never closed on any path through the function. This is a definite leak.
+
+```java
+FileInputStream fis = new FileInputStream("data.txt");
+process(fis);
+// function exits without fis.close() — state-resource-leak
+```
+
+### Possible resource leak (`state-resource-leak-possible`)
+
+A resource is closed on some paths but not others.
+
+```go
+f, err := os.Open("data.txt")
+if err != nil {
+    return  // f not closed here
+}
+f.Close()  // closed here
+// state-resource-leak-possible on the error path
+```
+
+### Unauthenticated access (`state-unauthed-access`)
+
+A function identified as a web handler reaches a privileged sink (shell execution, file I/O) without any authentication check on the path.
+
+A function is identified as a web handler if:
+1. Its name starts with `handle_`, `route_`, or `api_` (strong match — sufficient on its own), OR
+2. Its name starts with `serve_` or `process_` AND any function in the file has web-like parameter names (`request`, `req`, `ctx`, `res`, `response`, `w`, `writer`, etc., varying by language).
+
+The function name `main` is explicitly excluded.
+
+```javascript
+app.post('/admin/exec', (req, res) => {
+    // No auth check
+    exec(req.body.command);  // state-unauthed-access
+});
+```
+
+## What It Cannot Detect
+
+- **Cross-function resource management**: Resources opened in one function and closed in another are not tracked. This is the most common source of false positives for leak detection.
+- **RAII / defer / try-with-resources**: Implicit cleanup via language-level constructs (Rust's `Drop`, Go's `defer`, Java's try-with-resources, Python's `with`) is not recognized. These patterns will produce false-positive leak findings.
+- **Dynamic dispatch**: If `close()` is called through a trait object or interface, it may not be recognized.
+- **Authentication via type system**: Rust's type-state pattern (e.g. `AuthenticatedRequest<T>`) is not recognized as an auth check.
+- **Complex authorization logic**: Only recognized function name patterns are checked.
+
+## Common False Positives
+
+| Scenario | Why it fires | Mitigation |
+|----------|-------------|------------|
+| RAII / Drop / defer cleanup | Implicit cleanup not visible | Known limitation; filter by severity |
+| Resource returned to caller | Ownership transferred, not leaked | Known limitation |
+| Framework-managed resources | Web framework manages connection lifecycle | Exclude framework-generated handlers |
+| Try-with-resources (Java) | Language construct not parsed | Known limitation |
+| Context manager (Python `with`) | Block construct not tracked | Known limitation |
+
+## Common False Negatives
+
+| Scenario | Why it's missed |
+|----------|----------------|
+| Resource closed in helper function | Cross-function tracking not implemented |
+| Auth in middleware | Auth check happens before handler is called |
+| Double-close via aliased reference | Alias analysis not performed |
+
+## Confidence Signals
+
+| Signal | Meaning |
+|--------|---------|
+| **Definite leak (state-resource-leak)** | Resource is never closed on any path — high confidence |
+| **Use-after-close** | Read/write operation after explicit close — high confidence |
+| **Web handler detected** | Entry point matched by parameter naming convention |
+| **Possible leak (state-resource-leak-possible)** | Resource closed on some but not all paths — lower confidence |
+
+## Tuning and Noise Controls
+
+### Enable state analysis
+
+```toml
+[scanner]
+enable_state_analysis = true
+```
+
+### Severity filtering
+
+```bash
+# Skip possible-leak findings (Low severity)
+nyx scan . --severity ">=MEDIUM"
+```
+
+### Exclude test files
+
+```toml
+[scanner]
+excluded_directories = ["tests", "test", "spec"]
+```
+
+## Resource Pairs
+
+The state engine recognizes these acquire/release pairs per language:
+
+### C/C++
+| Acquire | Release | Resource |
+|---------|---------|----------|
+| `fopen` | `fclose` | File handle |
+| `open` | `close` | File descriptor |
+| `socket` | `close` | Socket |
+| `malloc`, `calloc`, `realloc` | `free` | Heap memory |
+| `pthread_mutex_lock` | `pthread_mutex_unlock` | Mutex |
+
+### Rust
+| Acquire | Release | Resource |
+|---------|---------|----------|
+| `File::open`, `File::create` | `drop`, `close` | File handle |
+| `TcpStream::connect` | `shutdown` | TCP connection |
+| `lock`, `read`, `write` (on Mutex/RwLock) | `drop` | Lock guard |
+
+### Java
+| Acquire | Release | Resource |
+|---------|---------|----------|
+| `new FileInputStream` | `close` | File stream |
+| `getConnection` | `close` | DB connection |
+| `new Socket` | `close` | Socket |
+
+### Go, Python, JavaScript, Ruby, PHP
+Similar patterns with language-specific function names.
+
+## Use Patterns (Trigger use-after-close)
+
+The following operations on a closed resource trigger `state-use-after-close`:
+
+```
+read, write, send, recv, fread, fwrite, fgets, fputs, fprintf, fscanf,
+fflush, fseek, ftell, rewind, feof, ferror, fgetc, fputc, getc, putc,
+ungetc, query, execute, fetch, sendto, recvfrom, ioctl, fcntl,
+strcpy, strncpy, strcat, strncat, memcpy, memmove, memset, memcmp,
+strcmp, strncmp, strlen, sprintf, snprintf
+```
+
+## Technical Details
+
+### Resource Lifecycle Lattice
+
+```
+UNINIT → OPEN → CLOSED
+              → MOVED
+```
+
+States are tracked as bitflags, allowing the lattice to represent uncertainty (e.g. OPEN|CLOSED means the resource is open on some paths and closed on others).
+
+### Leak Detection Scope
+
+Resource leaks are checked at the file-level exit node and the **synthesized** function exit node (a single Return node that all early returns feed into). Early-return nodes are **not** checked individually — only the merged state at the function's synthesized exit is inspected. This prevents duplicate findings where an early-return path reports a definite leak while the merged exit correctly reports a possible leak.
+
+This per-function exit inspection ensures that a variable leaked inside one function is not masked by a same-named variable that is properly closed in a subsequent function.
+
+### Auth Level Lattice
+
+```
+Unauthed < Authed < Admin
+```
+
+Join semantics: take the minimum (conservative). If any path is unauthenticated, the result is unauthenticated.
--- a/docs/detectors/taint.md
+++ b/docs/detectors/taint.md
@ -0,0 +1,202 @@
+# Taint Analysis
+
+## Summary
+
+Nyx's taint analysis tracks the flow of untrusted data from **sources** (where data enters the program) through **assignments and function calls** to **sinks** (where dangerous operations happen). If the data reaches a sink without passing through a **sanitizer** with matching capabilities, a finding is emitted.
+
+The engine uses a monotone forward dataflow analysis over a finite lattice with guaranteed termination. Analysis is **intra-procedural with cross-file function summaries** — it does not follow calls into other functions but uses pre-computed summaries of their behavior.
+
+## Rule ID
+
+```
+taint-unsanitised-flow (source <line>:<col>)
+```
+
+One rule ID covers all taint findings. The parenthetical identifies the specific source location.
+
+## What It Detects
+
+- Environment variables flowing to shell execution (`env::var` → `Command::new`)
+- User input flowing to code evaluation (`req.body` → `eval()`)
+- File contents flowing to SQL queries (`fs::read_to_string` → `db.execute()`)
+- Request parameters flowing to HTML output (`req.query` → `innerHTML`)
+- Any source-to-sink flow where the sink's required capability is not stripped by a sanitizer
+
+## What It Cannot Detect
+
+- **Inter-procedural flows without summaries**: If a function isn't summarized (e.g. from a third-party library without source), the taint engine cannot track data through it. It conservatively treats unknown callees as neither propagating nor sanitizing.
+- **Flows through data structures**: Taint is tracked per-variable, not per-field. `obj.field = tainted; sink(obj.other_field)` may produce a false positive because taint attaches to `obj` as a whole.
+- **Aliasing**: `let y = &x; sink(*y)` — the engine tracks `y` as a fresh variable, not an alias of `x`. This can cause false negatives.
+- **Complex control flow**: The analysis is flow-sensitive (respects control flow within a function) but does not track taint through arbitrary loops with complex exit conditions.
+- **Implicit flows**: Taint only follows explicit data flow, not information flow through branching (e.g. `if (secret) { x = 1 } else { x = 0 }` does not taint `x`).
+
+## Common False Positives
+
+| Scenario | Why it happens | Mitigation |
+|----------|---------------|------------|
+| Custom sanitizer not recognized | Nyx only knows built-in and configured sanitizers | Add a custom sanitizer rule in config |
+| Taint through struct fields | Variable-level (not field-level) tracking | No current mitigation; field sensitivity is planned |
+| Dead code paths | The engine is path-insensitive within a function (it considers all paths) | Contradiction pruning catches some cases; path-validated findings score lower |
+| Library wrappers | A wrapper around a dangerous function may re-introduce taint that was sanitized by the wrapper | Summarize the wrapper function or add it as a sanitizer |
+
+## Common False Negatives
+
+| Scenario | Why it's missed |
+|----------|----------------|
+| Third-party library calls | No summary available; callee treated as opaque |
+| Taint through global/static variables | Not tracked across function boundaries |
+| Taint through closures/callbacks in some languages | Closure capture analysis is limited (JS/TS/Ruby/Go anonymous functions ARE analyzed) |
+| Flows spanning more than two files | Summary approximation loses precision at depth |
+
+## Confidence Signals
+
+These signals in the output indicate higher-confidence findings:
+
+| Signal | What it means |
+|--------|--------------|
+| **Evidence: Source + Sink** | Both endpoints identified with specific function names and locations |
+| **Source kind = user input** | Source is directly controllable by an attacker (req.body, argv, etc.) |
+| **path_validated = false** | No validation guard on the path — higher exploitability |
+| **No guard_kind** | No dominating predicate check (null check, error check, etc.) |
+| **High rank_score** | Multiple confidence signals combined |
+
+Lower-confidence:
+
+| Signal | What it means |
+|--------|--------------|
+| **path_validated = true** | A validation predicate guards the path — may not be exploitable |
+| **guard_kind = "ValidationCall"** | An explicit validation function was called before the sink |
+| **Source kind = database** | Data from DB — may already be validated at insertion time |
+
+## Tuning and Noise Controls
+
+### Add custom sanitizers
+
+If your codebase has a custom sanitizer that Nyx doesn't recognize:
+
+```toml
+# nyx.local
+[[analysis.languages.javascript.rules]]
+matchers = ["escapeHtml", "sanitizeInput"]
+kind = "sanitizer"
+cap = "html_escape"
+```
+
+Or via CLI:
+```bash
+nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape
+```
+
+### Filter by severity
+
+```bash
+nyx scan . --severity HIGH          # Only high-severity taint findings
+nyx scan . --severity ">=MEDIUM"    # Skip low-severity
+```
+
+### Skip non-production code
+
+By default, findings in `tests/`, `vendor/`, `build/` paths are downgraded one severity tier. To exclude them entirely, add to config:
+
+```toml
+[scanner]
+excluded_directories = ["tests", "vendor", "build", "examples"]
+```
+
+### Disable taint (AST-only mode)
+
+```bash
+nyx scan . --mode ast
+```
+
+## Example
+
+**Vulnerable code** (Rust):
+```rust
+use std::env;
+use std::process::Command;
+
+fn main() {
+    let cmd = env::var("USER_CMD").unwrap();          // line 5: source
+    Command::new("sh").arg("-c").arg(&cmd).output();   // line 6: sink
+}
+```
+
+**Finding**:
+```
+[HIGH]   taint-unsanitised-flow (source 5:15)  src/main.rs:6:5
+         Source: env::var("USER_CMD") at 5:15
+         Sink: Command::new("sh").arg("-c")
+         Score: 76
+```
+
+**Safe alternative**:
+```rust
+use std::env;
+use std::process::Command;
+
+fn main() {
+    let cmd = env::var("USER_CMD").unwrap();
+    // Use the value as a direct argument, not a shell command
+    Command::new(&cmd).output();
+    // Or validate against an allowlist
+}
+```
+
+## Technical Details
+
+### Capability System
+
+Taint uses a bitflag capability system to match sources with appropriate sanitizers and sinks:
+
+| Capability | Bit | Sources | Sanitizers | Sinks |
+|-----------|-----|---------|------------|-------|
+| `ENV_VAR` | 0x01 | `env::var`, `getenv` | — | — |
+| `HTML_ESCAPE` | 0x02 | — | `html_escape`, `DOMPurify.sanitize` | `innerHTML`, `document.write` |
+| `SHELL_ESCAPE` | 0x04 | — | `shell_escape` | `Command::new`, `system()`, `eval()` |
+| `URL_ENCODE` | 0x08 | — | `encodeURIComponent` | `location.href` |
+| `JSON_PARSE` | 0x10 | — | `JSON.parse` | — |
+| `FILE_IO` | 0x20 | — | `filepath.Clean`, `basename`, `os.path.realpath` | `fopen`, `open`, `send_file`, `fs::read_to_string` |
+| `FMT_STRING` | 0x40 | — | — | `printf(var)` |
+
+Sources typically use `Cap::all()` to match any sink. A sanitizer strips specific capability bits. A finding fires when a tainted variable reaches a sink and the taint still has the matching capability bit set.
+
+### Nested Function Analysis
+
+The CFG builder recursively discovers function expressions nested inside call arguments:
+
+- **JavaScript/TypeScript**: `function_expression`, `arrow_function` inside call arguments (e.g., Express route handlers)
+- **Ruby**: `do_block` and `block` nodes (e.g., Sinatra `get '/path' do...end`)
+- **Go**: `func_literal` (anonymous function literals)
+
+Each nested function is walked as a separate scope and receives a unique identifier (`<anon@{byte_offset}>`) to prevent collisions when multiple anonymous functions exist in the same file.
+
+### Chained Call Classification
+
+Method chains like `r.URL.Query().Get("host")` are normalized by stripping internal `()` segments between `.` separators. The classifier matches against both the original text and the normalized form, enabling rules like `r.URL` to match within `r.URL.Query.Get`.
+
+### Nested Call Fallback
+
+When the outermost call in an expression doesn't classify as a source/sink, the engine tries all nested inner calls. This handles patterns like `str(eval(expr))` where `str` is not a sink but the inner `eval` is.
+
+### Rust `if let` / `while let` Pattern Bindings
+
+The CFG builder recognizes Rust `let_condition` nodes inside `if` and `while` expressions. The value expression is classified for source/sink labels, and the pattern binding is extracted as a variable definition:
+
+```rust
+if let Ok(cmd) = env::var("CMD") {
+    // cmd is tainted — env::var is a source, cmd is the binding
+    Command::new("sh").arg("-c").arg(&cmd).output();  // taint-unsanitised-flow
+}
+```
+
+This also works for `while let` patterns.
+
+### JS/TS Two-Level Solve
+
+For JavaScript and TypeScript, taint analysis uses a two-level approach:
+
+1. **Level 1**: Solve top-level code (module scope)
+2. **Level 2**: Solve each function seeded with the converged top-level state
+
+This prevents false positives from cross-function taint leakage while preserving global-to-function flows.
--- a/docs/index.md
+++ b/docs/index.md
@ -0,0 +1,32 @@
+# Nyx Documentation
+
+Welcome to the Nyx documentation. Nyx is a multi-language static vulnerability scanner built in Rust.
+
+## User Guide
+
+- [Installation](installation.md) — Install via cargo, prebuilt binaries, or from source
+- [Quick Start](quickstart.md) — Your first scan in 60 seconds
+- [CLI Reference](cli.md) — Every flag, subcommand, and option
+- [Configuration](configuration.md) — Config file schema, precedence, custom rules
+- [Output Formats](output.md) — Console, JSON, SARIF; exit codes; evidence fields
+
+## Detector Reference
+
+- [Detector Overview](detectors.md) — How the four detector families work together
+- [Taint Analysis](detectors/taint.md) — Cross-file source-to-sink dataflow tracking
+- [CFG Structural Analysis](detectors/cfg.md) — Auth gaps, unguarded sinks, resource leaks
+- [State Model Analysis](detectors/state.md) — Resource lifecycle and authentication state
+- [AST Patterns](detectors/patterns.md) — Tree-sitter structural pattern matching
+
+## Rule Reference
+
+- [Rule Index](rules/index.md) — How rules are organized
+- [Rust](rules/rust.md) | [C](rules/c.md) | [C++](rules/cpp.md) | [Java](rules/java.md) | [Go](rules/go.md)
+- [JavaScript](rules/javascript.md) | [TypeScript](rules/typescript.md) | [Python](rules/python.md)
+- [PHP](rules/php.md) | [Ruby](rules/ruby.md)
+
+## Contributing
+
+- [Contributing Guide](../CONTRIBUTING.md) — Development setup, adding rules, PR guidelines
+- [Security Policy](../SECURITY.md) — Responsible disclosure
+- [Code of Conduct](../CODE_OF_CONDUCT.md)
--- a/docs/installation.md
+++ b/docs/installation.md
@ -0,0 +1,76 @@
+# Installation
+
+## Install from crates.io
+
+```bash
+cargo install nyx-scanner
+```
+
+This installs the `nyx` binary into `~/.cargo/bin/`.
+
+## Install from GitHub releases
+
+1. Go to the [Releases](https://github.com/elicpeter/nyx/releases) page.
+2. Download the binary for your platform:
+
+   | Platform | Archive |
+   |----------|---------|
+   | Linux x86_64 | `nyx-x86_64-unknown-linux-gnu.zip` |
+   | macOS Intel | `nyx-x86_64-apple-darwin.zip` |
+   | macOS Apple Silicon | `nyx-aarch64-apple-darwin.zip` |
+   | Windows x86_64 | `nyx-x86_64-pc-windows-msvc.zip` |
+
+3. Extract and install:
+
+   ```bash
+   # Linux / macOS
+   unzip nyx-*.zip
+   chmod +x nyx
+   sudo mv nyx /usr/local/bin/
+
+   # Windows (PowerShell)
+   Expand-Archive -Path nyx-*.zip -DestinationPath .
+   Move-Item -Path .\nyx.exe -Destination "C:\Program Files\Nyx\"
+   ```
+
+4. Verify:
+   ```bash
+   nyx --version
+   ```
+
+## Build from source
+
+```bash
+git clone https://github.com/elicpeter/nyx.git
+cd nyx
+cargo build --release
+cargo install --path .
+```
+
+Requires **Rust 1.85+** (edition 2024).
+
+## CI Integration
+
+### GitHub Actions
+
+```yaml
+- name: Install Nyx
+  run: cargo install nyx-scanner
+
+- name: Run security scan
+  run: nyx scan . --format sarif --fail-on medium > results.sarif
+
+- name: Upload SARIF
+  uses: github/codeql-action/upload-sarif@v3
+  with:
+    sarif_file: results.sarif
+```
+
+### Generic CI
+
+```bash
+# Fail the build if any High or Medium finding is detected
+nyx scan . --severity ">=MEDIUM" --fail-on medium --quiet --format json
+```
+
+The `--fail-on` flag causes Nyx to exit with code **1** if any finding meets or exceeds the given severity. Exit code **0** means no findings matched.
--- a/docs/output.md
+++ b/docs/output.md
@ -0,0 +1,315 @@
+# Output Formats
+
+Nyx supports three output formats, selected with `--format` or `output.default_format` in config.
+
+## Console (default)
+
+Human-readable, color-coded output to stdout. Status messages go to stderr.
+
+```
+[HIGH]   taint-unsanitised-flow (source 5:11)  src/handler.rs:12:5 (Score: 76, Confidence: High)
+         Source: env::var("CMD") → Command::new("sh").arg("-c")
+
+[MEDIUM] cfg-unguarded-sink                    src/handler.rs:12:5 (Score: 35, Confidence: Medium)
+
+[LOW]    rs.quality.unwrap                     src/lib.rs:88:5 (Score: 10, Confidence: High)
+```
+
+### Severity indicators
+
+| Tag | Color | Meaning |
+|-----|-------|---------|
+| `[HIGH]` | Red, bold | Critical — likely exploitable |
+| `[MEDIUM]` | Orange, bold | Important — may be exploitable |
+| `[LOW]` | Muted blue-gray | Informational — code quality or weak signal |
+
+### Evidence fields
+
+Taint and state findings include structured evidence:
+
+| Label | Meaning |
+|-------|---------|
+| **Source** | Where tainted data originated (function name + location) |
+| **Sink** | Where the dangerous operation happens |
+| **Path guard** | Type of validation predicate protecting the path |
+
+### Score
+
+When attack-surface ranking is enabled (default), each finding shows a `Score` value. Higher scores indicate greater exploitability. See [Detector Overview](detectors.md) for the scoring formula.
+
+### Rollup findings
+
+High-frequency LOW Quality findings (e.g. `rs.quality.unwrap`) are grouped into rollup findings by `(file, rule)`:
+
+```
+  21:10  ● [LOW]   rs.quality.unwrap
+      rs.quality.unwrap (38 occurrences)
+      Examples: 21:10, 50:10, 79:10, 105:10, 134:10
+      Run: nyx scan --show-instances rs.quality.unwrap
+```
+
+Rollups count as **one finding** for LOW budget enforcement. Use `--show-instances <RULE>` to expand a specific rule or `--all` to disable rollups entirely.
+
+### Suppression footer
+
+When findings are suppressed by the prioritization pipeline, a footer is shown:
+
+```
+Suppressed 195 LOW/Quality findings.
+Active filters:
+  include_quality = false
+  max_low = 20
+  max_low_per_file = 1
+  max_low_per_rule = 10
+
+Use --include-quality, --max-low, or --all to adjust.
+```
+
+---
+
+## JSON
+
+Machine-readable JSON array. Each finding is an object:
+
+```json
+[
+  {
+    "path": "src/handler.rs",
+    "line": 12,
+    "col": 5,
+    "severity": "High",
+    "id": "taint-unsanitised-flow (source 5:11)",
+    "path_validated": false,
+    "labels": [
+      ["Source", "env::var(\"CMD\") at 5:11"],
+      ["Sink", "Command::new(\"sh\").arg(\"-c\")"]
+    ],
+    "confidence": "High",
+    "evidence": {
+      "source": {
+        "path": "src/handler.rs",
+        "line": 5,
+        "col": 11,
+        "kind": "source",
+        "snippet": "env::var(\"CMD\")"
+      },
+      "sink": {
+        "path": "src/handler.rs",
+        "line": 12,
+        "col": 5,
+        "kind": "sink",
+        "snippet": "Command::new(\"sh\")"
+      },
+      "notes": ["source_kind:EnvironmentConfig"]
+    },
+    "rank_score": 76.0,
+    "rank_reason": [
+      ["severity_base", "60"],
+      ["analysis_kind", "10"],
+      ["source_kind", "5"],
+      ["evidence_count", "1"]
+    ]
+  }
+]
+```
+
+### Field descriptions
+
+| Field | Type | Always present | Description |
+|-------|------|----------------|-------------|
+| `path` | string | yes | File path relative to scan root |
+| `line` | int | yes | 1-indexed line number |
+| `col` | int | yes | 1-indexed column number |
+| `severity` | string | yes | `"High"`, `"Medium"`, or `"Low"` |
+| `id` | string | yes | Rule ID |
+| `category` | string | yes | Finding category: `"Security"`, `"Reliability"`, or `"Quality"` |
+| `path_validated` | bool | no | True if guarded by validation predicate |
+| `guard_kind` | string | no | Predicate type (e.g. `"NullCheck"`, `"ValidationCall"`) |
+| `message` | string | no | Human-readable context (state analysis findings) |
+| `labels` | array | no | Array of `[label, value]` pairs for console display |
+| `confidence` | string | no | Confidence level: `"Low"`, `"Medium"`, or `"High"` |
+| `evidence` | object | no | Structured evidence (source/sink spans, state, notes) |
+| `rank_score` | float | no | Attack-surface score (omitted when ranking disabled) |
+| `rank_reason` | array | no | Score breakdown (omitted when ranking disabled) |
+| `rollup` | object | no | Rollup data when findings are grouped (see below) |
+
+Fields marked "no" are omitted when empty/null/false to keep output compact.
+
+### Confidence levels
+
+| Level | Meaning |
+|-------|---------|
+| `High` | Strong signal — taint-confirmed flow, definite state violation |
+| `Medium` | Moderate signal — resource leak, path-validated taint, CFG structural |
+| `Low` | Weak signal — AST pattern match, possible resource leak, degraded analysis |
+
+### Evidence object
+
+The `evidence` field provides structured provenance data:
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `source` | object | Source span (path, line, col, kind, snippet) |
+| `sink` | object | Sink span (path, line, col, kind, snippet) |
+| `guards` | array | Validation guard spans |
+| `sanitizers` | array | Sanitizer spans |
+| `state` | object | State-machine evidence (machine, subject, from_state, to_state) |
+| `notes` | array | Free-form notes (e.g. `"source_kind:UserInput"`, `"path_validated"`) |
+
+All fields are omitted when empty/null.
+
+### Rollup object
+
+When a finding is a rollup (grouped from multiple occurrences), the `rollup` field is present:
+
+```json
+{
+  "rollup": {
+    "count": 38,
+    "occurrences": [
+      { "line": 21, "col": 10 },
+      { "line": 50, "col": 10 },
+      { "line": 79, "col": 10 }
+    ]
+  }
+}
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `count` | int | Total number of occurrences |
+| `occurrences` | array | First N example locations (controlled by `rollup_examples`) |
+
+---
+
+## SARIF (Static Analysis Results Interchange Format)
+
+SARIF 2.1.0 JSON, suitable for GitHub Code Scanning and other SARIF-compatible tools.
+
+```bash
+nyx scan . --format sarif > results.sarif
+```
+
+The SARIF output includes:
+
+- **Tool metadata** — Nyx name and version
+- **Rules** — Rule ID, description, severity mapping
+- **Results** — One result per finding with location, message, and properties
+- **Properties** — Each result includes `category` and optionally `confidence` and `rollup.count`
+- **Related locations** — Rollup findings include example locations in `relatedLocations`
+- **Artifacts** — File paths referenced by findings
+
+### GitHub Code Scanning integration
+
+```yaml
+- name: Run Nyx
+  run: nyx scan . --format sarif > results.sarif
+
+- name: Upload SARIF
+  uses: github/codeql-action/upload-sarif@v3
+  with:
+    sarif_file: results.sarif
+```
+
+---
+
+## Exit Codes
+
+| Code | Meaning |
+|------|---------|
+| `0` | Scan completed successfully; no findings matched `--fail-on` threshold |
+| `1` | `--fail-on` threshold breached (at least one finding meets or exceeds the specified severity) |
+| Non-zero | Error (I/O, config, database, parse error) |
+
+Without `--fail-on`, Nyx always exits `0` on a successful scan regardless of findings count.
+
+---
+
+## Severity Levels
+
+| Level | Description | Typical rules |
+|-------|-------------|---------------|
+| **High** | Critical vulnerabilities — likely exploitable | Command injection, unsafe deserialization, banned C functions, taint-confirmed flows with user input sources |
+| **Medium** | Important issues — may be exploitable with additional context | SQL concatenation, XSS sinks, reflection, unguarded sinks, resource leaks |
+| **Low** | Informational — code quality or weak signals | Weak crypto algorithms, insecure randomness, `unwrap()`/`panic!()`, type-safety escapes |
+
+### Non-production severity downgrade
+
+By default, findings in paths matching common non-production patterns (`tests/`, `test/`, `vendor/`, `build/`, `examples/`, `benchmarks/`) are downgraded by one tier:
+
+- High → Medium
+- Medium → Low
+- Low → Low (unchanged)
+
+Use `--keep-nonprod-severity` to disable this behavior.
+
+---
+
+## Inline Suppressions
+
+Suppress specific findings directly in source code using `nyx:ignore` comments. Suppressed findings are excluded from output, severity counts, and `--fail-on` checks by default.
+
+### Comment syntax
+
+| Language | Comment styles |
+|----------|---------------|
+| Rust, C, C++, Java, Go, JS, TS | `// nyx:ignore ...` or `/* nyx:ignore ... */` |
+| Python, Ruby | `# nyx:ignore ...` |
+| PHP | `// nyx:ignore ...`, `# nyx:ignore ...`, or `/* nyx:ignore ... */` |
+
+### Directive forms
+
+```python
+x = dangerous()  # nyx:ignore taint-unsanitised-flow     ← suppresses this line
+# nyx:ignore-next-line taint-unsanitised-flow
+x = dangerous()                                           ← suppresses this line
+```
+
+- `nyx:ignore <RULE_ID>` — suppresses findings on the **same line** as the comment.
+- `nyx:ignore-next-line <RULE_ID>` — suppresses findings on the **next line**.
+- For taint findings, the primary line is the **sink line** (the `line` field in output).
+
+### Rule ID matching
+
+- **Case-sensitive**, exact match after canonicalization.
+- Comma-separated: `nyx:ignore rule-a, rule-b`
+- Wildcard suffix: `nyx:ignore rs.quality.*` matches any ID starting with `rs.quality.`
+- Taint IDs are canonicalized: `nyx:ignore taint-unsanitised-flow` matches `taint-unsanitised-flow (source 5:1)` (parenthetical suffix stripped).
+
+### Console behavior
+
+- **Default**: suppressed findings are hidden entirely.
+- **`--show-suppressed`**: suppressed findings appear dimmed with `[SUPPRESSED]` tag. Summary shows `"N issues (M suppressed)"`.
+
+### JSON / SARIF behavior
+
+- **Default**: suppressed findings are excluded from JSON/SARIF output.
+- **`--show-suppressed`**: suppressed findings are included with additional fields:
+
+```json
+{
+  "suppressed": true,
+  "suppression": {
+    "kind": "SameLine",
+    "matched_pattern": "taint-unsanitised-flow",
+    "directive_line": 42
+  }
+}
+```
+
+### Exit code
+
+Suppressed findings do **not** trigger `--fail-on`. A scan with only suppressed findings exits `0`.
+
+---
+
+## Rule ID Format
+
+| Prefix | Detector | Example |
+|--------|----------|---------|
+| `taint-*` | Taint analysis | `taint-unsanitised-flow (source 5:11)` |
+| `cfg-*` | CFG structural | `cfg-unguarded-sink`, `cfg-auth-gap` |
+| `state-*` | State model | `state-use-after-close`, `state-resource-leak` |
+| `<lang>.*.*` | AST patterns | `rs.memory.transmute`, `js.code_exec.eval` |
+
+See the [Rule Reference](rules/index.md) for a complete listing.
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@ -0,0 +1,103 @@
+# Quick Start
+
+## Your first scan
+
+```bash
+# Scan the current directory
+nyx scan
+
+# Scan a specific path
+nyx scan ./my-project
+```
+
+Nyx automatically creates an SQLite index on first run. Subsequent scans skip unchanged files.
+
+## Understanding the output
+
+A typical console output looks like:
+
+```
+[HIGH]   taint-unsanitised-flow (source 5:11)  src/handler.rs:12:5
+         Source: env::var("CMD") at 5:11
+         Sink: Command::new("sh").arg("-c")
+         Score: 76
+
+[MEDIUM] cfg-unguarded-sink                    src/handler.rs:12:5
+         Score: 35
+
+[MEDIUM] rs.quality.unsafe_block               src/lib.rs:44:5
+         Score: 30
+```
+
+Each finding shows:
+
+| Field | Meaning |
+|-------|---------|
+| **Severity tag** | `[HIGH]`, `[MEDIUM]`, or `[LOW]` |
+| **Rule ID** | Identifies the detector and specific rule |
+| **Location** | `file:line:col` |
+| **Evidence** | Source, Sink, and guard details (taint findings only) |
+| **Score** | Attack-surface ranking score (higher = more exploitable) |
+
+## Common workflows
+
+### CI gate — fail on high-severity findings
+
+```bash
+nyx scan . --fail-on high --quiet
+# Exit code 1 if any HIGH finding exists, 0 otherwise
+```
+
+### Export for tooling
+
+```bash
+# JSON for scripting
+nyx scan . --format json > findings.json
+
+# SARIF for GitHub Code Scanning
+nyx scan . --format sarif > results.sarif
+```
+
+### Fast structural scan (no dataflow)
+
+```bash
+nyx scan . --mode ast
+```
+
+AST-only mode runs tree-sitter pattern queries without building CFGs or running taint analysis. Much faster, but misses dataflow vulnerabilities.
+
+### Filter by severity
+
+```bash
+# Only high-severity
+nyx scan . --severity HIGH
+
+# High and medium
+nyx scan . --severity ">=MEDIUM"
+
+# Specific set
+nyx scan . --severity "HIGH,MEDIUM"
+```
+
+### Skip the index
+
+```bash
+nyx scan . --index off
+```
+
+Useful for one-off scans or when you don't want to write to disk.
+
+### Scan without non-production noise
+
+By default, findings in test/vendor/build paths are downgraded one severity tier. To keep original severity:
+
+```bash
+nyx scan . --keep-nonprod-severity
+```
+
+## Next steps
+
+- [CLI Reference](cli.md) — All flags and options
+- [Configuration](configuration.md) — Customize rules, exclusions, and behavior
+- [Detector Overview](detectors.md) — How the analysis engines work
+- [Rule Reference](rules/index.md) — Browse all rules by language
--- a/docs/rules/c.md
+++ b/docs/rules/c.md
@ -0,0 +1,89 @@
+# C Rules
+
+Nyx detects C vulnerabilities through AST patterns (banned functions, format strings) and taint analysis (user input → shell execution, buffer overflow sinks).
+
+## Taint Sources
+
+| Function | Capability | Source Kind |
+|----------|-----------|-------------|
+| `getenv` | `all` | EnvironmentConfig |
+| `fgets`, `scanf`, `fscanf`, `gets`, `read` | `all` | UserInput |
+
+## Taint Sinks
+
+| Function | Required Capability |
+|----------|-------------------|
+| `system`, `popen`, `exec*` family | `SHELL_ESCAPE` |
+| `sprintf`, `strcpy`, `strcat` | `HTML_ESCAPE` |
+| `printf`, `fprintf` | `FMT_STRING` |
+| `fopen`, `open` | `FILE_IO` |
+
+---
+
+## AST Pattern Rules
+
+### Memory Safety (Banned Functions)
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `c.memory.gets` | High | A | `gets()` — no bounds checking, always exploitable |
+| `c.memory.strcpy` | High | A | `strcpy()` — no bounds checking on destination buffer |
+| `c.memory.strcat` | High | A | `strcat()` — no bounds checking on destination buffer |
+| `c.memory.sprintf` | High | A | `sprintf()` — no length limit on output buffer |
+| `c.memory.scanf_percent_s` | High | A | `scanf("%s")` — unbounded string read |
+| `c.memory.printf_no_fmt` | High | B | `printf(var)` — format-string vulnerability (non-literal first arg) |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `c.cmdi.system` | High | A | `system()` — shell command execution |
+| `c.cmdi.popen` | Medium | A | `popen()` — shell command execution with pipe |
+
+---
+
+## Examples
+
+### `c.memory.gets` — Banned function
+
+**Vulnerable:**
+```c
+char buf[64];
+gets(buf);  // No bounds checking — buffer overflow
+```
+
+**Safe alternative:**
+```c
+char buf[64];
+fgets(buf, sizeof(buf), stdin);
+```
+
+### `c.memory.printf_no_fmt` — Format string
+
+**Vulnerable:**
+```c
+char *user_input = get_input();
+printf(user_input);  // Format string vulnerability
+```
+
+**Safe alternative:**
+```c
+char *user_input = get_input();
+printf("%s", user_input);
+```
+
+### `c.cmdi.system` — Shell execution
+
+**Vulnerable:**
+```c
+char cmd[256];
+snprintf(cmd, sizeof(cmd), "ls %s", user_dir);
+system(cmd);  // Command injection if user_dir contains shell metacharacters
+```
+
+**Safe alternative:**
+```c
+// Use execvp with explicit argument array
+char *args[] = {"ls", user_dir, NULL};
+execvp("ls", args);
+```
--- a/docs/rules/cpp.md
+++ b/docs/rules/cpp.md
@ -0,0 +1,66 @@
+# C++ Rules
+
+C++ rules inherit C banned-function concerns and add C++-specific patterns like dangerous casts.
+
+## Taint Labels
+
+C++ shares taint labels with C. See [C Rules](c.md) for the full source/sink/sanitizer listing.
+
+---
+
+## AST Pattern Rules
+
+### Memory Safety
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `cpp.memory.gets` | High | A | `gets()` — no bounds checking, always exploitable |
+| `cpp.memory.strcpy` | High | A | `strcpy()` — no bounds checking on destination |
+| `cpp.memory.strcat` | High | A | `strcat()` — no bounds checking on destination |
+| `cpp.memory.sprintf` | High | A | `sprintf()` — no length limit on output |
+| `cpp.memory.reinterpret_cast` | Medium | A | `reinterpret_cast` — type-punning cast |
+| `cpp.memory.const_cast` | Medium | A | `const_cast` — removes const/volatile qualifier |
+| `cpp.memory.printf_no_fmt` | High | B | `printf(var)` — format-string vulnerability |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `cpp.cmdi.system` | High | A | `system()` — shell command execution |
+| `cpp.cmdi.popen` | High | A | `popen()` — shell command execution |
+
+---
+
+## Examples
+
+### `cpp.memory.reinterpret_cast` — Type-punning cast
+
+**Flagged:**
+```cpp
+int x = 42;
+float* fp = reinterpret_cast<float*>(&x);  // Type-punning, may violate strict aliasing
+```
+
+**Safe alternative:**
+```cpp
+int x = 42;
+float f;
+std::memcpy(&f, &x, sizeof(f));  // Well-defined type punning
+```
+
+### `cpp.memory.const_cast` — Removing const
+
+**Flagged:**
+```cpp
+void process(const std::string& s) {
+    char* p = const_cast<char*>(s.c_str());  // Removes const
+    p[0] = 'X';  // Undefined behavior
+}
+```
+
+**Safe alternative:**
+```cpp
+void process(std::string s) {  // Take by value
+    s[0] = 'X';
+}
+```
--- a/docs/rules/go.md
+++ b/docs/rules/go.md
@ -0,0 +1,148 @@
+# Go Rules
+
+Nyx detects Go vulnerabilities through AST patterns and taint analysis, covering command execution, unsafe pointer usage, TLS misconfiguration, weak crypto, SQL injection, hardcoded secrets, and deserialization.
+
+## Taint Labels
+
+Go has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/go.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `os.Getenv` | all |
+| `http.Request`, `r.FormValue`, `r.URL`, `r.Body`, `r.Header` | all |
+| `r.URL.Query`, `r.URL.Query.Get`, `Request.FormValue`, `Request.URL` | all |
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `html.EscapeString`, `template.HTMLEscapeString` | HTML_ESCAPE |
+| `url.QueryEscape`, `url.PathEscape` | URL_ENCODE |
+| `filepath.Clean`, `filepath.Base` | FILE_IO |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `exec.Command` | SHELL_ESCAPE |
+| `db.Query`, `db.Exec`, `db.QueryRow`, `db.Prepare` | SHELL_ESCAPE |
+| `fmt.Fprintf`, `fmt.Sprintf`, `fmt.Printf` | FMT_STRING |
+| `os.Open`, `os.OpenFile`, `os.Create`, `ioutil.ReadFile`, `os.ReadFile` | FILE_IO |
+| `template.HTML` | HTML_ESCAPE |
+
+> **Note:** Chained calls like `r.URL.Query().Get("host")` are normalized by stripping internal `()` segments before matching, so `r.URL.Query.Get` matches the source rule.
+
+---
+
+## AST Pattern Rules
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.cmdi.exec_command` | High | A | `exec.Command()` — arbitrary process execution |
+
+### Memory Safety
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.memory.unsafe_pointer` | Medium | A | `unsafe.Pointer` — bypasses Go type system |
+
+### Insecure Transport
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.transport.insecure_skip_verify` | High | A | `InsecureSkipVerify: true` — disables TLS certificate validation |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.crypto.md5` | Low | A | `md5.New()` / `md5.Sum()` — weak hash algorithm |
+| `go.crypto.sha1` | Low | A | `sha1.New()` / `sha1.Sum()` — weak hash algorithm |
+
+### SQL Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.sqli.query_concat` | Medium | B | `db.Query`/`Exec`/`QueryRow` with concatenated string |
+
+### Secrets
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.secrets.hardcoded_key` | Medium | A | Variable with secret-like name assigned a string literal |
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.deser.gob_decode` | Medium | A | `gob.NewDecoder` — Go binary deserialization |
+
+---
+
+## Examples
+
+### `go.transport.insecure_skip_verify` — TLS misconfiguration
+
+**Vulnerable:**
+```go
+tr := &http.Transport{
+    TLSClientConfig: &tls.Config{
+        InsecureSkipVerify: true,  // Disables certificate verification
+    },
+}
+```
+
+**Safe alternative:**
+```go
+tr := &http.Transport{
+    TLSClientConfig: &tls.Config{
+        // Use proper CA certificates
+        RootCAs: certPool,
+    },
+}
+```
+
+### `go.sqli.query_concat` — SQL concatenation
+
+**Vulnerable:**
+```go
+rows, err := db.Query("SELECT * FROM users WHERE id=" + userID)
+```
+
+**Safe alternative:**
+```go
+rows, err := db.Query("SELECT * FROM users WHERE id=$1", userID)
+```
+
+### `go.secrets.hardcoded_key` — Hardcoded secret
+
+**Flagged:**
+```go
+apiKey := "sk-1234567890abcdef"
+password := "hunter2"
+```
+
+**Safe alternative:**
+```go
+apiKey := os.Getenv("API_KEY")
+password := os.Getenv("DB_PASSWORD")
+```
+
+### `go.cmdi.exec_command` — Command execution
+
+**Vulnerable:**
+```go
+cmd := exec.Command("sh", "-c", userInput)
+cmd.Run()
+```
+
+**Safe alternative:**
+```go
+// Use explicit command and arguments, not shell
+cmd := exec.Command("ls", "-la", safeDir)
+cmd.Run()
+```
--- a/docs/rules/index.md
+++ b/docs/rules/index.md
@ -0,0 +1,79 @@
+# Rule Reference
+
+This section lists every detection rule in Nyx, organized by language.
+
+## Rule ID Format
+
+| Prefix | Detector Family | Example |
+|--------|----------------|---------|
+| `taint-*` | [Taint analysis](../detectors/taint.md) | `taint-unsanitised-flow (source 5:11)` |
+| `cfg-*` | [CFG structural](../detectors/cfg.md) | `cfg-unguarded-sink`, `cfg-auth-gap` |
+| `state-*` | [State model](../detectors/state.md) | `state-use-after-close`, `state-resource-leak` |
+| `<lang>.*.*` | [AST patterns](../detectors/patterns.md) | `rs.memory.transmute`, `js.code_exec.eval` |
+
+## Cross-Language Rules
+
+These rules apply to all supported languages:
+
+### Taint Rules
+
+| Rule ID | Severity | Description |
+|---------|----------|-------------|
+| `taint-unsanitised-flow (source L:C)` | Varies by source kind | Unsanitized data flows from source to sink |
+
+### CFG Structural Rules
+
+| Rule ID | Severity | Description |
+|---------|----------|-------------|
+| `cfg-unguarded-sink` | High/Medium | Sink without dominating guard |
+| `cfg-auth-gap` | High | Web handler reaches privileged sink without auth |
+| `cfg-unreachable-sink` | Medium | Dangerous function in unreachable code |
+| `cfg-unreachable-sanitizer` | Low | Sanitizer in unreachable code |
+| `cfg-unreachable-source` | Low | Source in unreachable code |
+| `cfg-error-fallthrough` | High/Medium | Error path doesn't terminate before dangerous code |
+| `cfg-resource-leak` | Medium | Resource not released on all exit paths |
+| `cfg-lock-not-released` | Medium | Lock not released on all exit paths |
+
+### State Model Rules
+
+| Rule ID | Severity | Description |
+|---------|----------|-------------|
+| `state-use-after-close` | High | Variable used after being closed |
+| `state-double-close` | Medium | Resource closed twice |
+| `state-resource-leak` | Medium | Resource never closed (definite) |
+| `state-resource-leak-possible` | Low | Resource may not close on all paths |
+| `state-unauthed-access` | High | Privileged operation without authentication |
+
+## Per-Language AST Pattern Rules
+
+Each language page lists all AST pattern rules with examples:
+
+- [Rust](rust.md) — 12 rules (memory safety, code quality)
+- [C](c.md) — 8 rules (banned functions, command execution, format strings)
+- [C++](cpp.md) — 9 rules (banned functions, dangerous casts, command execution)
+- [Java](java.md) — 8 rules (deserialization, command execution, reflection, SQL, crypto, XSS)
+- [Go](go.md) — 8 rules (command execution, unsafe pointer, TLS, crypto, SQL, secrets, deserialization)
+- [JavaScript](javascript.md) — 12 rules (code execution, XSS, prototype pollution, crypto, transport)
+- [TypeScript](typescript.md) — 10 rules (mirrors JS + type-safety escapes)
+- [Python](python.md) — 12 rules (code execution, command execution, deserialization, SQL, crypto, XSS)
+- [PHP](php.md) — 11 rules (code execution, command execution, deserialization, SQL, path traversal, crypto)
+- [Ruby](ruby.md) — 10 rules (code execution, command execution, deserialization, reflection, SSRF, crypto)
+
+## Taint Label Coverage
+
+Taint analysis uses language-specific source/sink/sanitizer labels. Coverage varies by language:
+
+| Language | Sources | Sinks | Sanitizers | Coverage |
+|----------|---------|-------|------------|----------|
+| Rust | Complete | Complete | Complete | Full |
+| JavaScript | Complete | Complete | Partial | Full |
+| TypeScript | Partial | Partial | Partial | Moderate |
+| Python | Partial | Complete | Partial | Moderate |
+| C | Partial | Complete | Minimal | Moderate |
+| C++ | Partial | Complete | Minimal | Moderate |
+| Java | Partial | Partial | Partial | Moderate |
+| Go | Complete | Complete | Partial | Full |
+| PHP | Complete | Complete | Partial | Full |
+| Ruby | Partial | Partial | Partial | Moderate |
+
+"Starter" coverage means basic rules exist but many common library functions are not yet labeled. Contributions welcome.
--- a/docs/rules/java.md
+++ b/docs/rules/java.md
@ -0,0 +1,135 @@
+# Java Rules
+
+Nyx detects Java vulnerabilities through AST patterns and taint analysis, covering deserialization, command execution, reflection, SQL injection, weak crypto, and XSS.
+
+## Taint Labels
+
+Java has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/java.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `System.getenv` | all |
+| `getParameter`, `getInputStream`, `getHeader`, `getCookies`, `getReader`, `getQueryString`, `getPathInfo` | all |
+| `readObject`, `readLine` | all |
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `HtmlUtils.htmlEscape`, `StringEscapeUtils.escapeHtml4` | HTML_ESCAPE |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `Runtime.exec`, `ProcessBuilder` | SHELL_ESCAPE |
+| `executeQuery`, `executeUpdate`, `prepareStatement` | SHELL_ESCAPE |
+| `Class.forName` | SHELL_ESCAPE |
+| `println`, `print`, `write` | HTML_ESCAPE |
+
+---
+
+## AST Pattern Rules
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.deser.readobject` | High | A | `ObjectInputStream.readObject()` — unsafe deserialization |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.cmdi.runtime_exec` | High | A | `Runtime.getRuntime().exec()` — shell command execution |
+
+### Reflection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.reflection.class_forname` | Medium | A | `Class.forName()` — dynamic class loading |
+| `java.reflection.method_invoke` | Medium | A | `Method.invoke()` — reflective method invocation |
+
+### SQL Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.sqli.execute_concat` | Medium | B | SQL `execute*()` with concatenated string argument |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.crypto.insecure_random` | Low | A | `new Random()` — `java.util.Random` is not cryptographically secure |
+| `java.crypto.weak_digest` | Low | A | `MessageDigest.getInstance("MD5"/"SHA1")` |
+
+### XSS
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.xss.getwriter_print` | Medium | A | `response.getWriter().print/println/write` — direct output |
+
+---
+
+## Examples
+
+### `java.deser.readobject` — Unsafe deserialization
+
+**Vulnerable:**
+```java
+ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
+Object obj = ois.readObject();  // Arbitrary object instantiation
+```
+
+**Safe alternative:**
+```java
+// Use a safe format like JSON
+ObjectMapper mapper = new ObjectMapper();
+MyType obj = mapper.readValue(request.getInputStream(), MyType.class);
+```
+
+### `java.sqli.execute_concat` — SQL concatenation
+
+**Vulnerable:**
+```java
+String query = "SELECT * FROM users WHERE id=" + userId;
+stmt.executeQuery(query);  // SQL injection
+```
+
+**Safe alternative:**
+```java
+PreparedStatement ps = conn.prepareStatement("SELECT * FROM users WHERE id=?");
+ps.setString(1, userId);
+ResultSet rs = ps.executeQuery();
+```
+
+### `java.cmdi.runtime_exec` — Command execution
+
+**Vulnerable:**
+```java
+Runtime.getRuntime().exec("cmd /c " + userCommand);
+```
+
+**Safe alternative:**
+```java
+ProcessBuilder pb = new ProcessBuilder("cmd", "/c", "dir");
+// Use explicit argument list, never concatenate user input
+```
+
+### `java.reflection.class_forname` — Dynamic class loading
+
+**Flagged:**
+```java
+Class<?> cls = Class.forName(className);
+Object obj = cls.getDeclaredConstructor().newInstance();
+```
+
+**Safe alternative:**
+```java
+// Use an allowlist of permitted class names
+Map<String, Class<?>> allowed = Map.of("User", User.class, "Order", Order.class);
+Class<?> cls = allowed.get(className);
+if (cls != null) { /* ... */ }
+```
--- a/docs/rules/javascript.md
+++ b/docs/rules/javascript.md
@ -0,0 +1,138 @@
+# JavaScript Rules
+
+JavaScript has the most complete taint label coverage alongside Rust. Nyx detects code execution, XSS, prototype pollution, command injection, and weak crypto.
+
+## Taint Sources
+
+| Function | Capability | Source Kind |
+|----------|-----------|-------------|
+| `document.location`, `window.location` | `all` | UserInput |
+| `req.body`, `req.query`, `req.params` | `all` | UserInput |
+| `req.headers`, `req.cookies` | `all` | UserInput |
+| `process.env` | `all` | EnvironmentConfig |
+
+## Taint Sinks
+
+| Function | Required Capability |
+|----------|-------------------|
+| `eval` | `SHELL_ESCAPE` |
+| `innerHTML` | `HTML_ESCAPE` |
+| `location.href`, `window.location.href` | `URL_ENCODE` |
+| `child_process.exec`, `child_process.execSync` | `SHELL_ESCAPE` |
+| `child_process.spawn` | `SHELL_ESCAPE` |
+
+## Taint Sanitizers
+
+| Function | Strips Capability |
+|----------|------------------|
+| `JSON.parse` | `JSON_PARSE` |
+| `encodeURIComponent`, `encodeURI` | `URL_ENCODE` |
+| `DOMPurify.sanitize` | `HTML_ESCAPE` |
+
+> **Note:** Anonymous function expressions and arrow functions passed as callback arguments (e.g., Express `app.get('/path', function(req, res) { ... })`) are automatically walked as separate function scopes for taint analysis. Each anonymous function gets a unique scope identifier to prevent cross-function taint leakage.
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.code_exec.eval` | High | A | `eval()` — dynamic code execution |
+| `js.code_exec.new_function` | High | A | `new Function()` — eval equivalent |
+| `js.code_exec.settimeout_string` | Medium | A | `setTimeout`/`setInterval` with string argument |
+
+### XSS Sinks
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.xss.document_write` | Medium | A | `document.write()` / `document.writeln()` |
+| `js.xss.outer_html` | Medium | A | Assignment to `.outerHTML` |
+| `js.xss.insert_adjacent_html` | Medium | A | `insertAdjacentHTML()` |
+| `js.xss.location_assign` | Medium | A | Assignment to `location`/`location.href` — open redirect |
+| `js.xss.cookie_write` | Medium | A | Write to `document.cookie` |
+
+### Prototype Pollution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.prototype.proto_assignment` | Medium | A | Assignment to `__proto__` |
+| `js.prototype.extend_object` | Medium | A | Assignment to `Object.prototype.*` |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.crypto.weak_hash` | Low | A | `crypto.createHash("md5"/"sha1")` |
+| `js.crypto.math_random` | Low | A | `Math.random()` — not cryptographically secure |
+
+### Insecure Transport
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.transport.fetch_http` | Low | A | `fetch("http://...")` — plaintext HTTP |
+
+---
+
+## Examples
+
+### `js.code_exec.eval` — Dynamic code execution
+
+**Vulnerable:**
+```javascript
+const code = req.query.code;
+eval(code);  // Remote code execution
+```
+
+**Safe alternative:**
+```javascript
+// Use a sandboxed interpreter or avoid eval entirely
+const allowed = { add: (a, b) => a + b };
+const result = allowed[req.query.operation]?.(req.query.a, req.query.b);
+```
+
+### `js.xss.document_write` — XSS sink
+
+**Vulnerable:**
+```javascript
+document.write("<h1>" + userName + "</h1>");
+```
+
+**Safe alternative:**
+```javascript
+const el = document.createElement("h1");
+el.textContent = userName;
+document.body.appendChild(el);
+```
+
+### `js.prototype.proto_assignment` — Prototype pollution
+
+**Vulnerable:**
+```javascript
+function merge(target, source) {
+    for (let key in source) {
+        target[key] = source[key];  // If key is "__proto__", pollutes prototype
+    }
+}
+```
+
+**Safe alternative:**
+```javascript
+function merge(target, source) {
+    for (let key in source) {
+        if (key === "__proto__" || key === "constructor") continue;
+        target[key] = source[key];
+    }
+}
+```
+
+### Taint: `req.body` → `eval()`
+
+**Finding:**
+```
+[HIGH]   taint-unsanitised-flow (source 2:18)  src/handler.js:3:5
+         Source: req.body at 2:18
+         Sink: eval()
+         Score: 78
+```
--- a/docs/rules/php.md
+++ b/docs/rules/php.md
@ -0,0 +1,138 @@
+# PHP Rules
+
+Nyx detects PHP vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, SQL injection, path traversal, and weak crypto.
+
+## Taint Labels
+
+PHP has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/php.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `$_GET` / `_GET`, `$_POST` / `_POST`, `$_REQUEST` / `_REQUEST`, `$_COOKIE` / `_COOKIE`, `$_FILES` / `_FILES`, `$_SERVER` / `_SERVER`, `$_ENV` / `_ENV` | all |
+| `file_get_contents`, `fread` | all |
+
+> **Note:** PHP superglobal names are matched both with and without the `$` prefix because the CFG's `collect_idents` strips the leading `$` from variable names. Subscript access like `$_GET['cmd']` is handled via `element_reference` / `subscript_expression` node detection.
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `htmlspecialchars`, `htmlentities` | HTML_ESCAPE |
+| `escapeshellarg`, `escapeshellcmd` | SHELL_ESCAPE |
+| `basename` | FILE_IO |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `system`, `exec`, `passthru`, `shell_exec`, `proc_open`, `popen` | SHELL_ESCAPE |
+| `eval`, `assert` | SHELL_ESCAPE |
+| `include`, `include_once`, `require`, `require_once` | FILE_IO |
+| `unserialize` | SHELL_ESCAPE |
+| `move_uploaded_file`, `copy`, `file_put_contents`, `fwrite` | FILE_IO |
+| `echo`, `print` | HTML_ESCAPE |
+| `mysqli_query`, `pg_query`, `query` | SHELL_ESCAPE |
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.code_exec.eval` | High | A | `eval()` — dynamic code execution |
+| `php.code_exec.create_function` | High | A | `create_function()` — deprecated eval-like constructor |
+| `php.code_exec.preg_replace_e` | High | A | `preg_replace` with `/e` modifier — code execution via regex |
+| `php.code_exec.assert_string` | High | A | `assert()` with string argument — evaluates PHP code |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.cmdi.system` | High | A | `system`/`shell_exec`/`exec`/`passthru`/`proc_open`/`popen` |
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.deser.unserialize` | High | A | `unserialize()` — PHP object injection |
+
+### SQL Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.sqli.query_concat` | Medium | B | `mysql_query`/`mysqli_query` with concatenated SQL |
+
+### Path Traversal
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.path.include_variable` | High | B | `include`/`require` with variable path — file inclusion |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.crypto.md5` | Low | A | `md5()` — weak hash function |
+| `php.crypto.sha1` | Low | A | `sha1()` — weak hash function |
+| `php.crypto.rand` | Low | A | `rand()`/`mt_rand()` — not cryptographically secure |
+
+---
+
+## Examples
+
+### `php.code_exec.eval` — Dynamic code execution
+
+**Vulnerable:**
+```php
+eval($_GET['code']);
+```
+
+**Safe alternative:**
+```php
+// Never use eval with user input
+// Use a template engine or allowlisted operations
+```
+
+### `php.deser.unserialize` — Object injection
+
+**Vulnerable:**
+```php
+$obj = unserialize($_COOKIE['data']);
+```
+
+**Safe alternative:**
+```php
+$data = json_decode($_COOKIE['data'], true);
+```
+
+### `php.path.include_variable` — File inclusion
+
+**Vulnerable:**
+```php
+include($_GET['page']);  // Local/remote file inclusion
+```
+
+**Safe alternative:**
+```php
+$allowed = ['home', 'about', 'contact'];
+$page = in_array($_GET['page'], $allowed) ? $_GET['page'] : 'home';
+include("pages/{$page}.php");
+```
+
+### `php.sqli.query_concat` — SQL concatenation
+
+**Vulnerable:**
+```php
+mysqli_query($conn, "SELECT * FROM users WHERE id=" . $_GET['id']);
+```
+
+**Safe alternative:**
+```php
+$stmt = $conn->prepare("SELECT * FROM users WHERE id=?");
+$stmt->bind_param("i", $_GET['id']);
+$stmt->execute();
+```
--- a/docs/rules/python.md
+++ b/docs/rules/python.md
@ -0,0 +1,142 @@
+# Python Rules
+
+Nyx detects Python vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, SQL injection, and weak crypto.
+
+## Taint Labels
+
+Python has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/python.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `os.getenv`, `os.environ` | all |
+| `request.args`, `request.form`, `request.json`, `request.headers`, `request.cookies`, `input` | all |
+| `sys.argv` | all |
+| `argparse.parse_args`, `urllib.request.urlopen`, `requests.get`, `requests.post` | all |
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `html.escape` | HTML_ESCAPE |
+| `shlex.quote` | SHELL_ESCAPE |
+| `os.path.realpath` | FILE_IO |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `eval`, `exec` | SHELL_ESCAPE |
+| `os.system`, `os.popen`, `subprocess.call`, `subprocess.run`, `subprocess.Popen`, `subprocess.check_output`, `subprocess.check_call` | SHELL_ESCAPE |
+| `cursor.execute`, `cursor.executemany` | SHELL_ESCAPE |
+| `send_file`, `send_from_directory` | FILE_IO |
+| `open` | FILE_IO |
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.code_exec.eval` | High | A | `eval()` — dynamic code execution |
+| `py.code_exec.exec` | High | A | `exec()` — dynamic code execution |
+| `py.code_exec.compile` | Medium | A | `compile()` with exec/eval mode |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.cmdi.os_system` | High | A | `os.system()` — shell command execution |
+| `py.cmdi.os_popen` | High | A | `os.popen()` — shell command execution |
+| `py.cmdi.subprocess_shell` | High | B | `subprocess.*` with `shell=True` |
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.deser.pickle_loads` | High | A | `pickle.loads()` / `pickle.load()` — arbitrary object deserialization |
+| `py.deser.yaml_load` | High | A | `yaml.load()` without SafeLoader |
+| `py.deser.shelve_open` | Medium | A | `shelve.open()` — pickle-backed deserialization |
+
+### SQL Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.sqli.execute_format` | Medium | B | `cursor.execute()` with string concatenation |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.crypto.md5` | Low | A | `hashlib.md5()` — weak hash algorithm |
+| `py.crypto.sha1` | Low | A | `hashlib.sha1()` — weak hash algorithm |
+
+### Template Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.xss.jinja_from_string` | Medium | A | `jinja2.Template.from_string()` — template injection |
+
+---
+
+## Examples
+
+### `py.deser.pickle_loads` — Unsafe deserialization
+
+**Vulnerable:**
+```python
+import pickle
+data = pickle.loads(request.body)  # Arbitrary code execution
+```
+
+**Safe alternative:**
+```python
+import json
+data = json.loads(request.body)  # JSON is safe
+```
+
+### `py.cmdi.subprocess_shell` — Shell execution
+
+**Vulnerable:**
+```python
+import subprocess
+subprocess.call(user_input, shell=True)  # Command injection
+```
+
+**Safe alternative:**
+```python
+import subprocess
+import shlex
+subprocess.call(shlex.split(user_input), shell=False)
+# Or better: use an explicit command list
+subprocess.call(["ls", "-la", user_dir])
+```
+
+### `py.deser.yaml_load` — Unsafe YAML
+
+**Vulnerable:**
+```python
+import yaml
+config = yaml.load(user_data)  # Can instantiate arbitrary objects
+```
+
+**Safe alternative:**
+```python
+import yaml
+config = yaml.safe_load(user_data)  # Only basic Python types
+```
+
+### `py.sqli.execute_format` — SQL concatenation
+
+**Vulnerable:**
+```python
+cursor.execute("SELECT * FROM users WHERE id=" + user_id)
+```
+
+**Safe alternative:**
+```python
+cursor.execute("SELECT * FROM users WHERE id=?", (user_id,))
+```
--- a/docs/rules/ruby.md
+++ b/docs/rules/ruby.md
@ -0,0 +1,132 @@
+# Ruby Rules
+
+Nyx detects Ruby vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, reflection, SSRF, and weak crypto.
+
+## Taint Labels
+
+Ruby has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/ruby.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `ENV`, `gets` | all |
+| `params` | all |
+
+> **Note:** Ruby's `params[:cmd]` subscript access is detected via `element_reference` node handling in the CFG. Sinatra/Rails `do...end` blocks are walked as function scopes.
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `CGI.escapeHTML`, `ERB::Util.html_escape` | HTML_ESCAPE |
+| `Shellwords.escape`, `Shellwords.shellescape` | SHELL_ESCAPE |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `system`, `exec` | SHELL_ESCAPE |
+| `eval` | SHELL_ESCAPE |
+| `puts`, `print` | HTML_ESCAPE |
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.code_exec.eval` | High | A | `Kernel#eval` — dynamic code execution |
+| `rb.code_exec.instance_eval` | High | A | `instance_eval` — evaluates string in object context |
+| `rb.code_exec.class_eval` | High | A | `class_eval` / `module_eval` — evaluates string in class context |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.cmdi.backtick` | High | A | Backtick shell execution (`` `cmd` ``) |
+| `rb.cmdi.system_interp` | High | A | `system`/`exec` call — command execution risk |
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.deser.yaml_load` | High | A | `YAML.load` — arbitrary object deserialization |
+| `rb.deser.marshal_load` | High | A | `Marshal.load` — arbitrary Ruby object deserialization |
+
+### Reflection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.reflection.send_dynamic` | Medium | B | `send()` with non-symbol argument — arbitrary method dispatch |
+| `rb.reflection.constantize` | Medium | A | `constantize` / `safe_constantize` — dynamic class resolution |
+
+### SSRF
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.ssrf.open_uri` | Medium | A | `Kernel#open` with HTTP URL — SSRF via open-uri |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.crypto.md5` | Low | A | `Digest::MD5` — weak hash algorithm |
+
+---
+
+## Examples
+
+### `rb.deser.yaml_load` — Unsafe YAML deserialization
+
+**Vulnerable:**
+```ruby
+data = YAML.load(params[:config])  # Arbitrary object instantiation
+```
+
+**Safe alternative:**
+```ruby
+data = YAML.safe_load(params[:config])  # Only basic Ruby types
+```
+
+### `rb.cmdi.backtick` — Backtick shell execution
+
+**Vulnerable:**
+```ruby
+output = `ls #{user_dir}`  # Command injection via interpolation
+```
+
+**Safe alternative:**
+```ruby
+require 'open3'
+output, status = Open3.capture2('ls', user_dir)
+```
+
+### `rb.reflection.send_dynamic` — Dynamic method dispatch
+
+**Vulnerable:**
+```ruby
+obj.send(params[:method], params[:arg])  # Arbitrary method invocation
+```
+
+**Safe alternative:**
+```ruby
+allowed = %w[name email phone]
+if allowed.include?(params[:method])
+  obj.send(params[:method])
+end
+```
+
+### `rb.deser.marshal_load` — Marshal deserialization
+
+**Vulnerable:**
+```ruby
+obj = Marshal.load(request.body.read)
+```
+
+**Safe alternative:**
+```ruby
+data = JSON.parse(request.body.read)
+```
--- a/docs/rules/rust.md
+++ b/docs/rules/rust.md
@ -0,0 +1,105 @@
+# Rust Rules
+
+Nyx detects Rust vulnerabilities through AST patterns (memory safety, code quality) and taint analysis (command injection via `env::var` → `Command::new`).
+
+## Taint Sources
+
+| Function | Capability | Source Kind |
+|----------|-----------|-------------|
+| `std::env::var`, `env::var` | `all` | EnvironmentConfig |
+
+## Taint Sinks
+
+| Function | Required Capability |
+|----------|-------------------|
+| `Command::new`, `Command::arg`, `Command::args` | `SHELL_ESCAPE` |
+| `Command::status`, `Command::output` | `SHELL_ESCAPE` |
+| `fs::read_to_string`, `fs::write`, `fs::read`, `File::open`, `File::create` | `FILE_IO` |
+
+## Taint Sanitizers
+
+| Function | Strips Capability |
+|----------|------------------|
+| `html_escape::encode_safe`, `sanitize_html` | `HTML_ESCAPE` |
+| `shell_escape::unix::escape`, `sanitize_shell` | `SHELL_ESCAPE` |
+
+> **Note:** `fs::read_to_string` was moved from taint sources to sinks to support path traversal detection (`env::var` → `fs::read_to_string`).
+
+---
+
+## AST Pattern Rules
+
+### Memory Safety
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rs.memory.transmute` | High | A | `std::mem::transmute` — unchecked type reinterpretation |
+| `rs.memory.copy_nonoverlapping` | High | A | `ptr::copy_nonoverlapping` — raw pointer memcpy |
+| `rs.memory.get_unchecked` | High | A | `get_unchecked` / `get_unchecked_mut` — unchecked indexing |
+| `rs.memory.mem_zeroed` | High | A | `std::mem::zeroed` — may be UB for non-POD types |
+| `rs.memory.ptr_read` | High | A | `ptr::read` / `ptr::read_volatile` — raw pointer dereference |
+| `rs.memory.narrow_cast` | Low | A | `as u8`/`i8`/`u16`/`i16` — possible truncation |
+| `rs.memory.mem_forget` | Low | A | `std::mem::forget` — may leak resources |
+
+### Code Quality
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rs.quality.unsafe_block` | Medium | A | `unsafe { }` block — manual memory safety obligation |
+| `rs.quality.unsafe_fn` | Medium | A | `unsafe fn` declaration |
+| `rs.quality.unwrap` | Low | A | `.unwrap()` — panics on `None`/`Err` |
+| `rs.quality.expect` | Low | A | `.expect()` — panics on `None`/`Err` |
+| `rs.quality.panic_macro` | Low | A | `panic!()` macro invocation |
+| `rs.quality.todo` | Low | A | `todo!()` / `unimplemented!()` placeholder |
+
+---
+
+## Examples
+
+### `rs.memory.transmute` — Unchecked type reinterpretation
+
+**Vulnerable:**
+```rust
+let x: u32 = 42;
+let y: f32 = unsafe { std::mem::transmute(x) };
+```
+
+**Safe alternative:**
+```rust
+let x: u32 = 42;
+let y: f32 = f32::from_bits(x);
+```
+
+### `rs.quality.unsafe_block` — Unsafe block
+
+**Flagged:**
+```rust
+unsafe {
+    let ptr = &x as *const i32;
+    println!("{}", *ptr);
+}
+```
+
+**Safe alternative:**
+```rust
+// Use safe abstractions when possible
+println!("{}", x);
+```
+
+### Taint: `env::var` → `Command::new`
+
+**Vulnerable:**
+```rust
+let cmd = std::env::var("USER_CMD").unwrap();
+Command::new("sh").arg("-c").arg(&cmd).output()?;
+```
+
+**Safe alternative:**
+```rust
+let cmd = std::env::var("USER_CMD").unwrap();
+// Validate against allowlist
+let allowed = ["ls", "whoami", "date"];
+if allowed.contains(&cmd.as_str()) {
+    Command::new(&cmd).output()?;
+}
+```
--- a/docs/rules/typescript.md
+++ b/docs/rules/typescript.md
@ -0,0 +1,81 @@
+# TypeScript Rules
+
+TypeScript rules mirror JavaScript patterns plus TypeScript-specific type-safety escape detectors. Taint labels are shared with JavaScript (see [JavaScript Rules](javascript.md)).
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.code_exec.eval` | High | A | `eval()` — dynamic code execution |
+| `ts.code_exec.new_function` | High | A | `new Function()` — eval equivalent |
+| `ts.code_exec.settimeout_string` | Medium | A | `setTimeout`/`setInterval` with string argument |
+
+### XSS Sinks
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.xss.document_write` | Medium | A | `document.write()` / `document.writeln()` |
+| `ts.xss.outer_html` | Medium | A | Assignment to `.outerHTML` |
+| `ts.xss.insert_adjacent_html` | Medium | A | `insertAdjacentHTML()` |
+| `ts.xss.location_assign` | Medium | A | Assignment to `location`/`location.href` |
+| `ts.xss.cookie_write` | Low | A | Write to `document.cookie` |
+
+### Prototype Pollution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.prototype.proto_assignment` | Medium | A | Assignment to `__proto__` |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.crypto.math_random` | Low | A | `Math.random()` — not cryptographically secure |
+
+### Code Quality (TypeScript-specific)
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.quality.any_annotation` | Low | A | Type annotation of `any` — disables type checking |
+| `ts.quality.as_any` | Low | A | Type assertion `as any` — type-safety escape hatch |
+
+---
+
+## Examples
+
+### `ts.quality.any_annotation` — `any` type
+
+**Flagged:**
+```typescript
+function process(data: any) {  // ts.quality.any_annotation
+    data.whatever();  // No type checking
+}
+```
+
+**Safe alternative:**
+```typescript
+interface UserData { name: string; email: string; }
+function process(data: UserData) {
+    console.log(data.name);
+}
+```
+
+### `ts.quality.as_any` — Type assertion escape
+
+**Flagged:**
+```typescript
+const result = someValue as any;  // ts.quality.as_any
+result.nonexistentMethod();
+```
+
+**Safe alternative:**
+```typescript
+if (isValidType(someValue)) {
+    const result = someValue as KnownType;
+    result.knownMethod();
+}
+```