mirror of
https://github.com/elicpeter/nyx.git
synced 2026-06-06 19:35:13 +02:00
Phase 1 (#33)
* chore: Exclude CLAUDE.md from Cargo.toml * feat: add callgraph module and integrate into main analysis flow * feat: enhance CLI with new severity filtering and analysis modes * feat: update CHANGELOG with recent enhancements and fixes to severity filtering and output handling * feat: implement state-model dataflow analysis for resource lifecycle and auth state * feat: enhance diagnostic output formatting and add evidence structure * feat: implement attack surface ranking for diagnostics with scoring and sorting * feat: add comprehensive documentation for installation, usage, and rules reference * feat: add multiple language support for command execution and evaluation endpoints * feat: implement inline suppression for findings using `nyx:ignore` comments * feat: add confidence levels to AST patterns and update output structure * feat: implement low-noise prioritization system with category filtering, rollup grouping, and configurable budgets * feat: bump version to 0.4.0 and update changelog with new features and improvements * feat: add dead code allowances to various functions in mod.rs and real_world_tests.rs
This commit is contained in:
parent
19b578c5c4
commit
1bbe4b1cfb
456 changed files with 25628 additions and 1228 deletions
234
docs/cli.md
Normal file
234
docs/cli.md
Normal file
|
|
@ -0,0 +1,234 @@
|
|||
# CLI Reference
|
||||
|
||||
## Global
|
||||
|
||||
```
|
||||
nyx [COMMAND]
|
||||
nyx --version
|
||||
nyx --help
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## `nyx scan`
|
||||
|
||||
Run a security scan on a directory.
|
||||
|
||||
```
|
||||
nyx scan [PATH] [OPTIONS]
|
||||
```
|
||||
|
||||
**PATH** defaults to `.` (current directory).
|
||||
|
||||
### Analysis Mode
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--mode <MODE>` | `full` | Analysis mode: `full`, `ast`, `cfg`, or `taint` |
|
||||
|
||||
| Mode | What runs |
|
||||
|------|-----------|
|
||||
| `full` | AST patterns + CFG structural analysis + taint analysis |
|
||||
| `ast` | AST patterns only (fastest, no CFG or taint) |
|
||||
| `cfg` / `taint` | CFG + taint analysis only (no AST patterns) |
|
||||
|
||||
**Deprecated aliases**: `--ast-only` (use `--mode ast`), `--cfg-only` (use `--mode cfg`), `--all-targets` (use `--mode full`).
|
||||
|
||||
### Index Control
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--index <MODE>` | `auto` | Index behavior: `auto`, `off`, or `rebuild` |
|
||||
|
||||
| Index Mode | Behavior |
|
||||
|------------|----------|
|
||||
| `auto` | Use existing index if available; build if missing |
|
||||
| `off` | Skip indexing, scan filesystem directly |
|
||||
| `rebuild` | Force rebuild index before scanning |
|
||||
|
||||
**Deprecated aliases**: `--no-index` (use `--index off`), `--rebuild-index` (use `--index rebuild`).
|
||||
|
||||
### Output
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `-f, --format <FMT>` | `console` | Output format: `console`, `json`, or `sarif` |
|
||||
| `--quiet` | off | Suppress status messages (stderr); stdout stays clean |
|
||||
| `--no-rank` | off | Disable attack-surface ranking |
|
||||
|
||||
### Filtering
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--severity <EXPR>` | *(none)* | Filter findings by severity |
|
||||
| `--min-score <N>` | *(none)* | Drop findings with rank score below N |
|
||||
| `--min-confidence <LEVEL>` | *(none)* | Drop findings below this confidence level (`low`, `medium`, `high`) |
|
||||
| `--fail-on <SEV>` | *(none)* | Exit code 1 if any finding >= this severity |
|
||||
| `--show-suppressed` | off | Show inline-suppressed findings (dimmed, tagged `[SUPPRESSED]`) |
|
||||
| `--keep-nonprod-severity` | off | Don't downgrade severity for test/vendor paths |
|
||||
| `--all` | off | Disable category filtering, rollups, and LOW budgets — show everything |
|
||||
| `--include-quality` | off | Include Quality-category findings (hidden by default) |
|
||||
| `--max-low <N>` | `20` | Maximum total LOW findings to show |
|
||||
| `--max-low-per-file <N>` | `1` | Maximum LOW findings per file |
|
||||
| `--max-low-per-rule <N>` | `10` | Maximum LOW findings per rule |
|
||||
| `--rollup-examples <N>` | `5` | Number of example locations in rollup findings |
|
||||
| `--show-instances <RULE>` | *(none)* | Expand all instances of a specific rule (bypass rollup) |
|
||||
|
||||
**Severity expression formats**:
|
||||
|
||||
```bash
|
||||
--severity HIGH # Only high
|
||||
--severity "HIGH,MEDIUM" # High or medium
|
||||
--severity ">=MEDIUM" # Medium and above (high + medium)
|
||||
--severity ">= low" # All severities (case-insensitive)
|
||||
```
|
||||
|
||||
**Deprecated aliases**: `--high-only` (use `--severity HIGH`), `--include-nonprod` (use `--keep-nonprod-severity`).
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
# Basic scan
|
||||
nyx scan
|
||||
|
||||
# Scan specific path, JSON output
|
||||
nyx scan ./server --format json
|
||||
|
||||
# CI gate: fail on medium+, SARIF output
|
||||
nyx scan . --format sarif --fail-on medium > results.sarif
|
||||
|
||||
# Fast AST-only scan, no index
|
||||
nyx scan . --mode ast --index off
|
||||
|
||||
# High-severity only, quiet mode
|
||||
nyx scan . --severity HIGH --quiet
|
||||
|
||||
# Only findings scoring 50 or above
|
||||
nyx scan . --min-score 50
|
||||
|
||||
# Only medium+ confidence findings
|
||||
nyx scan . --min-confidence medium
|
||||
|
||||
# Show everything (no filtering, no rollups)
|
||||
nyx scan . --all
|
||||
|
||||
# Include quality findings but keep rollups and budgets
|
||||
nyx scan . --include-quality
|
||||
|
||||
# See all unwrap findings expanded
|
||||
nyx scan . --include-quality --show-instances rs.quality.unwrap
|
||||
|
||||
# Allow more LOW findings
|
||||
nyx scan . --max-low 50 --max-low-per-file 5
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## `nyx index`
|
||||
|
||||
Manage the SQLite file index.
|
||||
|
||||
### `nyx index build`
|
||||
|
||||
```
|
||||
nyx index build [PATH] [--force]
|
||||
```
|
||||
|
||||
Build or update the index for the given path (default: `.`).
|
||||
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `-f, --force` | Force full rebuild, ignoring cached file hashes |
|
||||
|
||||
### `nyx index status`
|
||||
|
||||
```
|
||||
nyx index status [PATH]
|
||||
```
|
||||
|
||||
Display index statistics (file count, size, last modified) for the given path.
|
||||
|
||||
---
|
||||
|
||||
## `nyx list`
|
||||
|
||||
```
|
||||
nyx list [-v]
|
||||
```
|
||||
|
||||
List all indexed projects.
|
||||
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `-v, --verbose` | Show detailed information per project |
|
||||
|
||||
---
|
||||
|
||||
## `nyx clean`
|
||||
|
||||
```
|
||||
nyx clean [PROJECT] [--all]
|
||||
```
|
||||
|
||||
Remove index data.
|
||||
|
||||
| Argument/Flag | Description |
|
||||
|---------------|-------------|
|
||||
| `PROJECT` | Project name or path to clean |
|
||||
| `--all` | Clean all indexed projects |
|
||||
|
||||
---
|
||||
|
||||
## `nyx config`
|
||||
|
||||
Manage configuration.
|
||||
|
||||
### `nyx config show`
|
||||
|
||||
Print the effective merged configuration as TOML.
|
||||
|
||||
### `nyx config path`
|
||||
|
||||
Print the configuration directory path.
|
||||
|
||||
### `nyx config add-rule`
|
||||
|
||||
```
|
||||
nyx config add-rule --lang <LANG> --matcher <MATCHER> --kind <KIND> --cap <CAP>
|
||||
```
|
||||
|
||||
Add a custom taint rule. Written to `nyx.local`.
|
||||
|
||||
| Flag | Values |
|
||||
|------|--------|
|
||||
| `--lang` | `rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby` |
|
||||
| `--matcher` | Function or property name to match |
|
||||
| `--kind` | `source`, `sanitizer`, `sink` |
|
||||
| `--cap` | `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `all` |
|
||||
|
||||
### `nyx config add-terminator`
|
||||
|
||||
```
|
||||
nyx config add-terminator --lang <LANG> --name <NAME>
|
||||
```
|
||||
|
||||
Add a terminator function (e.g. `process.exit`). Written to `nyx.local`.
|
||||
|
||||
---
|
||||
|
||||
## Exit Codes
|
||||
|
||||
| Code | Meaning |
|
||||
|------|---------|
|
||||
| `0` | Scan completed; no findings matched `--fail-on` threshold (or no `--fail-on` specified) |
|
||||
| `1` | Scan completed but at least one finding met or exceeded the `--fail-on` severity |
|
||||
| Non-zero | Error during scan (I/O error, config parse error, database error, etc.) |
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `RUST_LOG` | Set tracing verbosity (e.g. `RUST_LOG=debug nyx scan .`) |
|
||||
| `NO_COLOR` | Disable ANSI color output |
|
||||
183
docs/configuration.md
Normal file
183
docs/configuration.md
Normal file
|
|
@ -0,0 +1,183 @@
|
|||
# Configuration
|
||||
|
||||
Nyx uses TOML configuration files. A default config is auto-generated on first run.
|
||||
|
||||
## File Locations
|
||||
|
||||
| Platform | Directory |
|
||||
|----------|-----------|
|
||||
| Linux | `~/.config/nyx/` |
|
||||
| macOS | `~/Library/Application Support/nyx/` |
|
||||
| Windows | `%APPDATA%\elicpeter\nyx\config\` |
|
||||
|
||||
Run `nyx config path` to see the exact directory on your system.
|
||||
|
||||
## File Precedence
|
||||
|
||||
1. **`nyx.conf`** — Default config (auto-created from built-in template on first run)
|
||||
2. **`nyx.local`** — User overrides (loaded on top of defaults)
|
||||
|
||||
Both files are optional. CLI flags take precedence over both.
|
||||
|
||||
## Merge Strategy
|
||||
|
||||
| Type | Behavior |
|
||||
|------|----------|
|
||||
| Scalars (`mode`, `min_severity`, booleans) | User value wins |
|
||||
| Arrays (`excluded_extensions`, `excluded_directories`) | Union + deduplicate |
|
||||
| Analysis rules | Per-language union with deduplication |
|
||||
|
||||
Example:
|
||||
```toml
|
||||
# nyx.conf (default):
|
||||
excluded_extensions = ["jpg", "png", "exe"]
|
||||
|
||||
# nyx.local (user):
|
||||
excluded_extensions = ["foo", "jpg"]
|
||||
|
||||
# Effective result:
|
||||
# ["exe", "foo", "jpg", "png"] — sorted, deduped union
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Full Schema
|
||||
|
||||
### `[scanner]`
|
||||
|
||||
| Field | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `mode` | `"full"` \| `"ast"` \| `"cfg"` \| `"taint"` | `"full"` | Analysis mode |
|
||||
| `min_severity` | `"Low"` \| `"Medium"` \| `"High"` | `"Low"` | Minimum severity to report |
|
||||
| `max_file_size_mb` | int \| null | null | Max file size in MiB; null = unlimited |
|
||||
| `excluded_extensions` | [string] | `["jpg", "png", "gif", "mp4", ...]` | File extensions to skip |
|
||||
| `excluded_directories` | [string] | `["node_modules", ".git", "target", ...]` | Directories to skip |
|
||||
| `excluded_files` | [string] | `[]` | Specific files to skip |
|
||||
| `read_global_ignore` | bool | `false` | Honor global ignore file |
|
||||
| `read_vcsignore` | bool | `true` | Honor `.gitignore` / `.hgignore` |
|
||||
| `require_git_to_read_vcsignore` | bool | `true` | Require `.git` dir to apply gitignore |
|
||||
| `one_file_system` | bool | `false` | Don't cross filesystem boundaries |
|
||||
| `follow_symlinks` | bool | `false` | Follow symbolic links |
|
||||
| `scan_hidden_files` | bool | `false` | Scan dot-files |
|
||||
| `include_nonprod` | bool | `false` | Keep original severity for test/vendor paths |
|
||||
| `enable_state_analysis` | bool | `false` | Enable resource lifecycle + auth state analysis. Detects use-after-close, double-close, resource leaks (per-function scope), and unauthenticated access. Requires `mode = "full"` or `mode = "cfg"`. |
|
||||
|
||||
### `[database]`
|
||||
|
||||
| Field | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `path` | string | `""` | Custom SQLite DB path; empty = platform default |
|
||||
|
||||
### `[output]`
|
||||
|
||||
| Field | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `default_format` | `"console"` \| `"json"` \| `"sarif"` | `"console"` | Default output format |
|
||||
| `quiet` | bool | `false` | Suppress status messages |
|
||||
| `max_results` | int \| null | null | Cap number of findings; null = unlimited |
|
||||
| `attack_surface_ranking` | bool | `true` | Enable attack-surface ranking |
|
||||
| `min_score` | int \| null | null | Minimum rank score to include; null = no minimum |
|
||||
| `min_confidence` | string \| null | null | Minimum confidence level (`"low"`, `"medium"`, `"high"`); null = no minimum |
|
||||
| `include_quality` | bool | `false` | Include Quality-category findings (hidden by default) |
|
||||
| `show_all` | bool | `false` | Disable category filtering, rollups, and LOW budgets |
|
||||
| `max_low` | int | `20` | Maximum total LOW findings to show (rollups count as 1) |
|
||||
| `max_low_per_file` | int | `1` | Maximum LOW findings per file (rollups count as 1) |
|
||||
| `max_low_per_rule` | int | `10` | Maximum LOW findings per rule (rollups count as 1) |
|
||||
| `rollup_examples` | int | `5` | Number of example locations stored in rollup findings |
|
||||
|
||||
### `[performance]`
|
||||
|
||||
| Field | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `worker_threads` | int \| null | null | Worker thread count; null/0 = auto-detect |
|
||||
| `batch_size` | int | `100` | Files per index batch |
|
||||
| `channel_multiplier` | int | `4` | Channel capacity = threads x multiplier |
|
||||
| `rayon_thread_stack_size` | int | `8388608` | Rayon thread stack size in bytes (8 MiB) |
|
||||
| `prune` | bool | `false` | Stop traversing into matching directories |
|
||||
|
||||
### `[analysis.languages.<slug>]`
|
||||
|
||||
Per-language custom rules. `<slug>` is one of: `rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby`.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `rules` | array of rule objects | Custom label rules |
|
||||
| `terminators` | [string] | Functions that terminate execution |
|
||||
| `event_handlers` | [string] | Event handler function names |
|
||||
|
||||
**Rule object**:
|
||||
|
||||
```toml
|
||||
[[analysis.languages.javascript.rules]]
|
||||
matchers = ["escapeHtml"]
|
||||
kind = "sanitizer" # "source" | "sanitizer" | "sink"
|
||||
cap = "html_escape" # "env_var" | "html_escape" | "shell_escape" |
|
||||
# "url_encode" | "json_parse" | "file_io" | "all"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Example Configurations
|
||||
|
||||
### Minimal override (`nyx.local`)
|
||||
|
||||
```toml
|
||||
[scanner]
|
||||
min_severity = "Medium"
|
||||
|
||||
[output]
|
||||
default_format = "json"
|
||||
max_results = 100
|
||||
```
|
||||
|
||||
### CI-optimized
|
||||
|
||||
```toml
|
||||
[scanner]
|
||||
mode = "full"
|
||||
min_severity = "Medium"
|
||||
excluded_directories = ["node_modules", ".git", "target", "vendor", "dist"]
|
||||
|
||||
[output]
|
||||
quiet = true
|
||||
default_format = "sarif"
|
||||
|
||||
[performance]
|
||||
worker_threads = 4
|
||||
```
|
||||
|
||||
### Custom rules for a Node.js project
|
||||
|
||||
```toml
|
||||
[analysis.languages.javascript]
|
||||
terminators = ["process.exit", "abort"]
|
||||
event_handlers = ["addEventListener"]
|
||||
|
||||
[[analysis.languages.javascript.rules]]
|
||||
matchers = ["escapeHtml", "sanitizeInput"]
|
||||
kind = "sanitizer"
|
||||
cap = "html_escape"
|
||||
|
||||
[[analysis.languages.javascript.rules]]
|
||||
matchers = ["dangerouslySetInnerHTML"]
|
||||
kind = "sink"
|
||||
cap = "html_escape"
|
||||
|
||||
[[analysis.languages.javascript.rules]]
|
||||
matchers = ["getRequestBody", "readUserInput"]
|
||||
kind = "source"
|
||||
cap = "all"
|
||||
```
|
||||
|
||||
### Adding rules via CLI
|
||||
|
||||
```bash
|
||||
# Add a sanitizer
|
||||
nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape
|
||||
|
||||
# Add a terminator
|
||||
nyx config add-terminator --lang javascript --name process.exit
|
||||
|
||||
# Verify
|
||||
nyx config show
|
||||
```
|
||||
81
docs/detectors.md
Normal file
81
docs/detectors.md
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
# Detector Overview
|
||||
|
||||
Nyx uses four independent detector families. Each targets different vulnerability classes and operates at a different level of analysis depth. Findings from all active detectors are merged, deduplicated, ranked, and presented in a single result set.
|
||||
|
||||
## The Four Detector Families
|
||||
|
||||
| Family | Rule prefix | Analysis depth | What it finds |
|
||||
|--------|------------|----------------|---------------|
|
||||
| [**Taint Analysis**](detectors/taint.md) | `taint-*` | Cross-file dataflow | Unsanitized data flowing from sources to sinks |
|
||||
| [**CFG Structural**](detectors/cfg.md) | `cfg-*` | Intra-procedural CFG | Auth gaps, unguarded sinks, resource leaks, error fallthrough |
|
||||
| [**State Model**](detectors/state.md) | `state-*` | Intra-procedural lattice | Use-after-close, double-close, resource leaks, unauthenticated access |
|
||||
| [**AST Patterns**](detectors/patterns.md) | `<lang>.*.*` | Structural (no flow) | Dangerous function calls, banned APIs, weak crypto |
|
||||
|
||||
## How They Combine
|
||||
|
||||
In `--mode full` (default), all four families run. Findings are deduplicated:
|
||||
|
||||
1. **Taint supersedes AST**: If a taint finding and an AST pattern both fire at the same location (e.g. both flag `eval(userInput)`), both are kept with distinct rule IDs. The taint finding ranks higher due to the analysis-kind bonus.
|
||||
|
||||
2. **State supersedes CFG**: If a state-model finding (e.g. `state-resource-leak`) fires at the same location as a CFG finding (e.g. `cfg-resource-leak`), the CFG finding is suppressed.
|
||||
|
||||
3. **Location-level dedup**: Exact duplicates (same line, column, rule ID, severity) are removed.
|
||||
|
||||
## Analysis Modes
|
||||
|
||||
| Mode | CLI flag | Active detectors |
|
||||
|------|----------|-----------------|
|
||||
| Full | `--mode full` | All four |
|
||||
| AST-only | `--mode ast` | AST patterns only |
|
||||
| CFG/Taint | `--mode cfg` | Taint + CFG + State |
|
||||
|
||||
## Attack-Surface Ranking
|
||||
|
||||
Every finding receives a deterministic **attack-surface score** estimating exploitability. Findings are sorted by descending score.
|
||||
|
||||
### Scoring Formula
|
||||
|
||||
```
|
||||
score = severity_base + analysis_kind + evidence_strength + state_bonus - validation_penalty
|
||||
```
|
||||
|
||||
| Component | Values | Purpose |
|
||||
|-----------|--------|---------|
|
||||
| **Severity base** | High=60, Medium=30, Low=10 | Primary signal |
|
||||
| **Analysis kind** | taint=+10, state=+8, cfg(with evidence)=+5, cfg(no evidence)=+3, ast=+0 | Confidence of analysis |
|
||||
| **Evidence strength** | +1 per evidence item (max 4), +2-6 for source kind | Specificity of finding |
|
||||
| **State bonus** | use-after-close/unauthed=+6, double-close=+3, must-leak=+2, may-leak=+1 | State rule severity |
|
||||
| **Validation penalty** | -5 if path-validated | Guard reduces exploitability |
|
||||
|
||||
### Source-kind priority
|
||||
|
||||
| Source type | Bonus | Examples |
|
||||
|-------------|-------|---------|
|
||||
| User input | +6 | `req.body`, `argv`, `stdin`, `form`, `query`, `params` |
|
||||
| Environment | +5 | `env::var`, `getenv`, `process.env` |
|
||||
| Unknown | +4 | Conservative default |
|
||||
| File system | +3 | `fs::read_to_string`, `fgets` |
|
||||
| Database | +2 | Query results |
|
||||
|
||||
### Score ranges (approximate)
|
||||
|
||||
| Finding type | Score range |
|
||||
|-------------|------------|
|
||||
| High taint + user input | ~76-80 |
|
||||
| High state (use-after-close) | ~74 |
|
||||
| High CFG structural | ~63-68 |
|
||||
| Medium taint + env source | ~45-50 |
|
||||
| Medium state (resource leak) | ~40 |
|
||||
| Low AST-only pattern | ~10 |
|
||||
|
||||
Ranking is enabled by default. Disable with `--no-rank` or `output.attack_surface_ranking = false`.
|
||||
|
||||
## Two-Pass Architecture
|
||||
|
||||
Nyx's taint analysis requires cross-file context, achieved via two passes:
|
||||
|
||||
1. **Pass 1 — Summary extraction**: Each file is parsed, a CFG is built, and a `FuncSummary` is extracted per function. Summaries capture source/sanitizer/sink capabilities (bitflags), taint propagation behavior, and callee lists. Summaries are persisted to SQLite.
|
||||
|
||||
2. **Pass 2 — Analysis**: All summaries are merged into a global map. Files are re-parsed and analyzed with full cross-file context. The taint engine resolves callees against local summaries (more precise) first, then falls back to global summaries.
|
||||
|
||||
With indexing enabled, Pass 1 skips files whose content hash hasn't changed since the last scan.
|
||||
161
docs/detectors/cfg.md
Normal file
161
docs/detectors/cfg.md
Normal file
|
|
@ -0,0 +1,161 @@
|
|||
# CFG Structural Analysis
|
||||
|
||||
## Summary
|
||||
|
||||
Nyx builds an intra-procedural control-flow graph (CFG) for each function and analyzes structural properties: whether sinks are guarded by sanitizers or validators, whether web handlers check authentication, whether resources are released on all exit paths, and whether error-handling code terminates properly.
|
||||
|
||||
These detectors use **dominator analysis** — they check whether a guard node dominates (must execute before) a sink node on the CFG.
|
||||
|
||||
## Rule IDs
|
||||
|
||||
| Rule ID | Severity | Description |
|
||||
|---------|----------|-------------|
|
||||
| `cfg-unguarded-sink` | High/Medium | Sink reachable without a dominating guard or sanitizer |
|
||||
| `cfg-auth-gap` | High | Web handler reaches privileged sink without auth check |
|
||||
| `cfg-unreachable-sink` | Medium | Dangerous function in unreachable code |
|
||||
| `cfg-unreachable-sanitizer` | Low | Sanitizer in unreachable code |
|
||||
| `cfg-unreachable-source` | Low | Source in unreachable code |
|
||||
| `cfg-error-fallthrough` | High/Medium | Error check doesn't terminate; dangerous code follows |
|
||||
| `cfg-resource-leak` | Medium | Resource acquired but not released on all exit paths |
|
||||
| `cfg-lock-not-released` | Medium | Lock acquired but not released on all exit paths |
|
||||
|
||||
## What It Detects
|
||||
|
||||
### Unguarded sinks (`cfg-unguarded-sink`)
|
||||
A sink call (e.g. `system()`, `eval()`, `Command::new()`) is reachable from the function entry without passing through a guard or sanitizer that matches the sink's capability.
|
||||
|
||||
### Auth gaps (`cfg-auth-gap`)
|
||||
A function identified as a web handler (by parameter naming conventions like `req`, `res`, `ctx`, `request`) reaches a privileged sink (shell execution, file I/O) without a prior call to an authentication function (`is_authenticated`, `require_auth`, `check_permission`, etc.).
|
||||
|
||||
### Unreachable security code (`cfg-unreachable-*`)
|
||||
Sinks, sanitizers, or sources in dead code branches. This often indicates a refactoring error where security-critical code was accidentally made unreachable.
|
||||
|
||||
### Error fallthrough (`cfg-error-fallthrough`)
|
||||
An error check (null check, error return check) does not terminate the function or loop back. Execution continues to a dangerous operation on the error path.
|
||||
|
||||
### Resource leaks (`cfg-resource-leak`, `cfg-lock-not-released`)
|
||||
A resource acquisition call (e.g. `File::open`, `fopen`, `socket`, `Lock`) is not matched by a release call (e.g. `close`, `fclose`, `unlock`) on all exit paths from the function.
|
||||
|
||||
## What It Cannot Detect
|
||||
|
||||
- **Inter-procedural guards**: If authentication is checked in a middleware function that calls this handler, the CFG detector cannot see it. It only analyzes one function at a time.
|
||||
- **Dynamic dispatch**: Virtual method calls, function pointers, and closures are opaque to the CFG.
|
||||
- **Complex guard patterns**: Only recognized guard function names are checked. Custom validation logic (e.g. `if password == expected`) is not recognized as a guard.
|
||||
- **Correct sanitization**: The detector checks that *some* guard dominates the sink, not that the guard is *correct*. A guard that always passes would suppress the finding.
|
||||
- **Cross-function resource flows**: If a file handle is opened in one function and closed in another, the detector will report a leak in the first function.
|
||||
|
||||
## Common False Positives
|
||||
|
||||
| Scenario | Why it fires | Mitigation |
|
||||
|----------|-------------|------------|
|
||||
| Framework-level auth middleware | Handler doesn't call auth directly | Document as expected; suppress with severity filter |
|
||||
| Resource closed via RAII/defer | Implicit cleanup not visible to CFG | Currently not detected; known limitation |
|
||||
| Custom guard function name | Function not in the recognized guard list | Add the function name as a sanitizer in config |
|
||||
| Test handlers | Intentionally skip auth in tests | Default non-prod downgrade reduces severity; or exclude test dirs |
|
||||
|
||||
## Common False Negatives
|
||||
|
||||
| Scenario | Why it's missed |
|
||||
|----------|----------------|
|
||||
| Auth in called function | Cross-function guards not tracked |
|
||||
| Guard via type system | Type-level guarantees (e.g. Rust's `AuthenticatedUser` wrapper) not analyzed |
|
||||
| Resource closed in finally/defer | Some cleanup patterns not recognized |
|
||||
|
||||
## Confidence Signals
|
||||
|
||||
| Signal | Meaning |
|
||||
|--------|---------|
|
||||
| **Evidence lists guard nodes** | Shows which guards were checked and found missing |
|
||||
| **Sink has high capability** | Shell execution or file I/O sinks are higher risk |
|
||||
| **Handler detection matched** | Web handler identification is based on conventional parameter names |
|
||||
|
||||
## Tuning and Noise Controls
|
||||
|
||||
### Add custom guards/sanitizers
|
||||
|
||||
```toml
|
||||
[[analysis.languages.python.rules]]
|
||||
matchers = ["validate_request", "check_csrf"]
|
||||
kind = "sanitizer"
|
||||
cap = "all"
|
||||
```
|
||||
|
||||
### Add auth rules
|
||||
|
||||
Auth checks are recognized by function name. If your codebase uses non-standard names:
|
||||
|
||||
```toml
|
||||
[[analysis.languages.javascript.rules]]
|
||||
matchers = ["ensureLoggedIn", "requirePermission"]
|
||||
kind = "sanitizer"
|
||||
cap = "all"
|
||||
```
|
||||
|
||||
### Filter results
|
||||
|
||||
```bash
|
||||
# Skip low-severity unreachable findings
|
||||
nyx scan . --severity ">=MEDIUM"
|
||||
```
|
||||
|
||||
### Disable CFG analysis
|
||||
|
||||
```bash
|
||||
nyx scan . --mode ast # AST patterns only
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Unguarded sink
|
||||
|
||||
```go
|
||||
func handler(w http.ResponseWriter, r *http.Request) {
|
||||
cmd := r.URL.Query().Get("cmd")
|
||||
exec.Command("sh", "-c", cmd).Run() // cfg-unguarded-sink: no guard dominates
|
||||
}
|
||||
```
|
||||
|
||||
### Auth gap
|
||||
|
||||
```javascript
|
||||
app.get('/admin/delete', (req, res) => {
|
||||
// No is_authenticated() call
|
||||
db.execute("DELETE FROM users WHERE id = " + req.params.id);
|
||||
// cfg-auth-gap: web handler reaches privileged sink without auth
|
||||
});
|
||||
```
|
||||
|
||||
### Resource leak
|
||||
|
||||
```c
|
||||
void process() {
|
||||
FILE *f = fopen("data.txt", "r"); // acquire
|
||||
if (error) {
|
||||
return; // cfg-resource-leak: f not closed on this path
|
||||
}
|
||||
fclose(f);
|
||||
}
|
||||
```
|
||||
|
||||
## Guard Rules
|
||||
|
||||
Nyx recognizes these function name patterns as guards:
|
||||
|
||||
| Pattern | Applies to |
|
||||
|---------|-----------|
|
||||
| `validate*`, `sanitize*` | All sinks |
|
||||
| `check_*`, `verify_*`, `assert_*` | All sinks |
|
||||
| `shell_escape` | Shell execution sinks |
|
||||
| `html_escape` | HTML/XSS sinks |
|
||||
| `url_encode` | URL sinks |
|
||||
| `which` | Shell execution (binary lookup) |
|
||||
|
||||
### Auth rules
|
||||
|
||||
| Pattern | Category |
|
||||
|---------|----------|
|
||||
| `is_authenticated`, `require_auth`, `check_permission` | Common |
|
||||
| `authorize`, `authenticate`, `require_login` | Common |
|
||||
| `check_auth`, `verify_token`, `validate_token` | Common |
|
||||
| `middleware.auth`, `auth.required` | Go |
|
||||
| `isAuthenticated`, `checkPermission`, `hasAuthority`, `hasRole` | Java |
|
||||
149
docs/detectors/patterns.md
Normal file
149
docs/detectors/patterns.md
Normal file
|
|
@ -0,0 +1,149 @@
|
|||
# AST Pattern Matching
|
||||
|
||||
## Summary
|
||||
|
||||
AST patterns are tree-sitter queries that match specific structural code constructs. They are the simplest and fastest detector family — no dataflow, no CFG, just structural presence. A match means the dangerous construct exists in the code; it does not prove the code is exploitable.
|
||||
|
||||
AST patterns run in all analysis modes, including `--mode ast` (where they are the only active detector).
|
||||
|
||||
## Rule IDs
|
||||
|
||||
Pattern rule IDs follow the format `<lang>.<category>.<specific>`:
|
||||
|
||||
```
|
||||
rs.memory.transmute
|
||||
js.code_exec.eval
|
||||
py.deser.pickle_loads
|
||||
c.memory.gets
|
||||
java.sqli.execute_concat
|
||||
```
|
||||
|
||||
See the [Rule Reference](../rules/index.md) for a complete listing per language.
|
||||
|
||||
## Pattern Tiers
|
||||
|
||||
| Tier | Meaning | Examples |
|
||||
|------|---------|---------|
|
||||
| **A** | Structural presence alone is high-signal | `gets()`, `eval()`, `pickle.loads()`, `mem::transmute` |
|
||||
| **B** | Query includes a heuristic guard | SQL `execute` with concatenated arg, `printf(var)` with non-literal format |
|
||||
|
||||
Tier B patterns use additional tree-sitter predicates to reduce false positives. For example, `java.sqli.execute_concat` only fires when `executeQuery()` receives a `binary_expression` (string concatenation) as its argument, not when it receives a literal or parameter placeholder.
|
||||
|
||||
## What It Detects
|
||||
|
||||
### By category
|
||||
|
||||
| Category | What it matches | Example languages |
|
||||
|----------|----------------|-------------------|
|
||||
| **CommandExec** | Shell command execution functions | C (`system`), Python (`os.system`), Ruby (backticks) |
|
||||
| **CodeExec** | Dynamic code evaluation | JS (`eval`, `new Function()`), Python (`exec`), PHP (`eval`) |
|
||||
| **Deserialization** | Unsafe object deserialization | Java (`readObject`), Python (`pickle.loads`), Ruby (`Marshal.load`) |
|
||||
| **SqlInjection** | SQL with string concatenation | Java, Go, Python, PHP (Tier B heuristic) |
|
||||
| **PathTraversal** | File inclusion with variable path | PHP (`include $var`) |
|
||||
| **Xss** | XSS sink functions | JS (`document.write`, `outerHTML`), Java (`getWriter().print`) |
|
||||
| **Crypto** | Weak cryptographic algorithms | All languages (`md5`, `sha1`, `Math.random()`) |
|
||||
| **Secrets** | Hardcoded credentials | Go (variable name matching) |
|
||||
| **InsecureTransport** | Unencrypted communication | Go (`InsecureSkipVerify`), JS (`fetch("http://")`) |
|
||||
| **Reflection** | Dynamic class/method dispatch | Java (`Class.forName`, `Method.invoke`), Ruby (`send`, `constantize`) |
|
||||
| **MemorySafety** | Memory safety violations | Rust (`transmute`, `unsafe`), C (`gets`, `strcpy`, `sprintf`) |
|
||||
| **Prototype** | Prototype pollution | JS/TS (`__proto__` assignment) |
|
||||
| **CodeQuality** | Panic/abort/type-safety issues | Rust (`unwrap`, `panic!`), TS (`as any`) |
|
||||
|
||||
## What It Cannot Detect
|
||||
|
||||
- **Dataflow**: Patterns don't track whether the dangerous function receives tainted input. `eval("hello")` (safe) and `eval(userInput)` (dangerous) both match `js.code_exec.eval`.
|
||||
- **Context**: Patterns don't understand whether the code is reachable, guarded, or inside a test.
|
||||
- **Semantics**: `strcpy(dst, src)` always matches — it cannot determine buffer sizes.
|
||||
- **Indirect calls**: Function pointers, dynamic dispatch, and aliased references are invisible.
|
||||
|
||||
## Common False Positives
|
||||
|
||||
| Scenario | Why it fires | Mitigation |
|
||||
|----------|-------------|------------|
|
||||
| `eval()` with a hardcoded string literal | Pattern matches structural presence | Taint analysis won't flag this — use `--mode cfg` for fewer false positives |
|
||||
| `unsafe` block in Rust with sound justification | All unsafe blocks match | Filter with `--severity ">=MEDIUM"` (unsafe_block is Medium) |
|
||||
| `.unwrap()` in test code | Acceptable in tests | Default non-prod downgrade reduces severity |
|
||||
| `md5()` used for checksums (not security) | Pattern doesn't know usage intent | Filter Low severity or add to exclusions |
|
||||
| SQL concatenation with trusted data | Tier B heuristic can't verify data source | Taint analysis is more precise here |
|
||||
|
||||
## Common False Negatives
|
||||
|
||||
| Scenario | Why it's missed |
|
||||
|----------|----------------|
|
||||
| `eval` called via alias (`let e = eval; e(input)`) | Pattern matches the identifier `eval`, not the resolved function |
|
||||
| Dangerous function in a macro expansion | Tree-sitter parses the macro call, not the expansion |
|
||||
| SQL injection via ORM query builder | No pattern for ORM-specific query building |
|
||||
| Imported function under different name | `from os import system as s; s(cmd)` — pattern looks for `system` |
|
||||
|
||||
## Confidence Signals
|
||||
|
||||
| Signal | Meaning |
|
||||
|--------|---------|
|
||||
| **Tier A** | High confidence — the function itself is dangerous |
|
||||
| **Tier B** | Moderate confidence — heuristic guard reduces false positives |
|
||||
| **High severity** | Critical vulnerability class (command exec, deserialization) |
|
||||
| **Low severity** | Informational (weak crypto, code quality) |
|
||||
| **Non-prod path** | Finding in test/vendor code — downgraded by default |
|
||||
|
||||
## Tuning and Noise Controls
|
||||
|
||||
### Severity filtering
|
||||
|
||||
```bash
|
||||
# Skip code-quality and weak-crypto findings
|
||||
nyx scan . --severity ">=MEDIUM"
|
||||
|
||||
# Only critical findings
|
||||
nyx scan . --severity HIGH
|
||||
```
|
||||
|
||||
### Use taint for precision
|
||||
|
||||
```bash
|
||||
# Taint-only mode: only report findings with confirmed dataflow
|
||||
nyx scan . --mode cfg
|
||||
```
|
||||
|
||||
### Exclude directories
|
||||
|
||||
```toml
|
||||
[scanner]
|
||||
excluded_directories = ["node_modules", "vendor", "generated"]
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Tier A — structural presence
|
||||
|
||||
**C: Banned function**
|
||||
```c
|
||||
char buf[64];
|
||||
gets(buf); // c.memory.gets — always dangerous, no safe usage
|
||||
```
|
||||
|
||||
**Python: Unsafe deserialization**
|
||||
```python
|
||||
import pickle
|
||||
data = pickle.loads(user_input) # py.deser.pickle_loads
|
||||
```
|
||||
|
||||
### Tier B — heuristic-guarded
|
||||
|
||||
**Java: SQL concatenation**
|
||||
```java
|
||||
// Fires: concatenated argument
|
||||
stmt.executeQuery("SELECT * FROM users WHERE id=" + userId);
|
||||
// java.sqli.execute_concat
|
||||
|
||||
// Does NOT fire: parameterized query
|
||||
stmt.executeQuery(preparedSql);
|
||||
```
|
||||
|
||||
**C: Format string**
|
||||
```c
|
||||
// Fires: variable as first argument
|
||||
printf(user_input); // c.memory.printf_no_fmt
|
||||
|
||||
// Does NOT fire: literal format string
|
||||
printf("%s", user_input);
|
||||
```
|
||||
204
docs/detectors/state.md
Normal file
204
docs/detectors/state.md
Normal file
|
|
@ -0,0 +1,204 @@
|
|||
# State Model Analysis
|
||||
|
||||
## Summary
|
||||
|
||||
Nyx's state model analysis tracks **resource lifecycle** and **authentication state** through a function using monotone dataflow over bounded lattices. It detects use-after-close bugs, double-close bugs, resource leaks, and unauthenticated access to privileged operations.
|
||||
|
||||
State analysis is **opt-in** — enable it with `scanner.enable_state_analysis = true` in config. It requires `mode = "full"` or `mode = "cfg"`.
|
||||
|
||||
## Rule IDs
|
||||
|
||||
| Rule ID | Severity | Description |
|
||||
|---------|----------|-------------|
|
||||
| `state-use-after-close` | High | Variable used after being closed/released |
|
||||
| `state-double-close` | Medium | Resource closed twice |
|
||||
| `state-resource-leak` | Medium | Resource opened but never closed (definite) |
|
||||
| `state-resource-leak-possible` | Low | Resource may not be closed on all paths |
|
||||
| `state-unauthed-access` | High | Privileged operation reached without authentication |
|
||||
|
||||
## What It Detects
|
||||
|
||||
### Use-after-close (`state-use-after-close`)
|
||||
|
||||
A resource transitions to the CLOSED state (via `close()`, `fclose()`, `disconnect()`, etc.), then a use operation (`read`, `write`, `send`, `recv`, `query`, etc.) is performed on it.
|
||||
|
||||
```c
|
||||
FILE *f = fopen("data.txt", "r");
|
||||
fclose(f);
|
||||
fread(buf, 1, 100, f); // state-use-after-close
|
||||
```
|
||||
|
||||
### Double-close (`state-double-close`)
|
||||
|
||||
A resource is closed twice. This can cause crashes or undefined behavior.
|
||||
|
||||
```python
|
||||
f = open("data.txt")
|
||||
f.close()
|
||||
f.close() # state-double-close
|
||||
```
|
||||
|
||||
### Resource leak (`state-resource-leak`)
|
||||
|
||||
A resource is opened but never closed on any path through the function. This is a definite leak.
|
||||
|
||||
```java
|
||||
FileInputStream fis = new FileInputStream("data.txt");
|
||||
process(fis);
|
||||
// function exits without fis.close() — state-resource-leak
|
||||
```
|
||||
|
||||
### Possible resource leak (`state-resource-leak-possible`)
|
||||
|
||||
A resource is closed on some paths but not others.
|
||||
|
||||
```go
|
||||
f, err := os.Open("data.txt")
|
||||
if err != nil {
|
||||
return // f not closed here
|
||||
}
|
||||
f.Close() // closed here
|
||||
// state-resource-leak-possible on the error path
|
||||
```
|
||||
|
||||
### Unauthenticated access (`state-unauthed-access`)
|
||||
|
||||
A function identified as a web handler reaches a privileged sink (shell execution, file I/O) without any authentication check on the path.
|
||||
|
||||
A function is identified as a web handler if:
|
||||
1. Its name starts with `handle_`, `route_`, or `api_` (strong match — sufficient on its own), OR
|
||||
2. Its name starts with `serve_` or `process_` AND any function in the file has web-like parameter names (`request`, `req`, `ctx`, `res`, `response`, `w`, `writer`, etc., varying by language).
|
||||
|
||||
The function name `main` is explicitly excluded.
|
||||
|
||||
```javascript
|
||||
app.post('/admin/exec', (req, res) => {
|
||||
// No auth check
|
||||
exec(req.body.command); // state-unauthed-access
|
||||
});
|
||||
```
|
||||
|
||||
## What It Cannot Detect
|
||||
|
||||
- **Cross-function resource management**: Resources opened in one function and closed in another are not tracked. This is the most common source of false positives for leak detection.
|
||||
- **RAII / defer / try-with-resources**: Implicit cleanup via language-level constructs (Rust's `Drop`, Go's `defer`, Java's try-with-resources, Python's `with`) is not recognized. These patterns will produce false-positive leak findings.
|
||||
- **Dynamic dispatch**: If `close()` is called through a trait object or interface, it may not be recognized.
|
||||
- **Authentication via type system**: Rust's type-state pattern (e.g. `AuthenticatedRequest<T>`) is not recognized as an auth check.
|
||||
- **Complex authorization logic**: Only recognized function name patterns are checked.
|
||||
|
||||
## Common False Positives
|
||||
|
||||
| Scenario | Why it fires | Mitigation |
|
||||
|----------|-------------|------------|
|
||||
| RAII / Drop / defer cleanup | Implicit cleanup not visible | Known limitation; filter by severity |
|
||||
| Resource returned to caller | Ownership transferred, not leaked | Known limitation |
|
||||
| Framework-managed resources | Web framework manages connection lifecycle | Exclude framework-generated handlers |
|
||||
| Try-with-resources (Java) | Language construct not parsed | Known limitation |
|
||||
| Context manager (Python `with`) | Block construct not tracked | Known limitation |
|
||||
|
||||
## Common False Negatives
|
||||
|
||||
| Scenario | Why it's missed |
|
||||
|----------|----------------|
|
||||
| Resource closed in helper function | Cross-function tracking not implemented |
|
||||
| Auth in middleware | Auth check happens before handler is called |
|
||||
| Double-close via aliased reference | Alias analysis not performed |
|
||||
|
||||
## Confidence Signals
|
||||
|
||||
| Signal | Meaning |
|
||||
|--------|---------|
|
||||
| **Definite leak (state-resource-leak)** | Resource is never closed on any path — high confidence |
|
||||
| **Use-after-close** | Read/write operation after explicit close — high confidence |
|
||||
| **Web handler detected** | Entry point matched by parameter naming convention |
|
||||
| **Possible leak (state-resource-leak-possible)** | Resource closed on some but not all paths — lower confidence |
|
||||
|
||||
## Tuning and Noise Controls
|
||||
|
||||
### Enable state analysis
|
||||
|
||||
```toml
|
||||
[scanner]
|
||||
enable_state_analysis = true
|
||||
```
|
||||
|
||||
### Severity filtering
|
||||
|
||||
```bash
|
||||
# Skip possible-leak findings (Low severity)
|
||||
nyx scan . --severity ">=MEDIUM"
|
||||
```
|
||||
|
||||
### Exclude test files
|
||||
|
||||
```toml
|
||||
[scanner]
|
||||
excluded_directories = ["tests", "test", "spec"]
|
||||
```
|
||||
|
||||
## Resource Pairs
|
||||
|
||||
The state engine recognizes these acquire/release pairs per language:
|
||||
|
||||
### C/C++
|
||||
| Acquire | Release | Resource |
|
||||
|---------|---------|----------|
|
||||
| `fopen` | `fclose` | File handle |
|
||||
| `open` | `close` | File descriptor |
|
||||
| `socket` | `close` | Socket |
|
||||
| `malloc`, `calloc`, `realloc` | `free` | Heap memory |
|
||||
| `pthread_mutex_lock` | `pthread_mutex_unlock` | Mutex |
|
||||
|
||||
### Rust
|
||||
| Acquire | Release | Resource |
|
||||
|---------|---------|----------|
|
||||
| `File::open`, `File::create` | `drop`, `close` | File handle |
|
||||
| `TcpStream::connect` | `shutdown` | TCP connection |
|
||||
| `lock`, `read`, `write` (on Mutex/RwLock) | `drop` | Lock guard |
|
||||
|
||||
### Java
|
||||
| Acquire | Release | Resource |
|
||||
|---------|---------|----------|
|
||||
| `new FileInputStream` | `close` | File stream |
|
||||
| `getConnection` | `close` | DB connection |
|
||||
| `new Socket` | `close` | Socket |
|
||||
|
||||
### Go, Python, JavaScript, Ruby, PHP
|
||||
Similar patterns with language-specific function names.
|
||||
|
||||
## Use Patterns (Trigger use-after-close)
|
||||
|
||||
The following operations on a closed resource trigger `state-use-after-close`:
|
||||
|
||||
```
|
||||
read, write, send, recv, fread, fwrite, fgets, fputs, fprintf, fscanf,
|
||||
fflush, fseek, ftell, rewind, feof, ferror, fgetc, fputc, getc, putc,
|
||||
ungetc, query, execute, fetch, sendto, recvfrom, ioctl, fcntl,
|
||||
strcpy, strncpy, strcat, strncat, memcpy, memmove, memset, memcmp,
|
||||
strcmp, strncmp, strlen, sprintf, snprintf
|
||||
```
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Resource Lifecycle Lattice
|
||||
|
||||
```
|
||||
UNINIT → OPEN → CLOSED
|
||||
→ MOVED
|
||||
```
|
||||
|
||||
States are tracked as bitflags, allowing the lattice to represent uncertainty (e.g. OPEN|CLOSED means the resource is open on some paths and closed on others).
|
||||
|
||||
### Leak Detection Scope
|
||||
|
||||
Resource leaks are checked at the file-level exit node and the **synthesized** function exit node (a single Return node that all early returns feed into). Early-return nodes are **not** checked individually — only the merged state at the function's synthesized exit is inspected. This prevents duplicate findings where an early-return path reports a definite leak while the merged exit correctly reports a possible leak.
|
||||
|
||||
This per-function exit inspection ensures that a variable leaked inside one function is not masked by a same-named variable that is properly closed in a subsequent function.
|
||||
|
||||
### Auth Level Lattice
|
||||
|
||||
```
|
||||
Unauthed < Authed < Admin
|
||||
```
|
||||
|
||||
Join semantics: take the minimum (conservative). If any path is unauthenticated, the result is unauthenticated.
|
||||
202
docs/detectors/taint.md
Normal file
202
docs/detectors/taint.md
Normal file
|
|
@ -0,0 +1,202 @@
|
|||
# Taint Analysis
|
||||
|
||||
## Summary
|
||||
|
||||
Nyx's taint analysis tracks the flow of untrusted data from **sources** (where data enters the program) through **assignments and function calls** to **sinks** (where dangerous operations happen). If the data reaches a sink without passing through a **sanitizer** with matching capabilities, a finding is emitted.
|
||||
|
||||
The engine uses a monotone forward dataflow analysis over a finite lattice with guaranteed termination. Analysis is **intra-procedural with cross-file function summaries** — it does not follow calls into other functions but uses pre-computed summaries of their behavior.
|
||||
|
||||
## Rule ID
|
||||
|
||||
```
|
||||
taint-unsanitised-flow (source <line>:<col>)
|
||||
```
|
||||
|
||||
One rule ID covers all taint findings. The parenthetical identifies the specific source location.
|
||||
|
||||
## What It Detects
|
||||
|
||||
- Environment variables flowing to shell execution (`env::var` → `Command::new`)
|
||||
- User input flowing to code evaluation (`req.body` → `eval()`)
|
||||
- File contents flowing to SQL queries (`fs::read_to_string` → `db.execute()`)
|
||||
- Request parameters flowing to HTML output (`req.query` → `innerHTML`)
|
||||
- Any source-to-sink flow where the sink's required capability is not stripped by a sanitizer
|
||||
|
||||
## What It Cannot Detect
|
||||
|
||||
- **Inter-procedural flows without summaries**: If a function isn't summarized (e.g. from a third-party library without source), the taint engine cannot track data through it. It conservatively treats unknown callees as neither propagating nor sanitizing.
|
||||
- **Flows through data structures**: Taint is tracked per-variable, not per-field. `obj.field = tainted; sink(obj.other_field)` may produce a false positive because taint attaches to `obj` as a whole.
|
||||
- **Aliasing**: `let y = &x; sink(*y)` — the engine tracks `y` as a fresh variable, not an alias of `x`. This can cause false negatives.
|
||||
- **Complex control flow**: The analysis is flow-sensitive (respects control flow within a function) but does not track taint through arbitrary loops with complex exit conditions.
|
||||
- **Implicit flows**: Taint only follows explicit data flow, not information flow through branching (e.g. `if (secret) { x = 1 } else { x = 0 }` does not taint `x`).
|
||||
|
||||
## Common False Positives
|
||||
|
||||
| Scenario | Why it happens | Mitigation |
|
||||
|----------|---------------|------------|
|
||||
| Custom sanitizer not recognized | Nyx only knows built-in and configured sanitizers | Add a custom sanitizer rule in config |
|
||||
| Taint through struct fields | Variable-level (not field-level) tracking | No current mitigation; field sensitivity is planned |
|
||||
| Dead code paths | The engine is path-insensitive within a function (it considers all paths) | Contradiction pruning catches some cases; path-validated findings score lower |
|
||||
| Library wrappers | A wrapper around a dangerous function may re-introduce taint that was sanitized by the wrapper | Summarize the wrapper function or add it as a sanitizer |
|
||||
|
||||
## Common False Negatives
|
||||
|
||||
| Scenario | Why it's missed |
|
||||
|----------|----------------|
|
||||
| Third-party library calls | No summary available; callee treated as opaque |
|
||||
| Taint through global/static variables | Not tracked across function boundaries |
|
||||
| Taint through closures/callbacks in some languages | Closure capture analysis is limited (JS/TS/Ruby/Go anonymous functions ARE analyzed) |
|
||||
| Flows spanning more than two files | Summary approximation loses precision at depth |
|
||||
|
||||
## Confidence Signals
|
||||
|
||||
These signals in the output indicate higher-confidence findings:
|
||||
|
||||
| Signal | What it means |
|
||||
|--------|--------------|
|
||||
| **Evidence: Source + Sink** | Both endpoints identified with specific function names and locations |
|
||||
| **Source kind = user input** | Source is directly controllable by an attacker (req.body, argv, etc.) |
|
||||
| **path_validated = false** | No validation guard on the path — higher exploitability |
|
||||
| **No guard_kind** | No dominating predicate check (null check, error check, etc.) |
|
||||
| **High rank_score** | Multiple confidence signals combined |
|
||||
|
||||
Lower-confidence:
|
||||
|
||||
| Signal | What it means |
|
||||
|--------|--------------|
|
||||
| **path_validated = true** | A validation predicate guards the path — may not be exploitable |
|
||||
| **guard_kind = "ValidationCall"** | An explicit validation function was called before the sink |
|
||||
| **Source kind = database** | Data from DB — may already be validated at insertion time |
|
||||
|
||||
## Tuning and Noise Controls
|
||||
|
||||
### Add custom sanitizers
|
||||
|
||||
If your codebase has a custom sanitizer that Nyx doesn't recognize:
|
||||
|
||||
```toml
|
||||
# nyx.local
|
||||
[[analysis.languages.javascript.rules]]
|
||||
matchers = ["escapeHtml", "sanitizeInput"]
|
||||
kind = "sanitizer"
|
||||
cap = "html_escape"
|
||||
```
|
||||
|
||||
Or via CLI:
|
||||
```bash
|
||||
nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape
|
||||
```
|
||||
|
||||
### Filter by severity
|
||||
|
||||
```bash
|
||||
nyx scan . --severity HIGH # Only high-severity taint findings
|
||||
nyx scan . --severity ">=MEDIUM" # Skip low-severity
|
||||
```
|
||||
|
||||
### Skip non-production code
|
||||
|
||||
By default, findings in `tests/`, `vendor/`, `build/` paths are downgraded one severity tier. To exclude them entirely, add to config:
|
||||
|
||||
```toml
|
||||
[scanner]
|
||||
excluded_directories = ["tests", "vendor", "build", "examples"]
|
||||
```
|
||||
|
||||
### Disable taint (AST-only mode)
|
||||
|
||||
```bash
|
||||
nyx scan . --mode ast
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
**Vulnerable code** (Rust):
|
||||
```rust
|
||||
use std::env;
|
||||
use std::process::Command;
|
||||
|
||||
fn main() {
|
||||
let cmd = env::var("USER_CMD").unwrap(); // line 5: source
|
||||
Command::new("sh").arg("-c").arg(&cmd).output(); // line 6: sink
|
||||
}
|
||||
```
|
||||
|
||||
**Finding**:
|
||||
```
|
||||
[HIGH] taint-unsanitised-flow (source 5:15) src/main.rs:6:5
|
||||
Source: env::var("USER_CMD") at 5:15
|
||||
Sink: Command::new("sh").arg("-c")
|
||||
Score: 76
|
||||
```
|
||||
|
||||
**Safe alternative**:
|
||||
```rust
|
||||
use std::env;
|
||||
use std::process::Command;
|
||||
|
||||
fn main() {
|
||||
let cmd = env::var("USER_CMD").unwrap();
|
||||
// Use the value as a direct argument, not a shell command
|
||||
Command::new(&cmd).output();
|
||||
// Or validate against an allowlist
|
||||
}
|
||||
```
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Capability System
|
||||
|
||||
Taint uses a bitflag capability system to match sources with appropriate sanitizers and sinks:
|
||||
|
||||
| Capability | Bit | Sources | Sanitizers | Sinks |
|
||||
|-----------|-----|---------|------------|-------|
|
||||
| `ENV_VAR` | 0x01 | `env::var`, `getenv` | — | — |
|
||||
| `HTML_ESCAPE` | 0x02 | — | `html_escape`, `DOMPurify.sanitize` | `innerHTML`, `document.write` |
|
||||
| `SHELL_ESCAPE` | 0x04 | — | `shell_escape` | `Command::new`, `system()`, `eval()` |
|
||||
| `URL_ENCODE` | 0x08 | — | `encodeURIComponent` | `location.href` |
|
||||
| `JSON_PARSE` | 0x10 | — | `JSON.parse` | — |
|
||||
| `FILE_IO` | 0x20 | — | `filepath.Clean`, `basename`, `os.path.realpath` | `fopen`, `open`, `send_file`, `fs::read_to_string` |
|
||||
| `FMT_STRING` | 0x40 | — | — | `printf(var)` |
|
||||
|
||||
Sources typically use `Cap::all()` to match any sink. A sanitizer strips specific capability bits. A finding fires when a tainted variable reaches a sink and the taint still has the matching capability bit set.
|
||||
|
||||
### Nested Function Analysis
|
||||
|
||||
The CFG builder recursively discovers function expressions nested inside call arguments:
|
||||
|
||||
- **JavaScript/TypeScript**: `function_expression`, `arrow_function` inside call arguments (e.g., Express route handlers)
|
||||
- **Ruby**: `do_block` and `block` nodes (e.g., Sinatra `get '/path' do...end`)
|
||||
- **Go**: `func_literal` (anonymous function literals)
|
||||
|
||||
Each nested function is walked as a separate scope and receives a unique identifier (`<anon@{byte_offset}>`) to prevent collisions when multiple anonymous functions exist in the same file.
|
||||
|
||||
### Chained Call Classification
|
||||
|
||||
Method chains like `r.URL.Query().Get("host")` are normalized by stripping internal `()` segments between `.` separators. The classifier matches against both the original text and the normalized form, enabling rules like `r.URL` to match within `r.URL.Query.Get`.
|
||||
|
||||
### Nested Call Fallback
|
||||
|
||||
When the outermost call in an expression doesn't classify as a source/sink, the engine tries all nested inner calls. This handles patterns like `str(eval(expr))` where `str` is not a sink but the inner `eval` is.
|
||||
|
||||
### Rust `if let` / `while let` Pattern Bindings
|
||||
|
||||
The CFG builder recognizes Rust `let_condition` nodes inside `if` and `while` expressions. The value expression is classified for source/sink labels, and the pattern binding is extracted as a variable definition:
|
||||
|
||||
```rust
|
||||
if let Ok(cmd) = env::var("CMD") {
|
||||
// cmd is tainted — env::var is a source, cmd is the binding
|
||||
Command::new("sh").arg("-c").arg(&cmd).output(); // taint-unsanitised-flow
|
||||
}
|
||||
```
|
||||
|
||||
This also works for `while let` patterns.
|
||||
|
||||
### JS/TS Two-Level Solve
|
||||
|
||||
For JavaScript and TypeScript, taint analysis uses a two-level approach:
|
||||
|
||||
1. **Level 1**: Solve top-level code (module scope)
|
||||
2. **Level 2**: Solve each function seeded with the converged top-level state
|
||||
|
||||
This prevents false positives from cross-function taint leakage while preserving global-to-function flows.
|
||||
32
docs/index.md
Normal file
32
docs/index.md
Normal file
|
|
@ -0,0 +1,32 @@
|
|||
# Nyx Documentation
|
||||
|
||||
Welcome to the Nyx documentation. Nyx is a multi-language static vulnerability scanner built in Rust.
|
||||
|
||||
## User Guide
|
||||
|
||||
- [Installation](installation.md) — Install via cargo, prebuilt binaries, or from source
|
||||
- [Quick Start](quickstart.md) — Your first scan in 60 seconds
|
||||
- [CLI Reference](cli.md) — Every flag, subcommand, and option
|
||||
- [Configuration](configuration.md) — Config file schema, precedence, custom rules
|
||||
- [Output Formats](output.md) — Console, JSON, SARIF; exit codes; evidence fields
|
||||
|
||||
## Detector Reference
|
||||
|
||||
- [Detector Overview](detectors.md) — How the four detector families work together
|
||||
- [Taint Analysis](detectors/taint.md) — Cross-file source-to-sink dataflow tracking
|
||||
- [CFG Structural Analysis](detectors/cfg.md) — Auth gaps, unguarded sinks, resource leaks
|
||||
- [State Model Analysis](detectors/state.md) — Resource lifecycle and authentication state
|
||||
- [AST Patterns](detectors/patterns.md) — Tree-sitter structural pattern matching
|
||||
|
||||
## Rule Reference
|
||||
|
||||
- [Rule Index](rules/index.md) — How rules are organized
|
||||
- [Rust](rules/rust.md) | [C](rules/c.md) | [C++](rules/cpp.md) | [Java](rules/java.md) | [Go](rules/go.md)
|
||||
- [JavaScript](rules/javascript.md) | [TypeScript](rules/typescript.md) | [Python](rules/python.md)
|
||||
- [PHP](rules/php.md) | [Ruby](rules/ruby.md)
|
||||
|
||||
## Contributing
|
||||
|
||||
- [Contributing Guide](../CONTRIBUTING.md) — Development setup, adding rules, PR guidelines
|
||||
- [Security Policy](../SECURITY.md) — Responsible disclosure
|
||||
- [Code of Conduct](../CODE_OF_CONDUCT.md)
|
||||
76
docs/installation.md
Normal file
76
docs/installation.md
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
# Installation
|
||||
|
||||
## Install from crates.io
|
||||
|
||||
```bash
|
||||
cargo install nyx-scanner
|
||||
```
|
||||
|
||||
This installs the `nyx` binary into `~/.cargo/bin/`.
|
||||
|
||||
## Install from GitHub releases
|
||||
|
||||
1. Go to the [Releases](https://github.com/elicpeter/nyx/releases) page.
|
||||
2. Download the binary for your platform:
|
||||
|
||||
| Platform | Archive |
|
||||
|----------|---------|
|
||||
| Linux x86_64 | `nyx-x86_64-unknown-linux-gnu.zip` |
|
||||
| macOS Intel | `nyx-x86_64-apple-darwin.zip` |
|
||||
| macOS Apple Silicon | `nyx-aarch64-apple-darwin.zip` |
|
||||
| Windows x86_64 | `nyx-x86_64-pc-windows-msvc.zip` |
|
||||
|
||||
3. Extract and install:
|
||||
|
||||
```bash
|
||||
# Linux / macOS
|
||||
unzip nyx-*.zip
|
||||
chmod +x nyx
|
||||
sudo mv nyx /usr/local/bin/
|
||||
|
||||
# Windows (PowerShell)
|
||||
Expand-Archive -Path nyx-*.zip -DestinationPath .
|
||||
Move-Item -Path .\nyx.exe -Destination "C:\Program Files\Nyx\"
|
||||
```
|
||||
|
||||
4. Verify:
|
||||
```bash
|
||||
nyx --version
|
||||
```
|
||||
|
||||
## Build from source
|
||||
|
||||
```bash
|
||||
git clone https://github.com/elicpeter/nyx.git
|
||||
cd nyx
|
||||
cargo build --release
|
||||
cargo install --path .
|
||||
```
|
||||
|
||||
Requires **Rust 1.85+** (edition 2024).
|
||||
|
||||
## CI Integration
|
||||
|
||||
### GitHub Actions
|
||||
|
||||
```yaml
|
||||
- name: Install Nyx
|
||||
run: cargo install nyx-scanner
|
||||
|
||||
- name: Run security scan
|
||||
run: nyx scan . --format sarif --fail-on medium > results.sarif
|
||||
|
||||
- name: Upload SARIF
|
||||
uses: github/codeql-action/upload-sarif@v3
|
||||
with:
|
||||
sarif_file: results.sarif
|
||||
```
|
||||
|
||||
### Generic CI
|
||||
|
||||
```bash
|
||||
# Fail the build if any High or Medium finding is detected
|
||||
nyx scan . --severity ">=MEDIUM" --fail-on medium --quiet --format json
|
||||
```
|
||||
|
||||
The `--fail-on` flag causes Nyx to exit with code **1** if any finding meets or exceeds the given severity. Exit code **0** means no findings matched.
|
||||
315
docs/output.md
Normal file
315
docs/output.md
Normal file
|
|
@ -0,0 +1,315 @@
|
|||
# Output Formats
|
||||
|
||||
Nyx supports three output formats, selected with `--format` or `output.default_format` in config.
|
||||
|
||||
## Console (default)
|
||||
|
||||
Human-readable, color-coded output to stdout. Status messages go to stderr.
|
||||
|
||||
```
|
||||
[HIGH] taint-unsanitised-flow (source 5:11) src/handler.rs:12:5 (Score: 76, Confidence: High)
|
||||
Source: env::var("CMD") → Command::new("sh").arg("-c")
|
||||
|
||||
[MEDIUM] cfg-unguarded-sink src/handler.rs:12:5 (Score: 35, Confidence: Medium)
|
||||
|
||||
[LOW] rs.quality.unwrap src/lib.rs:88:5 (Score: 10, Confidence: High)
|
||||
```
|
||||
|
||||
### Severity indicators
|
||||
|
||||
| Tag | Color | Meaning |
|
||||
|-----|-------|---------|
|
||||
| `[HIGH]` | Red, bold | Critical — likely exploitable |
|
||||
| `[MEDIUM]` | Orange, bold | Important — may be exploitable |
|
||||
| `[LOW]` | Muted blue-gray | Informational — code quality or weak signal |
|
||||
|
||||
### Evidence fields
|
||||
|
||||
Taint and state findings include structured evidence:
|
||||
|
||||
| Label | Meaning |
|
||||
|-------|---------|
|
||||
| **Source** | Where tainted data originated (function name + location) |
|
||||
| **Sink** | Where the dangerous operation happens |
|
||||
| **Path guard** | Type of validation predicate protecting the path |
|
||||
|
||||
### Score
|
||||
|
||||
When attack-surface ranking is enabled (default), each finding shows a `Score` value. Higher scores indicate greater exploitability. See [Detector Overview](detectors.md) for the scoring formula.
|
||||
|
||||
### Rollup findings
|
||||
|
||||
High-frequency LOW Quality findings (e.g. `rs.quality.unwrap`) are grouped into rollup findings by `(file, rule)`:
|
||||
|
||||
```
|
||||
21:10 ● [LOW] rs.quality.unwrap
|
||||
rs.quality.unwrap (38 occurrences)
|
||||
Examples: 21:10, 50:10, 79:10, 105:10, 134:10
|
||||
Run: nyx scan --show-instances rs.quality.unwrap
|
||||
```
|
||||
|
||||
Rollups count as **one finding** for LOW budget enforcement. Use `--show-instances <RULE>` to expand a specific rule or `--all` to disable rollups entirely.
|
||||
|
||||
### Suppression footer
|
||||
|
||||
When findings are suppressed by the prioritization pipeline, a footer is shown:
|
||||
|
||||
```
|
||||
Suppressed 195 LOW/Quality findings.
|
||||
Active filters:
|
||||
include_quality = false
|
||||
max_low = 20
|
||||
max_low_per_file = 1
|
||||
max_low_per_rule = 10
|
||||
|
||||
Use --include-quality, --max-low, or --all to adjust.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## JSON
|
||||
|
||||
Machine-readable JSON array. Each finding is an object:
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"path": "src/handler.rs",
|
||||
"line": 12,
|
||||
"col": 5,
|
||||
"severity": "High",
|
||||
"id": "taint-unsanitised-flow (source 5:11)",
|
||||
"path_validated": false,
|
||||
"labels": [
|
||||
["Source", "env::var(\"CMD\") at 5:11"],
|
||||
["Sink", "Command::new(\"sh\").arg(\"-c\")"]
|
||||
],
|
||||
"confidence": "High",
|
||||
"evidence": {
|
||||
"source": {
|
||||
"path": "src/handler.rs",
|
||||
"line": 5,
|
||||
"col": 11,
|
||||
"kind": "source",
|
||||
"snippet": "env::var(\"CMD\")"
|
||||
},
|
||||
"sink": {
|
||||
"path": "src/handler.rs",
|
||||
"line": 12,
|
||||
"col": 5,
|
||||
"kind": "sink",
|
||||
"snippet": "Command::new(\"sh\")"
|
||||
},
|
||||
"notes": ["source_kind:EnvironmentConfig"]
|
||||
},
|
||||
"rank_score": 76.0,
|
||||
"rank_reason": [
|
||||
["severity_base", "60"],
|
||||
["analysis_kind", "10"],
|
||||
["source_kind", "5"],
|
||||
["evidence_count", "1"]
|
||||
]
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### Field descriptions
|
||||
|
||||
| Field | Type | Always present | Description |
|
||||
|-------|------|----------------|-------------|
|
||||
| `path` | string | yes | File path relative to scan root |
|
||||
| `line` | int | yes | 1-indexed line number |
|
||||
| `col` | int | yes | 1-indexed column number |
|
||||
| `severity` | string | yes | `"High"`, `"Medium"`, or `"Low"` |
|
||||
| `id` | string | yes | Rule ID |
|
||||
| `category` | string | yes | Finding category: `"Security"`, `"Reliability"`, or `"Quality"` |
|
||||
| `path_validated` | bool | no | True if guarded by validation predicate |
|
||||
| `guard_kind` | string | no | Predicate type (e.g. `"NullCheck"`, `"ValidationCall"`) |
|
||||
| `message` | string | no | Human-readable context (state analysis findings) |
|
||||
| `labels` | array | no | Array of `[label, value]` pairs for console display |
|
||||
| `confidence` | string | no | Confidence level: `"Low"`, `"Medium"`, or `"High"` |
|
||||
| `evidence` | object | no | Structured evidence (source/sink spans, state, notes) |
|
||||
| `rank_score` | float | no | Attack-surface score (omitted when ranking disabled) |
|
||||
| `rank_reason` | array | no | Score breakdown (omitted when ranking disabled) |
|
||||
| `rollup` | object | no | Rollup data when findings are grouped (see below) |
|
||||
|
||||
Fields marked "no" are omitted when empty/null/false to keep output compact.
|
||||
|
||||
### Confidence levels
|
||||
|
||||
| Level | Meaning |
|
||||
|-------|---------|
|
||||
| `High` | Strong signal — taint-confirmed flow, definite state violation |
|
||||
| `Medium` | Moderate signal — resource leak, path-validated taint, CFG structural |
|
||||
| `Low` | Weak signal — AST pattern match, possible resource leak, degraded analysis |
|
||||
|
||||
### Evidence object
|
||||
|
||||
The `evidence` field provides structured provenance data:
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `source` | object | Source span (path, line, col, kind, snippet) |
|
||||
| `sink` | object | Sink span (path, line, col, kind, snippet) |
|
||||
| `guards` | array | Validation guard spans |
|
||||
| `sanitizers` | array | Sanitizer spans |
|
||||
| `state` | object | State-machine evidence (machine, subject, from_state, to_state) |
|
||||
| `notes` | array | Free-form notes (e.g. `"source_kind:UserInput"`, `"path_validated"`) |
|
||||
|
||||
All fields are omitted when empty/null.
|
||||
|
||||
### Rollup object
|
||||
|
||||
When a finding is a rollup (grouped from multiple occurrences), the `rollup` field is present:
|
||||
|
||||
```json
|
||||
{
|
||||
"rollup": {
|
||||
"count": 38,
|
||||
"occurrences": [
|
||||
{ "line": 21, "col": 10 },
|
||||
{ "line": 50, "col": 10 },
|
||||
{ "line": 79, "col": 10 }
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `count` | int | Total number of occurrences |
|
||||
| `occurrences` | array | First N example locations (controlled by `rollup_examples`) |
|
||||
|
||||
---
|
||||
|
||||
## SARIF (Static Analysis Results Interchange Format)
|
||||
|
||||
SARIF 2.1.0 JSON, suitable for GitHub Code Scanning and other SARIF-compatible tools.
|
||||
|
||||
```bash
|
||||
nyx scan . --format sarif > results.sarif
|
||||
```
|
||||
|
||||
The SARIF output includes:
|
||||
|
||||
- **Tool metadata** — Nyx name and version
|
||||
- **Rules** — Rule ID, description, severity mapping
|
||||
- **Results** — One result per finding with location, message, and properties
|
||||
- **Properties** — Each result includes `category` and optionally `confidence` and `rollup.count`
|
||||
- **Related locations** — Rollup findings include example locations in `relatedLocations`
|
||||
- **Artifacts** — File paths referenced by findings
|
||||
|
||||
### GitHub Code Scanning integration
|
||||
|
||||
```yaml
|
||||
- name: Run Nyx
|
||||
run: nyx scan . --format sarif > results.sarif
|
||||
|
||||
- name: Upload SARIF
|
||||
uses: github/codeql-action/upload-sarif@v3
|
||||
with:
|
||||
sarif_file: results.sarif
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Exit Codes
|
||||
|
||||
| Code | Meaning |
|
||||
|------|---------|
|
||||
| `0` | Scan completed successfully; no findings matched `--fail-on` threshold |
|
||||
| `1` | `--fail-on` threshold breached (at least one finding meets or exceeds the specified severity) |
|
||||
| Non-zero | Error (I/O, config, database, parse error) |
|
||||
|
||||
Without `--fail-on`, Nyx always exits `0` on a successful scan regardless of findings count.
|
||||
|
||||
---
|
||||
|
||||
## Severity Levels
|
||||
|
||||
| Level | Description | Typical rules |
|
||||
|-------|-------------|---------------|
|
||||
| **High** | Critical vulnerabilities — likely exploitable | Command injection, unsafe deserialization, banned C functions, taint-confirmed flows with user input sources |
|
||||
| **Medium** | Important issues — may be exploitable with additional context | SQL concatenation, XSS sinks, reflection, unguarded sinks, resource leaks |
|
||||
| **Low** | Informational — code quality or weak signals | Weak crypto algorithms, insecure randomness, `unwrap()`/`panic!()`, type-safety escapes |
|
||||
|
||||
### Non-production severity downgrade
|
||||
|
||||
By default, findings in paths matching common non-production patterns (`tests/`, `test/`, `vendor/`, `build/`, `examples/`, `benchmarks/`) are downgraded by one tier:
|
||||
|
||||
- High → Medium
|
||||
- Medium → Low
|
||||
- Low → Low (unchanged)
|
||||
|
||||
Use `--keep-nonprod-severity` to disable this behavior.
|
||||
|
||||
---
|
||||
|
||||
## Inline Suppressions
|
||||
|
||||
Suppress specific findings directly in source code using `nyx:ignore` comments. Suppressed findings are excluded from output, severity counts, and `--fail-on` checks by default.
|
||||
|
||||
### Comment syntax
|
||||
|
||||
| Language | Comment styles |
|
||||
|----------|---------------|
|
||||
| Rust, C, C++, Java, Go, JS, TS | `// nyx:ignore ...` or `/* nyx:ignore ... */` |
|
||||
| Python, Ruby | `# nyx:ignore ...` |
|
||||
| PHP | `// nyx:ignore ...`, `# nyx:ignore ...`, or `/* nyx:ignore ... */` |
|
||||
|
||||
### Directive forms
|
||||
|
||||
```python
|
||||
x = dangerous() # nyx:ignore taint-unsanitised-flow ← suppresses this line
|
||||
# nyx:ignore-next-line taint-unsanitised-flow
|
||||
x = dangerous() ← suppresses this line
|
||||
```
|
||||
|
||||
- `nyx:ignore <RULE_ID>` — suppresses findings on the **same line** as the comment.
|
||||
- `nyx:ignore-next-line <RULE_ID>` — suppresses findings on the **next line**.
|
||||
- For taint findings, the primary line is the **sink line** (the `line` field in output).
|
||||
|
||||
### Rule ID matching
|
||||
|
||||
- **Case-sensitive**, exact match after canonicalization.
|
||||
- Comma-separated: `nyx:ignore rule-a, rule-b`
|
||||
- Wildcard suffix: `nyx:ignore rs.quality.*` matches any ID starting with `rs.quality.`
|
||||
- Taint IDs are canonicalized: `nyx:ignore taint-unsanitised-flow` matches `taint-unsanitised-flow (source 5:1)` (parenthetical suffix stripped).
|
||||
|
||||
### Console behavior
|
||||
|
||||
- **Default**: suppressed findings are hidden entirely.
|
||||
- **`--show-suppressed`**: suppressed findings appear dimmed with `[SUPPRESSED]` tag. Summary shows `"N issues (M suppressed)"`.
|
||||
|
||||
### JSON / SARIF behavior
|
||||
|
||||
- **Default**: suppressed findings are excluded from JSON/SARIF output.
|
||||
- **`--show-suppressed`**: suppressed findings are included with additional fields:
|
||||
|
||||
```json
|
||||
{
|
||||
"suppressed": true,
|
||||
"suppression": {
|
||||
"kind": "SameLine",
|
||||
"matched_pattern": "taint-unsanitised-flow",
|
||||
"directive_line": 42
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Exit code
|
||||
|
||||
Suppressed findings do **not** trigger `--fail-on`. A scan with only suppressed findings exits `0`.
|
||||
|
||||
---
|
||||
|
||||
## Rule ID Format
|
||||
|
||||
| Prefix | Detector | Example |
|
||||
|--------|----------|---------|
|
||||
| `taint-*` | Taint analysis | `taint-unsanitised-flow (source 5:11)` |
|
||||
| `cfg-*` | CFG structural | `cfg-unguarded-sink`, `cfg-auth-gap` |
|
||||
| `state-*` | State model | `state-use-after-close`, `state-resource-leak` |
|
||||
| `<lang>.*.*` | AST patterns | `rs.memory.transmute`, `js.code_exec.eval` |
|
||||
|
||||
See the [Rule Reference](rules/index.md) for a complete listing.
|
||||
103
docs/quickstart.md
Normal file
103
docs/quickstart.md
Normal file
|
|
@ -0,0 +1,103 @@
|
|||
# Quick Start
|
||||
|
||||
## Your first scan
|
||||
|
||||
```bash
|
||||
# Scan the current directory
|
||||
nyx scan
|
||||
|
||||
# Scan a specific path
|
||||
nyx scan ./my-project
|
||||
```
|
||||
|
||||
Nyx automatically creates an SQLite index on first run. Subsequent scans skip unchanged files.
|
||||
|
||||
## Understanding the output
|
||||
|
||||
A typical console output looks like:
|
||||
|
||||
```
|
||||
[HIGH] taint-unsanitised-flow (source 5:11) src/handler.rs:12:5
|
||||
Source: env::var("CMD") at 5:11
|
||||
Sink: Command::new("sh").arg("-c")
|
||||
Score: 76
|
||||
|
||||
[MEDIUM] cfg-unguarded-sink src/handler.rs:12:5
|
||||
Score: 35
|
||||
|
||||
[MEDIUM] rs.quality.unsafe_block src/lib.rs:44:5
|
||||
Score: 30
|
||||
```
|
||||
|
||||
Each finding shows:
|
||||
|
||||
| Field | Meaning |
|
||||
|-------|---------|
|
||||
| **Severity tag** | `[HIGH]`, `[MEDIUM]`, or `[LOW]` |
|
||||
| **Rule ID** | Identifies the detector and specific rule |
|
||||
| **Location** | `file:line:col` |
|
||||
| **Evidence** | Source, Sink, and guard details (taint findings only) |
|
||||
| **Score** | Attack-surface ranking score (higher = more exploitable) |
|
||||
|
||||
## Common workflows
|
||||
|
||||
### CI gate — fail on high-severity findings
|
||||
|
||||
```bash
|
||||
nyx scan . --fail-on high --quiet
|
||||
# Exit code 1 if any HIGH finding exists, 0 otherwise
|
||||
```
|
||||
|
||||
### Export for tooling
|
||||
|
||||
```bash
|
||||
# JSON for scripting
|
||||
nyx scan . --format json > findings.json
|
||||
|
||||
# SARIF for GitHub Code Scanning
|
||||
nyx scan . --format sarif > results.sarif
|
||||
```
|
||||
|
||||
### Fast structural scan (no dataflow)
|
||||
|
||||
```bash
|
||||
nyx scan . --mode ast
|
||||
```
|
||||
|
||||
AST-only mode runs tree-sitter pattern queries without building CFGs or running taint analysis. Much faster, but misses dataflow vulnerabilities.
|
||||
|
||||
### Filter by severity
|
||||
|
||||
```bash
|
||||
# Only high-severity
|
||||
nyx scan . --severity HIGH
|
||||
|
||||
# High and medium
|
||||
nyx scan . --severity ">=MEDIUM"
|
||||
|
||||
# Specific set
|
||||
nyx scan . --severity "HIGH,MEDIUM"
|
||||
```
|
||||
|
||||
### Skip the index
|
||||
|
||||
```bash
|
||||
nyx scan . --index off
|
||||
```
|
||||
|
||||
Useful for one-off scans or when you don't want to write to disk.
|
||||
|
||||
### Scan without non-production noise
|
||||
|
||||
By default, findings in test/vendor/build paths are downgraded one severity tier. To keep original severity:
|
||||
|
||||
```bash
|
||||
nyx scan . --keep-nonprod-severity
|
||||
```
|
||||
|
||||
## Next steps
|
||||
|
||||
- [CLI Reference](cli.md) — All flags and options
|
||||
- [Configuration](configuration.md) — Customize rules, exclusions, and behavior
|
||||
- [Detector Overview](detectors.md) — How the analysis engines work
|
||||
- [Rule Reference](rules/index.md) — Browse all rules by language
|
||||
89
docs/rules/c.md
Normal file
89
docs/rules/c.md
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
# C Rules
|
||||
|
||||
Nyx detects C vulnerabilities through AST patterns (banned functions, format strings) and taint analysis (user input → shell execution, buffer overflow sinks).
|
||||
|
||||
## Taint Sources
|
||||
|
||||
| Function | Capability | Source Kind |
|
||||
|----------|-----------|-------------|
|
||||
| `getenv` | `all` | EnvironmentConfig |
|
||||
| `fgets`, `scanf`, `fscanf`, `gets`, `read` | `all` | UserInput |
|
||||
|
||||
## Taint Sinks
|
||||
|
||||
| Function | Required Capability |
|
||||
|----------|-------------------|
|
||||
| `system`, `popen`, `exec*` family | `SHELL_ESCAPE` |
|
||||
| `sprintf`, `strcpy`, `strcat` | `HTML_ESCAPE` |
|
||||
| `printf`, `fprintf` | `FMT_STRING` |
|
||||
| `fopen`, `open` | `FILE_IO` |
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Memory Safety (Banned Functions)
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `c.memory.gets` | High | A | `gets()` — no bounds checking, always exploitable |
|
||||
| `c.memory.strcpy` | High | A | `strcpy()` — no bounds checking on destination buffer |
|
||||
| `c.memory.strcat` | High | A | `strcat()` — no bounds checking on destination buffer |
|
||||
| `c.memory.sprintf` | High | A | `sprintf()` — no length limit on output buffer |
|
||||
| `c.memory.scanf_percent_s` | High | A | `scanf("%s")` — unbounded string read |
|
||||
| `c.memory.printf_no_fmt` | High | B | `printf(var)` — format-string vulnerability (non-literal first arg) |
|
||||
|
||||
### Command Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `c.cmdi.system` | High | A | `system()` — shell command execution |
|
||||
| `c.cmdi.popen` | Medium | A | `popen()` — shell command execution with pipe |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `c.memory.gets` — Banned function
|
||||
|
||||
**Vulnerable:**
|
||||
```c
|
||||
char buf[64];
|
||||
gets(buf); // No bounds checking — buffer overflow
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```c
|
||||
char buf[64];
|
||||
fgets(buf, sizeof(buf), stdin);
|
||||
```
|
||||
|
||||
### `c.memory.printf_no_fmt` — Format string
|
||||
|
||||
**Vulnerable:**
|
||||
```c
|
||||
char *user_input = get_input();
|
||||
printf(user_input); // Format string vulnerability
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```c
|
||||
char *user_input = get_input();
|
||||
printf("%s", user_input);
|
||||
```
|
||||
|
||||
### `c.cmdi.system` — Shell execution
|
||||
|
||||
**Vulnerable:**
|
||||
```c
|
||||
char cmd[256];
|
||||
snprintf(cmd, sizeof(cmd), "ls %s", user_dir);
|
||||
system(cmd); // Command injection if user_dir contains shell metacharacters
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```c
|
||||
// Use execvp with explicit argument array
|
||||
char *args[] = {"ls", user_dir, NULL};
|
||||
execvp("ls", args);
|
||||
```
|
||||
66
docs/rules/cpp.md
Normal file
66
docs/rules/cpp.md
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
# C++ Rules
|
||||
|
||||
C++ rules inherit C banned-function concerns and add C++-specific patterns like dangerous casts.
|
||||
|
||||
## Taint Labels
|
||||
|
||||
C++ shares taint labels with C. See [C Rules](c.md) for the full source/sink/sanitizer listing.
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Memory Safety
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `cpp.memory.gets` | High | A | `gets()` — no bounds checking, always exploitable |
|
||||
| `cpp.memory.strcpy` | High | A | `strcpy()` — no bounds checking on destination |
|
||||
| `cpp.memory.strcat` | High | A | `strcat()` — no bounds checking on destination |
|
||||
| `cpp.memory.sprintf` | High | A | `sprintf()` — no length limit on output |
|
||||
| `cpp.memory.reinterpret_cast` | Medium | A | `reinterpret_cast` — type-punning cast |
|
||||
| `cpp.memory.const_cast` | Medium | A | `const_cast` — removes const/volatile qualifier |
|
||||
| `cpp.memory.printf_no_fmt` | High | B | `printf(var)` — format-string vulnerability |
|
||||
|
||||
### Command Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `cpp.cmdi.system` | High | A | `system()` — shell command execution |
|
||||
| `cpp.cmdi.popen` | High | A | `popen()` — shell command execution |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `cpp.memory.reinterpret_cast` — Type-punning cast
|
||||
|
||||
**Flagged:**
|
||||
```cpp
|
||||
int x = 42;
|
||||
float* fp = reinterpret_cast<float*>(&x); // Type-punning, may violate strict aliasing
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```cpp
|
||||
int x = 42;
|
||||
float f;
|
||||
std::memcpy(&f, &x, sizeof(f)); // Well-defined type punning
|
||||
```
|
||||
|
||||
### `cpp.memory.const_cast` — Removing const
|
||||
|
||||
**Flagged:**
|
||||
```cpp
|
||||
void process(const std::string& s) {
|
||||
char* p = const_cast<char*>(s.c_str()); // Removes const
|
||||
p[0] = 'X'; // Undefined behavior
|
||||
}
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```cpp
|
||||
void process(std::string s) { // Take by value
|
||||
s[0] = 'X';
|
||||
}
|
||||
```
|
||||
148
docs/rules/go.md
Normal file
148
docs/rules/go.md
Normal file
|
|
@ -0,0 +1,148 @@
|
|||
# Go Rules
|
||||
|
||||
Nyx detects Go vulnerabilities through AST patterns and taint analysis, covering command execution, unsafe pointer usage, TLS misconfiguration, weak crypto, SQL injection, hardcoded secrets, and deserialization.
|
||||
|
||||
## Taint Labels
|
||||
|
||||
Go has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/go.rs`.
|
||||
|
||||
### Sources
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `os.Getenv` | all |
|
||||
| `http.Request`, `r.FormValue`, `r.URL`, `r.Body`, `r.Header` | all |
|
||||
| `r.URL.Query`, `r.URL.Query.Get`, `Request.FormValue`, `Request.URL` | all |
|
||||
|
||||
### Sanitizers
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `html.EscapeString`, `template.HTMLEscapeString` | HTML_ESCAPE |
|
||||
| `url.QueryEscape`, `url.PathEscape` | URL_ENCODE |
|
||||
| `filepath.Clean`, `filepath.Base` | FILE_IO |
|
||||
|
||||
### Sinks
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `exec.Command` | SHELL_ESCAPE |
|
||||
| `db.Query`, `db.Exec`, `db.QueryRow`, `db.Prepare` | SHELL_ESCAPE |
|
||||
| `fmt.Fprintf`, `fmt.Sprintf`, `fmt.Printf` | FMT_STRING |
|
||||
| `os.Open`, `os.OpenFile`, `os.Create`, `ioutil.ReadFile`, `os.ReadFile` | FILE_IO |
|
||||
| `template.HTML` | HTML_ESCAPE |
|
||||
|
||||
> **Note:** Chained calls like `r.URL.Query().Get("host")` are normalized by stripping internal `()` segments before matching, so `r.URL.Query.Get` matches the source rule.
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Command Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `go.cmdi.exec_command` | High | A | `exec.Command()` — arbitrary process execution |
|
||||
|
||||
### Memory Safety
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `go.memory.unsafe_pointer` | Medium | A | `unsafe.Pointer` — bypasses Go type system |
|
||||
|
||||
### Insecure Transport
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `go.transport.insecure_skip_verify` | High | A | `InsecureSkipVerify: true` — disables TLS certificate validation |
|
||||
|
||||
### Weak Crypto
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `go.crypto.md5` | Low | A | `md5.New()` / `md5.Sum()` — weak hash algorithm |
|
||||
| `go.crypto.sha1` | Low | A | `sha1.New()` / `sha1.Sum()` — weak hash algorithm |
|
||||
|
||||
### SQL Injection
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `go.sqli.query_concat` | Medium | B | `db.Query`/`Exec`/`QueryRow` with concatenated string |
|
||||
|
||||
### Secrets
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `go.secrets.hardcoded_key` | Medium | A | Variable with secret-like name assigned a string literal |
|
||||
|
||||
### Deserialization
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `go.deser.gob_decode` | Medium | A | `gob.NewDecoder` — Go binary deserialization |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `go.transport.insecure_skip_verify` — TLS misconfiguration
|
||||
|
||||
**Vulnerable:**
|
||||
```go
|
||||
tr := &http.Transport{
|
||||
TLSClientConfig: &tls.Config{
|
||||
InsecureSkipVerify: true, // Disables certificate verification
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```go
|
||||
tr := &http.Transport{
|
||||
TLSClientConfig: &tls.Config{
|
||||
// Use proper CA certificates
|
||||
RootCAs: certPool,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
### `go.sqli.query_concat` — SQL concatenation
|
||||
|
||||
**Vulnerable:**
|
||||
```go
|
||||
rows, err := db.Query("SELECT * FROM users WHERE id=" + userID)
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```go
|
||||
rows, err := db.Query("SELECT * FROM users WHERE id=$1", userID)
|
||||
```
|
||||
|
||||
### `go.secrets.hardcoded_key` — Hardcoded secret
|
||||
|
||||
**Flagged:**
|
||||
```go
|
||||
apiKey := "sk-1234567890abcdef"
|
||||
password := "hunter2"
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```go
|
||||
apiKey := os.Getenv("API_KEY")
|
||||
password := os.Getenv("DB_PASSWORD")
|
||||
```
|
||||
|
||||
### `go.cmdi.exec_command` — Command execution
|
||||
|
||||
**Vulnerable:**
|
||||
```go
|
||||
cmd := exec.Command("sh", "-c", userInput)
|
||||
cmd.Run()
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```go
|
||||
// Use explicit command and arguments, not shell
|
||||
cmd := exec.Command("ls", "-la", safeDir)
|
||||
cmd.Run()
|
||||
```
|
||||
79
docs/rules/index.md
Normal file
79
docs/rules/index.md
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
# Rule Reference
|
||||
|
||||
This section lists every detection rule in Nyx, organized by language.
|
||||
|
||||
## Rule ID Format
|
||||
|
||||
| Prefix | Detector Family | Example |
|
||||
|--------|----------------|---------|
|
||||
| `taint-*` | [Taint analysis](../detectors/taint.md) | `taint-unsanitised-flow (source 5:11)` |
|
||||
| `cfg-*` | [CFG structural](../detectors/cfg.md) | `cfg-unguarded-sink`, `cfg-auth-gap` |
|
||||
| `state-*` | [State model](../detectors/state.md) | `state-use-after-close`, `state-resource-leak` |
|
||||
| `<lang>.*.*` | [AST patterns](../detectors/patterns.md) | `rs.memory.transmute`, `js.code_exec.eval` |
|
||||
|
||||
## Cross-Language Rules
|
||||
|
||||
These rules apply to all supported languages:
|
||||
|
||||
### Taint Rules
|
||||
|
||||
| Rule ID | Severity | Description |
|
||||
|---------|----------|-------------|
|
||||
| `taint-unsanitised-flow (source L:C)` | Varies by source kind | Unsanitized data flows from source to sink |
|
||||
|
||||
### CFG Structural Rules
|
||||
|
||||
| Rule ID | Severity | Description |
|
||||
|---------|----------|-------------|
|
||||
| `cfg-unguarded-sink` | High/Medium | Sink without dominating guard |
|
||||
| `cfg-auth-gap` | High | Web handler reaches privileged sink without auth |
|
||||
| `cfg-unreachable-sink` | Medium | Dangerous function in unreachable code |
|
||||
| `cfg-unreachable-sanitizer` | Low | Sanitizer in unreachable code |
|
||||
| `cfg-unreachable-source` | Low | Source in unreachable code |
|
||||
| `cfg-error-fallthrough` | High/Medium | Error path doesn't terminate before dangerous code |
|
||||
| `cfg-resource-leak` | Medium | Resource not released on all exit paths |
|
||||
| `cfg-lock-not-released` | Medium | Lock not released on all exit paths |
|
||||
|
||||
### State Model Rules
|
||||
|
||||
| Rule ID | Severity | Description |
|
||||
|---------|----------|-------------|
|
||||
| `state-use-after-close` | High | Variable used after being closed |
|
||||
| `state-double-close` | Medium | Resource closed twice |
|
||||
| `state-resource-leak` | Medium | Resource never closed (definite) |
|
||||
| `state-resource-leak-possible` | Low | Resource may not close on all paths |
|
||||
| `state-unauthed-access` | High | Privileged operation without authentication |
|
||||
|
||||
## Per-Language AST Pattern Rules
|
||||
|
||||
Each language page lists all AST pattern rules with examples:
|
||||
|
||||
- [Rust](rust.md) — 12 rules (memory safety, code quality)
|
||||
- [C](c.md) — 8 rules (banned functions, command execution, format strings)
|
||||
- [C++](cpp.md) — 9 rules (banned functions, dangerous casts, command execution)
|
||||
- [Java](java.md) — 8 rules (deserialization, command execution, reflection, SQL, crypto, XSS)
|
||||
- [Go](go.md) — 8 rules (command execution, unsafe pointer, TLS, crypto, SQL, secrets, deserialization)
|
||||
- [JavaScript](javascript.md) — 12 rules (code execution, XSS, prototype pollution, crypto, transport)
|
||||
- [TypeScript](typescript.md) — 10 rules (mirrors JS + type-safety escapes)
|
||||
- [Python](python.md) — 12 rules (code execution, command execution, deserialization, SQL, crypto, XSS)
|
||||
- [PHP](php.md) — 11 rules (code execution, command execution, deserialization, SQL, path traversal, crypto)
|
||||
- [Ruby](ruby.md) — 10 rules (code execution, command execution, deserialization, reflection, SSRF, crypto)
|
||||
|
||||
## Taint Label Coverage
|
||||
|
||||
Taint analysis uses language-specific source/sink/sanitizer labels. Coverage varies by language:
|
||||
|
||||
| Language | Sources | Sinks | Sanitizers | Coverage |
|
||||
|----------|---------|-------|------------|----------|
|
||||
| Rust | Complete | Complete | Complete | Full |
|
||||
| JavaScript | Complete | Complete | Partial | Full |
|
||||
| TypeScript | Partial | Partial | Partial | Moderate |
|
||||
| Python | Partial | Complete | Partial | Moderate |
|
||||
| C | Partial | Complete | Minimal | Moderate |
|
||||
| C++ | Partial | Complete | Minimal | Moderate |
|
||||
| Java | Partial | Partial | Partial | Moderate |
|
||||
| Go | Complete | Complete | Partial | Full |
|
||||
| PHP | Complete | Complete | Partial | Full |
|
||||
| Ruby | Partial | Partial | Partial | Moderate |
|
||||
|
||||
"Starter" coverage means basic rules exist but many common library functions are not yet labeled. Contributions welcome.
|
||||
135
docs/rules/java.md
Normal file
135
docs/rules/java.md
Normal file
|
|
@ -0,0 +1,135 @@
|
|||
# Java Rules
|
||||
|
||||
Nyx detects Java vulnerabilities through AST patterns and taint analysis, covering deserialization, command execution, reflection, SQL injection, weak crypto, and XSS.
|
||||
|
||||
## Taint Labels
|
||||
|
||||
Java has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/java.rs`.
|
||||
|
||||
### Sources
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `System.getenv` | all |
|
||||
| `getParameter`, `getInputStream`, `getHeader`, `getCookies`, `getReader`, `getQueryString`, `getPathInfo` | all |
|
||||
| `readObject`, `readLine` | all |
|
||||
|
||||
### Sanitizers
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `HtmlUtils.htmlEscape`, `StringEscapeUtils.escapeHtml4` | HTML_ESCAPE |
|
||||
|
||||
### Sinks
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `Runtime.exec`, `ProcessBuilder` | SHELL_ESCAPE |
|
||||
| `executeQuery`, `executeUpdate`, `prepareStatement` | SHELL_ESCAPE |
|
||||
| `Class.forName` | SHELL_ESCAPE |
|
||||
| `println`, `print`, `write` | HTML_ESCAPE |
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Deserialization
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `java.deser.readobject` | High | A | `ObjectInputStream.readObject()` — unsafe deserialization |
|
||||
|
||||
### Command Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `java.cmdi.runtime_exec` | High | A | `Runtime.getRuntime().exec()` — shell command execution |
|
||||
|
||||
### Reflection
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `java.reflection.class_forname` | Medium | A | `Class.forName()` — dynamic class loading |
|
||||
| `java.reflection.method_invoke` | Medium | A | `Method.invoke()` — reflective method invocation |
|
||||
|
||||
### SQL Injection
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `java.sqli.execute_concat` | Medium | B | SQL `execute*()` with concatenated string argument |
|
||||
|
||||
### Weak Crypto
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `java.crypto.insecure_random` | Low | A | `new Random()` — `java.util.Random` is not cryptographically secure |
|
||||
| `java.crypto.weak_digest` | Low | A | `MessageDigest.getInstance("MD5"/"SHA1")` |
|
||||
|
||||
### XSS
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `java.xss.getwriter_print` | Medium | A | `response.getWriter().print/println/write` — direct output |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `java.deser.readobject` — Unsafe deserialization
|
||||
|
||||
**Vulnerable:**
|
||||
```java
|
||||
ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
|
||||
Object obj = ois.readObject(); // Arbitrary object instantiation
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```java
|
||||
// Use a safe format like JSON
|
||||
ObjectMapper mapper = new ObjectMapper();
|
||||
MyType obj = mapper.readValue(request.getInputStream(), MyType.class);
|
||||
```
|
||||
|
||||
### `java.sqli.execute_concat` — SQL concatenation
|
||||
|
||||
**Vulnerable:**
|
||||
```java
|
||||
String query = "SELECT * FROM users WHERE id=" + userId;
|
||||
stmt.executeQuery(query); // SQL injection
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```java
|
||||
PreparedStatement ps = conn.prepareStatement("SELECT * FROM users WHERE id=?");
|
||||
ps.setString(1, userId);
|
||||
ResultSet rs = ps.executeQuery();
|
||||
```
|
||||
|
||||
### `java.cmdi.runtime_exec` — Command execution
|
||||
|
||||
**Vulnerable:**
|
||||
```java
|
||||
Runtime.getRuntime().exec("cmd /c " + userCommand);
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```java
|
||||
ProcessBuilder pb = new ProcessBuilder("cmd", "/c", "dir");
|
||||
// Use explicit argument list, never concatenate user input
|
||||
```
|
||||
|
||||
### `java.reflection.class_forname` — Dynamic class loading
|
||||
|
||||
**Flagged:**
|
||||
```java
|
||||
Class<?> cls = Class.forName(className);
|
||||
Object obj = cls.getDeclaredConstructor().newInstance();
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```java
|
||||
// Use an allowlist of permitted class names
|
||||
Map<String, Class<?>> allowed = Map.of("User", User.class, "Order", Order.class);
|
||||
Class<?> cls = allowed.get(className);
|
||||
if (cls != null) { /* ... */ }
|
||||
```
|
||||
138
docs/rules/javascript.md
Normal file
138
docs/rules/javascript.md
Normal file
|
|
@ -0,0 +1,138 @@
|
|||
# JavaScript Rules
|
||||
|
||||
JavaScript has the most complete taint label coverage alongside Rust. Nyx detects code execution, XSS, prototype pollution, command injection, and weak crypto.
|
||||
|
||||
## Taint Sources
|
||||
|
||||
| Function | Capability | Source Kind |
|
||||
|----------|-----------|-------------|
|
||||
| `document.location`, `window.location` | `all` | UserInput |
|
||||
| `req.body`, `req.query`, `req.params` | `all` | UserInput |
|
||||
| `req.headers`, `req.cookies` | `all` | UserInput |
|
||||
| `process.env` | `all` | EnvironmentConfig |
|
||||
|
||||
## Taint Sinks
|
||||
|
||||
| Function | Required Capability |
|
||||
|----------|-------------------|
|
||||
| `eval` | `SHELL_ESCAPE` |
|
||||
| `innerHTML` | `HTML_ESCAPE` |
|
||||
| `location.href`, `window.location.href` | `URL_ENCODE` |
|
||||
| `child_process.exec`, `child_process.execSync` | `SHELL_ESCAPE` |
|
||||
| `child_process.spawn` | `SHELL_ESCAPE` |
|
||||
|
||||
## Taint Sanitizers
|
||||
|
||||
| Function | Strips Capability |
|
||||
|----------|------------------|
|
||||
| `JSON.parse` | `JSON_PARSE` |
|
||||
| `encodeURIComponent`, `encodeURI` | `URL_ENCODE` |
|
||||
| `DOMPurify.sanitize` | `HTML_ESCAPE` |
|
||||
|
||||
> **Note:** Anonymous function expressions and arrow functions passed as callback arguments (e.g., Express `app.get('/path', function(req, res) { ... })`) are automatically walked as separate function scopes for taint analysis. Each anonymous function gets a unique scope identifier to prevent cross-function taint leakage.
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Code Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `js.code_exec.eval` | High | A | `eval()` — dynamic code execution |
|
||||
| `js.code_exec.new_function` | High | A | `new Function()` — eval equivalent |
|
||||
| `js.code_exec.settimeout_string` | Medium | A | `setTimeout`/`setInterval` with string argument |
|
||||
|
||||
### XSS Sinks
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `js.xss.document_write` | Medium | A | `document.write()` / `document.writeln()` |
|
||||
| `js.xss.outer_html` | Medium | A | Assignment to `.outerHTML` |
|
||||
| `js.xss.insert_adjacent_html` | Medium | A | `insertAdjacentHTML()` |
|
||||
| `js.xss.location_assign` | Medium | A | Assignment to `location`/`location.href` — open redirect |
|
||||
| `js.xss.cookie_write` | Medium | A | Write to `document.cookie` |
|
||||
|
||||
### Prototype Pollution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `js.prototype.proto_assignment` | Medium | A | Assignment to `__proto__` |
|
||||
| `js.prototype.extend_object` | Medium | A | Assignment to `Object.prototype.*` |
|
||||
|
||||
### Weak Crypto
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `js.crypto.weak_hash` | Low | A | `crypto.createHash("md5"/"sha1")` |
|
||||
| `js.crypto.math_random` | Low | A | `Math.random()` — not cryptographically secure |
|
||||
|
||||
### Insecure Transport
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `js.transport.fetch_http` | Low | A | `fetch("http://...")` — plaintext HTTP |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `js.code_exec.eval` — Dynamic code execution
|
||||
|
||||
**Vulnerable:**
|
||||
```javascript
|
||||
const code = req.query.code;
|
||||
eval(code); // Remote code execution
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```javascript
|
||||
// Use a sandboxed interpreter or avoid eval entirely
|
||||
const allowed = { add: (a, b) => a + b };
|
||||
const result = allowed[req.query.operation]?.(req.query.a, req.query.b);
|
||||
```
|
||||
|
||||
### `js.xss.document_write` — XSS sink
|
||||
|
||||
**Vulnerable:**
|
||||
```javascript
|
||||
document.write("<h1>" + userName + "</h1>");
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```javascript
|
||||
const el = document.createElement("h1");
|
||||
el.textContent = userName;
|
||||
document.body.appendChild(el);
|
||||
```
|
||||
|
||||
### `js.prototype.proto_assignment` — Prototype pollution
|
||||
|
||||
**Vulnerable:**
|
||||
```javascript
|
||||
function merge(target, source) {
|
||||
for (let key in source) {
|
||||
target[key] = source[key]; // If key is "__proto__", pollutes prototype
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```javascript
|
||||
function merge(target, source) {
|
||||
for (let key in source) {
|
||||
if (key === "__proto__" || key === "constructor") continue;
|
||||
target[key] = source[key];
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Taint: `req.body` → `eval()`
|
||||
|
||||
**Finding:**
|
||||
```
|
||||
[HIGH] taint-unsanitised-flow (source 2:18) src/handler.js:3:5
|
||||
Source: req.body at 2:18
|
||||
Sink: eval()
|
||||
Score: 78
|
||||
```
|
||||
138
docs/rules/php.md
Normal file
138
docs/rules/php.md
Normal file
|
|
@ -0,0 +1,138 @@
|
|||
# PHP Rules
|
||||
|
||||
Nyx detects PHP vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, SQL injection, path traversal, and weak crypto.
|
||||
|
||||
## Taint Labels
|
||||
|
||||
PHP has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/php.rs`.
|
||||
|
||||
### Sources
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `$_GET` / `_GET`, `$_POST` / `_POST`, `$_REQUEST` / `_REQUEST`, `$_COOKIE` / `_COOKIE`, `$_FILES` / `_FILES`, `$_SERVER` / `_SERVER`, `$_ENV` / `_ENV` | all |
|
||||
| `file_get_contents`, `fread` | all |
|
||||
|
||||
> **Note:** PHP superglobal names are matched both with and without the `$` prefix because the CFG's `collect_idents` strips the leading `$` from variable names. Subscript access like `$_GET['cmd']` is handled via `element_reference` / `subscript_expression` node detection.
|
||||
|
||||
### Sanitizers
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `htmlspecialchars`, `htmlentities` | HTML_ESCAPE |
|
||||
| `escapeshellarg`, `escapeshellcmd` | SHELL_ESCAPE |
|
||||
| `basename` | FILE_IO |
|
||||
|
||||
### Sinks
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `system`, `exec`, `passthru`, `shell_exec`, `proc_open`, `popen` | SHELL_ESCAPE |
|
||||
| `eval`, `assert` | SHELL_ESCAPE |
|
||||
| `include`, `include_once`, `require`, `require_once` | FILE_IO |
|
||||
| `unserialize` | SHELL_ESCAPE |
|
||||
| `move_uploaded_file`, `copy`, `file_put_contents`, `fwrite` | FILE_IO |
|
||||
| `echo`, `print` | HTML_ESCAPE |
|
||||
| `mysqli_query`, `pg_query`, `query` | SHELL_ESCAPE |
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Code Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `php.code_exec.eval` | High | A | `eval()` — dynamic code execution |
|
||||
| `php.code_exec.create_function` | High | A | `create_function()` — deprecated eval-like constructor |
|
||||
| `php.code_exec.preg_replace_e` | High | A | `preg_replace` with `/e` modifier — code execution via regex |
|
||||
| `php.code_exec.assert_string` | High | A | `assert()` with string argument — evaluates PHP code |
|
||||
|
||||
### Command Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `php.cmdi.system` | High | A | `system`/`shell_exec`/`exec`/`passthru`/`proc_open`/`popen` |
|
||||
|
||||
### Deserialization
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `php.deser.unserialize` | High | A | `unserialize()` — PHP object injection |
|
||||
|
||||
### SQL Injection
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `php.sqli.query_concat` | Medium | B | `mysql_query`/`mysqli_query` with concatenated SQL |
|
||||
|
||||
### Path Traversal
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `php.path.include_variable` | High | B | `include`/`require` with variable path — file inclusion |
|
||||
|
||||
### Weak Crypto
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `php.crypto.md5` | Low | A | `md5()` — weak hash function |
|
||||
| `php.crypto.sha1` | Low | A | `sha1()` — weak hash function |
|
||||
| `php.crypto.rand` | Low | A | `rand()`/`mt_rand()` — not cryptographically secure |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `php.code_exec.eval` — Dynamic code execution
|
||||
|
||||
**Vulnerable:**
|
||||
```php
|
||||
eval($_GET['code']);
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```php
|
||||
// Never use eval with user input
|
||||
// Use a template engine or allowlisted operations
|
||||
```
|
||||
|
||||
### `php.deser.unserialize` — Object injection
|
||||
|
||||
**Vulnerable:**
|
||||
```php
|
||||
$obj = unserialize($_COOKIE['data']);
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```php
|
||||
$data = json_decode($_COOKIE['data'], true);
|
||||
```
|
||||
|
||||
### `php.path.include_variable` — File inclusion
|
||||
|
||||
**Vulnerable:**
|
||||
```php
|
||||
include($_GET['page']); // Local/remote file inclusion
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```php
|
||||
$allowed = ['home', 'about', 'contact'];
|
||||
$page = in_array($_GET['page'], $allowed) ? $_GET['page'] : 'home';
|
||||
include("pages/{$page}.php");
|
||||
```
|
||||
|
||||
### `php.sqli.query_concat` — SQL concatenation
|
||||
|
||||
**Vulnerable:**
|
||||
```php
|
||||
mysqli_query($conn, "SELECT * FROM users WHERE id=" . $_GET['id']);
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```php
|
||||
$stmt = $conn->prepare("SELECT * FROM users WHERE id=?");
|
||||
$stmt->bind_param("i", $_GET['id']);
|
||||
$stmt->execute();
|
||||
```
|
||||
142
docs/rules/python.md
Normal file
142
docs/rules/python.md
Normal file
|
|
@ -0,0 +1,142 @@
|
|||
# Python Rules
|
||||
|
||||
Nyx detects Python vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, SQL injection, and weak crypto.
|
||||
|
||||
## Taint Labels
|
||||
|
||||
Python has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/python.rs`.
|
||||
|
||||
### Sources
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `os.getenv`, `os.environ` | all |
|
||||
| `request.args`, `request.form`, `request.json`, `request.headers`, `request.cookies`, `input` | all |
|
||||
| `sys.argv` | all |
|
||||
| `argparse.parse_args`, `urllib.request.urlopen`, `requests.get`, `requests.post` | all |
|
||||
|
||||
### Sanitizers
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `html.escape` | HTML_ESCAPE |
|
||||
| `shlex.quote` | SHELL_ESCAPE |
|
||||
| `os.path.realpath` | FILE_IO |
|
||||
|
||||
### Sinks
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `eval`, `exec` | SHELL_ESCAPE |
|
||||
| `os.system`, `os.popen`, `subprocess.call`, `subprocess.run`, `subprocess.Popen`, `subprocess.check_output`, `subprocess.check_call` | SHELL_ESCAPE |
|
||||
| `cursor.execute`, `cursor.executemany` | SHELL_ESCAPE |
|
||||
| `send_file`, `send_from_directory` | FILE_IO |
|
||||
| `open` | FILE_IO |
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Code Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `py.code_exec.eval` | High | A | `eval()` — dynamic code execution |
|
||||
| `py.code_exec.exec` | High | A | `exec()` — dynamic code execution |
|
||||
| `py.code_exec.compile` | Medium | A | `compile()` with exec/eval mode |
|
||||
|
||||
### Command Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `py.cmdi.os_system` | High | A | `os.system()` — shell command execution |
|
||||
| `py.cmdi.os_popen` | High | A | `os.popen()` — shell command execution |
|
||||
| `py.cmdi.subprocess_shell` | High | B | `subprocess.*` with `shell=True` |
|
||||
|
||||
### Deserialization
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `py.deser.pickle_loads` | High | A | `pickle.loads()` / `pickle.load()` — arbitrary object deserialization |
|
||||
| `py.deser.yaml_load` | High | A | `yaml.load()` without SafeLoader |
|
||||
| `py.deser.shelve_open` | Medium | A | `shelve.open()` — pickle-backed deserialization |
|
||||
|
||||
### SQL Injection
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `py.sqli.execute_format` | Medium | B | `cursor.execute()` with string concatenation |
|
||||
|
||||
### Weak Crypto
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `py.crypto.md5` | Low | A | `hashlib.md5()` — weak hash algorithm |
|
||||
| `py.crypto.sha1` | Low | A | `hashlib.sha1()` — weak hash algorithm |
|
||||
|
||||
### Template Injection
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `py.xss.jinja_from_string` | Medium | A | `jinja2.Template.from_string()` — template injection |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `py.deser.pickle_loads` — Unsafe deserialization
|
||||
|
||||
**Vulnerable:**
|
||||
```python
|
||||
import pickle
|
||||
data = pickle.loads(request.body) # Arbitrary code execution
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```python
|
||||
import json
|
||||
data = json.loads(request.body) # JSON is safe
|
||||
```
|
||||
|
||||
### `py.cmdi.subprocess_shell` — Shell execution
|
||||
|
||||
**Vulnerable:**
|
||||
```python
|
||||
import subprocess
|
||||
subprocess.call(user_input, shell=True) # Command injection
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```python
|
||||
import subprocess
|
||||
import shlex
|
||||
subprocess.call(shlex.split(user_input), shell=False)
|
||||
# Or better: use an explicit command list
|
||||
subprocess.call(["ls", "-la", user_dir])
|
||||
```
|
||||
|
||||
### `py.deser.yaml_load` — Unsafe YAML
|
||||
|
||||
**Vulnerable:**
|
||||
```python
|
||||
import yaml
|
||||
config = yaml.load(user_data) # Can instantiate arbitrary objects
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```python
|
||||
import yaml
|
||||
config = yaml.safe_load(user_data) # Only basic Python types
|
||||
```
|
||||
|
||||
### `py.sqli.execute_format` — SQL concatenation
|
||||
|
||||
**Vulnerable:**
|
||||
```python
|
||||
cursor.execute("SELECT * FROM users WHERE id=" + user_id)
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```python
|
||||
cursor.execute("SELECT * FROM users WHERE id=?", (user_id,))
|
||||
```
|
||||
132
docs/rules/ruby.md
Normal file
132
docs/rules/ruby.md
Normal file
|
|
@ -0,0 +1,132 @@
|
|||
# Ruby Rules
|
||||
|
||||
Nyx detects Ruby vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, reflection, SSRF, and weak crypto.
|
||||
|
||||
## Taint Labels
|
||||
|
||||
Ruby has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/ruby.rs`.
|
||||
|
||||
### Sources
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `ENV`, `gets` | all |
|
||||
| `params` | all |
|
||||
|
||||
> **Note:** Ruby's `params[:cmd]` subscript access is detected via `element_reference` node handling in the CFG. Sinatra/Rails `do...end` blocks are walked as function scopes.
|
||||
|
||||
### Sanitizers
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `CGI.escapeHTML`, `ERB::Util.html_escape` | HTML_ESCAPE |
|
||||
| `Shellwords.escape`, `Shellwords.shellescape` | SHELL_ESCAPE |
|
||||
|
||||
### Sinks
|
||||
|
||||
| Matcher | Cap |
|
||||
|---------|-----|
|
||||
| `system`, `exec` | SHELL_ESCAPE |
|
||||
| `eval` | SHELL_ESCAPE |
|
||||
| `puts`, `print` | HTML_ESCAPE |
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Code Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `rb.code_exec.eval` | High | A | `Kernel#eval` — dynamic code execution |
|
||||
| `rb.code_exec.instance_eval` | High | A | `instance_eval` — evaluates string in object context |
|
||||
| `rb.code_exec.class_eval` | High | A | `class_eval` / `module_eval` — evaluates string in class context |
|
||||
|
||||
### Command Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `rb.cmdi.backtick` | High | A | Backtick shell execution (`` `cmd` ``) |
|
||||
| `rb.cmdi.system_interp` | High | A | `system`/`exec` call — command execution risk |
|
||||
|
||||
### Deserialization
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `rb.deser.yaml_load` | High | A | `YAML.load` — arbitrary object deserialization |
|
||||
| `rb.deser.marshal_load` | High | A | `Marshal.load` — arbitrary Ruby object deserialization |
|
||||
|
||||
### Reflection
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `rb.reflection.send_dynamic` | Medium | B | `send()` with non-symbol argument — arbitrary method dispatch |
|
||||
| `rb.reflection.constantize` | Medium | A | `constantize` / `safe_constantize` — dynamic class resolution |
|
||||
|
||||
### SSRF
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `rb.ssrf.open_uri` | Medium | A | `Kernel#open` with HTTP URL — SSRF via open-uri |
|
||||
|
||||
### Weak Crypto
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `rb.crypto.md5` | Low | A | `Digest::MD5` — weak hash algorithm |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `rb.deser.yaml_load` — Unsafe YAML deserialization
|
||||
|
||||
**Vulnerable:**
|
||||
```ruby
|
||||
data = YAML.load(params[:config]) # Arbitrary object instantiation
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```ruby
|
||||
data = YAML.safe_load(params[:config]) # Only basic Ruby types
|
||||
```
|
||||
|
||||
### `rb.cmdi.backtick` — Backtick shell execution
|
||||
|
||||
**Vulnerable:**
|
||||
```ruby
|
||||
output = `ls #{user_dir}` # Command injection via interpolation
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```ruby
|
||||
require 'open3'
|
||||
output, status = Open3.capture2('ls', user_dir)
|
||||
```
|
||||
|
||||
### `rb.reflection.send_dynamic` — Dynamic method dispatch
|
||||
|
||||
**Vulnerable:**
|
||||
```ruby
|
||||
obj.send(params[:method], params[:arg]) # Arbitrary method invocation
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```ruby
|
||||
allowed = %w[name email phone]
|
||||
if allowed.include?(params[:method])
|
||||
obj.send(params[:method])
|
||||
end
|
||||
```
|
||||
|
||||
### `rb.deser.marshal_load` — Marshal deserialization
|
||||
|
||||
**Vulnerable:**
|
||||
```ruby
|
||||
obj = Marshal.load(request.body.read)
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```ruby
|
||||
data = JSON.parse(request.body.read)
|
||||
```
|
||||
105
docs/rules/rust.md
Normal file
105
docs/rules/rust.md
Normal file
|
|
@ -0,0 +1,105 @@
|
|||
# Rust Rules
|
||||
|
||||
Nyx detects Rust vulnerabilities through AST patterns (memory safety, code quality) and taint analysis (command injection via `env::var` → `Command::new`).
|
||||
|
||||
## Taint Sources
|
||||
|
||||
| Function | Capability | Source Kind |
|
||||
|----------|-----------|-------------|
|
||||
| `std::env::var`, `env::var` | `all` | EnvironmentConfig |
|
||||
|
||||
## Taint Sinks
|
||||
|
||||
| Function | Required Capability |
|
||||
|----------|-------------------|
|
||||
| `Command::new`, `Command::arg`, `Command::args` | `SHELL_ESCAPE` |
|
||||
| `Command::status`, `Command::output` | `SHELL_ESCAPE` |
|
||||
| `fs::read_to_string`, `fs::write`, `fs::read`, `File::open`, `File::create` | `FILE_IO` |
|
||||
|
||||
## Taint Sanitizers
|
||||
|
||||
| Function | Strips Capability |
|
||||
|----------|------------------|
|
||||
| `html_escape::encode_safe`, `sanitize_html` | `HTML_ESCAPE` |
|
||||
| `shell_escape::unix::escape`, `sanitize_shell` | `SHELL_ESCAPE` |
|
||||
|
||||
> **Note:** `fs::read_to_string` was moved from taint sources to sinks to support path traversal detection (`env::var` → `fs::read_to_string`).
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Memory Safety
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `rs.memory.transmute` | High | A | `std::mem::transmute` — unchecked type reinterpretation |
|
||||
| `rs.memory.copy_nonoverlapping` | High | A | `ptr::copy_nonoverlapping` — raw pointer memcpy |
|
||||
| `rs.memory.get_unchecked` | High | A | `get_unchecked` / `get_unchecked_mut` — unchecked indexing |
|
||||
| `rs.memory.mem_zeroed` | High | A | `std::mem::zeroed` — may be UB for non-POD types |
|
||||
| `rs.memory.ptr_read` | High | A | `ptr::read` / `ptr::read_volatile` — raw pointer dereference |
|
||||
| `rs.memory.narrow_cast` | Low | A | `as u8`/`i8`/`u16`/`i16` — possible truncation |
|
||||
| `rs.memory.mem_forget` | Low | A | `std::mem::forget` — may leak resources |
|
||||
|
||||
### Code Quality
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `rs.quality.unsafe_block` | Medium | A | `unsafe { }` block — manual memory safety obligation |
|
||||
| `rs.quality.unsafe_fn` | Medium | A | `unsafe fn` declaration |
|
||||
| `rs.quality.unwrap` | Low | A | `.unwrap()` — panics on `None`/`Err` |
|
||||
| `rs.quality.expect` | Low | A | `.expect()` — panics on `None`/`Err` |
|
||||
| `rs.quality.panic_macro` | Low | A | `panic!()` macro invocation |
|
||||
| `rs.quality.todo` | Low | A | `todo!()` / `unimplemented!()` placeholder |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `rs.memory.transmute` — Unchecked type reinterpretation
|
||||
|
||||
**Vulnerable:**
|
||||
```rust
|
||||
let x: u32 = 42;
|
||||
let y: f32 = unsafe { std::mem::transmute(x) };
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```rust
|
||||
let x: u32 = 42;
|
||||
let y: f32 = f32::from_bits(x);
|
||||
```
|
||||
|
||||
### `rs.quality.unsafe_block` — Unsafe block
|
||||
|
||||
**Flagged:**
|
||||
```rust
|
||||
unsafe {
|
||||
let ptr = &x as *const i32;
|
||||
println!("{}", *ptr);
|
||||
}
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```rust
|
||||
// Use safe abstractions when possible
|
||||
println!("{}", x);
|
||||
```
|
||||
|
||||
### Taint: `env::var` → `Command::new`
|
||||
|
||||
**Vulnerable:**
|
||||
```rust
|
||||
let cmd = std::env::var("USER_CMD").unwrap();
|
||||
Command::new("sh").arg("-c").arg(&cmd).output()?;
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```rust
|
||||
let cmd = std::env::var("USER_CMD").unwrap();
|
||||
// Validate against allowlist
|
||||
let allowed = ["ls", "whoami", "date"];
|
||||
if allowed.contains(&cmd.as_str()) {
|
||||
Command::new(&cmd).output()?;
|
||||
}
|
||||
```
|
||||
81
docs/rules/typescript.md
Normal file
81
docs/rules/typescript.md
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
# TypeScript Rules
|
||||
|
||||
TypeScript rules mirror JavaScript patterns plus TypeScript-specific type-safety escape detectors. Taint labels are shared with JavaScript (see [JavaScript Rules](javascript.md)).
|
||||
|
||||
---
|
||||
|
||||
## AST Pattern Rules
|
||||
|
||||
### Code Execution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `ts.code_exec.eval` | High | A | `eval()` — dynamic code execution |
|
||||
| `ts.code_exec.new_function` | High | A | `new Function()` — eval equivalent |
|
||||
| `ts.code_exec.settimeout_string` | Medium | A | `setTimeout`/`setInterval` with string argument |
|
||||
|
||||
### XSS Sinks
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `ts.xss.document_write` | Medium | A | `document.write()` / `document.writeln()` |
|
||||
| `ts.xss.outer_html` | Medium | A | Assignment to `.outerHTML` |
|
||||
| `ts.xss.insert_adjacent_html` | Medium | A | `insertAdjacentHTML()` |
|
||||
| `ts.xss.location_assign` | Medium | A | Assignment to `location`/`location.href` |
|
||||
| `ts.xss.cookie_write` | Low | A | Write to `document.cookie` |
|
||||
|
||||
### Prototype Pollution
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `ts.prototype.proto_assignment` | Medium | A | Assignment to `__proto__` |
|
||||
|
||||
### Weak Crypto
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `ts.crypto.math_random` | Low | A | `Math.random()` — not cryptographically secure |
|
||||
|
||||
### Code Quality (TypeScript-specific)
|
||||
|
||||
| Rule ID | Severity | Tier | Description |
|
||||
|---------|----------|------|-------------|
|
||||
| `ts.quality.any_annotation` | Low | A | Type annotation of `any` — disables type checking |
|
||||
| `ts.quality.as_any` | Low | A | Type assertion `as any` — type-safety escape hatch |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### `ts.quality.any_annotation` — `any` type
|
||||
|
||||
**Flagged:**
|
||||
```typescript
|
||||
function process(data: any) { // ts.quality.any_annotation
|
||||
data.whatever(); // No type checking
|
||||
}
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```typescript
|
||||
interface UserData { name: string; email: string; }
|
||||
function process(data: UserData) {
|
||||
console.log(data.name);
|
||||
}
|
||||
```
|
||||
|
||||
### `ts.quality.as_any` — Type assertion escape
|
||||
|
||||
**Flagged:**
|
||||
```typescript
|
||||
const result = someValue as any; // ts.quality.as_any
|
||||
result.nonexistentMethod();
|
||||
```
|
||||
|
||||
**Safe alternative:**
|
||||
```typescript
|
||||
if (isValidType(someValue)) {
|
||||
const result = someValue as KnownType;
|
||||
result.knownMethod();
|
||||
}
|
||||
```
|
||||
Loading…
Add table
Add a link
Reference in a new issue