Feat/configurable sanitizers and js precision (#32)

* chore: Exclude CLAUDE.md from Cargo.toml

* feat: Add configurable analysis rules and CLI commands for custom sanitizers and terminators

* feat: Enhance resource management and analysis efficiency

- Implemented parallel summary merging in `scan_filesystem` using rayon for improved performance.
- Introduced `GlobalSummaries::merge()` for efficient merging of summaries.
- Optimized file reading and hashing to eliminate redundant I/O operations.
- Added `should_scan_with_hash()` and `upsert_file_with_hash()` methods to streamline file processing.
- Enhanced taint analysis with in-place mutations to reduce memory allocations.
- Updated resource acquisition patterns to exclude false positives for `freopen` and wrapper functions.

* feat: Implement severity downgrade for findings in non-production paths and add source kind inference

* feat: Update versioning information in SECURITY.md for new stable line

* feat: Update categories in Cargo.toml to include parser-implementations and text-processing

* feat: Update dependencies in Cargo.lock for improved compatibility and performance

* feat: Update dependencies in Cargo.lock and Cargo.toml for improved compatibility
This commit is contained in:
Eli Peter 2026-02-25 04:02:11 -05:00 committed by GitHub
parent f96a89e7c1
commit 19b578c5c4
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
37 changed files with 3775 additions and 432 deletions

View file

@ -29,8 +29,10 @@
| Two-pass architecture | Pass 1 extracts function summaries; Pass 2 runs taint with full cross-file context |
| Incremental indexing | SQLite database stores file hashes, summaries, and findings to skip unchanged files |
| Parallel execution | File walking and analysis run concurrently via Rayon; scales with available CPU cores |
| Configurable analysis rules | Define custom sources, sanitizers, sinks, terminators, and event handlers per language via TOML config or CLI |
| Configurable scan parameters | Exclude directories, set maximum file size, tune worker threads, limit output, and more |
| Multiple output formats | Human-readable console view (default) and machine-readable JSON |
| Multiple output formats | Console (default), JSON, and SARIF 2.1.0 for CI integration |
| Progress reporting | Real-time progress bars for file discovery and analysis passes |
---
@ -105,6 +107,9 @@ $ nyx scan
# Scan a specific path and emit JSON
$ nyx scan ./server --format json
# Emit SARIF 2.1.0 for CI integration (GitHub Code Scanning, etc.)
$ nyx scan --format sarif > results.sarif
# Perform an ad-hoc scan without touching the index
$ nyx scan --no-index
@ -116,6 +121,10 @@ $ nyx scan --ast-only
# CFG + taint analysis only (skip AST pattern rules)
$ nyx scan --cfg-only
# Include test/vendor/benchmark paths at original severity
# (by default these are downgraded one tier)
$ nyx scan --include-nonprod
```
### Index Management
@ -135,6 +144,22 @@ $ nyx clean <PROJECT_NAME>
$ nyx clean --all
```
### Configuration Management
```bash
# Print the effective merged configuration
$ nyx config show
# Print the config directory path
$ nyx config path
# Add a custom sanitizer rule (written to nyx.local)
$ nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape
# Add a terminator function
$ nyx config add-terminator --lang javascript --name process.exit
```
---
## Analysis Modes
@ -173,11 +198,11 @@ All 10 languages have full AST pattern matching and CFG/taint analysis. Resource
| C++ | Yes | Yes | Yes |
| Java | Yes | Yes | Yes |
| Go | Yes | Yes | Yes |
| PHP | Yes | Yes | |
| Python | Yes | Yes | |
| Ruby | Yes | Yes | |
| TypeScript | Yes | Yes | |
| JavaScript | Yes | Yes | |
| PHP | Yes | Yes | Yes |
| Python | Yes | Yes | Yes |
| Ruby | Yes | Yes | Yes |
| TypeScript | Yes | Yes | Yes |
| JavaScript | Yes | Yes | Yes |
---
@ -203,6 +228,7 @@ excluded_extensions = ["mp3", "mp4"]
[output]
default_format = "json"
max_results = 200
quiet = true # suppress status messages
[performance]
worker_threads = 8 # 0 = auto-detect
@ -210,6 +236,29 @@ batch_size = 200
channel_multiplier = 2
```
### Custom Analysis Rules
You can define custom sources, sanitizers, sinks, terminators, and event handlers per language. These take priority over built-in rules, letting you teach Nyx about project-specific functions.
```toml
[analysis.languages.javascript]
terminators = ["process.exit"]
event_handlers = ["addEventListener"]
[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml"]
kind = "sanitizer" # "source" | "sanitizer" | "sink"
cap = "html_escape" # "env_var" | "html_escape" | "shell_escape" |
# "url_encode" | "json_parse" | "file_io" | "all"
[[analysis.languages.javascript.rules]]
matchers = ["dangerouslySetHTML"]
kind = "sink"
cap = "html_escape"
```
Rules can also be added interactively via `nyx config add-rule` and `nyx config add-terminator`.
A fully documented `nyx.conf` is generated automatically on first run.
---
@ -258,10 +307,10 @@ With indexing enabled, Pass 1 skips files whose blake3 content hash is unchanged
| Area | Details |
|---|---|
| Output formats | SARIF 2.1.0, JUnit XML, HTML report generator |
| Language coverage | Expanded taint rules per language, resource leak pairs for Python/Ruby/PHP/JS/TS |
| Output formats | JUnit XML, HTML report generator |
| Language coverage | Expanded taint rules per language |
| Rule updates | Remote rule feed with signature verification |
| UX | Progress bar, smart file-watch re-scan |
| UX | Smart file-watch re-scan |
Community feedback shapes priorities -- please [open an issue](https://github.com/ecpeter23/nyx/issues) to discuss proposed changes.