nyx/README.md
Eli Peter f96a89e7c1
Feat/full cfg (#30)
* feat: Enhance control flow analysis with function summaries and taint analysis

* feat: Update taint analysis to utilize function summaries for enhanced tracking

* Refactor `walk.rs` batch processing and override handling:

- Renamed `Batcher` to `BatchSender` for clarity.
- Added `BatchSender::new` constructor for cleaner initialization.
- Simplified batch size management in `BatchSender`.
- Extracted `build_overrides` function for reusable override construction.
- Improved error handling and validation in override building.
- Enhanced performance with directory and file type filtering in `walk`.

* Improve logging and streamline directory walk process:

- Added detailed `tracing` logs for debugging batch flushes, override construction, and walk initialization/completion.
- Optimized and simplified `filter_entry` logic for directory and file type filters.
- Improved metadata checks and max file size enforcement during the scan.

* Refactor and optimize taint tracking, label rules, and directory walk process:

- Replaced `DefaultHasher` with `blake3::Hasher` for improved taint hashing.
- Enhanced sorting and hashing logic in `taint.rs` for consistency and efficiency.
- Removed unused `set_hash` function and redundant imports across files.
- Improved batch sender logic in `walk.rs`, renaming key components for clarity.
- Unified `spawn_senders` and `spawn_file_walker` with thread handling and channel tuple return.
- Expanded label rules with additional matchers for sources, sanitizers, and sinks.
- Deprecated `dump_cfg` and specific logging utilities in `cfg.rs` for code cleanup.

* fix: fixed let chains error in walk.rs

* fix: updated dependencies

* fix: updated dependencies

* chore: Remove standard error in scan.rs

* feat: Introduce function summaries for enhanced taint and control flow analysis

* feat: Enhance taint analysis with interop support and function summaries

* feat: Add configuration analysis module and enhance matcher rules

* feat: Add arity column to function_summaries and handle schema migration

* fix: fixed clippy &PathBuf warnings

* chore: Update dependencies and versioning in Cargo files

* docs: Update README to enhance clarity and detail on features and analysis modes

* chore: Update CHANGELOG for version 0.2.0 with new features, changes, and fixes

* docs: Update SECURITY.md to clarify version support status

---------

Co-authored-by: elipeter <eli.peter@es.fcm.travel>
2026-02-24 23:44:07 -05:00

291 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<div align="center">
<img src="assets/logo.png" alt="nyx logo" width="300"/>
**Fast, cross-language cli vulnerability scanner.**
[![crates.io](https://img.shields.io/crates/v/nyx-scanner.svg)](https://crates.io/crates/nyx-scanner)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Rust 1.85+](https://img.shields.io/badge/rust-1.85%2B-orange)](https://www.rust-lang.org)
[![CI](https://img.shields.io/github/actions/workflow/status/ecpeter23/nyx/ci.yml?branch=master)](https://github.com/ecpeter23/nyx/actions)
</div>
---
## What is Nyx?
**Nyx** is a lightweight, lightning-fast Rust-native command-line tool that detects security vulnerabilities across 10 programming languages. It combines [`tree-sitter`](https://tree-sitter.github.io/) parsing, intra-procedural control-flow graphs, and cross-file taint analysis with an optional SQLite-backed index to deliver deep, repeatable scans on projects of any size.
---
## Key Capabilities
| Capability | Description |
|---|---|
| Multi-language support | Rust, C, C++, Java, Go, PHP, Python, Ruby, TypeScript, JavaScript |
| AST-level pattern matching | Language-specific queries written against precise parse trees |
| Control-flow graph analysis | Auth gaps, unguarded sinks, unreachable security code, resource leaks, error fallthrough |
| Cross-file taint tracking | BFS taint propagation from sources through sanitizers to sinks with function summaries |
| Cross-language interop | Taint flows across language boundaries via explicit interop edges |
| Two-pass architecture | Pass 1 extracts function summaries; Pass 2 runs taint with full cross-file context |
| Incremental indexing | SQLite database stores file hashes, summaries, and findings to skip unchanged files |
| Parallel execution | File walking and analysis run concurrently via Rayon; scales with available CPU cores |
| Configurable scan parameters | Exclude directories, set maximum file size, tune worker threads, limit output, and more |
| Multiple output formats | Human-readable console view (default) and machine-readable JSON |
---
## Why choose Nyx?
| Advantage | What it means for you |
|---|---|
| **Pure-Rust, single binary** | No JVM, Python, or server to install; drop the `nyx` executable into your `$PATH` and go. |
| **Massively parallel** | Uses Rayon and a thread-pool walker; scales to all CPU cores. Scanning the entire **rust-lang/rust** codebase (~53,000 files) on an M2 MacBook Pro takes **~1 s**. |
| **Deep analysis** | Real CFG construction and taint propagation, not just regex matching. Cross-file function summaries, capability-based sanitizer tracking, and scored findings. |
| **Index-aware** | An optional SQLite index stores file hashes and findings; subsequent scans touch *only* changed files, slashing CI times. |
| **Offline & privacy-friendly** | Requires no login, cloud account, or telemetry. Perfect for air-gapped environments and strict compliance policies. |
| **Tree-sitter precision** | Parses real language grammars, not regexes, giving far fewer false positives than line-based scanners. |
| **Extensible** | Add new patterns with concise `tree-sitter` queries; no SaaS lock-in. |
---
## Installation
### Install crate
```bash
$ cargo install nyx-scanner
```
### Install Github release
1. Navigate to the [Releases](https://github.com/ecpeter23/nyx/releases) page of the repository.
2. Download the appropriate binary for your system:
```nyx-x86_64-unknown-linux-gnu.zip``` for Linux
```nyx-x86_64-pc-windows-msvc.zip``` for Windows
```nyx-x86_64-apple-darwin.zip``` or ```nyx-aarch64-apple-darwin.zip``` for macOS (Intel or Apple Silicon)
3. Unzip the file and move the executable to a directory in your system PATH:
```bash
# Example for Unix systems
unzip nyx-x86_64-unknown-linux-gnu.zip
chmod +x nyx
sudo mv nyx /usr/local/bin/
```
```bash
# Example for Windows in PowerShell
Expand-Archive -Path nyx-x86_64-pc-windows-msvc.zip -DestinationPath .
Move-Item -Path .\nyx.exe -Destination "C:\Program Files\Nyx\" # Add to PATH manually if needed
```
4. Verify the installation:
```bash
nyx --version
```
### Build from source
```bash
$ git clone https://github.com/ecpeter23/nyx.git
$ cd nyx
$ cargo build --release
# optional copy the binary into PATH
$ cargo install --path .
```
Nyx targets **stable Rust 1.85 or later**.
---
## Quick Start
```bash
# Scan the current directory (creates/uses an index automatically)
$ nyx scan
# Scan a specific path and emit JSON
$ nyx scan ./server --format json
# Perform an ad-hoc scan without touching the index
$ nyx scan --no-index
# Restrict results to high-severity findings
$ nyx scan --high-only
# AST pattern matching only (fastest, no CFG/taint)
$ nyx scan --ast-only
# CFG + taint analysis only (skip AST pattern rules)
$ nyx scan --cfg-only
```
### Index Management
```bash
# Create or rebuild an index
$ nyx index build [PATH] [--force]
# Display index metadata (size, modified date, etc.)
$ nyx index status [PATH]
# List all indexed projects (add -v for detailed view)
$ nyx list [-v]
# Remove a single project or purge all indexes
$ nyx clean <PROJECT_NAME>
$ nyx clean --all
```
---
## Analysis Modes
Nyx supports three analysis modes, selectable via the `scanner.mode` config option or CLI flags:
| Mode | CLI flag | What runs |
|---|---|---|
| **Full** (default) | — | AST pattern matching + CFG construction + taint analysis |
| **AST-only** | `--ast-only` | AST pattern matching only; skips CFG and taint entirely |
| **Taint-only** | `--cfg-only` | CFG + taint analysis only; filters out AST pattern findings |
### What the CFG + taint engine detects
| Finding | Rule ID | Description |
|---|---|---|
| Tainted data flow | `taint-*` | Untrusted data (env vars, user input, file reads) flowing to dangerous sinks (shell exec, SQL, file write) without matching sanitization |
| Unguarded sink | `cfg-unguarded-sink` | Sink calls not dominated by a guard or sanitizer on the control-flow path |
| Auth gap | `cfg-auth-gap` | Web handler functions that reach privileged sinks without an auth check |
| Unreachable security code | `cfg-unreachable-*` | Sanitizers, guards, or sinks in dead code branches |
| Error fallthrough | `cfg-error-fallthrough` | Error-handling branches that don't terminate, allowing execution to fall through to dangerous operations |
| Resource leak | `cfg-resource-leak` | Resources acquired but not released on all exit paths (malloc/free, fopen/fclose, Lock/Unlock) |
Findings are scored and ranked by severity, proximity to entry point, path complexity, and taint confirmation.
---
## Supported Languages
All 10 languages have full AST pattern matching and CFG/taint analysis. Resource leak detection is available where language-specific acquire/release pairs are defined.
| Language | AST Patterns | CFG + Taint | Resource Leaks |
|---|---|---|---|
| Rust | Yes | Yes | Yes |
| C | Yes | Yes | Yes |
| C++ | Yes | Yes | Yes |
| Java | Yes | Yes | Yes |
| Go | Yes | Yes | Yes |
| PHP | Yes | Yes | — |
| Python | Yes | Yes | — |
| Ruby | Yes | Yes | — |
| TypeScript | Yes | Yes | — |
| JavaScript | Yes | Yes | — |
---
## Configuration Overview
Nyx merges a default configuration file (`nyx.conf`) with user overrides (`nyx.local`). Both live in the platform-specific configuration directory shown below.
| Platform | Directory |
|---|---|
| Linux | `~/.config/nyx/` |
| macOS | `~/Library/Application Support/dev.ecpeter23.nyx/` |
| Windows | `%APPDATA%\ecpeter23\nyx\config\` |
Minimal example (`nyx.local`):
```toml
[scanner]
mode = "full" # full | ast | taint
min_severity = "Medium"
follow_symlinks = true
excluded_extensions = ["mp3", "mp4"]
[output]
default_format = "json"
max_results = 200
[performance]
worker_threads = 8 # 0 = auto-detect
batch_size = 200
channel_multiplier = 2
```
A fully documented `nyx.conf` is generated automatically on first run.
---
## Architecture in Brief
Nyx uses a **two-pass architecture** to enable cross-file analysis without sacrificing parallelism:
1. **File enumeration** -- A parallel walker (Rayon + `ignore` crate) applies gitignore rules, size limits, and user exclusions.
2. **Pass 1 -- Summary extraction** -- Each file is parsed via tree-sitter, an intra-procedural CFG is built (petgraph), and a `FuncSummary` is exported per function capturing source/sanitizer/sink capabilities (bitflags), taint propagation behavior, and callee lists. Summaries are persisted to SQLite.
3. **Summary merge** -- All per-file summaries are merged into a `GlobalSummaries` map with conservative conflict resolution (union caps, OR booleans).
4. **Pass 2 -- Analysis** -- Files are re-parsed and analyzed with the full cross-file context: BFS taint propagation resolves callees against local and global summaries, CFG analysis checks for auth gaps, unguarded sinks, resource leaks, and more.
5. **Reporting** -- Findings are scored, ranked, deduplicated, and emitted to the console or serialized as JSON.
With indexing enabled, Pass 1 skips files whose blake3 content hash is unchanged, and cached findings are served directly for AST-only results.
---
## Roadmap
### Phase 1 -- Deep Static Engine
| Feature | Description |
|---|---|
| Interprocedural call graph | Precise symbol resolution via `FuncKey`, language-scoped namespaces, cross-module linking. No name-collision merging -- full call graph with topological analysis. |
| Path-sensitive analysis | Track path predicates and conditional constraints. Detect infeasible paths and validation-only-in-one-branch patterns. Dramatically reduces false positives. |
| Dataflow & state modeling | Resource state machines (init -> use -> close), auth state transitions, privilege level tracking. Semantic analysis beyond pattern matching. |
| Attack surface ranking | Score entry points by distance-to-sink, guard strength, path complexity, and privilege escalation potential. Deterministic attack surface scoring. |
### Phase 2 -- Dynamic Capability
| Feature | Description |
|---|---|
| Controlled dynamic execution | Local sandbox: identify entry points, spin up test harnesses, inject payloads, detect runtime crashes and command execution. Deterministic automated exploit validation -- static finds `exec(user_input)`, dynamic confirms it with `; id`. |
| Fuzzing integration | libFuzzer (C/C++), cargo-fuzz (Rust), go-fuzz, HTTP fuzzing harness. Static engine identifies interesting functions, fuzzer targets only those. |
### Phase 3 -- Intelligent Reasoning Layer
| Feature | Description |
|---|---|
| Semantic similarity | Embeddings for finding similar vulnerability patterns across codebases. |
| LLM reasoning | AI-assisted detection of non-obvious logic bugs. |
| Exploit refinement | Automated loops to refine and validate exploit chains. |
### Other planned improvements
| Area | Details |
|---|---|
| Output formats | SARIF 2.1.0, JUnit XML, HTML report generator |
| Language coverage | Expanded taint rules per language, resource leak pairs for Python/Ruby/PHP/JS/TS |
| Rule updates | Remote rule feed with signature verification |
| UX | Progress bar, smart file-watch re-scan |
Community feedback shapes priorities -- please [open an issue](https://github.com/ecpeter23/nyx/issues) to discuss proposed changes.
---
## Contributing
Pull requests are welcome. To contribute:
1. Fork the repository and create a feature branch.
2. Adhere to `rustfmt` and ensure `cargo clippy --all -- -D warnings` passes.
3. Add unit and/or integration tests where applicable (`cargo test` should remain green).
4. Submit a concise, well-documented pull request.
Please open an issue for any crash, panic, or suspicious result -- attach the minimal code snippet and mention the Nyx version.
See `CONTRIBUTING.md` for full guidelines.
---
## License
Nyx is licensed under the **GNU General Public License v3.0 (GPL-3.0)**.
This ensures that all modified versions of the scanner remain free and open-source, protecting the integrity and transparency of security tools.
See [LICENSE](./LICENSE) for full details.