Compare commits

...

38 commits

Author SHA1 Message Date
Eli Peter
c9776a5caf
Introduce repro cli subcommand
Some checks failed
CI / docs-fresh (push) Has been cancelled
CI / rustdoc (push) Has been cancelled
CI / rust-beta-build (push) Has been cancelled
CI / msrv (push) Has been cancelled
CI / rust-stable-test / linux-without-docker (push) Has been cancelled
CI / rust-stable-test / linux-with-docker (push) Has been cancelled
CI / escape-positive-control (push) Has been cancelled
CI / cross-platform-smoke (push) Has been cancelled
CI / cross-platform-smoke-1 (push) Has been cancelled
CI / rust-beta-test (push) Has been cancelled
CI / cargo-package (push) Has been cancelled
CI / benchmark-gate (push) Has been cancelled
CI / corpus-marker-audit (push) Has been cancelled
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (javascript-typescript) (push) Has been cancelled
CodeQL Advanced / Analyze (rust) (push) Has been cancelled
docs / build-deploy (push) Has been cancelled
dynamic / dynamic / linux-process-only (push) Has been cancelled
dynamic / dynamic / linux-with-docker (push) Has been cancelled
dynamic / dynamic / macos (push) Has been cancelled
eval / eval / owasp-benchmark-v1.2 (push) Has been cancelled
eval / eval / juiceshop (push) Has been cancelled
eval / eval / nodegoat (push) Has been cancelled
eval / eval / dvpwa (push) Has been cancelled
eval / eval / dvwa (push) Has been cancelled
eval / eval / gosec (push) Has been cancelled
eval / eval / railsgoat (push) Has been cancelled
eval / eval / rustsec (push) Has been cancelled
repro-bare / repro-bare / tests/repro_fixtures/python-3.11/repro (push) Has been cancelled
OSSF Scorecard / scorecard (push) Has been cancelled
2026-06-05 13:34:07 -05:00
elipeter
a2d1a1583f updated CHANGELOG.md 2026-06-05 13:13:42 -05:00
elipeter
8a7d2b8010 added repro subcommand 2026-06-05 13:10:58 -05:00
Eli Peter
c1fa6a87cf
ui-fixes 2026-06-05 12:39:39 -05:00
elipeter
f52b3bed1e changed sizes 2026-06-05 12:39:13 -05:00
elipeter
214bf91b63 bumped dep 2026-06-05 12:27:16 -05:00
elipeter
49fa174607 added svg for confirmed verdict badge 2026-06-05 12:04:09 -05:00
elipeter
291fe5d7be updated CHANGELOG.md 2026-06-05 11:36:52 -05:00
Eli Peter
25863d222a
Merge pull request #86 from nyx-sec/triage-works-in-cli
fix(cli): apply repository triage file during scans
2026-06-05 10:59:40 -05:00
elipeter
d09a97008e updated CHANGELOG.md 2026-06-05 10:53:09 -05:00
elipeter
1148e65f36 fix(cli): apply repository triage file during scans 2026-06-05 10:50:25 -05:00
Eli Peter
991c84a1eb
Dynamic (#77) 2026-06-05 10:16:30 -05:00
Eli Peter
55247b7fcd
Critical bug fixes and recall improvements (#68) 2026-05-11 12:42:39 -04:00
Eli Peter
7d0e7320e2
new capacity bits (#67) 2026-05-07 01:29:31 -04:00
elipeter
afaffc0df6 updated third party licenses 2026-05-06 05:03:00 -04:00
elipeter
c6f4c3e1cf chore: Update CHANGELOG with recent UI refresh, layout improvements, and screenshot enhancements 2026-05-06 05:01:43 -04:00
elipeter
6c607634da style: Improve code formatting for better readability in CSS and JSX files 2026-05-06 04:49:13 -04:00
elipeter
b51ae4f89d feat: Increase screenshot resolution to 1600x992 for improved quality 2026-05-06 04:45:50 -04:00
elipeter
77be7f10d9 refactor: Update UI components for consistency and improve layout 2026-05-06 04:38:04 -04:00
elipeter
da619171cf chore: Update package versions in Cargo.lock and package.json 2026-05-05 19:53:40 -04:00
elipeter
e8f1c64dc9 feat: Add asset mirroring for nyxscan.dev landing site and update favicon 2026-05-05 19:21:11 -04:00
elipeter
e830fd0a7e fix: Correct image paths in documentation for consistency 2026-05-05 19:08:51 -04:00
elipeter
c6baa4d5dc feat: Update brand color to mint-cyan across screenshots and UI elements 2026-05-05 19:02:47 -04:00
elipeter
bbf6f91c56 feat: Enhance CLI screenshot capture with raw file saving and GIF generation 2026-05-05 18:17:53 -04:00
Eli Peter
fb698d2c27
Performance and precision pass (#64) 2026-05-04 19:58:04 -04:00
Eli Peter
c7c5e0f3a1
Precision pass on auth and resource analysis (#63) 2026-05-03 13:51:46 -04:00
elipeter
064801a3a4 feat: Simplify inner-call release detection logic in resource filtering 2026-05-02 21:49:01 -04:00
elipeter
ebe4a15a72 feat: Enhance resource leak detection by recognizing inner-call release patterns and err-companion guards 2026-05-02 21:47:03 -04:00
elipeter
48bc43e1a6 feat: Add SSA summaries support for validated parameter propagation and enhance loop body error handling 2026-05-02 21:02:47 -04:00
elipeter
92aaa36ed6 chore: Update version placeholders and changelog for release 0.6.0 2026-05-02 18:06:50 -04:00
elipeter
215dd02eff docs: Update CVE list in README to include recent vulnerabilities and their details 2026-05-02 17:51:42 -04:00
Eli Peter
1f2bfe76c1
docs: Enhance module documentation across various files for clarity a… (#62)
* docs: Enhance module documentation across various files for clarity and completeness

* fix: Remove unnecessary blank line in build.rs for cleaner code

* docs: Update documentation to improve clarity and consistency in code comments
2026-05-02 17:46:45 -04:00
Eli Peter
40995e45e7
Authorization analysis logic improvements (#61) 2026-05-02 16:44:49 -04:00
Eli Peter
3c89bddbf2
Improved path traversal detection and enhanced sink classification logic 2026-05-02 03:36:14 -04:00
Eli Peter
58f1794a4e
Added Cap::DATA_EXFIL and taint fp and fn fixes on real repos (#59)
* feat: Enhance data exfiltration detection with source sensitivity gating for cookies and headers

* feat: Implement cross-file data exfiltration detection with parameter-specific gate filters

* feat: Add calibration tests and refine DATA_EXFIL severity scoring logic

* feat: Introduce per-detector configuration for data exfiltration suppression

* feat: Enhance DATA_EXFIL findings with destination field tracking in diagnostics and SARIF output

* feat: Add tainted body and URL handling for data exfiltration detection

* feat: Add integration tests and fixtures for DATA_EXFIL and SSRF detection in Go

* feat: Add Java integration tests and fixtures for DATA_EXFIL detection across multiple HTTP clients

* feat: Add synthetic externals handling for closure-captured variables in SSA

* feat: Implement closure-based suppression for resource leak findings

* feat: Add regression guards for shell-injection and taint propagation in for-of destructure patterns

* feat: Implement constructor cap narrowing for data exfiltration detection in HTTP request builders

* feat: Add gated sinks for data exfiltration detection in C and C++ using curl_easy_setopt

* feat: Implement DATA_EXFIL cap parity for backwards analysis and add integration tests

* feat: Add data exfiltration sinks for various languages and enhance documentation

* refactor: Simplify formatting and improve readability in various files

* refactor: Improve readability by simplifying conditional statements and adding clippy linting

* docs: Update CHANGELOG and comments for data exfiltration features and configuration

* docs: Clarify configuration instructions for data exfiltration trusted destinations

* docs: Enhance comments for evidence routing logic in data exfiltration
2026-05-01 10:59:52 -04:00
Eli Peter
a438886217
Python fp and docs updtes (#58)
* refactor: Update comments for clarity and add expectations.json files for performance metrics

* feat: Implement FP guard for JS/TS local-collection receivers to suppress missing ownership checks

* feat: Enhance Rust parameter handling to classify local collections and prevent false ownership checks

* refactor: Simplify code formatting for better readability in multiple files

* refactor: Improve UTF-8 sequence length handling and enhance clarity in loop iteration

* feat: Update Java and Python patterns to include new security rules

* refactor: Improve comment clarity and consistency across multiple Rust files

* refactor: Simplify code formatting for improved readability in integration tests and module files

* refactor: Improve comment formatting and enhance clarity in assertions across multiple files
2026-04-29 19:53:34 -04:00
elipeter
4db0805de6 ci: Enhance release workflow to support manual tag input and ensure consistent artifact naming 2026-04-29 11:59:50 -04:00
elipeter
65add619a0 ci: Update cosign signing commands to use bundle output format 2026-04-29 11:53:55 -04:00
2420 changed files with 342682 additions and 8865 deletions

19
.config/nextest.toml Normal file
View file

@ -0,0 +1,19 @@
# nextest configuration
#
# See https://nexte.st/docs/configuration/ for the full schema.
# ── Test groups ──────────────────────────────────────────────────────────────
#
# `hostile-input-timing` serialises the two timing-bounded
# `hostile_input_tests` cases that pass under nextest in isolation but fail
# under the full-suite parallel run on darwin (resource contention from the
# other ~4000 tests pushes them past their internal budget). Pinning them to
# a single thread within their own group keeps their wall-clock predictable
# without slowing the rest of the suite.
[test-groups]
hostile-input-timing = { max-threads = 1 }
[[profile.default.overrides]]
filter = 'binary(hostile_input_tests) and (test(very_long_single_line_parses) or test(many_small_functions_do_not_explode))'
test-group = 'hostile-input-timing'

View file

@ -41,7 +41,7 @@ body:
attributes: attributes:
label: Nyx version label: Nyx version
description: Output of `nyx --version`. description: Output of `nyx --version`.
placeholder: "nyx 0.5.0" placeholder: "nyx 0.7.0"
validations: validations:
required: true required: true
- type: input - type: input

View file

@ -8,6 +8,7 @@ on:
branches: ["master"] branches: ["master"]
pull_request: pull_request:
branches: ["master"] branches: ["master"]
workflow_dispatch:
concurrency: concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
@ -153,6 +154,22 @@ jobs:
exit 1 exit 1
fi fi
rustdoc:
name: rustdoc
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- name: Check rustdoc links
env:
RUSTDOCFLAGS: "-D warnings"
run: cargo doc --workspace --no-deps --all-features
rust-beta-build: rust-beta-build:
name: rust-beta-build name: rust-beta-build
runs-on: ubuntu-latest runs-on: ubuntu-latest
@ -181,8 +198,8 @@ jobs:
- name: Compile check at MSRV - name: Compile check at MSRV
run: cargo check --all-features --tests run: cargo check --all-features --tests
rust-stable-test: rust-stable-test-linux-without-docker:
name: rust-stable-test name: rust-stable-test / linux-without-docker
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v6 - uses: actions/checkout@v6
@ -194,8 +211,59 @@ jobs:
- uses: taiki-e/install-action@nextest - uses: taiki-e/install-action@nextest
- name: Rust tests (stable) - name: Rust tests (stable, no docker)
run: cargo nextest run --all-features run: cargo nextest run --no-fail-fast --all-features
rust-stable-test-linux-with-docker:
name: rust-stable-test / linux-with-docker
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Pull language images for sandbox tests
run: |
docker pull python:3-slim
docker pull node:20-slim
docker pull eclipse-temurin:21-jre-jammy
docker pull php:8-cli
- name: Smoke-test interpreter availability
run: |
docker run --rm python:3-slim python3 --version
docker run --rm node:20-slim node --version
docker run --rm eclipse-temurin:21-jre-jammy java -version
docker run --rm php:8-cli php --version
- name: Rust tests with docker (sandbox escape gate)
run: cargo nextest run --no-fail-fast --all-features --test dynamic_sandbox_escape --test dynamic_parity
escape-positive-control:
name: escape-positive-control
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Pull python image
run: docker pull python:3-slim
- name: Escape positive control (gate wiring check)
run: |
cargo nextest run --no-fail-fast --all-features --test dynamic_sandbox_escape \
-- --include-ignored positive_control_cap_sys_admin
cross-platform-smoke: cross-platform-smoke:
name: cross-platform-smoke name: cross-platform-smoke
@ -218,7 +286,7 @@ jobs:
run: cargo build --release --all-features run: cargo build --release --all-features
- name: Smoke tests - name: Smoke tests
run: cargo nextest run --all-features --test integration_tests --test pattern_tests --test cli_validation_tests run: cargo nextest run --no-fail-fast --all-features --test integration_tests --test pattern_tests --test cli_validation_tests
rust-beta-test: rust-beta-test:
name: rust-beta-test name: rust-beta-test
@ -234,7 +302,7 @@ jobs:
- uses: taiki-e/install-action@nextest - uses: taiki-e/install-action@nextest
- name: Rust tests (beta) - name: Rust tests (beta)
run: cargo nextest run --all-features run: cargo nextest run --no-fail-fast --all-features
cargo-package: cargo-package:
name: cargo-package name: cargo-package
@ -283,16 +351,18 @@ jobs:
cache: true cache: true
cache-key: benchmark-gate-release cache-key: benchmark-gate-release
- uses: taiki-e/install-action@nextest
- name: Build benchmark + perf test binaries - name: Build benchmark + perf test binaries
run: cargo test --release --all-features --test benchmark_test --test perf_tests --no-run run: cargo nextest run --release --all-features --test benchmark_test --test perf_tests --no-run
- name: Accuracy regression gate (P/R/F1) - name: Accuracy regression gate (P/R/F1)
run: cargo test --release --all-features --test benchmark_test -- --ignored --nocapture benchmark_evaluation run: cargo nextest run --no-fail-fast --release --all-features --test benchmark_test --run-ignored only --no-capture benchmark_evaluation
- name: Performance regression gate - name: Performance regression gate
env: env:
NYX_CI_BENCH: "1" NYX_CI_BENCH: "1"
run: cargo test --release --all-features --test perf_tests -- --nocapture run: cargo nextest run --no-fail-fast --release --all-features --test perf_tests --no-capture
- name: Upload benchmark results - name: Upload benchmark results
if: always() if: always()
@ -301,3 +371,34 @@ jobs:
name: benchmark-results name: benchmark-results
path: tests/benchmark/results/latest.json path: tests/benchmark/results/latest.json
if-no-files-found: warn if-no-files-found: warn
corpus-marker-audit:
name: corpus-marker-audit
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
with:
python-version: "3.12"
- name: Marker collision audit (§16.3)
run: python3 scripts/corpus_dashboard.py
# Exits non-zero if any oracle marker from one cap appears in another
# cap's payload bytes. This catches cross-cap oracle collisions that
# would cause false-positive confirmed verdicts.
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Corpus unit tests (no_marker_collisions, all_payloads_have_fixture_paths)
run: cargo nextest run --no-fail-fast --lib -p nyx-scanner dynamic::corpus
env:
RUST_LOG: error
- name: Corpus dashboard sync check (Python/Rust payload table parity)
run: python3 scripts/check_corpus_sync.py

167
.github/workflows/corpus_promote.yml vendored Normal file
View file

@ -0,0 +1,167 @@
name: Corpus Promote
# Weekly automated promotion-PR template.
#
# Scans fuzz-discovered/ for candidates not yet in src/dynamic/corpus.rs
# and opens a PR proposing them for human review (§16.4 — no auto-merge).
#
# Also runs the marker-collision audit as a hard gate: if any collision is
# found the workflow fails rather than proposing the promotion.
on:
schedule:
# Sundays at 09:00 UTC — offset from the fuzz run (06:00 UTC) so
# discovered candidates are ready before the promotion job runs.
- cron: "0 9 * * 0"
workflow_dispatch:
inputs:
dry_run:
description: "Dry run (print PR body but do not open)"
required: false
default: "false"
permissions:
contents: write
pull-requests: write
concurrency:
group: corpus-promote
cancel-in-progress: true
jobs:
promote:
name: Propose corpus promotions
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: actions/setup-node@v6
with:
node-version: 20
cache: npm
cache-dependency-path: frontend/package-lock.json
- name: Build frontend
working-directory: frontend
run: |
npm ci
npm run build
# ── Marker collision audit ──────────────────────────────────────────────
- name: Marker collision audit
run: |
set -euo pipefail
cargo build --features dynamic -p nyx-scanner 2>/dev/null || true
cd fuzz/dynamic_corpus
cargo run -- audit-markers
env:
RUST_LOG: error
# ── Discover candidates ─────────────────────────────────────────────────
- name: Find promotion candidates
id: candidates
run: |
set -euo pipefail
count=0
files=""
if [ -d fuzz-discovered ]; then
while IFS= read -r f; do
# Skip .gitkeep, sidecar JSONs, and files already listed in corpus.rs.
[[ "$f" == *".gitkeep" ]] && continue
[[ "$f" == *".json" ]] && continue
bytes=$(xxd -p "$f" | tr -d '\n')
if ! grep -q "$bytes" src/dynamic/corpus.rs 2>/dev/null; then
count=$((count + 1))
files="$files $f"
fi
done < <(find fuzz-discovered -type f | sort)
fi
echo "count=$count" >> "$GITHUB_OUTPUT"
echo "files=$files" >> "$GITHUB_OUTPUT"
- name: Skip if no new candidates
if: steps.candidates.outputs.count == '0'
run: |
echo "No new candidates found in fuzz-discovered/. Nothing to promote."
# ── Open promotion PR ───────────────────────────────────────────────────
- name: Open promotion PR
if: >
steps.candidates.outputs.count != '0' &&
github.event.inputs.dry_run != 'true'
env:
GH_TOKEN: ${{ github.token }}
CANDIDATE_COUNT: ${{ steps.candidates.outputs.count }}
CANDIDATE_FILES: ${{ steps.candidates.outputs.files }}
run: |
set -euo pipefail
branch="corpus-promote-$(date +%Y%m%d)"
git checkout -b "$branch"
# Stage candidate files into fuzz-discovered (already there).
# The PR body provides the reviewer with everything they need.
# Build PR body into a temp file to avoid shell re-interpolation of
# sidecar JSON content (which may contain backticks or $(...) sequences).
body_file=$(mktemp)
cat > "$body_file" <<'PREAMBLE'
## Corpus Promotion Proposal
This PR was generated automatically by the weekly corpus-promote workflow.
It does **not** auto-merge — a human reviewer must approve each candidate
before it can land in `src/dynamic/corpus.rs` (§16.4).
### Candidates
The following payloads were discovered by the internal mutation fuzzer and
confirmed via `sink_hit && oracle_fired` against instrumented fixtures:
PREAMBLE
for f in $CANDIDATE_FILES; do
sidecar="${f}.json"
printf -- '- `%s`\n' "$f" >> "$body_file"
if [ -f "$sidecar" ]; then
printf ' ```json\n' >> "$body_file"
cat "$sidecar" >> "$body_file"
printf '\n ```\n' >> "$body_file"
fi
done
cat >> "$body_file" <<'CHECKLIST'
### Review checklist
- [ ] Bytes are a genuine attack vector, not a fixture artifact
- [ ] Oracle marker is unique (no collision with other caps)
- [ ] `fixture_paths` updated in `src/dynamic/corpus.rs`
- [ ] `since_corpus_version` set to next version
- [ ] `CORPUS_VERSION` bumped and bump history updated
_Generated by corpus_promote.yml — do not auto-merge._
CHECKLIST
git add fuzz-discovered/ || true
git diff --cached --quiet || git commit -m "chore: add ${CANDIDATE_COUNT} fuzzer-discovered corpus candidates"
git push origin "$branch"
gh pr create \
--title "chore(corpus): promote ${CANDIDATE_COUNT} fuzzer-discovered payload(s)" \
--body "$(cat "$body_file")" \
--base master \
--label "corpus-promotion" || true
rm -f "$body_file"
- name: Dry run summary
if: github.event.inputs.dry_run == 'true'
run: |
echo "Dry run: would promote ${{ steps.candidates.outputs.count }} candidate(s)."
echo "Files: ${{ steps.candidates.outputs.files }}"

View file

@ -25,6 +25,11 @@ jobs:
steps: steps:
- uses: actions/checkout@v6 - uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- name: Cache mdbook - name: Cache mdbook
id: cache-mdbook id: cache-mdbook
uses: actions/cache@v5 uses: actions/cache@v5

146
.github/workflows/dynamic.yml vendored Normal file
View file

@ -0,0 +1,146 @@
# Phase 29 (Track I): dedicated dynamic-verification matrix.
#
# Three rows exercise the dynamic harness pipeline (`cargo nextest run
# --features dynamic`) under the host configurations the Phase 1728
# tracks documented as supported:
#
# linux-process-only — Ubuntu host, no docker daemon. Forces the
# process backend and exercises the Phase 17
# Linux hardening primitives (chroot, seccomp,
# unshare, no_new_privs). `libc6-dev` is
# installed so the hardening probe + escape
# suite can `cc -static`; without it the
# chroot-leg of the escape suite skips silently
# (Phase 20 follow-up #4 in deferred.md).
#
# linux-with-docker — Ubuntu host with the runner Docker daemon. Exercises
# the docker backend (Phase 19) and the
# differential-confirmation parity tests.
#
# macos — macOS-latest, no docker. Exercises the
# Phase-18 `sandbox-exec` primitives plus the
# process backend on Darwin. Track-I acceptance
# literal: "cargo nextest run --features dynamic
# is green on macOS without docker."
name: dynamic
permissions:
contents: read
on:
push:
branches: ["master"]
pull_request:
branches: ["master"]
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
linux-process-only:
name: dynamic / linux-process-only
runs-on: ubuntu-latest
env:
# Force the process backend even when callers default to Auto so
# docker-unavailable paths cannot accidentally hide a regression.
NYX_SANDBOX_BACKEND: process
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
# Phase 17 / Phase 20 follow-up: the hardening probe + escape
# suite chroot leg need static glibc. Without these packages the
# `cc -static probe.c` step in tests/sandbox_hardening_linux.rs +
# tests/sandbox_escape_suite.rs falls back to dynamic linking and
# the chroot leg silently skips.
- name: Install fixture prerequisites (static libc)
run: |
sudo apt-get update -y
sudo apt-get install -y --no-install-recommends libc6-dev libc-dev-bin
- name: Smoke-test interpreter availability
run: |
python3 --version
node --version || sudo apt-get install -y --no-install-recommends nodejs
ruby --version || true
php --version || true
- name: Dynamic suite (process backend only)
run: cargo nextest run --no-fail-fast --features dynamic
linux-with-docker:
name: dynamic / linux-with-docker
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Install fixture prerequisites (static libc)
run: |
sudo apt-get update -y
sudo apt-get install -y --no-install-recommends libc6-dev libc-dev-bin
- name: Pull language images for sandbox tests
run: |
docker pull python:3-slim
docker pull node:20-slim
docker pull eclipse-temurin:21-jre-jammy
docker pull php:8-cli
- name: Smoke-test docker interpreter availability
run: |
docker run --rm python:3-slim python3 --version
docker run --rm node:20-slim node --version
docker run --rm eclipse-temurin:21-jre-jammy java -version
docker run --rm php:8-cli php --version
- name: Dynamic suite (process + docker backends)
run: cargo nextest run --no-fail-fast --features dynamic
macos:
name: dynamic / macos
runs-on: macos-latest
env:
# macOS runners ship without docker; force process backend so the
# `Auto` resolver in src/dynamic/sandbox.rs cannot accidentally
# pick up a stray Lima/Colima daemon and confuse the matrix.
NYX_SANDBOX_BACKEND: process
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Smoke-test sandbox-exec availability
run: |
/usr/bin/sandbox-exec -p '(version 1)(allow default)' /bin/echo ok
- name: Smoke-test interpreter availability
run: |
python3 --version
node --version
ruby --version
# Phase 29 acceptance literal: "cargo nextest run --features
# dynamic is green on macOS without docker (process-only row)."
- name: Dynamic suite (macOS, process backend)
run: cargo nextest run --no-fail-fast --features dynamic

348
.github/workflows/eval.yml vendored Normal file
View file

@ -0,0 +1,348 @@
# Real-corpus acceptance (Track R).
#
# * owasp (Phase 27 / Track R.0): Gate 6 vs a real OWASP BenchmarkJava
# checkout (Java).
# * jsts (Phase 28 / Track R.1): Gate 7 vs OWASP NodeGoat (Express, .js)
# and OWASP Juice Shop (TypeScript, .ts), one matrix row per corpus.
# * polyglot (Phase 29 / Track R.2): Gate 8 vs OWASP RailsGoat (Rails, .rb),
# DVWA (PHP), DVPWA (aiohttp, .py), gosec (Go) and the RustSec advisory-db
# (Rust negative control), one matrix row per corpus.
#
# Runs on every PR that touches the dynamic verifier (src/dynamic/), the
# eval-corpus harness (tests/eval_corpus/), or the gate script itself.
#
# Each gate enforces, against the committed ground truth:
# * verify wall-clock <= 15 min (CI budget; the dev reference is 10 min),
# * the per-(cap,lang) budget in tests/eval_corpus/budget.toml,
# * per-cap confirmed-rate / precision / recall — hard-gated only for caps
# in NYX_*_FLOOR_CAPS (empty by default → published report-only until a
# cap Confirms end to end), with destinations >= 40% / >= 0.85 / >= 0.40.
#
# No corpus is vendored. Each is cloned at a pinned ref and cached so reruns
# skip the clone. Before the gate runs, the committed ground truth is
# regenerated from its source against the fresh clone and asserted in sync,
# and the converter hard-errors on any labelled path missing from the corpus,
# so a corpus bump that drifts the labels fails the job loudly.
name: eval
permissions:
contents: read
on:
push:
branches: ["master"]
paths:
- "src/dynamic/**"
- "tests/eval_corpus/**"
- "scripts/m7_ship_gate.sh"
- ".github/workflows/eval.yml"
pull_request:
branches: ["master"]
paths:
- "src/dynamic/**"
- "tests/eval_corpus/**"
- "scripts/m7_ship_gate.sh"
- ".github/workflows/eval.yml"
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
owasp:
name: eval / owasp-benchmark-v1.2
runs-on: ubuntu-latest
env:
# Gate 6 self-skips unless this points at a real checkout.
NYX_OWASP_CORPUS: ${{ github.workspace }}/.eval-corpus/owasp_benchmark_v1.2
# CI wall-clock budget: 20 min. The 2740-file OWASP scan+verify lands
# right at the old 15-min ceiling on the hosted runners (observed 900.2s),
# so the gate tripped on CI variance alone; 1200s restores headroom. The
# dev reference stays 10 min — override locally to tighten.
NYX_OWASP_WALLCLOCK_BUDGET_SECONDS: "1200"
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
# The Phase 22 Java compile pool drives `com.sun.tools.javac` out of a
# warm JDK; temurin 21 ships the compiler module the pool loads.
- name: Set up JDK 21
uses: actions/setup-java@v5
with:
distribution: temurin
java-version: "21"
- name: Cache OWASP BenchmarkJava (1.2beta)
id: cache-owasp
uses: actions/cache@v5
with:
path: .eval-corpus/owasp_benchmark_v1.2
key: owasp-benchmark-1.2beta
- name: Clone OWASP BenchmarkJava (1.2beta tag)
if: steps.cache-owasp.outputs.cache-hit != 'true'
run: |
git clone --depth 1 --branch 1.2beta \
https://github.com/OWASP-Benchmark/BenchmarkJava \
.eval-corpus/owasp_benchmark_v1.2
# No-compromise guard: the committed ground truth must be exactly what a
# fresh conversion of the pinned CSV produces. Catches GT drift (a
# corpus bump, a hand-edit) before the gate runs on stale labels.
- name: Verify ground truth is in sync with the pinned corpus
run: |
python3 tests/eval_corpus/owasp_gt_convert.py \
--corpus-dir .eval-corpus/owasp_benchmark_v1.2 \
--output /tmp/owasp_gt_regen.json
python3 - <<'PY'
import json, sys
committed = json.load(open("tests/eval_corpus/ground_truth/owasp_benchmark_v1.2.json"))
regen = json.load(open("/tmp/owasp_gt_regen.json"))
if committed != regen:
sys.exit("committed ground truth diverges from a fresh conversion of "
"the 1.2beta CSV; regenerate with owasp_gt_convert.py")
print(f"ground truth in sync: {len(committed)} records")
PY
- name: eval-corpus harness regression tests
run: |
python3 tests/eval_corpus/test_tabulate_regression.py
python3 tests/eval_corpus/test_manifest_gt_convert.py
- name: Gate 6 — OWASP Benchmark v1.2 acceptance
run: scripts/m7_ship_gate.sh --sets owasp
jsts:
name: eval / ${{ matrix.corpus.name }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
corpus:
- name: nodegoat
repo: https://github.com/OWASP/NodeGoat
# NodeGoat ships no release tags; pin the default branch and let
# the cache key hold it stable. The manifest's path layout
# (app/, config/) has been constant for years.
ref: master
env: NYX_NODEGOAT_CORPUS
manifest: nodegoat.manifest.toml
ground_truth: nodegoat.json
- name: juiceshop
repo: https://github.com/juice-shop/juice-shop
ref: v15.0.0
env: NYX_JUICESHOP_CORPUS
manifest: juiceshop.manifest.toml
ground_truth: juiceshop.json
env:
# CI wall-clock budget: 15 min. Override locally to tighten.
NYX_JSTS_WALLCLOCK_BUDGET_SECONDS: "900"
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
# The dynamic verifier's Node build pool (Phase 23) compiles its
# harnesses with a real node/npm toolchain.
- name: Set up Node 20
uses: actions/setup-node@v6
with:
node-version: "20"
- name: Cache ${{ matrix.corpus.name }}
id: cache-corpus
uses: actions/cache@v5
with:
path: .eval-corpus/${{ matrix.corpus.name }}
key: jsts-${{ matrix.corpus.name }}-${{ matrix.corpus.ref }}
- name: Clone ${{ matrix.corpus.name }} (${{ matrix.corpus.ref }})
if: steps.cache-corpus.outputs.cache-hit != 'true'
run: |
git clone --depth 1 --branch ${{ matrix.corpus.ref }} \
${{ matrix.corpus.repo }} \
.eval-corpus/${{ matrix.corpus.name }}
# No-compromise guard: the committed ground truth must be exactly what a
# fresh conversion of the curated manifest produces *against this
# corpus*. manifest_gt_convert.py hard-errors on any labelled path that
# no longer exists in the clone (corpus drift / typo), and the diff
# below catches a stale committed JSON.
- name: Verify ground truth is in sync with the pinned corpus
run: |
python3 tests/eval_corpus/manifest_gt_convert.py \
--manifest tests/eval_corpus/ground_truth/${{ matrix.corpus.manifest }} \
--corpus-dir .eval-corpus/${{ matrix.corpus.name }} \
--output /tmp/${{ matrix.corpus.name }}_gt_regen.json
python3 - <<'PY'
import json, sys
name = "${{ matrix.corpus.ground_truth }}"
committed = json.load(open(f"tests/eval_corpus/ground_truth/{name}"))
regen = json.load(open("/tmp/${{ matrix.corpus.name }}_gt_regen.json"))
if committed != regen:
sys.exit("committed ground truth diverges from a fresh conversion of "
"the manifest against the pinned corpus; regenerate with "
"manifest_gt_convert.py")
print(f"ground truth in sync: {len(committed)} records")
PY
- name: eval-corpus harness regression tests
run: |
python3 tests/eval_corpus/test_tabulate_regression.py
python3 tests/eval_corpus/test_manifest_gt_convert.py
- name: Gate 7 — ${{ matrix.corpus.name }} acceptance
run: |
export ${{ matrix.corpus.env }}="${{ github.workspace }}/.eval-corpus/${{ matrix.corpus.name }}"
scripts/m7_ship_gate.sh --sets ${{ matrix.corpus.name }}
polyglot:
name: eval / ${{ matrix.corpus.name }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
corpus:
- name: railsgoat
repo: https://github.com/OWASP/railsgoat
ref: rails.5.0.0
lang: ruby
env: NYX_RAILSGOAT_CORPUS
manifest: railsgoat.manifest.toml
ground_truth: railsgoat.json
- name: dvwa
repo: https://github.com/digininja/DVWA
ref: "2.5"
lang: php
env: NYX_DVWA_CORPUS
manifest: dvwa.manifest.toml
ground_truth: dvwa.json
- name: dvpwa
repo: https://github.com/anxolerd/dvpwa
# DVPWA ships no release tags; pin the default branch and let the
# cache key hold it stable.
ref: master
lang: python
env: NYX_DVPWA_CORPUS
manifest: dvpwa.manifest.toml
ground_truth: dvpwa.json
- name: gosec
repo: https://github.com/securego/gosec
ref: v2.26.1
lang: go
env: NYX_GOSEC_CORPUS
manifest: gosec.manifest.toml
ground_truth: gosec.json
- name: rustsec
repo: https://github.com/rustsec/advisory-db
# advisory-db ships no release tags; pin the default branch. This
# is the Rust NEGATIVE CONTROL (advisory metadata, no scannable
# source) — its committed ground truth is empty by construction.
ref: main
lang: rust
env: NYX_RUSTSEC_CORPUS
manifest: rustsec.manifest.toml
ground_truth: rustsec.json
env:
# CI wall-clock budget: 15 min. Override locally to tighten.
NYX_POLYGLOT_WALLCLOCK_BUDGET_SECONDS: "900"
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
# The dynamic verifier's per-language build pool (Phase 22/23) compiles
# its harnesses with a real toolchain. Each matrix row sets up only the
# toolchain for its corpus's target language; the Rust row needs no extra
# step (the rust toolchain above covers it, and advisory-db has no
# buildable source anyway).
- name: Set up Ruby
if: matrix.corpus.lang == 'ruby'
uses: ruby/setup-ruby@v1
with:
ruby-version: "3.3"
- name: Set up PHP
if: matrix.corpus.lang == 'php'
uses: shivammathur/setup-php@v2
with:
php-version: "8.3"
- name: Set up Python
if: matrix.corpus.lang == 'python'
uses: actions/setup-python@v6
with:
python-version: "3.12"
- name: Set up Go
if: matrix.corpus.lang == 'go'
uses: actions/setup-go@v6
with:
go-version: "1.22"
- name: Cache ${{ matrix.corpus.name }}
id: cache-corpus
uses: actions/cache@v5
with:
path: .eval-corpus/${{ matrix.corpus.name }}
key: polyglot-${{ matrix.corpus.name }}-${{ matrix.corpus.ref }}
- name: Clone ${{ matrix.corpus.name }} (${{ matrix.corpus.ref }})
if: steps.cache-corpus.outputs.cache-hit != 'true'
run: |
git clone --depth 1 --branch ${{ matrix.corpus.ref }} \
${{ matrix.corpus.repo }} \
.eval-corpus/${{ matrix.corpus.name }}
# No-compromise guard: the committed ground truth must be exactly what a
# fresh conversion of the curated manifest produces *against this corpus*.
# manifest_gt_convert.py hard-errors on any labelled path that no longer
# exists in the clone (corpus drift / typo); the diff below catches a
# stale committed JSON. For the RustSec negative control the manifest
# carries `negative_control = true` and zero entries, so the converter
# emits an empty `[]` — still validated against the real clone.
- name: Verify ground truth is in sync with the pinned corpus
run: |
python3 tests/eval_corpus/manifest_gt_convert.py \
--manifest tests/eval_corpus/ground_truth/${{ matrix.corpus.manifest }} \
--corpus-dir .eval-corpus/${{ matrix.corpus.name }} \
--output /tmp/${{ matrix.corpus.name }}_gt_regen.json
python3 - <<'PY'
import json, sys
name = "${{ matrix.corpus.ground_truth }}"
committed = json.load(open(f"tests/eval_corpus/ground_truth/{name}"))
regen = json.load(open("/tmp/${{ matrix.corpus.name }}_gt_regen.json"))
if committed != regen:
sys.exit("committed ground truth diverges from a fresh conversion of "
"the manifest against the pinned corpus; regenerate with "
"manifest_gt_convert.py")
print(f"ground truth in sync: {len(committed)} records")
PY
- name: eval-corpus harness regression tests
run: |
python3 tests/eval_corpus/test_tabulate_regression.py
python3 tests/eval_corpus/test_manifest_gt_convert.py
- name: Gate 8 — ${{ matrix.corpus.name }} acceptance
run: |
export ${{ matrix.corpus.env }}="${{ github.workspace }}/.eval-corpus/${{ matrix.corpus.name }}"
scripts/m7_ship_gate.sh --sets ${{ matrix.corpus.name }}

View file

@ -136,6 +136,7 @@ jobs:
-max_total_time=${{ steps.budget.outputs.seconds }} \ -max_total_time=${{ steps.budget.outputs.seconds }} \
-max_len=65536 \ -max_len=65536 \
-timeout=60 \ -timeout=60 \
-rss_limit_mb=8192 \
-dict=fuzz/dict/all.dict -dict=fuzz/dict/all.dict
- name: Upload crash artifacts - name: Upload crash artifacts
@ -146,3 +147,71 @@ jobs:
path: fuzz/artifacts/${{ matrix.target }}/ path: fuzz/artifacts/${{ matrix.target }}/
if-no-files-found: ignore if-no-files-found: ignore
retention-days: 14 retention-days: 14
harness-fuzz:
name: harness-fuzz-${{ matrix.cap }}
runs-on: ubuntu-latest
# Run only on schedule and manual dispatch — 50 k iterations per cap is
# too slow for PR checks but is the right cadence for weekly corpus growth.
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
strategy:
fail-fast: false
matrix:
include:
- cap: sql_query
harness: tests/dynamic_fixtures/python/sqli_positive.py
- cap: code_exec
harness: tests/dynamic_fixtures/python/cmdi_positive.py
- cap: file_io
harness: tests/dynamic_fixtures/python/fileio_positive.py
- cap: ssrf
harness: tests/dynamic_fixtures/python/ssrf_positive.py
- cap: html_escape
harness: tests/dynamic_fixtures/python/xss_positive.py
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
cache: true
cache-workspaces: |
.
fuzz/dynamic_corpus
- uses: actions/setup-node@v6
with:
node-version: 20
cache: npm
cache-dependency-path: frontend/package-lock.json
- name: Build frontend
working-directory: frontend
run: |
npm ci
npm run build
- name: Build nyx-dynamic-corpus
working-directory: fuzz/dynamic_corpus
run: cargo build
- uses: actions/setup-python@v6
with:
python-version: "3.x"
- name: Run harness fuzzer — ${{ matrix.cap }}
run: |
fuzz/dynamic_corpus/target/debug/nyx-dynamic-corpus run \
--cap ${{ matrix.cap }} \
--spec-hash "ci-${{ matrix.cap }}" \
--harness-cmd "python3 ${{ matrix.harness }}" \
--iterations 50000 \
--output fuzz-discovered
- name: Upload discovered candidates
if: always()
uses: actions/upload-artifact@v7
with:
name: harness-fuzz-${{ matrix.cap }}-${{ github.run_id }}
path: fuzz-discovered/
if-no-files-found: ignore
retention-days: 30

68
.github/workflows/image-builder.yml vendored Normal file
View file

@ -0,0 +1,68 @@
name: image-builder
# Phase 19 (Track E.3): daily drift PR.
#
# Runs `nyx-image-builder build --all` on a Linux runner that has docker
# available, captures the rewritten `tools/image-builder/images.toml`, and
# opens a PR when any pinned digest changed. The PR is reviewed manually
# before merge so a hostile upstream image cannot silently land in
# `IMAGE_DIGESTS`.
permissions:
contents: write
pull-requests: write
on:
schedule:
# 04:23 UTC daily — off-peak for the major upstream registries so
# transient pull errors are rare.
- cron: "23 4 * * *"
workflow_dispatch:
concurrency:
group: image-builder
cancel-in-progress: false
jobs:
refresh-digests:
name: refresh image digests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- name: Verify docker is reachable
run: docker info
- name: Build pinned-digest catalogue
run: |
cargo run -F image-builder --bin nyx-image-builder -- build --all
- name: Verify catalogue against local pulls
run: |
cargo run -F image-builder --bin nyx-image-builder -- verify
- name: Open PR on drift
uses: peter-evans/create-pull-request@v8
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: "image-builder: refresh pinned digests"
title: "image-builder: refresh pinned digests"
body: |
Automated digest refresh by `nyx-image-builder build --all`.
The CI job pulled every base image in
`tools/image-builder/images.toml`, captured the resolved
`sha256:` digest, and wrote it back into the file. Review
the diff before merging — a hostile upstream image would
show up here as an unexpected digest change.
branch: image-builder/refresh-digests
base: master
delete-branch: true
labels: |
image-builder
automation

View file

@ -3,12 +3,19 @@ name: Release build & publish
on: on:
release: release:
types: [created] types: [created]
workflow_dispatch:
inputs:
tag:
description: "Existing release tag to (re)build and publish (e.g. v0.5.0)"
required: true
type: string
permissions: permissions:
contents: write contents: write
env: env:
BIN_NAME: nyx BIN_NAME: nyx
RELEASE_TAG: ${{ github.event.release.tag_name || inputs.tag }}
jobs: jobs:
frontend: frontend:
@ -17,6 +24,8 @@ jobs:
steps: steps:
- name: Check out sources - name: Check out sources
uses: actions/checkout@v6 uses: actions/checkout@v6
with:
ref: ${{ env.RELEASE_TAG }}
- uses: actions/setup-node@v6 - uses: actions/setup-node@v6
with: with:
@ -60,6 +69,8 @@ jobs:
steps: steps:
- name: Check out sources - name: Check out sources
uses: actions/checkout@v6 uses: actions/checkout@v6
with:
ref: ${{ env.RELEASE_TAG }}
- name: Download prebuilt frontend dist - name: Download prebuilt frontend dist
uses: actions/download-artifact@v8 uses: actions/download-artifact@v8
@ -99,7 +110,12 @@ jobs:
BIN_PATH=target/$TARGET/release/$BIN$EXT BIN_PATH=target/$TARGET/release/$BIN$EXT
mkdir -p dist mkdir -p dist
ARCHIVE=$BIN-$TARGET.zip ARCHIVE=$BIN-$TARGET.zip
zip -9 "dist/$ARCHIVE" "$BIN_PATH" THIRDPARTY-LICENSES.html LICENSE* COPYING* files=("$BIN_PATH" THIRDPARTY-LICENSES.html)
shopt -s nullglob
license_files=(LICENSE* COPYING*)
shopt -u nullglob
files+=("${license_files[@]}")
zip -9 "dist/$ARCHIVE" "${files[@]}"
echo "ASSET=$ARCHIVE" >> "$GITHUB_ENV" echo "ASSET=$ARCHIVE" >> "$GITHUB_ENV"
- name: Package (Windows) - name: Package (Windows)
@ -112,9 +128,11 @@ jobs:
$BinPath = "target/$Target/release/$Bin$Ext" $BinPath = "target/$Target/release/$Bin$Ext"
New-Item -ItemType Directory -Path dist -Force | Out-Null New-Item -ItemType Directory -Path dist -Force | Out-Null
$Archive = "$Bin-$Target.zip" $Archive = "$Bin-$Target.zip"
$LicenseFiles = @(Get-ChildItem -Path 'LICENSE*', 'COPYING*' -File -ErrorAction SilentlyContinue | ForEach-Object { $_.FullName })
$Files = @($BinPath, 'THIRDPARTY-LICENSES.html') + $LicenseFiles
Compress-Archive ` Compress-Archive `
-Path $BinPath, 'THIRDPARTY-LICENSES.html', 'LICENSE*', 'COPYING*' ` -Path $Files `
-DestinationPath "dist/$Archive" ` -DestinationPath "dist/$Archive" `
-CompressionLevel Optimal -CompressionLevel Optimal
@ -136,6 +154,8 @@ jobs:
steps: steps:
- name: Check out sources - name: Check out sources
uses: actions/checkout@v6 uses: actions/checkout@v6
with:
ref: ${{ env.RELEASE_TAG }}
- name: Download prebuilt frontend dist - name: Download prebuilt frontend dist
uses: actions/download-artifact@v8 uses: actions/download-artifact@v8
@ -192,13 +212,15 @@ jobs:
steps: steps:
- name: Check out sources - name: Check out sources
uses: actions/checkout@v6 uses: actions/checkout@v6
with:
ref: ${{ env.RELEASE_TAG }}
- name: Generate CycloneDX SBOM - name: Generate CycloneDX SBOM
uses: anchore/sbom-action@v0 uses: anchore/sbom-action@v0
with: with:
path: . path: .
format: cyclonedx-json format: cyclonedx-json
output-file: nyx-${{ github.event.release.tag_name }}.cdx.json output-file: nyx-${{ env.RELEASE_TAG }}.cdx.json
upload-artifact: false upload-artifact: false
upload-release-assets: false upload-release-assets: false
@ -218,31 +240,28 @@ jobs:
cat SHA256SUMS cat SHA256SUMS
# Sigstore keyless signing. Verify with: # Sigstore keyless signing. Verify with:
# cosign verify-blob --certificate <file>.pem \ # cosign verify-blob --bundle <file>.bundle \
# --signature <file>.sig \
# --certificate-identity-regexp 'https://github.com/elicpeter/nyx/.*' \ # --certificate-identity-regexp 'https://github.com/elicpeter/nyx/.*' \
# --certificate-oidc-issuer https://token.actions.githubusercontent.com \ # --certificate-oidc-issuer https://token.actions.githubusercontent.com \
# <file> # <file>
- name: Install cosign - name: Install cosign
uses: sigstore/cosign-installer@v4.1.1 uses: sigstore/cosign-installer@v4.1.2
- name: Cosign keyless sign release artifacts - name: Cosign keyless sign release artifacts
shell: bash shell: bash
run: | run: |
set -euo pipefail set -euo pipefail
SBOM="nyx-${{ github.event.release.tag_name }}.cdx.json" SBOM="nyx-${{ env.RELEASE_TAG }}.cdx.json"
( (
cd release-artifacts cd release-artifacts
for f in *.zip SHA256SUMS; do for f in *.zip SHA256SUMS; do
cosign sign-blob --yes \ cosign sign-blob --yes \
--output-signature "$f.sig" \ --bundle "$f.bundle" \
--output-certificate "$f.pem" \
"$f" "$f"
done done
) )
cosign sign-blob --yes \ cosign sign-blob --yes \
--output-signature "$SBOM.sig" \ --bundle "$SBOM.bundle" \
--output-certificate "$SBOM.pem" \
"$SBOM" "$SBOM"
# SLSA v1 provenance. Verify with `gh attestation verify <file> --repo <repo>`. # SLSA v1 provenance. Verify with `gh attestation verify <file> --repo <repo>`.
@ -252,20 +271,18 @@ jobs:
subject-path: | subject-path: |
release-artifacts/*.zip release-artifacts/*.zip
release-artifacts/SHA256SUMS release-artifacts/SHA256SUMS
nyx-${{ github.event.release.tag_name }}.cdx.json nyx-${{ env.RELEASE_TAG }}.cdx.json
- name: Upload to the release - name: Upload to the release
uses: softprops/action-gh-release@v3 uses: softprops/action-gh-release@v3
with: with:
tag_name: ${{ env.RELEASE_TAG }}
files: | files: |
release-artifacts/*.zip release-artifacts/*.zip
release-artifacts/*.zip.sig release-artifacts/*.zip.bundle
release-artifacts/*.zip.pem
release-artifacts/SHA256SUMS release-artifacts/SHA256SUMS
release-artifacts/SHA256SUMS.sig release-artifacts/SHA256SUMS.bundle
release-artifacts/SHA256SUMS.pem nyx-${{ env.RELEASE_TAG }}.cdx.json
nyx-${{ github.event.release.tag_name }}.cdx.json nyx-${{ env.RELEASE_TAG }}.cdx.json.bundle
nyx-${{ github.event.release.tag_name }}.cdx.json.sig
nyx-${{ github.event.release.tag_name }}.cdx.json.pem
env: env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

104
.github/workflows/repro-bare.yml vendored Normal file
View file

@ -0,0 +1,104 @@
# Replay every tree-committed dynamic repro bundle with host language
# toolchains blocked so we catch regressions where a bundle silently
# depends on an interpreter the operator does not have.
#
# The setup step prepends deny-list wrappers for python3, node, ruby,
# php, and Java so the only toolchain the bundle can use is the docker
# daemon. reproduce.sh in --docker mode pulls the pinned base image
# (via docker_pull.sh) and runs the harness inside the container; if the
# bundle accidentally relied on a host interpreter the run falls over
# before the sentinel check.
#
# Adding a new fixture: extend the `matrix.fixture` list with the new
# `tests/repro_fixtures/<toolchain_id>/<spec_hash>` path. The bundle
# must already exist on disk, see tests/repro_fixture_bundles.rs for
# the regeneration recipe.
name: repro-bare
permissions:
contents: read
on:
push:
branches: ["master"]
pull_request:
branches: ["master"]
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
bare-image-replay:
name: repro-bare / ${{ matrix.fixture }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
fixture:
- tests/repro_fixtures/python-3.11/repro
steps:
- uses: actions/checkout@v6
- name: Block host language toolchains
run: |
set -euo pipefail
# Do not mutate the hosted runner image. ubuntu-latest carries
# preinstalled and cached language runtimes, and apt package
# relationships can shift underneath us as the image is updated.
# A PATH-level deny layer gives this job the bare-host semantics it
# needs without depending on apt being able to uninstall core bits.
deny_dir="${RUNNER_TEMP}/nyx-deny-toolchains"
mkdir -p "$deny_dir"
for exe in \
python python3 python3.10 python3.11 python3.12 python3.13 python3.14 \
node npm npx corepack \
ruby gem bundle \
php \
java javac jar
do
{
printf '%s\n' '#!/bin/sh'
printf '%s\n' 'echo "error: host language toolchain is disabled in repro-bare; use the Docker replay path" >&2'
printf '%s\n' 'exit 127'
} > "${deny_dir}/${exe}"
chmod +x "${deny_dir}/${exe}"
done
export PATH="${deny_dir}:${PATH}"
echo "${deny_dir}" >> "${GITHUB_PATH}"
hash -r 2>/dev/null || true
# Confirm the deny layer is active — surface the failure here
# rather than inside reproduce.sh where it would look like a
# bundle bug.
for exe in python3 node ruby php java; do
resolved="$(command -v "${exe}" || true)"
if [ "${resolved}" != "${deny_dir}/${exe}" ]; then
echo "error: ${exe} deny wrapper is not first on PATH (got ${resolved:-not found})" >&2
exit 1
fi
if "${exe}" --version >/dev/null 2>&1; then
echo "error: ${exe} still runs after host-toolchain block" >&2
exit 1
fi
done
if ! command -v docker >/dev/null 2>&1; then
echo "error: docker is no longer reachable after host-toolchain block" >&2
exit 1
fi
- name: Verify docker is reachable
run: docker info
- name: Pre-pull pinned image
working-directory: ${{ matrix.fixture }}
run: ./docker_pull.sh
- name: Replay bundle via docker
working-directory: ${{ matrix.fixture }}
run: ./reproduce.sh --docker

9
.gitignore vendored
View file

@ -1,13 +1,22 @@
/target /target
/fuzz/target /fuzz/target
/fuzz/corpus /fuzz/corpus
/fuzz/dynamic_corpus/target
/fuzz/artifacts /fuzz/artifacts
/.idea /.idea
/frontend/node_modules /frontend/node_modules
/src/server/assets/dist /src/server/assets/dist
/marketing
/.nyx /.nyx
/.nyx-build-cache
/logs /logs
/book /book
.DS_Store .DS_Store
.z3-trace .z3-trace
.pitboss
.eval-corpus
.node_modules-target .node_modules-target
node_modules
__pycache__/
*.pyc
tools/sb-trace/*.trace.raw

View file

@ -2,9 +2,327 @@
All notable changes to Nyx are documented here. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and the project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html). For where Nyx is going, see the [Roadmap](ROADMAP.md). All notable changes to Nyx are documented here. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and the project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html). For where Nyx is going, see the [Roadmap](ROADMAP.md).
## [Unreleased] ## [0.8.0] - 2026-06-06
_No changes yet._ The dynamic-verification release. An attack-surface map, a sandboxed dynamic verifier, a framework adapter registry that grounds both, the per-language build infrastructure that makes per-finding verification affordable at corpus scale, and the first real-corpus acceptance gates.
The attack-surface map and chain composer turn the flat finding list into a route-to-sink graph. The dynamic verifier re-runs every Medium-or-higher finding against a payload corpus and stamps a Confirmed / PartiallyConfirmed / NotConfirmed / Inconclusive / Unsupported verdict on each. The adapter registry (130+ entries across 8 languages) covers HTTP, message-broker, scheduled-job, GraphQL, WebSocket, middleware, and migration entry points. Per-language build pools and copy-on-write workdirs hold the with-verify wall-clock to within 1.5x of a static-only scan.
### Attack-surface map
- **`nyx surface` subcommand.** Prints the project's entry points, datastores, external services, and dangerous local sinks as text, JSON, Graphviz `dot`, or rendered SVG. Loads the persisted `SurfaceMap` from the most recent indexed scan when available, or rebuilds inline from source. `--build` forces a full pass-1 + call-graph walk so DataStore / ExternalService / DangerousLocal nodes populate on an unscanned project.
- **Surface page in `nyx serve`.** New `SurfacePage` renders the same graph in the browser UI, with ELK layout, sidebar navigation, and a wide-canvas SVG viewer. Persists alongside the index so the frontend reloads without a rescan.
- **Chain findings.** `ChainFinding` records connect a route entry point to a downstream sink via the call graph + surface map. The composer scores `(impact × evidence)` per chain, queues the top-N for composite reverification, and wires the result into `findings.json` / SARIF / the dashboard. Chains rank above isolated findings.
### Framework adapter registry
`src/dynamic/framework/` ships a `FrameworkAdapter` trait with concrete adapters across 8 languages (116 entries today, growing per release). Each adapter binds a route / handler / consumer pattern to a `FrameworkBinding` so the surface map and dynamic verifier can locate entry points without re-walking the AST.
- **HTTP routers.** Flask, Django, FastAPI, Starlette (Python); Express, Koa, NestJS, Fastify (JS/TS); Spring, Quarkus, Micronaut, Jakarta Servlet (Java); Gin, Echo, Fiber, Chi (Go); Axum, Actix, Rocket, Warp (Rust); Rails, Sinatra, Hanami (Ruby); Laravel, Symfony, CodeIgniter (PHP).
- **New `EntryKind` variants.** `ClassMethod`, `MessageHandler`, `ScheduledJob`, `GraphQLResolver`, `WebSocket`, `Middleware`, `Migration` join the existing `RouteHandler` / `Function` set so the surface map shows non-HTTP entry surfaces.
- **Message broker handlers.** Kafka, AWS SQS, Google Pub/Sub, NATS, and RabbitMQ consumers across Python, Node, Java, and Go.
- **Scheduled jobs.** Celery (Python), Sidekiq (Ruby), Quartz (Java), plain cron expression recognition.
- **GraphQL resolvers.** Apollo, Relay, gqlgen, Juniper, Graphene.
- **WebSocket handlers.** ws, Socket.IO, ActionCable, Django Channels.
- **Middleware + migrations.** Express, Laravel, Spring, Django, Rails middleware; Django, Flask, Laravel, Rails, Prisma, Sequelize migration scripts.
- **Sanitizer-aware adapter strengthening.** Every XXE, header-injection, open-redirect, SSTI, LDAP, XPath, deserialization, crypto, and data-exfiltration adapter rejects bindings when the surrounding source visibly hardens the parser (`disallow-doctype-decl`, `resolve_entities=False`, `libxml_disable_entity_loader`), routes the value through a known encoder (`LdapEncoder.filterEncode`, `escape_filter_chars`, `ldap_escape`), swaps a weak primitive for a CSPRNG (`secrets.token_bytes`, `crypto.randomBytes`, `SecureRandom`), or validates the destination host through an allowlist. Cuts adapter FPs without losing the genuinely dangerous calls.
### Dynamic verification
- **`nyx scan --verify`.** Every finding with `Confidence >= Medium` is re-executed inside a sandboxed harness against a curated payload corpus. The verdict (`Confirmed` / `NotConfirmed` / `Inconclusive` / `Unsupported`) lands on `Evidence.dynamic_verdict` and shows up in console output, JSON, SARIF, and the dashboard via a new `VerdictBadge` component on the finding detail page.
- **Backends.** In-process on Linux with `Standard` / `Strict` hardening (namespace unshare, chroot, RLIMIT cap, seccomp filter), in-process on macOS via `sandbox-exec` with a profile-per-policy wrap, Docker with a published image-builder catalogue, and a Firecracker trait stub for future microVM execution. The Docker backend ships native binary support for Rust and Go so harnesses no longer need to drag a toolchain into every image.
- **Language coverage.** Per-language harness emitters for Python, JS/TS, Go, Java, PHP, Ruby, Rust, C, and C++. Stub harness intercepts SQL, HTTP, Redis, and filesystem boundaries so the verdict reflects the sink, not the network. The `JSON_PARSE`, `UNAUTHORIZED_ID`, and `DATA_EXFIL` cap dispatchers are wired into every emitter that ships these caps (Python, JS, TS, Go, Java, PHP, Ruby, Rust), so the verdict pipeline closes the loop on each cap end-to-end rather than per-language piecemeal.
- **Abstract-interpretation and symex sanitizer suppression.** Symbolic execution and the interval/string abstract domain are now consulted at verdict time, so a payload that the static engine would call dangerous but symex can prove never reaches the sink lands as NotConfirmed.
- **Guard-aware verdicts.** When a known input-validation or output-sanitization middleware sits in front of a Confirmed sink (Spring `@PreAuthorize`, Express `helmet`, Nest `@UseGuards`, Django `@permission_classes`, and the per-language registry in `src/dynamic/framework/auth_markers.rs`), the verdict demotes to `ConfirmedWithKnownGuard` and the guard names land on `differential.known_guards`. Authentication-only filters do not trigger the demotion since they do not mitigate injection.
- **Repro bundles.** Every verified finding writes a hermetic bundle to `~/.cache/nyx/dynamic/repro/<spec_hash>/` with `reproduce.sh`, `expected/{verdict.json,outcome.json,trace.jsonl}`, and a `docker_pull.sh` when the toolchain is pinned in `tools/image-builder/images.toml`. `--verbose` flushes the per-step `VerifyTrace` to stderr for live triage.
- **Real-engine harness paths.** LDAP injection routes through an embedded LDAPv3 BER server, exercised from Java via JNDI `InitialDirContext` and from Python and PHP via pure-stdlib BER clients. XPath injection runs against the live parser in each language: Java `javax.xml.xpath`, PHP `DOMXPath`, JS `xpath` npm, Python `lxml`. `Cap::CRYPTO` lands a `WeakKey` probe across Python, Go, Java, PHP, and Rust that flags sub-2^16 keys produced by non-CSPRNG sources. A new `HeaderSmuggledInWire` oracle predicate catches CRLF smuggling on hand-rolled raw-socket HTTP servers (Python `http.server`, Node `net`, Rust `std::net::TcpListener`) where framework-level CRLF strip cannot intervene.
- **Differential rule v2 and partial confirmations.** A finding confirms when *any* vulnerable payload in the set fires and *every* paired benign control stays clean, replacing the strict pair-wise rule so a single missing control no longer downgrades a confirmable finding. A new `PartiallyConfirmed` verdict marks findings where the sink is reached but the exploit chain does not complete (no marker written, no callback observed), so engine work can ratchet without the tool overstating what it proved.
- **Spec derivation v2.** Every derivation strategy now runs and is scored on flow-step depth, framework binding, cross-file source resolution via `GlobalSummaries`, and payload availability; the highest-scoring candidate wins and the runner-up ranking lands in the trace so engine gaps stay visible. Cross-file seeding walks the call graph (max depth 5) until a `Source` step or framework binding is found. New `EntryKind` adapters auto-recover the entry surface from framework decorators and annotations.
### Performance
- **Per-language build pools.** A warm `javac` daemon compiles batched harness sources in one long-lived JVM (Track O headline, Phase 22); Node, PHP, Ruby, Go, Rust, C, and C++ reuse shared module / package / object caches; Python layers a read-only venv per `requirements_hash` with a warmed bytecode cache. Target per-finding harness build: P50 ≤ 200ms hot, ≤ 1.5s cold. Pools self-skip when a toolchain is absent so toolchain-less CI rows stay green.
- **Copy-on-write workdirs.** Per-finding workdir setup uses `clonefile` on macOS and `reflink` / `copy_file_range` on Linux instead of copying every harness file, cutting setup cost to single-digit milliseconds.
- **Cap-routed concurrency lanes.** The verifier worker pool splits into per-cap lanes (`SSRF: 8`, `DESERIALIZE: 2`, `CRYPTO: 1`, and so on) so a slow harness for one cap cannot head-of-line block fast ones.
- **Ship-gate budgets.** Gate 3 holds the with-verify / static-only wall-clock ratio at ≤ 1.5x on `benches/fixtures/`; Gate 6 holds the Java OWASP Benchmark `--verify` run at ≤ 15 min on CI / ≤ 10 min on the dev reference machine.
### Determinism, policy, telemetry
- **YAML policy deny list.** `src/policy.rs` is consulted before harness build. Network egress, filesystem writes outside the sandbox root, and process spawns can be denied per-rule; deny decisions land in the trace, redacted via the shared scrubber.
- **Seeded RNG.** `dynamic::rand::SpecRng` is seeded from each `HarnessSpec` hash so two runs of the same spec produce identical payloads. `scripts/check_no_unseeded_rand.sh` audits the tree for unseeded `rand` usage on every CI run.
- **`VerifyTrace` observability.** Every per-step decision (probe selection, payload mutation, oracle check, deny verdict) writes to the trace stream and the repro bundle.
- **Schema-versioned telemetry.** `events.jsonl` carries `schema_version`, `nyx_version`, `corpus_version`, `kind`, and `ts` on every envelope. PII and secret scrubbing runs on every persisted artefact via `src/utils/redact.rs`.
- **`NYX_NO_TELEMETRY=1`** disables event persistence outright.
### CVE corpus and ground truth
- **New `Cap` corpora.** Vulnerable + patched fixtures landed for the seven new cap classes (LDAP injection, XPath injection, header injection, open redirect, SSTI, XXE, prototype pollution) plus deserialization, crypto, JSON parsing, unauthorized-id, and data exfiltration. Every cap now carries at least one positive / negative / adversarial / unsupported fixture quad per supported language.
- **OWASP Benchmark v1.2 importer.** `tests/eval_corpus/owasp_gt_convert.py` converts the OWASP Java Benchmark expected-results manifest into Nyx ground truth and lands a 16k-line `owasp_benchmark_v1.2.json` for evaluation.
- **NIST SARD importer.** `tests/eval_corpus/sard_gt_convert.py` converts SARD test cases into the same format so cross-dataset recall numbers stay comparable.
- **Evaluation corpus tooling.** `tests/eval_corpus/run_full.sh` runs the Nyx benchmark, OWASP Benchmark, and NIST SARD evaluation sets and writes `tests/eval_corpus/results.json`. `tests/eval_corpus/report.py` and `tabulate.py` produce the per-cap and per-language summary used to track coverage and accuracy.
- **Real-corpus acceptance gates.** `scripts/m7_ship_gate.sh` adds Gate 6 (Java OWASP Benchmark v1.2), Gate 7 (NodeGoat + Juice Shop), and Gate 8 (RailsGoat, DVWA, DVPWA, gosec, RustSec). Each row enforces the per-`(cap, lang)` budget in `tests/eval_corpus/budget.toml` and publishes per-cap precision / recall / confirmed-rate against a committed ground truth. The corpora are not vendored; each row self-skips unless its `NYX_<NAME>_CORPUS` points at a checkout.
- **Per-spec cryptographic canary.** Every oracle marker is now derived from `BLAKE3(spec_hash || run_nonce)` rather than a fixed literal, so markers are unique per finding, collision-resistant against ambient harness output, and never leak to the host. A compile-time audit rejects any new ad-hoc canary.
### Engine
- **DB fast-fail preflight.** `Indexer::init` reads the first 16 bytes of any candidate SQLite file and rejects anything without the standard `SQLite format 3\0` magic. Stops a misnamed JSON / text file from corrupting the index path with a SQLite error halfway through migration.
- **Symbolic-execution coverage.** Symex now recognises a wider set of string operations (`substr`, `replace`, `to_lower`, `to_upper`, `trim`, `strlen`) per the value/transfer pipeline, and the abstract-interpretation framework reasons about interval and prefix/suffix string facts during the dynamic verdict pass.
### CLI
- **`nyx scan --verify`** (enabled by default in standard builds) and `--backend {auto,process,docker}` select the dynamic-verification harness. `--no-verify` skips verification for a single run without changing config.
- **`nyx scan --harden {standard,strict}`** picks the process-backend hardening profile. `standard` is no-new-privs plus a memory rlimit on Linux. `strict` layers namespace unshare, chroot to the workdir, and a default-deny seccomp filter on Linux, or wraps the harness with `sandbox-exec` on macOS.
- **Patch-validation CI mode.** `--baseline FILE` reads a previous scan's JSON (or a stripped `.nyx/baseline.json` written by `--baseline-write`) and diffs it against the current scan on `stable_hash`, emitting `New` / `Resolved` / `FlippedConfirmed` / `FlippedNotConfirmed` transitions. `--gate {no-new-confirmed,resolve-all-confirmed}` exits non-zero when the diff violates the policy so CI fails the build instead of merging an unreviewed regression. The stripped baseline carries only `stable_hash`, `dynamic_verdict`, `severity`, `path`, and `rule_id`, so persisting it between scans does not leak source.
- **Repository triage in CI.** `nyx scan` now reads the same `.nyx/triage.json` file written by `nyx serve`. Terminal triage states (`false_positive`, `accepted_risk`, `suppressed`, `fixed`) are hidden from CLI output and excluded from `--fail-on` by default, while `--show-suppressed` includes them with `triage_state` / `triage_note` metadata for JSON, SARIF, and console output.
- **`nyx scan --verify-all-confidence`** drops the Medium cutoff and re-verifies everything.
- **`nyx scan --unsafe-sandbox`** disables hardening (development only, never for CI).
- **`nyx verify-feedback <finding_id> --wrong <reason> | --right`** records a correction or confirmation for a finding's verdict in the local telemetry log.
- **`nyx scan --explain-engine`** prints the effective engine configuration and exits without scanning.
- **`nyx surface`** (described above) with `--format {text,json,dot,svg}` and `--build`.
- **`nyx repro` subcommand.** Replays dynamic repro bundles by finding id,
spec hash, or explicit bundle path, with `--docker`, `--print-path`, and
`--list` helpers. The CLI now matches the browser UI's reproduced command
and uses bundle manifests to bridge stable finding ids to spec-hash cache
directories.
### Frontend
- **Project target selector in `nyx serve`.** The sidebar now remembers scan roots, lets you switch the active target, and accepts a new project path without restarting the server. `/api/targets` backs the selector, scans can opt into a different `scan_root`, and `nyx scan` / `nyx index build` register the projects they touch so `nyx serve` can pick them up later.
- **Surface page** with ELK auto-layout and the shared node-style palette.
- **Verdict badge** on finding detail, plus a dynamic-verdict section that surfaces the verdict, the payload that triggered it, and a link to the repro bundle.
- **Scan compare** gains a dynamic-verdict diff column so two scans can be compared on what was confirmed versus what was downgraded.
### License
- **Internal license grants documentation** at `LICENSE-GRANTS.md`. Grant 1 covers Nyctos derived works. The repo stays GPL-3.0-or-later; the grants document scope of internal product licensing.
## [0.7.0] - 2026-05-11
A focused release that adds seven new vulnerability classes, ships two SSA sidecars for XML and XPath parser hardening, deepens cross-file authorization for FastAPI, trims roughly a thousand auth false positives on Go DAO helpers along with the dominant Hibernate Criteria SQL cluster, and runs a performance pass on the auth extractor, SCCP, and the global summaries map. A `nyx rules list` CLI surfaces the rule registry, the web UI gets a brand-aligned visual refresh, and the CVE corpus grows across Python, PHP, JavaScript, and C.
### Highlights
- New caps for LDAP injection, XPath injection, header / CRLF injection, open redirect, server-side template injection, XXE, and prototype pollution, with per-language label rules across all eight supported languages.
- Cross-file FastAPI authorization: `include_router` chains and module-level `APIRouter(dependencies=[…])` now lift onto every attached route, with `Security(..., scopes=[...])` recognised distinctly from `Depends(...)`.
- Type-tracked XML and XPath hardening through two new SSA sidecars: parser bodies that set `secure_processing` / `processEntities: false` / `resolve_entities=False`, and `XPath` instances bound to `setXPathVariableResolver(...)`, are recognised as safe.
- ~957 `go.auth.missing_ownership_check` findings closed on gitea-shaped DAO helpers (id-scalar precision pass), 169 of 216 openmrs `cfg-unguarded-sink` findings closed on Hibernate Criteria-API receivers, joomla and drupal `php.deser.unserialize` closed on `Serializable::unserialize($input)` magic-method bodies.
- `nyx rules list` CLI subcommand, brand-aligned `nyx serve` visual refresh, and regenerated README / docs screenshots and GIFs.
### Detector classes
- New `Cap` bits and canonical rule ids: `Cap::LDAP_INJECTION` / `taint-ldap-injection`, `Cap::XPATH_INJECTION` / `taint-xpath-injection`, `Cap::HEADER_INJECTION` / `taint-header-injection`, `Cap::OPEN_REDIRECT` / `taint-open-redirect`, `Cap::SSTI` / `taint-template-injection`, `Cap::XXE` / `taint-xxe`, `Cap::PROTOTYPE_POLLUTION` / `taint-prototype-pollution`. Each ships per-language sink, sanitizer, and gated-sink rules across JS/TS, Python, Java, PHP, Go, Ruby, Rust, and C/C++. Severity, OWASP 2021 mapping, and human-readable description live in `CAP_RULE_REGISTRY` in `src/labels/mod.rs`; `cap_rule_meta()` and `rule_id_for_caps()` are the public lookups.
- `Cap` widened from `u16` to `u32` to fit the new bits. `Evidence.sink_caps` and `RuleInfo.cap_bits` follow. The serde decoder accepts any unsigned integer width so caches written before the bump still load. SQLite schema bumped from 3 to 4 to force a rescan, since older `source_caps` / `sanitizer_caps` / `sink_caps` blobs were emitted before any of the new bits could appear.
- `owasp_bucket_for` consults `CAP_RULE_REGISTRY` first so adding a cap class no longer requires a second-table edit. The match requires an exact rule id or a recognised separator (` `, `(`, `.`) so a future `taint-ssrf-allowlist-violation` cannot silently inherit `taint-ssrf`'s bucket. The legacy family-token table now also routes `xpath`, `header`, and `xxe` to A03 / A05.
- `issue_category_label` (dashboard badge) routes the seven new rule-id prefixes to dedicated labels: LDAP Injection, XPath Injection, Header Injection, Open Redirect, Template Injection, XXE, Prototype Pollution.
### Engine
- **XML-parser configuration tracking.** `src/ssa/xml_config.rs` runs alongside type-fact analysis and carries per-receiver `secure_processing` / `disallow_doctype` / `external_entities` flags forward through copy assignments and phi joins (meet for safe flags, sticky union for the unsafe `external_entities` polarity). `xxe_safe()` queries the result at the type-qualified `XmlParser.parse` sink and strips `Cap::XXE` when the parser was provably hardened (JAXP `setFeature(FEATURE_SECURE_PROCESSING, true)`, lxml `XMLParser(resolve_entities=False, no_network=True)`, fast-xml-parser `processEntities: false`). Persisted to `OptimizeResult.xml_parser_config`.
- **XPath-receiver configuration tracking.** `src/ssa/xpath_config.rs` mirrors the XML sidecar for Java's `XPath` instances: `setXPathVariableResolver(...)` flips the receiver's `has_resolver` flag, copy assignments union, phi joins meet. `xpath_safe()` strips `Cap::XPATH_INJECTION` at `xpath.evaluate(expr, ...)` / `xpath.compile(expr)` sinks when the receiver was provably bound to a resolver. Persisted to `OptimizeResult.xpath_config`.
- **Five new `TypeKind` variants.** `LdapClient` (JNDI `InitialDirContext` / `InitialLdapContext`, Spring `LdapTemplate`, ldapjs `createClient`, python-ldap `initialize`, ldap3 `Connection`), `XPathClient` (JAXP `newXPath`, lxml `etree.XPath`, npm `xpath`), `XmlParser` (JAXP factory products: `newDocumentBuilder`, `newSAXParser`, `getXMLReader`), `Template` (FreeMarker `new Template(...)` / `Configuration.getTemplate`), and `NullPrototypeObject` for JS/TS values produced by `Object.create(null)`. Wired into `constructor_type` for return-type inference and `TypeKind::label_prefix()` for type-qualified callee resolution. `XPathClient` is kept distinct from `DatabaseConnection` so a generic `pdo->query` SQL_QUERY sink does not collide with `xpath.query`.
- **`GateActivation::LiteralOnly`.** Strict literal-value activation: the gate fires only when the activation argument is a literal that matches `dangerous_values` / `dangerous_prefixes`. Unknown or dynamic activation argument suppresses (no conservative `ALL_ARGS_PAYLOAD` push). Used where the dangerous shape is identifiable only by an explicit literal flag, e.g. `jQuery.extend(true, target, src)` deep-merge against Backbone's `Model.extend({proto})`.
- **Two new path-state predicates for inline open-redirect sanitisers.** `RelativeUrlValidated` covers `x.startsWith("/")`, `x.starts_with("/")`, `x.startswith("/")`, PHP `strpos($x, "/") === 0`, and direct `x[0] === "/"`. `HostAllowlistValidated` covers `new URL(x).host === ALLOWED`, `urlparse(x).netloc == ALLOWED`, multi-statement `parsed.host_str() == "..."` for Rust, and `parsed.Host == "..."` / `parsed.Hostname() == "..."` for Go. Both clear `Cap::OPEN_REDIRECT` only on the validated branch, leaving any non-redirect taint downstream to fire on its own caps. The Go form gates on case-sensitive capital `H` so a lowercase `u.host == X` field comparison falls through to the generic `Comparison` predicate.
- **`Object.create(null)` recogniser.** `is_object_create_null_call` in `cfg/literals.rs` matches `Object.create(null)` (and parenthesised, awaited, or TS type-cast wrappers) and tags `CallMeta.produces_null_proto = true`. Type-fact analysis lifts the flag to `TypeKind::NullPrototypeObject` on the returned SSA value so the synthetic `__index_set__` sink is suppressed flow-sensitively. Phi joins drop the tag back to `Unknown` so a partial null-proto receiver still fires on the unsafe path.
- **CFG-layer prototype-pollution suppression** at the synthetic `__index_set__` sink (JS/TS, recognised by the existing `try_lower_subscript_write` lowering). Three flow-insensitive shapes elide the `Sink(PROTOTYPE_POLLUTION)` label before SSA sees the node: constant-key fold (literal key not in `__proto__` / `constructor` / `prototype`), reject pattern (sibling `if (idx === "__proto__" || ...) return / throw / break;`), and allowlist pattern (ancestor `if (idx === "name" || idx === "id") { obj[idx] = v }`). Walks stop at the enclosing function so closure-captured guards in an outer scope cannot silently authorise inner assignments.
- **Spring MVC `return "redirect:" + tainted` recogniser** (Java). `try_lower_spring_redirect_return` in `cfg/mod.rs` matches the leftmost `+`-chain whose root is a `redirect:` string literal and emits a synthetic `__spring_redirect__` Call sink with `Sink(Cap::OPEN_REDIRECT)` between the predecessors and the Return node. Concatenated identifiers from anywhere in the right-hand chain feed the synthetic node's `arg_uses[0]`, so the taint pipeline carries any tainted suffix through OPEN_REDIRECT.
- **Subscript-set form classification for header sinks.** `response.headers["X-Foo"] = bar` / `headers["X-Foo"] = bar` (Ruby `element_reference`, JS/TS `subscript_expression`, Python `subscript`) had no `property` field on the LHS. `push_node` now walks into the subscript's `object` and classifies its member-expression text, so `Cap::HEADER_INJECTION` fires on the bare bracket form alongside `setHeader` / `res.set` / `headers_mut.insert`.
- **PHP literal extraction** extended in `cfg/literals.rs`: PHP `encapsed_string` (double-quoted) when every child is a pure-literal segment; boolean literals (`true` / `false`) for the jQuery `extend(true, ...)` `LiteralOnly` gate; leading-string `binary_expression` concat (`"Location: " . $url`, JS/TS `"Location: " + url`) so `dangerous_prefixes` matching activates on partially dynamic concatenations.
- **PHP receiver-text strip** in `helpers::root_receiver_text` drops the leading `$` from `variable_name` nodes so `$smarty->fetch(...)` / `$twig->createTemplate(...)` reconstruct as `Smarty.fetch` / `Environment.createTemplate` for suffix-matcher gates.
- **Gate-callee resolution hardening for member-source rewrites.** When `first_member_label` rewrites a call's `text` to a Source like `req.body`, the gate matcher now reads the call's `function` / `method` / `name` field instead, so `setValue(target, req.body, ...)` matches the `setValue` proto-pollution gate. Whitespace stripped from the function field so multi-line chains still match flat gate matchers.
- **Ruby option-constant lookup in gate activation.** Bare `scope_resolution` / `constant` nodes (`Nokogiri::XML::ParseOptions::NOENT`) now fall back to the macro-arg extractor used by C/C++/PHP, so Nokogiri XXE gates activate on idiomatic option-flag arguments.
- **PHP `unary_op_expression` negation recognition.** tree-sitter-php emits `unary_op_expression` for unary `!`; CFG `detect_negation` and condition-chain decomposition now match it, so `if (!validate($x))` no longer carries `condition_negated=false` and the surviving branch is the rejection arm, not the validated one.
- **PHP container kinds.** `declaration_list`, `interface_declaration`, `trait_declaration`, `enum_declaration`, `enum_declaration_list` mapped to `Kind::Block` so methods inside them participate in CFG construction.
- **Go variadic `parameter_declaration` named-field handling** for `collect_param_names`. `name` and `type` named fields read directly so type-segment identifiers no longer pollute the param-name set (`info *PackageInfo` no longer contributes `PackageInfo`).
- **Empty-formals SSA lowering signal.** Per-parameter summary probing now seeds via `BodyMeta.param_destructured_fields`; JS/TS arrow `() => {…}` lowers with `with_params=true` so it is treated as "explicitly zero formals" rather than "no formals info".
### Authorization
- **FastAPI cross-file `include_router` dependency tracking.** `auth_analysis/router_facts.rs` captures per-file router declarations (`<router> = X(deps=[…])`) and `<parent>.include_router(<child_module>.<child_var>)` edges in pass 1, persists them into `GlobalSummaries::router_facts_by_module`, and resolves them into the active file's `AuthorizationModel::cross_file_router_deps` at pass 2 entry. Transitive lifts (grandparent to parent to child) handled by iterative index walk. Module identity is the file basename without `.py`. Closes the airflow execution-API shape where a child router lives in `routes/task_instances.py` and its auth is declared on the parent in `routes/__init__.py`.
- **FastAPI router-level `dependencies=[...]` propagation.** Module-level `router = APIRouter(dependencies=[Security(...)])` is pre-walked once per file and merged onto every `@<router>.<verb>(...)` route attached in the same file. Closes airflow execution-API routes that re-use a single `ti_id_router` declared once at module scope.
- **FastAPI `Security(callable, scopes=[...])` recognised distinctly from `Depends(callable)`.** Scoped Security promotes the synthetic `AuthCheck` to `AuthCheckKind::Other` (route-level scope-checked authorization), not Login. New scope-tracking boolean threaded through `expand_decorator_calls` and `extract_fastapi_dependencies`.
- **Caller-scope IPA: same-file route-handler-to-helper auth lift.** `apply_caller_scope_propagation` walks every non-route helper unit; if its in-file callers are non-empty AND every caller is itself an authorized route handler (route-level non-Login auth check) or already authorized via this same propagation, the caller's checks lift onto the helper as synthetic `is_route_level=true` `AuthCheck`s. Iterated to a small fixpoint so transitive helper chains (route to mid_helper to leaf_helper) are covered. Refuses to authorize helpers with no in-file caller, helpers called from a mix of authorized and unauthorized callers, and helpers called only from un-lifted helpers. Cross-file lifting is not implemented. Closes the dominant FastAPI / Django / Flask "route authenticates via decorator/dependency, then delegates to a private helper that performs the sink" FP shape on sentry / saleor / airflow.
- **Go DAO-helper id-scalar precision pass.** For non-route Go units, a parameter whose declared type is a bounded primitive scalar (`int64`, `uint32`, `string`, `bool`, `byte`, `rune`, `float64`, …) and whose name is id-shaped (`id`, `*Id`, `*_id`, `*ids`) is dropped from `unit.params` before ownership-check evaluation. Real Go HTTP handlers always carry a framework-request-typed param (`*http.Request`, `*gin.Context`, `echo.Context`, `*fiber.Ctx`); per-framework route extractors set `include_id_like_typed=true` so id-shaped path params survive on real routes. Mirrors the existing Python `is_python_id_like_typed_param` filter. Closes ~957 `go.auth.missing_ownership_check` findings on gitea backend DAO helpers (`func GetRunByRepoAndID(ctx, repoID, runID int64)`, `func DeleteRunner(ctx, id int64)`, the entire `models/...` layer where the ownership check sits in the calling route handler) and equivalent shapes in minio / Go ORM codebases.
- **Bare-callee verb-name fallback gate.** `list(...)`, `filter(...)`, `update(...)`, `create_audit_entry(...)`, `update_coding_agent_state(...)` (no receiver dot at all) no longer classify as `DbMutation` / `DbCrossTenantRead` via the loose verb-name fallback. Real ORM/DB calls carry a receiver (`User.find(id)`, `Model.objects.filter`, `repo.save(x)`); a bare `list(events)` is the Python builtin and `filter(fn, xs)` is `Iterable.filter`. New helper `receiver_is_simple_chain(callee)` requires a non-chained receiver dot. The realtime / outbound / cache prefix dispatches still match by chain root.
### Type-aware sinks and validators
- **Java JPA / Hibernate Criteria API as structural SQL.** `TypeKind::JpaCriteriaQuery` covers `CriteriaQuery<T>`, `CriteriaUpdate<T>`, `CriteriaDelete<T>`, `Subquery<T>`, `TypedQuery<T>`. `sink_args_jpa_criteria_query_safe` clears `cfg-unguarded-sink` SQL_QUERY when any positional argument to the sink call is JpaCriteriaQuery-typed (receiver excluded; receiver of `session.createQuery(cq)` is the Session/EntityManager channel, never the SQL payload). `cb.createQuery(...)`, `em.getCriteriaBuilder()`, and the JpaCriteriaQuery type chain inferred via constructor / factory return-type hints in `type_facts.rs`. Closes the dominant FP cluster on openmrs (169 of 216 cfg-unguarded-sink), xwiki, and keycloak Hibernate DAO methods.
- **Receiver-side validator registry.** `labels::lookup_receiver_validator(lang, callee)` clears `Cap` from the receiver value (and call equivalents) on success, distinct from `Sanitizer` which clears caps from the return value. Python registers `relative_to => Cap::FILE_IO` so `path.relative_to(base)` drops the file-IO cap on the path. Closes the CVE-2024-23334 patched aiohttp `static_root_path.joinpath(filename).resolve().relative_to(static_root_path)` shape.
- **JS/TS Array-method validator-callback narrowing.** `arr.filter(isSafeIdentifier)`, `arr.find(isValidId)`, `arr.findLast(...)` with a `BooleanTrueIsValid` callback (`isValid…`, `isSafe…`, `hasValid…` and snake-case variants) propagate `validated_must` through the call's return value. Resolves callback name from `info.arg_callees` (call-shape arguments) and SSA `value_defs[v].var_name` (bare-identifier callbacks, the dominant patched-CVE form). Strict-additive: anonymous arrows / opaque identifiers leave existing propagation untouched. `findIndex` / `every` / `some` excluded (scalar return shape). Motivated by CVE-2026-42353.
- **JS/TS ternary-branch source classification.** `let arr = cond ? req.query.lng : "";` previously lowered each branch to a labelless Assign with empty uses; the join phi saw no taint. `lower_ternary_branch` now runs `first_member_label` on the branch AST when no `Source` label is already attached.
- **PHP `fopen` modeled as `Sink(Cap::SSRF)`** (same dual SSRF / LFI shape as `file_get_contents`; fires only on tainted argument). Closes CVE-2026-33486 (roadiz/documents `DownloadedFile::fromUrl` wrapping `fopen($url, 'r')`).
- **PHP `Serializable::unserialize($input)` magic-method passthrough recognition.** The legacy `Serializable` interface contract (deprecated since PHP 8.1) requires the implementation to call `\unserialize($input)` on the formal parameter inside `public function unserialize($x) { ... }`. PHP itself invokes the method when restoring an instance, so the body's call cannot be removed without breaking the interface. `php.deser.unserialize` now suppresses inside this exact shape (method named `unserialize`, single formal, bare-parameter argument). Class-level `Serializable` implementation is the actionable signal (fix is migration to `__serialize` / `__unserialize`). Closes joomla / drupal Serializable-implementing class FPs.
- **SQLAlchemy query-builder chained-call recognition.** `select(X).filter_by(...)`, `query(X).filter(...)`, `select().join().where()` chains now anchor through the chain root primitive when the chain receiver type is opaque. New `db_query_builder_roots` config (Python defaults: `select`, `query`). Closes airflow `session.scalar(select(C).filter_by(conn_id=user_input))` shapes that previously dropped under the chained-call suppression in `classify_sink_class`.
- **Python non-sink container constructor recognition.** Bare-callee `set()` / `dict()` / `list()` / `tuple()` / `frozenset()` / `defaultdict(...)` is treated as a non-sink constructor, so `verified_ids = set(); verified_ids.update(myteams)` does not classify the `.update` call as `DbMutation`. Type-annotation hint form `set[int]` / `dict[str, int]` recognised via PEP 585 generic suffix strip alongside the existing angle-bracket strip.
- **Python `request.match_info` source label** (aiohttp path-parameter source).
- **New Python pattern `py.xss.make_response_format` (Tier B).** Flask `make_response(<f-string-or-concat>)` reflection. Recognises both bare `make_response(...)` and `flask.make_response(...)`. Closes CVE-2023-6568 (mlflow auth `create_user` reflecting attacker-controlled `Content-Type` header into the response body).
### Language coverage
Per-language label rules expanded for the seven new caps.
- **JavaScript / TypeScript:** ldapjs `LdapClient.search`, `escapeXpath` / `xpathEscape`, `document.evaluate` / npm `xpath.select`, `setHeader` / `res.set` / `res.append` / `res.headers[]=`, `stripCRLF` / `escapeHeader`, lodash / dot-prop / object-path deep-merge prototype-pollution gates, Handlebars / EJS / Mustache template sinks, fast-xml-parser / xml2js with `processEntities`-aware activation, `redirect` / `Location` open-redirect sinks.
- **Python:** python-ldap `LDAPObject.search_s`, ldap3 `Connection.search`, lxml `etree.XPath` / `lxml.etree.parse` with parser-config awareness, Flask `response.headers[]=` / `make_response`, Jinja2 `Template(...)` and Mako `Template(...)` SSTI sinks, `flask.redirect` / `aiohttp HTTPFound` open-redirect.
- **Java / Kotlin:** `DirContext.search`, `XPath.evaluate` / `XPath.compile`, JAXP `DocumentBuilder.parse` / `SAXParser.parse` / `XMLReader.parse`, FreeMarker `Template.process`, Spring `redirect:` view-name synthetic sink, `HttpServletResponse.setHeader` / `addHeader`.
- **PHP:** `ldap_search` / `ldap_list` / `ldap_read`, `DOMXPath::query` / `DOMXPath::evaluate`, `header()` with leading-prefix activation, Smarty `fetch` / Twig `createTemplate` / Blade compile + `eval` template forms, `loadXML` / `simplexml_load_string` with `LIBXML_NOENT` activation.
- **Go:** `go-ldap conn.Search`, `etree.Path` / `xmlpath.Compile`, `http.Header.Set` / `Response.Header().Set`, `html/template` and `text/template` `Parse(...)`, `encoding/xml.Unmarshal` / `Decoder.Decode`, `http.Redirect` with relative-URL / host-allowlist gating.
- **Ruby:** `Net::LDAP#search`, `Nokogiri::XML::Document#xpath`, `response.headers[]=`, `ERB.new` SSTI, `Nokogiri::XML.parse` with `NOENT` / `DTDLOAD` activation, `redirect_to` with relative-URL gate.
- **C / C++:** libldap `ldap_search_ext_s`, libxml2 `xmlXPathEval`, `curl_easy_setopt` with header-list activation, libxml2 `xmlReadFile` / `xmlReadMemory` with `XML_PARSE_NOENT` activation.
- **Rust:** actix-web `HeaderMap.insert` / `HeaderValue::from_str` header-injection gates. `Redirect::to` retagged from `Cap::SSRF` to `Cap::OPEN_REDIRECT` so the open-redirect rule fires distinctly from the SSRF rule.
`NYX_PYTHON_PROTO_POLLUTION` opt-in flag: Python `dict.update` / `__dict__.update` proto-pollution gates are off by default because bare `update` overlaps too broadly with `Counter.update` and ordinary state-mutation patterns to ship as a default sink.
### CVE corpus
- **C.** CVE-2017-1000117 (git argv injection via `ssh://-oProxyCommand=…`) vulnerable + patched fixtures under `tests/benchmark/cve_corpus/c/CVE-2017-1000117/`. Known remaining gap: array-element taint propagation, `c.cmdi.exec*` AST patterns, and dash-prefix-byte sanitizer recognition.
- **Python.** CVE-2023-6568 (mlflow reflected XSS), CVE-2024-21513 (langchain SQL / Jinja), CVE-2024-23334 (aiohttp static-file path traversal) vulnerable + patched fixtures.
- **PHP.** CVE-2026-33486 (roadiz/documents SSRF) vulnerable + patched fixtures.
- **JavaScript.** CVE-2026-42353 (i18next-http-middleware path traversal) vulnerable + patched fixtures.
### CLI
- **`nyx rules list`** subcommand. Surfaces the same registry the dashboard's `/api/rules` page reads from: built-in cap-class entries (one per `Cap` with a canonical rule id), per-language label rules (sink / source / sanitizer), gated sinks, and any custom rules from config. Filters: `--lang <slug>`, `--kind <class|source|sink|sanitizer>`, `--class-only` for registry entries only, `--no-class` for per-language rules only. `--json` for machine output. Cap-class entries carry `language = "all"` so a language filter still surfaces them unless `--no-class` is set.
- **`RuleInfo.is_class` / `RuleInfo.emission_active` flags.** Cap-class entries carry `is_class = true` so dashboards can group them separately. `emission_active = false` marks legacy classes (SQL_QUERY, SSRF, FILE_IO, FMT_STRING, DESERIALIZE, CODE_EXEC, CRYPTO) whose findings still surface under the catch-all `taint-unsanitised-flow` rule id; the seven new classes plus `unauthorized_id` and `data_exfil` are `emission_active = true`. The active set is pinned in `cap_rule_registry_emission_active_set_is_pinned` so a future migration of a legacy cap cannot drift silently.
- **`parse_cap` and `CapName::FromStr`** accept the new short names: `ldap_injection` / `ldapi`, `xpath_injection` / `xpathi`, `header_injection` / `crlf` / `response_splitting`, `open_redirect` / `redirect`, `ssti` / `template_injection`, `xxe`, `prototype_pollution` / `proto_pollution`, plus the existing `data_exfil` alias. The `nyx config add-rule --cap` flag and `[analysis.languages.*.rules]` entries take any of these.
### Frontend
- **Refreshed local web UI visual system** around the mint-cyan Nyx brand: warmer light surfaces, deep green accents, updated severity / confidence colors, tighter typography, smaller radii, denser cards, table, badge, button, header, and sidebar styling, and matched graph / code-viewer colors.
- **Reworked `nyx serve` surfaces** for a more operational layout. Overview uses the refreshed health-score card and chart grid; Scans has a fixed compact table with capped language badges; Scan Detail places summary and timing data side by side; Triage, Rules, Config, Explorer, Finding Detail, Scan Compare, and Debug pages received focused spacing, overflow, and density fixes.
- **Branded asset set** shared between the SPA and the embedded server bundle: PNG favicons, Apple touch icon, sidebar logo image, refreshed SVG favicon, and Rust static handlers for the new `/logo.png` and favicon files.
- **Frontend `RuleListItem` and `RuleDetailView`** carry the new `is_class` flag so the dashboard's Rules page can group cap-class entries separately.
- **Regenerated README and docs screenshots and GIFs** against the new UI at 1600x992, saving raw originals before framing and adding CLI GIF plus combined CLI-to-serve demo GIF capture support. Extended the screenshot capture workflow with mint-led framing copy, optional `nyxscan.dev` asset mirroring, WebP regeneration for mirrored PNGs, and raw `_raw` image / GIF outputs for downstream reuse.
### Performance
- **Hoisted `collect_top_level_units` out of the per-extractor loop** in `extract_authorization_model`. Multi-extractor languages (Go gin+echo, JS/TS express+koa+fastify, Python flask+django, Rust axum+actix_web+rocket, Ruby sinatra) had been re-walking the entire AST and rebuilding the `Function`-kind unit set per extractor, then deduping by span. New `AuthExtractor::requires_top_level_units()` opt-out for Spring / Rails which build their own. Was 46% of `extract_authorization_model` wall-clock on the mattermost/server/channels/app subtree.
- **Single `AuthorizationModel` build per file in fused mode.** The diag path and the per-file summary path each ran their own `extract_authorization_model`, duplicating the hoisted unit pass and every framework extractor's AST walk. Auth summaries now extract from the base model (pre var-types, pre helper-lifting) so the persisted per-file summary matches the legacy `extract_auth_summaries_by_key` path bit-for-bit.
- **O(N) shallow value-ref emission in `collect_unit_state`.** The previous per-node `extract_value_refs(node, bytes)` walked the entire subtree on every recursion level (O(N²) per body) even though the recursion below already visits every descendant once. New `append_shallow_value_ref` emits the node's own ref and lets recursion handle the descent. Public callers of `extract_value_refs` (`collect_call`, `collect_condition`, assignment-side extraction) keep the deep walk. Was ~17% + 15% + 11% of wall-clock split across `build_function_unit_with_meta`, `collect_unit_state`, and `extract_value_refs` on mattermost.
- **Per-`ParsedFile` `body_const_facts_cache: OnceCell`.** SSA + const-prop + type-fact build was running 2-3× per body across `run_cfg_analyses_with_lowered`, `run_auth_analyses`, and `collect_file_var_types`. Single-pass cache; gin profile dropped from 13.6% to ~4.5%.
- **SCCP switched from `HashMap<SsaValue, _>` and `HashSet<(BlockId, BlockId)>`** to dense `Vec` per-value lattice and per-destination predecessor `SmallVec<[BlockId; 2]>`. The inner fixed-point loop no longer SipHashes a 64-bit pair for every operand of every phi. Public `ConstPropResult` shape unchanged (one final O(num_values) HashMap conversion).
- **`GlobalSummaries.by_key` switched to `FxHashMap`** (rustc-hash 2.1) from stdlib SipHash. `FuncKey` carries 3 String fields, so any HashMap operation hashes at least 30 bytes; FxHash is ~5× faster on this workload. Seed is fixed (no DoS hardening), fine for an in-process index keyed by program-derived names.
- `large_go_module.go` perf fixture (1493 lines) added to `benches/perf_fixtures/`; `benches/scan_bench.rs` extended with auth-extractor, SCCP, and summary-resolution rows.
### Fixed (false positives)
- `Object.create(null)` receivers no longer fire prototype-pollution at the synthetic `__index_set__` sink. Suppression is flow-sensitive via `TypeKind::NullPrototypeObject` so a phi join that only sometimes resolves to a null-proto receiver still fires on the unsafe path.
- `cfg-unguarded-sink` over-fires on JS/TS object-literal property writes guarded by an explicit `__proto__` / `constructor` / `prototype` reject `if` (early `return` / `throw` / `break`) or by an allowlist `if` whose true arm contains the assignment. Resolved at the CFG layer before the SSA sink scan.
- Spring MVC `return "redirect:" + url` flagged generic `taint-unsanitised-flow` even when the redirect destination was the load-bearing taint. Now routed through the synthetic `__spring_redirect__` sink so the finding emerges as `taint-open-redirect`.
- `$smarty->fetch(...)` / `$twig->createTemplate(...)` no longer drop their SSTI gate match on idiomatic PHP receiver shapes.
- `setValue(target, req.body, ...)` and similar wrappers no longer gate-match on the rewritten Source `req.body` text.
- Nokogiri / lxml / fast-xml-parser parser bodies hardened with `setFeature` / `processEntities: false` / `XMLParser(resolve_entities=False)` no longer fire `taint-xxe`.
- `XPath` instances bound to `setXPathVariableResolver(...)` no longer fire `taint-xpath-injection` on subsequent `xpath.evaluate(expr, ...)` sinks.
- Inline `if (!url.startsWith("/")) reject` and `if (new URL(url).host !== ALLOWED) reject` open-redirect sanitisers narrow `Cap::OPEN_REDIRECT` on the validated branch instead of falling through to the generic `Comparison` predicate. Other taint downstream still fires on its own caps.
- Rust `Redirect::to` no longer fires `taint-ssrf` for what is structurally an open redirect; retagged to `Cap::OPEN_REDIRECT`.
- ~957 gitea backend DAO `go.auth.missing_ownership_check` findings (id-scalar precision pass).
- 169 of 216 openmrs `cfg-unguarded-sink` findings (JpaCriteriaQuery type). Equivalent reductions on xwiki / keycloak Hibernate DAO clusters.
- joomla and drupal `php.deser.unserialize` flagged inside `Serializable::unserialize($input)` magic-method bodies.
- airflow execution-API routes flagged `missing_ownership_check` despite being authorized via cross-file `include_router` chains and module-level `APIRouter(dependencies=[…])` declarations.
- sentry `verified_ids = set(); verified_ids.update(myteams)` flagged as `DbMutation`.
- aiohttp `path.relative_to(static_root_path)` not recognised as a path-traversal validator.
- i18next-http-middleware `arr.filter(utils.isSafeIdentifier)` not narrowing taint on the result.
- `cond ? req.query.lng : ""` ternary lost `Source` label on the truthy branch.
- `if (!validate($x))` rejection-arm narrowing flipped on PHP unary `!`.
- mlflow `make_response(f"Invalid content type: '{content_type}'")` (Tier B pattern).
- Bare-callee verb-name dispatch on Python builtins / locally-defined helpers (`list`, `filter`, `update`, `create_audit_entry`, `update_coding_agent_state`).
- FastAPI `Depends(...)` / `Security(...)` deps declared on a module-level `APIRouter` no longer dropped on every attached route.
- FastAPI `Security(callable, scopes=[...])` no longer downgraded to a Login-only check.
### Tests
- New per-cap integration suites: `tests/{xpath_injection,xxe,ssti,prototype_pollution,header_injection,open_redirect,ldap_injection}_tests.rs`, plus `python_proto_pollution_tests.rs` for the env-gated Python form. Per-cap fixture trees under `tests/fixtures/<class>/<lang>/` cover safe, unsafe, and irrelevant-baseline shapes for every supported language.
- Cross-file FastAPI integration test `tests/fastapi_cross_file_include_router_tests.rs` with airflow-shaped fixture tree under `tests/fixtures/auth_cross_file/airflow_execution_api_includes/`.
- New `cfg/cfg_tests.rs` covers ternary-branch CFG lowering shapes.
- New `summary/tests.rs` covers cross-file `include_router` summary persistence and resolution.
- Per-language safe / vuln auth and detector fixtures across Python, Java, Go, PHP, JS, TS.
### Other
- Refactor passes across `auth_analysis`, `ssa/const_prop`, `ssa/type_facts`, `summary`, and the per-framework auth extractors (cleaner conditional checks, simpler function signatures, deduplicated assertions). No behaviour change.
- README links to a Simplified Chinese translation (`README.zh-CN.md`).
## [0.6.1] - 2026-05-03
A precision pass on auth and resource analysis plus three fresh CVE corpus pairs, plus a UTF-8 slice panic in the path abstract domain. Closes ~1900 Go auth FPs on gitea-shaped helpers, the mastodon/diaspora private-callback Ruby controller pattern, and a phantom-taint outbreak from JS/TS / Java lambda shorthand in jest-style nested test callbacks.
### Added
- Java JDBC raw-SQL sinks. `Statement.execute`, `Statement.executeBatch`, and `Statement.executeLargeUpdate` modeled as `SQL_QUERY` sinks, classified via type-qualified resolution (`DatabaseConnection.execute`) so bare `execute` (Runnable, Executor, HttpClient) does not over-fire. `conn.createStatement()` and `conn.prepareCall()` now infer return type `DatabaseConnection`, so the JDBC chain `Statement s = conn.createStatement(); s.execute(q)` types `s` correctly. Closes GHSA-h8cj-hpmg-636v (Appsmith FilterDataServiceCE.dropTable). Vulnerable + patched Java fixtures added.
- Java/Kotlin `Pattern.matcher(value).matches()` chain recognised as a `ValidationCall` allowlist. Receiver of `.matcher(` must contain `regex` or `pattern`. Validation target is the `.matcher()` argument, not the bare `.matches()` receiver. Branch narrowing applies the `validated_must` to the input variable on the surviving branch. Same GHSA as above (`FILTER_TEMP_TABLE_NAME_PATTERN.matcher(tableName).matches()`).
- Per-parameter SSA summary probe now receives `BodyMeta.param_types`, so `extract_ssa_func_summary` runs a local `analyze_types_with_param_types` pass before extraction. Helper bodies whose sinks resolve only via type-qualified callees (e.g. `DatabaseConnection.execute` for JDBC `Statement.execute`) no longer drop the sink during cross-function summary extraction. Fixes the Appsmith helper `executeDbQuery(query)` that routed SQL through `statement.execute(query)`.
- Short-circuit branch condition CFG nodes now mirror `condition_vars` into `taint.uses`, so `apply_branch_predicates` interns the variable for short-circuit-decomposed validators (`if (x == null || !regex.matcher(x).matches()) throw`). Without this, the per-disjunct cond nodes built via `build_condition_chain` silently no-opped and `x` never reached `validated_must` on the surviving branch.
- Go `goqu.L(s)` and `goqu.Lit(s)` raw-SQL literal builders modeled as `SQL_QUERY` sinks. Safe siblings (`goqu.I` identifier, `goqu.C` column, `goqu.T` table, `goqu.V` parameterised value, `goqu.SUM`, `goqu.COUNT`, …) stay unlabeled. Gin source list extended with the array-returning siblings of the existing scalar helpers: `c.QueryArray`, `c.GetQueryArray`, `c.PostFormArray`, `c.GetPostFormArray`. Closes CVE-2026-41422 (daptin: `c.QueryArray("column")``goqu.L(project)` with the loop variable lifted through `for _, project := range columns`). Vulnerable + patched Go corpus pair under `tests/benchmark/cve_corpus/go/CVE-2026-41422/`.
- Go `for ident := range iter` def-use lifting. The `range_clause` child of `for_statement` is now consulted when `left`/`right` aren't direct fields of the `for` node, so taint from the iterable reaches the loop binding. Required for the daptin CVE shape above.
- Java `enhanced_for_statement`, PHP `foreach`, and Ruby `for` def-use lifting, completing the loop forms the Go `range_clause` fix above started. The `Kind::For` def-use arm only knew the JS/Python `left`/`right` pair and Go's `range_clause`; Java carries the binding on `name` and the iterable on `value`, Ruby's `for` on `pattern`/`value`, and PHP's `foreach` keeps both as unnamed children split by the `as` keyword, so none recorded the loop variable as a define and taint on the iterable never reached the binding (`for (Cookie c : req.getCookies()) { … c.getValue() … }` lost the flow at `c`). Each form now folds onto the shared define/use path. Lifts Java OWASP Benchmark recall: path_traversal 0.21 → 0.32, sqli 0.16 → 0.28, cmdi 0.04 → 0.08.
- Iterable-expression classification for the loop forms above. The loop node is classified against its iterable text, so a source-returning iterable (`req.getCookies()`, `req.getParameterValues("v")`, `$_GET['list']`) lands a `Source` on the loop node and the binding inherits its taint, the same rewrite JS/Python `for … of` / `for … in` already had. Subscript iterables (`$_GET['x']`, `params[:list]`) classify on their base object since sources key on the base name, not the index.
- Java iterable-returning request accessors modeled as sources: `getParameterValues`, `getParameterMap`, `getParameterNames`, `getHeaders`, `getHeaderNames`. The `getParameter` / `getHeader` matchers are word-boundary suffix matches and never covered the plural collection variants that feed for-each loops (`for (String s : req.getParameterValues("v"))`). The dominant OWASP Benchmark vulnerable-source shape.
- Rust format-string named-argument lifting (`format!("...{x}...")`, stable since 1.58). Identifiers captured by `{name}` / `{name:fmt-spec}` are pulled into the call's `uses` for known format-style macros: `format`, `print`/`println`, `eprint`/`eprintln`, `write`/`writeln`, `panic`, `format_args`, `assert`/`debug_assert`, `todo`, `unimplemented`, `unreachable`, plus log-crate severity macros (`info`, `warn`, `error`, `debug`, `trace`). Recursive descent through one or two layers of expression wrapping (`format!("{x}").to_owned()`, RHS chained method calls). Without this, taint stopped at the macro boundary. `let q = format!("...{x}...")` carried no `x` because the identifier lives in format-string bytes rather than as a separate AST argument node. Mirrors the Python f-string lifter.
- Rust CVE corpus extended. CVE-2023-42456, CVE-2024-32884, CVE-2025-53549 vulnerable + patched fixtures under `tests/benchmark/cve_corpus/rust/`.
- Java lambda shorthand recognised by `extract_param_meta`. `lambda_expression`'s `parameters` field as a bare `identifier` (`cmd -> …`) or as an `inferred_parameters` wrapper around identifiers (`(a, b) -> …`) was not matching the formal_parameter / spread_parameter kinds in `PARAM_CONFIG`, so the lambda appeared parameterless and the SSA pipeline treated its formals as closure captures. Mirrors the JS/TS arrow shorthand path.
### Fixed
- Panic on non-ASCII input to `has_first_char_absolute_check` in the path abstract domain. The 32-byte search window around `[0]` was sliced as `&clause[lo..hi]` (str), which panicked when `hi` landed inside a multi-byte UTF-8 char (e.g. the em dash `—`, bytes 34..37). Switched to `&bytes[lo..hi]` with `windows()` byte-pattern checks; all needles are ASCII so the searches are equivalent. Surfaced by `cargo fuzz` (`scan_bytes` target, `.c` extension path, embedded `—` in a comment near `s[0] == '/'`). Regression test added.
### Fixed (false positives)
- `cfg-unguarded-sink` parameter-only trace no longer clears a sink argument whose reaching definition is a loop binding. Once the loop variable resolves to its iterable (the def-use lifting above), a `foreach ($param as $v) { sink($v) }` element looked like a bare `sink($p)` wrapper pass-through and the structural finding was dropped. A loop element over a parameter collection is not wrapper plumbing, so the finding survives for loop-bound sink arguments; literal-keyed arrays stay suppressed through `sink_arg_uses_safe_foreach_key`. Keeps the negative case in `fp_guard_php_foreach_safe_literal_keys` firing.
- Go `unit_has_user_input_evidence` framework-request-name allow-list narrowed for Go. `ctx`, `context`, `info`, `body`, `path`, `payload`, `dto`, `form`, `query` are no longer treated as user-input indicators on Go: in Go these are `context.Context` (cancellation/value-bag from the stdlib) or struct-pointer payload params (`info *PackageInfo`, `opts *FooOptions`), not request bindings. Go HTTP frameworks bind the request to per-framework typed params (`r *http.Request`, `c *gin.Context`, `c echo.Context`, `c *fiber.Ctx`); these arrive at the gate via `RouteHandler` kind or the type-aware param filter below. Stdlib `req` / `request` (the `*http.Request` convention) preserved. Other languages keep the broader allow-list.
- Go param collection drops `ctx context.Context` and `ctx context.CancelFunc` parameters entirely rather than seeding their names into `unit.params`. Tree-sitter-go's `parameter_declaration` exposes `name` and `type` as named fields; descend only into `name` so type-segment identifiers don't pollute the param-name set (`info *PackageInfo` no longer contributes `PackageInfo`). Together with the allow-list narrowing above, closes ~1900 `go.auth.missing_ownership_check` findings on gitea backend helpers whose only "user-input evidence" was the ubiquitous `ctx context.Context` first param.
- Ruby controller method visibility + filter-callback gate. Methods marked `private` (bare `private` directive, targeted `private :foo, :bar`, or `protected`) and Rails filter callback targets (`before_action`, `after_action`, `around_action`, their `prepend_*` / `append_*` / `skip_*` siblings, and the legacy `*_filter` aliases) are no longer emitted as `Function` units. Visibility tracking is class-body source-order with two directive forms (bare toggles default visibility, targeted explicitly marks named methods). Block-form filters (`before_action do … end`) carry no symbol arg and are correctly ignored. Closes mastodon / diaspora `rb.auth.missing_ownership_check` flood on `set_X` row-fetch helpers used as `before_action` callbacks.
- Field-LHS resource acquires no longer counted as local resource leaks at the `apply_assignment` site. `e->name = (char *)e + sizeof(*e)` (sub-buffer alias inside a returned struct) and `mem->buf = ptr` (local-into-field ownership transfer) now mark the RHS local `MOVED` and stop tracking the field as a separately OPEN resource. The parent struct owns the field's lifecycle. Cross-language (distinct from the Go-only `apply_call` field-LHS gate, which is restricted because JS/TS class-field acquires `this.fd = fs.openSync(...)` are the documented expected leak pattern in that path). Closes curl `entry_new` and equivalent C/C++ shapes in openssl / postgres.
- Empty-formals SSA lowering signal. `lower_to_ssa_with_params` now sets `with_params=true` even when `formal_params` is empty, so an arrow `() => {…}` is treated as "explicitly zero formals" rather than "no formals info". External vars in a zero-formal arrow are now correctly tagged as synthetic closure captures, so the JS/TS / Java auto-seed pass cannot mistake a bubbled-up free var (e.g. `userId` lifted from a nested jest test callback) for a real handler formal. Closes 934 phantom taint findings on the outline test suite (`describe("…", () => { test("…", () => { server.post(…) }) })`-shaped fixtures).
- Rust integer-typed values now suppress `Cap::FILE_IO` at the abstract-domain leaf gate (previously HTML_ESCAPE only). An integer's decimal representation is digits with optional leading `-`, never path metacharacters (`/`, `\`, `.`); magnitude is irrelevant. Closes the sudo-rs RUSTSEC-2023-0069 patched FP `let uid: u32 = user.parse()?; path.push(uid.to_string())`.
## [0.6.0] - 2026-05-02
A focused release that splits data-exfiltration off from SSRF and ships sinks for outbound HTTP request bodies across all 10 languages, with calibration tuned so plain user input echoed back upstream does not fire.
### Added
- New `taint-data-exfiltration` rule, separate from SSRF. Fires when a Sensitive-tier source (cookie, header, env, file, database, caught exception) reaches the body, headers, or json payload of an outbound HTTP call. Plain user input gets suppressed at emission time so a gateway echoing `req.body` back upstream is not flagged.
- Sinks ship for `fetch` body, `XMLHttpRequest.send`, Python `requests.post` and `httpx.AsyncClient.post`, Java JDK `HttpClient.send` with `BodyPublishers`, OkHttp builder chains, Apache HttpClient `execute`, RestTemplate, WebClient, Go `http.Post` and `http.NewRequest` + `Do`, Rust `reqwest`/`ureq`/`surf`/`hyper` body/json/form/multipart chains, Ruby `Net::HTTP.post` and RestClient, C and C++ `curl_easy_setopt(CURLOPT_POSTFIELDS, ...)` gated by the macro arg.
- Three suppression knobs:
- Sanitizer convention. `logEvent`, `forwardPayload`, `tracker.send`, `analytics.track`, `metrics.report`, `serializeForUpstream` are treated as `Sanitizer(data_exfil)` by default. Add your own with the standard custom-rule path.
- Trusted destination allowlist in `detectors.data_exfil.trusted_destinations`. Matched against the abstract-string domain prefix; a literal or template prefix that begins with one of these entries drops the cap.
- Detector toggle `detectors.data_exfil.enabled = false` strips the cap before emission. Other taint classes are unaffected.
- Calibration. Severity is High for cookie or env sources, Medium for header, file, database, or caught-exception sources. Confidence stays at Medium even with strong corroboration, drops to Low without abstract or symbolic backing, and drops one tier on path-validated flows. SARIF output carries a `properties.data_exfil_field` entry on data-exfil findings, set to the destination object-literal field the leak reached (`body`, `headers`, or `json`).
- Benchmark coverage. 13 vulnerable fixtures across 8 languages under `tests/benchmark/corpus/{lang}/data_exfil/` and 6 paired safe fixtures for the sensitivity gate and sanitizer convention. New `data_exfil` row in the per-class breakdown. Per-class CI floor at P, R, F1 ≥ 0.85 (current baseline is 1.000).
- Backwards taint walk recognises `Cap::DATA_EXFIL` and emits the same rule ID.
- Ruby SSRF coverage. `OpenURI.open_uri` now classified as an SSRF sink (the low-level fetcher that `URI.open` delegates to). Closes the CarrierWave CVE-2021-21288 download path and equivalent gem shapes that route through `OpenURI` directly.
- Ruby chained-call wrapper classification. Statement-level wrappers like `YAML.safe_load(File.read(filename))` and `Marshal.load(File.read(p))` now classify the inner sink for cross-function summary extraction. Without this, the outer call became a non-sink node and the inner sink was lost when the helper was summarised.
- Ruby CVE corpus. Vulnerable + patched fixtures added for CVE-2021-21288 (CarrierWave SSRF) and CVE-2023-38337 (rswag path traversal).
- Lodash `_.template` modeled as a gated `Cap::CODE_EXEC` sink. Activates on the template-string argument; suppresses when arg-1 carries a literal `{ evaluate: false }`. Closes Strapi CVE-2023-22621 (server-side template injection → RCE via `<% … %>` evaluate blocks). Vulnerable + patched fixtures added under `tests/benchmark/cve_corpus/javascript/CVE-2023-22621/`.
- JS/TS gated-sink kwarg extractor falls back to inspecting arg-1 object literals (`fn(x, { evaluate: false })`) when the language has no `keyword_argument` node. Required so the lodash gate can read its options object.
- Lodash double-call form (`_.template(t)(data)`) routes through `find_chained_inner_call` so the outer call's gated-sink rebinding fires.
- Cross-function helper-validation propagation. New `SsaFuncSummary.validated_params_to_return` field records parameter indices whose taint flow to the return value is fully validated by a dominating predicate (regex allowlist, type check, validation call) on every return path. At call sites, each tainted argument passed to a validated position, and the call's own return value, are marked `validated_must` / `validated_may` in the caller's SSA taint state, the same way an inline `if (!regex.test(x)) throw` would. Closes the helper-validator gap behind PayloadCMS CVE-2026-25544 (Drizzle SQL injection in `sanitizeValue`). Vulnerable + patched TypeScript fixtures added.
- Destructured-arg sibling expansion in per-parameter taint summary probing. JS/TS object-pattern formals (`({ column, operator, value }) => …`) now seed every binding sharing the slot, and any sibling reaching `validated_must` counts as the slot being validated. New `BodyMeta.param_destructured_fields` carries sibling lists alongside `params` and `param_types`. JS `PARAM_CONFIG` accepts `assignment_pattern` (default-value formals) and `object_pattern` (destructured formals).
- Regex-allowlist branch narrowing. `<X>.test(value)` / `<X>.match(value)` / `<X>.matches(value)` where the receiver name contains `regex` or `pattern` classifies as a `ValidationCall` and narrows the call's first argument, not the regex receiver. Was also extended to `extract_validation_target` so the surviving branch validates `value`, not the regex object. Motivated by Payload CVE-2026-25544 (`if (!SAFE_STRING_REGEX.test(value)) throw …`).
- TypeScript template-substring (`${fn(arg)}`) call-resolution arity-hint fallback. When CFG lowering drops `arg_uses` but `args` is non-empty, the resolver passes `None` so the unique-name fallback can still pick up the lone candidate.
- Caller-scope-entity exemption in `rs.auth.missing_ownership_check`. `<entity>.id` / `<entity>.pk` no longer fires when `<entity>` is a unit parameter named after a multi-tenant scope primitive: `organization` / `org`, `project`, `team`, `workspace`, `tenant`, `account`, `community`, `group`, `repository` / `repo`, `company`. Other field names (`.name`, `.slug`) still flag, and `user` / `member` / `actor` are deliberately excluded (handled by `is_actor_context_subject`). Closes a flood of FPs in Sentry / Saleor / Discourse / Mastodon-shaped multi-tenant helpers (`get_environments(request, organization)`, `_filter_releases_by_query(qs, organization, …)`).
- Auth value-ref walker recurses into the `value` child of `keyword_argument` / `keyword_arg` / `named_argument` nodes. `Model.objects.filter(organization_id=org.id)` no longer surfaces the kwarg key (`organization_id`) as a bare-identifier user-input subject. The schema column name is fixed at call time.
- Test-decorator denylist for Flask route extraction. `mock.patch`, `mock.patch.object` / `.dict` / `.multiple`, `unittest.mock.*`, `monkeypatch.setattr` / `setenv` / `delattr` / `delenv`, and `pytest.mark.parametrize` no longer collide with `<app>.patch` route registration. Stops every `@mock.patch("…")`-decorated test method from being attached as a Flask PATCH handler and flagged as `missing_ownership_check`.
- Typed-extractor route-level guard injection for axum and actix-web. Handlers registered via attribute macros (`#[get("/path")]`, `#[routes::path(…)]`) or via external service-config builders previously never had their typed-extractor guards seeded. New `apply_typed_extractor_guards_to_units` walks every `Function`-kind unit and injects guard checks from typed-extractor params, complementing the route-walk path that already covered `.route(...)` registration.
- New auth config key `policy_guard_names`. Typed-extractor wrappers that prove route-level capability/policy enforcement (e.g. meilisearch's `GuardedData<ActionPolicy<X>, _>`) are recognised distinctly from authentication-only wrappers. Matched as last-segment + case-insensitive `starts_with`. Rust default: `["Guarded"]`. Distinct from `login_guard_names` so the pattern doesn't pollute regular call recognition (a function like `guarded_load(..)` is not a login guard).
- Outer-wrapper-aware classification of typed extractors. `GuardedData<ActionPolicy<X>, Data<AuthController>>` is classified by the outer `GuardedData` (policy-bearing → `AuthCheckKind::Other`), not by whether an inner generic arg substring-matches `auth`. Bare data-only extractors (`Path<u64>`, `Query<X>`, `Json<X>`, `Form<X>`, `State<X>`, `Extension<X>`, `Data<X>`) outer-name-match early-return to `None` regardless of inner type tokens. Reference-marker (`&`, `&mut`, `&'a`) and module-path (`std::collections::`) prefixes stripped before matching.
- Project-level web-framework signal in Rust auth analysis. New `FrameworkContext::lang_has_web_framework(lang)` is three-valued: `Some(true)` when manifest names a framework, `Some(false)` when the manifest was inspected and named none, `None` when no manifest was inspected. New `rust_file_imports_web_framework` does a per-file `axum::` / `actix_web::` / `rocket::` / `axum_extra::` import probe (8 KB head). When the project's Cargo.toml is inspected and lists no Rust web framework AND the file does not directly import one, the `context_inputs` and param-name-heuristic arms of `unit_has_user_input_evidence` are suppressed. `RouteHandler` classification (concrete route-registration evidence) still bypasses the gate. Closes a flood of `missing_ownership_check` FPs in non-web Rust crates such as zed-style desktop / GUI codebases where a debug-session handle named `session` would trip `matches_session_context` on `session.update(cx, …)`. Currently Rust-only; other languages keep prior behavior (`None`).
- Rust auth corpus extended with `safe_actix_guarded_data_extractor.rs` and `unsafe_actix_no_guarded_data_extractor.rs` (typed-extractor guard injection); `safe_non_web_rust_project/` and `unsafe_actix_web_project_no_check/` (full Cargo.toml + src/lib.rs project shapes for the framework-signal gate).
- Python auth corpus extended with `vuln_user_id_param_no_auth.py`, `safe_django_orm_caller_scoped_entity.py` (caller-scope-entity exemption), `safe_mock_patch_test_method.py` (test-decorator denylist).
- Go safe corpus extended with `safe_inner_call_close_in_arg.go` (`require.NoError(t, f.Close())` shape), `safe_struct_field_resource_owned_by_struct.go` (field-LHS ownership transfer), and a `vuln_resource_leak_no_close.go` regression guard.
### Fixed (false positives)
- C++ `cpp.memory.reinterpret_cast` no longer fires when the target type is well-defined by C++ aliasing rules. Suppressed targets: byte-pointer family (`char*`, `unsigned char*`, `signed char*`, `wchar_t*`, `uint8_t*`, `int8_t*`, `std::byte*`, `byte*`), `void*`, integer round-trip (`uintptr_t`, `intptr_t`, and `std::` variants, no pointer required), and the BSD socket address family (`sockaddr*`, `struct sockaddr*`, `sockaddr_in*`, `sockaddr_in6*`, `sockaddr_un*`, `sockaddr_storage*`). User-defined struct or class pointer targets keep firing. Closes ~70% over-fire on serialization, hashing, IPC, and socket-API code where the cast is the standard-blessed idiom.
- PHP `php.crypto.md5` and `php.crypto.sha1` suppress when the call's consuming context yields a non-cryptographic identifier name. Recognised contexts: assignment LHS (variable, `$obj->property`, `$arr['key']`), array element keys, subscript indices, return statements (resolved to enclosing method or function name with `get` prefix stripped), and method-call arguments where the method is a key/cache/lookup verb (`get`, `set`, `has`, `delete`, `fetch`, `store`, `find`, `getItem`, `setItem`). Names containing a crypto keyword (`password`, `secret`, `token`, `signature`, `hmac`, `digest`, `salt`, `key`) keep firing. Closes ETag generation, cache-key hashing, dedup fingerprint, and `getCacheKey()`-style false positives in real PHP repos (phpmyadmin, nextcloud).
- JS and TS `secrets.fallback_secret` no longer fire on empty-string fallbacks (`process.env.X || ""`). Developers write `|| ""` to satisfy non-undefined string types without committing a real secret. Non-empty literal fallbacks still fire.
- Path-traversal sink suppression accepts canonicalised-and-rooted shapes. New `PathFact::is_path_traversal_safe` predicate clears `Cap::FILE_IO` when the path is dotdot-free and either non-absolute or carries a verified prefix-lock. New `OPAQUE_PREFIX_LOCK` marker records the structural invariant ("rooted under SOME prefix") when the `starts_with`-style guard's argument is a method call, field access, or configured root rather than a string literal. Closes the Ruby `File.expand_path + start_with?(root)` shape (rswag CVE-2023-38337 patched counterpart), the Python `os.path.realpath + .startswith(root)` shape, and the JS `path.resolve + .startsWith(root)` shape. `classify_path_assertion` extended to JS `.startsWith(...)`, Python `.startswith(...)`, Ruby `.start_with?(...)` (paren and paren-less), and Go `strings.HasPrefix(...)`.
- Branch narrowing now flips prefix-lock attachment under condition negation. For `if !target.startsWith(ROOT) { return; }` the lock attaches to the surviving block, not the rejection arm. Rejection-axis narrowing is unchanged because the rejection classifier is text-level and already accounts for leading `!`.
- Go field-LHS resource acquires no longer counted as local resource leaks. `b.cpuprof = os.Create(...)` transfers ownership to the containing struct; closure responsibility belongs to a paired `Stop()` / `Release()` method on the struct's lifecycle. Gated in both `state/transfer.rs::apply_call` and `cfg_analysis/resources.rs::run`. Restricted to Go (`Lang::Go` check). JS/TS class-field acquires (`this.fd = fs.openSync(...)`) keep being tracked because the leak fixtures rely on it. Production trigger: prometheus `cmd/promtool/tsdb.go::startProfiling` cluster (`b.cpuprof`, `b.memprof`, `b.blockprof`, `b.mtxprof`).
- Go inner-call release in argument position. `require.NoError(t, f.Close())`, `errs = append(errs, f.Close())`, JUnit `assertEquals(0, in.read())`: releases that live in argument position now mark the receiver `CLOSED`. Bare-receiver inner calls only (chained-receiver releases stay owned by `chain_proxies`); marks `CLOSED` only with no `DoubleClose` attribution; respects `in_defer` for symmetry.
### Other
- Action download script warning for the mutable `latest` tag now references `v0.6.0` instead of `v0.5.0`.
## [0.5.0] - 2026-04-29 ## [0.5.0] - 2026-04-29
@ -35,7 +353,7 @@ The biggest release since launch. The taint engine was rebuilt on top of an SSA
- Direction-aware engine notes (`UnderReport`, `OverReport`, `Bail`) flow into confidence scoring, ranking, and the new `--require-converged` strict mode. - Direction-aware engine notes (`UnderReport`, `OverReport`, `Bail`) flow into confidence scoring, ranking, and the new `--require-converged` strict mode.
- Synthetic field-write inheritance: `u.Path = "/foo"` no longer drops taint carried by other fields of `u`. Fixes Owncast CVE-2023-3188 (SSRF). - Synthetic field-write inheritance: `u.Path = "/foo"` no longer drops taint carried by other fields of `u`. Fixes Owncast CVE-2023-3188 (SSRF).
- Phantom-Param-aware field suppression skips method/function references that share a base name with a tainted variable. - Phantom-Param-aware field suppression skips method/function references that share a base name with a tainted variable.
- Validation err-check narrowing for the two-statement Go idiom `_, err := strconv.Atoi(input); if err != nil { return }` `input` is marked validated on the surviving `err == nil` branch. - Validation err-check narrowing for the two-statement Go idiom `_, err := strconv.Atoi(input); if err != nil { return }`: `input` is marked validated on the surviving `err == nil` branch.
- Go: `strings.Replace` / `strings.ReplaceAll` recognised as a sanitizer when the OLD literal contains a known-dangerous payload (shell metachars, path-traversal, HTML, SQL) and the NEW literal does not reintroduce one. - Go: `strings.Replace` / `strings.ReplaceAll` recognised as a sanitizer when the OLD literal contains a known-dangerous payload (shell metachars, path-traversal, HTML, SQL) and the NEW literal does not reintroduce one.
- Go: literal-strip cap detection extended to shell metachars (`;`, `|`, `&`, `$`, backtick) and SQL metachars (`'`, `"`, `--`). - Go: literal-strip cap detection extended to shell metachars (`;`, `|`, `&`, `$`, backtick) and SQL metachars (`'`, `"`, `--`).
- Go: `interpreted_string_literal` / `raw_string_literal` handled in tree-sitter so const-string arg extraction works for Go's double-quoted and backtick forms. - Go: `interpreted_string_literal` / `raw_string_literal` handled in tree-sitter so const-string arg extraction works for Go's double-quoted and backtick forms.
@ -98,7 +416,7 @@ The biggest release since launch. The taint engine was rebuilt on top of an SSA
- Replaced the legacy `app.js` with a React + Vite + TypeScript SPA. - Replaced the legacy `app.js` with a React + Vite + TypeScript SPA.
- Interactive graph workspace for CFG and call-graph views (Graphology + ELK + Sigma) with neighborhood reduction and a full-page inspector. - Interactive graph workspace for CFG and call-graph views (Graphology + ELK + Sigma) with neighborhood reduction and a full-page inspector.
- Triage UI with database-backed decisions (true positive, false positive, deferred, suppressed) and `.nyx/triage.json` round-trip. - Triage UI with database-backed decisions (true positive, false positive, accepted risk, suppressed) and `.nyx/triage.json` round-trip.
- Scan history, rules management, and finding detail panels with evidence and flow visualization. - Scan history, rules management, and finding detail panels with evidence and flow visualization.
- Vitest browser-side test suite wired into CI. - Vitest browser-side test suite wired into CI.
- Bumped to React 19, Vite 8, TypeScript 6.0, ESLint 10, `@vitejs/plugin-react` 6, with aligned `@types/react*`. - Bumped to React 19, Vite 8, TypeScript 6.0, ESLint 10, `@vitejs/plugin-react` 6, with aligned `@types/react*`.

View file

@ -29,6 +29,8 @@ Please read our [Code of Conduct](CODE_OF_CONDUCT.md) before participating.
- **Rust 1.88+** (edition 2024) - **Rust 1.88+** (edition 2024)
- Git - Git
- **Node 20+** — only if you touch the browser UI under `frontend/` (the
`nyx serve` web app). Pure-Rust changes do not need it.
### Building ### Building
@ -43,13 +45,29 @@ cargo install --path . # Install as `nyx` binary
### Running Quality Checks ### Running Quality Checks
The fastest way to reproduce CI locally is the bundled script — it runs the same
commands CI runs (fmt, Clippy, tests, and the frontend checks):
```bash ```bash
cargo test --bin nyx # Unit tests (inline in modules) ./scripts/check.sh # Mirror CI: fmt + clippy + tests (+ frontend)
cargo clippy --all -- -D warnings # Lint, treats warnings as errors ./scripts/check.sh --rust-only # Skip the frontend checks
cargo fmt # Format code ./scripts/fix.sh # Auto-fix: cargo fmt + clippy --fix + prettier/eslint
cargo fmt -- --check # Check formatting without modifying
``` ```
Or run the steps individually:
```bash
cargo test --all-features # Tests, incl. tests/ integration suite
cargo clippy --all-targets --all-features -- -D warnings # Lint, warnings = errors
cargo fmt # Format code
cargo fmt -- --check # Check formatting without modifying
```
> **Match CI exactly.** CI lints and tests with `--all-targets --all-features`.
> The older `cargo test --bin nyx` / `cargo clippy --all` commands skip the
> `tests/` integration suite and feature-gated code, so they can pass locally
> while CI fails. Prefer `./scripts/check.sh`.
> **Note**: The first build downloads and compiles tree-sitter grammars for all 10 languages. Subsequent builds are faster. > **Note**: The first build downloads and compiles tree-sitter grammars for all 10 languages. Subsequent builds are faster.
### Benchmarks ### Benchmarks
@ -64,6 +82,12 @@ Benchmark fixtures live in `benches/fixtures/`. Criterion produces HTML reports
## Project Layout ## Project Layout
> **New here?** [`docs/how-it-works.md`](docs/how-it-works.md) walks the analysis
> pipeline end to end (with a diagram), and [`docs/detectors/taint.md`](docs/detectors/taint.md)
> covers the taint engine. The easiest first contribution is usually a new AST
> pattern (see [below](#how-to-add-a-new-ast-pattern)) — small, self-contained,
> and well templated.
``` ```
src/ src/
main.rs CLI entry point main.rs CLI entry point
@ -260,12 +284,13 @@ Adding a new language requires changes across several modules. Use an existing l
## Testing ## Testing
### Unit Tests ### Tests
All tests are inline `#[test]` blocks inside source modules. Run them with: Unit tests are inline `#[test]` blocks inside source modules; integration tests
live under `tests/`. Run everything the way CI does:
```bash ```bash
cargo test --bin nyx cargo test --all-features
``` ```
### What to Test ### What to Test
@ -280,7 +305,7 @@ cargo test --bin nyx
CI runs Clippy with strict settings. Before submitting: CI runs Clippy with strict settings. Before submitting:
```bash ```bash
cargo clippy --all -- -D warnings cargo clippy --all-targets --all-features -- -D warnings
``` ```
--- ---
@ -293,10 +318,10 @@ First-time contributors are welcome. If you are unsure where to start, open an i
2. **Keep PRs focused**. One logical change per PR. 2. **Keep PRs focused**. One logical change per PR.
3. **Ensure CI passes**: 3. **Ensure CI passes** — run `./scripts/check.sh` (mirrors CI), or the steps individually:
```bash ```bash
cargo test --bin nyx cargo test --all-features
cargo clippy --all -- -D warnings cargo clippy --all-targets --all-features -- -D warnings
cargo fmt -- --check cargo fmt -- --check
``` ```
@ -340,7 +365,7 @@ We welcome well-motivated feature proposals. Please describe:
1. Update version in `Cargo.toml`. 1. Update version in `Cargo.toml`.
2. Update `CHANGELOG.md` with the new version section. 2. Update `CHANGELOG.md` with the new version section.
3. Run full test suite: `cargo test --bin nyx && cargo clippy --all -- -D warnings`. 3. Run full checks: `./scripts/check.sh` (or `cargo test --all-features && cargo clippy --all-targets --all-features -- -D warnings`).
4. Create a git tag: `git tag v0.x.y`. 4. Create a git tag: `git tag v0.x.y`.
5. Push tag: `git push origin v0.x.y`. 5. Push tag: `git push origin v0.x.y`.
6. CI builds release binaries and publishes to crates.io. 6. CI builds release binaries and publishes to crates.io.

247
Cargo.lock generated
View file

@ -111,9 +111,9 @@ checksum = "7c02d123df017efcdfbd739ef81735b36c5ba83ec3c59c80a9d7ecc718f92e50"
[[package]] [[package]]
name = "assert_cmd" name = "assert_cmd"
version = "2.2.1" version = "2.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "39bae1d3fa576f7c6519514180a72559268dd7d1fe104070956cb687bc6673bd" checksum = "2aa3a22042e45de04255c7bf3626e239f450200fd0493c1e382263544b20aea6"
dependencies = [ dependencies = [
"anstyle", "anstyle",
"bstr", "bstr",
@ -126,9 +126,9 @@ dependencies = [
[[package]] [[package]]
name = "async-compression" name = "async-compression"
version = "0.4.41" version = "0.4.42"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d0f9ee0f6e02ffd7ad5816e9464499fba7b3effd01123b515c41d1697c43dad1" checksum = "e79b3f8a79cccc2898f31920fc69f304859b3bd567490f75ebf51ae1c792a9ac"
dependencies = [ dependencies = [
"compression-codecs", "compression-codecs",
"compression-core", "compression-core",
@ -144,9 +144,9 @@ checksum = "1505bd5d3d116872e7271a6d4e16d81d0c8570876c8de68093a09ac269d8aac0"
[[package]] [[package]]
name = "autocfg" name = "autocfg"
version = "1.5.0" version = "1.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c08606f8c3cbf4ce6ec8e28fb0014a2c086708fe954eaa885384a6165172e7e8" checksum = "f2032f911046de80f0a198e0901378627c33f59ea0ac00e363d481118bd70a53"
[[package]] [[package]]
name = "axum" name = "axum"
@ -202,9 +202,9 @@ dependencies = [
[[package]] [[package]]
name = "bitflags" name = "bitflags"
version = "2.11.1" version = "2.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4512299f36f043ab09a583e57bceb5a5aab7a73db1805848e8fef3c9e8c78b3" checksum = "84d7ced0ae9557296835c32bf1b1e02b44c746701f898460fb000d7eaa84f00a"
[[package]] [[package]]
name = "blake3" name = "blake3"
@ -233,9 +233,9 @@ dependencies = [
[[package]] [[package]]
name = "bumpalo" name = "bumpalo"
version = "3.20.2" version = "3.20.3"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d20789868f4b01b2f2caec9f5c4e0213b41e3e5702a50157d699ae31ced2fcb" checksum = "72f5acc6cb2ba439de613abc23857ec3d78374d8ed5ac84e9d11336e87da8649"
[[package]] [[package]]
name = "bytes" name = "bytes"
@ -257,9 +257,9 @@ checksum = "37b2a672a2cb129a2e41c10b1224bb368f9f37a2b16b612598138befd7b37eb5"
[[package]] [[package]]
name = "cc" name = "cc"
version = "1.2.60" version = "1.2.63"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "43c5703da9466b66a946814e1adf53ea2c90f10063b86290cc9eb67ce3478a20" checksum = "556e016178bb5662a08681bbe0f00f8e17631781a4dfc8c45e466e4b185ec27f"
dependencies = [ dependencies = [
"find-msvc-tools", "find-msvc-tools",
"shlex", "shlex",
@ -284,9 +284,9 @@ dependencies = [
[[package]] [[package]]
name = "chrono" name = "chrono"
version = "0.4.44" version = "0.4.45"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c673075a2e0e5f4a1dde27ce9dee1ea4558c7ffe648f576438a20ca1d2acc4b0" checksum = "1aa79e62e7697b8e29b513a68abacf485adcd1fe8284a4316c5ae868e6633327"
dependencies = [ dependencies = [
"iana-time-zone", "iana-time-zone",
"num-traits", "num-traits",
@ -378,9 +378,9 @@ checksum = "1d07550c9036bf2ae0c684c4297d503f838287c83c53686d05370d0e139ae570"
[[package]] [[package]]
name = "compression-codecs" name = "compression-codecs"
version = "0.4.37" version = "0.4.38"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eb7b51a7d9c967fc26773061ba86150f19c50c0d65c887cb1fbe295fd16619b7" checksum = "ce2548391e9c1929c21bf6aa2680af86fe4c1b33e6cea9ac1cfeec0bd11218cf"
dependencies = [ dependencies = [
"compression-core", "compression-core",
"flate2", "flate2",
@ -389,9 +389,9 @@ dependencies = [
[[package]] [[package]]
name = "compression-core" name = "compression-core"
version = "0.4.31" version = "0.4.32"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "75984efb6ed102a0d42db99afb6c1948f0380d1d91808d5529916e6c08b49d8d" checksum = "cc14f565cf027a105f7a44ccf9e5b424348421a1d8952a8fc9d499d313107789"
[[package]] [[package]]
name = "console" name = "console"
@ -447,7 +447,7 @@ dependencies = [
"ciborium", "ciborium",
"clap", "clap",
"criterion-plot", "criterion-plot",
"itertools", "itertools 0.13.0",
"num-traits", "num-traits",
"oorandom", "oorandom",
"page_size", "page_size",
@ -467,7 +467,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d8d80a2f4f5b554395e47b5d8305bc3d27813bacb73493eb1001e8f76dae29ea" checksum = "d8d80a2f4f5b554395e47b5d8305bc3d27813bacb73493eb1001e8f76dae29ea"
dependencies = [ dependencies = [
"cast", "cast",
"itertools", "itertools 0.13.0",
] ]
[[package]] [[package]]
@ -512,9 +512,9 @@ checksum = "460fbee9c2c2f33933d720630a6a0bac33ba7053db5344fac858d4b8952d77d5"
[[package]] [[package]]
name = "dashmap" name = "dashmap"
version = "6.1.0" version = "6.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5041cc499144891f3790297212f32a74fb938e5136a14943f338ef9e0ae276cf" checksum = "e6361d5c062261c78a176addb82d4c821ae42bed6089de0e12603cd25de2059c"
dependencies = [ dependencies = [
"cfg-if", "cfg-if",
"crossbeam-utils", "crossbeam-utils",
@ -562,9 +562,9 @@ dependencies = [
[[package]] [[package]]
name = "either" name = "either"
version = "1.15.0" version = "1.16.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "48c757948c5ede0e46177b7add2e67155f70e33c07fea8284df6576da70b3719" checksum = "91622ff5e7162018101f2fea40d6ebf4a78bbe5a49736a2020649edf9693679e"
[[package]] [[package]]
name = "encode_unicode" name = "encode_unicode"
@ -637,6 +637,12 @@ dependencies = [
"num-traits", "num-traits",
] ]
[[package]]
name = "fnv"
version = "1.0.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3f9eec918d3f24069decb9af1554cad7c880e2da24a9afd88aca000531ab82c1"
[[package]] [[package]]
name = "foldhash" name = "foldhash"
version = "0.1.5" version = "0.1.5"
@ -741,6 +747,25 @@ dependencies = [
"regex-syntax", "regex-syntax",
] ]
[[package]]
name = "h2"
version = "0.4.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "171fefbc92fe4a4de27e0698d6a5b392d6a0e333506bc49133760b3bcf948733"
dependencies = [
"atomic-waker",
"bytes",
"fnv",
"futures-core",
"futures-sink",
"http",
"indexmap",
"slab",
"tokio",
"tokio-util",
"tracing",
]
[[package]] [[package]]
name = "half" name = "half"
version = "2.7.1" version = "2.7.1"
@ -778,15 +803,15 @@ dependencies = [
[[package]] [[package]]
name = "hashbrown" name = "hashbrown"
version = "0.17.0" version = "0.17.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4f467dd6dccf739c208452f8014c75c18bb8301b050ad1cfb27153803edb0f51" checksum = "ed5909b6e89a2db4456e54cd5f673791d7eca6732202bbf2a9cc504fe2f9b84a"
[[package]] [[package]]
name = "hashlink" name = "hashlink"
version = "0.11.0" version = "0.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ea0b22561a9c04a7cb1a302c013e0259cd3b4bb619f145b32f72b8b4bcbed230" checksum = "824e001ac4f3012dd16a264bec811403a67ca9deb6c102fc5049b32c4574b35f"
dependencies = [ dependencies = [
"hashbrown 0.16.1", "hashbrown 0.16.1",
] ]
@ -805,9 +830,9 @@ checksum = "fc0fef456e4baa96da950455cd02c081ca953b141298e41db3fc7e36b1da849c"
[[package]] [[package]]
name = "http" name = "http"
version = "1.4.0" version = "1.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e3ba2a386d7f85a81f119ad7498ebe444d2e22c2af0b86b069416ace48b3311a" checksum = "8be7462df143984c4598a256ef469b251d7d7f9e271135073e78fc535414f3d0"
dependencies = [ dependencies = [
"bytes", "bytes",
"itoa", "itoa",
@ -850,9 +875,9 @@ checksum = "df3b46402a9d5adb4c86a0cf463f42e19994e3ee891101b1841f30a545cb49a9"
[[package]] [[package]]
name = "hyper" name = "hyper"
version = "1.9.0" version = "1.10.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6299f016b246a94207e63da54dbe807655bf9e00044f73ded42c3ac5305fbcca" checksum = "55281c53a1894c864990125767da440a4e630446785086f52523b20033b74498"
dependencies = [ dependencies = [
"atomic-waker", "atomic-waker",
"bytes", "bytes",
@ -915,9 +940,9 @@ checksum = "3d3067d79b975e8844ca9eb072e16b31c3c1c36928edf9c6789548c524d0d954"
[[package]] [[package]]
name = "ignore" name = "ignore"
version = "0.4.25" version = "0.4.26"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d3d782a365a015e0f5c04902246139249abf769125006fbe7649e2ee88169b4a" checksum = "b915661dd01db3f05050265b2477bcc6527b3792388e2749b41623cc592be67d"
dependencies = [ dependencies = [
"crossbeam-deque", "crossbeam-deque",
"globset", "globset",
@ -936,7 +961,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d466e9454f08e4a911e14806c24e16fba1b4c121d1ea474396f396069cf949d9" checksum = "d466e9454f08e4a911e14806c24e16fba1b4c121d1ea474396f396069cf949d9"
dependencies = [ dependencies = [
"equivalent", "equivalent",
"hashbrown 0.17.0", "hashbrown 0.17.1",
"serde", "serde",
"serde_core", "serde_core",
] ]
@ -969,6 +994,15 @@ dependencies = [
"either", "either",
] ]
[[package]]
name = "itertools"
version = "0.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2b192c782037fadd9cfa75548310488aabdbf3d2da73885b31bd0abd03351285"
dependencies = [
"either",
]
[[package]] [[package]]
name = "itoa" name = "itoa"
version = "1.0.18" version = "1.0.18"
@ -977,10 +1011,12 @@ checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682"
[[package]] [[package]]
name = "js-sys" name = "js-sys"
version = "0.3.95" version = "0.3.99"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2964e92d1d9dc3364cae4d718d93f227e3abb088e747d92e0395bfdedf1c12ca" checksum = "142bc4740e452c1e57ade0cbc129f139c9093e354346f0872ef985f4f5cf5f11"
dependencies = [ dependencies = [
"cfg-if",
"futures-util",
"once_cell", "once_cell",
"wasm-bindgen", "wasm-bindgen",
] ]
@ -999,15 +1035,15 @@ checksum = "09edd9e8b54e49e587e4f6295a7d29c3ea94d469cb40ab8ca70b288248a81db2"
[[package]] [[package]]
name = "libc" name = "libc"
version = "0.2.185" version = "0.2.186"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "52ff2c0fe9bc6cb6b14a0592c2ff4fa9ceb83eea9db979b0487cd054946a2b8f" checksum = "68ab91017fe16c622486840e4c83c9a37afeff978bd239b5293d61ece587de66"
[[package]] [[package]]
name = "libredox" name = "libredox"
version = "0.1.16" version = "0.1.17"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e02f3bb43d335493c96bf3fd3a321600bf6bd07ed34bc64118e9293bdffea46c" checksum = "f02ab6bace2054fb888a3c16f990117b579d14a3088e472d63c6011fa185c9d3"
dependencies = [ dependencies = [
"libc", "libc",
] ]
@ -1040,9 +1076,9 @@ dependencies = [
[[package]] [[package]]
name = "log" name = "log"
version = "0.4.29" version = "0.4.32"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5e5032e24019045c762d3c0f28f5b6b8bbf38563a65908389bf7978758920897" checksum = "953f07c43838f8e6f9758cab68bf5bed85465e7587ebe0b823f1bcd81978ad3a"
[[package]] [[package]]
name = "matchers" name = "matchers"
@ -1061,9 +1097,9 @@ checksum = "47e1ffaa40ddd1f3ed91f717a33c8c0ee23fff369e3aa8772b9605cc1d22f4c3"
[[package]] [[package]]
name = "memchr" name = "memchr"
version = "2.8.0" version = "2.8.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" checksum = "6b947ae49db0d222b1dbc6b113ce7248a3fc3a6ca21b696717bfc000ba4484d8"
[[package]] [[package]]
name = "mime" name = "mime"
@ -1083,9 +1119,9 @@ dependencies = [
[[package]] [[package]]
name = "mio" name = "mio"
version = "1.2.0" version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "50b7e5b27aa02a74bac8c3f23f448f8d87ff11f92d3aac1a6ed369ee08cc56c1" checksum = "02bd0af71c67b473010cbbc60715ee815645a4dc942899111f494b4b737d6fda"
dependencies = [ dependencies = [
"libc", "libc",
"wasi", "wasi",
@ -1109,9 +1145,9 @@ dependencies = [
[[package]] [[package]]
name = "num-conv" name = "num-conv"
version = "0.2.1" version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c6673768db2d862beb9b39a78fdcb1a69439615d5794a1be50caa9bc92c81967" checksum = "521739c6d2bac4aa25192232afe6841231376b2b26d4d9fae5ecf8ca5772e441"
[[package]] [[package]]
name = "num-traits" name = "num-traits"
@ -1134,12 +1170,13 @@ dependencies = [
[[package]] [[package]]
name = "nyx-scanner" name = "nyx-scanner"
version = "0.5.0" version = "0.8.0"
dependencies = [ dependencies = [
"assert_cmd", "assert_cmd",
"axum", "axum",
"bitflags", "bitflags",
"blake3", "blake3",
"bytes",
"bytesize", "bytesize",
"chrono", "chrono",
"clap", "clap",
@ -1149,6 +1186,8 @@ dependencies = [
"dashmap", "dashmap",
"directories", "directories",
"glob", "glob",
"h2",
"http",
"ignore", "ignore",
"indicatif", "indicatif",
"num_cpus", "num_cpus",
@ -1157,11 +1196,13 @@ dependencies = [
"petgraph", "petgraph",
"phf", "phf",
"predicates", "predicates",
"prost",
"r2d2", "r2d2",
"r2d2_sqlite", "r2d2_sqlite",
"rayon", "rayon",
"rmp-serde", "rmp-serde",
"rusqlite", "rusqlite",
"rustc-hash",
"serde", "serde",
"serde_json", "serde_json",
"smallvec", "smallvec",
@ -1410,6 +1451,29 @@ dependencies = [
"unicode-ident", "unicode-ident",
] ]
[[package]]
name = "prost"
version = "0.14.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d2ea70524a2f82d518bce41317d0fae74151505651af45faf1ffbd6fd33f0568"
dependencies = [
"bytes",
"prost-derive",
]
[[package]]
name = "prost-derive"
version = "0.14.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "27c6023962132f4b30eb4c172c91ce92d933da334c59c23cddee82358ddafb0b"
dependencies = [
"anyhow",
"itertools 0.14.0",
"proc-macro2",
"quote",
"syn",
]
[[package]] [[package]]
name = "quote" name = "quote"
version = "1.0.45" version = "1.0.45"
@ -1438,9 +1502,9 @@ dependencies = [
[[package]] [[package]]
name = "r2d2_sqlite" name = "r2d2_sqlite"
version = "0.33.0" version = "0.34.0"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5576df16239e4e422c4835c8ed00be806d4491855c7847dba60b7aa8408b469b" checksum = "f9a289c0a3bf56505c470efa2366e76010f1d892e2492a2f96b223386d63b7e2"
dependencies = [ dependencies = [
"r2d2", "r2d2",
"rusqlite", "rusqlite",
@ -1554,9 +1618,9 @@ dependencies = [
[[package]] [[package]]
name = "rsqlite-vfs" name = "rsqlite-vfs"
version = "0.1.0" version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a8a1f2315036ef6b1fbacd1972e8ee7688030b0a2121edfc2a6550febd41574d" checksum = "c51c9ae4df8a7fba42103df5c621fa3c37eccf3a3c650879e90fc48b11cc192c"
dependencies = [ dependencies = [
"hashbrown 0.16.1", "hashbrown 0.16.1",
"thiserror", "thiserror",
@ -1577,6 +1641,12 @@ dependencies = [
"sqlite-wasm-rs", "sqlite-wasm-rs",
] ]
[[package]]
name = "rustc-hash"
version = "2.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "94300abf3f1ae2e2b8ffb7b58043de3d399c73fa6f4b73826402a5c457614dbe"
[[package]] [[package]]
name = "rustix" name = "rustix"
version = "1.1.4" version = "1.1.4"
@ -1664,9 +1734,9 @@ dependencies = [
[[package]] [[package]]
name = "serde_json" name = "serde_json"
version = "1.0.149" version = "1.0.150"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86" checksum = "e8014e44b4736ed0538adeecded0fce2a272f22dc9578a7eb6b2d9993c74cfb9"
dependencies = [ dependencies = [
"indexmap", "indexmap",
"itoa", "itoa",
@ -1719,9 +1789,9 @@ dependencies = [
[[package]] [[package]]
name = "shlex" name = "shlex"
version = "1.3.0" version = "2.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0fda2ff0d084019ba4d7c6f371c95d8fd75ce3524c3cb8fb653a3023f6323e64" checksum = "f8fadd59c855ef2080decdef8ff161eb6661b86933c9d82e5ba29dc602a55aba"
[[package]] [[package]]
name = "signal-hook-registry" name = "signal-hook-registry"
@ -1741,9 +1811,9 @@ checksum = "703d5c7ef118737c72f1af64ad2f6f8c5e1921f818cdcb97b8fe6fc69bf66214"
[[package]] [[package]]
name = "siphasher" name = "siphasher"
version = "1.0.2" version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b2aa850e253778c88a04c3d7323b043aeda9d3e30d5971937c1855769763678e" checksum = "8ee5873ec9cce0195efcb7a4e9507a04cd49aec9c83d0389df45b1ef7ba2e649"
[[package]] [[package]]
name = "slab" name = "slab"
@ -1762,9 +1832,9 @@ dependencies = [
[[package]] [[package]]
name = "socket2" name = "socket2"
version = "0.6.3" version = "0.6.4"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3a766e1110788c36f4fa1c2b71b387a7815aa65f88ce0229841826633d93723e" checksum = "52d1cfed4120b4d927bf7c0f86d2087a4a7d6027c906d9f9d525a80573b9be51"
dependencies = [ dependencies = [
"libc", "libc",
"windows-sys", "windows-sys",
@ -1772,9 +1842,9 @@ dependencies = [
[[package]] [[package]]
name = "sqlite-wasm-rs" name = "sqlite-wasm-rs"
version = "0.5.3" version = "0.5.5"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1b2c760607300407ddeaee518acf28c795661b7108c75421303dbefb237d3a36" checksum = "dc3efc0da82635d7e1ced0053bbbfa8c7ab9645d0bf36ceb4f7127bb85315d75"
dependencies = [ dependencies = [
"cc", "cc",
"js-sys", "js-sys",
@ -1912,10 +1982,11 @@ dependencies = [
[[package]] [[package]]
name = "tokio" name = "tokio"
version = "1.52.1" version = "1.52.3"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b67dee974fe86fd92cc45b7a95fdd2f99a36a6d7b0d431a231178d3d670bbcc6" checksum = "8fc7f01b389ac15039e4dc9531aa973a135d7a4135281b12d7c1bc79fd57fffe"
dependencies = [ dependencies = [
"bytes",
"libc", "libc",
"mio", "mio",
"pin-project-lite", "pin-project-lite",
@ -2018,9 +2089,9 @@ dependencies = [
[[package]] [[package]]
name = "tower-http" name = "tower-http"
version = "0.6.8" version = "0.6.11"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d4e6559d53cc268e5031cd8429d05415bc4cb4aefc4aa5d6cc35fbf5b924a1f8" checksum = "4cfcf7e2740e6fc6d4d688b4ef00650406bb94adf4731e43c096c3a19fe40840"
dependencies = [ dependencies = [
"async-compression", "async-compression",
"bitflags", "bitflags",
@ -2127,9 +2198,9 @@ dependencies = [
[[package]] [[package]]
name = "tree-sitter" name = "tree-sitter"
version = "0.26.8" version = "0.26.9"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "887bd495d0582c5e3e0d8ece2233666169fa56a9644d172fc22ad179ab2d0538" checksum = "4dab76d0b724ba557954125188cf0633a1ca43199ced82d95c7b9c32cc3de1f3"
dependencies = [ dependencies = [
"cc", "cc",
"regex", "regex",
@ -2277,9 +2348,9 @@ checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821"
[[package]] [[package]]
name = "uuid" name = "uuid"
version = "1.23.1" version = "1.23.2"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ddd74a9687298c6858e9b88ec8935ec45d22e8fd5e6394fa1bd4e99a87789c76" checksum = "d258b83ceec21034727ecee8c382cfa6c3e133699b0742c64571814fb420c9f7"
dependencies = [ dependencies = [
"getrandom 0.4.2", "getrandom 0.4.2",
"js-sys", "js-sys",
@ -2344,9 +2415,9 @@ dependencies = [
[[package]] [[package]]
name = "wasm-bindgen" name = "wasm-bindgen"
version = "0.2.118" version = "0.2.122"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0bf938a0bacb0469e83c1e148908bd7d5a6010354cf4fb73279b7447422e3a89" checksum = "3ed04576f974d2b2fba0f38c51dbc5518011e38c36bf1143164be765528fd409"
dependencies = [ dependencies = [
"cfg-if", "cfg-if",
"once_cell", "once_cell",
@ -2357,9 +2428,9 @@ dependencies = [
[[package]] [[package]]
name = "wasm-bindgen-macro" name = "wasm-bindgen-macro"
version = "0.2.118" version = "0.2.122"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eeff24f84126c0ec2db7a449f0c2ec963c6a49efe0698c4242929da037ca28ed" checksum = "916151b09da36bd82f6615cbf3a419e2f0ba23a03c6160e8e92eb6bd4aa1dec6"
dependencies = [ dependencies = [
"quote", "quote",
"wasm-bindgen-macro-support", "wasm-bindgen-macro-support",
@ -2367,9 +2438,9 @@ dependencies = [
[[package]] [[package]]
name = "wasm-bindgen-macro-support" name = "wasm-bindgen-macro-support"
version = "0.2.118" version = "0.2.122"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9d08065faf983b2b80a79fd87d8254c409281cf7de75fc4b773019824196c904" checksum = "299047362ccbfce148b67ab7e73349f77748e00c8296f9542adfad2ad82c5c5e"
dependencies = [ dependencies = [
"bumpalo", "bumpalo",
"proc-macro2", "proc-macro2",
@ -2380,9 +2451,9 @@ dependencies = [
[[package]] [[package]]
name = "wasm-bindgen-shared" name = "wasm-bindgen-shared"
version = "0.2.118" version = "0.2.122"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5fd04d9e306f1907bd13c6361b5c6bfc7b3b3c095ed3f8a9246390f8dbdee129" checksum = "9a929b2c61f11ba3e9bc35b50c1f25cb38e0e892c0c231ae2b8cf78d5dad4437"
dependencies = [ dependencies = [
"unicode-ident", "unicode-ident",
] ]
@ -2423,9 +2494,9 @@ dependencies = [
[[package]] [[package]]
name = "web-sys" name = "web-sys"
version = "0.3.95" version = "0.3.99"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4f2dfbb17949fa2088e5d39408c48368947b86f7834484e87b73de55bc14d97d" checksum = "6d621441cfc37b84979402712047321980c178f299193a3589d05b99e8763436"
dependencies = [ dependencies = [
"js-sys", "js-sys",
"wasm-bindgen", "wasm-bindgen",
@ -2542,9 +2613,9 @@ dependencies = [
[[package]] [[package]]
name = "winnow" name = "winnow"
version = "1.0.2" version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2ee1708bef14716a11bae175f579062d4554d95be2c6829f518df847b7b3fdd0" checksum = "0592e1c9d151f854e6fd382574c3a0855250e1d9b2f99d9281c6e6391af352f1"
[[package]] [[package]]
name = "wit-bindgen" name = "wit-bindgen"
@ -2671,18 +2742,18 @@ dependencies = [
[[package]] [[package]]
name = "zerocopy" name = "zerocopy"
version = "0.8.48" version = "0.8.50"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eed437bf9d6692032087e337407a86f04cd8d6a16a37199ed57949d415bd68e9" checksum = "3b065d4f0e55f82fae73202e189638116a87c55ab6b8e6c2721e13dd9d854ad1"
dependencies = [ dependencies = [
"zerocopy-derive", "zerocopy-derive",
] ]
[[package]] [[package]]
name = "zerocopy-derive" name = "zerocopy-derive"
version = "0.8.48" version = "0.8.50"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "70e3cd084b1788766f53af483dd21f93881ff30d7320490ec3ef7526d203bad4" checksum = "0b631b19d36a892ab55420c92dbc83ccd79274f25be714855d3074aa71cab639"
dependencies = [ dependencies = [
"proc-macro2", "proc-macro2",
"quote", "quote",

View file

@ -1,14 +1,14 @@
[package] [package]
name = "nyx-scanner" name = "nyx-scanner"
version = "0.5.0" version = "0.8.0"
edition = "2024" edition = "2024"
rust-version = "1.88" rust-version = "1.88"
description = "A multi-language static analysis tool for detecting security vulnerabilities" description = "A multi-language static analysis tool for detecting security vulnerabilities"
license = "GPL-3.0-or-later" license = "GPL-3.0-or-later"
authors = ["Eli Peter <elicpeter@example.com>"] authors = ["Eli Peter <elicpeter@example.com>"]
homepage = "https://github.com/elicpeter/nyx" homepage = "https://nyxsec.dev/scanner"
repository = "https://github.com/elicpeter/nyx" repository = "https://github.com/elicpeter/nyx"
documentation = "https://elicpeter.github.io/nyx/" documentation = "https://nyxsec.dev/docs/nyx/"
keywords = ["security", "vulnerability", "scanner", "static-analysis", "cli"] keywords = ["security", "vulnerability", "scanner", "static-analysis", "cli"]
categories = ["security", "command-line-utilities", "development-tools", "parser-implementations", "text-processing"] categories = ["security", "command-line-utilities", "development-tools", "parser-implementations", "text-processing"]
readme = "README.md" readme = "README.md"
@ -33,12 +33,34 @@ pkg-url = "{ repo }/releases/download/v{ version }/nyx-{ target }{ archive-suffi
pkg-fmt = "zip" pkg-fmt = "zip"
bin-dir = "target/{ target }/release/{ bin }{ binary-ext }" bin-dir = "target/{ target }/release/{ bin }{ binary-ext }"
# docs.rs builds the `serve` feature (default) so the server module renders.
# `smt` is left off — bundled Z3 takes too long on docs.rs builders, and
# `smt-system-z3` needs a system library that isn't available there.
[package.metadata.docs.rs]
features = ["serve"]
rustdoc-args = ["--cfg", "docsrs"]
[features] [features]
default = ["serve"] default = ["serve", "dynamic"]
serve = ["dep:axum", "dep:tokio", "dep:tokio-stream", "dep:tower-http"] serve = ["dep:axum", "dep:tokio", "dep:tokio-stream", "dep:tower-http"]
smt = ["dep:z3", "z3/bundled"] smt = ["dep:z3", "z3/bundled"]
smt-system-z3 = ["dep:z3"] smt-system-z3 = ["dep:z3"]
docgen = [] docgen = []
# Dynamic verification layer: builds harnesses from findings, runs them in a
# sandbox, reports back whether the sink fires.
dynamic = ["dep:bytes", "dep:h2", "dep:http", "dep:prost", "dep:tempfile", "dep:tokio"]
# Phase 19 (Track E.3): the `nyx-image-builder` helper binary that builds
# and pins per-toolchain Docker images. Gated so it does not bloat the
# default `nyx` build with extra TOML-write logic CI-only operators need.
image-builder = []
# Phase 20 (Track E.4): the firecracker VM backend. Off by default so
# the standard build pulls in zero Firecracker-related code; turning it
# on adds the `firecracker.rs` backend module and exposes
# `SandboxBackend::Firecracker` to callers. When the feature is on but
# the `firecracker` binary is absent on PATH, the backend returns
# `SandboxError::BackendUnavailable(SandboxBackend::Firecracker)` so the
# verifier can route around it cleanly.
firecracker = ["dynamic"]
[lib] [lib]
name = "nyx_scanner" name = "nyx_scanner"
@ -53,34 +75,44 @@ name = "nyx-docgen"
path = "tools/docgen/main.rs" path = "tools/docgen/main.rs"
required-features = ["docgen"] required-features = ["docgen"]
[[bin]]
name = "nyx-image-builder"
path = "tools/image-builder/main.rs"
required-features = ["image-builder"]
[[bench]] [[bench]]
name = "scan_bench" name = "scan_bench"
harness = false harness = false
[[bench]]
name = "dynamic_bench"
harness = false
required-features = []
[dev-dependencies] [dev-dependencies]
tempfile = "3.26.0" tempfile = "3.27.0"
criterion = { version = "0.8", features = ["html_reports"] } criterion = { version = "0.8.2", features = ["html_reports"] }
assert_cmd = "2" assert_cmd = "2.2.2"
predicates = "3" predicates = "3.1.4"
glob = "0.3" glob = "0.3.3"
tower = { version = "0.5", features = ["util"] } tower = { version = "0.5.3", features = ["util"] }
[dependencies] [dependencies]
directories = "6.0.0" directories = "6.0.0"
clap = { version = "4.5.60", features = ["derive"] } clap = { version = "4.6.1", features = ["derive"] }
serde = { version = "1.0.228", features = ["derive"] } serde = { version = "1.0.228", features = ["derive"] }
serde_json = "1.0" serde_json = "1.0.150"
rmp-serde = "1.3" rmp-serde = "1.3.1"
toml = "1.0.3" toml = "1.1.2"
tracing-subscriber = { version = "0.3.22", features = ["env-filter", "json", "ansi","time"] } tracing-subscriber = { version = "0.3.23", features = ["env-filter", "json", "ansi","time"] }
tracing = "0.1.44" tracing = "0.1.44"
num_cpus = "1.17.0" num_cpus = "1.17.0"
rusqlite = { version = "0.39.0", features = ["bundled"] } rusqlite = { version = "0.39.0", features = ["bundled"] }
r2d2_sqlite = { version = "0.33.0", features = ["bundled"] } r2d2_sqlite = { version = "0.34.0", features = ["bundled"] }
ignore = "0.4.25" ignore = "0.4.26"
tree-sitter = "0.26.6" tree-sitter = "0.26.9"
tree-sitter-rust = "0.24.0" tree-sitter-rust = "0.24.2"
tree-sitter-c = "0.24.1" tree-sitter-c = "0.24.2"
tree-sitter-cpp = "0.23.4" tree-sitter-cpp = "0.23.4"
tree-sitter-java = "0.23.5" tree-sitter-java = "0.23.5"
tree-sitter-typescript = "0.23.2" tree-sitter-typescript = "0.23.2"
@ -91,27 +123,42 @@ tree-sitter-python = "0.25.0"
tree-sitter-ruby = "0.23.1" tree-sitter-ruby = "0.23.1"
crossbeam-channel = "0.5.15" crossbeam-channel = "0.5.15"
blake3 = "1.8.5" blake3 = "1.8.5"
once_cell = "1.21.3" once_cell = "1.21.4"
console = "0.16.2" console = "0.16.3"
terminal_size = "0.4" terminal_size = "0.4.4"
rayon = "1.11.0" rayon = "1.12.0"
r2d2 = "0.8.10" r2d2 = "0.8.10"
bytesize = "2.3.1" bytesize = "2.3.1"
chrono = { version = "0.4.44", default-features = false, features = ["std", "clock", "serde"] } chrono = { version = "0.4.45", default-features = false, features = ["std", "clock", "serde"] }
thiserror = "2.0.18" thiserror = "2.0.18"
dashmap = "6.1.0" dashmap = "6.2.1"
parking_lot = "0.12" parking_lot = "0.12.5"
petgraph = { version = "0.8.3", features = ["serde-1"] } petgraph = { version = "0.8.3", features = ["serde-1"] }
bitflags = "2.11.0" bitflags = "2.12.1"
phf = { version = "0.13.1", features = ["macros"] } phf = { version = "0.13.1", features = ["macros"] }
indicatif = "0.18.4" indicatif = "0.18.4"
smallvec = { version = "1.15", features = ["serde"] } smallvec = { version = "1.15.1", features = ["serde"] }
uuid = { version = "1", features = ["v4"] } rustc-hash = "2.1.2"
axum = { version = "0.8", optional = true } uuid = { version = "1.23.2", features = ["v4"] }
tokio = { version = "1", features = ["rt-multi-thread", "macros", "signal", "sync"], optional = true } axum = { version = "0.8.9", optional = true }
tokio-stream = { version = "0.1", features = ["sync"], optional = true } bytes = { version = "1.11.1", optional = true }
tower-http = { version = "0.6", features = ["cors", "compression-gzip", "trace", "set-header", "limit"], optional = true } h2 = { version = "0.4.14", optional = true }
http = { version = "1.4.1", optional = true }
prost = { version = "0.14.3", optional = true }
tokio = { version = "1.52.3", features = ["rt-multi-thread", "macros", "signal", "sync", "net", "io-util"], optional = true }
tokio-stream = { version = "0.1.18", features = ["sync"], optional = true }
tower-http = { version = "0.6.11", features = ["cors", "compression-gzip", "trace", "set-header", "limit"], optional = true }
z3 = { version = "0.20.0", optional = true} z3 = { version = "0.20.0", optional = true}
tempfile = { version = "3.27.0", optional = true }
[lints.clippy]
# Allowed project-wide instead of per-file. The vast majority of
# `collapsible_if` hits are `if let Some(x) = .. { if cond { .. } }` patterns
# whose only "fix" is to collapse into a let-chain, which hurts readability on
# the complex extractor expressions throughout the engine. Keeping the decision
# here means the rationale lives in one place and new files inherit it
# automatically rather than re-declaring `#![allow(clippy::collapsible_if)]`.
collapsible_if = "allow"
[profile.release] [profile.release]
lto = true lto = true

89
LICENSE-GRANTS.md Normal file
View file

@ -0,0 +1,89 @@
# Internal License Grants
This file records dual-licensing grants the copyright holder of Nyx has issued
beyond the public GPL-3.0-or-later release.
Nyx ships publicly under GPL-3.0-or-later. That license continues to apply to
every public release on GitHub, crates.io, and any other channel. The grants
recorded here are separate, private licenses from the copyright holder to
specific projects. They do not modify the public GPL terms and they are not
transferable to third parties.
The right to issue these grants is preserved in `CLA.md` Section 4
(Relicensing Right):
> [The contributor] grants the Project and any entity that maintains or
> succeeds it the right to relicense Your Contribution, in whole or in part,
> under terms other than the Project's current license (currently
> GPL-3.0-or-later), where necessary to support the long-term sustainability,
> distribution, and evolution of the Project.
The copyright holder is the sole author of every Contribution to Nyx
(verifiable via `git log`). The CLA covers any future external Contributions.
The copyright holder may therefore grant any party, including projects owned
by the same copyright holder, a license to use Nyx under terms other than
GPL-3.0-or-later, without affecting the public GPL release.
## How forks are affected
A third-party fork of nyx-agent that obtains the nyx-agent source under PolyForm
Small Business 1.0.0 (or any successor source-available license) does not
acquire any rights to Nyx beyond the public GPL-3.0-or-later terms. The
internal grant below is project-to-project and non-transferable. Anyone
redistributing a binary that statically or dynamically links the `nyx` crate
must comply with the GPL on the `nyx` portion of the work. GPL is viral
copyleft on distribution. Only the copyright holder may issue further
dual-licensing grants.
---
## Grant Register
### Grant 1: nyx-agent
| Field | Value |
|---|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Grantor | Eli Peter, sole copyright holder of Nyx as of the effective date |
| Grantee | The nyx-agent project (`nyx-agent` daemon, web UI, and accompanying tooling). Repository: `nyx-agent` |
| Effective date | 2026-05-17 |
| Scope | All Nyx source code, documentation, fixtures, build artefacts, and binaries (the "Licensed Material") in any version released as of the effective date or thereafter, plus any future modifications the Grantor authors or accepts under the CLA |
| Permitted uses | (a) static or dynamic linking of the Licensed Material into the nyx-agent daemon; (b) modification of the Licensed Material as required for nyx-agent integration; (c) redistribution of the Licensed Material as part of the nyx-agent distribution; (d) sublicensing the Licensed Material to end users of nyx-agent solely under whatever license terms nyx-agent itself is distributed under (currently PolyForm Small Business 1.0.0, or a separately negotiated commercial license) |
| Restrictions | (a) this grant does not modify, supersede, or revoke the public GPL-3.0-or-later release of Nyx; (b) this grant is non-transferable; only the nyx-agent project, owned by the Grantor, may exercise it; (c) any third-party fork of nyx-agent must obtain Nyx under the public GPL terms unless it negotiates a separate grant from the Grantor; (d) attribution of Nyx authorship must be preserved in any redistribution per the CLA's moral-rights waiver |
| Duration | Perpetual and irrevocable, subject only to the Grantee maintaining ownership-or-control by the Grantor. If the nyx-agent project is sold, assigned, or otherwise transferred to a third party, this grant terminates and the new owner must negotiate a separate license |
| Sublicensing of the grant itself | Not permitted. The Grantee may distribute Nyx as part of nyx-agent to end users under nyx-agent's outward terms, but the Grantee may not grant any other project the right to use Nyx outside the public GPL terms |
| Governing law | Same as Nyx CLA |
---
## Adding future grants
New grants follow the same format as Grant 1. Append a new section
(`### Grant N: <recipient name>`) below the existing entries and commit to
the Nyx repository. Grants are append-only. Revisions land as superseding
entries with their own date, not as edits to the original.
Grants the Grantor anticipates issuing in the future include:
- Commercial-license SKU grants to individual customers of nyx-agent that
exceed the PolyForm Small Business threshold. These will be issued
per-customer under a separate Nyx Commercial License contract.
- Stewardship-transition grants if the project is ever handed off (for
example, to a foundation). These would be a single grant to the receiving
entity.
The Grantor reserves the right to refuse to issue any grant.
---
## What this file is NOT
- It is not a redistribution license. Third parties cannot rely on it to use
Nyx outside the public GPL terms.
- It is not a Contributor License Agreement. `CLA.md` covers contribution
terms separately.
- It is not a public-facing license file. The canonical public license for
Nyx is `LICENSE` (GPL-3.0-or-later).
---
Copyright (c) 2026 Eli Peter. All rights reserved.

102
README.md
View file

@ -1,13 +1,15 @@
<div align="center"> <div align="center">
<img src="assets/nyx-wordmark.svg" alt="nyx" height="110"/> <img src="assets/nyx-readme-header.png" alt="NYX" width="640"/>
**A local-first security scanner with a browser UI. Scan your repo and triage in your browser, with no cloud and no account.** **A local-first security scanner with sandboxed dynamic verification and a browser UI. Scan your repo and triage in your browser, with no cloud and no account.**
[![crates.io](https://img.shields.io/crates/v/nyx-scanner.svg)](https://crates.io/crates/nyx-scanner) [![crates.io](https://img.shields.io/crates/v/nyx-scanner.svg)](https://crates.io/crates/nyx-scanner)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Rust 1.88+](https://img.shields.io/badge/rust-1.88%2B-orange)](https://www.rust-lang.org) [![Rust 1.88+](https://img.shields.io/badge/rust-1.88%2B-orange)](https://www.rust-lang.org)
[![CI](https://img.shields.io/github/actions/workflow/status/elicpeter/nyx/ci.yml?branch=master)](https://github.com/elicpeter/nyx/actions) [![CI](https://img.shields.io/github/actions/workflow/status/elicpeter/nyx/ci.yml?branch=master)](https://github.com/elicpeter/nyx/actions)
[![Docs](https://img.shields.io/badge/docs-elicpeter.github.io%2Fnyx-blue)](https://elicpeter.github.io/nyx/) [![Docs](https://img.shields.io/badge/docs-nyxscan.dev%2Fdocs-blue)](https://nyxscan.dev/docs/)
English · [简体中文](./README.zh-CN.md)
</div> </div>
<p align="center"><img src="assets/screenshots/demo.gif" alt="Nyx UI walkthrough: empty Welcome state, kicking off a scan, the populated overview with Health Score, drilling into a HIGH finding's flow visualizer, then the triage flow" width="900"/></p> <p align="center"><img src="assets/screenshots/demo.gif" alt="Nyx UI walkthrough: empty Welcome state, kicking off a scan, the populated overview with Health Score, drilling into a HIGH finding's flow visualizer, then the triage flow" width="900"/></p>
@ -16,7 +18,7 @@
## Scan locally, browse locally ## Scan locally, browse locally
Nyx runs a cross-language taint analysis on your repository, then serves the results to a React UI bound to `127.0.0.1`. You get a finding list with severity, evidence, and a step-by-step **flow visualiser** that walks the dataflow from source → sanitizer → sink. Triage decisions persist to `.nyx/triage.json`, which commits alongside your code so the team shares one triage state. Nyx runs cross-language taint analysis on your repository, then verifies Medium or higher confidence findings by running small sandboxed harnesses against the real code. Results are served to a React UI bound to `127.0.0.1`. You get severity, static evidence, dynamic verdicts, and a step-by-step **flow visualiser** that walks the dataflow from source → sanitizer → sink. Triage decisions persist to `.nyx/triage.json`, which commits alongside your code so the team shares one triage state.
```bash ```bash
cargo install nyx-scanner cargo install nyx-scanner
@ -24,7 +26,7 @@ nyx scan # runs the analyzer, caches findings in .nyx/
nyx serve # opens http://localhost:9700 in your browser nyx serve # opens http://localhost:9700 in your browser
``` ```
Everything stays on your machine: loopback-only bind, host-header enforcement, CSRF on every mutation, no telemetry, no login. Everything stays on your machine: loopback-only bind, host-header enforcement, CSRF on every mutation, no remote telemetry, no login.
<p align="center"><img src="assets/screenshots/overview.png" alt="Overview dashboard for a small JS app: Health Score C 78 with the five-component breakdown (Severity pressure, Confidence quality, Trend, Triage coverage, Regression resistance), 3 findings detected, OWASP A03 and A02 buckets, confidence distribution and issue category bars, top affected files" width="900"/></p> <p align="center"><img src="assets/screenshots/overview.png" alt="Overview dashboard for a small JS app: Health Score C 78 with the five-component breakdown (Severity pressure, Confidence quality, Trend, Triage coverage, Regression resistance), 3 findings detected, OWASP A03 and A02 buckets, confidence distribution and issue category bars, top affected files" width="900"/></p>
@ -36,7 +38,7 @@ Everything stays on your machine: loopback-only bind, host-header enforcement, C
|---|---| |---|---|
| **Overview** | Dashboard: finding counts by severity, top offenders, engine profile summary | | **Overview** | Dashboard: finding counts by severity, top offenders, engine profile summary |
| **Findings** | Browsable list with severity badges, triage status, rule filter, language filter | | **Findings** | Browsable list with severity badges, triage status, rule filter, language filter |
| **Finding detail** | Flow-path visualiser with numbered steps (source → sanitizer → sink), code snippets, evidence, cross-file markers, triage dropdown | | **Finding detail** | Flow-path visualiser with numbered steps (source → sanitizer → sink), dynamic verdicts, code snippets, evidence, cross-file markers, triage dropdown |
| **Triage** | Bulk update states (open, investigating, fixed, false_positive, accepted_risk, suppressed), audit trail, import/export JSON | | **Triage** | Bulk update states (open, investigating, fixed, false_positive, accepted_risk, suppressed), audit trail, import/export JSON |
| **Explorer** | File tree with per-file symbol list and finding overlay | | **Explorer** | File tree with per-file symbol list and finding overlay |
| **Scans** | Run history, metrics, diff two scans to see what changed | | **Scans** | Run history, metrics, diff two scans to see what changed |
@ -44,7 +46,7 @@ Everything stays on your machine: loopback-only bind, host-header enforcement, C
| **Config** | Live config editor; reload without restart | | **Config** | Live config editor; reload without restart |
`nyx serve` flags: `--port <N>` (default `9700`), `--host <addr>` (loopback only: `127.0.0.1`, `localhost`, or `::1`), `--no-browser`. See `[server]` in `nyx.conf` for persistent settings, and the [Browser UI guide](https://elicpeter.github.io/nyx/serve.html) for the page-by-page UI tour and security model. `nyx serve` flags: `--port <N>` (default `9700`), `--host <addr>` (loopback only: `127.0.0.1`, `localhost`, or `::1`), `--no-browser`. See `[server]` in `nyx.conf` for persistent settings, and the [Browser UI guide](https://nyxscan.dev/docs/serve.html) for the page-by-page UI tour and security model.
--- ---
@ -52,7 +54,7 @@ Everything stays on your machine: loopback-only bind, host-header enforcement, C
The same engine runs headless for CI pipelines. SARIF output uploads directly to GitHub Code Scanning. The same engine runs headless for CI pipelines. SARIF output uploads directly to GitHub Code Scanning.
<p align="center"><img src="assets/screenshots/cli-scan.png" alt="nyx scan console output: HIGH taint findings across a JS and Python file with source → sink arrows" width="820"/></p> <p align="center"><img src="assets/screenshots/cli-scan.gif" alt="nyx scan console output: HIGH taint findings across a JS and Python file with source → sink arrows" width="820"/></p>
```bash ```bash
# Fail the job on medium or higher, emit SARIF # Fail the job on medium or higher, emit SARIF
@ -69,12 +71,12 @@ nyx scan --mode ast
nyx scan --engine-profile deep nyx scan --engine-profile deep
``` ```
Forward cross-file taint runs in every profile. Symex and the demand-driven backwards walk are opt-in. Turn them on either via `--engine-profile deep`, or individually (`--symex`, `--backwards-analysis`). See the [CLI reference](https://elicpeter.github.io/nyx/cli.html#engine-depth-profile) for the full toggle matrix. Forward cross-file taint runs in every profile. Symex and the demand-driven backwards walk are opt-in. Turn them on either via `--engine-profile deep`, or individually (`--symex`, `--backwards-analysis`). See the [CLI reference](https://nyxscan.dev/docs/cli.html#engine-depth-profile) for the full toggle matrix.
### GitHub Action ### GitHub Action
```yaml ```yaml
- uses: elicpeter/nyx@v0.5.0 - uses: elicpeter/nyx@v0.8.0
with: with:
format: sarif format: sarif
fail-on: MEDIUM fail-on: MEDIUM
@ -115,15 +117,15 @@ Requires stable Rust 1.88+. The frontend is compiled and embedded in the binary
## Languages ## Languages
All 10 languages parse via tree-sitter and run through the full pipeline, but rule depth and engine coverage are uneven. Benchmark F1 on the 433-case corpus at [`tests/benchmark/ground_truth.json`](tests/benchmark/ground_truth.json) is 100% for nine of ten languages and 94.1% for Go, so F1 alone no longer separates the tiers. Tiering reflects rule depth, gated-sink coverage, and structural idioms the synthetic corpus does not fully stress: All 10 languages parse via tree-sitter and run through the full pipeline, but rule depth and engine coverage are uneven. Benchmark F1 on the synthetic corpus at [`tests/benchmark/ground_truth.json`](tests/benchmark/ground_truth.json) is 100% across all ten languages at the last measured baseline (see [`tests/benchmark/RESULTS.md`](tests/benchmark/RESULTS.md)), so F1 alone no longer separates the tiers. Tiering reflects rule depth, gated-sink coverage, and structural idioms the synthetic corpus does not fully stress:
| Tier | Languages | F1 | Use as a CI gate? | | Tier | Languages | F1 | Use as a CI gate? |
|---|---|---|---| |---|---|---|---|
| **Stable** | Python, JavaScript, TypeScript | 100% | Yes | | **Stable** | Python, JavaScript, TypeScript | 100% | Yes |
| **Beta** | Java, PHP, Ruby, Rust, Go | 94.1% to 100% | Yes, with light FP triage | | **Beta** | Java, PHP, Ruby, Rust, Go | 100% | Yes, with light FP triage |
| **Preview** | C, C++ | 100% on synthetic corpus | No. STL container flow, builder chains, and inline class member functions are tracked, but deep pointer aliasing and function pointers are not. Pair with clang-tidy or Clang Static Analyzer | | **Preview** | C, C++ | 100% on synthetic corpus | No. STL container flow, builder chains, and inline class member functions are tracked, but deep pointer aliasing and function pointers are not. Pair with clang-tidy or Clang Static Analyzer |
Aggregate rule-level F1: 99.8% (P=0.995, R=1.000). All real-CVE fixtures fire; the single open FP is `go-safe-009`. Per-dimension detail and known blind spots live on the [Language maturity page](https://elicpeter.github.io/nyx/language-maturity.html). All real-CVE fixtures fire and the corpus carries zero open FPs at the recorded baseline (P=R=F1=1.000). Per-dimension detail and known blind spots live on the [Language maturity page](https://nyxscan.dev/docs/language-maturity.html).
### Validated against real CVEs ### Validated against real CVEs
@ -134,36 +136,92 @@ The corpus also holds a small set of vulnerable/patched pairs extracted from pub
| [CVE-2023-48022](https://nvd.nist.gov/vuln/detail/CVE-2023-48022) | Ray | Python | Command injection | | [CVE-2023-48022](https://nvd.nist.gov/vuln/detail/CVE-2023-48022) | Ray | Python | Command injection |
| [CVE-2017-18342](https://nvd.nist.gov/vuln/detail/CVE-2017-18342) | PyYAML | Python | Deserialization | | [CVE-2017-18342](https://nvd.nist.gov/vuln/detail/CVE-2017-18342) | PyYAML | Python | Deserialization |
| [CVE-2019-14939](https://nvd.nist.gov/vuln/detail/CVE-2019-14939) | mongo-express | JavaScript | Code execution (`eval`) | | [CVE-2019-14939](https://nvd.nist.gov/vuln/detail/CVE-2019-14939) | mongo-express | JavaScript | Code execution (`eval`) |
| [CVE-2023-22621](https://nvd.nist.gov/vuln/detail/CVE-2023-22621) | Strapi | JavaScript | Code execution (SSTI) |
| [CVE-2025-64430](https://nvd.nist.gov/vuln/detail/CVE-2025-64430) | Parse Server | JavaScript | SSRF | | [CVE-2025-64430](https://nvd.nist.gov/vuln/detail/CVE-2025-64430) | Parse Server | JavaScript | SSRF |
| [CVE-2023-26159](https://nvd.nist.gov/vuln/detail/CVE-2023-26159) | follow-redirects | TypeScript | SSRF | | [CVE-2023-26159](https://nvd.nist.gov/vuln/detail/CVE-2023-26159) | follow-redirects | TypeScript | SSRF |
| [GHSA-4x48-cgf9-q33f](https://github.com/advisories/GHSA-4x48-cgf9-q33f) | Novu | TypeScript | SSRF |
| [CVE-2026-25544](https://nvd.nist.gov/vuln/detail/CVE-2026-25544) | Payload CMS | TypeScript | SQL injection |
| [CVE-2022-30323](https://nvd.nist.gov/vuln/detail/CVE-2022-30323) | hashicorp/go-getter | Go | Command injection | | [CVE-2022-30323](https://nvd.nist.gov/vuln/detail/CVE-2022-30323) | hashicorp/go-getter | Go | Command injection |
| [CVE-2024-31450](https://nvd.nist.gov/vuln/detail/CVE-2024-31450) | owncast | Go | Path traversal | | [CVE-2024-31450](https://nvd.nist.gov/vuln/detail/CVE-2024-31450) | owncast | Go | Path traversal |
| [CVE-2023-3188](https://nvd.nist.gov/vuln/detail/CVE-2023-3188) | owncast | Go | SSRF | | [CVE-2023-3188](https://nvd.nist.gov/vuln/detail/CVE-2023-3188) | owncast | Go | SSRF |
| [CVE-2026-41422](https://github.com/daptin/daptin/security/advisories/GHSA-rw2c-8rfq-gwfv) | daptin | Go | SQL injection |
| [CVE-2015-7501](https://nvd.nist.gov/vuln/detail/CVE-2015-7501) | Apache Commons Collections | Java | Deserialization | | [CVE-2015-7501](https://nvd.nist.gov/vuln/detail/CVE-2015-7501) | Apache Commons Collections | Java | Deserialization |
| [CVE-2017-12629](https://nvd.nist.gov/vuln/detail/CVE-2017-12629) | Apache Solr | Java | Command injection | | [CVE-2017-12629](https://nvd.nist.gov/vuln/detail/CVE-2017-12629) | Apache Solr | Java | Command injection |
| [CVE-2022-1471](https://nvd.nist.gov/vuln/detail/CVE-2022-1471) | SnakeYAML | Java | Deserialization |
| [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) | Apache Commons Text | Java | Code execution |
| [GHSA-h8cj-hpmg-636v](https://github.com/advisories/GHSA-h8cj-hpmg-636v) | Appsmith | Java | SQL injection |
| [CVE-2013-0156](https://nvd.nist.gov/vuln/detail/CVE-2013-0156) | Ruby on Rails | Ruby | Deserialization | | [CVE-2013-0156](https://nvd.nist.gov/vuln/detail/CVE-2013-0156) | Ruby on Rails | Ruby | Deserialization |
| [CVE-2020-8130](https://nvd.nist.gov/vuln/detail/CVE-2020-8130) | Rake | Ruby | Command injection | | [CVE-2020-8130](https://nvd.nist.gov/vuln/detail/CVE-2020-8130) | Rake | Ruby | Command injection |
| [CVE-2021-21288](https://nvd.nist.gov/vuln/detail/CVE-2021-21288) | CarrierWave | Ruby | SSRF |
| [CVE-2023-38337](https://nvd.nist.gov/vuln/detail/CVE-2023-38337) | rswag-api | Ruby | Path traversal |
| [CVE-2017-9841](https://nvd.nist.gov/vuln/detail/CVE-2017-9841) | PHPUnit | PHP | Code execution (`eval`) | | [CVE-2017-9841](https://nvd.nist.gov/vuln/detail/CVE-2017-9841) | PHPUnit | PHP | Code execution (`eval`) |
| [CVE-2018-15133](https://nvd.nist.gov/vuln/detail/CVE-2018-15133) | Laravel | PHP | Deserialization | | [CVE-2018-15133](https://nvd.nist.gov/vuln/detail/CVE-2018-15133) | Laravel | PHP | Deserialization |
| [CVE-2018-20997](https://nvd.nist.gov/vuln/detail/CVE-2018-20997) | tar-rs | Rust | Path traversal |
| [CVE-2022-36113](https://nvd.nist.gov/vuln/detail/CVE-2022-36113) | cargo | Rust | Path traversal |
| [CVE-2024-24576](https://nvd.nist.gov/vuln/detail/CVE-2024-24576) | Rust stdlib | Rust | Command injection |
| [CVE-2023-42456](https://rustsec.org/advisories/RUSTSEC-2023-0069.html) | sudo-rs | Rust | Path traversal |
| [CVE-2024-32884](https://rustsec.org/advisories/RUSTSEC-2024-0335.html) | gitoxide | Rust | Command injection |
| [CVE-2025-53549](https://rustsec.org/advisories/RUSTSEC-2025-0043.html) | matrix-rust-sdk | Rust | SQL injection |
| [CVE-2016-3714](https://nvd.nist.gov/vuln/detail/CVE-2016-3714) | ImageMagick (ImageTragick) | C | Command injection | | [CVE-2016-3714](https://nvd.nist.gov/vuln/detail/CVE-2016-3714) | ImageMagick (ImageTragick) | C | Command injection |
| [CVE-2019-18634](https://nvd.nist.gov/vuln/detail/CVE-2019-18634) | sudo (pwfeedback) | C | Memory safety | | [CVE-2019-18634](https://nvd.nist.gov/vuln/detail/CVE-2019-18634) | sudo (pwfeedback) | C | Memory safety |
| [CVE-2019-13132](https://nvd.nist.gov/vuln/detail/CVE-2019-13132) | ZeroMQ libzmq | C++ | Memory safety | | [CVE-2019-13132](https://nvd.nist.gov/vuln/detail/CVE-2019-13132) | ZeroMQ libzmq | C++ | Memory safety |
| [CVE-2022-1941](https://nvd.nist.gov/vuln/detail/CVE-2022-1941) | Protocol Buffers | C++ | Memory safety | | [CVE-2022-1941](https://nvd.nist.gov/vuln/detail/CVE-2022-1941) | Protocol Buffers | C++ | Memory safety |
| [CVE-2025-69662](https://nvd.nist.gov/vuln/detail/CVE-2025-69662) | geopandas | Python | SQL injection |
| [CVE-2026-33626](https://nvd.nist.gov/vuln/detail/CVE-2026-33626) | LMDeploy | Python | SSRF |
Fixtures live under [`tests/benchmark/cve_corpus/`](tests/benchmark/cve_corpus/) with upstream attribution headers. Fixtures live under [`tests/benchmark/cve_corpus/`](tests/benchmark/cve_corpus/) with upstream attribution headers.
<!--
### Real-world findings
- **Nextcloud server**, [PR #59979](https://github.com/nextcloud/server/pull/59979), merged. The runtime decoder for this column already restricted `allowed_classes`, but the repair routine called `unserialize()` without it, so magic methods on referenced classes could still run. Fix matches the runtime path.
-->
--- ---
## How it works ## How it works
Two passes over the filesystem, with an optional SQLite index to skip unchanged files: Two passes over the filesystem, with an optional SQLite index to skip unchanged files:
```mermaid
flowchart LR
Repo["Repository files"] --> Pass1["Pass 1 per file<br/>tree-sitter, CFG, SSA"]
Pass1 --> Summaries["Function summaries<br/>sources, sinks, sanitizers, points-to"]
Summaries --> Index["SQLite index<br/>optional incremental cache"]
Index --> Pass2["Pass 2 cross-file<br/>global summaries, k=1 inline, SCC fixpoint"]
Pass2 --> Rank["Rank and dedupe<br/>severity, evidence, exploitability"]
Rank --> Verify["Dynamic verification<br/>sandboxed harnesses, verdicts"]
Verify --> Output["Console, JSON, SARIF<br/>and browser UI"]
```
1. **Pass 1**: parse each file via tree-sitter, build an intra-procedural CFG (petgraph), lower to pruned SSA (Cytron phi insertion over dominance frontiers), and export per-function summaries (source/sanitizer/sink caps, taint transforms, points-to, callees). 1. **Pass 1**: parse each file via tree-sitter, build an intra-procedural CFG (petgraph), lower to pruned SSA (Cytron phi insertion over dominance frontiers), and export per-function summaries (source/sanitizer/sink caps, taint transforms, points-to, callees).
2. **Summary merge**: union all per-file summaries into a `GlobalSummaries` map. 2. **Summary merge**: union all per-file summaries into a `GlobalSummaries` map.
3. **Pass 2**: re-analyze each file with cross-file context under bounded context sensitivity (k=1 inlining for intra-file callees, SCC fixpoint capped at 64 iterations, and summary fallback for callees above the inline body-size cap). A forward dataflow worklist propagates taint through the SSA lattice with guaranteed convergence. Call-graph SCCs iterate to fixed-point (within the cap) so mutually recursive functions get accurate summaries. 3. **Pass 2**: re-analyze each file with cross-file context under bounded context sensitivity (k=1 inlining for intra-file callees, SCC fixpoint capped at 64 iterations, and summary fallback for callees above the inline body-size cap). A forward dataflow worklist propagates taint through the SSA lattice with guaranteed convergence. Call-graph SCCs iterate to fixed-point (within the cap) so mutually recursive functions get accurate summaries.
4. **Rank, dedupe, emit**: findings are scored by severity × evidence strength × source-kind exploitability, then emitted to console, JSON, or SARIF. 4. **Rank, dedupe, verify, emit**: findings are scored by severity × evidence strength × source-kind exploitability. Medium or higher confidence findings are dynamically verified by default, then results are emitted to console, JSON, SARIF, and the browser UI.
Detector families: taint (cross-file source→sink), CFG structural (auth gaps, unguarded sinks, resource leaks), state model (use-after-close, double-close, must-leak, unauthed-access), AST patterns (tree-sitter structural match). Full detector docs: [Detectors](https://elicpeter.github.io/nyx/detectors.html). Detector families: taint (cross-file source→sink, with cap-specific rule classes for SQLi, XSS, command/code exec, deserialization, SSRF, path traversal, format string, crypto, LDAP injection, XPath injection, HTTP header / response splitting, open redirect, server-side template injection, XXE, prototype pollution, data exfiltration, and the auth fold-in), CFG structural (auth gaps, unguarded sinks, resource leaks), state model (use-after-close, double-close, must-leak, unauthed-access), AST patterns (tree-sitter structural match). Full detector docs: [Detectors](https://nyxscan.dev/docs/detectors.html).
---
## Verify findings dynamically
Static analysis says a sink is reachable. Dynamic verification tries to prove it. With `--verify` (on by default), Nyx builds a small harness around each Medium-or-higher finding, runs it in a sandbox against a curated payload corpus, and stamps a verdict onto the finding.
```bash
nyx scan --verify # build + run a harness per finding (default)
nyx scan --no-verify # static analysis only, for fast local loops
```
A finding is **Confirmed** only when an attacker-controlled payload fires the sink *and* a paired benign control stays clean. That differential rule, plus behavioral oracles (a template that renders `49`, a deserializer that resolves a gadget class, a redirect that leaves the origin), keeps the verifier from confirming on an echoed string. Sinks behind a recognized guard demote to `ConfirmedWithKnownGuard`; sinks reached without a completed exploit chain land as `PartiallyConfirmed`.
Coverage spans 18 verifiable capability classes and 120+ registered adapters across all ten languages (Flask, Django, Express, NestJS, Spring, Rails, Laravel, Gin, Axum, and more), with per-language build pools and copy-on-write workdirs to keep the per-finding cost low. Confirmed findings write a hermetic repro bundle with a `reproduce.sh`. Runs are deterministic: every payload is seeded from the spec hash.
```bash
# CI: fail the build if a new Confirmed finding appears vs. a baseline
nyx scan --baseline .nyx/baseline.json --gate no-new-confirmed
```
Backends: Docker (preferred, network-blocked by default) or an in-process runner with `--harden {standard,strict}`. Full matrix, oracle list, and limitations: [Dynamic verification](https://nyxscan.dev/docs/dynamic.html).
--- ---
@ -188,13 +246,13 @@ kind = "sanitizer"
cap = "html_escape" cap = "html_escape"
``` ```
Or add rules interactively: `nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape`. Caps: `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `all`. Full schema: [Configuration](https://elicpeter.github.io/nyx/configuration.html). Or add rules interactively: `nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape`. Caps: `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `data_exfil`, `code_exec`, `crypto`, `unauthorized_id`, `ldap_injection`, `xpath_injection`, `header_injection`, `open_redirect`, `ssti`, `xxe`, `prototype_pollution`, `all`. Full schema: [Configuration](https://nyxscan.dev/docs/configuration.html). Run `nyx rules list` to browse the registry from the terminal.
--- ---
## Status ## Status
Under active development. APIs, detector behavior, and configuration options may change between releases. Rule-level F1 on the 433-case corpus is the CI regression floor; per-language detail lives in [`tests/benchmark/RESULTS.md`](tests/benchmark/RESULTS.md). Under active development. APIs, detector behavior, and configuration options may change between releases. Rule-level F1 on the synthetic corpus is the CI regression floor; per-language detail lives in [`tests/benchmark/RESULTS.md`](tests/benchmark/RESULTS.md).
Taint analysis is interprocedural. Persisted per-function SSA summaries carry per-return-path transforms and parameter-granularity points-to, and call-graph SCCs (including SCCs that span files) iterate to a joint fixed-point. The default `balanced` profile also runs k=1 context-sensitive inlining for intra-file callees. Symex (with cross-file and interprocedural frames) and the demand-driven backwards walk are opt-in. Enable them individually with `--symex` and `--backwards-analysis`, or together with `--engine-profile deep`. Taint analysis is interprocedural. Persisted per-function SSA summaries carry per-return-path transforms and parameter-granularity points-to, and call-graph SCCs (including SCCs that span files) iterate to a joint fixed-point. The default `balanced` profile also runs k=1 context-sensitive inlining for intra-file callees. Symex (with cross-file and interprocedural frames) and the demand-driven backwards walk are opt-in. Enable them individually with `--symex` and `--backwards-analysis`, or together with `--engine-profile deep`.
@ -209,12 +267,12 @@ Limitations:
## Documentation ## Documentation
Browse the full docs site at **[elicpeter.github.io/nyx](https://elicpeter.github.io/nyx/)**. Browse the full docs site at **[nyxscan.dev/docs](https://nyxscan.dev/docs/)**.
- [Quick Start](https://elicpeter.github.io/nyx/quickstart.html) · [CLI Reference](https://elicpeter.github.io/nyx/cli.html) · [Installation](https://elicpeter.github.io/nyx/installation.html) - [Quick Start](https://nyxscan.dev/docs/quickstart.html) · [CLI Reference](https://nyxscan.dev/docs/cli.html) · [Installation](https://nyxscan.dev/docs/installation.html)
- [`nyx serve`](https://elicpeter.github.io/nyx/serve.html) · [Output Formats](https://elicpeter.github.io/nyx/output.html) · [Configuration](https://elicpeter.github.io/nyx/configuration.html) - [`nyx serve`](https://nyxscan.dev/docs/serve.html) · [Output Formats](https://nyxscan.dev/docs/output.html) · [Configuration](https://nyxscan.dev/docs/configuration.html) · [Dynamic verification](https://nyxscan.dev/docs/dynamic.html)
- [How it works](https://elicpeter.github.io/nyx/how-it-works.html) · [Detectors](https://elicpeter.github.io/nyx/detectors.html) ([Taint](https://elicpeter.github.io/nyx/detectors/taint.html), [CFG](https://elicpeter.github.io/nyx/detectors/cfg.html), [State](https://elicpeter.github.io/nyx/detectors/state.html), [AST Patterns](https://elicpeter.github.io/nyx/detectors/patterns.html)) - [How it works](https://nyxscan.dev/docs/how-it-works.html) · [Detectors](https://nyxscan.dev/docs/detectors.html) ([Taint](https://nyxscan.dev/docs/detectors/taint.html), [CFG](https://nyxscan.dev/docs/detectors/cfg.html), [State](https://nyxscan.dev/docs/detectors/state.html), [AST Patterns](https://nyxscan.dev/docs/detectors/patterns.html))
- [Rule Reference](https://elicpeter.github.io/nyx/rules.html) · [Language Maturity](https://elicpeter.github.io/nyx/language-maturity.html) · [Advanced Analysis](https://elicpeter.github.io/nyx/advanced-analysis.html) · [Auth Analysis](https://elicpeter.github.io/nyx/auth.html) - [Rule Reference](https://nyxscan.dev/docs/rules.html) · [Language Maturity](https://nyxscan.dev/docs/language-maturity.html) · [Advanced Analysis](https://nyxscan.dev/docs/advanced-analysis.html) · [Auth Analysis](https://nyxscan.dev/docs/auth.html)
--- ---

276
README.zh-CN.md Normal file
View file

@ -0,0 +1,276 @@
<div align="center">
<img src="assets/nyx-readme-header.png" alt="NYX" width="640"/>
**本地优先的安全扫描器,带沙箱动态验证和浏览器 UI。在本地扫描代码仓库并在浏览器中分诊处理无需云端、无需账号。**
[![crates.io](https://img.shields.io/crates/v/nyx-scanner.svg)](https://crates.io/crates/nyx-scanner)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Rust 1.88+](https://img.shields.io/badge/rust-1.88%2B-orange)](https://www.rust-lang.org)
[![CI](https://img.shields.io/github/actions/workflow/status/elicpeter/nyx/ci.yml?branch=master)](https://github.com/elicpeter/nyx/actions)
[![Docs](https://img.shields.io/badge/docs-nyxscan.dev%2Fdocs-blue)](https://nyxscan.dev/docs/)
[English](./README.md) · 简体中文
</div>
<p align="center"><img src="assets/screenshots/demo.gif" alt="Nyx UI 演示:从空欢迎页开始扫描,查看含健康分的总览页,钻入一条 HIGH 级发现的流可视化,再到分诊流程" width="900"/></p>
---
## 本地扫描,本地浏览
Nyx 在你的代码仓库上运行跨语言污点分析,然后对中高置信度发现运行小型沙箱 harness验证真实代码里 source 到 sink 的流是否会触发。结果通过绑定到 `127.0.0.1` 的 React UI 提供给你。你会看到严重等级、静态证据、动态验证结果,以及分步**流可视化**,从源 → 净化器 → 汇逐步呈现数据流。分诊决策持久化在 `.nyx/triage.json` 中,与代码一同提交,团队共享同一份分诊状态。
```bash
cargo install nyx-scanner
nyx scan # 运行分析器,把发现缓存到 .nyx/
nyx serve # 在浏览器中打开 http://localhost:9700
```
一切都留在你本地:仅回环绑定、强制 host 头校验、所有变更操作均带 CSRF、无远程遥测、无登录。
<p align="center"><img src="assets/screenshots/overview.png" alt="一个小型 JS 应用的总览仪表盘:健康分 C 78五项分量分解严重度压力、置信度质量、趋势、分诊覆盖、回归抗性3 条发现OWASP A03 与 A02 类别,置信度分布与问题类别条形图,受影响最多的文件" width="900"/></p>
---
## UI 中包含什么
| 页面 | 显示内容 |
|---|---|
| **总览** | 仪表盘:按严重等级分类的发现计数、热点文件、引擎画像摘要 |
| **发现** | 可浏览列表,含严重度徽章、分诊状态、规则筛选、语言筛选 |
| **发现详情** | 流路径可视化,带编号步骤(源 → 净化器 → 汇)、动态验证结果、代码片段、证据、跨文件标记、分诊下拉框 |
| **分诊** | 批量更新状态open、investigating、fixed、false_positive、accepted_risk、suppressed审计日志JSON 导入/导出 |
| **资源管理器** | 文件树,含每个文件的符号列表与发现叠加层 |
| **扫描** | 历史记录、指标,对比两次扫描查看差异 |
| **规则** | 各语言的内置与自定义规则;可在 UI 中添加规则 |
| **配置** | 实时配置编辑器;无需重启即可重载 |
`nyx serve` 参数:`--port <N>`(默认 `9700`)、`--host <addr>`(仅回环:`127.0.0.1``localhost``::1`)、`--no-browser`。持久化设置见 `nyx.conf``[server]` 段,分页面 UI 介绍与安全模型详见 [Browser UI 指南](https://nyxscan.dev/docs/serve.html)。
---
## 用于 CI 的 CLI
同一个引擎可以无头运行用于 CI 流水线。SARIF 输出可直接上传到 GitHub Code Scanning。
<p align="center"><img src="assets/screenshots/cli-scan.gif" alt="nyx scan 终端输出JS 与 Python 文件中的 HIGH 级污点发现及 source → sink 箭头" width="820"/></p>
```bash
# 在 medium 及以上等级让 CI 失败,并输出 SARIF
nyx scan --format sarif --fail-on MEDIUM > results.sarif
# 临时 JSON无索引
nyx scan ./server --format json --index off
# 仅 AST 模式(最快;跳过 CFG + 污点)
nyx scan --mode ast
# 引擎深度快捷方式fast | balanced默认 | deep
# `deep` 增加 symex 与按需后向污点,精度更高,开销约 2-3 倍
nyx scan --engine-profile deep
```
正向跨文件污点在所有画像下都会运行。Symex 与按需后向遍历是可选项,可通过 `--engine-profile deep` 一次性开启,或单独开启(`--symex``--backwards-analysis`)。完整开关矩阵见 [CLI 参考](https://nyxscan.dev/docs/cli.html#engine-depth-profile)。
### GitHub Action
```yaml
- uses: elicpeter/nyx@v0.8.0
with:
format: sarif
fail-on: MEDIUM
- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: nyx-results.sarif
```
输入:`path``version``format``sarif`|`json`|`console`)、`fail-on``args``token`。输出:`finding-count``sarif-file``exit-code``nyx-version`。支持 Linux 与 macOS runnerx86_64、ARM64
---
## 安装
**Cargo推荐**
```bash
cargo install nyx-scanner
```
**预编译二进制:** 从 [Releases](https://github.com/elicpeter/nyx/releases) 下载对应平台的归档包,对照 `SHA256SUMS`(以及随附的 `SHA256SUMS.asc` GPG 签名,如有提供)校验,解压并把 `nyx` 放到 `PATH` 中。
```bash
# 可选:校验校验文件的 GPG 签名(当 SHA256SUMS.asc 已发布时)
gpg --verify SHA256SUMS.asc SHA256SUMS
sha256sum -c SHA256SUMS --ignore-missing
unzip nyx-x86_64-unknown-linux-gnu.zip && chmod +x nyx && sudo mv nyx /usr/local/bin/
```
**从源码编译:**
```bash
git clone https://github.com/elicpeter/nyx.git
cd nyx && cargo build --release
```
需要 stable Rust 1.88+。前端会在编译期被打包嵌入二进制中,因此 `nyx serve` 没有单独的安装步骤。
---
## 语言支持
全部 10 种语言都通过 tree-sitter 解析并跑完整流水线,但规则深度与引擎覆盖并不均衡。在 [`tests/benchmark/ground_truth.json`](tests/benchmark/ground_truth.json) 的合成语料上,所有十种语言在最近一次基线测量中 F1 均为 100%(见 [`tests/benchmark/RESULTS.md`](tests/benchmark/RESULTS.md)),因此 F1 已无法单独区分梯度。分级反映规则深度、门控汇覆盖、以及合成语料未充分覆盖的结构性惯用法:
| 梯度 | 语言 | F1 | 适合用作 CI 门禁吗? |
|---|---|---|---|
| **稳定** | Python、JavaScript、TypeScript | 100% | 适合 |
| **Beta** | Java、PHP、Ruby、Rust、Go | 100% | 适合,需轻度 FP 分诊 |
| **预览** | C、C++ | 合成语料 100% | 不适合。已跟踪 STL 容器流、builder 链、内联类成员函数;尚未覆盖深度指针别名与函数指针。建议与 clang-tidy 或 Clang Static Analyzer 搭配使用 |
所有真实 CVE 用例均触发,语料在记录基线下无未关闭的 FPP=R=F1=1.000)。各维度详情与已知盲区见 [语言成熟度页面](https://nyxscan.dev/docs/language-maturity.html)。
### 通过真实 CVE 验证
语料中还包含一小批从公开公告中提取的「漏洞 / 已修复」配对,因此基准下限不仅由合成的同形测例守护,还由对真实 bug 的回归保护守护。每个配对 Nyx 都在漏洞文件上触发、在已修复文件上零发现。
| CVE | 项目 | 语言 | 类别 |
|---|---|---|---|
| [CVE-2023-48022](https://nvd.nist.gov/vuln/detail/CVE-2023-48022) | Ray | Python | 命令注入 |
| [CVE-2017-18342](https://nvd.nist.gov/vuln/detail/CVE-2017-18342) | PyYAML | Python | 反序列化 |
| [CVE-2019-14939](https://nvd.nist.gov/vuln/detail/CVE-2019-14939) | mongo-express | JavaScript | 代码执行(`eval` |
| [CVE-2023-22621](https://nvd.nist.gov/vuln/detail/CVE-2023-22621) | Strapi | JavaScript | 代码执行SSTI |
| [CVE-2025-64430](https://nvd.nist.gov/vuln/detail/CVE-2025-64430) | Parse Server | JavaScript | SSRF |
| [CVE-2023-26159](https://nvd.nist.gov/vuln/detail/CVE-2023-26159) | follow-redirects | TypeScript | SSRF |
| [GHSA-4x48-cgf9-q33f](https://github.com/advisories/GHSA-4x48-cgf9-q33f) | Novu | TypeScript | SSRF |
| [CVE-2026-25544](https://nvd.nist.gov/vuln/detail/CVE-2026-25544) | Payload CMS | TypeScript | SQL 注入 |
| [CVE-2022-30323](https://nvd.nist.gov/vuln/detail/CVE-2022-30323) | hashicorp/go-getter | Go | 命令注入 |
| [CVE-2024-31450](https://nvd.nist.gov/vuln/detail/CVE-2024-31450) | owncast | Go | 路径穿越 |
| [CVE-2023-3188](https://nvd.nist.gov/vuln/detail/CVE-2023-3188) | owncast | Go | SSRF |
| [CVE-2026-41422](https://github.com/daptin/daptin/security/advisories/GHSA-rw2c-8rfq-gwfv) | daptin | Go | SQL 注入 |
| [CVE-2015-7501](https://nvd.nist.gov/vuln/detail/CVE-2015-7501) | Apache Commons Collections | Java | 反序列化 |
| [CVE-2017-12629](https://nvd.nist.gov/vuln/detail/CVE-2017-12629) | Apache Solr | Java | 命令注入 |
| [CVE-2022-1471](https://nvd.nist.gov/vuln/detail/CVE-2022-1471) | SnakeYAML | Java | 反序列化 |
| [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) | Apache Commons Text | Java | 代码执行 |
| [GHSA-h8cj-hpmg-636v](https://github.com/advisories/GHSA-h8cj-hpmg-636v) | Appsmith | Java | SQL 注入 |
| [CVE-2013-0156](https://nvd.nist.gov/vuln/detail/CVE-2013-0156) | Ruby on Rails | Ruby | 反序列化 |
| [CVE-2020-8130](https://nvd.nist.gov/vuln/detail/CVE-2020-8130) | Rake | Ruby | 命令注入 |
| [CVE-2021-21288](https://nvd.nist.gov/vuln/detail/CVE-2021-21288) | CarrierWave | Ruby | SSRF |
| [CVE-2023-38337](https://nvd.nist.gov/vuln/detail/CVE-2023-38337) | rswag-api | Ruby | 路径穿越 |
| [CVE-2017-9841](https://nvd.nist.gov/vuln/detail/CVE-2017-9841) | PHPUnit | PHP | 代码执行(`eval` |
| [CVE-2018-15133](https://nvd.nist.gov/vuln/detail/CVE-2018-15133) | Laravel | PHP | 反序列化 |
| [CVE-2018-20997](https://nvd.nist.gov/vuln/detail/CVE-2018-20997) | tar-rs | Rust | 路径穿越 |
| [CVE-2022-36113](https://nvd.nist.gov/vuln/detail/CVE-2022-36113) | cargo | Rust | 路径穿越 |
| [CVE-2024-24576](https://nvd.nist.gov/vuln/detail/CVE-2024-24576) | Rust stdlib | Rust | 命令注入 |
| [CVE-2023-42456](https://rustsec.org/advisories/RUSTSEC-2023-0069.html) | sudo-rs | Rust | 路径穿越 |
| [CVE-2024-32884](https://rustsec.org/advisories/RUSTSEC-2024-0335.html) | gitoxide | Rust | 命令注入 |
| [CVE-2025-53549](https://rustsec.org/advisories/RUSTSEC-2025-0043.html) | matrix-rust-sdk | Rust | SQL 注入 |
| [CVE-2016-3714](https://nvd.nist.gov/vuln/detail/CVE-2016-3714) | ImageMagick (ImageTragick) | C | 命令注入 |
| [CVE-2019-18634](https://nvd.nist.gov/vuln/detail/CVE-2019-18634) | sudo (pwfeedback) | C | 内存安全 |
| [CVE-2019-13132](https://nvd.nist.gov/vuln/detail/CVE-2019-13132) | ZeroMQ libzmq | C++ | 内存安全 |
| [CVE-2022-1941](https://nvd.nist.gov/vuln/detail/CVE-2022-1941) | Protocol Buffers | C++ | 内存安全 |
| [CVE-2025-69662](https://nvd.nist.gov/vuln/detail/CVE-2025-69662) | geopandas | Python | SQL 注入 |
| [CVE-2026-33626](https://nvd.nist.gov/vuln/detail/CVE-2026-33626) | LMDeploy | Python | SSRF |
用例文件位于 [`tests/benchmark/cve_corpus/`](tests/benchmark/cve_corpus/),并附上游归属头注释。
---
## 工作原理
对文件系统进行两遍扫描,可选用 SQLite 索引跳过未变更文件:
1. **Pass 1**:用 tree-sitter 解析每个文件,构建过程内 CFGpetgraph下降到剪枝后的 SSA在支配边界上做 Cytron phi 插入并导出每函数摘要source/sanitizer/sink 能力位、污点变换、指向集、被调集合)。
2. **摘要合并**:将每文件摘要并集合并为 `GlobalSummaries` 映射。
3. **Pass 2**:在跨文件上下文与有限上下文敏感(文件内被调用 k=1 内联SCC 不动点上限 64 次迭代,超过内联体大小阈值的被调用走摘要回退)下重新分析每个文件。正向数据流工作表通过 SSA 格传播污点,保证收敛。调用图 SCC 迭代到不动点(在上限内),使相互递归函数能拿到准确摘要。
4. **排序、去重、动态验证、输出**:按 严重度 × 证据强度 × 源类可利用性 打分。默认构建会对中高置信度发现做动态验证然后输出到控制台、JSON、SARIF 和浏览器 UI。
检测器家族:污点(跨文件 source→sink含 SQLi、XSS、命令/代码执行、反序列化、SSRF、路径穿越、格式串、加密、LDAP 注入、XPath 注入、HTTP 头/响应拆分、开放重定向、服务端模板注入、XXE、原型污染、数据外泄、以及 auth 折入的能力位类规则、CFG 结构鉴权缺失、未守卫汇、资源泄漏、状态模型use-after-close、double-close、must-leak、unauthed-access、AST 模式tree-sitter 结构匹配)。完整检测器文档:[Detectors](https://nyxscan.dev/docs/detectors.html)。
---
## 动态验证
静态分析说明 source 到 sink 可达。动态验证会尝试证明这条路径在真实代码里会触发。默认构建开启该功能,`nyx scan` 会为中高置信度发现生成 harness在沙箱中用 curated payload 运行,并把结果写入 `evidence.dynamic_verdict`
```bash
nyx scan --verify # 默认行为的显式写法
nyx scan --no-verify # 只跑静态分析,适合本地快速循环
```
`Confirmed` 只有在攻击 payload 触发 sink 且对应的良性 control 保持干净时才会出现。`NotConfirmed` 表示 harness 跑完但没有触发,不等于发现已关闭。完整能力矩阵、后端与限制见 [Dynamic verification](https://nyxscan.dev/docs/dynamic.html)。
---
## 配置
配置由 `nyx.conf`(默认值)与 `nyx.local`你的覆写合并而成从平台配置目录读取Linux 为 `~/.config/nyx/`macOS 为 `~/Library/Application Support/nyx/`Windows 为 `%APPDATA%\elicpeter\nyx\config\`)。
```toml
[scanner]
mode = "full" # full | ast | cfg | taint
min_severity = "Medium"
[server]
host = "127.0.0.1"
port = 9700
open_browser = true
# 项目专属净化器
[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml"]
kind = "sanitizer"
cap = "html_escape"
```
或交互式添加规则:`nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape`。能力位caps`env_var``html_escape``shell_escape``url_encode``json_parse``file_io``fmt_string``sql_query``deserialize``ssrf``data_exfil``code_exec``crypto``unauthorized_id``ldap_injection``xpath_injection``header_injection``open_redirect``ssti``xxe``prototype_pollution``all`。完整 schema[Configuration](https://nyxscan.dev/docs/configuration.html)。运行 `nyx rules list` 可在终端浏览注册表。
---
## 状态
正在积极开发中。API、检测器行为、配置项可能在版本间发生变化。合成语料上的规则级 F1 是 CI 回归下限;分语言详情见 [`tests/benchmark/RESULTS.md`](tests/benchmark/RESULTS.md)。
污点分析是过程间的。持久化的每函数 SSA 摘要带有按返回路径的变换与参数粒度的指向集,调用图 SCC包括跨文件 SCC迭代到联合不动点。默认 `balanced` 画像还会对文件内被调用做 k=1 上下文敏感内联。Symex含跨文件与过程间帧以及按需后向遍历是可选项。可分别用 `--symex``--backwards-analysis` 单独开启,或通过 `--engine-profile deep` 一并开启。
局限:
- 过程间精度是有界而非无限的。上下文敏感内联为 k=1 且有被调用体大小上限SCC 不动点有迭代上限。引擎触达上限时回退到摘要,并在发现上记录 `engine_note`
- 不跨语言追踪调用FFI、子进程、WASM。每种语言独立分析。
- 几项语言特性未建模:宏、大多数动态分派、别名导入、反射。
- C/C++ 处于预览梯度。当前已跟踪 STL 容器流、builder 链、内联类成员函数;深度指针别名与函数指针未跟踪。干净报告不应被理解为干净审计。在作为硬性 CI 门禁之前,请与基于 clang 的工具搭配使用。
- 结果可能含误报或漏报;预期需要人工复核。
---
## 文档
完整文档站点:**[nyxscan.dev/docs](https://nyxscan.dev/docs/)**。
- [Quick Start](https://nyxscan.dev/docs/quickstart.html) · [CLI Reference](https://nyxscan.dev/docs/cli.html) · [Installation](https://nyxscan.dev/docs/installation.html)
- [`nyx serve`](https://nyxscan.dev/docs/serve.html) · [Output Formats](https://nyxscan.dev/docs/output.html) · [Configuration](https://nyxscan.dev/docs/configuration.html)
- [How it works](https://nyxscan.dev/docs/how-it-works.html) · [Detectors](https://nyxscan.dev/docs/detectors.html)[Taint](https://nyxscan.dev/docs/detectors/taint.html)、[CFG](https://nyxscan.dev/docs/detectors/cfg.html)、[State](https://nyxscan.dev/docs/detectors/state.html)、[AST Patterns](https://nyxscan.dev/docs/detectors/patterns.html)
- [Rule Reference](https://nyxscan.dev/docs/rules.html) · [Language Maturity](https://nyxscan.dev/docs/language-maturity.html) · [Advanced Analysis](https://nyxscan.dev/docs/advanced-analysis.html) · [Auth Analysis](https://nyxscan.dev/docs/auth.html)
---
## 参与贡献
欢迎贡献。
Nyx 是开源项目,并将永远保有完全开源的核心。为了支持长期开发并使项目可持续,贡献者在首次合入前可能会被要求签署 Contributor License Agreement。
提交前请运行 `sh scripts/check.sh`。完整指南(包括如何添加规则与支持新语言)见 [`CONTRIBUTING.md`](CONTRIBUTING.md)。崩溃、panic 或可疑结果请提 issue附最小复现片段与 Nyx 版本号。
---
## AI 披露
- **引擎代码**taint、SSA、CFG、调用图、抽象解释、符号执行以人工编写为主。AI 仅用于有选择的重构与样板代码,所有合入均经人工审阅。
- **文档与本 README 的大部分内容**:由 AI 基于代码生成并经人工编辑。文档与代码漂移请作为 bug 上报。
- **测试用例与 `expected.yaml` 文件**AI 协助起草,落库前经人工审核。
- **前端 UI**React 应用):在 AI 协助下构建,经人工审阅。
与任何静态分析器一样,在把 Nyx 用作 CI 门禁前,请基于你自己的语料验证发现。
---
## 许可证
GNU General Public License v3.0 或更高版本GPL-3.0-or-later。可选的 `smt` 特性会捆绑 Z3MIT 许可);分发以 `--features smt` 构建的二进制时,应在归属信息中包含 Z3 的许可证。完整文本见 [LICENSE](./LICENSE);第三方依赖见 [THIRDPARTY-LICENSES.html](./THIRDPARTY-LICENSES.html)。

94
RELEASE_CHECKLIST.md Normal file
View file

@ -0,0 +1,94 @@
# Release checklist: 0.8.0 (dynamic verification)
Maintainer-facing gate for cutting `0.8.0`. The release ships the dynamic
verifier (Tracks J through S of `.pitboss/play/plan.md`). Sign-off requires
every row below green, and every CI matrix row green for at least three
consecutive runs on `master`.
Legend: `[x]` verified locally on the dev reference machine, `[ ]` confirmed
by CI (must hold for three consecutive runs before tagging).
## Cross-cutting invariants
- [x] `cargo check --no-default-features --features serve` green.
- [x] `cargo check --features dynamic` green.
- [x] `cargo nextest run --features dynamic` green: 6545 passed, 0 failed, 16 skipped.
- [x] Determinism: every payload RNG seeds from `spec.spec_hash`; oracle canaries derive from `BLAKE3(spec_hash || run_nonce)`. `scripts/check_no_unseeded_rand.sh` audits the tree.
- [x] Observability: each new code path emits a `VerifyTrace` event and a typed `Inconclusive` / `Unsupported` reason.
- [x] Security: every sink-under-test routes through `src/dynamic/policy.rs` deny rules; no phase weakened the seccomp / `.sb` profile sets.
- [ ] Performance: default `nyx scan` (no `--verify`) latency does not regress.
## Ship gates (`scripts/m7_ship_gate.sh`)
- [x] Gate 1: static-only scan green on `tests/benchmark/corpus`.
- [x] Gate 2: `cargo nextest run --features dynamic` green (covers Gate 4 + Gate 5 binaries).
- [x] Gate 3: with-verify / static-only wall-clock ratio <= 1.5x on `benches/fixtures/`.
- [x] Gate 4: SARIF schema validation on every dynamic verdict variant.
- [x] Gate 5: layering boundary test green.
- [ ] Gate 6: Java OWASP Benchmark v1.2 `--verify` acceptance (wall-clock <= 15 min CI, per-cap precision >= 0.85 / recall >= 0.40, per-`(cap, lang)` budget). Self-skips without `NYX_OWASP_CORPUS`.
- [ ] Gate 7: NodeGoat + Juice Shop acceptance. Self-skips without `NYX_NODEGOAT_CORPUS` / `NYX_JUICESHOP_CORPUS`.
- [ ] Gate 8: RailsGoat / DVWA / DVPWA / gosec / RustSec acceptance. Self-skips without the matching `NYX_*_CORPUS`.
Gates 6 through 8 run against real corpora that are not vendored into the repo.
They are enforced in the `eval` workflow with the corpora cached on the CI
runner. Locally they self-skip with a clear message.
## CI matrix rows (must be green three runs running)
`ci.yml`:
- [ ] frontend, rustfmt, clippy-stable, cargo-deny, unused-deps, third-party-licenses
- [ ] docs-fresh (`nyx-docgen` output committed), rustdoc
- [ ] rust-beta-build, msrv
- [ ] rust-stable-test-linux-without-docker, rust-stable-test-linux-with-docker (`cargo nextest run --all-features`)
`dynamic.yml` (each runs `cargo nextest run --features dynamic`):
- [ ] linux-process-only
- [ ] linux-with-docker
- [ ] macos
`eval.yml`:
- [ ] owasp (Gate 6)
- [ ] jsts matrix: nodegoat, juiceshop (Gate 7)
- [ ] polyglot matrix: railsgoat, dvwa, dvpwa, gosec, rustsec (Gate 8)
## Docs and metadata
- [x] `Cargo.toml` version bumped to `0.8.0`; `Cargo.lock` regenerated.
- [x] `docs/dynamic.md` rewritten: cap x lang matrix, framework adapter table, oracle table, performance budgets, limitations.
- [x] `README.md` dynamic verification section + docs link.
- [x] `CHANGELOG.md` `[0.8.0]` entry covers Tracks J through S.
- [x] Stray version strings updated (README GitHub Action pin, telemetry doc example).
## Known limitations carried into 0.8.0
These are documented in `docs/dynamic.md` and accepted for the MVP. They are
not release blockers, but the release notes should not overstate the verifier.
- **Guarded-sink over-confirmation (resolved on `dynamic`).** The synthesized
harness now drives the finding's enclosing entry function when one is
derivable, routing the payload to the tainted parameter, so a guard that
lives in the caller (a `Object.create(null)` merge target, an allowlisting
`resolveClass`, a const-name check before `Marshal.load`) runs first and
participates in the verdict. The build-time entry-vs-sink choice is recorded
on the verify trace as `entry_invocation`. When no enclosing entry can be
derived the harness falls back to driving the sink directly, which can still
over-confirm a guard it never executes. On the in-house fixture set the
verify scan now confirms the 8 genuine vulnerabilities and reads
`NotConfirmed` on all 4 negative-control files.
- **In-house confirmed rate is modest.** A `--verify` scan of
`tests/dynamic_fixtures` (process backend) lands 8 Confirmed / 15
NotConfirmed / 115 Inconclusive / 137 Unsupported of 275. The Unsupported
bulk is `SoundOracleUnavailable` (ENV_VAR / SHELL_ESCAPE / URL_ENCODE source
and sanitizer caps, correct by design); the Inconclusive bulk is
`SpecDerivationFailed` on benign and scaffolding fixtures with no derivable
flow. The authoritative confirmed / precision / recall numbers come from the
real-corpus gates (6 through 8), which require the corpora.
- **Real-corpus gates unverified locally.** Gates 6 through 8 self-skip without
`NYX_*_CORPUS`. The >= 40% confirmed and >= 0.85 precision targets are
enforced only in the `eval` workflow.
## Tag
- [ ] Three consecutive green CI runs on `master` confirmed.
- [ ] Real-corpus gates (6 through 8) green in the `eval` workflow with corpora wired.
- [ ] `git tag v0.8.0` and push; `release-build.yml` publishes the binaries and `SHA256SUMS`.

View file

@ -1,22 +1,23 @@
# Roadmap # Roadmap
Nyx today is a static-only multi-language vulnerability scanner. The roadmap below extends it into a hybrid scanner that combines static analysis with controlled execution and AI-assisted reasoning. ## Now: recall and precision on real codebases
## Phase 1: Static Analysis (current) The current focus is straightforward. Run Nyx against real open-source repositories and real CVEs, then close the gap between what it finds and what it should find.
The shipped scanner. Multi-language taint tracking on a pruned SSA IR, cross-file function summaries, points-to and abstract interpretation, symbolic execution with an optional SMT backend, and a local web UI for triage. See the [Changelog](CHANGELOG.md) for the full breakdown of what's landed through 0.5.0. That means:
## Phase 2: Dynamic Capability - **Recall.** Pick CVEs with public fixes. Reproduce them on the vulnerable commit. If Nyx misses, figure out why (missing source, missing sink, lost flow across a call, dropped at a sanitizer that was not actually a sanitizer) and fix the underlying analysis, not the fixture.
- **Precision.** Triage the noise on large repos (phpMyAdmin, Nextcloud, and others). Each false positive gets reduced to a pattern: receiver-type gate, non-crypto context for `md5`/`sha1`, type-safe sink suppression, etc. Land the gate, re-run the corpus, confirm the count drops without taking real bugs with it.
- **Corpus discipline.** Every fix lands with a fixture (positive or negative) and a corpus row. Rule-level F1 on `tests/benchmark/corpus/` is the scoreboard. CI floors only ratchet up.
| Feature | Description | The scanner internals (SSA, cross-file summaries, abstract interpretation, symbolic execution, auth analysis) are in place. They get refined in service of the recall/precision work, not extended for their own sake.
| --- | --- |
| Controlled dynamic execution | Local sandbox: identify entry points, spin up test harnesses, inject payloads, detect runtime crashes and command execution. Deterministic automated exploit validation: static finds `exec(user_input)`, dynamic confirms it with `; id`. |
| Fuzzing integration | libFuzzer (C/C++), cargo-fuzz (Rust), go-fuzz, HTTP fuzzing harness. Static engine identifies interesting functions, fuzzer targets only those. |
## Phase 3: Intelligent Reasoning Layer ## Later: dynamic capability
| Feature | Description | Static analysis confirms a flow exists. Dynamic execution confirms it fires. The plan is a local sandbox that picks up entry points Nyx already identifies, builds a harness, injects a payload, and watches for the crash or shell. Pairs naturally with fuzzing (libFuzzer, cargo-fuzz, go-fuzz, HTTP) where the static engine picks the targets.
| --- | --- |
| Semantic similarity | Embeddings for finding similar vulnerability patterns across codebases. | Not started. Lands after the static side is honest on real corpora.
| LLM reasoning | AI-assisted detection of non-obvious logic bugs. |
| Exploit refinement | Automated loops to refine and validate exploit chains. | ## Later still: reasoning layer
Embeddings for cross-codebase pattern similarity. LLM-assisted detection for logic bugs that resist taint modeling. Automated exploit refinement loops. All speculative until the foundation is solid.

View file

@ -1,46 +1,88 @@
# Security Policy # Security Policy
## Supported Versions ## Reporting a vulnerability
| Version | Supported | Notes | Report privately. Do not open a public GitHub issue for a security bug.
|---------|-----------|----------------------|
| 0.5.x | ✅ | Latest stable line |
| 0.4.x | ✅ | Critical fixes only |
| < 0.4 | | End-of-life |
We follow [Semantic Versioning] as soon as we hit **1.0.0**. Use [GitHub Security Advisories](https://github.com/elicpeter/nyx/security/advisories/new) to file a private report. Only the maintainers see it.
Before that, breaking changes may land in any minor release.
## Reporting a Vulnerability Include:
* **Private disclosure first.** - Affected version (`nyx --version`) and OS
Please **do not** open public GitHub issues for security bugs. - Reproduction steps or a minimal PoC
- Impact (RCE, file read or write, sandbox escape, auth bypass in `nyx serve`, etc.)
- Whether you have a fix in mind
* **How to report** You'll get an acknowledgement within 3 business days, and a status update every 7 days until the issue is closed.
1. To report a vulnerability, please use the GitHub disclosure in the security tab to alert us to a security issue.
* **What to include** ## Scope
A minimal PoC or reproduction steps
Affected Nyx version (`nyx --version`) and OS
Impact explanation (e.g. RCE, DoS, data leak)
* **Response timeline** In scope: bugs that let untrusted input reach the Nyx process and cause harm.
We acknowledge within **3 business days** and give a status update every **7 days** thereafter until resolution.
## Disclosure Process - Code execution in the scanner: parser exploits, deserialization, command injection in helpers, custom-rule sandbox escape.
- Path traversal or arbitrary file access outside the target repo.
- `nyx serve` issues: auth bypass, host-header bypass, CSRF on mutating routes, XSS in the UI, cross-origin access from a non-loopback origin.
- Memory safety bugs in any unsafe Rust we introduce.
- Tampering with `.nyx/` triage state from outside the user's repo.
- Supply chain issues affecting published `nyx-scanner` crates or release artifacts.
1. We confirm the issue and assign a CVE (via GitHub or MITRE). Out of scope:
2. A fix is developed on a private branch and back-ported if needed.
3. Coordinated release: new version on crates.io + public advisory.
4. Credit is given to the reporter unless they request anonymity.
## Scope & Severity - False positives or missed detections in scan output. File a regular GitHub issue with the rule ID and a fixture.
- Findings Nyx reports against your own code. That's the scanner working, not a Nyx vulnerability.
- Anything requiring physical or local-account access to the user's machine.
- Self-XSS and missing security headers on `127.0.0.1` endpoints. The UI is loopback-only.
- Performance pathologies on hostile input (a 50 GB file, deeply nested grammars). We harden where we can.
- Issues only reachable by a user editing their own `nyx.conf` to weaken defaults.
This policy covers vulnerabilities that let an **untrusted Nyx input** cause: ## Supported versions
* Remote or local code execution in the Nyx process | Version | Status |
* Privilege escalation, data exfiltration, or denial of service |---------|-----------------------|
| 0.7.x | Supported |
| 0.6.x | Critical fixes only |
| < 0.6 | End of life |
**False positives / missed detections** in scan results are *quality issues*, not security issues. Please file normal GitHub issues for those. The project follows [Semantic Versioning](https://semver.org) once it reaches 1.0.0. Until then, breaking changes can land in any minor release.
[Semantic Versioning]: https://semver.org ## Severity
We use [CVSS 3.1](https://www.first.org/cvss/v3.1/specification-document) to rate reports.
| Severity | Examples |
|----------|-----------------------------------------------------------------------------------------------|
| Critical | Unauthenticated RCE in `nyx serve`, custom-rule sandbox escape during a default scan |
| High | Auth bypass against `nyx serve`, arbitrary file write outside the repo |
| Medium | Stored XSS in the UI, CSRF on a mutating route, host-header bypass |
| Low | Information disclosure with no privilege change, log-injection, denial of service via input |
## Disclosure
Coordinated disclosure.
1. We confirm the report and assign severity.
2. We request a CVE through GitHub or MITRE.
3. A fix is developed on a private branch, with backports to supported lines if needed.
4. A new release ships on crates.io and a public advisory goes out.
5. The reporter is credited in the advisory and the changelog, unless they ask to stay anonymous.
Target window from report to fix is 90 days. If you need to publish on a shorter timeline, tell us in the report and we'll work toward it.
## Safe harbor
Good-faith security research is welcome. We won't pursue legal action against researchers who:
- Report privately and give a reasonable window before publishing.
- Test against their own installations, not third-party deployments running Nyx.
- Avoid data destruction, account takeover, and service disruption.
- Stop and reach out if a test starts to affect data or systems they don't own.
If you're not sure whether a test is in scope, ask first.
## Bounty
There is no paid bug bounty program. Credit, a thank-you in the advisory, and a mention in the changelog are what we offer today.
## Security model recap
Nyx runs locally. The browser UI binds to `127.0.0.1` by default, requires a matching `Host` header, and uses a CSRF token on every mutating request. There is no login, no telemetry, and no remote control plane. If you find a way around any of those defaults, that's a security issue and we want to hear about it.

View file

@ -44,7 +44,7 @@
<h2>Overview of licenses:</h2> <h2>Overview of licenses:</h2>
<ul class="licenses-overview"> <ul class="licenses-overview">
<li><a href="#Apache-2.0">Apache License 2.0</a> (159)</li> <li><a href="#Apache-2.0">Apache License 2.0</a> (160)</li>
<li><a href="#MIT">MIT License</a> (71)</li> <li><a href="#MIT">MIT License</a> (71)</li>
<li><a href="#Zlib">zlib License</a> (2)</li> <li><a href="#Zlib">zlib License</a> (2)</li>
<li><a href="#BSD-2-Clause">BSD 2-Clause &quot;Simplified&quot; License</a> (1)</li> <li><a href="#BSD-2-Clause">BSD 2-Clause &quot;Simplified&quot; License</a> (1)</li>
@ -1542,14 +1542,13 @@
<li><a href=" https://github.com/rust-cli/anstyle.git ">anstyle-query 1.1.5</a></li> <li><a href=" https://github.com/rust-cli/anstyle.git ">anstyle-query 1.1.5</a></li>
<li><a href=" https://github.com/rust-cli/anstyle.git ">anstyle-wincon 3.0.11</a></li> <li><a href=" https://github.com/rust-cli/anstyle.git ">anstyle-wincon 3.0.11</a></li>
<li><a href=" https://github.com/rust-cli/anstyle.git ">anstyle 1.0.14</a></li> <li><a href=" https://github.com/rust-cli/anstyle.git ">anstyle 1.0.14</a></li>
<li><a href=" https://github.com/assert-rs/assert_cmd.git ">assert_cmd 2.2.1</a></li> <li><a href=" https://github.com/assert-rs/assert_cmd.git ">assert_cmd 2.2.2</a></li>
<li><a href=" https://github.com/bytesize-rs/bytesize ">bytesize 2.3.1</a></li> <li><a href=" https://github.com/bytesize-rs/bytesize ">bytesize 2.3.1</a></li>
<li><a href=" https://github.com/clap-rs/clap ">clap 4.6.1</a></li> <li><a href=" https://github.com/clap-rs/clap ">clap 4.6.1</a></li>
<li><a href=" https://github.com/clap-rs/clap ">clap_builder 4.6.0</a></li> <li><a href=" https://github.com/clap-rs/clap ">clap_builder 4.6.0</a></li>
<li><a href=" https://github.com/clap-rs/clap ">clap_derive 4.6.1</a></li> <li><a href=" https://github.com/clap-rs/clap ">clap_derive 4.6.1</a></li>
<li><a href=" https://github.com/clap-rs/clap ">clap_lex 1.1.0</a></li> <li><a href=" https://github.com/clap-rs/clap ">clap_lex 1.1.0</a></li>
<li><a href=" https://github.com/rust-cli/anstyle.git ">colorchoice 1.0.5</a></li> <li><a href=" https://github.com/rust-cli/anstyle.git ">colorchoice 1.0.5</a></li>
<li><a href=" https://github.com/srijs/rust-crc32fast ">crc32fast 1.5.0</a></li>
<li><a href=" https://github.com/sfackler/rust-fallible-iterator ">fallible-iterator 0.3.0</a></li> <li><a href=" https://github.com/sfackler/rust-fallible-iterator ">fallible-iterator 0.3.0</a></li>
<li><a href=" https://github.com/sfackler/fallible-streaming-iterator ">fallible-streaming-iterator 0.1.9</a></li> <li><a href=" https://github.com/sfackler/fallible-streaming-iterator ">fallible-streaming-iterator 0.1.9</a></li>
<li><a href=" https://github.com/polyfill-rs/is_terminal_polyfill ">is_terminal_polyfill 1.70.2</a></li> <li><a href=" https://github.com/polyfill-rs/is_terminal_polyfill ">is_terminal_polyfill 1.70.2</a></li>
@ -2616,13 +2615,16 @@ limitations under the License.</pre>
<h4>Used by:</h4> <h4>Used by:</h4>
<ul class="license-used-by"> <ul class="license-used-by">
<li><a href=" https://github.com/bluss/arrayvec ">arrayvec 0.7.6</a></li> <li><a href=" https://github.com/bluss/arrayvec ">arrayvec 0.7.6</a></li>
<li><a href=" https://github.com/Nullus157/async-compression ">async-compression 0.4.42</a></li>
<li><a href=" https://github.com/smol-rs/atomic-waker ">atomic-waker 1.1.2</a></li> <li><a href=" https://github.com/smol-rs/atomic-waker ">atomic-waker 1.1.2</a></li>
<li><a href=" https://github.com/cuviper/autocfg ">autocfg 1.5.0</a></li> <li><a href=" https://github.com/cuviper/autocfg ">autocfg 1.5.0</a></li>
<li><a href=" https://github.com/bitflags/bitflags ">bitflags 2.11.1</a></li> <li><a href=" https://github.com/bitflags/bitflags ">bitflags 2.11.1</a></li>
<li><a href=" https://github.com/BurntSushi/bstr ">bstr 1.12.1</a></li> <li><a href=" https://github.com/BurntSushi/bstr ">bstr 1.12.1</a></li>
<li><a href=" https://github.com/japaric/cast.rs ">cast 0.3.0</a></li> <li><a href=" https://github.com/japaric/cast.rs ">cast 0.3.0</a></li>
<li><a href=" https://github.com/rust-lang/cc-rs ">cc 1.2.60</a></li> <li><a href=" https://github.com/rust-lang/cc-rs ">cc 1.2.62</a></li>
<li><a href=" https://github.com/rust-lang/cfg-if ">cfg-if 1.0.4</a></li> <li><a href=" https://github.com/rust-lang/cfg-if ">cfg-if 1.0.4</a></li>
<li><a href=" https://github.com/Nullus157/async-compression ">compression-codecs 0.4.38</a></li>
<li><a href=" https://github.com/Nullus157/async-compression ">compression-core 0.4.32</a></li>
<li><a href=" https://github.com/servo/core-foundation-rs ">core-foundation-sys 0.8.7</a></li> <li><a href=" https://github.com/servo/core-foundation-rs ">core-foundation-sys 0.8.7</a></li>
<li><a href=" https://github.com/criterion-rs/criterion.rs ">criterion-plot 0.8.2</a></li> <li><a href=" https://github.com/criterion-rs/criterion.rs ">criterion-plot 0.8.2</a></li>
<li><a href=" https://github.com/criterion-rs/criterion.rs ">criterion 0.8.2</a></li> <li><a href=" https://github.com/criterion-rs/criterion.rs ">criterion 0.8.2</a></li>
@ -2636,13 +2638,13 @@ limitations under the License.</pre>
<li><a href=" https://github.com/smol-rs/fastrand ">fastrand 2.4.1</a></li> <li><a href=" https://github.com/smol-rs/fastrand ">fastrand 2.4.1</a></li>
<li><a href=" https://github.com/rust-lang/cc-rs ">find-msvc-tools 0.1.9</a></li> <li><a href=" https://github.com/rust-lang/cc-rs ">find-msvc-tools 0.1.9</a></li>
<li><a href=" https://github.com/petgraph/fixedbitset ">fixedbitset 0.5.7</a></li> <li><a href=" https://github.com/petgraph/fixedbitset ">fixedbitset 0.5.7</a></li>
<li><a href=" https://github.com/rust-lang/flate2-rs ">flate2 1.1.9</a></li> <li><a href=" https://github.com/servo/rust-fnv ">fnv 1.0.7</a></li>
<li><a href=" https://github.com/servo/rust-url ">form_urlencoded 1.2.2</a></li> <li><a href=" https://github.com/servo/rust-url ">form_urlencoded 1.2.2</a></li>
<li><a href=" https://github.com/rust-lang/glob ">glob 0.3.3</a></li> <li><a href=" https://github.com/rust-lang/glob ">glob 0.3.3</a></li>
<li><a href=" https://github.com/rust-lang/hashbrown ">hashbrown 0.14.5</a></li> <li><a href=" https://github.com/rust-lang/hashbrown ">hashbrown 0.14.5</a></li>
<li><a href=" https://github.com/rust-lang/hashbrown ">hashbrown 0.15.5</a></li> <li><a href=" https://github.com/rust-lang/hashbrown ">hashbrown 0.15.5</a></li>
<li><a href=" https://github.com/rust-lang/hashbrown ">hashbrown 0.16.1</a></li> <li><a href=" https://github.com/rust-lang/hashbrown ">hashbrown 0.16.1</a></li>
<li><a href=" https://github.com/rust-lang/hashbrown ">hashbrown 0.17.0</a></li> <li><a href=" https://github.com/rust-lang/hashbrown ">hashbrown 0.17.1</a></li>
<li><a href=" https://github.com/withoutboats/heck ">heck 0.5.0</a></li> <li><a href=" https://github.com/withoutboats/heck ">heck 0.5.0</a></li>
<li><a href=" https://github.com/seanmonstar/httparse ">httparse 1.10.1</a></li> <li><a href=" https://github.com/seanmonstar/httparse ">httparse 1.10.1</a></li>
<li><a href=" https://github.com/indexmap-rs/indexmap ">indexmap 2.14.0</a></li> <li><a href=" https://github.com/indexmap-rs/indexmap ">indexmap 2.14.0</a></li>
@ -2660,6 +2662,8 @@ limitations under the License.</pre>
<li><a href=" https://github.com/servo/rust-url/ ">percent-encoding 2.3.2</a></li> <li><a href=" https://github.com/servo/rust-url/ ">percent-encoding 2.3.2</a></li>
<li><a href=" https://github.com/petgraph/petgraph ">petgraph 0.8.3</a></li> <li><a href=" https://github.com/petgraph/petgraph ">petgraph 0.8.3</a></li>
<li><a href=" https://github.com/rust-lang/pkg-config-rs ">pkg-config 0.3.33</a></li> <li><a href=" https://github.com/rust-lang/pkg-config-rs ">pkg-config 0.3.33</a></li>
<li><a href=" https://github.com/tokio-rs/prost ">prost-derive 0.14.3</a></li>
<li><a href=" https://github.com/tokio-rs/prost ">prost 0.14.3</a></li>
<li><a href=" https://github.com/rayon-rs/rayon ">rayon-core 1.13.0</a></li> <li><a href=" https://github.com/rayon-rs/rayon ">rayon-core 1.13.0</a></li>
<li><a href=" https://github.com/rayon-rs/rayon ">rayon 1.12.0</a></li> <li><a href=" https://github.com/rayon-rs/rayon ">rayon 1.12.0</a></li>
<li><a href=" https://github.com/rust-lang/regex ">regex-automata 0.4.14</a></li> <li><a href=" https://github.com/rust-lang/regex ">regex-automata 0.4.14</a></li>
@ -3689,215 +3693,6 @@ APPENDIX: How to apply the Apache License to your work.
Copyright [yyyy] [name of copyright owner] Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an &quot;AS IS&quot; BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
</pre>
</li>
<li class="license">
<h3 id="Apache-2.0">Apache License 2.0</h3>
<h4>Used by:</h4>
<ul class="license-used-by">
<li><a href=" https://github.com/oyvindln/adler2 ">adler2 2.0.1</a></li>
</ul>
<pre class="license-text"> Apache License
Version 2.0, January 2004
https://www.apache.org/licenses/LICENSE-2.0
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
&quot;License&quot; shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
&quot;Licensor&quot; shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
&quot;Legal Entity&quot; shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
&quot;control&quot; means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
&quot;You&quot; (or &quot;Your&quot;) shall mean an individual or Legal Entity
exercising permissions granted by this License.
&quot;Source&quot; form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
&quot;Object&quot; form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
&quot;Work&quot; shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
&quot;Derivative Works&quot; shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
&quot;Contribution&quot; shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, &quot;submitted&quot;
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as &quot;Not a Contribution.&quot;
&quot;Contributor&quot; shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a &quot;NOTICE&quot; text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an &quot;AS IS&quot; BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets &quot;[]&quot;
replaced with your own identifying information. (Don&#x27;t include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same &quot;printed page&quot; as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the &quot;License&quot;); Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
You may obtain a copy of the License at You may obtain a copy of the License at
@ -4335,23 +4130,21 @@ limitations under the License.
<h4>Used by:</h4> <h4>Used by:</h4>
<ul class="license-used-by"> <ul class="license-used-by">
<li><a href=" https://github.com/zrzka/anes-rs ">anes 0.1.6</a></li> <li><a href=" https://github.com/zrzka/anes-rs ">anes 0.1.6</a></li>
<li><a href=" https://github.com/Nullus157/async-compression ">async-compression 0.4.41</a></li> <li><a href=" https://github.com/dtolnay/anyhow ">anyhow 1.0.102</a></li>
<li><a href=" https://github.com/BLAKE3-team/BLAKE3 ">blake3 1.8.5</a></li> <li><a href=" https://github.com/BLAKE3-team/BLAKE3 ">blake3 1.8.5</a></li>
<li><a href=" https://github.com/Nullus157/async-compression ">compression-codecs 0.4.37</a></li>
<li><a href=" https://github.com/Nullus157/async-compression ">compression-core 0.4.31</a></li>
<li><a href=" https://github.com/cesarb/constant_time_eq ">constant_time_eq 0.4.2</a></li> <li><a href=" https://github.com/cesarb/constant_time_eq ">constant_time_eq 0.4.2</a></li>
<li><a href=" https://github.com/soc/directories-rs ">directories 6.0.0</a></li> <li><a href=" https://github.com/soc/directories-rs ">directories 6.0.0</a></li>
<li><a href=" https://github.com/dirs-dev/dirs-sys-rs ">dirs-sys 0.5.0</a></li> <li><a href=" https://github.com/dirs-dev/dirs-sys-rs ">dirs-sys 0.5.0</a></li>
<li><a href=" https://github.com/VoidStarKat/half-rs ">half 2.7.1</a></li> <li><a href=" https://github.com/VoidStarKat/half-rs ">half 2.7.1</a></li>
<li><a href=" https://github.com/dtolnay/itoa ">itoa 1.0.18</a></li> <li><a href=" https://github.com/dtolnay/itoa ">itoa 1.0.18</a></li>
<li><a href=" https://github.com/rust-lang/libc ">libc 0.2.185</a></li> <li><a href=" https://github.com/rust-lang/libc ">libc 0.2.186</a></li>
<li><a href=" https://github.com/Frommi/miniz_oxide/tree/master/miniz_oxide ">miniz_oxide 0.8.9</a></li>
<li><a href=" https://github.com/jhpratt/num-conv ">num-conv 0.2.1</a></li> <li><a href=" https://github.com/jhpratt/num-conv ">num-conv 0.2.1</a></li>
<li><a href=" https://github.com/taiki-e/pin-project-lite ">pin-project-lite 0.2.17</a></li> <li><a href=" https://github.com/taiki-e/pin-project-lite ">pin-project-lite 0.2.17</a></li>
<li><a href=" https://github.com/taiki-e/portable-atomic ">portable-atomic 1.13.1</a></li> <li><a href=" https://github.com/taiki-e/portable-atomic ">portable-atomic 1.13.1</a></li>
<li><a href=" https://github.com/dtolnay/proc-macro2 ">proc-macro2 1.0.106</a></li> <li><a href=" https://github.com/dtolnay/proc-macro2 ">proc-macro2 1.0.106</a></li>
<li><a href=" https://github.com/dtolnay/quote ">quote 1.0.45</a></li> <li><a href=" https://github.com/dtolnay/quote ">quote 1.0.45</a></li>
<li><a href=" https://github.com/rust-random/rand ">rand 0.10.1</a></li> <li><a href=" https://github.com/rust-random/rand ">rand 0.10.1</a></li>
<li><a href=" https://github.com/rust-lang/rustc-hash ">rustc-hash 2.1.2</a></li>
<li><a href=" https://github.com/dtolnay/ryu ">ryu 1.0.23</a></li> <li><a href=" https://github.com/dtolnay/ryu ">ryu 1.0.23</a></li>
<li><a href=" https://github.com/serde-rs/serde ">serde 1.0.228</a></li> <li><a href=" https://github.com/serde-rs/serde ">serde 1.0.228</a></li>
<li><a href=" https://github.com/serde-rs/serde ">serde_core 1.0.228</a></li> <li><a href=" https://github.com/serde-rs/serde ">serde_core 1.0.228</a></li>
@ -4360,7 +4153,7 @@ limitations under the License.
<li><a href=" https://github.com/dtolnay/path-to-error ">serde_path_to_error 0.1.20</a></li> <li><a href=" https://github.com/dtolnay/path-to-error ">serde_path_to_error 0.1.20</a></li>
<li><a href=" https://github.com/nox/serde_urlencoded ">serde_urlencoded 0.7.1</a></li> <li><a href=" https://github.com/nox/serde_urlencoded ">serde_urlencoded 0.7.1</a></li>
<li><a href=" https://github.com/comex/rust-shlex ">shlex 1.3.0</a></li> <li><a href=" https://github.com/comex/rust-shlex ">shlex 1.3.0</a></li>
<li><a href=" https://github.com/jedisct1/rust-siphash ">siphasher 1.0.2</a></li> <li><a href=" https://github.com/jedisct1/rust-siphash ">siphasher 1.0.3</a></li>
<li><a href=" https://github.com/dtolnay/syn ">syn 2.0.117</a></li> <li><a href=" https://github.com/dtolnay/syn ">syn 2.0.117</a></li>
<li><a href=" https://github.com/Actyx/sync_wrapper ">sync_wrapper 1.0.2</a></li> <li><a href=" https://github.com/Actyx/sync_wrapper ">sync_wrapper 1.0.2</a></li>
<li><a href=" https://github.com/dtolnay/thiserror ">thiserror-impl 2.0.18</a></li> <li><a href=" https://github.com/dtolnay/thiserror ">thiserror-impl 2.0.18</a></li>
@ -4768,7 +4561,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
<h3 id="GPL-3.0">GNU General Public License v3.0 only</h3> <h3 id="GPL-3.0">GNU General Public License v3.0 only</h3>
<h4>Used by:</h4> <h4>Used by:</h4>
<ul class="license-used-by"> <ul class="license-used-by">
<li><a href=" https://github.com/elicpeter/nyx ">nyx-scanner 0.5.0</a></li> <li><a href=" https://github.com/elicpeter/nyx ">nyx-scanner 0.8.0</a></li>
</ul> </ul>
<pre class="license-text"> <pre class="license-text">
GNU GENERAL PUBLIC LICENSE GNU GENERAL PUBLIC LICENSE
@ -5105,6 +4898,39 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE. THE SOFTWARE.
</pre>
</li>
<li class="license">
<h3 id="MIT">MIT License</h3>
<h4>Used by:</h4>
<ul class="license-used-by">
<li><a href=" https://github.com/hyperium/h2 ">h2 0.4.14</a></li>
</ul>
<pre class="license-text">Copyright (c) 2017 h2 authors
Permission is hereby granted, free of charge, to any
person obtaining a copy of this software and associated
documentation files (the &quot;Software&quot;), to deal in the
Software without restriction, including without
limitation the rights to use, copy, modify, merge,
publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software
is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice
shall be included in all copies or substantial portions
of the Software.
THE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF
ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
</pre> </pre>
</li> </li>
<li class="license"> <li class="license">
@ -5337,7 +5163,7 @@ DEALINGS IN THE SOFTWARE.
<h3 id="MIT">MIT License</h3> <h3 id="MIT">MIT License</h3>
<h4>Used by:</h4> <h4>Used by:</h4>
<ul class="license-used-by"> <ul class="license-used-by">
<li><a href=" https://github.com/tower-rs/tower-http ">tower-http 0.6.8</a></li> <li><a href=" https://github.com/tower-rs/tower-http ">tower-http 0.6.10</a></li>
</ul> </ul>
<pre class="license-text">Copyright (c) 2019-2021 Tower Contributors <pre class="license-text">Copyright (c) 2019-2021 Tower Contributors
@ -5735,7 +5561,7 @@ USE OR OTHER DEALINGS IN THE SOFTWARE.
<ul class="license-used-by"> <ul class="license-used-by">
<li><a href=" https://github.com/tokio-rs/tokio ">tokio-stream 0.1.18</a></li> <li><a href=" https://github.com/tokio-rs/tokio ">tokio-stream 0.1.18</a></li>
<li><a href=" https://github.com/tokio-rs/tokio ">tokio-util 0.7.18</a></li> <li><a href=" https://github.com/tokio-rs/tokio ">tokio-util 0.7.18</a></li>
<li><a href=" https://github.com/tokio-rs/tokio ">tokio 1.52.1</a></li> <li><a href=" https://github.com/tokio-rs/tokio ">tokio 1.52.3</a></li>
</ul> </ul>
<pre class="license-text">MIT License <pre class="license-text">MIT License
@ -5751,35 +5577,6 @@ furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software. copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
</pre>
</li>
<li class="license">
<h3 id="MIT">MIT License</h3>
<h4>Used by:</h4>
<ul class="license-used-by">
<li><a href=" https://github.com/mcountryman/simd-adler32 ">simd-adler32 0.3.9</a></li>
</ul>
<pre class="license-text">MIT License
Copyright (c) [2021] [Marvin Countryman]
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the &quot;Software&quot;), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR THE SOFTWARE IS PROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
@ -6213,7 +6010,7 @@ SOFTWARE.
<h3 id="MIT">MIT License</h3> <h3 id="MIT">MIT License</h3>
<h4>Used by:</h4> <h4>Used by:</h4>
<ul class="license-used-by"> <ul class="license-used-by">
<li><a href=" https://github.com/ivanceras/r2d2-sqlite ">r2d2_sqlite 0.33.0</a></li> <li><a href=" https://github.com/ivanceras/r2d2-sqlite ">r2d2_sqlite 0.34.0</a></li>
</ul> </ul>
<pre class="license-text">The MIT License (MIT) <pre class="license-text">The MIT License (MIT)

View file

@ -27,7 +27,7 @@ esac
# ── Resolve "latest" to an actual release tag ──────────────────────────────── # ── Resolve "latest" to an actual release tag ────────────────────────────────
if [[ "$VERSION" == "latest" ]]; then if [[ "$VERSION" == "latest" ]]; then
echo "::warning::version: latest follows a mutable tag. Pin to a specific release (e.g. v0.5.0) for supply-chain safety." echo "::warning::version: latest follows a mutable tag. Pin to a specific release (e.g. v0.7.0) for supply-chain safety."
API_URL="https://api.github.com/repos/${REPO}/releases/latest" API_URL="https://api.github.com/repos/${REPO}/releases/latest"
CURL_ARGS=(-fsSL) CURL_ARGS=(-fsSL)
if [[ -n "${GITHUB_TOKEN:-}" ]]; then if [[ -n "${GITHUB_TOKEN:-}" ]]; then

View file

@ -12,9 +12,9 @@ inputs:
required: false required: false
default: '.' default: '.'
version: version:
description: 'Nyx release tag (e.g. v0.5.0). "latest" is accepted but discouraged, pinning to a specific tag protects against upstream compromise.' description: 'Nyx release tag (e.g. v0.7.0). "latest" is accepted but discouraged, pinning to a specific tag protects against upstream compromise.'
required: false required: false
default: 'v0.5.0' default: 'v0.7.0'
format: format:
description: 'Output format: sarif, json, or console' description: 'Output format: sarif, json, or console'
required: false required: false

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.4 MiB

After

Width:  |  Height:  |  Size: 432 KiB

Before After
Before After

Binary file not shown.

Before

Width:  |  Height:  |  Size: 324 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 520 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.9 KiB

View file

@ -0,0 +1,24 @@
<svg xmlns="http://www.w3.org/2000/svg" width="900" height="275" viewBox="0 0 900 275" role="img" aria-labelledby="title desc">
<title id="title">NYX</title>
<desc id="desc">NYX security scanner.</desc>
<defs>
<style>
.banner {
font-family: ui-monospace, SFMono-Regular, Menlo, Consolas, "Liberation Mono", monospace;
font-size: 38px;
font-weight: 800;
letter-spacing: 0;
white-space: pre;
}
</style>
</defs>
<g transform="translate(146 48)" xml:space="preserve">
<text class="banner" x="0" y="0" fill="#2ea067" xml:space="preserve">███╗ ██╗██╗ ██╗██╗ ██╗</text>
<text class="banner" x="0" y="43" fill="#2ea067" xml:space="preserve">████╗ ██║╚██╗ ██╔╝╚██╗██╔╝</text>
<text class="banner" x="0" y="86" fill="#2ea067" xml:space="preserve">██╔██╗ ██║ ╚████╔╝ ╚███╔╝</text>
<text class="banner" x="0" y="129" fill="#2ea067" xml:space="preserve">██║╚██╗██║ ╚██╔╝ ██╔██╗</text>
<text class="banner" x="0" y="172" fill="#2ea067" xml:space="preserve">██║ ╚████║ ██║ ██╔╝ ██╗</text>
<text class="banner" x="0" y="215" fill="#2ea067" xml:space="preserve">╚═╝ ╚═══╝ ╚═╝ ╚═╝ ╚═╝</text>
</g>
</svg>

After

Width:  |  Height:  |  Size: 1.4 KiB

View file

@ -6,5 +6,5 @@
font-weight="700" font-weight="700"
font-size="100" font-size="100"
letter-spacing="-1" letter-spacing="-1"
fill="#5856d6">nyx</text> fill="#72f3d7">nyx</text>
</svg> </svg>

Before

Width:  |  Height:  |  Size: 392 B

After

Width:  |  Height:  |  Size: 392 B

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 225 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 231 KiB

After

Width:  |  Height:  |  Size: 257 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 204 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 248 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 16 MiB

After

Width:  |  Height:  |  Size: 24 MiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 72 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 196 KiB

After

Width:  |  Height:  |  Size: 222 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 190 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 231 KiB

After

Width:  |  Height:  |  Size: 257 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 248 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 42 KiB

After

Width:  |  Height:  |  Size: 62 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 296 KiB

After

Width:  |  Height:  |  Size: 276 KiB

Before After
Before After

Binary file not shown.

Before

Width:  |  Height:  |  Size: 198 KiB

After

Width:  |  Height:  |  Size: 132 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 169 KiB

After

Width:  |  Height:  |  Size: 137 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 207 KiB

After

Width:  |  Height:  |  Size: 160 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 122 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 228 KiB

After

Width:  |  Height:  |  Size: 145 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 168 KiB

After

Width:  |  Height:  |  Size: 134 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 245 KiB

After

Width:  |  Height:  |  Size: 168 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 130 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 136 KiB

After

Width:  |  Height:  |  Size: 109 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 91 KiB

After

Width:  |  Height:  |  Size: 85 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 113 KiB

After

Width:  |  Height:  |  Size: 101 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 357 KiB

After

Width:  |  Height:  |  Size: 167 KiB

Before After
Before After

Binary file not shown.

Before

Width:  |  Height:  |  Size: 416 KiB

After

Width:  |  Height:  |  Size: 233 KiB

Before After
Before After

Binary file not shown.

Before

Width:  |  Height:  |  Size: 168 KiB

After

Width:  |  Height:  |  Size: 134 KiB

Before After
Before After

Binary file not shown.

Before

Width:  |  Height:  |  Size: 355 KiB

After

Width:  |  Height:  |  Size: 166 KiB

Before After
Before After

686
benches/dynamic_bench.rs Normal file
View file

@ -0,0 +1,686 @@
//! Dynamic verification benchmarks (§8.4).
//!
//! Tracks the per-scan cost anchors:
//!
//! 1. `harness_build_cold` — fresh workdir, spec → BuiltHarness (source gen + disk write).
//! 2. `harness_build_warm` — same spec, workdir already staged (file write skipped).
//! 3. `sandbox_run_payload` — single payload run via process backend against
//! sqli_positive.py (subprocess + settrace overhead, no networking).
//! 4. `docker_image_build` — cold image pull/build for the python:3-slim base.
//! 5. `docker_exec_warm` — `docker exec` into a running container (no cold start).
//! 6. `docker_payload_cost` — per-payload sandbox cost via docker backend end-to-end.
//! 7. `composite_chain_reverify_dispatch` — `reverify_top_chains` on a
//! synthetic 3-member chain with no member diags. Measures the no-derive
//! dispatch path (chain_step_specs miss, early-exit build/run loops,
//! Inconclusive verdict allocation, severity downgrade).
//! 8. `composite_chain_reverify_stub_confirmed` — same chain shape, stubbed
//! reverifier returning `Confirmed`. Measures the apply-verdict happy path
//! (no severity bucket change).
//! 9. `composite_chain_reverify_top_n_slice` — 5-chain slice with `top_n=3`.
//! Measures the slice traversal cost so a regression that walks the full
//! slice instead of the prefix is visible.
//! 10. `composite_chain_reverify_replay_stable` — same chain shape as
//! `stub_confirmed`, but with `VerifyOptions::replay_stable_check=true`
//! and a stub that stamps `replay_stable=Some(true)`. Anchors the
//! apply-verdict allocation cost when the telemetry stability field
//! is populated; a regression that adds per-chain work behind the
//! replay opt-in (e.g. an extra run_chain_steps call leaking out of
//! the live path into the stub layer) shows up here.
//!
//! Wall-clock budget anchors for the composite reverify path: the live
//! process backend stays under 400ms per 3-member chain, the docker
//! backend under 1500ms. Those live-run numbers are covered by the
//! `flask_eval_chain_reverify_populates_dynamic_verdict` integration
//! test in `tests/chain_emission_e2e.rs`; the microbenches here anchor
//! the dispatch + verdict-application overhead so regressions on the
//! API-shape half land in the criterion baseline.
//!
//! Baselines committed to `benches/dynamic_bench_baseline.json`.
//! Run: `cargo bench --features dynamic -- dynamic`
//!
//! Docker benchmarks are no-ops when docker is unavailable (skipped, not failed).
use criterion::{Criterion, criterion_group, criterion_main};
#[cfg(feature = "dynamic")]
use nyx_scanner::dynamic::spec::{
EntryKind, HarnessSpec, JavaToolchain, PayloadSlot, SpecDerivationStrategy,
};
#[cfg(feature = "dynamic")]
use nyx_scanner::labels::Cap;
#[cfg(feature = "dynamic")]
use nyx_scanner::symbol::Lang;
#[cfg(feature = "dynamic")]
fn make_rust_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_rust_0001".into(),
entry_file: "tests/dynamic_fixtures/rust/sqli_positive.rs".into(),
entry_name: "run".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::Rust,
toolchain_id: "rust-stable".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/rust/sqli_positive.rs".into(),
sink_line: 18,
spec_hash: "benchrustsqli0001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn make_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench0000000001".into(),
entry_file: "tests/dynamic_fixtures/python/sqli_positive.py".into(),
entry_name: "login".into(),
entry_kind: EntryKind::Function,
lang: Lang::Python,
toolchain_id: "python-3".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/python/sqli_positive.py".into(),
sink_line: 7,
spec_hash: "benchsqli000001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn bench_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_sqli_spec();
c.bench_function("harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("harness build")
});
});
}
#[cfg(feature = "dynamic")]
fn bench_harness_build_warm(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_sqli_spec();
harness::build(&spec).expect("harness pre-stage");
c.bench_function("harness_build_warm", |b| {
b.iter(|| harness::build(&spec).expect("harness build warm"));
});
}
#[cfg(feature = "dynamic")]
fn bench_sandbox_run_payload(c: &mut Criterion) {
use nyx_scanner::dynamic::corpus::payloads_for;
use nyx_scanner::dynamic::harness;
use nyx_scanner::dynamic::sandbox::{self, SandboxOptions};
let spec = make_sqli_spec();
let harness = harness::build(&spec).expect("harness build");
let payloads = payloads_for(Cap::SQL_QUERY);
let payload = payloads
.iter()
.find(|p| !p.is_benign)
.expect("sqli payload");
let opts = SandboxOptions {
timeout: std::time::Duration::from_secs(10),
..SandboxOptions::default()
};
c.bench_function("sandbox_run_payload", |b| {
b.iter(|| sandbox::run(&harness, payload.bytes, &opts).expect("sandbox run"));
});
}
#[cfg(feature = "dynamic")]
fn docker_available() -> bool {
std::process::Command::new("docker")
.arg("info")
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status()
.map(|s| s.success())
.unwrap_or(false)
}
/// Cold docker image pull/build.
///
/// Measures the time to ensure `python:3-slim` is present locally. On a
/// warm cache this is just an inspect call (sub-second). On a cold host it
/// includes the pull from the registry.
///
/// Registers a labelled noop measurement when Docker is absent so criterion's
/// output is never empty for this slot.
#[cfg(feature = "dynamic")]
fn bench_docker_image_build(c: &mut Criterion) {
if !docker_available() {
c.bench_function("docker_image_build_no_docker", |b| b.iter(|| ()));
return;
}
c.bench_function("docker_image_build", |b| {
b.iter(|| {
// `docker pull` is idempotent and fast when image is already local.
let _ = std::process::Command::new("docker")
.args(["pull", "python:3-slim"])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
});
});
}
/// Warm `docker exec` reuse benchmark.
///
/// Starts a single container before the benchmark loop and measures the cost
/// of each `docker exec` call (no cold-start amortisation visible here — that
/// is visible by comparing this vs `bench_docker_payload_cost`).
#[cfg(feature = "dynamic")]
fn bench_docker_exec_warm(c: &mut Criterion) {
if !docker_available() {
eprintln!("bench_docker_exec_warm: docker unavailable, skipping");
return;
}
// Start a long-lived container for the benchmark.
let container = "nyx-bench-exec-warm";
let _ = std::process::Command::new("docker")
.args([
"run",
"-d",
"--rm",
"--name",
container,
"--cap-drop=ALL",
"--security-opt",
"no-new-privileges:true",
"--network",
"none",
"python:3-slim",
"sleep",
"300",
])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
c.bench_function("docker_exec_warm", |b| {
b.iter(|| {
let _ = std::process::Command::new("docker")
.args(["exec", container, "python3", "-c", "pass"])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
});
});
let _ = std::process::Command::new("docker")
.args(["stop", container])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
}
/// Per-payload sandbox cost via docker backend end-to-end.
///
/// Measures the complete path: harness already built + docker backend +
/// process the sqli_positive fixture. The first call includes container
/// start; subsequent calls show exec-reuse cost.
///
/// Registers a labelled noop measurement when Docker is absent so criterion's
/// output is never empty for this slot.
#[cfg(feature = "dynamic")]
fn bench_docker_payload_cost(c: &mut Criterion) {
if !docker_available() {
c.bench_function("docker_payload_cost_no_docker", |b| b.iter(|| ()));
return;
}
use nyx_scanner::dynamic::corpus::payloads_for;
use nyx_scanner::dynamic::harness;
use nyx_scanner::dynamic::sandbox::{self, SandboxBackend, SandboxOptions};
let spec = make_sqli_spec();
let built = harness::build(&spec).expect("harness build");
let payloads = payloads_for(Cap::SQL_QUERY);
let payload = payloads
.iter()
.find(|p| !p.is_benign)
.expect("sqli payload");
let opts = SandboxOptions {
timeout: std::time::Duration::from_secs(30),
backend: SandboxBackend::Docker,
..SandboxOptions::default()
};
c.bench_function("docker_payload_cost", |b| {
b.iter(|| {
let _ = sandbox::run(&built, payload.bytes, &opts);
});
});
}
/// Rust harness build (source gen + disk write, no compilation).
///
/// Measures only `harness::build()` — staging files to the workdir.
/// The expensive `cargo build --release` step is NOT included here
/// (that is the province of an integration benchmark, not this microbench).
#[cfg(feature = "dynamic")]
fn bench_rust_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_rust_sqli_spec();
c.bench_function("rust_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("harness build")
});
});
}
#[cfg(feature = "dynamic")]
fn make_js_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_js_0001".into(),
entry_file: "tests/dynamic_fixtures/js/sqli_positive.js".into(),
entry_name: "login".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::JavaScript,
toolchain_id: "node-20".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/js/sqli_positive.js".into(),
sink_line: 8,
spec_hash: "benchjssqli000001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn make_go_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_go_0001".into(),
entry_file: "tests/dynamic_fixtures/go/sqli_positive.go".into(),
entry_name: "Login".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::Go,
toolchain_id: "go-1.21".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/go/sqli_positive.go".into(),
sink_line: 12,
spec_hash: "benchgosqli000001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn make_java_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_java_0001".into(),
entry_file: "tests/dynamic_fixtures/java/sqli_positive.java".into(),
entry_name: "login".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::Java,
toolchain_id: "java-21".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/java/sqli_positive.java".into(),
sink_line: 9,
spec_hash: "benchjavasqli00001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn make_php_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_php_0001".into(),
entry_file: "tests/dynamic_fixtures/php/sqli_positive.php".into(),
entry_name: "login".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::Php,
toolchain_id: "php-8".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/php/sqli_positive.php".into(),
sink_line: 9,
spec_hash: "benchphpsqli000001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
/// JS harness build (source gen + disk write).
#[cfg(feature = "dynamic")]
fn bench_js_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_js_sqli_spec();
c.bench_function("js_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("JS harness build")
});
});
}
/// Go harness build (source gen + disk write, no compilation).
#[cfg(feature = "dynamic")]
fn bench_go_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_go_sqli_spec();
c.bench_function("go_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("Go harness build")
});
});
}
/// Java harness build (source gen + disk write, no compilation).
#[cfg(feature = "dynamic")]
fn bench_java_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_java_sqli_spec();
c.bench_function("java_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("Java harness build")
});
});
}
/// PHP harness build (source gen + disk write).
#[cfg(feature = "dynamic")]
fn bench_php_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_php_sqli_spec();
c.bench_function("php_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("PHP harness build")
});
});
}
#[cfg(feature = "dynamic")]
fn mk_chain_member(hash: u64, idx: usize) -> nyx_scanner::chain::FindingRef {
use nyx_scanner::surface::SourceLocation;
nyx_scanner::chain::FindingRef {
finding_id: format!("bench-chain-member-{idx}"),
stable_hash: hash,
location: SourceLocation::new("bench/synthetic.py", (idx as u32) + 1, 1),
rule_id: "taint-unsanitised-flow".into(),
cap_bits: 0,
}
}
#[cfg(feature = "dynamic")]
fn mk_synthetic_chain(hash: u64, members: usize) -> nyx_scanner::chain::ChainFinding {
use nyx_scanner::chain::{ChainFinding, ChainSeverity, ChainSink, ImpactCategory};
ChainFinding {
stable_hash: hash,
members: (0..members)
.map(|i| mk_chain_member(hash.wrapping_add(i as u64 + 1), i))
.collect(),
sink: ChainSink {
file: "bench/synthetic.py".into(),
line: 99,
col: 1,
function_name: "sink".into(),
cap_bits: 0,
},
implied_impact: ImpactCategory::Rce,
severity: ChainSeverity::Critical,
score: 100.0,
dynamic_verdict: None,
reverify_reason: None,
}
}
#[cfg(feature = "dynamic")]
struct BenchConfirmedReverifier;
#[cfg(feature = "dynamic")]
impl nyx_scanner::chain::CompositeReverifier for BenchConfirmedReverifier {
fn reverify(
&self,
_chain: &nyx_scanner::chain::ChainFinding,
_member_diags: &[nyx_scanner::commands::scan::Diag],
_surface: &nyx_scanner::surface::SurfaceMap,
opts: &nyx_scanner::dynamic::verify::VerifyOptions,
) -> nyx_scanner::evidence::VerifyResult {
// Mirror `DefaultCompositeReverifier::reverify`'s replay-stable
// stamping shape so the apply-verdict allocation cost matches
// the live path when the opt-in is on. The stub does not
// re-run any work (it has none to re-run) but the resulting
// `VerifyResult` populates `replay_stable=Some(true)` so
// downstream sites that branch on the field exercise the same
// path they would for a real Confirmed-with-stable run.
let replay_stable = if opts.replay_stable_check {
Some(true)
} else {
None
};
nyx_scanner::evidence::VerifyResult {
finding_id: "bench".into(),
status: nyx_scanner::evidence::VerifyStatus::Confirmed,
triggered_payload: None,
reason: None,
inconclusive_reason: None,
detail: None,
attempts: vec![],
toolchain_match: None,
differential: None,
replay_stable,
wrong: None,
hardening_outcome: None,
}
}
}
/// Phase 26 dispatch-cost anchor: synthetic 3-member chain with no
/// matching member diags. The reverifier walks chain_step_specs (3
/// HashMap misses → 3 NoFlowSteps errors), the build loop sees zero
/// derived specs and exits early, the run loop sees zero built steps
/// and exits early. The composed VerifyResult is allocated and applied
/// via `apply_dynamic_verdict` (Inconclusive → severity downgrade).
///
/// This is the no-toolchain-dep dispatch overhead — a regression here
/// signals a hot-path allocation introduced into the reverify pipeline.
#[cfg(feature = "dynamic")]
fn bench_composite_chain_reverify_dispatch(c: &mut Criterion) {
use nyx_scanner::chain::reverify;
use nyx_scanner::dynamic::verify::VerifyOptions;
use nyx_scanner::surface::SurfaceMap;
let surface = SurfaceMap::new();
let opts = VerifyOptions::default();
c.bench_function("composite_chain_reverify_dispatch", |b| {
b.iter(|| {
let mut chains = [mk_synthetic_chain(0xC1A1, 3)];
let _ = reverify::reverify_top_chains(&mut chains, &[], &surface, &opts, 1);
});
});
}
/// Phase 26 stub-reverifier happy-path anchor: synthetic 3-member
/// chain driven through `reverify_top_chains_with` + a stubbed
/// reverifier returning `Confirmed`. Measures the apply-verdict path
/// when the verdict does NOT trigger a severity downgrade, so the
/// `ChainReverifyResult` allocation + `chain.apply_dynamic_verdict`
/// transition cost is exercised independent of the verdict-side
/// allocation in the dispatch bench.
#[cfg(feature = "dynamic")]
fn bench_composite_chain_reverify_stub_confirmed(c: &mut Criterion) {
use nyx_scanner::chain::reverify;
use nyx_scanner::dynamic::verify::VerifyOptions;
use nyx_scanner::surface::SurfaceMap;
let surface = SurfaceMap::new();
let opts = VerifyOptions::default();
let reverifier = BenchConfirmedReverifier;
c.bench_function("composite_chain_reverify_stub_confirmed", |b| {
b.iter(|| {
let mut chains = [mk_synthetic_chain(0xC2A2, 3)];
let _ = reverify::reverify_top_chains_with(
&mut chains,
&[],
&surface,
&opts,
1,
&reverifier,
);
});
});
}
/// Phase 26 top-N slice anchor: 5-chain slice with `top_n=3`. Asserts
/// (by way of regression) that the reverify pass never walks past the
/// top-N prefix. The fan-in is the per-chain dispatch cost times three;
/// a regression that drops the `bound = top_n.min(chains.len())` cap
/// would show up as a ~5/3 increase in this bench.
#[cfg(feature = "dynamic")]
fn bench_composite_chain_reverify_top_n_slice(c: &mut Criterion) {
use nyx_scanner::chain::reverify;
use nyx_scanner::dynamic::verify::VerifyOptions;
use nyx_scanner::surface::SurfaceMap;
let surface = SurfaceMap::new();
let opts = VerifyOptions::default();
let reverifier = BenchConfirmedReverifier;
c.bench_function("composite_chain_reverify_top_n_slice", |b| {
b.iter(|| {
let mut chains: [nyx_scanner::chain::ChainFinding; 5] = [
mk_synthetic_chain(0xC301, 3),
mk_synthetic_chain(0xC302, 3),
mk_synthetic_chain(0xC303, 3),
mk_synthetic_chain(0xC304, 3),
mk_synthetic_chain(0xC305, 3),
];
let _ = reverify::reverify_top_chains_with(
&mut chains,
&[],
&surface,
&opts,
3,
&reverifier,
);
});
});
}
/// Phase 26 replay-stable anchor: same 3-member synthetic chain as
/// `stub_confirmed`, driven through `reverify_top_chains_with` with
/// `VerifyOptions::replay_stable_check=true`. The `BenchConfirmedReverifier`
/// stub honours the opt-in by stamping `replay_stable=Some(true)` on
/// the returned `VerifyResult`, exercising the apply-verdict path with
/// the telemetry stability field populated.
///
/// Purpose: anchor the cost of the replay-stable apply path so a
/// regression that leaks a real `run_chain_steps` invocation into the
/// stubbed verifier layer (or that allocates extra state behind the
/// `replay_stable_check` toggle in `chain::reverify::apply_one`) shows
/// up immediately against the `stub_confirmed` baseline.
#[cfg(feature = "dynamic")]
fn bench_composite_chain_reverify_replay_stable(c: &mut Criterion) {
use nyx_scanner::chain::reverify;
use nyx_scanner::dynamic::verify::VerifyOptions;
use nyx_scanner::surface::SurfaceMap;
let surface = SurfaceMap::new();
let opts = VerifyOptions {
replay_stable_check: true,
..VerifyOptions::default()
};
let reverifier = BenchConfirmedReverifier;
c.bench_function("composite_chain_reverify_replay_stable", |b| {
b.iter(|| {
let mut chains = [mk_synthetic_chain(0xC4A3, 3)];
let _ = reverify::reverify_top_chains_with(
&mut chains,
&[],
&surface,
&opts,
1,
&reverifier,
);
});
});
}
#[cfg(feature = "dynamic")]
#[allow(dead_code)]
fn bench_noop(_c: &mut Criterion) {}
// When dynamic feature is off, provide a stub so the binary still links.
#[cfg(not(feature = "dynamic"))]
fn bench_noop(c: &mut Criterion) {
c.bench_function("dynamic_disabled_noop", |b| b.iter(|| ()));
}
#[cfg(feature = "dynamic")]
criterion_group!(
dynamic,
bench_harness_build_cold,
bench_harness_build_warm,
bench_sandbox_run_payload,
bench_docker_image_build,
bench_docker_exec_warm,
bench_docker_payload_cost,
bench_rust_harness_build_cold,
bench_js_harness_build_cold,
bench_go_harness_build_cold,
bench_java_harness_build_cold,
bench_php_harness_build_cold,
bench_composite_chain_reverify_dispatch,
bench_composite_chain_reverify_stub_confirmed,
bench_composite_chain_reverify_top_n_slice,
bench_composite_chain_reverify_replay_stable,
);
#[cfg(not(feature = "dynamic"))]
criterion_group!(dynamic, bench_noop);
criterion_main!(dynamic);

View file

@ -0,0 +1,26 @@
{
"schema": 1,
"note": "ASPIRATIONAL placeholder — values were hand-typed, not captured from a real bench run. Regenerate with: benches/regen_baseline.sh (requires --features dynamic and python3 on PATH). Commit the updated file to establish a real regression reference for M3+.",
"benchmarks": {
"harness_build_cold": {
"mean_ns": 800000,
"stddev_ns": 120000,
"description": "Fresh workdir; spec → BuiltHarness including source gen + disk write."
},
"harness_build_warm": {
"mean_ns": 180000,
"stddev_ns": 30000,
"description": "Workdir already staged; file write skipped by dst.exists() guard."
},
"sandbox_run_payload": {
"mean_ns": 120000000,
"stddev_ns": 15000000,
"description": "Single process-backend run with sqli payload; includes python3 startup + settrace."
}
},
"regression_thresholds": {
"harness_build_cold": 2.0,
"harness_build_warm": 2.0,
"sandbox_run_payload": 1.5
}
}

File diff suppressed because it is too large Load diff

84
benches/regen_baseline.sh Executable file
View file

@ -0,0 +1,84 @@
#!/usr/bin/env bash
# Regenerate benches/dynamic_bench_baseline.json from a real cargo bench run.
#
# Usage:
# bash benches/regen_baseline.sh
#
# Requirements:
# - python3 on PATH
# - cargo (nightly or stable with edition 2024)
# - Criterion's JSON output (criterion feature already in dev-deps)
#
# The script runs the dynamic bench group, parses Criterion's estimates JSON,
# and overwrites dynamic_bench_baseline.json with real numbers.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
BASELINE_FILE="${SCRIPT_DIR}/dynamic_bench_baseline.json"
echo "Running cargo bench --features dynamic -- dynamic ..."
cargo bench --manifest-path "${REPO_ROOT}/Cargo.toml" \
--features dynamic \
-- dynamic \
2>&1 | tee /tmp/nyx_bench_raw.txt
# Criterion writes estimates to target/criterion/<bench>/<group>/estimates.json.
# Extract mean_ns for each tracked benchmark.
extract_ns() {
local path="$1"
if [[ -f "${path}" ]]; then
python3 -c "
import json, sys
d = json.load(open('${path}'))
mean = d['mean']['point_estimate']
stddev = (d['std_dev']['point_estimate']) if 'std_dev' in d else 0
print(int(mean), int(stddev))
"
else
echo "0 0"
fi
}
TARGET="${REPO_ROOT}/target/criterion"
read COLD_MEAN COLD_STDDEV < <(extract_ns "${TARGET}/harness_build_cold/default/estimates.json")
read WARM_MEAN WARM_STDDEV < <(extract_ns "${TARGET}/harness_build_warm/default/estimates.json")
read RUN_MEAN RUN_STDDEV < <(extract_ns "${TARGET}/sandbox_run_payload/default/estimates.json")
MACHINE="$(uname -m) / $(uname -s)"
NYX_VER="$(cargo metadata --manifest-path "${REPO_ROOT}/Cargo.toml" --no-deps --format-version 1 \
| python3 -c "import json,sys; d=json.load(sys.stdin); print(next(p['version'] for p in d['packages'] if p['name']=='nyx-scanner'))")"
DATE="$(date +%Y-%m-%d)"
cat > "${BASELINE_FILE}" <<EOF
{
"schema": 1,
"note": "Baseline captured on ${MACHINE}, nyx v${NYX_VER}, ${DATE}. Regenerate with: benches/regen_baseline.sh",
"benchmarks": {
"harness_build_cold": {
"mean_ns": ${COLD_MEAN},
"stddev_ns": ${COLD_STDDEV},
"description": "Fresh workdir; spec → BuiltHarness including source gen + disk write."
},
"harness_build_warm": {
"mean_ns": ${WARM_MEAN},
"stddev_ns": ${WARM_STDDEV},
"description": "Workdir already staged; file write skipped by dst.exists() guard."
},
"sandbox_run_payload": {
"mean_ns": ${RUN_MEAN},
"stddev_ns": ${RUN_STDDEV},
"description": "Single process-backend run with sqli payload; includes python3 startup + settrace."
}
},
"regression_thresholds": {
"harness_build_cold": 2.0,
"harness_build_warm": 2.0,
"sandbox_run_payload": 1.5
}
}
EOF
echo "Updated ${BASELINE_FILE}"

View file

@ -157,6 +157,7 @@ fn bench_state_analysis_only(c: &mut Criterion) {
&[], &[],
&std::collections::HashSet::new(), &std::collections::HashSet::new(),
None, None,
None,
) )
}); });
}); });
@ -172,6 +173,266 @@ fn bench_classify(c: &mut Criterion) {
}); });
} }
/// Per-file fused analysis throughput on a realistic ~1.5k-line Go module
/// (gin context.go, ~147 fns). Guards the
/// `ParsedFile::body_const_facts_cache` optimization that collapses the
/// 2-3× per-body re-lowering that previously dominated `analyse_file_fused`
/// (~14% of wall-clock on the gin-scan profile). Regressions here mean
/// per-body work is being recomputed across passes again.
fn bench_analyse_file_fused_large_go(c: &mut Criterion) {
let fixture = Path::new("benches/perf_fixtures/large_go_module.go")
.canonicalize()
.expect("perf fixture");
let bytes = std::fs::read(&fixture).expect("read fixture");
let mut cfg = Config::default();
cfg.scanner.mode = AnalysisMode::Full;
cfg.scanner.enable_state_analysis = true;
cfg.performance.worker_threads = Some(1);
// One-shot diagnostic: count `build_body_const_facts` calls per fused
// analysis so a regression that removes the per-file cache surfaces here
// (expected ~148 calls on this fixture; pre-cache was ~444).
nyx_scanner::cfg_analysis::BUILD_BODY_CONST_FACTS_CALLS
.store(0, std::sync::atomic::Ordering::Relaxed);
let _ = nyx_scanner::ast::analyse_file_fused(&bytes, &fixture, &cfg, None, None)
.expect("warmup analyse");
let calls = nyx_scanner::cfg_analysis::BUILD_BODY_CONST_FACTS_CALLS
.load(std::sync::atomic::Ordering::Relaxed);
eprintln!("[diag] build_body_const_facts calls per analyse_file_fused: {calls}");
c.bench_function("analyse_file_fused_large_go", |b| {
b.iter(|| {
nyx_scanner::ast::analyse_file_fused(&bytes, &fixture, &cfg, None, None)
.expect("analyse_file_fused")
});
});
}
/// Per-file `extract_authorization_model` throughput on the realistic
/// ~1.5k-line Go fixture (gin context.go). Guards the
/// `extract_authorization_model` orchestrator hoist that pulled the
/// shared `collect_top_level_units` AST walk out of every supporting
/// extractor's `extract()` (one walk per file instead of one per
/// matching extractor). On Go files both `EchoExtractor` and
/// `GinExtractor` match by default — pre-hoist this bench measured the
/// AST being walked twice; regressions here mean the hoist has been
/// broken or a new Go extractor was added that re-walks the tree.
fn bench_extract_authorization_model_go(c: &mut Criterion) {
use tree_sitter::Parser;
let fixture = Path::new("benches/perf_fixtures/large_go_module.go")
.canonicalize()
.expect("perf fixture");
let bytes = std::fs::read(&fixture).expect("read fixture");
let mut parser = Parser::new();
let go_lang: tree_sitter::Language = tree_sitter_go::LANGUAGE.into();
parser.set_language(&go_lang).expect("set go grammar");
let tree = parser.parse(&bytes, None).expect("parse fixture");
let cfg = Config::default();
let rules = nyx_scanner::auth_analysis::config::build_auth_rules(&cfg, "go");
c.bench_function("extract_authorization_model_go", |b| {
b.iter(|| {
nyx_scanner::auth_analysis::extract::extract_authorization_model(
"go",
cfg.framework_ctx.as_ref(),
&tree,
&bytes,
&fixture,
&rules,
None,
)
});
});
}
/// Per-file shared-vs-double `extract_authorization_model` cost on a
/// realistic Go fixture (gin context.go). Pre-fix
/// `analyse_file_fused` called `extract_authorization_model` twice per
/// file (once for diagnostics via `run_auth_analysis`, once for
/// per-file summary keying via `extract_auth_summaries_by_key`). This
/// bench records the **shared-model path** only (extract once, derive
/// both summaries + diagnostics) so a regression that re-introduces
/// the double-call surfaces as a ≥1.7× slowdown here.
fn bench_extract_authorization_model_shared_go(c: &mut Criterion) {
use tree_sitter::Parser;
let fixture = Path::new("benches/perf_fixtures/large_go_module.go")
.canonicalize()
.expect("perf fixture");
let bytes = std::fs::read(&fixture).expect("read fixture");
let mut parser = Parser::new();
let go_lang: tree_sitter::Language = tree_sitter_go::LANGUAGE.into();
parser.set_language(&go_lang).expect("set go grammar");
let tree = parser.parse(&bytes, None).expect("parse fixture");
let cfg = Config::default();
let rules = nyx_scanner::auth_analysis::config::build_auth_rules(&cfg, "go");
c.bench_function("extract_authorization_model_shared_go", |b| {
b.iter(|| {
// Mirror `analyse_file_fused`: extract once, derive both
// per-file summaries (cheap iter over units) AND run the
// full diagnostic pipeline against the same model.
let model = nyx_scanner::auth_analysis::extract::extract_authorization_model(
"go",
cfg.framework_ctx.as_ref(),
&tree,
&bytes,
&fixture,
&rules,
None,
);
let summaries = nyx_scanner::auth_analysis::extract_auth_summaries_from_model(
&model, "go", &fixture, None,
);
let diags = nyx_scanner::auth_analysis::run_auth_analysis_with_model(
model, &tree, "go", &fixture, &rules, None, None, None,
);
(summaries, diags)
});
});
}
/// Per-file `collect_top_level_units` cost on a realistic Go fixture
/// (gin context.go, ~147 functions). Targets the inner per-function
/// AST-walk path: `collect_top_level_units` →
/// `build_function_unit_with_meta` → `collect_unit_state` (recursive
/// per-AST-node walk that emits per-node value-refs).
///
/// Pre-fix (2026-05-04 perfhunt session-0009) `collect_unit_state`
/// called `extract_value_refs(node, bytes)` at every AST node, and that
/// helper recursively walked the node's full subtree. Combined with
/// the recursion below, every descendant got walked once for each of
/// its ancestors — total work O(N²) per function body. The fix
/// replaced that call with an O(1)-per-node `append_shallow_value_ref`
/// helper. A regression that re-introduces the deep walk surfaces
/// here as a ≥2× slowdown.
fn bench_collect_top_level_units_go(c: &mut Criterion) {
use tree_sitter::Parser;
let fixture = Path::new("benches/perf_fixtures/large_go_module.go")
.canonicalize()
.expect("perf fixture");
let bytes = std::fs::read(&fixture).expect("read fixture");
let mut parser = Parser::new();
let go_lang: tree_sitter::Language = tree_sitter_go::LANGUAGE.into();
parser.set_language(&go_lang).expect("set go grammar");
let tree = parser.parse(&bytes, None).expect("parse fixture");
let cfg = Config::default();
let rules = nyx_scanner::auth_analysis::config::build_auth_rules(&cfg, "go");
c.bench_function("collect_top_level_units_go", |b| {
b.iter(|| {
let mut model = nyx_scanner::auth_analysis::model::AuthorizationModel::default();
nyx_scanner::auth_analysis::extract::common::collect_top_level_units(
tree.root_node(),
&bytes,
&rules,
&mut model,
);
model
});
});
}
/// SCCP throughput on every SSA body lowered from the gin context.go
/// fixture. Targets `nyx_scanner::ssa::const_prop::const_propagate`
/// directly, isolating it from the surrounding `optimize_ssa` pass and
/// the full-fused per-file analysis.
///
/// Pre-fix (2026-05-04 perfhunt) `const_propagate` stored its lattice in
/// `HashMap<SsaValue, ConstLattice>` and walked
/// `inst_uses(inst).contains(&val)` for every block re-evaluation in the
/// SSA worklist — both shapes paid `SipHash` cost on every operand, and
/// the `inst_uses` factory allocated a fresh `Vec<SsaValue>` on every
/// call. Switching the lattice + executable-edge maps to dense
/// `Vec`-indexed storage and the use-check to a zero-allocation
/// predicate cut `const_propagate` self-time roughly in half on the
/// large-Go fixture. A regression that re-introduces the hash-keyed
/// inner loop will surface here as a ≥1.4× slowdown.
fn bench_const_propagate_large_go(c: &mut Criterion) {
use nyx_scanner::ssa;
let fixture = Path::new("benches/perf_fixtures/large_go_module.go")
.canonicalize()
.expect("perf fixture");
let cfg_obj = Config::default();
let (file_cfg, _lang) = nyx_scanner::ast::build_cfg_for_file(&fixture, &cfg_obj)
.expect("build cfg")
.expect("supported language");
// Lower every body once outside the bench loop so we measure only
// SCCP cost. The collected `(SsaBody, Cfg)` pairs are the input to
// the inner loop.
let mut bodies: Vec<ssa::ir::SsaBody> = Vec::new();
for body in &file_cfg.bodies {
// Use `body.meta.name` as the scope filter so the SSA lowering
// pulls only this function's nodes; `scope_all=true` is reserved
// for the synthetic top-level body where `name` is None.
let scope = body.meta.name.as_deref();
let scope_all = scope.is_none();
match ssa::lower_to_ssa(&body.graph, body.entry, scope, scope_all) {
Ok(ssa_body) => bodies.push(ssa_body),
Err(_) => continue,
}
}
eprintln!(
"[diag] const_propagate bench: {} bodies lowered",
bodies.len()
);
c.bench_function("const_propagate_large_go", |b| {
b.iter(|| {
let mut total_values = 0usize;
for body in &bodies {
let result = ssa::const_prop::const_propagate(body);
total_values += result.values.len();
}
total_values
});
});
}
/// `GlobalSummaries::lookup_same_lang` cost on a populated index. The
/// inner loop hashes `(Lang, String)` once per call, then `FuncKey` once
/// per candidate via `by_key.get(k)`. Pre-fix the four secondary
/// indices used `std::collections::HashMap` (SipHash). Post-fix
/// (2026-05-04 perfhunt session-0015) they use `rustc_hash::FxHashMap`,
/// trading DoS hardening (irrelevant for in-process program-keyed
/// indices) for ~5x faster hashing on the 30+ byte 3-string `FuncKey`
/// hash workload. A regression that re-introduces SipHash would
/// surface here as a ≥3x slowdown.
fn bench_global_summaries_lookup_same_lang_go(c: &mut Criterion) {
let fixture = Path::new("benches/perf_fixtures/large_go_module.go")
.canonicalize()
.expect("perf fixture");
let cfg = Config::default();
let summaries =
nyx_scanner::ast::extract_summaries_from_file(&fixture, &cfg).expect("extract summaries");
let names: Vec<String> = summaries.iter().map(|s| s.name.clone()).collect();
let global = nyx_scanner::summary::merge_summaries(summaries, None);
let lang = nyx_scanner::symbol::Lang::Go;
eprintln!("[diag] lookup_same_lang bench: {} names", names.len());
c.bench_function("global_summaries_lookup_same_lang_go", |b| {
b.iter(|| {
let mut total = 0usize;
for name in &names {
total += global.lookup_same_lang(lang, name).len();
}
total
});
});
}
criterion_group!( criterion_group!(
benches, benches,
bench_ast_only_scan, bench_ast_only_scan,
@ -180,5 +441,11 @@ criterion_group!(
bench_single_file_parse_and_cfg, bench_single_file_parse_and_cfg,
bench_state_analysis_only, bench_state_analysis_only,
bench_classify, bench_classify,
bench_analyse_file_fused_large_go,
bench_extract_authorization_model_go,
bench_extract_authorization_model_shared_go,
bench_collect_top_level_units_go,
bench_const_propagate_large_go,
bench_global_summaries_lookup_same_lang_go,
); );
criterion_main!(benches); criterion_main!(benches);

View file

@ -11,6 +11,8 @@ preferred-dark-theme = "navy"
git-repository-url = "https://github.com/elicpeter/nyx" git-repository-url = "https://github.com/elicpeter/nyx"
edit-url-template = "https://github.com/elicpeter/nyx/edit/master/{path}" edit-url-template = "https://github.com/elicpeter/nyx/edit/master/{path}"
site-url = "/nyx/" site-url = "/nyx/"
additional-css = ["docs/mermaid.css"]
additional-js = ["docs/mermaid-init.js"]
[output.html.fold] [output.html.fold]
enable = true enable = true

370
build.rs
View file

@ -1,8 +1,21 @@
use std::collections::BTreeMap;
use std::path::Path; use std::path::Path;
use std::process::Command; use std::process::Command;
fn main() { fn main() {
// Only relevant when the serve feature is active // Phase 17 (Track E.1): always emit the seccomp policy table to
// OUT_DIR. Gated runtime via `#[cfg(target_os = "linux")]`, but the
// codegen runs on every host so `cargo check` on macOS still emits
// the file (the include never actually compiles on non-Linux).
emit_seccomp_policy();
// Phase 19 (Track E.3): emit the IMAGE_DIGESTS table from
// tools/image-builder/images.toml. The runtime side (src/dynamic/
// toolchain.rs) `include!`s the generated file unconditionally so
// every host build has the same pinned-digest catalogue.
emit_image_digests();
// Only relevant when the serve feature is active.
if std::env::var("CARGO_FEATURE_SERVE").is_err() { if std::env::var("CARGO_FEATURE_SERVE").is_err() {
return; return;
} }
@ -14,11 +27,11 @@ fn main() {
println!("cargo:rerun-if-changed=src/server/assets/dist/index.html"); println!("cargo:rerun-if-changed=src/server/assets/dist/index.html");
if index_html.exists() { if index_html.exists() {
// Dist already built nothing to do // Dist already built, nothing to do
return; return;
} }
// Dist missing try to build frontend // Dist missing, try to build frontend
let frontend_dir = Path::new("frontend"); let frontend_dir = Path::new("frontend");
if !frontend_dir.join("package.json").exists() { if !frontend_dir.join("package.json").exists() {
emit_placeholder_and_warn(dist_dir); emit_placeholder_and_warn(dist_dir);
@ -70,3 +83,354 @@ fn emit_placeholder_and_warn(dist_dir: &Path) {
"cargo:warning=Node.js/npm not available — wrote placeholder frontend assets. Run 'cd frontend && npm install && npm run build' for the real UI." "cargo:warning=Node.js/npm not available — wrote placeholder frontend assets. Run 'cd frontend && npm install && npm run build' for the real UI."
); );
} }
// ── Phase 17 (Track E.1) — seccomp policy codegen ────────────────────────────
const SECCOMP_POLICY_PATH: &str = "src/dynamic/sandbox/seccomp/seccomp_policy.toml";
/// Cap-name → Cap bit value table. Mirrors the `bitflags!` block in
/// `src/labels/mod.rs`. Keep in sync when adding/removing `Cap`
/// constants.
const CAP_BIT_FOR_NAME: &[(&str, u32)] = &[
("ENV_VAR", 1 << 0),
("HTML_ESCAPE", 1 << 1),
("SHELL_ESCAPE", 1 << 2),
("URL_ENCODE", 1 << 3),
("JSON_PARSE", 1 << 4),
("FILE_IO", 1 << 5),
("FMT_STRING", 1 << 6),
("SQL_QUERY", 1 << 7),
("DESERIALIZE", 1 << 8),
("SSRF", 1 << 9),
("CODE_EXEC", 1 << 10),
("CRYPTO", 1 << 11),
("UNAUTHORIZED_ID", 1 << 12),
("DATA_EXFIL", 1 << 13),
("LDAP_INJECTION", 1 << 14),
("XPATH_INJECTION", 1 << 15),
("HEADER_INJECTION", 1 << 16),
("OPEN_REDIRECT", 1 << 17),
("SSTI", 1 << 18),
("XXE", 1 << 19),
("PROTOTYPE_POLLUTION", 1 << 20),
];
fn emit_seccomp_policy() {
println!("cargo:rerun-if-changed={}", SECCOMP_POLICY_PATH);
let out_dir = std::env::var("OUT_DIR").expect("OUT_DIR must be set by cargo");
let out_path = Path::new(&out_dir).join("seccomp_policy.rs");
// Read the policy file; on missing file (e.g. fresh checkout on a
// foreign target), emit empty tables so compilation still succeeds.
let toml_text = match std::fs::read_to_string(SECCOMP_POLICY_PATH) {
Ok(s) => s,
Err(_) => {
std::fs::write(
&out_path,
"pub static BASE: &[&str] = &[];\npub static CAP: &[(u32, &[&str])] = &[];\n",
)
.expect("write empty seccomp policy stub");
return;
}
};
let parsed = parse_seccomp_toml(&toml_text);
let mut out = String::new();
out.push_str("// generated by build.rs from seccomp_policy.toml — do not edit\n\n");
// Base allowlist.
out.push_str("pub static BASE: &[&str] = &[\n");
for name in &parsed.base {
out.push_str(&format!(" \"{}\",\n", escape(name)));
}
out.push_str("];\n\n");
// Per-cap allowlists.
out.push_str("pub static CAP: &[(u32, &[&str])] = &[\n");
for (cap_name, allow) in &parsed.caps {
let bit = CAP_BIT_FOR_NAME
.iter()
.find(|(n, _)| *n == cap_name.as_str())
.map(|(_, b)| *b)
.unwrap_or_else(|| {
panic!(
"seccomp_policy.toml references unknown Cap '{cap_name}' — \
add it to CAP_BIT_FOR_NAME in build.rs first"
)
});
out.push_str(&format!(" (0x{bit:08x}_u32, &[\n"));
for name in allow {
out.push_str(&format!(" \"{}\",\n", escape(name)));
}
out.push_str(" ]),\n");
}
out.push_str("];\n");
std::fs::write(&out_path, out).expect("write seccomp policy table");
}
#[derive(Default)]
struct SeccompPolicy {
base: Vec<String>,
caps: BTreeMap<String, Vec<String>>,
}
/// Tiny line-oriented TOML parser scoped to the shape used by
/// `seccomp_policy.toml`:
///
/// [base]
/// allow = ["read", "write", ...]
///
/// [cap.SQL_QUERY]
/// allow = [
/// "fdatasync",
/// ...
/// ]
///
/// Comments (`#`) and blank lines are skipped. Multi-line array bodies
/// are accumulated until the closing `]`.
fn parse_seccomp_toml(src: &str) -> SeccompPolicy {
let mut policy = SeccompPolicy::default();
let mut current_section: Option<String> = None;
let mut accumulating_array: Option<String> = None;
let mut array_buf = String::new();
for raw_line in src.lines() {
let line = strip_comment(raw_line).trim();
if line.is_empty() {
continue;
}
if let Some(_key) = accumulating_array.as_ref() {
array_buf.push_str(line);
array_buf.push('\n');
if line.contains(']') {
let key = accumulating_array.take().unwrap();
let values = parse_string_array(&array_buf);
store_allow(&mut policy, current_section.as_deref(), &key, values);
array_buf.clear();
}
continue;
}
if let Some(section) = line.strip_prefix('[').and_then(|s| s.strip_suffix(']')) {
current_section = Some(section.to_string());
continue;
}
if let Some((key, rest)) = line.split_once('=') {
let key = key.trim().to_string();
let rest = rest.trim();
if rest.starts_with('[') && rest.contains(']') {
let values = parse_string_array(rest);
store_allow(&mut policy, current_section.as_deref(), &key, values);
} else if rest.starts_with('[') {
accumulating_array = Some(key);
array_buf.push_str(rest);
array_buf.push('\n');
}
continue;
}
}
policy
}
fn strip_comment(line: &str) -> &str {
let mut in_string = false;
let bytes = line.as_bytes();
for (i, &b) in bytes.iter().enumerate() {
match b {
b'"' => in_string = !in_string,
b'#' if !in_string => return &line[..i],
_ => {}
}
}
line
}
fn parse_string_array(src: &str) -> Vec<String> {
// Find every "..." run between the first `[` and the last `]`.
let start = src.find('[').map(|i| i + 1).unwrap_or(0);
let end = src.rfind(']').unwrap_or(src.len());
let body = &src[start..end];
let mut out = Vec::new();
let mut chars = body.chars().peekable();
while let Some(c) = chars.next() {
if c == '"' {
let mut s = String::new();
for c2 in chars.by_ref() {
if c2 == '"' {
break;
}
s.push(c2);
}
out.push(s);
}
}
out
}
fn store_allow(policy: &mut SeccompPolicy, section: Option<&str>, key: &str, values: Vec<String>) {
if key != "allow" {
return;
}
match section {
Some("base") => policy.base = values,
Some(other) => {
if let Some(cap_name) = other.strip_prefix("cap.") {
policy.caps.insert(cap_name.to_string(), values);
}
}
None => {}
}
}
fn escape(s: &str) -> String {
s.replace('\\', "\\\\").replace('"', "\\\"")
}
// ── Phase 19 (Track E.3) — image digest codegen ──────────────────────────────
const IMAGE_CATALOGUE_PATH: &str = "tools/image-builder/images.toml";
/// Parse `tools/image-builder/images.toml` and emit two tables to
/// `$OUT_DIR/image_digests.rs`:
///
/// pub static IMAGE_DIGESTS: phf::Map<&'static str, &'static str> = …;
/// pub static IMAGE_BASES: phf::Map<&'static str, &'static str> = …;
///
/// `IMAGE_DIGESTS` keys are toolchain IDs (`python-3.11`, …) and values are
/// `<base>@sha256:…` strings ready to hand to `docker pull`. An empty digest
/// in `images.toml` is treated as "not yet pinned" and the entry is omitted
/// from `IMAGE_DIGESTS`; `IMAGE_BASES` always carries the unpinned reference
/// so `docker.rs` can fall back to a tag pull when no digest is recorded.
fn emit_image_digests() {
println!("cargo:rerun-if-changed={}", IMAGE_CATALOGUE_PATH);
let out_dir = std::env::var("OUT_DIR").expect("OUT_DIR must be set by cargo");
let out_path = Path::new(&out_dir).join("image_digests.rs");
let toml_text = match std::fs::read_to_string(IMAGE_CATALOGUE_PATH) {
Ok(s) => s,
Err(_) => {
// Missing catalogue (fresh checkout without the file) — emit
// empty maps so the runtime include still compiles.
std::fs::write(
&out_path,
"/// generated empty IMAGE_DIGESTS — images.toml missing\n\
pub static IMAGE_DIGESTS: phf::Map<&'static str, &'static str> = \
phf::phf_map! {};\n\
pub static IMAGE_BASES: phf::Map<&'static str, &'static str> = \
phf::phf_map! {};\n",
)
.expect("write empty image digests stub");
return;
}
};
let entries = parse_image_catalogue(&toml_text);
let mut out = String::new();
out.push_str("// generated by build.rs from tools/image-builder/images.toml — do not edit\n\n");
// IMAGE_DIGESTS: only entries with a non-empty digest survive.
out.push_str(
"pub static IMAGE_DIGESTS: phf::Map<&'static str, &'static str> = phf::phf_map! {\n",
);
for e in &entries {
if e.digest.is_empty() {
continue;
}
let pinned = format!("{}@{}", e.base, e.digest);
out.push_str(&format!(
" \"{}\" => \"{}\",\n",
escape(&e.toolchain_id),
escape(&pinned),
));
}
out.push_str("};\n\n");
// IMAGE_BASES: every entry, digest stripped. Used by docker.rs when no
// digest is pinned yet so a `docker pull <base>` is still possible.
out.push_str(
"pub static IMAGE_BASES: phf::Map<&'static str, &'static str> = phf::phf_map! {\n",
);
for e in &entries {
out.push_str(&format!(
" \"{}\" => \"{}\",\n",
escape(&e.toolchain_id),
escape(&e.base),
));
}
out.push_str("};\n");
std::fs::write(&out_path, out).expect("write image_digests.rs");
}
#[derive(Default)]
struct ImageEntry {
toolchain_id: String,
base: String,
digest: String,
}
/// Tiny TOML parser scoped to the `[[image]] toolchain_id = …` shape used
/// by `images.toml`. Only the three fields we consume here are extracted;
/// the rest of each entry (`toolchain`, `packages`) is ignored.
fn parse_image_catalogue(src: &str) -> Vec<ImageEntry> {
let mut entries: Vec<ImageEntry> = Vec::new();
let mut current: Option<ImageEntry> = None;
for raw_line in src.lines() {
let line = strip_comment(raw_line).trim();
if line.is_empty() {
continue;
}
if line == "[[image]]" {
if let Some(prev) = current.take()
&& !prev.toolchain_id.is_empty()
{
entries.push(prev);
}
current = Some(ImageEntry::default());
continue;
}
if line.starts_with("[[") || line.starts_with('[') {
// Any other section ends accumulation.
if let Some(prev) = current.take()
&& !prev.toolchain_id.is_empty()
{
entries.push(prev);
}
continue;
}
let Some(slot) = current.as_mut() else {
continue;
};
let Some((key, value)) = line.split_once('=') else {
continue;
};
let key = key.trim();
let value = value.trim().trim_matches('"').trim_matches('\'');
match key {
"toolchain_id" => slot.toolchain_id = value.to_owned(),
"base" => slot.base = value.to_owned(),
"digest" => slot.digest = value.to_owned(),
_ => {}
}
}
if let Some(prev) = current.take()
&& !prev.toolchain_id.is_empty()
{
entries.push(prev);
}
entries
}

View file

@ -69,6 +69,21 @@ enable_state_analysis = true
## Per-language auth overrides live under [analysis.languages.<slug>.auth]. ## Per-language auth overrides live under [analysis.languages.<slug>.auth].
enable_auth_analysis = true enable_auth_analysis = true
## Run dynamic verification on Medium/High confidence findings after static analysis.
## Default builds include this support. Use --no-verify or set this false for
## fast static-only scans, or when building with --no-default-features.
verify = true
## Also verify Low-confidence findings. Slower; intended for payload tuning.
verify_all_confidence = false
## Dynamic sandbox backend: auto | docker | process | firecracker
## auto uses Docker when available, otherwise the process backend.
verify_backend = "auto"
## Process-backend hardening profile: standard | strict
harden_profile = "standard"
## Catch per-file panics during analysis and continue the scan. ## Catch per-file panics during analysis and continue the scan.
## When false (default), a panic in one file's analyser aborts the whole ## When false (default), a panic in one file's analyser aborts the whole
## scan — useful for catching engine bugs loudly in development. ## scan — useful for catching engine bugs loudly in development.
@ -299,6 +314,31 @@ interprocedural = true
smt = true smt = true
# ─── Detector knobs ──────────────────────────────────────────────────
# Per-detector class suppression and enablement. These knobs target
# common false-positive classes that show up on legitimate forwarding
# pipelines (telemetry / analytics / metrics dispatch).
#
# [detectors.data_exfil]
#
# # Toggle the entire `taint-data-exfiltration` detector class. Set to
# # false on projects whose architecture routes user-derived payloads
# # through trusted forwarding boundaries by design.
# enabled = true
#
# # URL prefixes treated as trusted destinations. Outbound calls whose
# # destination argument has a static prefix (proven by the abstract
# # string domain or visible as a literal) matching one of these entries
# # have `Cap::DATA_EXFIL` dropped before event emission. Mirrors the
# # SSRF prefix-lock semantics. Use full origins or origin-prefixed
# # paths (e.g. "https://api.internal/") so partial matches across
# # unrelated hosts cannot occur.
# trusted_destinations = [
# "https://api.internal/",
# "https://telemetry.",
# ]
# ─── Per-language analysis rules ───────────────────────────────────── # ─── Per-language analysis rules ─────────────────────────────────────
# [analysis.languages.javascript.auth] # [analysis.languages.javascript.auth]

View file

@ -9,6 +9,7 @@
- [CLI reference](cli.md) - [CLI reference](cli.md)
- [Browser UI](serve.md) - [Browser UI](serve.md)
- [Dynamic verification](dynamic.md)
- [Configuration](configuration.md) - [Configuration](configuration.md)
- [Output formats](output.md) - [Output formats](output.md)
@ -27,3 +28,8 @@
- [CFG](detectors/cfg.md) - [CFG](detectors/cfg.md)
- [State](detectors/state.md) - [State](detectors/state.md)
- [Taint](detectors/taint.md) - [Taint](detectors/taint.md)
# Project
- [Roadmap](roadmap.md)
- [Changelog](changelog.md)

View file

@ -31,6 +31,22 @@ SQL sink as an injection risk; an SSRF sink whose URL prefix is locked to a
trusted host stays quiet. This turns a large class of FPs on numeric and trusted host stays quiet. This turns a large class of FPs on numeric and
locked-prefix paths into true negatives. locked-prefix paths into true negatives.
**Path traversal.** The path domain accepts canonicalised-and-rooted
shapes via `PathFact::is_path_traversal_safe`: a path that is
dotdot-free and either non-absolute or carries a verified prefix-lock has
its `Cap::FILE_IO` cleared. When the lock argument is a string literal
the lock prefix is recorded directly; when it is a method call, field
access, or configured root, an `OPAQUE_PREFIX_LOCK` marker captures the
structural invariant ("rooted under SOME prefix") instead. This closes
the Ruby `File.expand_path + start_with?(root)`, Python
`os.path.realpath + .startswith(root)`, and JS
`path.resolve + .startsWith(root)` shapes. `classify_path_assertion`
recognises JS `.startsWith(...)`, Python `.startswith(...)`, Ruby
`.start_with?(...)` (paren and paren-less), and Go `strings.HasPrefix(...)`.
Branch narrowing flips lock attachment under condition negation
(`if !target.startsWith(ROOT) { return; }` attaches the lock to the
surviving block, not the rejection arm).
**How to turn it off.** **How to turn it off.**
| Surface | Value | | Surface | Value |
@ -80,8 +96,24 @@ hash per-argument `Cap` bits but not source-origin identity, so two
callers with identical caps but different origins share cached callers with identical caps but different origins share cached
origin-attribution. origin-attribution.
**Source**: [`src/taint/ssa_transfer.rs`](https://github.com/elicpeter/nyx/blob/master/src/taint/ssa_transfer.rs) **Helper-validator propagation.** SSA summaries carry a
(`ArgTaintSig`, `InlineCache`, `inline_analyse_callee`). `validated_params_to_return` field listing parameter indices whose
taint flow to the return value is fully validated by a dominating
predicate (regex allowlist, type check, validation call) on every
return path. At call sites, each tainted argument passed to a
validated position, and the call's own return value, are marked
`validated_must` / `validated_may` in the caller's SSA taint state,
the same way an inline `if (!regex.test(x)) throw …` would validate
the surviving branch. Sound because the summary is recorded only when
the parameter's name is in `validated_must` at *every* return block; a
normal-returning call therefore proves the validating arm. JS/TS
object-pattern formals (`({ column, operator, value }) => …`) seed
every destructured sibling in the per-parameter probe, so flow through
any of them counts toward the slot being validated.
**Source**: [`src/taint/ssa_transfer/`](https://github.com/elicpeter/nyx/tree/master/src/taint/ssa_transfer/)
(`ArgTaintSig`, `InlineCache`, `inline_analyse_callee`,
`propagate_validated_params_to_return`).
--- ---
@ -111,9 +143,8 @@ identity independent of the parent value.
|---|---| |---|---|
| Env var | `NYX_POINTER_ANALYSIS=0` | | Env var | `NYX_POINTER_ANALYSIS=0` |
The pass is **on by default** as of 2026-04-26. The env-var override is The pass is **on by default**. The env-var override exists so you can
kept for one release so you can compare against the pre-pointer baseline, compare against the pre-pointer baseline.
then will be removed.
**Limitations.** This is not a general escape analysis. Function pointers **Limitations.** This is not a general escape analysis. Function pointers
and arbitrary indirect calls still resolve to no callee, and deep alias and arbitrary indirect calls still resolve to no callee, and deep alias
@ -236,15 +267,28 @@ while the pass stabilises.
| CLI flag | `--backwards-analysis` / `--no-backwards-analysis` | | CLI flag | `--backwards-analysis` / `--no-backwards-analysis` |
| Env var (legacy) | `NYX_BACKWARDS=1` | | Env var (legacy) | `NYX_BACKWARDS=1` |
**Limitations (first cut).** Reverse call-graph expansion past a **Limitations.** Reverse call-graph expansion stops at `ReachedParam`; the walk
`ReachedParam` is deferred; the walk terminates at function parameters terminates at function parameters rather than crossing back into callers.
rather than crossing back into callers. Path-constraint pruning is Path-constraint pruning is conservative: only the accumulated
conservative: only the accumulated `PredicateSummary` bits are consulted, `PredicateSummary` bits are consulted, not the full symbolic predicate stack.
not the full symbolic predicate stack. Depth-bounded at k=2 for Depth-bounded at k=2 for
cross-function body expansion. See `DEFAULT_BACKWARDS_DEPTH`, cross-function body expansion. See `DEFAULT_BACKWARDS_DEPTH`,
`BACKWARDS_VALUE_BUDGET`, and `MAX_BACKWARDS_CALLEE_BLOCKS` in `BACKWARDS_VALUE_BUDGET`, and `MAX_BACKWARDS_CALLEE_BLOCKS` in
`src/taint/backwards.rs` for the exact bounds. `src/taint/backwards.rs` for the exact bounds.
**Cap parity.** The walk treats `DemandState.caps` as opaque bitflags,
every cap defined in `src/labels/mod.rs` round-trips identically through
the demand transfer. Including `Cap::DATA_EXFIL` (bit 13): a
`taint-data-exfiltration` forward finding receives `backwards-confirmed`
exactly like a `taint-unsanitised-flow` SQL/CMD/SSRF finding when its
demand walk reaches a Sensitive source. The cap-routing logic in
`src/ast.rs` then surfaces the rule id correctly regardless of which
direction confirmed the flow. See
`tests/backwards_analysis_tests.rs::demand_driven_suite` (the
`data_exfil` sub-case) and
`taint::backwards::tests::driver_walks_data_exfil_source_to_sink` for
the regression guards.
**Source**: [`src/taint/backwards.rs`](https://github.com/elicpeter/nyx/blob/master/src/taint/backwards.rs). **Source**: [`src/taint/backwards.rs`](https://github.com/elicpeter/nyx/blob/master/src/taint/backwards.rs).
--- ---

View file

@ -1,19 +1,62 @@
# Auth analysis # Auth analysis
**Rust today.** Other languages have rule scaffolding in [`src/auth_analysis/config.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/config.rs) (Python, Ruby, Go, Java, JavaScript, TypeScript), but only Rust has benchmark corpus coverage and the precision work to back it. Treat findings on other languages as preview; the rule prefix (`py.auth.*`, `js.auth.*`, `rb.auth.*`, `go.auth.*`, `java.auth.*`) is reserved but the matchers haven't been validated against real codebases yet. **Rust is the stable target.** Python and Go have shipped precision work as of 0.7.0 (FastAPI cross-file dependencies, Go DAO-helper filtering, same-file caller-scope IPA) and are usable on real codebases. Ruby, Java, JavaScript, and TypeScript have rule scaffolding in [`src/auth_analysis/config.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/config.rs) but no benchmark corpus yet; treat findings there as preview.
## What it catches ## What it catches
The Rust rule is `rs.auth.missing_ownership_check`. It fires when a request handler reaches a privileged operation that takes a scoped identifier (`*_id`, row reference, scoped resource) without a preceding ownership or membership check. The Rust rule is `rs.auth.missing_ownership_check`. It fires when a request handler reaches a privileged operation that takes a scoped identifier (`*_id`, row reference, scoped resource) without a preceding ownership or membership check.
Concretely, it looks for five patterns of authorization in the function body and flags the call when none are present: Concretely, it looks for these patterns of authorization in the function body and flags the call when none are present:
- A call to a recognised authorization helper. Defaults: `check_ownership`, `has_ownership`, `require_ownership`, `ensure_ownership`, `is_owner`, `authorize`, `verify_access`, `has_permission`, `can_access`, `can_manage`, plus `*_membership` and `require_{group,org,workspace,tenant,team}_member` variants. Extend in `[analysis.languages.rust]`. - A call to a recognised authorization helper. Defaults: `check_ownership`, `has_ownership`, `require_ownership`, `ensure_ownership`, `is_owner`, `authorize`, `verify_access`, `has_permission`, `can_access`, `can_manage`, plus `*_membership` and `require_{group,org,workspace,tenant,team}_member` variants. Extend in `[analysis.languages.rust]`.
- An ownership-equality check on a row reference: `if owner_id != user.id { return 403 }` or any `field_id != self_actor` shape. The check writes `AuthCheck` evidence back to the row-fetch arguments via `AnalysisUnit.row_field_vars`. - An ownership-equality check on a row reference: `if owner_id != user.id { return 403 }` or any `field_id != self_actor` shape. The check writes `AuthCheck` evidence back to the row-fetch arguments via `AnalysisUnit.row_field_vars`.
- A self-actor reference: `let user = require_auth(...).await?` followed by use of `user.id`, `user.user_id`, `user.uid`. The actor is recognised from typed extractor params (`Extension<Session>`, `CurrentUser`, etc.) and from typed helper bindings. - A self-actor reference: `let user = require_auth(...).await?` followed by use of `user.id`, `user.user_id`, `user.uid`. The actor is recognised from typed extractor params (`Extension<Session>`, `CurrentUser`, etc.) and from typed helper bindings.
- A typed extractor wrapper that proves route-level capability/policy enforcement: meilisearch-style `GuardedData<ActionPolicy<X>, _>`. Recognised by outer wrapper name (last segment, case-insensitive `starts_with`) so `GuardedData<ActionPolicy<X>, Data<AuthController>>` is classified by the outer `GuardedData`, not by whether an inner generic arg substring-matches `auth`. Configured via `policy_guard_names` (Rust default: `["Guarded"]`). Distinct from authentication-only wrappers so the pattern doesn't pollute regular call recognition.
- A SQL query that joins through an ACL table or filters by `user_id` predicate. Detected without a SQL parser via [`sql_semantics.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/sql_semantics.rs); the authorized result variable propagates through `let row = ...prepare(LIT)...`, `for row in result`, `let id = row.get(...)`. - A SQL query that joins through an ACL table or filters by `user_id` predicate. Detected without a SQL parser via [`sql_semantics.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/sql_semantics.rs); the authorized result variable propagates through `let row = ...prepare(LIT)...`, `for row in result`, `let id = row.get(...)`.
- A helper-summary lift: handler calls `validate_target(db, widget_id, user.id)` whose body contains a `require_*_member` call. Cross-function summaries are merged at fixed-point (capped at 4 iterations). - A helper-summary lift: handler calls `validate_target(db, widget_id, user.id)` whose body contains a `require_*_member` call. Cross-function summaries are merged at fixed-point (capped at 4 iterations).
Handlers registered through attribute macros (`#[get("/path")]`, `#[routes::path(…)]`) or external service-config builders are also walked for typed-extractor guards, complementing the `.route(...)` registration path.
## Caller-scope-entity exemption
`<entity>.id` / `<entity>.pk` is not flagged when `<entity>` is a unit parameter named after a multi-tenant scope primitive: `organization` / `org`, `project`, `team`, `workspace`, `tenant`, `account`, `community`, `group`, `repository` / `repo`, `company`. The argument represents the caller's scope, not a user-controlled target, so internal helpers like `def get_environments(request, organization): Environment.objects.filter(organization_id=organization.id, …)` inherit the caller's authorization. Other field names (`.name`, `.slug`) still flag, and `user` / `member` / `actor` are deliberately excluded; those are handled by the actor-context recogniser.
## Project-level web-framework gate (Rust)
In Rust, the `context_inputs` and param-name arms of the user-input heuristic are gated by a project-level web-framework signal. The signal is three-valued:
- `Some(true)`: the project's `Cargo.toml` names `axum`, `actix-web`, or `rocket`, OR the file directly imports one (`axum::`, `actix_web::`, `rocket::`, `axum_extra::`). Heuristics stay on.
- `Some(false)`: `Cargo.toml` was inspected and named no web framework, AND the file does not directly import one. Heuristics off; only `RouteHandler` classification (concrete route-registration evidence) survives.
- `None`: no detection ran (single-file scan with no project root). Heuristics on; behavior unchanged.
This avoids a class of FPs in non-web Rust crates where a debug-session handle named `session` would trip on `session.update(cx, …)`-style desktop-app code. Other languages keep prior behavior; the gate is currently Rust-only.
## Python: FastAPI cross-file dependencies
FastAPI's `include_router` chain is resolved across files. A child router declared in `routes/task_instances.py` and attached on a parent in `routes/__init__.py` inherits the parent's `dependencies=[...]`.
- Module-level `router = APIRouter(dependencies=[Security(...)])` is pre-walked once per file and merged onto every `@<router>.<verb>(...)` route attached in the same file.
- `<parent>.include_router(<child_module>.<child_var>)` edges are captured per file in pass 1, persisted into `GlobalSummaries::router_facts_by_module`, and lifted onto the active file's `AuthorizationModel::cross_file_router_deps` at pass 2 entry. Transitive lifts (grandparent to parent to child) iterate to fixpoint.
- `Security(callable, scopes=[...])` is recognised distinctly from `Depends(callable)` and promotes the synthetic `AuthCheck` to `AuthCheckKind::Other` (route-level scope-checked authorization). Bare `Depends(callable)` is still a Login-only check.
Module identity is the file basename without `.py`. This is sufficient for airflow-style `task_instances.router` naming; a project with two files of the same name in different subtrees will currently collide.
## Go: DAO-helper id-scalar precision pass
For non-route Go units, a parameter whose declared type is a bounded primitive scalar (`int64`, `uint32`, `string`, `bool`, `byte`, `rune`, `float64`, etc.) and whose name is id-shaped (`id`, `*Id`, `*_id`, `*ids`) is dropped from `unit.params` before ownership-check evaluation.
Real Go HTTP handlers always carry a framework-request-typed param (`*http.Request`, `*gin.Context`, `echo.Context`, `*fiber.Ctx`); per-framework route extractors set `include_id_like_typed=true` so id-shaped path params survive on real routes. The filter only fires when the unit was not classified as a route handler, so helpers like `func GetRunByRepoAndID(ctx, repoID, runID int64)` are recognised as DAO callees and the ownership check is expected at the calling route handler, not inside the helper.
## Same-file caller-scope IPA
When a private helper is called only from authorized route handlers in the same file, the caller's auth checks lift onto the helper as synthetic `is_route_level=true` `AuthCheck` entries.
- Iterated to a small fixpoint so transitive chains (route to mid_helper to leaf_helper) are covered.
- Refuses to authorize helpers with no in-file caller, helpers called from a mix of authorized and unauthorized callers, and helpers called only from un-lifted helpers.
- Cross-file caller-scope lifting is not implemented yet.
This closes the FastAPI / Django / Flask shape where a route authenticates via decorator or dependency, then delegates to a private helper that performs the sink.
## Sink classification ## Sink classification
The same call name can be safe on a local collection and dangerous on a database. The detector categorises each candidate sink before deciding whether to flag: The same call name can be safe on a local collection and dangerous on a database. The detector categorises each candidate sink before deciding whether to flag:
@ -62,9 +105,18 @@ cap = "unauthorized_id"
The same rule recognised in the standalone analyser also strips `Cap::UNAUTHORIZED_ID` for the taint-based variant. The same rule recognised in the standalone analyser also strips `Cap::UNAUTHORIZED_ID` for the taint-based variant.
### Add a project-specific typed-extractor policy wrapper
```toml
[analysis.languages.rust.auth]
policy_guard_names = ["MyAppGuarded", "PolicyExtractor"]
```
Matched as last-segment + case-insensitive `starts_with` (so a single entry `"Guarded"` covers `Guarded`, `GuardedData`, `GuardedRoute`). Distinct from `login_guard_names` and `admin_guard_names`.
### Recognised actor names ### Recognised actor names
Recognised by default: `user.id`, `user.user_id`, `user.uid`, `session.user_id`, `current_user.id`, plus typed extractor parameters with `CurrentUser`, `SessionUser`, `AuthUser`, `Extension<...>` shapes. To add a custom binding pattern, file an issue or add a fixture; the heuristic is in [`src/auth_analysis/checks.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/checks.rs) under `extract_validation_target` and friends. Recognised by default: `user.id`, `user.user_id`, `user.uid`, `session.user_id`, `current_user.id`, plus typed extractor parameters with `CurrentUser`, `SessionUser`, `AuthUser`, `Extension<...>` shapes. To add a custom binding pattern, file an issue or add a fixture; the heuristic lives in [`src/auth_analysis/extract/common.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/extract/common.rs) under the `*self_actor*` helpers (`collect_self_actor_binding`, `collect_typed_extractor_self_actor`, `is_self_actor_type_text`).
### Suppress ### Suppress
@ -84,8 +136,8 @@ nyx scan . --severity ">=MEDIUM" --min-confidence medium
Auth findings render alongside taint findings in the [browser UI](serve.md). The flow visualiser shows the sink call, the actor reference (when one was found), and any helper-summary path the engine traversed; the How to fix panel mirrors the rule's recommendation. Auth findings render alongside taint findings in the [browser UI](serve.md). The flow visualiser shows the sink call, the actor reference (when one was found), and any helper-summary path the engine traversed; the How to fix panel mirrors the rule's recommendation.
<p align="center"><img src="../assets/screenshots/docs/serve-finding-detail.png" alt="Nyx finding detail: numbered source → call → sink walk with a How to fix panel and an inline evidence object" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-finding-detail.png" alt="Nyx finding detail: numbered source → call → sink walk with a How to fix panel and an inline evidence object" width="900"/></p>
## Where the work was done ## Benchmark corpus
The remediation work is documented release-by-release in `tests/benchmark/RESULTS.md` under the Rust auth row. Phases A1 through B5 (precision and structural improvements) and Phase C (taint-based variant) all landed on the 0.5.0 release branch. The benchmark corpus at [`tests/benchmark/corpus/rust/auth/`](https://github.com/elicpeter/nyx/tree/master/tests/benchmark/corpus/rust/auth/) is 10 fixtures covering the five FP patterns plus a true-positive control. The Rust auth corpus at [`tests/benchmark/corpus/rust/auth/`](https://github.com/elicpeter/nyx/tree/master/tests/benchmark/corpus/rust/auth/) covers the recognised authorization patterns, true-positive controls, typed-extractor guard injection, and the project-level web-framework gate (full-Cargo.toml fixtures under `safe_non_web_rust_project/` and `unsafe_actix_web_project_no_check/`). Per-row metrics live under the Rust auth row in `tests/benchmark/RESULTS.md`.

View file

@ -74,7 +74,7 @@ nyx scan [PATH] [OPTIONS]
| `--fail-on <SEV>` | *(none)* | Exit code 1 if any finding >= this severity | | `--fail-on <SEV>` | *(none)* | Exit code 1 if any finding >= this severity |
| `--show-suppressed` | off | Show inline-suppressed findings (dimmed, tagged `[SUPPRESSED]`) | | `--show-suppressed` | off | Show inline-suppressed findings (dimmed, tagged `[SUPPRESSED]`) |
| `--keep-nonprod-severity` | off | Don't downgrade severity for test/vendor paths | | `--keep-nonprod-severity` | off | Don't downgrade severity for test/vendor paths |
| `--all` | off | Disable category filtering, rollups, and LOW budgets -- show everything | | `--all` | off | Disable category filtering, rollups, and LOW budgets. Shows everything |
| `--include-quality` | off | Include Quality-category findings (hidden by default) | | `--include-quality` | off | Include Quality-category findings (hidden by default) |
| `--max-low <N>` | `20` | Maximum total LOW findings to show | | `--max-low <N>` | `20` | Maximum total LOW findings to show |
| `--max-low-per-file <N>` | `1` | Maximum LOW findings per file | | `--max-low-per-file <N>` | `1` | Maximum LOW findings per file |
@ -82,6 +82,13 @@ nyx scan [PATH] [OPTIONS]
| `--rollup-examples <N>` | `5` | Number of example locations in rollup findings | | `--rollup-examples <N>` | `5` | Number of example locations in rollup findings |
| `--show-instances <RULE>` | *(none)* | Expand all instances of a specific rule (bypass rollup) | | `--show-instances <RULE>` | *(none)* | Expand all instances of a specific rule (bypass rollup) |
`nyx scan` automatically reads `.nyx/triage.json` from the scan root when the
file exists. Terminal triage states written by `nyx serve` (`false_positive`,
`accepted_risk`, `suppressed`, and `fixed`) are hidden from CLI output and do
not trigger `--fail-on` by default. Use `--show-suppressed` to include them in
console, JSON, or SARIF output with their `triage_state` and optional
`triage_note`.
**Severity expression formats**: **Severity expression formats**:
```bash ```bash
@ -95,11 +102,11 @@ nyx scan [PATH] [OPTIONS]
`--fail-on` returns a non-zero exit code when the threshold trips, so CI jobs fail without further wiring: `--fail-on` returns a non-zero exit code when the threshold trips, so CI jobs fail without further wiring:
<p align="center"><img src="../assets/screenshots/docs/cli-failon.png" alt="nyx scan with --fail-on HIGH against a small fixture: three HIGH taint findings printed, followed by exit=1 from the shell" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/cli-failon.png" alt="nyx scan with --fail-on HIGH against a small fixture: three HIGH taint findings printed, followed by exit=1 from the shell" width="900"/></p>
Quality-category and rollup-prone Low findings are filtered down by default. The footer tells you exactly what got dropped and which knob to turn: Quality-category and rollup-prone Low findings are filtered down by default. The footer tells you exactly what got dropped and which knob to turn:
<p align="center"><img src="../assets/screenshots/docs/cli-rollup-tail.png" alt="nyx scan tail: warning '*' generated 57 issues; Suppressed 92 LOW/Quality findings; Active filters max_low=20, max_low_per_file=1, max_low_per_rule=10; Use --include-quality, --max-low, or --all to adjust" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/cli-rollup-tail.png" alt="nyx scan tail: warning '*' generated 57 issues; Suppressed 92 LOW/Quality findings; Active filters max_low=20, max_low_per_file=1, max_low_per_rule=10; Use --include-quality, --max-low, or --all to adjust" width="900"/></p>
### Analysis Engine Toggles ### Analysis Engine Toggles
@ -150,7 +157,29 @@ Individual flags override the profile. For example, `--engine-profile fast --ba
nyx scan --engine-profile deep --no-smt --explain-engine nyx scan --engine-profile deep --no-smt --explain-engine
``` ```
<p align="center"><img src="../assets/screenshots/docs/cli-explain-engine.png" alt="nyx scan --engine-profile deep --explain-engine output: resolved config showing every analysis pass, its current state, and the CLI flag/env var that controls it" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/cli-explain-engine.png" alt="nyx scan --engine-profile deep --explain-engine output: resolved config showing every analysis pass, its current state, and the CLI flag/env var that controls it" width="900"/></p>
### Dynamic verification
Available in default builds, or in custom builds with `--features dynamic`. See [dynamic.md](dynamic.md) for the full pipeline and verdict semantics.
| Flag | Default | Description |
|------|---------|-------------|
| `--verify` | on | Enable dynamic verification (default when built with `dynamic`). Conflicts with `--no-verify` |
| `--no-verify` | off | Skip verification for this run. Useful for fast static-only scans without editing config |
| `--verify-all-confidence` | off | Also verify findings below `Confidence >= Medium`. Slower; intended for payload tuning |
| `--backend <BACKEND>` | `auto` | Sandbox backend: `auto` (docker if available, else process), `docker` (required), `process` (in-process runner) |
| `--unsafe-sandbox` | off | Force the process backend. Equivalent to `--backend process`. Cannot combine with `--backend docker` |
| `--harden <PROFILE>` | `standard` | Process-backend lockdown: `standard` (no-new-privs + rlimit on Linux) or `strict` (namespaces + chroot + seccomp on Linux; `sandbox-exec` on macOS) |
| `--verbose` | off | Flush the per-finding `VerifyTrace` to stderr after each verdict. Same stream that lands in `expected/trace.jsonl` in the repro bundle |
### Baseline / patch validation
| Flag | Default | Description |
|------|---------|-------------|
| `--baseline <FILE>` | *(none)* | Read a prior scan's JSON (or a stripped `.nyx/baseline.json`) and diff it against this scan on `stable_hash`. Reports `New` / `Resolved` / `FlippedConfirmed` / `FlippedNotConfirmed` transitions |
| `--baseline-write <FILE>` | *(none)* | After scanning, write a stripped baseline (only `stable_hash`, `dynamic_verdict`, `severity`, `path`, `rule_id`; no source). Safe to commit |
| `--gate <GATE>` | *(none)* | CI gate to enforce when `--baseline` is active. `no-new-confirmed` exits 2 on any new Confirmed finding; `resolve-all-confirmed` exits 2 if any baseline-Confirmed finding is not fully resolved |
### Examples ### Examples
@ -191,6 +220,43 @@ nyx scan . --max-low 50 --max-low-per-file 5
--- ---
## `nyx repro`
Replay a dynamic repro bundle for a confirmed finding.
```
nyx repro (--finding <ID> | --spec-hash <HASH> | --bundle <DIR>) [OPTIONS]
```
Nyx writes repro bundles under the platform cache directory and keys them by
`spec_hash`. The browser UI and scan output show `finding_id`, so
`--finding` scans cached bundle manifests and replays the newest match.
| Flag | Description |
|------|-------------|
| `--finding <ID>` | Find the newest cached bundle whose manifest carries this stable finding ID |
| `--spec-hash <HASH>` | Replay an exact cache bundle by spec hash |
| `--bundle <DIR>` | Replay an explicit bundle directory |
| `--docker` | Run the bundle's Docker replay path (`./reproduce.sh --docker`) |
| `--print-path` | Print the resolved bundle path and exit without replaying |
| `--list` | With `--finding`, list all matching cached bundles newest first |
Examples:
```bash
nyx repro --finding b9caa35df2213040
nyx repro --finding b9caa35df2213040 --docker
nyx repro --finding b9caa35df2213040 --print-path
nyx repro --spec-hash 8bca7f8e0311d6c9
nyx repro --bundle /path/to/repro/8bca7f8e0311d6c9
```
Exit codes mirror `reproduce.sh`: `0` pass, `1` replay mismatch, `2` Docker
unavailable, `3` process-backend toolchain mismatch. Any other script exit is
passed through.
---
## `nyx index` ## `nyx index`
Manage the SQLite file index. Manage the SQLite file index.
@ -215,7 +281,7 @@ nyx index status [PATH]
Display index statistics (file count, size, last modified) for the given path. Display index statistics (file count, size, last modified) for the given path.
<p align="center"><img src="../assets/screenshots/docs/cli-idxstatus.png" alt="nyx index status output: project name, index path under the platform config dir, exists/size/modified fields" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/cli-idxstatus.png" alt="nyx index status output: project name, index path under the platform config dir, exists/size/modified fields" width="900"/></p>
--- ---
@ -248,6 +314,64 @@ Remove index data.
--- ---
## `nyx surface`
Print the project's attack-surface map.
```
nyx surface [PATH] [--format <FMT>] [--build]
```
Loads the `SurfaceMap` persisted by the most recent indexed scan when available; otherwise runs the per-language framework probes against the on-disk source to produce an entry-points-only map. Pass `--build` to force a full inline build (pass-1 summary extraction + call-graph construction) on an unscanned project, which adds `DataStore` / `ExternalService` / `DangerousLocal` nodes the entry-points-only fallback omits.
| Flag | Default | Description |
|------|---------|-------------|
| `--format <FMT>` | `text` | Output format: `text` (indented tree), `json` (canonical SurfaceMap), `dot` (Graphviz source), or `svg` (spawns `dot` locally) |
| `--build` | off | Force a full SurfaceMap build inline when no indexed scan exists. Same cost as `nyx index build` |
Pipe `dot` output through `dot -Tsvg` for a renderable graph, or use `--format svg` for a one-step render when graphviz is installed.
---
## `nyx serve`
Start the local browser UI for browsing scan results.
```
nyx serve [PATH] [OPTIONS]
```
**PATH** defaults to `.` (current directory). The server binds to a loopback address only and refuses non-loopback hosts at startup.
| Flag | Default | Description |
|------|---------|-------------|
| `-p, --port <PORT>` | *(from config)* | Port to bind to (overrides `[server].port`) |
| `--host <HOST>` | *(from config)* | Host to bind to (overrides `[server].host`) |
| `--no-browser` | off | Skip opening the browser automatically |
See [serve.md](serve.md) for the UI tour, route map, and CSRF / host-header behaviour.
---
## `nyx verify-feedback`
Record a correction or confirmation against a dynamic-verifier verdict. Requires `--features dynamic`.
```
nyx verify-feedback <FINDING_ID> [--wrong <REASON> | --right] [--upload]
```
| Argument/Flag | Description |
|---------------|-------------|
| `FINDING_ID` | Stable 16-char hex id shown in `nyx scan --verify` output |
| `--wrong <REASON>` | Mark the verdict wrong and record the reason. Conflicts with `--right` |
| `--right` | Confirm the verdict. Conflicts with `--wrong` |
| `--upload` | Reserved; uploading to Nyx telemetry is not yet implemented |
Feedback is written to the local telemetry log under the platform cache dir.
---
## `nyx config` ## `nyx config`
Manage configuration. Manage configuration.
@ -256,7 +380,7 @@ Manage configuration.
Print the effective merged configuration as TOML. Useful for sanity-checking what the scanner is actually using after `nyx.conf` and `nyx.local` merge: Print the effective merged configuration as TOML. Useful for sanity-checking what the scanner is actually using after `nyx.conf` and `nyx.local` merge:
<p align="center"><img src="../assets/screenshots/docs/cli-configshow.png" alt="nyx config show output: TOML dump of the merged scanner config showing [scanner] mode/min_severity/excluded_extensions/excluded_directories, [database] settings, and resolved engine toggles" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/cli-configshow.png" alt="nyx config show output: TOML dump of the merged scanner config showing [scanner] mode/min_severity/excluded_extensions/excluded_directories, [database] settings, and resolved engine toggles" width="900"/></p>
### `nyx config path` ### `nyx config path`
@ -275,7 +399,7 @@ Add a custom taint rule. Written to `nyx.local`.
| `--lang` | `rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby` | | `--lang` | `rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby` |
| `--matcher` | Function or property name to match | | `--matcher` | Function or property name to match |
| `--kind` | `source`, `sanitizer`, `sink` | | `--kind` | `source`, `sanitizer`, `sink` |
| `--cap` | `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `all` | | `--cap` | `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `data_exfil`, `ldap_injection`, `xpath_injection`, `header_injection`, `open_redirect`, `ssti`, `xxe`, `prototype_pollution`, `all` |
### `nyx config add-terminator` ### `nyx config add-terminator`
@ -287,6 +411,41 @@ Add a terminator function (e.g. `process.exit`). Written to `nyx.local`.
--- ---
## `nyx rules`
Browse the built-in rule registry from the terminal. Same dataset the dashboard's Rules page reads from: cap-class entries (one per `Cap` with a canonical rule id), per-language label rules (sink / source / sanitizer), gated sinks, and any custom rules from your config.
### `nyx rules list`
```
nyx rules list [--lang <SLUG>] [--kind <KIND>] [--class-only|--no-class] [--json]
```
| Flag | Values |
|------|--------|
| `--lang` | Language slug (`javascript`, `typescript`, `python`, `java`, `php`, `go`, `ruby`, `rust`, `c`, `cpp`). Cap-class entries (`language = "all"`) still surface alongside any language filter unless `--no-class` is set. |
| `--kind` | `class` (cap-class entry), `source`, `sink`, `sanitizer` |
| `--class-only` | Show only the cap-class registry entries, suppressing per-language label rules and gated sinks. |
| `--no-class` | Suppress cap-class registry entries, show only per-language label rules and gated sinks. Conflicts with `--class-only`. |
| `--json` | Emit JSON instead of the human-readable table. Schema matches the `/api/rules` response. |
Examples:
```bash
# Browse the seven new vulnerability classes
nyx rules list --class-only
# All Java sinks
nyx rules list --lang java --kind sink
# JSON output for scripted filtering
nyx rules list --json | jq '.[] | select(.cap == "ldap_injection")'
```
The `enabled` column reflects the `analysis.disabled_rules` overlay from your config, so a rule disabled in `nyx.local` shows up here too. Custom rules added via `nyx config add-rule` appear at the end with `is_custom: true`.
---
## Exit codes ## Exit codes
See [output.md](output.md#exit-codes). Summary: `0` on success (including findings without `--fail-on`), `1` when `--fail-on` trips, non-zero on scan errors. See [output.md](output.md#exit-codes). Summary: `0` on success (including findings without `--fail-on`), `1` when `--fail-on` trips, non-zero on scan errors.

View file

@ -2,7 +2,7 @@
Nyx uses TOML configuration files. A default config is auto-generated on first run. If you'd rather edit settings and rules from the browser, the [Config page in `nyx serve`](serve.md#config) is a live editor that writes back to `nyx.local`: Nyx uses TOML configuration files. A default config is auto-generated on first run. If you'd rather edit settings and rules from the browser, the [Config page in `nyx serve`](serve.md#config) is a live editor that writes back to `nyx.local`:
<p align="center"><img src="../assets/screenshots/docs/serve-config.png" alt="Nyx config page: General settings, Triage Sync toggle, Sources panel with language/matcher/capability dropdowns and a per-language matcher table" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-config.png" alt="Nyx config page: General settings, Triage Sync toggle, Sources panel with language/matcher/capability dropdowns and a per-language matcher table" width="900"/></p>
## File Locations ## File Locations
@ -16,8 +16,8 @@ Run `nyx config path` to see the exact directory on your system.
## File Precedence ## File Precedence
1. **`nyx.conf`** -- Default config (auto-created from built-in template on first run) 1. **`nyx.conf`**: default config (auto-created from built-in template on first run)
2. **`nyx.local`** -- User overrides (loaded on top of defaults) 2. **`nyx.local`**: user overrides (loaded on top of defaults)
Both files are optional. CLI flags take precedence over both. Both files are optional. CLI flags take precedence over both.
@ -40,7 +40,7 @@ excluded_extensions = ["jpg", "png", "exe"]
excluded_extensions = ["foo", "jpg"] excluded_extensions = ["foo", "jpg"]
# Effective result: # Effective result:
# ["exe", "foo", "jpg", "png"] -- sorted, deduped union # ["exe", "foo", "jpg", "png"] (sorted, deduped union)
``` ```
--- ---
@ -65,6 +65,13 @@ excluded_extensions = ["foo", "jpg"]
| `scan_hidden_files` | bool | `false` | Scan dot-files | | `scan_hidden_files` | bool | `false` | Scan dot-files |
| `include_nonprod` | bool | `false` | Keep original severity for test/vendor paths | | `include_nonprod` | bool | `false` | Keep original severity for test/vendor paths |
| `enable_state_analysis` | bool | `true` | Enable resource lifecycle + auth state analysis. Detects use-after-close, double-close, resource leaks (per-function scope), and unauthenticated access. Requires `mode = "full"` or `mode = "taint"`. | | `enable_state_analysis` | bool | `true` | Enable resource lifecycle + auth state analysis. Detects use-after-close, double-close, resource leaks (per-function scope), and unauthenticated access. Requires `mode = "full"` or `mode = "taint"`. |
| `enable_auth_analysis` | bool | `true` | Enable auth-state analysis within the state engine. When false, only resource lifecycle findings (leak, use-after-close, double-close) are produced. |
| `enable_panic_recovery` | bool | `false` | Catch per-file analysis panics as warnings and continue. When false, a panic aborts the scan, preserving the loud-fail behaviour for users debugging engine bugs. |
| `enable_auth_as_taint` | bool | `false` | Fold auth analysis into the SSA/taint engine via `Cap::UNAUTHORIZED_ID`. Off while the standalone path still carries stable detection. |
| `verify` | bool | `true` | Run dynamic verification on each `Confidence >= Medium` finding after the static pass. Included in default builds; custom `--no-default-features` builds need `--features dynamic`. CLI overrides: `--verify` / `--no-verify`. |
| `verify_all_confidence` | bool | `false` | Extend dynamic verification to findings below `Confidence::Medium`. Intended for corpus-building, not production scans. CLI: `--verify-all-confidence`. |
| `verify_backend` | string | `"auto"` | Sandbox backend for dynamic verification. `"auto"` picks docker when available else process; `"docker"` requires docker; `"process"` runs in-process (same as `--unsafe-sandbox`). |
| `harden_profile` | string | `"standard"` | Process-backend hardening profile. `"standard"` engages `PR_SET_NO_NEW_PRIVS` + `setrlimit(RLIMIT_AS)` on Linux; `"strict"` adds namespace unshare, chroot to workdir, and a default-deny seccomp filter on Linux, plus `sandbox-exec` wrapping on macOS keyed off the finding's expected cap. |
### `[database]` ### `[database]`
@ -119,6 +126,7 @@ Configuration for the local web UI (`nyx serve`).
| `auto_reload` | bool | `true` | Auto-reload UI when scan results change | | `auto_reload` | bool | `true` | Auto-reload UI when scan results change |
| `persist_runs` | bool | `true` | Persist scan runs for history view | | `persist_runs` | bool | `true` | Persist scan runs for history view |
| `max_saved_runs` | int | `50` | Maximum number of saved runs | | `max_saved_runs` | int | `50` | Maximum number of saved runs |
| `triage_sync` | bool | `true` | Auto-sync triage decisions to `.nyx/triage.json` in the project root so changes can be committed to git. |
### `[runs]` ### `[runs]`
@ -173,10 +181,10 @@ Release-grade switches for the optional analysis passes. Each toggle has a
matching CLI flag (pair of `--foo` / `--no-foo`) that overrides the config matching CLI flag (pair of `--foo` / `--no-foo`) that overrides the config
value for a single run. These used to be `NYX_*` environment variables value for a single run. These used to be `NYX_*` environment variables
(`NYX_CONSTRAINT`, `NYX_ABSTRACT_INTERP`, `NYX_SYMEX`, `NYX_CROSS_FILE_SYMEX`, (`NYX_CONSTRAINT`, `NYX_ABSTRACT_INTERP`, `NYX_SYMEX`, `NYX_CROSS_FILE_SYMEX`,
`NYX_SYMEX_INTERPROC`, `NYX_CONTEXT_SENSITIVE`, `NYX_PARSE_TIMEOUT_MS`, `NYX_SYMEX_INTERPROC`, `NYX_CONTEXT_SENSITIVE`, `NYX_BACKWARDS`,
`NYX_SMT`); those env vars are still honored as a last-resort override when `NYX_PARSE_TIMEOUT_MS`, `NYX_SMT`); those env vars are still honored as a
nyx is used as a library (no CLI entry point), but the config/CLI surface is fallback default when nyx is used as a library (no CLI entry point), but the
the stable path. config/CLI surface is the stable path.
| Field | Type | Default | Description | | Field | Type | Default | Description |
|-------|------|---------|-------------| |-------|------|---------|-------------|
@ -185,6 +193,8 @@ the stable path.
| `context_sensitive` | bool | `true` | k=1 context-sensitive callee inlining for intra-file calls | | `context_sensitive` | bool | `true` | k=1 context-sensitive callee inlining for intra-file calls |
| `backwards_analysis` | bool | `false` | Demand-driven backwards taint walk from sinks (adds scan time; default off) | | `backwards_analysis` | bool | `false` | Demand-driven backwards taint walk from sinks (adds scan time; default off) |
| `parse_timeout_ms` | int | `10000` | Per-file tree-sitter parse timeout; `0` disables the cap | | `parse_timeout_ms` | int | `10000` | Per-file tree-sitter parse timeout; `0` disables the cap |
| `max_origins` | int | `32` | Maximum taint origins retained per lattice value. Excess origins are dropped deterministically (sorted by source location) and an `OriginsTruncated` engine note is recorded. CLI: `--max-origins`. |
| `max_pointsto` | int | `32` | Maximum abstract heap objects retained per intra-procedural points-to set. Excess objects are dropped and a `PointsToTruncated` engine note is recorded. CLI: `--max-pointsto`. |
**`[analysis.engine.symex]`** sub-section: **`[analysis.engine.symex]`** sub-section:
@ -208,11 +218,53 @@ CLI flag map (each pair is `--enable / --no-enable`):
| `symex.cross_file` | `--cross-file-symex` / `--no-cross-file-symex` | | `symex.cross_file` | `--cross-file-symex` / `--no-cross-file-symex` |
| `symex.interprocedural` | `--symex-interproc` / `--no-symex-interproc` | | `symex.interprocedural` | `--symex-interproc` / `--no-symex-interproc` |
| `symex.smt` | `--smt` / `--no-smt` | | `symex.smt` | `--smt` / `--no-smt` |
| `max_origins` | `--max-origins <N>` |
| `max_pointsto` | `--max-pointsto <N>` |
**Engine-depth profile shortcut**: instead of flipping individual toggles, pass `--engine-profile {fast,balanced,deep}` to set the whole stack at once. Individual flags override the profile, so `--engine-profile fast --backwards-analysis` runs the fast stack with backwards analysis on. See `docs/cli.md` for the exact toggle matrix. **Engine-depth profile shortcut**: instead of flipping individual toggles, pass `--engine-profile {fast,balanced,deep}` to set the whole stack at once. Individual flags override the profile, so `--engine-profile fast --backwards-analysis` runs the fast stack with backwards analysis on. See `docs/cli.md` for the exact toggle matrix.
**Explain effective engine**: pass `--explain-engine` to print the resolved engine configuration (profile + config + CLI overrides) and exit without scanning. **Explain effective engine**: pass `--explain-engine` to print the resolved engine configuration (profile + config + CLI overrides) and exit without scanning.
### `[chain]`
Bounded-DFS path search across taint findings. Emits multi-step attack chains when several findings link through shared SSA values or call edges.
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `max_depth` | int | `4` | Maximum per-finding hops in a single chain path. |
| `min_score` | float | `9.5` | Score threshold; chains below this value are dropped. |
| `reverify_top_n` | int | `5` | Only the top-N chains by score are eligible for composite dynamic re-verification. `0` disables composite re-verification. |
### `[telemetry]`
Sampling policy for the on-disk event log written by dynamic verification (`~/.cache/nyx/dynamic/events.jsonl`). Confirmed and Inconclusive verdicts are calibration-critical and kept by default; other verdict statuses can be downsampled to bound log growth. Decisions are seeded by `spec_hash` for determinism. See `docs/dynamic.md` for the on-disk schema and `NYX_NO_TELEMETRY=1` opt-out.
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `keep_all_confirmed` | bool | `true` | Always retain `Confirmed` verdicts. |
| `keep_all_inconclusive` | bool | `true` | Always retain `Inconclusive` verdicts. |
| `sample_rate_other` | float | `1.0` | Retention probability for verdicts not covered by the keep-all flags. `1.0` keeps everything, `0.0` drops everything. |
### `[detectors.data_exfil]`
Per-project tuning for the `taint-data-exfiltration` rule. All fields are optional.
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `enabled` | bool | `true` | Set `false` to strip `Cap::DATA_EXFIL` from sink caps before emission. No `taint-data-exfiltration` finding reaches the report. Other taint classes are not affected. |
| `trusted_destinations` | [string] | `[]` | URL prefixes that drop `Cap::DATA_EXFIL` on the call site. Matched against the abstract-string domain prefix of the destination arg, so a literal URL or a template literal with a static prefix both work. Use full origins or origin-pinned paths and include the trailing `/`, otherwise `https://api.` matches `https://api.evil.example.com/` too. |
```toml
[detectors.data_exfil]
enabled = true
trusted_destinations = [
"https://api.internal/",
"https://telemetry.example.com/",
]
```
For the sanitizer convention, source sensitivity gate, and per-language sink coverage, see [Detectors / Taint / DATA_EXFIL](detectors/taint.md#data_exfil-suppression-layers).
### `[analysis.languages.<slug>]` ### `[analysis.languages.<slug>]`
Per-language custom rules. `<slug>` is one of: `rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby`. Per-language custom rules. `<slug>` is one of: `rust`, `javascript`, `typescript`, `python`, `go`, `java`, `c`, `cpp`, `php`, `ruby`.
@ -232,9 +284,15 @@ kind = "sanitizer" # "source" | "sanitizer" | "sink"
cap = "html_escape" # "env_var" | "html_escape" | "shell_escape" | cap = "html_escape" # "env_var" | "html_escape" | "shell_escape" |
# "url_encode" | "json_parse" | "file_io" | # "url_encode" | "json_parse" | "file_io" |
# "fmt_string" | "sql_query" | "deserialize" | # "fmt_string" | "sql_query" | "deserialize" |
# "ssrf" | "code_exec" | "crypto" | "all" # "ssrf" | "data_exfil" | "code_exec" | "crypto" |
# "unauthorized_id" | "ldap_injection" |
# "xpath_injection" | "header_injection" |
# "open_redirect" | "ssti" | "xxe" |
# "prototype_pollution" | "all"
``` ```
Aliases accepted by `parse_cap` and `[..rules].cap`: `data_exfiltration` for `data_exfil`, `ldapi` for `ldap_injection`, `xpathi` for `xpath_injection`, `crlf` and `response_splitting` for `header_injection`, `redirect` for `open_redirect`, `template_injection` for `ssti`, `proto_pollution` for `prototype_pollution`.
--- ---
## Example Configurations ## Example Configurations
@ -328,7 +386,7 @@ nyx config show
Config is validated after loading and merging. Validation checks include: Config is validated after loading and merging. Validation checks include:
- Server port must be 165535 - Server port must be 1 to 65535
- Server host must not be empty - Server host must not be empty
- `max_saved_runs` must be > 0 when `persist_runs` is true - `max_saved_runs` must be > 0 when `persist_runs` is true
- `max_runs` must be > 0 when `persist` is true - `max_runs` must be > 0 when `persist` is true
@ -365,15 +423,15 @@ State analysis requires `mode = "full"` or `mode = "taint"`. It has no effect in
### Engine-version mismatch is handled automatically ### Engine-version mismatch is handled automatically
Nyx stores the scanner's `CARGO_PKG_VERSION` in the project index database. Nyx stores the scanner's `CARGO_PKG_VERSION` in the project index database.
When the version recorded in the DB differs from the running binary; or the When the version recorded in the DB differs from the running binary, or the
row is missing entirely; every cached summary, SSA body, and file-hash row row is missing entirely, every cached summary, SSA body, and file-hash row
is wiped on the next open so the next scan rebuilds the index against the new is wiped on the next open. The next scan rebuilds the index against the new
engine. No flag is needed; CI pipelines keep working across upgrades. engine. No flag is needed; CI pipelines keep working across upgrades.
The rebuild is logged at `info` level: The rebuild is logged at `info` level:
``` ```
engine version changed (0.4.0 → 0.5.0), rebuilding index engine version changed (<old><new>), rebuilding index
``` ```
If you see this once per upgrade it is working as intended. If you see it on If you see this once per upgrade it is working as intended. If you see it on
@ -410,4 +468,4 @@ On the next scan Nyx builds a fresh index from scratch.
## Reserved Fields ## Reserved Fields
Some config fields are defined but not yet implemented. They are marked `(RESERVED)` in the default config and accept values without effect. This allows forward-compatible config files; settings will activate when the feature is implemented without requiring config changes. Some config fields are defined but not yet implemented. They are marked `(RESERVED)` in the default config and accept values without effect. Config files stay forward-compatible: settings start having an effect when the feature ships, with no edit needed.

View file

@ -9,6 +9,36 @@ Nyx ships four independent detector families. They run together in `--mode full`
| [State model](detectors/state.md) | `state-*` | Per-function state lattice | Use-after-close, double-close, leaks, unauthenticated access | | [State model](detectors/state.md) | `state-*` | Per-function state lattice | Use-after-close, double-close, leaks, unauthenticated access |
| [AST patterns](detectors/patterns.md) | `<lang>.<cat>.<name>` | Tree-sitter structural match | Banned APIs, weak crypto, dangerous constructs | | [AST patterns](detectors/patterns.md) | `<lang>.<cat>.<name>` | Tree-sitter structural match | Banned APIs, weak crypto, dangerous constructs |
```mermaid
flowchart LR
Taint["Taint analysis<br/>cross-file source-to-sink"] --> Normalize["Normalize findings"]
Cfg["CFG structural<br/>guards, exits, resource paths"] --> Normalize
State["State model<br/>resource and auth lattice"] --> Normalize
Ast["AST patterns<br/>tree-sitter structural match"] --> Normalize
Normalize --> Dedupe["Deduplicate<br/>same site, rule, severity"]
Dedupe --> Rank["Rank<br/>severity, evidence, context"]
Rank --> Output["Console, JSON, SARIF, UI"]
```
The taint family is split into cap-specific rule classes when a sink callee carries multiple vulnerability classes:
| Rule id | Cap | Surface |
|---|---|---|
| `taint-unsanitised-flow` | `sql_query`, `ssrf`, `code_exec`, `file_io`, `fmt_string`, `deserialize`, `crypto` | Catch-all class for the legacy caps that have not migrated to a dedicated rule id yet. |
| `taint-ldap-injection` | `ldap_injection` | Attacker-controlled data concatenated into an LDAP filter or DN without RFC 4515 escaping. Receivers typed as `LdapClient` (JNDI `DirContext`, Spring `LdapTemplate`, ldapjs `Client`, python-ldap `LDAPObject`, ldap3 `Connection`) and chained `.search` / `.searchByEntity` / `.search_s` form the sink set. |
| `taint-xpath-injection` | `xpath_injection` | Attacker-controlled string passed as the XPath expression to `xpath.evaluate` / `xpath.compile` / `document.evaluate` / `DOMXPath::query` / `etree.XPath`. Suppressed when the receiver was bound to an `XPathVariableResolver` (parameterised XPath shape). |
| `taint-header-injection` | `header_injection` | Attacker-controlled bytes landing in an HTTP response header without `\r\n` stripping (response splitting, cache poisoning). Covers `setHeader` / `res.set` / `res.append` / `headers["X-Foo"] = bar` / `Header().Set` / `add_header` / `setcookie` / `http.Header.Set`. |
| `taint-open-redirect` | `open_redirect` | Attacker-controlled URL driving a redirect / `Location` header without an allowlist or relative-URL check. Includes the Spring MVC `return "redirect:" + url` view-name shape via the `__spring_redirect__` synthetic sink. Suppressed by `RelativeUrlValidated` (`startsWith("/")` family) and `HostAllowlistValidated` (`new URL(x).host === ALLOWED`, `urlparse(x).netloc == ...`) inline predicates. |
| `taint-template-injection` | `ssti` | Attacker controls the *template source string* fed to a server-side renderer (Jinja2 / Mako / FreeMarker / Twig / Handlebars / EJS / Mustache / ERB / `text/template` / `html/template` / Smarty / Blade `Template(...)` / `compile(...)`), distinct from rendering a trusted template with tainted variables. |
| `taint-xxe` | `xxe` | Attacker-controlled XML reaching a parser that resolves external entities. Covers JAXP `DocumentBuilder.parse` / `SAXParser.parse` / `XMLReader.parse`, lxml `etree.parse`, Nokogiri, fast-xml-parser, xml2js, libxml2 `xmlReadFile` / `xmlReadMemory`. Suppressed when the receiver carries a hardening fact in `xml_parser_config` (`secure_processing`, `disallow_doctype`, `processEntities: false`, `LIBXML_NOENT` not set). |
| `taint-prototype-pollution` | `prototype_pollution` | Attacker-controlled key reaching an object property assignment that can mutate `Object.prototype`. JS/TS only. Covers `obj[tainted] = v` (synthetic `__index_set__` sink), library-mediated deep-merge / set helpers (`_.merge`, `_.set`, `dotProp.set`, `objectPath.set`, `setValue`), and jQuery's `extend(true, target, src)` deep-merge form via the `LiteralOnly` activation gate. Suppressed by constant-key fold (`__proto__` / `constructor` / `prototype` filtering), reject / allowlist guards on the key, and `Object.create(null)` receivers (flow-sensitive `NullPrototypeObject` type). Python equivalent (`dict.update`) is opt-in via `NYX_PYTHON_PROTO_POLLUTION=1`. |
| `taint-data-exfiltration` | `data_exfil` | Sensitive data flowing into the payload of an outbound network request (body / headers / json on `fetch`, body on `XMLHttpRequest.send`). Distinct from SSRF: the destination is fixed but attacker-influenced bytes leave the process. |
| `rs.auth.missing_ownership_check.taint` | `unauthorized_id` | Rust auth subsystem fold-in; see [auth.md](auth.md). |
A single call site can fire several of these at once when it carries multiple gates. `fetch(taintedUrl, {body: tainted})` produces both an SSRF finding (URL flow) and a `taint-data-exfiltration` finding (body flow), each with its own cap mask rather than a conflated union.
Each cap-class entry is registered in `CAP_RULE_REGISTRY` (`src/labels/mod.rs`) with its title, severity, OWASP 2021 code, and description. Browse the registry from the CLI with `nyx rules list --class-only`, or `nyx rules list --kind class --json` for machine output.
For Rust auth-specific rules (`rs.auth.*`), see [auth.md](auth.md). For Rust auth-specific rules (`rs.auth.*`), see [auth.md](auth.md).
## How they combine ## How they combine
@ -39,11 +69,13 @@ score = severity_base + analysis_kind + evidence_strength + state_bonus - valida
| Component | Values | | Component | Values |
|---|---| |---|---|
| Severity base | High=60, Medium=30, Low=10 | | Severity base | High=60, Medium=30, Low=10 |
| Analysis kind | taint=+10, state=+8, cfg with evidence=+5, cfg without evidence=+3, ast=+0 | | Analysis kind | taint=+10, taint-data-exfiltration=+7, state=+8, cfg with evidence=+5, cfg without evidence=+3, ast=+0 |
| Evidence strength | +1 per evidence item up to 4; +2 to +6 for source kind | | Evidence strength | +1 per evidence item up to 4; +2 to +6 for source kind |
| State bonus | use-after-close / unauthed=+6, double-close=+3, must-leak=+2, may-leak=+1 | | State bonus | use-after-close / unauthed=+6, double-close=+3, must-leak=+2, may-leak=+1 |
| Validation penalty | -5 if path-validated | | Validation penalty | -5 if path-validated |
DATA_EXFIL is calibrated below other taint classes by design. Severity is High only when the source carries credential / session material (cookies, env vars); other Sensitive sources (request headers, file system, database, caught exception) downgrade to Medium. Confidence is capped at Medium and only fires Medium when the abstract / symbolic domain corroborates a concrete string body reaching the outbound payload; otherwise it falls to Low. A guarded flow (`path_validated`) drops a confidence tier. The intent is to seat data-exfiltration findings below SSRF / SQLi / command-injection but above informational AST patterns.
Source-kind contributions (taint only): Source-kind contributions (taint only):
| Source | Bonus | | Source | Bonus |
@ -61,7 +93,9 @@ Approximate score ranges:
| High taint with user input | 76 to 81 | | High taint with user input | 76 to 81 |
| High state (use-after-close) | ~74 | | High state (use-after-close) | ~74 |
| High CFG structural | 63 to 68 | | High CFG structural | 63 to 68 |
| High DATA_EXFIL (cookie / env source, body confirmed) | ~76 |
| Medium taint with env source | 45 to 50 | | Medium taint with env source | 45 to 50 |
| Medium DATA_EXFIL (header / fs / db / caught-exception source) | 40 to 45 |
| Medium state (resource leak) | ~40 | | Medium state (resource leak) | ~40 |
| Low AST-only pattern | ~10 | | Low AST-only pattern | ~10 |

View file

@ -56,8 +56,10 @@ Full list: [rules.md](../rules.md).
| `eval("hardcoded literal")` | Pattern matches structure | Run `--mode cfg` to drop AST patterns and rely on taint | | `eval("hardcoded literal")` | Pattern matches structure | Run `--mode cfg` to drop AST patterns and rely on taint |
| `unsafe` block with sound justification | Every `unsafe` matches `rs.quality.unsafe_block` | Filter `>=MEDIUM` (it's Medium) or accept the noise | | `unsafe` block with sound justification | Every `unsafe` matches `rs.quality.unsafe_block` | Filter `>=MEDIUM` (it's Medium) or accept the noise |
| `.unwrap()` in tests | Acceptable in test code | Default non-prod severity downgrade reduces it | | `.unwrap()` in tests | Acceptable in test code | Default non-prod severity downgrade reduces it |
| `md5` for non-cryptographic checksums | Pattern can't see intent | Suppress with `--severity ">=MEDIUM"` or per-line `nyx:ignore` | | `md5` for non-cryptographic checksums | Pattern can't see intent in most languages | PHP recognises non-crypto consuming context structurally (cache keys, ETag, dedup, `getCacheKey()` returns) and suppresses. Other languages: `--severity ">=MEDIUM"` or per-line `nyx:ignore` |
| SQL concat with trusted data (Tier B) | Heuristic can't verify the source | Taint is more precise; or convert to a parameterized query | | SQL concat with trusted data (Tier B) | Heuristic can't verify the source | Taint is more precise; or convert to a parameterized query |
| C++ `reinterpret_cast<T>(...)` for byte-pointer / void* / `sockaddr` | Pattern fires on every cast regardless of target type | Suppressed when the target is well-defined by C++ aliasing rules: `char*`, `unsigned char*`, `signed char*`, `wchar_t*`, `uint8_t*`, `int8_t*`, `std::byte*`, `byte*`, `void*`, `uintptr_t` / `intptr_t` (and `std::` variants), and the BSD socket address family. User-defined struct or class pointer targets keep firing. |
| JS / TS `secrets.fallback_secret` on `process.env.X \|\| ""` | Empty-string fallback satisfies non-undefined string types without committing a secret | Empty-string fallbacks are excluded from the rule. Non-empty literal fallbacks still fire. |
## Confidence levels ## Confidence levels

View file

@ -59,7 +59,7 @@ Higher confidence:
Lower confidence: Lower confidence:
- Path-validated taint (`path_validated: true`). - Path-validated taint (`path_validated: true`).
- Source is a database read or internal file (pre-validated at insertion is common). - Source is a database read or internal file (pre-validated at insertion is common).
- Engine note `ForwardBailed` / `PathWidened`. Use `--require-converged` to drop these in strict gates. - Any non-informational engine note (`SsaLoweringBailed`, `ParseTimeout`, `PredicateStateWidened`, `PathEnvCapped`, `WorklistCapped`, etc.). Use `--require-converged` to drop over-report and bail notes in strict gates.
## Tuning ## Tuning
@ -92,7 +92,7 @@ AST-only mode gives you structural pattern matches without taint.
In the browser UI, taint findings render as a numbered flow walk so you can see each hop the engine took: In the browser UI, taint findings render as a numbered flow walk so you can see each hop the engine took:
<p align="center"><img src="../../assets/screenshots/docs/serve-finding-detail.png" alt="Nyx finding detail: HIGH taint-unsanitised-flow with numbered source → call → sink steps and How to fix guidance" width="900"/></p> <p align="center"><img src="../assets/screenshots/docs/serve-finding-detail.png" alt="Nyx finding detail: HIGH taint-unsanitised-flow with numbered source → call → sink steps and How to fix guidance" width="900"/></p>
## Example ## Example
@ -130,14 +130,142 @@ Sources, sanitizers, and sinks are linked by named capabilities. A sanitizer onl
| `shell_escape` | | `shlex.quote`, `shell_escape::escape` | `system`, `Command::new`, `eval` | | `shell_escape` | | `shlex.quote`, `shell_escape::escape` | `system`, `Command::new`, `eval` |
| `url_encode` | | `encodeURIComponent` | `location.href`, HTTP client URL arg | | `url_encode` | | `encodeURIComponent` | `location.href`, HTTP client URL arg |
| `json_parse` | | `JSON.parse` | | | `json_parse` | | `JSON.parse` | |
| `file_io` | | `os.path.realpath`, `filepath.Clean` | `open`, `fs::read_to_string`, `send_file` | | `file_io` | | `os.path.realpath`, `filepath.Clean`, canonicalise + `starts_with`-rooted guard | `open`, `fs::read_to_string`, `send_file` |
| `fmt_string` | | | `printf(var)` | | `fmt_string` | | | `printf(var)` |
| `sql_query` | | parameterized query binders | `cursor.execute`, `db.query` with concatenation | | `sql_query` | | parameterized query binders | `cursor.execute`, `db.query` with concatenation |
| `deserialize` | | | `pickle.loads`, `yaml.load`, `Marshal.load` | | `deserialize` | | | `pickle.loads`, `yaml.load`, `Marshal.load` |
| `ssrf` | | URL-prefix locks | `requests.get`, `fetch`, `HttpClient.send` | | `ssrf` | | URL-prefix locks | `requests.get`, `fetch` URL arg, outbound HTTP destination |
| `code_exec` | | | `eval`, `exec`, `Function` | | `code_exec` | | | `eval`, `exec`, `Function` |
| `crypto` | | | weak-algorithm constructors | | `crypto` | | | weak-algorithm constructors |
| `unauthorized_id` | request-bound scoped IDs (Rust auth analysis) | ownership check | row-level write | | `unauthorized_id` | request-bound scoped IDs (Rust auth analysis) | ownership check | row-level write |
| `ldap_injection` | | `ldap-escape` filter / dn helpers, project-local `escapeLdapFilter` | `DirContext.search`, `LdapClient.search`, `ldap_search`, `Net::LDAP#search`, `ldap_search_ext_s` |
| `xpath_injection` | | bound `XPathVariableResolver`, `escapeXpath` / `xpathEscape` helpers | `XPath.evaluate`, `DOMXPath::query`, `document.evaluate`, `xpath.select`, `etree.XPath` |
| `header_injection` | | `stripCRLF` / `escapeHeader` / `sanitizeHeader` | `setHeader`, `res.set`, `res.append`, `headers["X-Foo"] = bar`, `Header().Set`, `header()`, `setcookie` |
| `open_redirect` | | leading-slash check (`startsWith("/")`), URL-parse + host allowlist (`new URL(x).host === ALLOWED`) | `Redirect::to`, Spring `redirect:` view name, `flask.redirect`, `http.Redirect`, `redirect_to` |
| `ssti` | | | template constructors fed by tainted source: `Jinja2 Template(...)`, `freemarker.Template`, `Twig::createTemplate`, Handlebars `compile`, `ERB.new`, Mako `Template(...)` |
| `xxe` | | hardened parser config (`secure_processing`, `disallow-doctype-decl`, `processEntities: false`, `LIBXML_NOENT` not set) | `DocumentBuilder.parse`, `SAXParser.parse`, `xml2js`, `fast-xml-parser`, `lxml.etree.parse`, `xmlReadFile` |
| `prototype_pollution` | | constant-key fold, reject / allowlist guards on the key, `Object.create(null)` receivers | `obj[tainted] = v` synthetic `__index_set__`, `_.merge`, `_.set`, `dotProp.set`, `objectPath.set`, jQuery `extend(true, ...)` |
| `data_exfil` | cookies, headers, env, db rows, file reads (Sensitive-tier sources only) | | `fetch` body / headers / json, `XMLHttpRequest.send` body |
| `all` | Sources typically use `all` so they match any sink | | | | `all` | Sources typically use `all` so they match any sink | | |
Sources typically use `cap = "all"` so they match every sink. Sinks declare the specific cap they need. Sanitizers only clear the cap they name. Sources typically use `cap = "all"` so they match every sink. Sinks declare the specific cap they need. Sanitizers only clear the cap they name.
## Source sensitivity
Some detector classes need to know not just *that* a value is attacker-influenced but *what kind* of value it is. Each source carries a `SourceKind` (`UserInput`, `Cookie`, `Header`, `EnvironmentConfig`, `FileSystem`, `Database`, `CaughtException`, `Unknown`) and a derived sensitivity tier:
| Tier | Source kinds | Meaning |
|---|---|---|
| `Plain` | `UserInput` (request bodies, query strings, form fields, argv, stdin) | Attacker-controlled but already in the attacker's hands. Echoing it back to them is not a disclosure. |
| `Sensitive` | `Cookie`, `Header`, `EnvironmentConfig`, `FileSystem`, `Database`, `CaughtException`, `Unknown` | Operator-bound state that should not leak across boundaries. |
| `Secret` | (reserved for explicit credential sources) | Highest tier; treated identically to `Sensitive` today. |
`Cap::DATA_EXFIL` only fires when the contributing source is at least `Sensitive`. Plain user input flowing into an outbound `fetch` body is suppressed at finding-emission time. That is the canonical false-positive class for API gateways and telemetry forwarders that proxy `req.body`. SSRF and other classes are unaffected; the gate is scoped to `DATA_EXFIL`.
If a project legitimately classifies a request body as sensitive (e.g. an internal forwarder where `req.body` carries a pre-authenticated user token), override via custom rules in `nyx.conf`:
```toml
# Treat the forwarder's outbound payload as already-sanitized so the
# DATA_EXFIL gate stops firing on it.
[[analysis.languages.javascript.rules]]
matchers = ["sanitizeOutbound"]
kind = "sanitizer"
cap = "data_exfil"
```
Or re-classify the source itself with a custom Source rule whose name matches one of the Sensitive substrings (`cookie`, `header`).
## DATA_EXFIL suppression layers
Three suppression knobs ship by default so projects can match the cap to their architecture without per-call suppressions.
### 1. Forwarding-wrapper sanitizer convention
A named function that exists to *forward* a payload across a known boundary is the developer's explicit decision to send the data. The default sanitizer rules treat the following identifiers as `Sanitizer(data_exfil)` in JavaScript and TypeScript:
```
serializeForUpstream
forwardPayload
tracker.send
analytics.track
metrics.report
logEvent
```
If your codebase follows this convention, the cap stops firing on these calls automatically. Extend the convention with your own forwarding wrappers via the standard custom-rule path:
```toml
[[analysis.languages.javascript.rules]]
matchers = ["dispatchTelemetry", "sendToBus"]
kind = "sanitizer"
cap = "data_exfil"
```
The rule of thumb: a function that *only* exists to ship a payload to a known boundary belongs in this list. A function that *might* leak (a generic HTTP wrapper, a logging helper that writes to an arbitrary destination) does not.
### 2. Destination allowlist
Configure a set of trusted outbound prefixes once and the cap is dropped on every site whose destination argument has a static prefix that begins with one of them:
```toml
[detectors.data_exfil]
trusted_destinations = [
"https://api.internal/",
"https://telemetry.",
]
```
Use full origins or origin-pinned paths so a partial-host match across unrelated origins cannot occur. `https://api.` would also match `https://api.evil.example.com/`, so the entry must include the path separator (`/`) at the end of the host.
The match consults the abstract string domain: a literal URL is a static prefix; a template literal `\`https://api.internal/${id}\`` exposes the prefix `https://api.internal/`; a fully dynamic URL has no prefix and the cap fires as usual.
### 3. Detector-class disable
Some projects forward user-bound payloads as a matter of architecture. Turn the entire detector class off when the noise is permanent:
```toml
[detectors.data_exfil]
enabled = false
```
`enabled = false` strips `Cap::DATA_EXFIL` from sink caps before event emission, so no `taint-data-exfiltration` finding reaches the report. The decision is per-project; other projects loaded by the same `nyx serve` instance keep their own settings.
## DATA_EXFIL sinks per language
Sinks Nyx ships with for `Cap::DATA_EXFIL`. The body, headers, or json payload arg fires; the URL arg routes through the SSRF gate and emits `taint-unsanitised-flow` instead.
| Language | Sinks | Example |
|---|---|---|
| JavaScript, TypeScript | `fetch(url, {body, headers, json})` body-bind, `XMLHttpRequest.prototype.send`, type-qualified `HttpClient.send` | `fetch('/upload', {method: 'POST', body: req.cookies.session})` |
| Python | `requests.post / put / patch` body and json kwargs, `httpx.AsyncClient().post` json kwarg, `aiohttp.ClientSession().post` body, dict round-trip into json | `requests.post('https://api.internal/ingest', json={'k': os.environ.get('SECRET')})` |
| Java | `HttpClient.send` with `BodyPublishers.ofString`, OkHttp `newCall(req).execute` body chain, Apache `HttpClient.execute(HttpPost)`, `RestTemplate.postForEntity / exchange`, `WebClient.post().bodyValue / body` | `client.send(HttpRequest.newBuilder().uri(...).POST(BodyPublishers.ofString(token)).build(), ...)` |
| Go | `http.Post(url, ct, body)` body arg, `http.PostForm` form arg, `(*http.Client).Do(req)` after `http.NewRequest`, `(*http.Request).Body` assignment | `http.Post("https://analytics.internal/track", "text/plain", strings.NewReader(c.Value))` |
| Rust | `reqwest::Client.post().body / json / form / multipart().send()`, `ureq::post().send_string / send_form / send_json`, `surf::post().body_string / body_json`, `hyper::Request::builder().body()` | `reqwest::Client::new().post(url).form(&secret).send()` |
| Ruby | `Net::HTTP.post(uri, body)` body arg, `Net::HTTP::Post.new(uri).body=`, `RestClient.post / put`, `HTTParty.post(url, body: ...)` body | `Net::HTTP.post(URI('https://analytics.internal/track'), "session=#{request.cookies[:auth]}")` |
| C, C++ | `curl_easy_setopt(handle, CURLOPT_POSTFIELDS, body)` and `CURLOPT_COPYPOSTFIELDS` gated sinks (macro-arg activation), `CURLOPT_POSTFIELDSIZE` body-bind | `curl_easy_setopt(curl, CURLOPT_POSTFIELDS, getenv("AUTH_TOKEN"));` |
| PHP | `curl_setopt($ch, CURLOPT_POSTFIELDS, $body)`, `Guzzle\Client.post($url, ['body' => $tainted])`, `Symfony\HttpClient->request('POST', $url, ['body' => $tainted])` | `curl_setopt($ch, CURLOPT_POSTFIELDS, $_COOKIE['session']);` |
Add project-specific sinks with `nyx config add-rule --kind sink --cap data_exfil --matcher <name>` or the equivalent TOML rule.
## DATA_EXFIL calibration ranges
`taint-data-exfiltration` is calibrated below the other taint classes on purpose.
| Source kind | Severity | Confidence ceiling |
|---|---|---|
| Cookie, environment variable | High | Medium |
| Header | Medium | Medium |
| File system, database | Medium | Medium |
| Caught exception | Medium | Low |
Path-validated flows (`path_validated: true`) drop one severity tier. Confidence drops to Low when the abstract or symbolic domain cannot corroborate a concrete string reaching the outbound payload (for example, when the body comes from a callee with no summary).
Attack-surface score ranges:
| Finding shape | Score |
|---|---|
| High DATA_EXFIL, cookie or env source, body confirmed | around 76 |
| Medium DATA_EXFIL, header, fs, db, or caught-exception source | 40 to 45 |
| Low DATA_EXFIL, no abstract corroboration, path-validated | 18 to 25 |
For reference: High SSRF, SQLi, cmdi land at 76 to 81; Medium taint with env source lands at 45 to 50; AST-only patterns sit around 10. Data-exfil sits below the direct-compromise classes but above informational AST patterns.

396
docs/dynamic.md Normal file
View file

@ -0,0 +1,396 @@
# Dynamic verification
Static analysis tells you a sink is reachable from a source. Dynamic
verification tries to prove it. When verification is on, Nyx builds a small
harness around each finding, runs it in a sandbox against a curated payload
set, and stamps the result onto `evidence.dynamic_verdict`.
It is a second signal, not a replacement for review. A `Confirmed` verdict
means Nyx triggered the sink in its harness with an attacker-controlled
payload and proved the benign control stayed clean. `NotConfirmed` means the
harness ran but nothing fired. Neither verdict closes a finding on its own.
Default Nyx builds include the `dynamic` feature. Custom
`--no-default-features` builds run static-only unless rebuilt with
`--features dynamic`.
## How confirmation works
Every cap that can be verified ships a curated corpus of payload pairs: at
least one vulnerable payload and one benign control. The verifier runs both
through the same harness and compares.
- The vulnerable payload must fire the sink. A payload "fires" when an
oracle predicate matches the observed behavior, not when a string appears
in the output.
- The benign control must stay clean. It exercises the same code path with a
value that a correct implementation handles safely.
A finding is `Confirmed` only when at least one vulnerable payload fires and
every paired benign control stays clean. This differential rule is what keeps
the verifier from confirming a finding just because the harness echoed an
input.
Oracles are behavioral, scoped to the cap:
| Cap | Oracle | What it observes |
| --- | --- | --- |
| Command/code injection | stub event | the harness's exec boundary saw the injected command |
| SQL injection | stub event | the SQL boundary saw the injected clause |
| SSRF, data exfil | outbound host | the request left for a host outside the allowlist |
| Path traversal | stub event | the filesystem boundary opened a path outside the root |
| Template injection | template eval | `{{7*7}}` rendered as `49`, not echoed as text |
| Deserialization | gadget marker | a non-allowlisted class was resolved during decode |
| XXE | entity expansion | an external entity was expanded by the parser |
| LDAP / XPath injection | result count | the malicious filter returned more rows than the benign one |
| Header / CRLF | header split | an injected `\r\n` split or added a response header |
| Open redirect | redirect host | the `Location` header pointed off-origin |
| Prototype pollution | canary touch | a property write reached `Object.prototype` |
| Weak crypto | key entropy | the produced key fit inside a 16-bit search space |
| JSON parse abuse | parse depth | the parser accepted a depth past its limit |
| IDOR | ownership cross | the read crossed from the caller's id to another owner's |
Every canary is derived per-run from `BLAKE3(spec_hash || run_nonce)`, so it is
unique per finding, collision-resistant against ambient harness output, and
never appears on the host.
## Running it
```bash
nyx scan # verifies Medium and High confidence findings
nyx scan --no-verify # static analysis only
nyx scan --verify # explicit form of the default behavior
nyx scan --verify-all-confidence # also verify Low-confidence findings
```
Use `--no-verify` for fast local checks or editor workflows. Keep
verification on for CI when scan time allows it. `--verify-all-confidence` is
slower and noisier; reach for it when tuning payloads or chasing coverage.
## Verdicts
| Status | Meaning |
| --- | --- |
| `Confirmed` | A vulnerable payload fired the sink and every benign control stayed clean. |
| `PartiallyConfirmed` | The sink was reached but no oracle marker was observed. The exploit chain did not complete. Treat as a strong lead, not a proof. |
| `NotConfirmed` | The harness ran but no payload fired. The path is likely infeasible or the corpus does not cover this shape. The original finding stays open until reviewed. |
| `Inconclusive` | Nyx could not finish the check. Carries a typed reason (build failed, spec derivation failed, sandbox error, policy denied, and others). |
| `Unsupported` | Nyx did not attempt the finding. Carries a typed reason (language unsupported, entry kind unsupported, no payloads for cap, confidence below threshold, no sound oracle). |
When a `Confirmed` sink sits behind a recognized input-validation or
output-sanitization guard (Spring `@PreAuthorize`, Express `helmet`, Nest
`@UseGuards`, Django `@permission_classes`), the verdict demotes to
`ConfirmedWithKnownGuard` and the guard names land on
`differential.known_guards`. Authentication-only filters do not trigger the
demotion, since they do not mitigate injection.
`PartiallyConfirmed` is deliberate. It marks the cases where engine work can
ratchet without the tool overstating what it proved.
## Capability coverage
Caps split into two groups. Data-style injection (SQL, command, path,
SSRF, XSS) uses language-neutral payload bytes (`' OR 1=1--`, `../../etc/passwd`,
a callback URL), so the harness emitter for any language can carry them. The
caps below have language-specific payloads (a Java gadget chain is not a
Python pickle), so each language is curated on its own.
A checkmark means a tuned per-language payload set ships for that cell. Cells
without a checkmark in the data-style rows still run, falling back to the
language-neutral payload union.
| Cap | Py | JS | TS | Java | PHP | Ruby | Go | Rust | C | C++ |
| --- | -- | -- | -- | ---- | --- | ---- | -- | ---- | - | --- |
| Command / code injection | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| SQL injection | union | union | union | union | union | union | union | ✓ | union | union |
| Path traversal | union | union | union | union | union | union | union | ✓ | union | union |
| SSRF | union | union | union | union | union | union | union | ✓ | union | union |
| XSS | union | union | union | union | union | union | union | ✓ | union | union |
| Format string | | | | | | | | | ✓ | |
| Deserialization | ✓ | | | ✓ | ✓ | ✓ | | | | |
| Template injection | ✓ | ✓ | | ✓ | ✓ | ✓ | | | | |
| XXE | ✓ | | | ✓ | ✓ | ✓ | ✓ | | | |
| LDAP injection | ✓ | | | ✓ | ✓ | | | | | |
| XPath injection | ✓ | ✓ | | ✓ | ✓ | | | | | |
| Header / CRLF | ✓ | ✓ | | ✓ | ✓ | ✓ | ✓ | ✓ | | |
| Open redirect | ✓ | ✓ | | ✓ | ✓ | ✓ | ✓ | ✓ | | |
| Prototype pollution | | ✓ | ✓ | | | | | | | |
| Weak crypto | ✓ | | | ✓ | ✓ | | ✓ | ✓ | | |
| JSON parse abuse | ✓ | ✓ | | ✓ | ✓ | ✓ | ✓ | ✓ | | |
| IDOR | ✓ | ✓ | | ✓ | ✓ | ✓ | ✓ | ✓ | | |
| Data exfiltration | ✓ | ✓ | | ✓ | ✓ | ✓ | ✓ | ✓ | | |
`ENV_VAR`, `SHELL_ESCAPE`, and `URL_ENCODE` are source and sanitizer caps with
no externally observable sink behavior. They route to
`Unsupported(SoundOracleUnavailable)` rather than counting as a missing-payload
gap.
## Framework adapters
Adapters bind a function to its external entry surface so the harness can
drive the real entry point (an HTTP request through the framework, a published
message, a scheduled fire) instead of calling the function in isolation.
Middleware and request validation participate in the verdict that way.
| Language | HTTP routers | Other surfaces |
| --- | --- | --- |
| Python | Flask, Django, FastAPI, Starlette | Jinja2, pickle, LDAP, Celery, Kafka, SQS, Pub/Sub, RabbitMQ, Django Channels, Socket.IO, Django middleware, Django + Flask migrations |
| JavaScript | Express, Koa, NestJS, Fastify | Handlebars, Apollo + Relay GraphQL, lodash.merge + JSON deep-assign, Socket.IO, SQS, Express middleware, Knex + Prisma + Sequelize migrations |
| TypeScript | NestJS | Object.assign + lodash.merge + JSON deep-assign |
| Java | Spring, Quarkus, Micronaut, Jakarta Servlet | Thymeleaf, ObjectInputStream, Spring LDAP, Kafka, SQS, RabbitMQ, Quartz, Spring middleware, Flyway + Liquibase migrations |
| PHP | Laravel, Symfony, CodeIgniter | Twig, unserialize, LDAP, Laravel middleware, Laravel migrations |
| Ruby | Rails, Sinatra, Hanami | ERB, Marshal, Sidekiq, ActionCable, Rails middleware, Rails migrations |
| Go | Gin, Echo, Fiber, Chi | gqlgen GraphQL, NATS, Pub/Sub, go-migrate migrations |
| Rust | Axum, Actix, Rocket, Warp | Juniper GraphQL, Refinery + SQLx migrations |
| C / C++ | none | argv / stdin entry only |
Adapters are sanitizer-aware. An XXE, header-injection, open-redirect, SSTI,
LDAP, XPath, deserialization, crypto, or data-exfil adapter declines the
binding when the surrounding source visibly hardens the call: a parser set to
`disallow-doctype-decl` or `resolve_entities=False`, a value routed through
`LdapEncoder.filterEncode` or `escape_filter_chars`, a weak primitive swapped
for `secrets.token_bytes` or `crypto.randomBytes` or `SecureRandom`, or a
redirect host checked against an allowlist. That cuts adapter false positives
without losing the genuinely dangerous calls.
## Entry points
The verifier knows how to stand up these entry shapes:
`Function`, `HttpRoute`, `CliSubcommand`, `LibraryApi`, `ClassMethod`,
`MessageHandler`, `ScheduledJob`, `GraphQLResolver`, `WebSocket`,
`Middleware`, `Migration`.
`ClassMethod` walks constructor parameters and builds the receiver, preferring
a default constructor and otherwise stubbing dependencies (`MockHttpClient`,
`MockDatabaseConnection`, `MockLogger`) up to a bounded depth. `MessageHandler`
boots an in-sandbox broker stub on loopback and publishes the payload.
`Migration` runs under a database-in-test-mode profile with no real
connection. An entry kind a language emitter does not yet support produces
`Inconclusive(EntryKindUnsupported)` with a hint, never a silent skip.
## Sandbox backends
```bash
nyx scan --backend auto # docker when available, else process (default)
nyx scan --backend docker # require docker
nyx scan --backend process # run on the host with weaker isolation
nyx scan --unsafe-sandbox # alias for --backend process
nyx scan --harden strict # full process-backend lockdown
```
Docker is the preferred backend. It mounts only the entry file's directory and
blocks outbound network by default. Nyx binds a loopback OOB listener at scan
start for callback-style payloads (SSRF, blind SSTI). When the bind succeeds,
Docker switches to bridge networking with a host-gateway route so the harness
can reach the listener; OOB payloads are skipped if the bind fails.
The process backend runs on the host. It is useful for development and
machines without Docker, and it does not provide the same isolation. Hardening
profiles apply to it:
- `standard` (default): no-new-privs plus a memory rlimit on Linux. No
`sandbox-exec` wrap on macOS.
- `strict`: namespace unshare, chroot to the workdir, and a default-deny
seccomp filter on Linux; `sandbox-exec -f <cap>.sb` on macOS. Opt-in,
because interpreted Linux harnesses can SIGSYS until the per-language seccomp
allowlists are widened.
Every sink under test passes through the policy deny rules in
`src/dynamic/policy.rs` before the harness builds. Network egress, writes
outside the sandbox root, and process spawns can be denied per rule, and the
deny decision lands in the trace.
## Performance
Verification adds a harness build and a sandbox run per finding. Two pieces of
infrastructure keep that affordable at corpus scale.
Per-language build pools reuse a warm toolchain across findings instead of
cold-starting one each time. Java runs a long-lived `javac` daemon; Node, PHP,
Ruby, Go, Rust, C, and C++ reuse shared module, package, and object caches;
Python layers a read-only venv with a warmed bytecode cache. The target is a
P50 harness build at or under 200ms hot and 1.5s cold, with an OWASP-scale run
finishing in 10 minutes on the dev reference machine.
Copy-on-write workdirs (`clonefile` on macOS, `reflink` or `copy_file_range`
on Linux) replace per-finding file copies, and the worker pool routes findings
into per-cap concurrency lanes so a slow `DESERIALIZE` harness does not block
fast `SSRF` ones.
The CI ship gate holds the with-verify to static-only wall-clock ratio at or
under 1.5x on `benches/fixtures/`. If a change pushes it past that, the gate
fails.
## Repro artifacts
Confirmed findings write a hermetic bundle under Nyx's platform cache
directory:
```text
<cache-dir>/nyx/dynamic/repro/<spec_hash>/
```
On Linux this is usually `~/.cache/nyx/dynamic/repro/<spec_hash>/`; on macOS
it is usually `~/Library/Caches/nyx/dynamic/repro/<spec_hash>/`.
The bundle carries the harness spec, payload, expected output, trace, and a
`reproduce.sh`. When the toolchain is pinned in `tools/image-builder/images.toml`
it also writes a `docker_pull.sh`.
The easiest replay path starts from the finding id shown in scan output or the
browser UI:
```bash
nyx repro --finding <finding_id>
nyx repro --finding <finding_id> --docker
```
You can also replay an exact bundle by spec hash, or inspect the shell script
directly:
```bash
nyx repro --spec-hash <spec_hash>
cd <cache-dir>/nyx/dynamic/repro/<spec_hash>
./reproduce.sh
./reproduce.sh --docker
```
Use the Docker form when the bundle records a pinned image or when host
toolchains differ from the original run.
## Configuration
```toml
[scanner]
verify = true # run dynamic verification after static analysis
verify_all_confidence = false # include findings below Confidence::Medium
verify_backend = "auto" # auto | docker | process | firecracker
harden_profile = "standard" # standard | strict
```
Set `verify = false` to make scans static-only unless the command line
overrides it. See [Configuration](configuration.md) for the full table.
## Event log
Nyx writes verdict events to:
```text
~/.cache/nyx/dynamic/events.jsonl
```
Each line is a JSON object with a versioned envelope:
```json
{
"schema_version": 1,
"nyx_version": "0.8.0",
"corpus_version": "15",
"kind": "verdict",
"ts": "2026-06-01T18:42:09Z",
"finding_id": "a3b1...",
"spec_hash": "9f4e...",
"lang": "python",
"cap": "SQL_QUERY",
"status": "Confirmed",
"toolchain_id": "python-3.11",
"toolchain_match": "exact",
"duration_ms": 312,
"build_attempts": 1
}
```
The literal `nyx_version` and `corpus_version` values shift between releases;
see `crate::dynamic::telemetry::CORPUS_VERSION` for the active payload-corpus
version your binary writes.
| Field | Meaning |
| --- | --- |
| `schema_version` | Event schema version. Readers reject mismatches. |
| `nyx_version` | Version of the Nyx binary that wrote the event. |
| `corpus_version` | Payload corpus version used for the verdict. |
| `kind` | `verdict` or `rank_delta`. Feedback rows use an `event: "verify_feedback"` field instead. |
| `ts` | Write time in RFC 3339 format. |
| `finding_id` | Stable finding identifier. |
| `spec_hash` | Hash of the harness spec. |
| `lang` | Language slug, or `unknown` when spec derivation failed. |
| `cap` | Sink capability, such as `SQL_QUERY` or `CODE_EXEC`. |
| `status` | `Confirmed`, `PartiallyConfirmed`, `NotConfirmed`, `Inconclusive`, or `Unsupported`. |
| `inconclusive_reason` | Present when `status` is `Inconclusive`. |
If the schema changes, move or delete the old `events.jsonl` before reading it
with the new binary. Programmatic readers should use
`crate::dynamic::telemetry::read_events(path)`.
### Sampling
`[telemetry]` in `nyx.toml` controls event retention:
```toml
[telemetry]
keep_all_confirmed = true
keep_all_inconclusive = true
sample_rate_other = 1.0
```
`sample_rate_other` accepts `0.0` to `1.0` and applies to `NotConfirmed` and
`Unsupported` verdicts. The decision is deterministic for a given `spec_hash`.
Confirmed, Inconclusive, and rank-delta events are always kept by default.
Set `NYX_NO_TELEMETRY=1` to disable event writes.
## Feedback
To record a bad verdict:
```bash
nyx verify-feedback <finding_id> --wrong "reason"
```
Feedback is written to the local event log. Nyx does not upload it.
## Determinism
Every random source is seeded from the spec hash, so two runs of the same spec
produce identical payloads and identical verdicts. `scripts/check_no_unseeded_rand.sh`
audits the tree for unseeded `rand` usage on every CI run.
## Limitations
- The harness drives the finding's enclosing entry function when one is
derivable, routing the payload to the tainted parameter, so a guard in the
code around the sink (a merge target built with `Object.create(null)`, an
`ObjectInputStream` subclass whose `resolveClass` enforces an allowlist, a
const-name check before `Marshal.load`) runs first and participates in the
verdict. The build-time choice is recorded on the verify trace as
`entry_invocation` (`mode=entry_function` or `mode=direct_sink`). When no
enclosing entry can be derived the harness falls back to driving the sink
directly, and that fallback can over-confirm a guard it never executes. Read
a `direct_sink` `Confirmed` as "this sink is reachable and fires on attacker
input," not "this exact code path has no in-line mitigation." Framework-level
guards (auth middleware, helmet) are also recognized and demote to
`ConfirmedWithKnownGuard`.
- Per-language payload curation is uneven. Command and code injection ship for
all ten languages; the classic data-style injection caps (SQL, path
traversal, SSRF, XSS) ship a tuned set for Rust and fall back to a
language-neutral payload union elsewhere; the framework-specific caps are
curated for the languages where they occur. The matrix above is the precise
state.
- A `NotConfirmed` verdict is not a clean bill. It means the harness did not
fire, which can be an infeasible path or a corpus that does not cover the
shape. Keep reviewing `NotConfirmed` findings.
- The process backend is weaker isolation than Docker. Use `--backend docker`
or `--harden strict` for untrusted code, and never `--unsafe-sandbox` in CI.
- Real-corpus acceptance rows (OWASP Benchmark, NodeGoat, Juice Shop, and the
polyglot set) self-skip in CI unless the corresponding `NYX_*_CORPUS`
environment variable points at a checkout. They are not vendored into the
repo.
- C and C++ have no framework adapters. Findings in those languages verify
through `argv` and `stdin` entry points only.
## Browser UI
`nyx serve` shows dynamic verdicts on finding detail pages, uses them in
ranking, and can compare verdict changes between saved scans. See
[Output formats](output.md) for the `dynamic_verdict` schema.

View file

@ -6,6 +6,23 @@ If you're going to act on a finding, it helps to know how the scanner got there.
A scan runs in two passes over the file tree, with an optional SQLite index that lets the second scan skip files whose content hash hasn't changed. A scan runs in two passes over the file tree, with an optional SQLite index that lets the second scan skip files whose content hash hasn't changed.
```mermaid
flowchart TD
Walk["Walk file tree"] --> Pass1["Pass 1 per file<br/>tree-sitter parse, CFG, SSA"]
Pass1 --> Summaries["Per-function summaries<br/>sources, sinks, sanitizers, returns, points-to"]
Pass1 --> Hierarchy["Type hierarchy index<br/>extends, implements, impl-for, includes"]
Summaries --> Global["GlobalSummaries map<br/>plus optional SQLite cache"]
Hierarchy --> Global
Global --> Pass2["Pass 2 per file<br/>cross-file context"]
Pass2 --> Taint["Forward SSA taint worklist<br/>finite lattice, guaranteed convergence"]
Pass2 --> Calls["Call precision<br/>k=1 inline, summaries, SCC fixed-point"]
Taint --> Findings["Findings with evidence<br/>source, path, sink, engine notes"]
Calls --> Findings
Findings --> Rank["Rank and dedupe<br/>severity, confidence, score"]
Rank --> Verify["Dynamic verification<br/>sandboxed harnesses, verdicts"]
Verify --> Emit["Emit<br/>console, JSON, SARIF, UI"]
```
**Pass 1, per file.** Tree-sitter parses the file. Nyx builds an intra-procedural control-flow graph, lowers it to SSA, and extracts a summary per function describing what that function does at the boundary: which arguments flow to sinks, which sources it reads from, which sinks it calls, what taint it strips, what it returns. Summaries are persisted to SQLite ([`src/summary/`](https://github.com/elicpeter/nyx/tree/master/src/summary/), [`src/database.rs`](https://github.com/elicpeter/nyx/blob/master/src/database.rs)). **Pass 1, per file.** Tree-sitter parses the file. Nyx builds an intra-procedural control-flow graph, lowers it to SSA, and extracts a summary per function describing what that function does at the boundary: which arguments flow to sinks, which sources it reads from, which sinks it calls, what taint it strips, what it returns. Summaries are persisted to SQLite ([`src/summary/`](https://github.com/elicpeter/nyx/tree/master/src/summary/), [`src/database.rs`](https://github.com/elicpeter/nyx/blob/master/src/database.rs)).
**Summary merge.** All per-file summaries get unioned into a global map keyed by qualified function name. **Summary merge.** All per-file summaries get unioned into a global map keyed by qualified function name.
@ -16,7 +33,9 @@ Two extra layers tune precision around calls. **Context-sensitive inlining** (k=
When a method call has a receiver typed as a super-class, trait, or interface, **hierarchy fan-out** widens the resolved callee set to every concrete implementer the engine has seen. A class diagram extracted in pass 1 (Java extends/implements, Rust impl-for, TS/JS extends, Python bases, Ruby includes, PHP extends/implements, C++ inheritance) feeds an index that the call resolver consults during pass 2. The fan-out is capped at 8 implementers per call site; over-fanning is a precision tax, not a soundness issue. When a method call has a receiver typed as a super-class, trait, or interface, **hierarchy fan-out** widens the resolved callee set to every concrete implementer the engine has seen. A class diagram extracted in pass 1 (Java extends/implements, Rust impl-for, TS/JS extends, Python bases, Ruby includes, PHP extends/implements, C++ inheritance) feeds an index that the call resolver consults during pass 2. The fan-out is capped at 8 implementers per call site; over-fanning is a precision tax, not a soundness issue.
A separate **field-sensitive points-to** pass tracks abstract locations down to the field level, so `c.mu.Lock()` is a lock on `Field(c, mu)` rather than on `c` as a whole. That distinction is what lets the resource-lifecycle and taint passes tell `obj.field = tainted; sink(obj.other_field)` apart from the conservative whole-variable approximation. Subscript reads and writes (`arr[i]`, `map[k] = v`) lower to synthetic `__index_get__` / `__index_set__` calls so the same container model handles them. Set `NYX_POINTER_ANALYSIS=0` to fall back to the pre-pointer-pass behaviour for one release if you need to compare baselines. A separate **field-sensitive points-to** pass tracks abstract locations down to the field level, so `c.mu.Lock()` is a lock on `Field(c, mu)` rather than on `c` as a whole. That distinction is what lets the resource-lifecycle and taint passes tell `obj.field = tainted; sink(obj.other_field)` apart from the conservative whole-variable approximation. Subscript reads and writes (`arr[i]`, `map[k] = v`) lower to synthetic `__index_get__` / `__index_set__` calls so the same container model handles them. Set `NYX_POINTER_ANALYSIS=0` to fall back to the pre-pointer-pass behaviour for baseline comparison.
**Dynamic verification.** After ranking and dedupe, default builds verify Medium and High confidence findings unless `--no-verify` or `scanner.verify = false` is set. The verifier derives a small harness from the finding, runs it in a sandbox against curated payloads, and stores the result on `evidence.dynamic_verdict`. `Confirmed` means a vulnerable payload fired and its benign control stayed clean. `NotConfirmed` means the harness ran but did not fire, not that the finding is closed.
## Optional analyses on top ## Optional analyses on top
@ -47,6 +66,6 @@ Findings whose engine notes indicate a bound was hit can be filtered with `--req
## What you get out ## What you get out
Each finding carries the source location, the sink location, the path in between (when symex produced one), the rule ID, severity, attack-surface score, confidence level, and a list of engine notes describing any precision loss along the way. Console output is human-readable; JSON and SARIF carry the full evidence object for tooling. Each finding carries the source location, the sink location, the path in between (when symex produced one), the rule ID, severity, attack-surface score, confidence level, dynamic verdict when one was attempted, and a list of engine notes describing any precision loss along the way. Console output is human-readable; JSON and SARIF carry the full evidence object for tooling.
For the JSON shape and SARIF mapping, see [output.md](output.md). For the JSON shape and SARIF mapping, see [output.md](output.md).

View file

@ -61,7 +61,7 @@ Optional features:
Nyx stores its scanner version in the project's index database. When the binary's version differs from the stored version, the index is wiped on the next scan and rebuilt against the new engine. You'll see one info-level log line: Nyx stores its scanner version in the project's index database. When the binary's version differs from the stored version, the index is wiped on the next scan and rebuilt against the new engine. You'll see one info-level log line:
``` ```
engine version changed (0.4.0 → 0.5.0), rebuilding index engine version changed (<old><new>), rebuilding index
``` ```
No flag needed. If you see this on *every* scan, the metadata row isn't being persisted; file an issue. No flag needed. If you see this on *every* scan, the metadata row isn't being persisted; file an issue.

View file

@ -9,24 +9,23 @@ The classifications here are grounded in three concrete signals:
1. **Rule depth**: how many distinct source / sanitizer / sink matchers exist 1. **Rule depth**: how many distinct source / sanitizer / sink matchers exist
for the language in `src/labels/<lang>.rs`, and how many vulnerability for the language in `src/labels/<lang>.rs`, and how many vulnerability
classes (Cap bits) those matchers cover. classes (Cap bits) those matchers cover.
2. **Benchmark results**: rule-level precision / recall / F1 on the 433-case 2. **Benchmark results**: rule-level precision / recall / F1 on the synthetic
corpus in corpus in
[`tests/benchmark/RESULTS.md`](https://github.com/elicpeter/nyx/blob/master/tests/benchmark/RESULTS.md), [`tests/benchmark/RESULTS.md`](https://github.com/elicpeter/nyx/blob/master/tests/benchmark/RESULTS.md).
last measured 2026-04-29 with scanner version 0.5.0. `RESULTS.md` is the authoritative case counts and per-language scores.
3. **Known weak spots**: FPs and FNs the maintainers have deliberately left 3. **Known weak spots**: FPs and FNs the maintainers have deliberately left
in the benchmark rather than suppressed, plus structural engine in the benchmark rather than suppressed, plus structural engine
limitations the corpus does not stress, documented release-by-release in limitations the corpus does not stress, documented in
[`RESULTS.md`](https://github.com/elicpeter/nyx/blob/master/tests/benchmark/RESULTS.md). [`RESULTS.md`](https://github.com/elicpeter/nyx/blob/master/tests/benchmark/RESULTS.md).
As of 2026-04-29 the synthetic corpus has effectively saturated: every The synthetic corpus has effectively saturated: every
real-CVE fixture fires and rule-level recall is 100%. Nine of ten real-CVE fixture fires and rule-level precision and recall are both 100%.
languages report rule-level F1 = 100.0%; Go reports 98.0% on the back of All ten languages report rule-level F1 = 100.0%. Aggregate rule-level
a single safe-fixture FP. Aggregate rule-level P=0.995, R=1.000, F1=0.998. P=1.000, R=1.000, F1=1.000. That means F1 alone no longer differentiates
That means F1 alone no longer differentiates tiers, so the differentiators tiers, so the differentiators are **rule depth**, **gated-sink coverage**,
are **rule depth**, **gated-sink coverage**, and **structural idioms the and **structural idioms the corpus does not fully stress** (deep pointer
corpus does not fully stress** (deep pointer aliasing in C/C++, aliasing in C/C++, framework-specific context). All parser integrations
framework-specific context). All parser integrations use tree-sitter and use tree-sitter and are stable; parsing is not a differentiator.
are stable; parsing is not a differentiator.
--- ---
@ -35,7 +34,7 @@ are stable; parsing is not a differentiator.
| Tier | Languages | F1 | What to expect | | Tier | Languages | F1 | What to expect |
|------|-----------|----|----------------| |------|-----------|----|----------------|
| **Stable** | Python, JavaScript, TypeScript | 100% | Deep rule sets, gated sinks (argument-role-aware), framework detection, extensive fixtures, and the bulk of advanced-analysis (SSA two-level solve, context-sensitivity, symbolic execution, abstract interpretation) coverage. Safe to depend on in CI gates. | | **Stable** | Python, JavaScript, TypeScript | 100% | Deep rule sets, gated sinks (argument-role-aware), framework detection, extensive fixtures, and the bulk of advanced-analysis (SSA two-level solve, context-sensitivity, symbolic execution, abstract interpretation) coverage. Safe to depend on in CI gates. |
| **Beta** | Go, Java, PHP, Ruby, Rust | 98.0% to 100% | Solid mid-depth rule sets with narrower cap coverage and **no gated sinks**. Cross-file flows work; some idioms (variable-typed method receivers, framework context, string interpolation, match-arm guards) are partially modeled. Usable in CI; review FP/FN lists before tightening gates. | | **Beta** | Go, Java, PHP, Ruby, Rust | 100% | Solid mid-depth rule sets with narrower cap coverage and **no gated sinks**. Cross-file flows work; some idioms (variable-typed method receivers, framework context, string interpolation, match-arm guards) are partially modeled. Usable in CI; review FP/FN lists before tightening gates. |
| **Preview** | C, C++ | 100% on synthetic corpus | Recent work taught the engine to follow taint through `std::vector` / `std::string` / map containers (including `c_str()`), through fluent builder chains like `Socket::builder().host(h).connect()`, and through inline class member functions. Function pointers and deeper pointer aliasing through `*p` / `p->field` are still not tracked. Rule-level scores against a corpus of obvious unsafe-API uses look perfect, but that is not the same as a clean audit on a real codebase. Pair with clang-tidy, Clang Static Analyzer, or Infer. | | **Preview** | C, C++ | 100% on synthetic corpus | Recent work taught the engine to follow taint through `std::vector` / `std::string` / map containers (including `c_str()`), through fluent builder chains like `Socket::builder().host(h).connect()`, and through inline class member functions. Function pointers and deeper pointer aliasing through `*p` / `p->field` are still not tracked. Rule-level scores against a corpus of obvious unsafe-API uses look perfect, but that is not the same as a clean audit on a real codebase. Pair with clang-tidy, Clang Static Analyzer, or Infer. |
--- ---
@ -44,23 +43,25 @@ are stable; parsing is not a differentiator.
### Stable tier ### Stable tier
#### Python: 100% P / 100% R / 100% F1 *(46-case corpus)* #### Python
- **Rule depth**: 5 source families, 7 sanitizer families, 21 sink matchers - **Rule depth**: deep source / sanitizer / sink coverage in
[`src/labels/python.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/python.rs)
spanning HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization. spanning HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
- **Framework context**: Flask, Django, argparse source matchers; `flask_request` - **Framework context**: Flask, Django, argparse source matchers; `flask_request`
import-alias support. import-alias support.
- **Advanced analysis**: gated sinks (`Popen`, `subprocess.run/call` with - **Advanced analysis**: gated sinks (`Popen`, `subprocess.run/call` with
activation-arg awareness), most SSA-equivalence and symbolic-execution activation-arg awareness), most SSA-equivalence and symbolic-execution
fixtures target Python. fixtures target Python.
- **Fixtures**: 125 under `tests/fixtures/` plus 42 benchmark cases. - **Fixtures**: extensive `.py` coverage under `tests/fixtures/` plus the benchmark cases.
- **Blind spots**: f-string interpolation is not explicitly modeled as a - **Blind spots**: f-string interpolation is not explicitly modeled as a
distinct taint-producing construct; string-formatting flows are caught by distinct taint-producing construct; string-formatting flows are caught by
the general concatenation path. the general concatenation path.
#### JavaScript: 100% P / 100% R / 100% F1 *(42-case corpus)* #### JavaScript
- **Rule depth**: 3 source families, 10 sanitizer families, 24 sink matchers - **Rule depth**: deep source / sanitizer / sink coverage in
[`src/labels/javascript.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/javascript.rs)
spanning HTML, URL, JSON, Shell, SQL, Code, SSRF, and File I/O. spanning HTML, URL, JSON, Shell, SQL, Code, SSRF, and File I/O.
- **Advanced analysis**: gated sinks (`setAttribute`, `parseFromString`), - **Advanced analysis**: gated sinks (`setAttribute`, `parseFromString`),
two-level SSA solve for top-level + per-function scopes two-level SSA solve for top-level + per-function scopes
@ -68,15 +69,16 @@ are stable; parsing is not a differentiator.
StringFact, abstract-interpretation interval tracking. StringFact, abstract-interpretation interval tracking.
- **Framework context**: Express, Koa, Fastify (via in-file import scan when - **Framework context**: Express, Koa, Fastify (via in-file import scan when
`package.json` is absent). `package.json` is absent).
- **Fixtures**: 238 under `tests/fixtures/`; the largest fixture set of any - **Fixtures**: the largest `.js` set under `tests/fixtures/` of any
language. language.
- **Blind spots**: template literals are lowered through concatenation rather - **Blind spots**: template literals are lowered through concatenation rather
than modeled as a first-class taint operator; dynamic property access than modeled as a first-class taint operator; dynamic property access
(`obj[user]`) is conservatively treated. (`obj[user]`) is conservatively treated.
#### TypeScript: 100% P / 100% R / 100% F1 *(47-case corpus)* #### TypeScript
- **Rule depth**: Shares the JS ruleset (3 sources, 10 sanitizers, 24 sinks) - **Rule depth**: shares the JS ruleset (see
[`src/labels/typescript.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/typescript.rs))
plus TS-specific grammar handling. plus TS-specific grammar handling.
- **Advanced analysis**: TSX and JSX grammars wired; - **Advanced analysis**: TSX and JSX grammars wired;
discriminated-union narrowing, generic erasure, decorator flow, and discriminated-union narrowing, generic erasure, decorator flow, and
@ -84,28 +86,32 @@ are stable; parsing is not a differentiator.
stressors. stressors.
- **Framework context**: Fastify detection via `detect_in_file_frameworks` - **Framework context**: Fastify detection via `detect_in_file_frameworks`
(import-driven, no `package.json` required). (import-driven, no `package.json` required).
- **Fixtures**: 39 test fixtures plus 42 benchmark cases. - **Fixtures**: dedicated `.ts` / `.tsx` set under `tests/fixtures/` plus the benchmark cases.
- **Blind spots**: `as any` casts and `any`-typed flows are handled - **Blind spots**: `as any` casts and `any`-typed flows are handled
conservatively (treated as tainted). conservatively (treated as tainted).
### Beta tier ### Beta tier
#### Go: 96.2% P / 100.0% R / 98.0% F1 *(53-case corpus, 1 FP, 0 FNs)* #### Go
- **Rule depth**: 4 source families, 4 sanitizer families, 9 sink matchers - **Rule depth**: mid-depth source / sanitizer / sink coverage in
[`src/labels/go.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/go.rs)
covering HTML, URL, Shell, SQL, SSRF, Crypto, and File I/O. covering HTML, URL, Shell, SQL, SSRF, Crypto, and File I/O.
- **Framework context**: Gin, Echo source matchers. - **Framework context**: Gin, Echo source matchers.
- **Open weak spots**: one safe Go fixture (`go-safe-009`) draws a spurious - **Recent fix**: `strings.ReplaceAll` is now recognised as a CMDi sanitiser
CMDi finding. in chain-wrapper / call-site-replace shapes, clearing the last open
Go safe-fixture FP (`go-safe-009`, `validate(s string)` wrapping a
`strings.ReplaceAll` over `;`).
- **Known gaps**: no gated sinks, no deserialization class. `fmt.Sprintf` - **Known gaps**: no gated sinks, no deserialization class. `fmt.Sprintf`
is deliberately not a sink. Cap coverage is narrower than the Stable is deliberately not a sink. Cap coverage is narrower than the Stable
tier and argument-role-aware sink modeling is not yet implemented for Go, tier and argument-role-aware sink modeling is not yet implemented for Go,
so production CI gates may surface additional FPs the corpus does not so production CI gates may surface additional FPs the corpus does not
exercise. exercise.
#### Java: 100% P / 100% R / 100% F1 *(35-case corpus)* #### Java
- **Rule depth**: 3 source families, 8 sanitizer families, 10 sink matchers - **Rule depth**: mid-depth source / sanitizer / sink coverage in
[`src/labels/java.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/java.rs)
covering HTML, URL, Shell, SQL, Code, SSRF, and Deserialization. covering HTML, URL, Shell, SQL, Code, SSRF, and Deserialization.
- **Framework context**: Spring, JPA, Hibernate ORM rules; JNDI injection - **Framework context**: Spring, JPA, Hibernate ORM rules; JNDI injection
sinks. sinks.
@ -115,54 +121,58 @@ are stable; parsing is not a differentiator.
cannot be inferred are conservatively over-tainted on unusual builder cannot be inferred are conservatively over-tainted on unusual builder
chains. chains.
#### PHP: 100% P / 100% R / 100% F1 *(37-case corpus)* #### PHP
- **Rule depth**: 3 source families (`$_GET`, `$_POST`, `$_REQUEST` - **Rule depth**: sources include `$_GET`, `$_POST`, `$_REQUEST`
superglobals), 7 sanitizer families, 10 sink matchers covering HTML, URL, superglobals plus sanitizer / sink matchers in
Shell, SQL, Code, SSRF, File I/O, and Deserialization. [`src/labels/php.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/php.rs)
covering HTML, URL, Shell, SQL, Code, SSRF, File I/O, and Deserialization.
- **Known gaps**: no gated sinks. Limited framework context (Laravel raw - **Known gaps**: no gated sinks. Limited framework context (Laravel raw
methods only). `echo` language-construct detection is wired but its methods only). `echo` language-construct detection is wired but its
inner-argument propagation is narrower than function-call sinks. inner-argument propagation is narrower than function-call sinks.
#### Ruby: 100% P / 100% R / 100% F1 *(39-case corpus)* #### Ruby
- **Rule depth**: 3 source families, 7 sanitizer families, 15 sink matchers - **Rule depth**: source / sanitizer / sink coverage in
covering HTML, Shell, SQL, Code, SSRF, File I/O, and Deserialization. [`src/labels/ruby.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/ruby.rs)
covering HTML, Shell, SQL, Code, SSRF, File I/O, and Deserialization. SSRF
coverage includes `URI.open` and the low-level `OpenURI.open_uri` it
delegates to (the canonical CarrierWave CVE-2021-21288 sink).
Statement-level chained-call wrappers
(`YAML.safe_load(File.read(filename))`, `Marshal.load(File.read(p))`,
`String.new(File.read(x))`) classify the inner sink for cross-function
summary extraction so the outer call does not strip the sink classification
on the helper.
- **Framework context**: Rails helpers (`sanitize_sql`, `permit`, `require`). - **Framework context**: Rails helpers (`sanitize_sql`, `permit`, `require`).
- **Known gaps**: string interpolation inside shell and SQL strings is - **Known gaps**: string interpolation inside shell and SQL strings is
recognized structurally but not modeled as a distinct operator. recognized structurally but not modeled as a distinct operator.
`begin/rescue/ensure` exception-edge wiring is documented as deferred `begin/rescue/ensure` exception-edge wiring is not implemented.
(structurally incompatible with `build_try()`). The previous open
`rb-interproc-001` FN closed in the 2026-04-28 baseline after the
Ruby `Kernel#open` CMDI sink and exact-match sigil work landed.
#### Rust: 100% P / 100% R / 100% F1 *(70-case adversarial corpus)* #### Rust
Rust holds the largest per-language adversarial corpus and was promoted Rust holds the largest per-language adversarial corpus. PathFact-driven
from Experimental to Beta in the 2026-04-25 measurement after the PathFact path-domain narrowing covers the `rs-safe-*` regression set.
landings closed every previously-open `rs-safe-*` regression.
- **Rule depth**: 6 source families, **2** sanitizer families (prefix and - **Rule depth**: source / sanitizer / sink coverage in
type-coercion), 11 sink matchers covering HTML, Shell, SQL, SSRF, [`src/labels/rust.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/rust.rs)
Deserialization, and File I/O. Extensive framework source coverage covering HTML, Shell, SQL, SSRF, Deserialization, and File I/O.
(Axum, Actix, Rocket); the most of any language on the source side. The Extensive framework source coverage (Axum, Actix, Rocket); the most of
narrow sanitizer count is the primary reason Rust is not in the Stable any language on the source side. The narrow sanitizer rule set (prefix
tier. Engine-side path/typed sanitizer recognition (PathFact) compensates, and type-coercion only) is the primary reason Rust is not in the Stable
but the ruleset itself is shallow. tier. Engine-side path/typed sanitizer recognition (PathFact)
- **Recent additions**: SQL class (`rusqlite`, `sqlx`, `diesel`, compensates, but the ruleset itself is shallow.
`postgres`), Deserialization class (`serde_yaml`, `bincode`, - **Coverage**: SQL class (`rusqlite`, `sqlx`, `diesel`, `postgres`),
`rmp_serde`, `ciborium`, `ron`, `toml`), expanded file I/O Deserialization class (`serde_yaml`, `bincode`, `rmp_serde`, `ciborium`,
(`fs::remove_file/dir/rename/copy`), `reqwest` SSRF builder chain. `ron`, `toml`), file I/O (`fs::remove_file/dir/rename/copy`), and the
- **Closed by recent PathFact landings** `reqwest` SSRF builder chain.
(`src/abstract_interp/path_domain.rs` + per-return-path PathFact entries - **PathFact-narrowed shapes** (`src/abstract_interp/path_domain.rs` plus
on `SsaFuncSummary`): `rs-safe-007` (`.replace("..","")` sanitiser), per-return-path PathFact entries on `SsaFuncSummary`) cover
`rs-safe-008` (negative-validation return), `rs-safe-009` (match-arm `.replace("..","")` sanitisers, negative-validation returns, match-arm
guards via condition lifting), `rs-safe-010` (static-map lookup), guards via condition lifting, static-map lookups,
`rs-safe-012` (`.contains("..")` + `.starts_with('/')` rejection), `.contains("..")` + `.starts_with('/')` rejection, Option-returning
`rs-safe-014` (Option-returning user sanitiser), `rs-safe-015` user sanitisers, `Path::new(p).is_absolute()` typed rejection,
(`Path::new(p).is_absolute()` typed rejection), `rs-safe-016` cross-function `.contains("..")` rejection, and the
(cross-function `.contains("..")` rejection), and CVE patches `CVE-2018-20997` / `CVE-2022-36113` / `CVE-2024-24576` patch shapes.
`CVE-2018-20997`, `CVE-2022-36113`, `CVE-2024-24576`.
- **Not yet covered**: unsafe FFI / `std::mem::transmute` (no rules), Tokio - **Not yet covered**: unsafe FFI / `std::mem::transmute` (no rules), Tokio
`process::Command` async variants (not distinguished from sync), `process::Command` async variants (not distinguished from sync),
`hyper` / `surf` / `ureq` SSRF clients (reqwest family only). `hyper` / `surf` / `ureq` SSRF clients (reqwest family only).
@ -170,17 +180,16 @@ landings closed every previously-open `rs-safe-*` regression.
### Preview tier ### Preview tier
C and C++ remain **Preview** despite reporting 100% rule-level F1 on the C and C++ remain **Preview** despite reporting 100% rule-level F1 on the
synthetic corpus. A run of additions in late April taught the engine to synthetic corpus. The engine follows taint through STL containers, builder
follow taint through several constructs that used to be hard cutoffs (STL chains, inline member functions, and the wider `std::sto*` family, so the
containers, builder chains, inline member functions, the wider `std::sto*` gap between "passes the synthetic corpus" and "would catch the same flow
family), so the gap between "passes the synthetic corpus" and "would catch on a real codebase" is narrower than the synthetic numbers suggest. It is
the same flow on a real codebase" is narrower than it used to be. It is not not zero. The biggest remaining gaps are deep pointer aliasing and function
zero. The biggest remaining gaps are deep pointer aliasing and function
pointers, both of which are pervasive in real C/C++ code. Treat a clean pointers, both of which are pervasive in real C/C++ code. Treat a clean
report as a starting point, not an audit. Pair Nyx with clang-tidy, the report as a starting point, not an audit. Pair Nyx with clang-tidy, the
Clang Static Analyzer, or Infer for production use. Clang Static Analyzer, or Infer for production use.
**What now works** (added in late April): **What works:**
- STL container flow. `vec.push_back(tainted)` followed by - STL container flow. `vec.push_back(tainted)` followed by
`vec.front().c_str()` carries taint into a downstream `system()` sink. `vec.front().c_str()` carries taint into a downstream `system()` sink.
@ -216,24 +225,26 @@ Clang Static Analyzer, or Infer for production use.
`void (*fn)(char *)` resolves to no callee, so cross-pointer flows are `void (*fn)(char *)` resolves to no callee, so cross-pointer flows are
invisible. invisible.
- Array-element taint by index. Writes to `buf[i]` do not always propagate - Array-element taint by index. Writes to `buf[i]` do not always propagate
taint to `buf` as a whole; the recent subscript-handling work helps the taint to `buf` as a whole; subscript-handling helps the general case but
general case but doesn't make `buf` an alias for every element. doesn't make `buf` an alias for every element.
- Nested classes beyond one level (C++ only). - Nested classes beyond one level (C++ only).
#### C: 100% P / 100% R / 100% F1 *(30-case corpus)* #### C
- **Rule depth**: 3 source families, **2** sanitizer families (the - **Rule depth**: source / sanitizer / sink coverage in
`sanitize_*` prefix and numeric-parse functions), 5 sink matchers spanning [`src/labels/c.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/c.rs).
Shell, File, SSRF, and Format-String. Sanitizers are limited to the `sanitize_*` prefix and numeric-parse
functions; sinks span Shell, File, SSRF, and Format-String.
- **Known gaps**: no framework rules, no gated sinks. The structural - **Known gaps**: no framework rules, no gated sinks. The structural
limitations listed above are the dominant concern; rule additions alone limitations listed above are the dominant concern; rule additions alone
will not lift this language out of the Preview tier. will not lift this language out of the Preview tier.
#### C++: 100% P / 100% R / 100% F1 *(33-case corpus, plus 6 new fixtures for STL / builder / inline-method flows)* #### C++
- **Rule depth**: Builds on the C ruleset with `std::cin` / `std::getline` - **Rule depth**: builds on the C ruleset (see
sources and a wider numeric-sanitizer set covering the full `std::sto*` [`src/labels/cpp.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/cpp.rs))
family (3 sources, 3 sanitizer families, 5 sinks). with `std::cin` / `std::getline` sources and a wider numeric-sanitizer
set covering the full `std::sto*` family.
- **Known gaps**: still no framework rules and no gated sinks. The - **Known gaps**: still no framework rules and no gated sinks. The
structural blind spots are now narrower than they were a release ago structural blind spots are now narrower than they were a release ago
(see "What now works" above), but function pointers and the harder (see "What now works" above), but function pointers and the harder
@ -269,9 +280,8 @@ have moved out of the blind-spot list. Synthetic-corpus F1 is not a
reliable signal for Preview-tier languages: a clean report can coexist reliable signal for Preview-tier languages: a clean report can coexist
with structural gaps. with structural gaps.
(The previous **Experimental** tier was retired in the 2026-04-25 (No language currently sits in the **Experimental** tier; it is reserved
measurement when Rust's adversarial corpus reached 100% F1; no language for future additions whose corpus has not yet stabilised.)
currently sits in that tier.)
--- ---

69
docs/mermaid-init.js Normal file
View file

@ -0,0 +1,69 @@
(function () {
const MERMAID_URL =
"https://cdn.jsdelivr.net/npm/mermaid@10.9.3/dist/mermaid.esm.min.mjs";
async function renderMermaid() {
const blocks = Array.from(
document.querySelectorAll("pre > code.language-mermaid"),
);
if (blocks.length === 0) {
return;
}
try {
const mermaidModule = await import(MERMAID_URL);
const mermaid = mermaidModule.default;
mermaid.initialize({
startOnLoad: false,
securityLevel: "strict",
theme: "base",
themeVariables: {
background: "transparent",
fontFamily:
"Inter, ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, Segoe UI, sans-serif",
primaryColor: "#0f172a",
primaryTextColor: "#e5e7eb",
primaryBorderColor: "#22d3ee",
secondaryColor: "#134e4a",
secondaryTextColor: "#e5e7eb",
secondaryBorderColor: "#2dd4bf",
tertiaryColor: "#1e293b",
tertiaryTextColor: "#e5e7eb",
tertiaryBorderColor: "#64748b",
lineColor: "#94a3b8",
edgeLabelBackground: "#0f172a",
clusterBkg: "#111827",
clusterBorder: "#475569",
},
});
for (const block of blocks) {
const pre = block.parentElement;
if (!pre) {
continue;
}
const wrapper = document.createElement("div");
wrapper.className = "nyx-mermaid";
const diagram = document.createElement("div");
diagram.className = "mermaid";
diagram.textContent = block.textContent.trim();
wrapper.appendChild(diagram);
pre.replaceWith(wrapper);
}
await mermaid.run({ querySelector: ".nyx-mermaid .mermaid" });
} catch (error) {
console.warn("Mermaid rendering failed", error);
}
}
if (document.readyState === "loading") {
document.addEventListener("DOMContentLoaded", renderMermaid);
} else {
renderMermaid();
}
})();

15
docs/mermaid.css Normal file
View file

@ -0,0 +1,15 @@
.nyx-mermaid {
margin: 1.5rem 0;
padding: 1rem;
overflow-x: auto;
border: 1px solid rgba(148, 163, 184, 0.35);
border-radius: 8px;
background: rgba(15, 23, 42, 0.28);
}
.nyx-mermaid svg {
display: block;
max-width: 100%;
height: auto;
margin: 0 auto;
}

View file

@ -19,9 +19,9 @@ Human-readable, color-coded output to stdout. Status messages go to stderr.
| Tag | Color | Meaning | | Tag | Color | Meaning |
|-----|-------|---------| |-----|-------|---------|
| `[HIGH]` | Red, bold | Critical -- likely exploitable | | `[HIGH]` | Red, bold | Critical, likely exploitable |
| `[MEDIUM]` | Orange, bold | Important -- may be exploitable | | `[MEDIUM]` | Orange, bold | Important, may be exploitable |
| `[LOW]` | Muted blue-gray | Informational -- code quality or weak signal | | `[LOW]` | Muted blue-gray | Informational: code quality or weak signal |
### Evidence fields ### Evidence fields
@ -69,48 +69,71 @@ Use --include-quality, --max-low, or --all to adjust.
## JSON ## JSON
Machine-readable JSON array. Each finding is an object: Machine-readable JSON object. The main keys are:
| Key | Type | Description |
|-----|------|-------------|
| `findings` | array | Finding objects |
| `chains` | array | Composed exploit chains, when emitted |
| `dynamic_verification` | object | Count of attached dynamic verdicts |
| `verdict_diff` | object | Baseline comparison, only when `--baseline` is used |
```json ```json
[ {
{ "findings": [
"path": "src/handler.rs", {
"line": 12, "path": "src/handler.rs",
"col": 5, "line": 12,
"severity": "High", "col": 5,
"id": "taint-unsanitised-flow (source 5:11)", "severity": "High",
"path_validated": false, "id": "taint-unsanitised-flow (source 5:11)",
"labels": [ "path_validated": false,
["Source", "env::var(\"CMD\") at 5:11"], "labels": [
["Sink", "Command::new(\"sh\").arg(\"-c\")"] ["Source", "env::var(\"CMD\") at 5:11"],
], ["Sink", "Command::new(\"sh\").arg(\"-c\")"]
"confidence": "High", ],
"evidence": { "confidence": "High",
"source": { "evidence": {
"path": "src/handler.rs", "source": {
"line": 5, "path": "src/handler.rs",
"col": 11, "line": 5,
"kind": "source", "col": 11,
"snippet": "env::var(\"CMD\")" "kind": "source",
"snippet": "env::var(\"CMD\")"
},
"sink": {
"path": "src/handler.rs",
"line": 12,
"col": 5,
"kind": "sink",
"snippet": "Command::new(\"sh\")"
},
"notes": ["source_kind:EnvironmentConfig"],
"dynamic_verdict": {
"finding_id": "a3b12f0c91e04420",
"status": "Confirmed",
"triggered_payload": "cmdi-echo-marker"
}
}, },
"sink": { "rank_score": 76.0,
"path": "src/handler.rs", "rank_reason": [
"line": 12, ["severity_base", "60"],
"col": 5, ["analysis_kind", "10"],
"kind": "sink", ["source_kind", "5"],
"snippet": "Command::new(\"sh\")" ["evidence_count", "1"]
}, ]
"notes": ["source_kind:EnvironmentConfig"] }
}, ],
"rank_score": 76.0, "chains": [],
"rank_reason": [ "dynamic_verification": {
["severity_base", "60"], "total": 1,
["analysis_kind", "10"], "confirmed": 1,
["source_kind", "5"], "partially_confirmed": 0,
["evidence_count", "1"] "not_confirmed": 0,
] "inconclusive": 0,
"unsupported": 0
} }
] }
``` ```
### Field descriptions ### Field descriptions
@ -132,6 +155,7 @@ Machine-readable JSON array. Each finding is an object:
| `rank_score` | float | no | Attack-surface score (omitted when ranking disabled) | | `rank_score` | float | no | Attack-surface score (omitted when ranking disabled) |
| `rank_reason` | array | no | Score breakdown (omitted when ranking disabled) | | `rank_reason` | array | no | Score breakdown (omitted when ranking disabled) |
| `rollup` | object | no | Rollup data when findings are grouped (see below) | | `rollup` | object | no | Rollup data when findings are grouped (see below) |
| `chain_member_of` | int | no | Stable hash of the emitted chain this finding belongs to |
Fields marked "no" are omitted when empty/null/false to keep output compact. Fields marked "no" are omitted when empty/null/false to keep output compact.
@ -139,9 +163,9 @@ Fields marked "no" are omitted when empty/null/false to keep output compact.
| Level | Meaning | | Level | Meaning |
|-------|---------| |-------|---------|
| `High` | Strong signal -- taint-confirmed flow, definite state violation | | `High` | Strong signal: taint-confirmed flow, definite state violation |
| `Medium` | Moderate signal -- resource leak, path-validated taint, CFG structural | | `Medium` | Moderate signal: resource leak, path-validated taint, CFG structural |
| `Low` | Weak signal -- AST pattern match, possible resource leak, degraded analysis | | `Low` | Weak signal: AST pattern match, possible resource leak, degraded analysis |
### Evidence object ### Evidence object
@ -155,9 +179,40 @@ The `evidence` field provides structured provenance data:
| `sanitizers` | array | Sanitizer spans | | `sanitizers` | array | Sanitizer spans |
| `state` | object | State-machine evidence (machine, subject, from_state, to_state) | | `state` | object | State-machine evidence (machine, subject, from_state, to_state) |
| `notes` | array | Free-form notes (e.g. `"source_kind:UserInput"`, `"path_validated"`) | | `notes` | array | Free-form notes (e.g. `"source_kind:UserInput"`, `"path_validated"`) |
| `dynamic_verdict` | object | Dynamic verification result, when verification ran or was skipped for a typed reason |
All fields are omitted when empty/null. All fields are omitted when empty/null.
### Dynamic verdict object
`evidence.dynamic_verdict` uses this shape:
| Field | Type | Description |
|-------|------|-------------|
| `finding_id` | string | Stable 16-character hex finding id |
| `status` | string | `Confirmed`, `PartiallyConfirmed`, `NotConfirmed`, `Inconclusive`, or `Unsupported` |
| `triggered_payload` | string | Payload label for `Confirmed` verdicts |
| `reason` | object/string | Typed reason for `Unsupported` |
| `inconclusive_reason` | object/string | Typed reason for `Inconclusive` |
| `detail` | string | Extra build, sandbox, or policy detail |
| `attempts` | array | Per-payload attempt summaries |
| `toolchain_match` | string | `exact` or `drift` |
| `differential` | object | Vulnerable versus benign control result, when both ran |
| `hardening_outcome` | object | Process-backend hardening result, when recorded |
The top-level `dynamic_verification` object counts verdict statuses across the emitted findings:
```json
{
"total": 4,
"confirmed": 2,
"partially_confirmed": 0,
"not_confirmed": 1,
"inconclusive": 0,
"unsupported": 1
}
```
### Rollup object ### Rollup object
When a finding is a rollup (grouped from multiple occurrences), the `rollup` field is present: When a finding is a rollup (grouped from multiple occurrences), the `rollup` field is present:
@ -192,12 +247,13 @@ nyx scan . --format sarif > results.sarif
The SARIF output includes: The SARIF output includes:
- **Tool metadata** -- Nyx name and version - **Tool metadata**: Nyx name and version
- **Rules** -- Rule ID, description, severity mapping - **Rules**: Rule ID, description, severity mapping
- **Results** -- One result per finding with location, message, and properties - **Results**: One result per finding with location, message, and properties
- **Properties** -- Each result includes `category` and optionally `confidence` and `rollup.count` - **Properties**: Each result includes `category` and optionally `confidence`, `rollup.count`, and `nyx_dynamic_verdict`
- **Related locations** -- Rollup findings include example locations in `relatedLocations` - **Fingerprints**: Dynamic verdict status is added as `partialFingerprints.dynamic_verdict_status` when present
- **Artifacts** -- File paths referenced by findings - **Related locations**: Rollup findings include example locations in `relatedLocations`
- **Artifacts**: File paths referenced by findings
### GitHub Code Scanning integration ### GitHub Code Scanning integration
@ -219,9 +275,29 @@ The SARIF output includes:
|------|---------| |------|---------|
| `0` | Scan completed successfully; no findings matched `--fail-on` threshold | | `0` | Scan completed successfully; no findings matched `--fail-on` threshold |
| `1` | `--fail-on` threshold breached (at least one finding meets or exceeds the specified severity) | | `1` | `--fail-on` threshold breached (at least one finding meets or exceeds the specified severity) |
| Non-zero | Error (I/O, config, database, parse error) | | `2` | `--gate` policy tripped (e.g. `no-new-confirmed` saw a new Confirmed finding, or `resolve-all-confirmed` saw a previously Confirmed finding still open) |
| Other non-zero | Error (I/O, config, database, parse error) |
Without `--fail-on`, Nyx always exits `0` on a successful scan regardless of findings count. Without `--fail-on` or `--gate`, Nyx always exits `0` on a successful scan regardless of findings count.
---
## Repository Triage
`nyx scan` and `nyx serve` share `.nyx/triage.json` in the scan root. The file
uses portable fingerprints so committed triage decisions survive different
checkout paths in local runs and CI.
When the file exists, CLI scans apply it automatically:
- `open` and `investigating` findings remain active.
- `false_positive`, `accepted_risk`, `suppressed`, and `fixed` findings are
excluded from output and `--fail-on` checks by default.
- `--show-suppressed` includes terminal triage findings and emits
`triage_state` plus `triage_note` when present.
`nyx serve` continues to read and write the same file when triage sync is
enabled, so browser triage and CI gating use the same decisions.
--- ---
@ -229,9 +305,9 @@ Without `--fail-on`, Nyx always exits `0` on a successful scan regardless of fin
| Level | Description | Typical rules | | Level | Description | Typical rules |
|-------|-------------|---------------| |-------|-------------|---------------|
| **High** | Critical vulnerabilities -- likely exploitable | Command injection, unsafe deserialization, banned C functions, taint-confirmed flows with user input sources | | **High** | Critical vulnerabilities, likely exploitable | Command injection, unsafe deserialization, banned C functions, taint-confirmed flows with user input sources |
| **Medium** | Important issues -- may be exploitable with additional context | SQL concatenation, XSS sinks, reflection, unguarded sinks, resource leaks | | **Medium** | Important issues, may be exploitable with additional context | SQL concatenation, XSS sinks, reflection, unguarded sinks, resource leaks |
| **Low** | Informational -- code quality or weak signals | Weak crypto algorithms, insecure randomness, `unwrap()`/`panic!()`, type-safety escapes | | **Low** | Informational: code quality or weak signals | Weak crypto algorithms, insecure randomness, `unwrap()`/`panic!()`, type-safety escapes |
### Non-production severity downgrade ### Non-production severity downgrade
@ -260,13 +336,13 @@ Suppress specific findings directly in source code using `nyx:ignore` comments.
### Directive forms ### Directive forms
```python ```python
x = dangerous() # nyx:ignore taint-unsanitised-flow ← suppresses this line x = dangerous() # nyx:ignore taint-unsanitised-flow (suppresses this line)
# nyx:ignore-next-line taint-unsanitised-flow # nyx:ignore-next-line taint-unsanitised-flow
x = dangerous() ← suppresses this line x = dangerous() (suppressed by the comment above)
``` ```
- `nyx:ignore <RULE_ID>` -- suppresses findings on the **same line** as the comment. - `nyx:ignore <RULE_ID>`: suppresses findings on the **same line** as the comment.
- `nyx:ignore-next-line <RULE_ID>` -- suppresses findings on the **next line**. - `nyx:ignore-next-line <RULE_ID>`: suppresses findings on the **next line**.
- For taint findings, the primary line is the **sink line** (the `line` field in output). - For taint findings, the primary line is the **sink line** (the `line` field in output).
### Rule ID matching ### Rule ID matching

View file

@ -6,11 +6,11 @@ After `cargo install nyx-scanner` (or dropping a release binary on your PATH), p
nyx scan ./my-project nyx scan ./my-project
``` ```
First run builds a SQLite index under `.nyx/`; later runs skip files whose content hash hasn't changed. First run builds a SQLite index under `.nyx/`; later runs skip files whose content hash hasn't changed. Default builds also verify Medium and High confidence findings in a sandbox. Use `--no-verify` when you want a static-only local loop.
## What a finding looks like ## What a finding looks like
<p align="center"><img src="../assets/screenshots/cli-scan.png" alt="nyx scan output: HIGH taint flows from req.params.user, req.query.url, and req.query.path into exec/fetch/fs.readFileSync, framed by the brand purple gradient" width="900"/></p> <p align="center"><img src="assets/screenshots/cli-scan.png" alt="nyx scan output: HIGH taint flows from req.params.user, req.query.url, and req.query.path into exec/fetch/fs.readFileSync, framed by the brand mint-cyan gradient" width="900"/></p>
The same scan in console form: The same scan in console form:
@ -21,6 +21,7 @@ The same scan in console form:
Source: request.args.get (5:11) Source: request.args.get (5:11)
Sink: os.system Sink: os.system
[DYN: confirmed via cmdi-echo-marker-python]
6:5 ✖ [HIGH] py.cmdi.os_system (Score: 64, Confidence: High) 6:5 ✖ [HIGH] py.cmdi.os_system (Score: 64, Confidence: High)
os.system() runs a shell command os.system() runs a shell command
@ -31,12 +32,15 @@ The same scan in console form:
Source: req.query.content (3:18) Source: req.query.content (3:18)
Sink: document.write Sink: document.write
[DYN: confirmed via xss-script-marker]
5:5 ⚠ [MEDIUM] js.xss.document_write (Score: 34, Confidence: High) 5:5 ⚠ [MEDIUM] js.xss.document_write (Score: 34, Confidence: High)
document.write() is an XSS sink document.write() is an XSS sink
Dynamic verification: 4 verdicts (2 confirmed, 0 partially confirmed, 1 not confirmed, 0 inconclusive, 1 unsupported)
warning 'demo' generated 10 issues. warning 'demo' generated 10 issues.
Finished in 0.054s. Finished in 1.842s.
``` ```
Each finding is one line of header plus evidence. Fields that matter: Each finding is one line of header plus evidence. Fields that matter:
@ -48,6 +52,7 @@ Each finding is one line of header plus evidence. Fields that matter:
| Score | Attack-surface ranking (severity + analysis kind + source kind + evidence). Higher is more exploitable | | Score | Attack-surface ranking (severity + analysis kind + source kind + evidence). Higher is more exploitable |
| Confidence | `High`, `Medium`, `Low`. Drops for AST-only matches, capped widened flows, and lowered-to-Low backwards-infeasible findings | | Confidence | `High`, `Medium`, `Low`. Drops for AST-only matches, capped widened flows, and lowered-to-Low backwards-infeasible findings |
| Source / Sink | Where tainted data entered and where the dangerous call happened | | Source / Sink | Where tainted data entered and where the dangerous call happened |
| `[DYN: ...]` | Dynamic verifier result, when Nyx built and ran a harness for the finding |
Two rules firing on the same line (the taint finding plus the AST pattern) is normal. The pattern matches the structural presence of `document.write`; the taint rule adds the evidence that `req.query.content` actually reached it. Both carry distinct rule IDs so suppressions can target one without the other. Two rules firing on the same line (the taint finding plus the AST pattern) is normal. The pattern matches the structural presence of `document.write`; the taint rule adds the evidence that `req.query.content` actually reached it. Both carry distinct rule IDs so suppressions can target one without the other.
@ -85,14 +90,17 @@ nyx scan . --require-converged
`--require-converged` keeps `under-report` findings (the emitted flow is still real) but drops over-reports and widenings. Intended for strict gates where a noisy finding is worse than nothing. `--require-converged` keeps `under-report` findings (the emitted flow is still real) but drops over-reports and widenings. Intended for strict gates where a noisy finding is worse than nothing.
## Skip dataflow for a fast first pass ## Skip work for a fast first pass
```bash ```bash
nyx scan . --mode ast nyx scan . --mode ast
nyx scan . --no-verify
``` ```
AST-only mode runs tree-sitter patterns without building a CFG or running taint. It's fast and still catches banned-API uses, weak crypto, and obvious XSS sinks, but it can't tell `eval("1+1")` apart from `eval(userInput)`. Use it as a pre-commit filter, not as a CI gate replacement. AST-only mode runs tree-sitter patterns without building a CFG or running taint. It's fast and still catches banned-API uses, weak crypto, and obvious XSS sinks, but it can't tell `eval("1+1")` apart from `eval(userInput)`. Use it as a pre-commit filter, not as a CI gate replacement.
`--no-verify` keeps the static engine on but skips sandboxed execution. Use it when you are iterating locally and only need the analyzer result.
## Next ## Next
- [CLI reference](cli.md) for every flag and subcommand. - [CLI reference](cli.md) for every flag and subcommand.

View file

@ -1,12 +1,12 @@
# Rule reference # Rule reference
Every finding Nyx emits has a rule ID. This page enumerates the IDs that ship with scanner 0.5.0, grouped by family. Every finding Nyx emits has a rule ID. This page enumerates the IDs that ship with the scanner, grouped by family.
> This page is written by hand and drifts against the code. Authoritative sources: [`src/patterns/<lang>.rs`](https://github.com/elicpeter/nyx/tree/master/src/patterns) for AST patterns, [`src/labels/<lang>.rs`](https://github.com/elicpeter/nyx/tree/master/src/labels) for taint matchers, and [`src/auth_analysis/config.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/config.rs) for auth rules. If a rule fires that isn't listed here, the source file is right and this page is wrong. > This page is written by hand and drifts against the code. Authoritative sources: [`src/patterns/<lang>.rs`](https://github.com/elicpeter/nyx/tree/master/src/patterns) for AST patterns, [`src/labels/<lang>.rs`](https://github.com/elicpeter/nyx/tree/master/src/labels) for taint matchers, and [`src/auth_analysis/config.rs`](https://github.com/elicpeter/nyx/blob/master/src/auth_analysis/config.rs) for auth rules. If a rule fires that isn't listed here, the source file is right and this page is wrong.
If you'd rather browse rules interactively, [`nyx serve`](serve.md) ships a Rules page that lists every loaded matcher with its language, kind, and capability: If you'd rather browse rules interactively, [`nyx serve`](serve.md) ships a Rules page that lists every loaded matcher with its language, kind, and capability:
<p align="center"><img src="../assets/screenshots/docs/serve-rules.png" alt="Nyx Rules page: filterable list of 218 rules with language, kind (SOURCE/SANITIZER/SINK), capability, and finding count columns" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-rules.png" alt="Nyx Rules page: filterable list of 218 rules with language, kind (SOURCE/SANITIZER/SINK), capability, and finding count columns" width="900"/></p>
## ID format ## ID format
@ -24,13 +24,22 @@ Language prefixes: `rs`, `c`, `cpp`, `go`, `java`, `js`, `ts`, `py`, `php`, `rb`
### Taint ### Taint
One rule covers every source-to-sink flow. The parenthetical identifies the source location. The taint family is split into cap-specific rule classes. The `taint-unsanitised-flow` id is the catch-all for the legacy caps that have not migrated to a dedicated rule id yet (`sql_query`, `ssrf`, `code_exec`, `file_io`, `fmt_string`, `deserialize`, `crypto`). The seven new vulnerability classes plus auth and data-exfil emerge under their own rule id. The parenthetical identifies the source location.
| Rule ID | Severity | | Rule ID | Cap | Severity |
|---|---| |---|---|---|
| `taint-unsanitised-flow (source L:C)` | Varies by source kind and sink capability | | `taint-unsanitised-flow (source L:C)` | `sql_query` / `ssrf` / `code_exec` / `file_io` / `fmt_string` / `deserialize` / `crypto` | Varies |
| `taint-ldap-injection` | `ldap_injection` | High |
| `taint-xpath-injection` | `xpath_injection` | High |
| `taint-header-injection` | `header_injection` | High |
| `taint-open-redirect` | `open_redirect` | Medium |
| `taint-template-injection` | `ssti` | High |
| `taint-xxe` | `xxe` | High |
| `taint-prototype-pollution` | `prototype_pollution` | High |
| `taint-data-exfiltration` | `data_exfil` | High / Medium |
| `rs.auth.missing_ownership_check.taint` | `unauthorized_id` | High |
The matcher sets (sources, sanitizers, sinks, gated sinks) live per-language in `src/labels/<lang>.rs`. [Language maturity](language-maturity.md) gives per-language counts and what's covered. Each cap-class entry is registered in `CAP_RULE_REGISTRY` (`src/labels/mod.rs`). Browse the registry from the CLI with `nyx rules list --class-only`, or via the dashboard's Rules page. The matcher sets (sources, sanitizers, sinks, gated sinks) live per-language in `src/labels/<lang>.rs`. [Language maturity](language-maturity.md) gives per-language counts and what's covered.
### CFG structural ### CFG structural
@ -112,20 +121,21 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
| `go.crypto.md5` | Low | A | Medium | | `go.crypto.md5` | Low | A | Medium |
| `go.crypto.sha1` | Low | A | Medium | | `go.crypto.sha1` | Low | A | Medium |
### Java: 8 patterns ### Java: 9 patterns
| Rule ID | Severity | Tier | Confidence | | Rule ID | Severity | Tier | Confidence |
|---|---|---|---| |---|---|---|---|
| `java.cmdi.runtime_exec` | High | A | High | | `java.cmdi.runtime_exec` | High | A | High |
| `java.code_exec.text4shell_interpolator` | High | A | High |
| `java.deser.readobject` | High | A | High | | `java.deser.readobject` | High | A | High |
| `java.deser.snakeyaml_unsafe_constructor` | High | A | High |
| `java.crypto.weak_algorithm` | Medium | A | Medium |
| `java.reflection.class_forname` | Medium | A | High | | `java.reflection.class_forname` | Medium | A | High |
| `java.reflection.method_invoke` | Medium | A | High | | `java.reflection.method_invoke` | Medium | A | High |
| `java.sqli.execute_concat` | Medium | B | Medium | | `java.sqli.execute_concat` | Medium | B | Medium |
| `java.xss.getwriter_print` | Medium | A | High |
| `java.crypto.insecure_random` | Low | A | Medium | | `java.crypto.insecure_random` | Low | A | Medium |
| `java.crypto.weak_digest` | Low | A | Medium |
### JavaScript: 22 patterns ### JavaScript: 23 patterns
| Rule ID | Severity | Tier | Confidence | | Rule ID | Severity | Tier | Confidence |
|---|---|---|---| |---|---|---|---|
@ -147,6 +157,7 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
| `js.xss.outer_html` | Medium | A | High | | `js.xss.outer_html` | Medium | A | High |
| `js.config.insecure_session_samesite` | Low | A | High | | `js.config.insecure_session_samesite` | Low | A | High |
| `js.config.insecure_session_secure` | Low | A | Medium | | `js.config.insecure_session_secure` | Low | A | Medium |
| `js.crypto.hardcoded_key` | Low | A | Medium |
| `js.crypto.math_random` | Low | A | Medium | | `js.crypto.math_random` | Low | A | Medium |
| `js.crypto.weak_hash` | Low | A | Medium | | `js.crypto.weak_hash` | Low | A | Medium |
| `js.secrets.hardcoded_secret` | Low | A | Medium | | `js.secrets.hardcoded_secret` | Low | A | Medium |
@ -168,7 +179,7 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
| `php.crypto.rand` | Low | A | Medium | | `php.crypto.rand` | Low | A | Medium |
| `php.crypto.sha1` | Low | A | Medium | | `php.crypto.sha1` | Low | A | Medium |
### Python: 13 patterns ### Python: 17 patterns
| Rule ID | Severity | Tier | Confidence | | Rule ID | Severity | Tier | Confidence |
|---|---|---|---| |---|---|---|---|
@ -182,9 +193,13 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
| `py.code_exec.compile` | Medium | A | High | | `py.code_exec.compile` | Medium | A | High |
| `py.deser.shelve_open` | Medium | A | High | | `py.deser.shelve_open` | Medium | A | High |
| `py.sqli.execute_format` | Medium | B | Medium | | `py.sqli.execute_format` | Medium | B | Medium |
| `py.sqli.text_format` | Medium | B | Medium |
| `py.xss.jinja_from_string` | Medium | A | High | | `py.xss.jinja_from_string` | Medium | A | High |
| `py.xss.make_response_format` | Medium | B | Medium |
| `py.crypto.md5` | Low | A | Medium | | `py.crypto.md5` | Low | A | Medium |
| `py.crypto.md5_bare` | Low | A | Low |
| `py.crypto.sha1` | Low | A | Medium | | `py.crypto.sha1` | Low | A | Medium |
| `py.crypto.sha1_bare` | Low | A | Low |
### Ruby: 11 patterns ### Ruby: 11 patterns
@ -220,7 +235,7 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
| `rs.quality.todo` | Low | A | High | | `rs.quality.todo` | Low | A | High |
| `rs.quality.unwrap` | Low | A | High | | `rs.quality.unwrap` | Low | A | High |
### TypeScript: 22 patterns ### TypeScript: 23 patterns
| Rule ID | Severity | Tier | Confidence | | Rule ID | Severity | Tier | Confidence |
|---|---|---|---| |---|---|---|---|
@ -240,6 +255,7 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
| `ts.xss.outer_html` | Medium | A | High | | `ts.xss.outer_html` | Medium | A | High |
| `ts.config.insecure_session_samesite` | Low | A | High | | `ts.config.insecure_session_samesite` | Low | A | High |
| `ts.config.insecure_session_secure` | Low | A | Medium | | `ts.config.insecure_session_secure` | Low | A | Medium |
| `ts.crypto.hardcoded_key` | Low | A | Medium |
| `ts.crypto.math_random` | Low | A | Medium | | `ts.crypto.math_random` | Low | A | Medium |
| `ts.crypto.weak_hash` | Low | A | Medium | | `ts.crypto.weak_hash` | Low | A | Medium |
| `ts.quality.any_annotation` | Low | A | Medium | | `ts.quality.any_annotation` | Low | A | Medium |
@ -253,6 +269,8 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
`nyx config add-rule --cap <name>` and `[analysis.languages.*.rules]` in config accept: `nyx config add-rule --cap <name>` and `[analysis.languages.*.rules]` in config accept:
`env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `all` `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `code_exec`, `crypto`, `unauthorized_id`, `data_exfil`, `ldap_injection`, `xpath_injection`, `header_injection`, `open_redirect`, `ssti`, `xxe`, `prototype_pollution`, `all`
Source for both the enum and the `to_cap` mapping: [`src/labels/mod.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/mod.rs) (`Cap`) and [`src/utils/config.rs`](https://github.com/elicpeter/nyx/blob/master/src/utils/config.rs) (`CapName`). Aliases: `data_exfiltration` for `data_exfil`, `ldapi` for `ldap_injection`, `xpathi` for `xpath_injection`, `crlf` and `response_splitting` for `header_injection`, `redirect` for `open_redirect`, `template_injection` for `ssti`, `proto_pollution` for `prototype_pollution`.
Source for both the enum and the `to_cap` mapping: [`src/labels/mod.rs`](https://github.com/elicpeter/nyx/blob/master/src/labels/mod.rs) (`Cap` and `CAP_RULE_REGISTRY`) and [`src/utils/config.rs`](https://github.com/elicpeter/nyx/blob/master/src/utils/config.rs) (`CapName`).

View file

@ -11,7 +11,21 @@ nyx serve --no-browser # don't auto-open
Persistent settings live under `[server]` in `nyx.conf` / `nyx.local`. Persistent settings live under `[server]` in `nyx.conf` / `nyx.local`.
<p align="center"><img src="../assets/screenshots/docs/serve-overview.png" alt="Nyx UI overview: total findings, severity breakdown, language and category distribution, top affected files" width="900"/></p> ```mermaid
flowchart LR
Scan["nyx scan<br/>or UI-started scan"] --> Cache[".nyx findings<br/>plus SQLite project index"]
Cache --> Serve["nyx serve<br/>loopback API and embedded React UI"]
Serve --> Review["Review findings<br/>flow, evidence, history"]
Review --> Triage["Update triage state<br/>investigate, suppress, accept, fix"]
Triage --> Sync[".nyx/triage.json<br/>optional repo-synced state"]
Sync --> Cache
```
Starting a scan from the UI runs dynamic verification on `Confidence >= Medium`
findings by default. Check "Skip dynamic verification" in the scan modal to get
a fast static-only result. See [Dynamic verification](dynamic.md) for details.
<p align="center"><img src="assets/screenshots/docs/serve-overview.png" alt="Nyx UI overview: total findings, severity breakdown, language and category distribution, top affected files" width="900"/></p>
## What it serves, and what it doesn't ## What it serves, and what it doesn't
@ -21,10 +35,10 @@ There is **no** account, no telemetry, no remote logging, no auto-update ping. T
## Security model ## Security model
`nyx serve` enforces three things at the HTTP layer ([`src/server/security.rs`](https://github.com/elicpeter/nyx/blob/master/src/server/security.rs)): `nyx serve` enforces three things:
1. **Loopback bind only.** `--host` and `[server].host` are clamped to `127.0.0.1`, `localhost`, or `::1`. Any other value is refused at startup with `Nyx serve only binds to loopback addresses; refused host '<value>'`. 1. **Loopback bind only.** `--host` and `[server].host` are clamped to `127.0.0.1`, `localhost`, or `::1`. Any other value is refused at startup with `Nyx serve only binds to loopback addresses; refused host '<value>'` ([`src/commands/serve.rs`](https://github.com/elicpeter/nyx/blob/master/src/commands/serve.rs)).
2. **Host-header check.** Every request must carry a `Host` header that matches the bound address and port. Missing or mismatched headers get a `400 invalid Host header`. Defends against DNS rebinding. 2. **Host-header check.** Every request must carry a `Host` header that matches the bound address and port. Missing or mismatched headers get a `400 invalid Host header`. Defends against DNS rebinding ([`src/server/security.rs`](https://github.com/elicpeter/nyx/blob/master/src/server/security.rs)).
3. **CSRF on mutations.** `POST` / `PUT` / `PATCH` / `DELETE` requests must carry a per-process CSRF token in the `x-nyx-csrf` header. The token is generated once when the server starts and exposed at `GET /api/health` so the embedded SPA can read it. Cross-origin mutations are rejected before the CSRF check via the `Origin` header. 3. **CSRF on mutations.** `POST` / `PUT` / `PATCH` / `DELETE` requests must carry a per-process CSRF token in the `x-nyx-csrf` header. The token is generated once when the server starts and exposed at `GET /api/health` so the embedded SPA can read it. Cross-origin mutations are rejected before the CSRF check via the `Origin` header.
If you forward the port over SSH or expose it through a reverse proxy, the host-header check will reject the request because the `Host` won't match `localhost:9700`. That's the intended behaviour. Don't do this without a deliberate reason; the loopback bind is part of the security model. If you forward the port over SSH or expose it through a reverse proxy, the host-header check will reject the request because the `Host` won't match `localhost:9700`. That's the intended behaviour. Don't do this without a deliberate reason; the loopback bind is part of the security model.
@ -82,25 +96,25 @@ Modifiers in the ±5 range nudge the result for trend (only after the second sca
It's a Nyx-finding-pressure metric, not a security audit. Score 100 means Nyx didn't find anything under its current rules and language coverage; it doesn't certify the absence of vulnerabilities. The score doesn't see runtime config, IAM, secret stores, dependency CVEs, or anything outside the source tree being scanned. A repo of mostly Kotlin (where Nyx coverage is thin) will score artificially well because most of the code never gets evaluated. It's a Nyx-finding-pressure metric, not a security audit. Score 100 means Nyx didn't find anything under its current rules and language coverage; it doesn't certify the absence of vulnerabilities. The score doesn't see runtime config, IAM, secret stores, dependency CVEs, or anything outside the source tree being scanned. A repo of mostly Kotlin (where Nyx coverage is thin) will score artificially well because most of the code never gets evaluated.
The current ceilings are calibrated for v0.5 scanner false-positive rates. As symex coverage and rule precision improve, the ceilings tighten. Calibration data and the rationale behind each tunable lives in [health-score-audit.md](health-score-audit.md). Ceilings are calibrated for the current scanner false-positive rates. As symex coverage and rule precision improve, the ceilings may tighten.
### Findings and Finding detail ### Findings and Finding detail
The findings list is filterable by severity, confidence, category, language, rule ID, and triage state. The findings list is filterable by severity, confidence, category, language, rule ID, and triage state.
<p align="center"><img src="../assets/screenshots/docs/serve-findings-list.png" alt="Nyx findings list: 13 findings filtered by severity/confidence/rule, with status badges, file paths, and language tags" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-findings-list.png" alt="Nyx findings list: 13 findings filtered by severity/confidence/rule, with status badges, file paths, and language tags" width="900"/></p>
Clicking through opens the **flow visualiser**: a numbered walk from source to sink with the snippet at each step, cross-file markers when the path leaves the current file, the rule's "How to fix" guidance, and the engine's evidence object inline. Clicking through opens the **flow visualiser**: a numbered walk from source to sink with the snippet at each step, cross-file markers when the path leaves the current file, the rule's "How to fix" guidance, and the engine's evidence object inline.
<p align="center"><img src="../assets/screenshots/docs/serve-finding-detail.png" alt="Nyx finding detail: HIGH taint-unsanitised-flow showing source → call → sink steps, How to fix guidance, and evidence panel" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-finding-detail.png" alt="Nyx finding detail: HIGH taint-unsanitised-flow showing source → call → sink steps, How to fix guidance, and evidence panel" width="900"/></p>
Engine notes call out when precision was bounded for that finding (`OriginsTruncated`, `PointsToTruncated`, `PathWidened`, `ForwardBailed`, etc.). Anything tagged `under-report` means the emitted flow is real and the result set is a lower bound; `over-report` means widening or bail. `--require-converged` in the CLI drops the over-report ones for strict gates. Engine notes call out when precision was bounded for that finding (`OriginsTruncated`, `PointsToTruncated`, `WorklistCapped`, `PredicateStateWidened`, `SsaLoweringBailed`, etc.). Each note carries a direction tag: `under-report` means the emitted flow is real and the result set is a lower bound; `over-report` means widening dropped a guard; `bail` means analysis aborted before producing a trustworthy result. `--require-converged` in the CLI drops over-report and bail notes for strict gates.
### Triage ### Triage
Each finding carries a triage state: `open`, `investigating`, `false_positive`, `accepted_risk`, `suppressed`, or `fixed`. The triage page bulk-updates them and shows the audit trail. Each finding carries a triage state: `open`, `investigating`, `false_positive`, `accepted_risk`, `suppressed`, or `fixed`. The triage page bulk-updates them and shows the audit trail.
<p align="center"><img src="../assets/screenshots/docs/serve-triage.png" alt="Nyx triage page: 13 findings need attention, severity breakdown, Findings/Suppression rules/Audit log tabs, rule chips, Investigate buttons" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-triage.png" alt="Nyx triage page: 13 findings need attention, severity breakdown, Findings/Suppression rules/Audit log tabs, rule chips, Investigate buttons" width="900"/></p>
State writes are persisted to SQLite immediately, and (when `[server].triage_sync = true`, default on) mirrored to `.nyx/triage.json` in the project root. Commit that file: State writes are persisted to SQLite immediately, and (when `[server].triage_sync = true`, default on) mirrored to `.nyx/triage.json` in the project root. Commit that file:
@ -114,7 +128,7 @@ It carries decisions across machines so a teammate's local scan reflects yours.
A file tree with per-file finding counts, syntax-highlighted source, and a right rail with the file's symbols and findings. Useful for "what's wrong with this module" rather than "what's wrong with this finding". A file tree with per-file finding counts, syntax-highlighted source, and a right rail with the file's symbols and findings. Useful for "what's wrong with this module" rather than "what's wrong with this finding".
<p align="center"><img src="../assets/screenshots/docs/serve-explorer.png" alt="Nyx explorer: file tree with per-file finding counts, syntax-highlighted Python source with red sink marker on the os.system line, file-summary right rail with findings" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-explorer.png" alt="Nyx explorer: file tree with per-file finding counts, syntax-highlighted Python source with red sink marker on the os.system line, file-summary right rail with findings" width="900"/></p>
The path query string preselects a file: `/explorer?file=src/handler.rs`. The path query string preselects a file: `/explorer?file=src/handler.rs`.
@ -122,11 +136,11 @@ The path query string preselects a file: `/explorer?file=src/handler.rs`.
Past runs are persisted when `[runs].persist = true` (off by default to avoid disk growth on heavy users). When persistence is on, `/scans` lists historical runs. Past runs are persisted when `[runs].persist = true` (off by default to avoid disk growth on heavy users). When persistence is on, `/scans` lists historical runs.
<p align="center"><img src="../assets/screenshots/docs/serve-scans.png" alt="Nyx scans list: completed scan run with root, duration, finding count, languages, and started timestamp" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-scans.png" alt="Nyx scans list: completed scan run with root, duration, finding count, languages, and started timestamp" width="900"/></p>
Each run drills into a detail page with files scanned, findings count, duration, languages, and a per-pass timing breakdown. Each run drills into a detail page with files scanned, findings count, duration, languages, and a per-pass timing breakdown.
<p align="center"><img src="../assets/screenshots/docs/serve-scan-detail.png" alt="Nyx scan detail: Summary tab with files scanned, findings, duration, languages; Details panel with Scan ID, Root, Engine version, started/finished timestamps; Timing breakdown bar showing Walk/Pass 1/Call Graph/Pass 2/Post" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-scan-detail.png" alt="Nyx scan detail: Summary tab with files scanned, findings, duration, languages; Details panel with Scan ID, Root, Engine version, started/finished timestamps; Timing breakdown bar showing Walk/Pass 1/Call Graph/Pass 2/Post" width="900"/></p>
Pick two scans to diff and see what got introduced, fixed, or rediscovered between runs. The retention cap is `[runs].max_runs` (default 100). Each run can also optionally save its log and stdout (`save_logs`, `save_stdout`); both are off by default. Code snippets are saved (`save_code_snippets = true`); turn off if storage is tight. Pick two scans to diff and see what got introduced, fixed, or rediscovered between runs. The retention cap is `[runs].max_runs` (default 100). Each run can also optionally save its log and stdout (`save_logs`, `save_stdout`); both are off by default. Code snippets are saved (`save_code_snippets = true`); turn off if storage is tight.
@ -134,7 +148,7 @@ Pick two scans to diff and see what got introduced, fixed, or rediscovered betwe
Every rule the engine knows about, built-in plus user-added. Each row shows the matchers, kind (source / sanitiser / sink), capability, language, and how many findings it produced in the latest scan. Filter by language, by kind, or by free text. Every rule the engine knows about, built-in plus user-added. Each row shows the matchers, kind (source / sanitiser / sink), capability, language, and how many findings it produced in the latest scan. Filter by language, by kind, or by free text.
<p align="center"><img src="../assets/screenshots/docs/serve-rules.png" alt="Nyx rules page: 218 rules with language/kind dropdowns and a matcher search; rows showing rule title, language, kind (SOURCE/SANITIZER/SINK), cap, and finding count" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-rules.png" alt="Nyx rules page: 218 rules with language/kind dropdowns and a matcher search; rows showing rule title, language, kind (SOURCE/SANITIZER/SINK), cap, and finding count" width="900"/></p>
User-added rules can be deleted from this page; built-ins are immutable. Built-ins live in `src/labels/<lang>.rs` and `src/patterns/<lang>.rs`; user-added entries write to `nyx.local`. User-added rules can be deleted from this page; built-ins are immutable. Built-ins live in `src/labels/<lang>.rs` and `src/patterns/<lang>.rs`; user-added entries write to `nyx.local`.
@ -142,7 +156,7 @@ User-added rules can be deleted from this page; built-ins are immutable. Built-i
A live config editor. Reads the merged config (`nyx.conf` + `nyx.local`), lets you flip switches and add custom source / sanitizer / sink rules, and writes back to `nyx.local`. Changes apply to the next scan; the running server uses its initial config snapshot. A live config editor. Reads the merged config (`nyx.conf` + `nyx.local`), lets you flip switches and add custom source / sanitizer / sink rules, and writes back to `nyx.local`. Changes apply to the next scan; the running server uses its initial config snapshot.
<p align="center"><img src="../assets/screenshots/docs/serve-config.png" alt="Nyx config page: General settings (analysis mode, max file size, excluded extensions, attack-surface ranking), Triage Sync toggle, Sources section with language/matcher/capability dropdowns and a per-language matcher table" width="900"/></p> <p align="center"><img src="assets/screenshots/docs/serve-config.png" alt="Nyx config page: General settings (analysis mode, max file size, excluded extensions, attack-surface ranking), Triage Sync toggle, Sources section with language/matcher/capability dropdowns and a per-language matcher table" width="900"/></p>
The custom-rule form picks a language, a matcher (function or property name), and a capability. The capability list matches the `Cap` bitflags the taint engine uses; see [rules.md](rules.md#capability-list-for-custom-rules) for what each one means. The custom-rule form picks a language, a matcher (function or property name), and a capability. The capability list matches the `Cap` bitflags the taint engine uses; see [rules.md](rules.md#capability-list-for-custom-rules) for what each one means.

View file

@ -3,8 +3,16 @@
<head> <head>
<meta charset="UTF-8" /> <meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Nyx Scanner</title> <title>Nyx</title>
<link rel="icon" href="/favicon.svg" type="image/svg+xml" /> <link rel="icon" type="image/png" sizes="32x32" href="/favicon-32.png" />
<link rel="icon" type="image/png" sizes="64x64" href="/favicon-64.png" />
<link rel="apple-touch-icon" sizes="180x180" href="/favicon-180.png" />
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
<link
href="https://fonts.googleapis.com/css2?family=Playfair+Display:wght@700&display=swap"
rel="stylesheet"
/>
</head> </head>
<body> <body>
<div id="root"></div> <div id="root"></div>

File diff suppressed because it is too large Load diff

View file

@ -1,7 +1,7 @@
{ {
"name": "nyx-frontend", "name": "nyx-frontend",
"private": true, "private": true,
"version": "0.5.0", "version": "0.7.0",
"license": "GPL-3.0-or-later", "license": "GPL-3.0-or-later",
"type": "module", "type": "module",
"scripts": { "scripts": {
@ -18,33 +18,33 @@
"test:coverage": "vitest run --coverage" "test:coverage": "vitest run --coverage"
}, },
"dependencies": { "dependencies": {
"@tanstack/react-query": "^5.100.6", "@tanstack/react-query": "^5.101.0",
"elkjs": "^0.11.1", "elkjs": "^0.11.1",
"graphology": "^0.26.0", "graphology": "^0.26.0",
"react": "^19.2.5", "react": "^19.2.7",
"react-dom": "^19.2.5", "react-dom": "^19.2.7",
"react-router-dom": "^7.14.2", "react-router-dom": "^7.17.0",
"sigma": "^3.0.2" "sigma": "^3.0.3"
}, },
"devDependencies": { "devDependencies": {
"@eslint/js": "^10.0.1", "@eslint/js": "^10.0.1",
"@testing-library/jest-dom": "^6.9.1", "@testing-library/jest-dom": "^6.9.1",
"@testing-library/react": "^16.3.2", "@testing-library/react": "^16.3.2",
"@testing-library/user-event": "^14.6.1", "@testing-library/user-event": "^14.6.1",
"@types/react": "^19.2.14", "@types/react": "^19.2.16",
"@types/react-dom": "^19.2.3", "@types/react-dom": "^19.2.3",
"@vitejs/plugin-react": "^6.0.1", "@vitejs/plugin-react": "^6.0.2",
"@vitest/coverage-v8": "^4.1.5", "@vitest/coverage-v8": "^4.1.8",
"eslint": "^10.2.1", "eslint": "^10.4.1",
"eslint-plugin-react-hooks": "^7.1.1", "eslint-plugin-react-hooks": "^7.1.1",
"eslint-plugin-react-refresh": "^0.5.2", "eslint-plugin-react-refresh": "^0.5.2",
"globals": "^17.5.0", "globals": "^17.6.0",
"jsdom": "^29.1.0", "jsdom": "^29.1.1",
"license-checker-rseidelsohn": "^4.4.2", "license-checker-rseidelsohn": "^5.0.1",
"prettier": "^3.8.3", "prettier": "^3.8.3",
"typescript": "~6.0.3", "typescript": "~6.0.3",
"typescript-eslint": "^8.59.1", "typescript-eslint": "^8.60.1",
"vite": "^8.0.10", "vite": "^8.0.16",
"vitest": "^4.1.5" "vitest": "^4.1.8"
} }
} }

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 KiB

Some files were not shown because too many files have changed in this diff Show more