Compare commits

...

81 commits

Author SHA1 Message Date
Eli Peter
c9776a5caf
Introduce repro cli subcommand
Some checks failed
CI / docs-fresh (push) Has been cancelled
CI / rustdoc (push) Has been cancelled
CI / rust-beta-build (push) Has been cancelled
CI / msrv (push) Has been cancelled
CI / rust-stable-test / linux-without-docker (push) Has been cancelled
CI / rust-stable-test / linux-with-docker (push) Has been cancelled
CI / escape-positive-control (push) Has been cancelled
CI / cross-platform-smoke (push) Has been cancelled
CI / cross-platform-smoke-1 (push) Has been cancelled
CI / rust-beta-test (push) Has been cancelled
CI / cargo-package (push) Has been cancelled
CI / benchmark-gate (push) Has been cancelled
CI / corpus-marker-audit (push) Has been cancelled
CodeQL Advanced / Analyze (actions) (push) Has been cancelled
CodeQL Advanced / Analyze (javascript-typescript) (push) Has been cancelled
CodeQL Advanced / Analyze (rust) (push) Has been cancelled
docs / build-deploy (push) Has been cancelled
dynamic / dynamic / linux-process-only (push) Has been cancelled
dynamic / dynamic / linux-with-docker (push) Has been cancelled
dynamic / dynamic / macos (push) Has been cancelled
eval / eval / owasp-benchmark-v1.2 (push) Has been cancelled
eval / eval / juiceshop (push) Has been cancelled
eval / eval / nodegoat (push) Has been cancelled
eval / eval / dvpwa (push) Has been cancelled
eval / eval / dvwa (push) Has been cancelled
eval / eval / gosec (push) Has been cancelled
eval / eval / railsgoat (push) Has been cancelled
eval / eval / rustsec (push) Has been cancelled
repro-bare / repro-bare / tests/repro_fixtures/python-3.11/repro (push) Has been cancelled
OSSF Scorecard / scorecard (push) Has been cancelled
2026-06-05 13:34:07 -05:00
elipeter
a2d1a1583f updated CHANGELOG.md 2026-06-05 13:13:42 -05:00
elipeter
8a7d2b8010 added repro subcommand 2026-06-05 13:10:58 -05:00
Eli Peter
c1fa6a87cf
ui-fixes 2026-06-05 12:39:39 -05:00
elipeter
f52b3bed1e changed sizes 2026-06-05 12:39:13 -05:00
elipeter
214bf91b63 bumped dep 2026-06-05 12:27:16 -05:00
elipeter
49fa174607 added svg for confirmed verdict badge 2026-06-05 12:04:09 -05:00
elipeter
291fe5d7be updated CHANGELOG.md 2026-06-05 11:36:52 -05:00
Eli Peter
25863d222a
Merge pull request #86 from nyx-sec/triage-works-in-cli
fix(cli): apply repository triage file during scans
2026-06-05 10:59:40 -05:00
elipeter
d09a97008e updated CHANGELOG.md 2026-06-05 10:53:09 -05:00
elipeter
1148e65f36 fix(cli): apply repository triage file during scans 2026-06-05 10:50:25 -05:00
Eli Peter
991c84a1eb
Dynamic (#77) 2026-06-05 10:16:30 -05:00
Eli Peter
55247b7fcd
Critical bug fixes and recall improvements (#68) 2026-05-11 12:42:39 -04:00
Eli Peter
7d0e7320e2
new capacity bits (#67) 2026-05-07 01:29:31 -04:00
elipeter
afaffc0df6 updated third party licenses 2026-05-06 05:03:00 -04:00
elipeter
c6f4c3e1cf chore: Update CHANGELOG with recent UI refresh, layout improvements, and screenshot enhancements 2026-05-06 05:01:43 -04:00
elipeter
6c607634da style: Improve code formatting for better readability in CSS and JSX files 2026-05-06 04:49:13 -04:00
elipeter
b51ae4f89d feat: Increase screenshot resolution to 1600x992 for improved quality 2026-05-06 04:45:50 -04:00
elipeter
77be7f10d9 refactor: Update UI components for consistency and improve layout 2026-05-06 04:38:04 -04:00
elipeter
da619171cf chore: Update package versions in Cargo.lock and package.json 2026-05-05 19:53:40 -04:00
elipeter
e8f1c64dc9 feat: Add asset mirroring for nyxscan.dev landing site and update favicon 2026-05-05 19:21:11 -04:00
elipeter
e830fd0a7e fix: Correct image paths in documentation for consistency 2026-05-05 19:08:51 -04:00
elipeter
c6baa4d5dc feat: Update brand color to mint-cyan across screenshots and UI elements 2026-05-05 19:02:47 -04:00
elipeter
bbf6f91c56 feat: Enhance CLI screenshot capture with raw file saving and GIF generation 2026-05-05 18:17:53 -04:00
Eli Peter
fb698d2c27
Performance and precision pass (#64) 2026-05-04 19:58:04 -04:00
Eli Peter
c7c5e0f3a1
Precision pass on auth and resource analysis (#63) 2026-05-03 13:51:46 -04:00
elipeter
064801a3a4 feat: Simplify inner-call release detection logic in resource filtering 2026-05-02 21:49:01 -04:00
elipeter
ebe4a15a72 feat: Enhance resource leak detection by recognizing inner-call release patterns and err-companion guards 2026-05-02 21:47:03 -04:00
elipeter
48bc43e1a6 feat: Add SSA summaries support for validated parameter propagation and enhance loop body error handling 2026-05-02 21:02:47 -04:00
elipeter
92aaa36ed6 chore: Update version placeholders and changelog for release 0.6.0 2026-05-02 18:06:50 -04:00
elipeter
215dd02eff docs: Update CVE list in README to include recent vulnerabilities and their details 2026-05-02 17:51:42 -04:00
Eli Peter
1f2bfe76c1
docs: Enhance module documentation across various files for clarity a… (#62)
* docs: Enhance module documentation across various files for clarity and completeness

* fix: Remove unnecessary blank line in build.rs for cleaner code

* docs: Update documentation to improve clarity and consistency in code comments
2026-05-02 17:46:45 -04:00
Eli Peter
40995e45e7
Authorization analysis logic improvements (#61) 2026-05-02 16:44:49 -04:00
Eli Peter
3c89bddbf2
Improved path traversal detection and enhanced sink classification logic 2026-05-02 03:36:14 -04:00
Eli Peter
58f1794a4e
Added Cap::DATA_EXFIL and taint fp and fn fixes on real repos (#59)
* feat: Enhance data exfiltration detection with source sensitivity gating for cookies and headers

* feat: Implement cross-file data exfiltration detection with parameter-specific gate filters

* feat: Add calibration tests and refine DATA_EXFIL severity scoring logic

* feat: Introduce per-detector configuration for data exfiltration suppression

* feat: Enhance DATA_EXFIL findings with destination field tracking in diagnostics and SARIF output

* feat: Add tainted body and URL handling for data exfiltration detection

* feat: Add integration tests and fixtures for DATA_EXFIL and SSRF detection in Go

* feat: Add Java integration tests and fixtures for DATA_EXFIL detection across multiple HTTP clients

* feat: Add synthetic externals handling for closure-captured variables in SSA

* feat: Implement closure-based suppression for resource leak findings

* feat: Add regression guards for shell-injection and taint propagation in for-of destructure patterns

* feat: Implement constructor cap narrowing for data exfiltration detection in HTTP request builders

* feat: Add gated sinks for data exfiltration detection in C and C++ using curl_easy_setopt

* feat: Implement DATA_EXFIL cap parity for backwards analysis and add integration tests

* feat: Add data exfiltration sinks for various languages and enhance documentation

* refactor: Simplify formatting and improve readability in various files

* refactor: Improve readability by simplifying conditional statements and adding clippy linting

* docs: Update CHANGELOG and comments for data exfiltration features and configuration

* docs: Clarify configuration instructions for data exfiltration trusted destinations

* docs: Enhance comments for evidence routing logic in data exfiltration
2026-05-01 10:59:52 -04:00
Eli Peter
a438886217
Python fp and docs updtes (#58)
* refactor: Update comments for clarity and add expectations.json files for performance metrics

* feat: Implement FP guard for JS/TS local-collection receivers to suppress missing ownership checks

* feat: Enhance Rust parameter handling to classify local collections and prevent false ownership checks

* refactor: Simplify code formatting for better readability in multiple files

* refactor: Improve UTF-8 sequence length handling and enhance clarity in loop iteration

* feat: Update Java and Python patterns to include new security rules

* refactor: Improve comment clarity and consistency across multiple Rust files

* refactor: Simplify code formatting for improved readability in integration tests and module files

* refactor: Improve comment formatting and enhance clarity in assertions across multiple files
2026-04-29 19:53:34 -04:00
elipeter
4db0805de6 ci: Enhance release workflow to support manual tag input and ensure consistent artifact naming 2026-04-29 11:59:50 -04:00
elipeter
65add619a0 ci: Update cosign signing commands to use bundle output format 2026-04-29 11:53:55 -04:00
Eli Peter
832533a8cd
Fix fn and bump frontend packages (#57)
* chore(deps): update frontend dependencies to latest versions

* fix: update reconnectTimer type and adjust tsconfig paths for consistency

* fix: add toast to dependencies in FindingsPage component

* fix: add toast to dependencies in FindingsPage component

* fix: update language maturity metrics and improve Go validation handling

* fix: update CHANGELOG with recent enhancements and dependency bumps

* fix: format reconnectTimer initialization for improved readability
2026-04-29 02:57:57 -04:00
dependabot[bot]
281699faae
chore(deps): bump react-router-dom from 6.30.3 to 7.14.2 in /frontend (#49)
Bumps [react-router-dom](https://github.com/remix-run/react-router/tree/HEAD/packages/react-router-dom) from 6.30.3 to 7.14.2.
- [Release notes](https://github.com/remix-run/react-router/releases)
- [Changelog](https://github.com/remix-run/react-router/blob/main/packages/react-router-dom/CHANGELOG.md)
- [Commits](https://github.com/remix-run/react-router/commits/react-router-dom@7.14.2/packages/react-router-dom)

---
updated-dependencies:
- dependency-name: react-router-dom
  dependency-version: 7.14.2
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-29 01:07:22 -04:00
dependabot[bot]
d08c835ea3
chore(deps): bump blake3 in the cargo-minor-and-patch group (#47)
Bumps the cargo-minor-and-patch group with 1 update: [blake3](https://github.com/BLAKE3-team/BLAKE3).


Updates `blake3` from 1.8.4 to 1.8.5
- [Release notes](https://github.com/BLAKE3-team/BLAKE3/releases)
- [Commits](https://github.com/BLAKE3-team/BLAKE3/compare/1.8.4...1.8.5)

---
updated-dependencies:
- dependency-name: blake3
  dependency-version: 1.8.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: cargo-minor-and-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-29 01:06:51 -04:00
dependabot[bot]
f4b1ab8a34
chore(deps): bump the frontend-minor-and-patch group (#48)
Bumps the frontend-minor-and-patch group in /frontend with 8 updates:

| Package | From | To |
| --- | --- | --- |
| [@tanstack/react-query](https://github.com/TanStack/query/tree/HEAD/packages/react-query) | `5.95.2` | `5.100.6` |
| [@vitest/coverage-v8](https://github.com/vitest-dev/vitest/tree/HEAD/packages/coverage-v8) | `4.1.1` | `4.1.5` |
| [eslint-plugin-react-hooks](https://github.com/facebook/react/tree/HEAD/packages/eslint-plugin-react-hooks) | `7.0.1` | `7.1.1` |
| [globals](https://github.com/sindresorhus/globals) | `17.4.0` | `17.5.0` |
| [jsdom](https://github.com/jsdom/jsdom) | `29.0.1` | `29.1.0` |
| [prettier](https://github.com/prettier/prettier) | `3.8.1` | `3.8.3` |
| [typescript-eslint](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/typescript-eslint) | `8.57.2` | `8.59.1` |
| [vitest](https://github.com/vitest-dev/vitest/tree/HEAD/packages/vitest) | `4.1.1` | `4.1.5` |


Updates `@tanstack/react-query` from 5.95.2 to 5.100.6
- [Release notes](https://github.com/TanStack/query/releases)
- [Changelog](https://github.com/TanStack/query/blob/main/packages/react-query/CHANGELOG.md)
- [Commits](https://github.com/TanStack/query/commits/@tanstack/react-query@5.100.6/packages/react-query)

Updates `@vitest/coverage-v8` from 4.1.1 to 4.1.5
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.5/packages/coverage-v8)

Updates `eslint-plugin-react-hooks` from 7.0.1 to 7.1.1
- [Release notes](https://github.com/facebook/react/releases)
- [Changelog](https://github.com/facebook/react/blob/main/packages/eslint-plugin-react-hooks/CHANGELOG.md)
- [Commits](https://github.com/facebook/react/commits/eslint-plugin-react-hooks@7.1.1/packages/eslint-plugin-react-hooks)

Updates `globals` from 17.4.0 to 17.5.0
- [Release notes](https://github.com/sindresorhus/globals/releases)
- [Commits](https://github.com/sindresorhus/globals/compare/v17.4.0...v17.5.0)

Updates `jsdom` from 29.0.1 to 29.1.0
- [Release notes](https://github.com/jsdom/jsdom/releases)
- [Commits](https://github.com/jsdom/jsdom/compare/v29.0.1...v29.1.0)

Updates `prettier` from 3.8.1 to 3.8.3
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prettier/prettier/compare/3.8.1...3.8.3)

Updates `typescript-eslint` from 8.57.2 to 8.59.1
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/typescript-eslint/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v8.59.1/packages/typescript-eslint)

Updates `vitest` from 4.1.1 to 4.1.5
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.5/packages/vitest)

---
updated-dependencies:
- dependency-name: "@tanstack/react-query"
  dependency-version: 5.100.6
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: frontend-minor-and-patch
- dependency-name: "@vitest/coverage-v8"
  dependency-version: 4.1.5
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: frontend-minor-and-patch
- dependency-name: eslint-plugin-react-hooks
  dependency-version: 7.1.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: frontend-minor-and-patch
- dependency-name: globals
  dependency-version: 17.5.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: frontend-minor-and-patch
- dependency-name: jsdom
  dependency-version: 29.1.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: frontend-minor-and-patch
- dependency-name: prettier
  dependency-version: 3.8.3
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: frontend-minor-and-patch
- dependency-name: typescript-eslint
  dependency-version: 8.59.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: frontend-minor-and-patch
- dependency-name: vitest
  dependency-version: 4.1.5
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: frontend-minor-and-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-29 01:05:33 -04:00
Eli Peter
82f18184b1
Prerelease cleanup (#46)
* feat: Add const_bound_vars tracking to prevent false positives in ownership checks

* feat: Introduce field interner and typed bounded vars for enhanced type tracking

* feat: Add typed_call_receivers and typed_bounded_dto_fields for enhanced type tracking

* feat: Centralize method name extraction with bare_method_name helper

* feat: Implement Phase-6 hierarchy fan-out for runtime virtual dispatch

* feat: Enhance C++ taint tracking with additional container operations and inline method resolution

* feat: Introduce field-sensitive points-to analysis for enhanced resource tracking

* feat: Implement Pointer-Phase 6 subscript handling for enhanced container analysis

* test: Add comprehensive tests for JavaScript control flow constructs and lattice operations

* docs: Update advanced analysis documentation with field-sensitive points-to and hierarchy fan-out details

* test: Add comprehensive tests for lattice algebra laws and SSA edge cases

* feat: Add destructured session user handling and safe user ID access patterns

* feat: Implement row-population reverse-walk for enhanced authorization checks

* feat: Enhance authorization checks with local alias chain for self-actor types

* feat: Introduce ActiveRecord query safety checks and enhance snippet extraction

* feat: Implement chained method call inner-gate rebinding for SSRF prevention

* feat: Add observability and error modules, enhance debug functionality, and implement theme context

* feat: Remove Auth Analysis page and update navigation to redirect to Explorer

* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor

* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor

* feat: Reset path-safe-suppressed spans before lowering to maintain analysis integrity

* fix(ssa): ungate debug_assert_bfs_ordering for release-tests build

The helper at src/ssa/lower.rs was gated `#[cfg(debug_assertions)]` while
the unit test at the bottom of the file was gated only `#[cfg(test)]`.
Since `cfg(test)` is set in release builds with `--tests` but
`cfg(debug_assertions)` is not, `cargo build --release --tests` failed
with E0425. Removing the gate fixes the build; the body is `debug_assert!`
only, so the helper is free in release. Also drop the gate at the call
site to avoid a `dead_code` warning when the lib is built without
`--tests`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(closure-capture): flip JS/TS fixtures to required-finding

The JS and TS closure-capture fixtures pinned the old broken behaviour
via `forbidden_findings: [{ "id_prefix": "taint-" }]`. The engine now
correctly traces taint through the closure boundary (env source captured
by an arrow function, sunk via `child_process.exec` inside the body), so
the formerly-forbidden finding is a true positive.

Match the Python sibling's shape — `required_findings` with
`id_prefix` + `min_count` plus a small `noise_budget` — and rewrite the
companion READMEs and the phase8_fragility_tests doc-comments from
"known gap" to "regression guard".

Verified:
- cargo test --release --test phase8_fragility_tests → 8/8 pass
- cargo test --release --lib bfs_assertion → pass
- corpus benchmark F1 = 0.9976 (TP=205, FP=1, FN=0) — unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: Add OWASP mapping and baseline mutation hooks for enhanced security analysis

* feat: Introduce health module and enhance health score computation with calibration tests

* feat: Add expectations configuration and cleanup .gitignore for log files

* feat: Implement theme selection and enhance settings panel for triage sync

* feat: Suppress false positives for strcpy calls with literal sources in AST

* feat: Update analyse_function_ssa to return body CFG for accurate analysis

* feat: Add bug report and feature request templates for improved issue tracking

* feat: removed dev scripts

* feat: update README.md for clarity and consistency in fixture descriptions

* feat: removed dev docs

* feat: clean up error handling and UI elements for improved user experience

* feat: adjust button sizes in HeaderBar for better UI consistency

* feat: enhance taint analysis with additional context for sanitizer and taint findings

* cargo fmt

* prettier

* refactor: simplify conditional checks and improve code readability in AST and screenshot capture scripts

* feat: add script to frame PNG screenshots with brand gradient

* feat: add fuzzing support with new targets and CI workflows

* refactor: streamline match expressions and improve formatting in CLI and output handling

* feat: enhance configuration display with detailed output options

* feat: stage demo configuration for improved CLI screenshot output

* feat: expose merge_configs function for user-configurable settings

* refactor: simplify code structure and improve readability in config handling

* refactor: improve descriptions for vulnerability patterns in various languages

* feat: update MIT License section with additional usage details and copyright information

* feat: update screenshots

* refactor: update build process and paths for frontend assets

* feat: add cross-file taint fuzzing target and supporting dictionary

* refactor: clean up formatting and comments in fuzz configuration and example files

* refactor: remove outdated comments and clean up CI configuration files

* chore: update changelog dates and improve formatting in documentation

* refactor: update Cargo.toml and CI configuration for improved packaging and build process

* refactor: enhance quote-stripping logic to prevent panics and add regression tests

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 00:58:38 -04:00
dependabot[bot]
79c29b394d
chore(deps): bump postcss from 8.5.8 to 8.5.10 in /frontend (#43)
Bumps [postcss](https://github.com/postcss/postcss) from 8.5.8 to 8.5.10.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/8.5.8...8.5.10)

---
updated-dependencies:
- dependency-name: postcss
  dependency-version: 8.5.10
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-25 18:42:30 -04:00
dependabot[bot]
134fd6913d
chore(deps-dev): bump vite from 6.4.1 to 6.4.2 in /frontend (#44)
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 6.4.1 to 6.4.2.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v6.4.2/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v6.4.2/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-version: 6.4.2
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-25 18:42:08 -04:00
Eli Peter
41128177d2
Release/0.5.0 (#35)
* feat: Introduce function-scoped variable interning for state analysis with new tests and fixtures

* feat: Add Phase 26 symbolic execution enhancements with bitwise operator support, abstract interpretation refinements, and new taint analysis tests

* feat: Refine state analysis to handle factory-pattern resource returns with mixed-path tests and leak detection enhancements

* feat: Add Phase 27 debug views with symbolic execution, abstract interpretation, SSA, and call graph viewers; integrate with debug layout and styles

* feat: Add Phase 31 type-qualified symbolic resolution with receiver-based callee disambiguation and testing

* feat: Extend symbolic execution with state iteration, enhanced debug views, and debounced input handling

* feat: Add Phase 13 resource and auth pattern extensions with new tests and fixtures

* feat: Introduce CFG debug graph renderer with compact mode, toolbar, and DAG layout integration

* feat: Add Phase 28 encoding and decoding transform modeling with structural symex enhancements and new taint analysis tests

* feat: Extend abstract interpretation with type facts and constant value tracking in debug views and server logic

* feat: Add linear path handling and witness extraction to symbolic execution with Phase 28 transform mismatch detection

* feat: Refine Go auth and sanitizer handling with enhanced rules, state updates, and benchmark improvements

* feat: Enable auth-state analysis by default and update relevant tests in benchmark config

* test: Update state_tests to reflect default enablement of auth-state analysis and add auth suppression test

* docs: update CHANGELOG.md

* feat: Introduce per-index taint tracking in `HeapState` with `HeapSlot`, overflow handling, and revised SSA transfers

* feat: Introduce C/C++ language labels and refine heap state tracking in SSA transfers

* feat: Implement per-index array slot tracking in symbolic heap with overflow collapse

* feat: Add implicit definition handling for uninitialized declarations in SSA value allocation

* feat: Refactor function parameters and constants for improved clarity and maintainability

* refactor: Reorder module imports and improve formatting for consistency

* refactor: Fix formatting erorrs

* refactor: Fix clippy warnings

* refactor: Fix fmt warnings (again)

* chore: Update dependencies and improve feature configuration

* Add comprehensive tests for undertested modules (#36) (COPILOT)

* Add comprehensive tests for undertested modules

Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>
Agent-Logs-Url: https://github.com/elicpeter/nyx/sessions/f3fc877e-f386-49ba-9793-fc93d3805083

* Add comprehensive tests for ext, project, walk, and errors modules

Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>
Agent-Logs-Url: https://github.com/elicpeter/nyx/sessions/f3fc877e-f386-49ba-9793-fc93d3805083

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>

* chore: Update dependencies and improve feature configuration

* fix: formatting errors in new tests

* chore: Update license list in about.toml

* chore: made functions input inline

* chore: updated cfg graph to take up the full page

* chore: add Prettier configuration and update code formatting

* Add frontend test suite with Vitest (111 tests) (#37)

* Add Vitest test suite for frontend - 111 tests across utils, components, hooks, and graph utilities

Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>
Agent-Logs-Url: https://github.com/elicpeter/nyx/sessions/7cf0dba2-ecff-4740-ba4d-92717e74a0b7

* ci: add frontend test step to CI workflow

Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>
Agent-Logs-Url: https://github.com/elicpeter/nyx/sessions/5bc0ac9f-0a32-4d03-9cb7-7a15aea53fca

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>

* chore: simplify array initialization in test files for consistency

* ran typecheck

* feat: add AnalysisWorkspace component and integrate it into CfgViewerPage

* feat: update routing in AppLayout and improve empty state message in ExplorerPage

* feat: enhance scan progress tracking with additional metrics and stages

* feat: update license information and add license check script

* feat: implement cross-file symbolic execution with callee body persistence

* feat: replace dagre graphs with Graphology + ELK + Sigma for more advanced call stack and cfg rendering

* feat: ensure CFG function view is scoped to the selected function, preventing bleed into sibling functions

* feat: enhance resource tracking with proxy method summaries and improve finding extraction

* feat: add terminal function exit detection for accurate resource leak analysis

* feat: add warnings for loops and functions without bodies to improve error recovery

* feat: update lambda expression handling to ensure proper function classification and control flow

* feat: remove bounded formatting/string ops and add JSON.parse sanitizer for improved data handling

* feat: add inline return taint analysis and regression tests for improved security checks

* feat: add engine version management and migration handling for database schema updates

* feat: enhance first_call_ident to skip nested function bodies and add regression tests

* feat: enhance callee name resolution with two-segment normalization and disambiguation

* feat: add cross-file context flags and debug assertions for taint analysis

* feat: refactor taint analysis structure to unify context handling and improve clarity

* feat: enhance dead code elimination to preserve Sink, Source, and Sanitizer labels with new tests

* docs: updated CHANGELOG.md

* fmt: formatting fixes

* fix: fixed frontend formatting and lint warnings

* fix: optimized ci

* fix: optimized ci

* Add comprehensive multi-file test coverage to Nyx (#38)

* Initial checklist for multi-file test suite expansion

Agent-Logs-Url: https://github.com/elicpeter/nyx/sessions/e550cb88-9767-4442-94d4-101bf5bb0e23

Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>

* Add 12 new multi-file test fixtures with TP/TN/near-miss coverage

Agent-Logs-Url: https://github.com/elicpeter/nyx/sessions/e550cb88-9767-4442-94d4-101bf5bb0e23

Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>

* deleted root repo

* rebuilt to test for regressions

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>
Co-authored-by: elipeter <elicpeter@gmail.com>

* feat: enhance import alias resolution and taint tracking

* feat: implement security hardening with CSRF protection and path validation

* feat: add support for import alias bindings in Python, PHP, and Rust

* feat: enhance CFG analysis modes and improve code readability

* feat: add detection for parameterized SQL queries to enhance security

* feat: add safe internal redirect handling and enhance session destroy validation

* feat: implement security improvements by addressing vulnerabilities in execAsync, session management, and file downloads

* feat: enhance taint detection by adding support for inline source member expressions in call arguments

* feat: implement pre-emission of Source nodes for inline source member expressions in call arguments

* feat: add support for Throw statement in control flow and error handling

* feat: add debug and echo endpoints with potential information leakage

* feat: implement internal redirect suppression and enhance taint detection

* feat: implement module alias tracking for dynamic dispatch in JS/TS

* feat: add authorization analysis module with Express support

* feat: add authorization analysis module with Express support

* feat: add tests for admin guard requirements and clean checks in authorization analysis

* feat: integrate Koa and Fastify frameworks into authorization analysis

* feat: add Flask and Django support to authorization analysis module

* feat: add support for Rails and Sinatra frameworks in authorization analysis

* feat: add support for Axum, ActixWeb, and Rocket frameworks in authorization analysis

* feat: add support for ActixWeb, Axum, and Rocket frameworks in authorization analysis

* feat: add support for Rails and Sinatra in authorization analysis

* chore: add .DS_Store to .gitignore

* refactor: simplify conditional checks and improve readability in multiple files

* refactor: update usage of Option methods for improved clarity and consistency

* refactor: improve code readability by simplifying conditional checks and formatting

* refactor: improve code formatting and readability by simplifying conditional checks

* refactor: simplify conditional checks and improve readability in multiple files

* refactor: simplify conditional checks in axum.rs for improved readability

* feat: add CodeQL analysis configuration for enhanced security scanning

* test: add comprehensive tests for `src/output.rs` SARIF builder (#39)

* chore: start test coverage improvement work

Agent-Logs-Url: https://github.com/elicpeter/nyx/sessions/cd7ff398-134e-4728-a5e7-0353a0744423

Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>

* test: add comprehensive tests for src/output.rs SARIF builder

Agent-Logs-Url: https://github.com/elicpeter/nyx/sessions/cd7ff398-134e-4728-a5e7-0353a0744423

Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>

* refactor: improve code formatting and readability in output.rs

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: elicpeter <54954007+elicpeter@users.noreply.github.com>
Co-authored-by: elipeter <elicpeter@gmail.com>

* refactor: improve code formatting and readability in output.rs

* Potential fix for code scanning alert no. 210: Uncontrolled data used in path expression

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Potential fix for code scanning alert no. 211: Uncontrolled data used in path expression

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* refactor: enhance triage file path handling with improved error management and validation

* refactor: updated func summaries for richer detail

* refactor: update SSA summary extraction to use canonical FuncKey for distinct entries

* refactor: enhance callee metadata structure to support arity, receiver, and qualifier for better overload resolution

* refactor: add support for keyword arguments in function calls and enhance receiver extraction for method-style calls

* refactor: implement new Flask routes for safe and unsafe shell command execution

* refactor: separate receiver handling in SSA operations and enhance taint propagation

* refactor: improve arity handling by using arg_uses for positional argument count and enhance witness scoring for tainted arguments

* refactor: implement auth decorator extraction and classification for multiple languages

* refactor: enhance Rust module path resolution and use map handling for cross-file disambiguation

* refactor: introduce CalleeQuery struct for structured callee resolution and enhance resolver logic

* refactor: implement same-file identity collision handling for `runTask` to ensure correct resolver behavior

* refactor: standardize default struct initialization across multiple files

* feat: add scripts for formatting checks and auto-fixes with test summaries

* refactor: simplify character splitting and enhance namespace qualifier handling

* refactor: improve documentation clarity and enhance code readability in resolver logic

* refactor: replace default struct initialization with explicit field assignments for clarity

* feat: enhance anonymous function naming by deriving context-based bindings

* refactor: streamline match expressions for improved readability and performance

* refactor: streamline match expressions for improved readability and performance

* refactor: replace loop with while let for improved clarity and performance

* feat: add SSA constant propagation support to analysis context for improved accuracy

* feat: add SSA constant propagation support to analysis context for improved accuracy

* feat: implement shell metacharacter validation and bounded-length checks in Rust analysis

* feat: add static map analysis for command injection suppression and type safety

* refactor: simplify match statements and reduce line breaks for improved readability

* feat(summary): phase 1/5 SinkSite data model for primary sink-location attribution

Introduce SinkSite (file_rel, line, col, snippet, cap) carrying the
primary sink source-location through function summaries. Swap
SsaFuncSummary.param_to_sink and FuncSummary.param_to_sink from a coarse
Cap map to a deduped SmallVec<[SinkSite; 1]> per parameter, with a
backward-compatible cap_sites() helper and serde defaults so pre-phase-1
on-disk rows continue to deserialise cleanly.

Extraction: SinkSiteLocator bundles the tree/bytes/file_rel needed by
extract_ssa_func_summary; ParsedFile::extract_ssa_artifacts wires the
locator in for the persisted pass-1 path, while pass-2 intra-file
transient summaries fall back to cap-only sites (behavior unchanged).
Merge: GlobalSummaries::insert now unions sink sites with
(file_rel, line, col, cap) dedup via shared union_param_sink_sites
helper.

Database: JSON-serialised summary columns carry the new shape
automatically; no schema change needed.

Phase 2 will consume SinkSite in build_taint_diag() to overwrite the
caller-site Finding.line with the callee's sink line when resolved via
summary. Phase 1 keeps behavior unchanged: scanning
tests/benchmark/corpus/rust/cmdi/cmdi_indirect.rs still produces the
same (wrong) line 10 finding.

Adds round-trip tests covering SinkSite solo, SsaFuncSummary with sink
sites, legacy-JSON default handling for both summary types, and merge
dedup.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(taint): phase 2/5 thread SinkSite into SsaTaintEvent and Finding

Plumb Phase 1's SinkSite through the event pipeline into Findings,
no output change yet.  SsaTaintEvent gains `primary_sink_site:
Option<SinkSite>`; when the main or callback sink-emission path has
non-empty `param_to_sink_sites`, filter to sites whose
`(line != 0) && (cap ∩ sink_caps != ∅)` and emit one event per
distinct site — the multi-primary collapse keeps each downstream
Finding single-primary.

Resolution: ResolvedSummary and SinkInfo gain mirror
`param_to_sink_sites` fields, populated from `SsaFuncSummary.param_to_sink`
(SSA + callback paths) and `FuncSummary.param_to_sink` (global paths).
Label, local-summary, and interop resolution paths leave the field
empty — they only ever had cap-level info to begin with.

Finding: new `primary_location: Option<SinkLocation>` with
`file_rel/line/col`.  `ssa_events_to_findings` maps
`event.primary_sink_site` → `Finding.primary_location`, filtering
cap-only sites (`line == 0`) to `None` so the (0,0) sentinel never
leaks to formatters.  Dedup key extended with the primary location
so multi-site events aren't collapsed back together.

Invariants (debug_assert!):
* every SinkSite reaching emission has `line != 0 && cap ∩ sink_caps
  != ∅` — enforced by the pick_primary_sink_sites* filters;
* every populated Finding.primary_location has `line != 0` AND
  non-empty `file_rel` — the cap-only → None translation upstream
  guarantees this.

Deliberately independent of `uses_summary`: that flag tracks whether
the *taint chain* used a summary, whereas primary attribution
requires only that the *sink* itself was summary-resolved.  A local
source reaching a cross-file sink produces `uses_summary=false`
alongside a populated primary_location — documented on
Finding.primary_location, covered by
`cross_file_sink_finding_carries_primary_location`.

build_taint_diag, SARIF/JSON/explanation formatters, and the
benchmark scorer remain untouched: finding.line still comes from
`cfg_graph[finding.sink]`, so cmdi_indirect.rs still reports line 10
and the benchmark's rs-cmdi-003 row still shows FN in the LOC column.

Tests: `cross_file_sink_finding_carries_primary_location` (proves
plumbing via a synthetic FuncSummary carrying a SinkSite at 42:5) and
`cross_file_sink_cap_only_site_leaves_primary_location_none`
(regression guard against cap-only sites surfacing).  All 1566 lib
tests + integration tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(output): phase 3/5 consume primary sink location in diag + SARIF

When a finding's primary_location (populated in phase 2 from a callee
summary's SinkSite) names the dangerous instruction inside a callee
body, attribute the diagnostic line to that location instead of the
caller's call site. The call site is demoted to a Call step in
flow_steps, and a synthetic Sink step at the primary location is
appended so analysts still see the full trace.

Changes:
- Add scan_root parameter to build_taint_diag so file_rel can be
  resolved back to an absolute path via a shared resolve_file_rel
  helper. Empty file_rel (single-file scans where namespace == "")
  resolves to the file under analysis.
- Extend SinkLocation with snippet, carried from the upstream
  SinkSite so the formatter needs no second file read.
- Relax the ssa_events_to_findings debug_assert to allow empty
  file_rel, which is valid when scan root equals the file itself.
- SARIF: emit data-flow as codeFlows[0].threadFlows[0].locations[];
  locations[0] already reflects the primary sink position via the
  updated diag line/col.

Acceptance: scan on tests/benchmark/corpus/rust/cmdi/cmdi_indirect.rs
now reports line 5 (Command::new) as the primary sink, with the call
site at line 10 visible in flow_steps.

Two expect.json fixtures updated (must_match line_range widened):
- javascript/taint/context_sensitive_call: 12-14 -> 7-14 (line 8 is
  the real sink inside run()).
- rust/cfg/closure_async: 10-10 -> 10-11 (line 11 is Command::new
  inside the closure).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(bench): phase 4/5 validate primary sink attribution across corpus

Extend the benchmark scorer and ground truth to lock in phase 3's
primary-location behavior, and add fixtures that exercise the new
capability end-to-end.

Scorer (tests/benchmark_test.rs):
- Add optional `expected_call_site_lines: Option<Vec<[usize; 2]>>` on
  Case. When present, score_location_level additionally requires at
  least one flow_step in the finding's evidence trace to fall within
  ±2 of the call-site range. When absent, the check is skipped —
  fully forward-compatible with existing fixtures.
- Retain ±2 tolerance on expected_sink_lines (compared against the
  now-primary Diag.line post-phase-3).

Ground truth edits:
- rs-cmdi-cross-001: expected_sink_lines [8,8] -> [9,9]. Line 8 is the
  transform::wrap call site (a cross-file propagator, not a sink);
  line 9 is Command::new, the real sink. The ±2 tolerance happened to
  mask this stale attribution but it was semantically wrong — phase 4
  is the right time to correct it. Also adds expected_call_site_lines
  [8,8] so the new field is exercised on an existing cross-file case.
- rs-cmdi-003: adds expected_call_site_lines [10,10] (run_cmd call).
  This fixture's sink (Command::new inside run_cmd at line 5) was the
  motivating case for phases 1-3; adding the call-site assertion
  guards against regression to caller-line attribution.

New fixtures:
- rust/cmdi/cmdi_indirect_multisink.rs (rs-cmdi-009): helper run_both
  takes two tainted params and invokes two Command sinks on
  consecutive lines. Locks in that primary line lands inside the
  helper (lines 5-6), not at the caller (line 12). Notes document
  that SinkSite is currently one-per-callee so both findings today
  collapse onto the first sink; expected_sink_lines=[5,6] and
  expected_call_site_lines=[12,12] stay valid either way.
- python/cmdi/cross_indirect_sink/{app.py,helper.py} (py-cmdi-cross-
  004): sink os.system lives in helper.py (cross-file), caller in
  app.py reads env source and calls run_cmd. Verifies phase 3's
  cross-file primary attribution: Diag.path = helper.py, Diag.line =
  5, with app.py:7 recorded in flow_steps as a Call step.

Acceptance:
- `cargo test --test benchmark_test -- --ignored --nocapture` passes.
- rs-cmdi-003 is TP/TP/TP (the target flip FN->TP at LOC). All
  pre-existing TP/TP/TP fixtures remain TP/TP/TP; 2 new fixtures are
  TP/TP/TP.
- Aggregate rule-level: TP=158 FP=10 FN=1 TN=97, P=0.940 R=0.994
  F1=0.966 on the 266-case corpus (was TP=156 FP=10 FN=1 TN=97 on
  264 pre-phase-4, delta is the +2 new cases both resolving TP).
- Full `cargo test` green (1566 lib tests + all integration tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(taint): phase 5/5 lock Finding.primary_location contract via regression test

Add a regression test in src/taint/ssa_transfer.rs that wires up a synthetic
SsaFuncSummary with a SinkSite at other.rs:42:10 and drives the three
emission stages (pick_primary_sink_sites → emit_ssa_taint_events →
ssa_events_to_findings) against a minimal caller SSA body.  Asserts the
resulting Finding.primary_location is exactly that triple.

The existing integration tests in src/taint/tests.rs cover the coarse
FuncSummary path end-to-end through analyse_file.  This test locks in the
lower-level SSA-side plumbing so a future refactor that silently drops the
site between pick → emit → findings fails here rather than only at the
benchmark layer.

Also refreshes tests/benchmark/results/latest.json (timestamp only; rs-cmdi-003
remains TP/TP/TP and the aggregate P/R/F1 are unchanged from phase 4).

Closes the primary sink-location attribution feature (phases 1-5/5):
* Phase 1 — SinkSite data model on summaries.
* Phase 2 — SinkSite threaded into SsaTaintEvent and Finding.
* Phase 3 — diag + SARIF consume primary_location.
* Phase 4 — benchmark validates primary_call_site_lines across corpus.
* Phase 5 — regression test locks the event→finding contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: clean up formatting and improve readability in multiple files

* refactor: simplify type definition for deduplication key in findings

* test(harness): add must_not_match expectation for FP regression guards

Extends ExpectedFinding with must_not_match field that asserts a
diagnostic must NOT fire — presence is a hard failure. Non-consuming
scan so it coexists with must_match entries on the same rule_id.
Adds forbidden_violations accumulator and updates summary line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(regression): update expectations to ensure must_not_match for various taint and resource leak rules

* feat: implement auto-seeding for JS/TS handler parameters to enhance taint tracking

* feat: update switch statement handling to improve control flow analysis

* feat: implement promisify alias handling for JS/TS to enhance taint tracking

* feat: enhance taint tracking by refining expectation handling and adding mode filtering

* feat: refine SQL handling in stream processing and enhance auto-seeding for handler parameters

* feat: update taint tracking rules to enforce full mode matching and improve flow analysis

* feat: enhance Ruby subshell handling to improve taint tracking and flow analysis

* feat: update xss_response expectations to refine taint flow analysis and enhance regression guarding

* feat: refine framework detection and update expectation handling for Echo and Sinatra

* feat: implement max_count for taint tracking expectations and deduplicate findings

* feat: add strict_unexpected handling for taint-unsanitised-flow in expectation files

* feat: enhance deduplication of taint-unsanitised-flow findings by collapsing based on line and severity

* feat: add strict_unexpected handling for taint-unsanitised-flow in multiple expectation files

* feat: add structural invariant checks for SSA bodies

* feat: ensure deterministic phi emission order using BTreeSet

* feat: enhance handling of terminators to ensure authoritative flow through successor edges

* feat: enhance Goto terminator handling to ensure all successors are marked executable

* feat: refactor code for improved readability and organization

* feat: simplify predicate checks and enhance readability in SSA handling

* feat: implement per-file parse timeout and enhance file size handling

* feat: migrate analysis engine toggles from environment variables to configuration file

* feat: remove unnecessary whitespace in hostile_input_tests.rs

* feat: remove unnecessary whitespace in hostile_input_tests.rs

* feat: update dependencies and enhance documentation on language maturity

* feat: enhance security headers and improve request body limits

* feat: implement sink capability bits for deduplication and enhance evidence tagging

* feat: implement dynamic activation handling for gated sinks and enhance validation logic

* feat: enhance configuration documentation and clarify inline analysis cache behavior

* feat: implement panic recovery during analysis to continue scans past errors

* feat: add expectations configuration for taint analysis and performance metrics

* feat: enhance error handling and logging during file reading and mutex locking

* feat: add cross-file body loading tests and plumbing for CF-1 phase

* feat: implement cross-file k=1 context-sensitive inline taint analysis with new tests and fixtures

* feat: implement indexed-scan parity in cross-file inline analysis with new dropdown and copy functionality

* feat: enhance classification span handling in CFG and AST for improved source attribution

* feat: add new Express routes for handling user input and telemetry data

* feat: implement ternary expression handling in CFG with diamond structure for JS/TS

* feat: implement Phase CF-3 abstract-domain transfer channels in summaries

* feat: add support for string-prefix transfer in cross-file calls and update tests

* docs: reduce RESULTS.md doc size

* feat: implement Phase CF-4 per-return-path summary decomposition with tests

* feat: update parameter handling in pass1 and refactor SsaFuncSummary initialization

* feat: implement Phase CF-5 for cross-file SCC joint fixed-point convergence with new flags and tests

* feat: implement Phase CF-6 with parameter-granularity points-to summaries and associated tests

* refactor: update comments and documentation for clarity and consistency

* style: format code for consistency and readability

* refactor: simplify verdict handling and improve edge checking logic

* refactor: optimize path and identifier collection by avoiding unnecessary cloning

* chore: update Cargo.toml for Rust version 1.85 and add ignored files; modify CHANGELOG and README for clarity on state analysis defaults

* refactor: update documentation and improve clarity in configuration files

* refactor: update documentation and improve clarity in configuration files

* feat: add JS/TS pass-2 convergence tests and expectations configuration

* feat: add Phase 5 regression tests for inline cache origin attribution and update related logic

* feat: implement Phase 7 deduplication and alternative path linking for taint findings

* feat: implement structural DFS index for anonymous functions and update naming conventions

* feat: add Phase 8 regression tests for container-element taint in JS and Python

* feat: add engine-depth profiles and explain-engine option for CLI

* feat: update expectations and add new README fixtures for multi-file scan regression

* feat: implement Phase 11 callback-alias and factory patterns with regression tests

* feat: implement Terminator::Switch for multi-way dispatch and add regression tests

* feat: add real-CVE benchmark fixtures for CVE-2023-48022, CVE-2019-14939, and CVE-2023-26159 with corresponding patched variants

* refactor: extract cfg and ssa_transfer to submodules

* refactor: cargo fmt

* refactor: remove unnecessary blank line in cfg_tests.rs

* refactor: remove unnecessary planning file

* chore: update Rust version to 1.88 and bump dependencies in Cargo files

* feat: enhance triage UI with new layout and controls, update README for clarity

* feat: enhance triage UI with new layout and controls, update README for clarity

* chore: remove outdated section from README for version 0.5.0

* docs: improve clarity and consistency in README content

* chore: add "GPL-3.0-or-later" to license options in about.toml

* chore: update license handling in about.toml and check-licenses.mjs

* style: format code for improved readability in TriagePage component

* style: format code for improved readability in TriagePage component

* chore: enhance license handling and improve body_id scoping in seed lookup

* feat: introduce owner and parent body IDs for enhanced seed scoping

* feat: implement direction-aware engine provenance with new CLI flag for strict CI gating

* feat: add Undef SSA operation for improved control-flow handling

* style: improve code formatting for consistency and readability in multiple files

* feat: add 16-function chain SCC across multiple files for enhanced analysis

* style: simplify code formatting for improved readability in multiple files

* fix: update CapHitReason default implementation and improve README clarity

* docs: enhance README with detailed explanations of taint analysis and limitations

* docs: refine README for clarity and consistency in taint analysis section

* style: improve code formatting for better readability in NewScanModal and scans

* fix: update cargo-about command to use --offline for deterministic license generation

* fix: update cargo-about command to use --offline for deterministic license generation

* ci: add step to prime cargo registry cache for deterministic license generation

* feat: add support for non-sink collections in authorization analysis

* feat: enhance authorization checks with row-level ownership equality and binding tracking

* feat: implement self-scoped user handling and enhance ownership checks

* refactor: simplify assertions and formatting in authorization analysis tests

* fix: normalize line endings in THIRDPARTY-LICENSES.html generation and update README with AI disclosure

* docs: update AI disclosure section for clarity and conciseness

* feat: add AI Contribution Policy and update contributing guidelines for AI assistance disclosure

* feat: enhance authorization analysis with SSA-derived variable type classification

* feat: implement auth_finding_to_diag function for enhanced security diagnostics

* feat: add args_value_refs to CallSite struct for enhanced argument tracking

* feat: add args_value_refs to CallSite struct for enhanced argument tracking

* feat: add direction-aware engine provenance with LossDirection classification and new CLI flag

* feat: simplify strip_cap_from_call_args call by removing unnecessary line breaks

* feat: enhance error message handling in cli_validation_tests for better Windows compatibility

* feat: optimize release profile settings in Cargo.toml and update CodeQL configuration

* feat: enhance release build process with SBOM generation and SLSA provenance

* feat: update actions/checkout and actions/setup-node to v6, enhance CLI options, and improve auth-check summaries

* feat: introduce PathFact handling for path safety checks and rejection logic

* feat: introduce PathFact handling for path safety checks and rejection logic

* feat: update benchmark data and enhance path sanitization logic with new safety checks

* feat: document AI assistance in frontend UI development and human review process

* feat: add return path facts for enhanced path safety checks and update documentation

* chore: update release date for version 0.5.0 in CHANGELOG.md

* chore: clean up ci.yml by removing outdated comments and clarifying steps

* feat: implement cross-language path sanitizers and validators for enhanced security

* feat: enhance SSA value usage tracking by including block terminators and improve path safety checks

* feat: enhance switch statement handling by adding per-case path constraints and support for exclusive cases

* refactor: simplify conditional formatting and improve code readability in executor and lower modules

* feat: add vulnerable examples for various languages demonstrating authentication and sanitization issues

* feat: enhance actor context recognition for self-actor identifiers and add support for global non-sink receivers

* feat: enhance actor context recognition for self-actor identifiers and add support for global non-sink receivers

* feat: add transform classifiers for Java, Go, and Ruby with corresponding tests

* refactor: clarify comments on reassign-to-constant idiom and sink behavior in guards.rs

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 17:59:11 -04:00
Eli Peter
c4ce08b452
fix: Exclude 'docs/' directory from package inclusion in Cargo.toml (#34) 2026-02-25 21:29:26 -05:00
Eli Peter
1bbe4b1cfb
Phase 1 (#33)
* chore: Exclude CLAUDE.md from Cargo.toml

* feat: add callgraph module and integrate into main analysis flow

* feat: enhance CLI with new severity filtering and analysis modes

* feat: update CHANGELOG with recent enhancements and fixes to severity filtering and output handling

* feat: implement state-model dataflow analysis for resource lifecycle and auth state

* feat: enhance diagnostic output formatting and add evidence structure

* feat: implement attack surface ranking for diagnostics with scoring and sorting

* feat: add comprehensive documentation for installation, usage, and rules reference

* feat: add multiple language support for command execution and evaluation endpoints

* feat: implement inline suppression for findings using `nyx:ignore` comments

* feat: add confidence levels to AST patterns and update output structure

* feat: implement low-noise prioritization system with category filtering, rollup grouping, and configurable budgets

* feat: bump version to 0.4.0 and update changelog with new features and improvements

* feat: add dead code allowances to various functions in mod.rs and real_world_tests.rs
2026-02-25 21:16:36 -05:00
Eli Peter
19b578c5c4
Feat/configurable sanitizers and js precision (#32)
* chore: Exclude CLAUDE.md from Cargo.toml

* feat: Add configurable analysis rules and CLI commands for custom sanitizers and terminators

* feat: Enhance resource management and analysis efficiency

- Implemented parallel summary merging in `scan_filesystem` using rayon for improved performance.
- Introduced `GlobalSummaries::merge()` for efficient merging of summaries.
- Optimized file reading and hashing to eliminate redundant I/O operations.
- Added `should_scan_with_hash()` and `upsert_file_with_hash()` methods to streamline file processing.
- Enhanced taint analysis with in-place mutations to reduce memory allocations.
- Updated resource acquisition patterns to exclude false positives for `freopen` and wrapper functions.

* feat: Implement severity downgrade for findings in non-production paths and add source kind inference

* feat: Update versioning information in SECURITY.md for new stable line

* feat: Update categories in Cargo.toml to include parser-implementations and text-processing

* feat: Update dependencies in Cargo.lock for improved compatibility and performance

* feat: Update dependencies in Cargo.lock and Cargo.toml for improved compatibility
2026-02-25 04:02:11 -05:00
Eli Peter
f96a89e7c1
Feat/full cfg (#30)
* feat: Enhance control flow analysis with function summaries and taint analysis

* feat: Update taint analysis to utilize function summaries for enhanced tracking

* Refactor `walk.rs` batch processing and override handling:

- Renamed `Batcher` to `BatchSender` for clarity.
- Added `BatchSender::new` constructor for cleaner initialization.
- Simplified batch size management in `BatchSender`.
- Extracted `build_overrides` function for reusable override construction.
- Improved error handling and validation in override building.
- Enhanced performance with directory and file type filtering in `walk`.

* Improve logging and streamline directory walk process:

- Added detailed `tracing` logs for debugging batch flushes, override construction, and walk initialization/completion.
- Optimized and simplified `filter_entry` logic for directory and file type filters.
- Improved metadata checks and max file size enforcement during the scan.

* Refactor and optimize taint tracking, label rules, and directory walk process:

- Replaced `DefaultHasher` with `blake3::Hasher` for improved taint hashing.
- Enhanced sorting and hashing logic in `taint.rs` for consistency and efficiency.
- Removed unused `set_hash` function and redundant imports across files.
- Improved batch sender logic in `walk.rs`, renaming key components for clarity.
- Unified `spawn_senders` and `spawn_file_walker` with thread handling and channel tuple return.
- Expanded label rules with additional matchers for sources, sanitizers, and sinks.
- Deprecated `dump_cfg` and specific logging utilities in `cfg.rs` for code cleanup.

* fix: fixed let chains error in walk.rs

* fix: updated dependencies

* fix: updated dependencies

* chore: Remove standard error in scan.rs

* feat: Introduce function summaries for enhanced taint and control flow analysis

* feat: Enhance taint analysis with interop support and function summaries

* feat: Add configuration analysis module and enhance matcher rules

* feat: Add arity column to function_summaries and handle schema migration

* fix: fixed clippy &PathBuf warnings

* chore: Update dependencies and versioning in Cargo files

* docs: Update README to enhance clarity and detail on features and analysis modes

* chore: Update CHANGELOG for version 0.2.0 with new features, changes, and fixes

* docs: Update SECURITY.md to clarify version support status

---------

Co-authored-by: elipeter <eli.peter@es.fcm.travel>
2026-02-24 23:44:07 -05:00
Eli Peter
8cbbec7d90
Update README.md to clarify config files (#27) 2025-07-03 17:02:57 +02:00
Eli Peter
3be352abb7
Update README.md (#26) 2025-07-03 15:49:20 +02:00
Eli Peter
6f78f95efb
Create SECURITY.md (#25) 2025-06-28 18:34:22 +02:00
Eli Peter
aedd4a90a1
Potential fix for code scanning alert no. 2: Workflow does not contain permissions (#24)
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-06-28 18:18:10 +02:00
elipeter
cc33a9cfd9 chore: Add GPL-3.0 to license list in about.toml 2025-06-28 17:41:47 +02:00
Eli Peter
3c21efba75
Added experimental control flow analysis and syntax classification for rust lang (#22)
* Introduce control flow graph (CFG) support:

- Added `cfg.rs` with CFG generation and analysis utilities.
- Integrated `petgraph` library for graph-based computations.
- Updated `ast.rs` to utilize CFG for function analysis.
- Modified `Cargo.toml` and `Cargo.lock` to include new dependencies.
- Improved static analysis with taint tracking through CFG paths.

* feat: enhance control flow analysis with taint tracking and node labeling

* feat: improve control flow graph with enhanced node handling and new tests

* Remove unnecessary reference marker in `byte_offset_to_point` comment.

* Remove unnecessary reference marker in `byte_offset_to_point` comment.

* Refactor `ast.rs` for performance and clarity; enhance `cfg.rs` with recursive CFG generation and improved classification logic for AST analysis.

* Refactor CFG and taint tracking logic:

- Enhanced `cfg.rs` with inline helper function `text_of` for cleaner UTF-8 handling in AST nodes.
- Expanded `labels.rs` rules with detailed `Sources`, `Sanitizers`, and `Sinks` for improved classification.
- Refined `push_node` to handle method call expressions with object-function pairing.
- Simplified code handling in trivia skipping and debug-only logic.

* Enhance `cfg.rs` with `first_call_ident` helper and improve identifier extraction logic in `push_node`.

* Add targeted CFG taint-tracking tests to enhance analysis coverage.

* Enhance CFG generation with loop expression handling and improve taint tracking logic. Add new sanitization example in `examples/sanitize/example.rs`.

* Update README with installation instructions for Cargo and GitHub releases.

* Expand taint-tracking with precise `def-use` computation and enhance `labels.rs` for detailed classification. Extend `examples/sanitize` with realistic scenarios demonstrating new rules.

* Refactor `labels.rs`:

- Removed redundant `LabelRule` entries for cleaner rule definitions.
- Adjusted matching logic to prioritize suffix and prefix matches effectively.

* Refactor `labels.rs`:

- Removed redundant `LabelRule` entries for cleaner rule definitions.
- Adjusted matching logic to prioritize suffix and prefix matches effectively.

* Add test for taint tracking with multiple sources in `cfg.rs`.

* Add `function_summaries` table and implement summary upsert/load methods. Refactor to handle summary storage and retrieval efficiently, with placeholder clean/drop logic.

* refactor: split `labels.rs` into modular structure with language-specific files

* refactor: split `labels.rs` into modular structure with language-specific files

* refactor: clean up SQL table definitions in `database.rs` for better readability

* refactor: simplify CFG structure by removing lifetime parameters and enhancing taint metadata handling

* refactor: update TODO comments in `cfg.rs` to clarify future enhancements for cap labels and function details

* refactor: remove redundant header from README.md for improved clarity

* feat: add PHF-based syntax classifiers and Kind enum for efficient syntax mapping across languages

* feat: introduce analysis modes for enhanced scanner configuration and diagnostics

* feat: define Kind enum for syntax classification in control flow analysis

* feat: bump version to 0.2.0-alpha and update CHANGELOG for new features and fixes

* refactor: clean up imports and formatting in AST and CFG modules for improved readability

* refactor: simplify function signatures and improve code readability in CFG and module files

* fix: correct rayon_thread_stack_size comment to reflect actual value of 8 MiB

* refactor: update string formatting in clean and project modules for consistency

* refactor: fix indentation in clean.rs for improved readability

---------

Co-authored-by: elipeter <eli.peter@es.fcm.travel>
2025-06-28 17:36:14 +02:00
Eli Peter
fd65360818
Update licensing to GPL-3: (#19)
* Update licensing to GPL-3:

- Changed project license from "MIT OR Apache-2.0" to "GPL-3".
- Added LICENSE file with GNU GPL-3 full text.
- Removed MIT and Apache-2.0 license files from the repository.

* docs: Update license badge in README to reflect GPL v3

* Update Cargo.toml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* docs: updated README.md for new license

* docs: update license information to GPL-3.0 in README and Cargo.toml

---------

Co-authored-by: elipeter <eli.peter@es.fcm.travel>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-25 17:23:31 +02:00
elipeter
cdd89dab4a docs: Update badges in README to reflect crates.io links and license information 2025-06-25 03:56:07 +02:00
elipeter
eb5bd2a244 fix: Update author email in Cargo.toml for consistency 2025-06-25 03:50:27 +02:00
Eli Peter
0366f66b42
Fix/updated binary name (#18)
* feat: Add binary configuration for nyx in Cargo.toml

* fix: Set default binary to nyx in Cargo.toml
2025-06-25 03:46:21 +02:00
Eli Peter
423e6bffd1
fix: Clarify changelog entry for Windows zip command issue in release pipeline (#17) 2025-06-25 03:37:39 +02:00
Eli Peter
ef0a6f80bb
Rename CHANGELOG.MD to CHANGELOG.md (#16) 2025-06-25 02:52:06 +02:00
Eli Peter
238ed095a3
Fix/update release pipeline (#15)
* fix: Enhance release packaging for cross-platform compatibility

* fix: Resolve pipeline bug with zip command on Windows

* fix: Clarify changelog entry for Windows zip command issue in release pipeline

* Update CHANGELOG.MD

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-25 02:49:09 +02:00
Eli Peter
e221fdd7d6
ci: Update license generation command to use output flag for consistency (#14) 2025-06-25 02:37:06 +02:00
elipeter
cd8ae3c47e ci: Update license generation command to use output flag for consistency 2025-06-25 02:26:59 +02:00
elipeter
c6c41bf0ce ci: Update license generation command to use handlebars template 2025-06-25 02:17:01 +02:00
Eli Peter
faf70b9eb6
ci: Update license generation format to use handlebars (#13) 2025-06-25 02:13:28 +02:00
Eli Peter
90fa775a48
docs: Add third-party licenses documentation and update build process (#12) 2025-06-25 02:05:15 +02:00
Eli Peter
9c76fd1e9f
Delete THIRDPARTY-LICENSES.html (#11) 2025-06-25 01:54:23 +02:00
Eli Peter
d50684e31b
docs: Add section on advantages of using Nyx in README (#10)
* docs: Add section on advantages of using Nyx in README

* ci: Update branch references from 'main' to 'master' in CI configuration

* docs: Add third-party licenses documentation and update build process

* Update .github/workflows/release-build.yml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* docs: Add third-party licenses documentation and update build process

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-25 01:42:10 +02:00
elipeter
a614e157b3 ci: Update CI workflow with matrix strategy, security checks, and linting rules adjustments 2025-06-25 00:49:29 +02:00
elipeter
24689be6f7 ci: Add rust-cache action to improve build performance 2025-06-25 00:37:36 +02:00
elipeter
47d4f589af Refactor CI workflow: rename file, update job name, and remove verbose flag from cargo build 2025-06-25 00:33:58 +02:00
elipeter
4872c5acb5 docs: Add initial CHANGELOG with project release history and key updates 2025-06-25 00:31:30 +02:00
elipeter
0efc26d28d chore: Add dual licensing information and contribution guidelines 2025-06-25 00:24:05 +02:00
Eli Peter
72ca7fa45d
test: Add unit tests for index building and scanning functionality (#9) 2025-06-24 23:57:27 +02:00
Eli Peter
46c4732f6e
test: Add unit tests for file handling and configuration merging (#7)
* test: Add unit tests for file handling and configuration merging

* test: Update IO error conversion test to use new error creation method
2025-06-24 23:38:32 +02:00
Eli Peter
8497800b13
test: Add unit tests for config merging and project name sanitization (#6)
* test: Add unit tests for config merging and project name sanitization

* Update src/utils/project.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* test: Update assertion for follow_symlinks in scanner configuration

* test: Fix typo in test function name for project info retrieval

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-24 23:18:01 +02:00
Eli Peter
a0c9d0f9d4
Merge pull request #2 from ecpeter23/bug/fix-max-results
fix: Limit diagnostics output on non indexed scan to a maximum number…
2025-06-24 22:51:30 +02:00
elipeter
a75b6cfabe fix: Remove unnecessary whitespace in diagnostics output handling 2025-06-24 22:51:16 +02:00
elipeter
ebe78b270c fix: Limit diagnostics output on non indexed scan to a maximum number of results based on configuration 2025-06-24 22:44:57 +02:00
4560 changed files with 614686 additions and 1458 deletions

19
.config/nextest.toml Normal file
View file

@ -0,0 +1,19 @@
# nextest configuration
#
# See https://nexte.st/docs/configuration/ for the full schema.
# ── Test groups ──────────────────────────────────────────────────────────────
#
# `hostile-input-timing` serialises the two timing-bounded
# `hostile_input_tests` cases that pass under nextest in isolation but fail
# under the full-suite parallel run on darwin (resource contention from the
# other ~4000 tests pushes them past their internal budget). Pinning them to
# a single thread within their own group keeps their wall-clock predictable
# without slowing the rest of the suite.
[test-groups]
hostile-input-timing = { max-threads = 1 }
[[profile.default.overrides]]
filter = 'binary(hostile_input_tests) and (test(very_long_single_line_parses) or test(many_small_functions_do_not_explode))'
test-group = 'hostile-input-timing'

1
.github/CODEOWNERS vendored Normal file
View file

@ -0,0 +1 @@
* @elicpeter

1
.github/FUNDING.yml vendored Normal file
View file

@ -0,0 +1 @@
github: elicpeter

75
.github/ISSUE_TEMPLATE/bug_report.yml vendored Normal file
View file

@ -0,0 +1,75 @@
name: Bug report
description: Report a crash, incorrect output, or other broken behavior in Nyx.
labels: ["bug"]
body:
- type: markdown
attributes:
value: |
Thanks for taking the time to file a bug. **Please do not file security vulnerabilities here** — use the private advisory link in SECURITY.md.
For false positives or missed detections (rule quality), this is the right place — those are quality bugs.
- type: textarea
id: summary
attributes:
label: Summary
description: One or two sentences describing what's wrong.
validations:
required: true
- type: textarea
id: repro
attributes:
label: Reproduction
description: Minimal source snippet or repo + the exact `nyx` command you ran. The smaller, the better — ideally a single file.
render: shell
validations:
required: true
- type: textarea
id: expected
attributes:
label: Expected behavior
validations:
required: true
- type: textarea
id: actual
attributes:
label: Actual behavior
description: Include the finding (or lack of finding), error output, or stack trace.
validations:
required: true
- type: input
id: version
attributes:
label: Nyx version
description: Output of `nyx --version`.
placeholder: "nyx 0.7.0"
validations:
required: true
- type: input
id: os
attributes:
label: OS / arch
placeholder: "macOS 14.5 arm64"
validations:
required: true
- type: dropdown
id: language
attributes:
label: Target language (if applicable)
options:
- "n/a"
- JavaScript / TypeScript
- Python
- Java
- Go
- Ruby
- PHP
- Rust
- C / C++
- Other
validations:
required: false
- type: textarea
id: extra
attributes:
label: Additional context
description: Logs, screenshots, related issues — anything else that helps.

8
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View file

@ -0,0 +1,8 @@
blank_issues_enabled: false
contact_links:
- name: Security vulnerability
url: https://github.com/elicpeter/nyx/security/advisories/new
about: Do NOT file public issues for security bugs. Use private disclosure (see SECURITY.md).
- name: Question or discussion
url: https://github.com/elicpeter/nyx/discussions
about: Open-ended questions, ideas, or help using Nyx belong in Discussions.

View file

@ -0,0 +1,27 @@
name: Feature request
description: Suggest a new capability, rule, language, or UX improvement.
labels: ["enhancement"]
body:
- type: textarea
id: problem
attributes:
label: Problem
description: What are you trying to do that Nyx can't do today? Concrete scenarios beat abstract wishes.
validations:
required: true
- type: textarea
id: proposal
attributes:
label: Proposed solution
description: How should it work? Sketches, example commands, or example findings are welcome.
validations:
required: true
- type: textarea
id: alternatives
attributes:
label: Alternatives considered
description: Other approaches you've thought about, and why they don't fit.
- type: textarea
id: extra
attributes:
label: Additional context

20
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View file

@ -0,0 +1,20 @@
## Summary
<!-- What does this PR change, and why? Keep it short. The diff already shows the "what". -->
## Related issues
<!-- "Closes #123", "Refs #456". Delete if none. -->
## Checklist
- [ ] `cargo test --bin nyx` passes
- [ ] `cargo clippy --all -- -D warnings` is clean
- [ ] `cargo fmt -- --check` passes
- [ ] User-visible changes are noted in `CHANGELOG.md` under `## [Unreleased]`
- [ ] Docs updated if behavior, flags, or config changed (`docs/`, `README.md`, `CONTRIBUTING.md`)
- [ ] New rules / language support include fixtures and integration tests
## Notes for reviewers
<!-- Anything you want a reviewer to look at first, tradeoffs, follow-ups. Delete if none. -->

6
.github/codeql/codeql-config.yml vendored Normal file
View file

@ -0,0 +1,6 @@
name: "CodeQL Config"
paths-ignore:
- examples
- tests
- benches

33
.github/dependabot.yml vendored Normal file
View file

@ -0,0 +1,33 @@
version: 2
updates:
- package-ecosystem: cargo
directory: "/"
schedule:
interval: weekly
open-pull-requests-limit: 10
groups:
cargo-minor-and-patch:
update-types:
- minor
- patch
- package-ecosystem: github-actions
directory: "/"
schedule:
interval: weekly
groups:
actions-minor-and-patch:
update-types:
- minor
- patch
- package-ecosystem: npm
directory: "/frontend"
schedule:
interval: weekly
open-pull-requests-limit: 10
groups:
frontend-minor-and-patch:
update-types:
- minor
- patch

404
.github/workflows/ci.yml vendored Normal file
View file

@ -0,0 +1,404 @@
name: CI
permissions:
contents: read
on:
push:
branches: ["master"]
pull_request:
branches: ["master"]
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
frontend:
name: frontend
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: 20
cache: npm
cache-dependency-path: frontend/package-lock.json
- name: Install frontend dependencies
working-directory: frontend
run: npm ci
- name: Frontend license check
working-directory: frontend
run: npm run license:check
- name: Frontend format check
working-directory: frontend
run: npm run format:check
- name: Frontend lint
working-directory: frontend
run: npm run lint
- name: Frontend type check
working-directory: frontend
run: npm run typecheck
- name: Frontend tests
working-directory: frontend
run: npm test
- name: Frontend build
working-directory: frontend
run: npm run build
rustfmt:
name: rustfmt
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
components: rustfmt
cache: true
- name: Format check
run: cargo fmt --all -- --check
clippy-stable:
name: clippy-stable
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
components: clippy
cache: true
- name: Lint (Clippy)
run: cargo clippy --all-targets --all-features -- -D warnings
cargo-deny:
name: cargo-deny
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@cargo-deny
- name: License & advisory checks
run: cargo deny check advisories licenses bans sources
unused-deps:
name: unused-deps
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: bnjbvr/cargo-machete@v0.9.2
third-party-licenses:
name: third-party-licenses
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@v2
with:
tool: cargo-about@0.7.1
- name: Prime cargo registry cache
run: cargo fetch --locked
- name: Regenerate license attribution
run: cargo about generate --offline about.hbs | tr -d '\r' > /tmp/THIRDPARTY-LICENSES.html
- name: Diff against committed file
run: diff -u --strip-trailing-cr THIRDPARTY-LICENSES.html /tmp/THIRDPARTY-LICENSES.html
docs-fresh:
name: docs-fresh
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- name: Regenerate rule reference
run: cargo run --features docgen --bin nyx-docgen
- name: Verify docs/rules.md is fresh
run: |
if ! git diff --exit-code docs/rules.md; then
echo "::error::docs/rules.md is stale. Run 'cargo run --features docgen --bin nyx-docgen' and commit the result."
exit 1
fi
rustdoc:
name: rustdoc
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- name: Check rustdoc links
env:
RUSTDOCFLAGS: "-D warnings"
run: cargo doc --workspace --no-deps --all-features
rust-beta-build:
name: rust-beta-build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: beta
cache: true
- name: Beta compile compatibility check
run: cargo check --all-features --tests
msrv:
name: msrv
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: "1.88"
cache: true
- name: Compile check at MSRV
run: cargo check --all-features --tests
rust-stable-test-linux-without-docker:
name: rust-stable-test / linux-without-docker
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Rust tests (stable, no docker)
run: cargo nextest run --no-fail-fast --all-features
rust-stable-test-linux-with-docker:
name: rust-stable-test / linux-with-docker
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Pull language images for sandbox tests
run: |
docker pull python:3-slim
docker pull node:20-slim
docker pull eclipse-temurin:21-jre-jammy
docker pull php:8-cli
- name: Smoke-test interpreter availability
run: |
docker run --rm python:3-slim python3 --version
docker run --rm node:20-slim node --version
docker run --rm eclipse-temurin:21-jre-jammy java -version
docker run --rm php:8-cli php --version
- name: Rust tests with docker (sandbox escape gate)
run: cargo nextest run --no-fail-fast --all-features --test dynamic_sandbox_escape --test dynamic_parity
escape-positive-control:
name: escape-positive-control
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Pull python image
run: docker pull python:3-slim
- name: Escape positive control (gate wiring check)
run: |
cargo nextest run --no-fail-fast --all-features --test dynamic_sandbox_escape \
-- --include-ignored positive_control_cap_sys_admin
cross-platform-smoke:
name: cross-platform-smoke
strategy:
fail-fast: false
matrix:
os: [macos-latest, windows-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Build
run: cargo build --release --all-features
- name: Smoke tests
run: cargo nextest run --no-fail-fast --all-features --test integration_tests --test pattern_tests --test cli_validation_tests
rust-beta-test:
name: rust-beta-test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: beta
cache: true
- uses: taiki-e/install-action@nextest
- name: Rust tests (beta)
run: cargo nextest run --no-fail-fast --all-features
cargo-package:
name: cargo-package
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: 20
cache: npm
cache-dependency-path: frontend/package-lock.json
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- name: Build frontend
working-directory: frontend
run: |
npm ci
npm run build
- name: Verify dist embedded in package
run: |
for f in src/server/assets/dist/index.html src/server/assets/dist/app.js src/server/assets/dist/style.css src/server/assets/favicon.svg default-nyx.conf build.rs; do
if ! cargo package --list --allow-dirty | grep -qx "$f"; then
echo "::error::missing from cargo package: $f"
exit 1
fi
done
- name: cargo package (verify build)
run: cargo package --allow-dirty
benchmark-gate:
name: benchmark-gate
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
cache-key: benchmark-gate-release
- uses: taiki-e/install-action@nextest
- name: Build benchmark + perf test binaries
run: cargo nextest run --release --all-features --test benchmark_test --test perf_tests --no-run
- name: Accuracy regression gate (P/R/F1)
run: cargo nextest run --no-fail-fast --release --all-features --test benchmark_test --run-ignored only --no-capture benchmark_evaluation
- name: Performance regression gate
env:
NYX_CI_BENCH: "1"
run: cargo nextest run --no-fail-fast --release --all-features --test perf_tests --no-capture
- name: Upload benchmark results
if: always()
uses: actions/upload-artifact@v7
with:
name: benchmark-results
path: tests/benchmark/results/latest.json
if-no-files-found: warn
corpus-marker-audit:
name: corpus-marker-audit
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
with:
python-version: "3.12"
- name: Marker collision audit (§16.3)
run: python3 scripts/corpus_dashboard.py
# Exits non-zero if any oracle marker from one cap appears in another
# cap's payload bytes. This catches cross-cap oracle collisions that
# would cause false-positive confirmed verdicts.
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Corpus unit tests (no_marker_collisions, all_payloads_have_fixture_paths)
run: cargo nextest run --no-fail-fast --lib -p nyx-scanner dynamic::corpus
env:
RUST_LOG: error
- name: Corpus dashboard sync check (Python/Rust payload table parity)
run: python3 scripts/check_corpus_sync.py

45
.github/workflows/codeql.yml vendored Normal file
View file

@ -0,0 +1,45 @@
name: "CodeQL Advanced"
on:
push:
branches: ["master"]
pull_request:
branches: ["master"]
schedule:
- cron: "0 9 * * 2"
jobs:
analyze:
name: Analyze (${{ matrix.language }})
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }}
permissions:
security-events: write
packages: read
actions: read
contents: read
strategy:
fail-fast: false
matrix:
include:
- language: actions
build-mode: none
- language: javascript-typescript
build-mode: none
- language: rust
build-mode: none
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Initialize CodeQL
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
config-file: ./.github/codeql/codeql-config.yml
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v4
with:
category: "/language:${{ matrix.language }}"

167
.github/workflows/corpus_promote.yml vendored Normal file
View file

@ -0,0 +1,167 @@
name: Corpus Promote
# Weekly automated promotion-PR template.
#
# Scans fuzz-discovered/ for candidates not yet in src/dynamic/corpus.rs
# and opens a PR proposing them for human review (§16.4 — no auto-merge).
#
# Also runs the marker-collision audit as a hard gate: if any collision is
# found the workflow fails rather than proposing the promotion.
on:
schedule:
# Sundays at 09:00 UTC — offset from the fuzz run (06:00 UTC) so
# discovered candidates are ready before the promotion job runs.
- cron: "0 9 * * 0"
workflow_dispatch:
inputs:
dry_run:
description: "Dry run (print PR body but do not open)"
required: false
default: "false"
permissions:
contents: write
pull-requests: write
concurrency:
group: corpus-promote
cancel-in-progress: true
jobs:
promote:
name: Propose corpus promotions
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: actions/setup-node@v6
with:
node-version: 20
cache: npm
cache-dependency-path: frontend/package-lock.json
- name: Build frontend
working-directory: frontend
run: |
npm ci
npm run build
# ── Marker collision audit ──────────────────────────────────────────────
- name: Marker collision audit
run: |
set -euo pipefail
cargo build --features dynamic -p nyx-scanner 2>/dev/null || true
cd fuzz/dynamic_corpus
cargo run -- audit-markers
env:
RUST_LOG: error
# ── Discover candidates ─────────────────────────────────────────────────
- name: Find promotion candidates
id: candidates
run: |
set -euo pipefail
count=0
files=""
if [ -d fuzz-discovered ]; then
while IFS= read -r f; do
# Skip .gitkeep, sidecar JSONs, and files already listed in corpus.rs.
[[ "$f" == *".gitkeep" ]] && continue
[[ "$f" == *".json" ]] && continue
bytes=$(xxd -p "$f" | tr -d '\n')
if ! grep -q "$bytes" src/dynamic/corpus.rs 2>/dev/null; then
count=$((count + 1))
files="$files $f"
fi
done < <(find fuzz-discovered -type f | sort)
fi
echo "count=$count" >> "$GITHUB_OUTPUT"
echo "files=$files" >> "$GITHUB_OUTPUT"
- name: Skip if no new candidates
if: steps.candidates.outputs.count == '0'
run: |
echo "No new candidates found in fuzz-discovered/. Nothing to promote."
# ── Open promotion PR ───────────────────────────────────────────────────
- name: Open promotion PR
if: >
steps.candidates.outputs.count != '0' &&
github.event.inputs.dry_run != 'true'
env:
GH_TOKEN: ${{ github.token }}
CANDIDATE_COUNT: ${{ steps.candidates.outputs.count }}
CANDIDATE_FILES: ${{ steps.candidates.outputs.files }}
run: |
set -euo pipefail
branch="corpus-promote-$(date +%Y%m%d)"
git checkout -b "$branch"
# Stage candidate files into fuzz-discovered (already there).
# The PR body provides the reviewer with everything they need.
# Build PR body into a temp file to avoid shell re-interpolation of
# sidecar JSON content (which may contain backticks or $(...) sequences).
body_file=$(mktemp)
cat > "$body_file" <<'PREAMBLE'
## Corpus Promotion Proposal
This PR was generated automatically by the weekly corpus-promote workflow.
It does **not** auto-merge — a human reviewer must approve each candidate
before it can land in `src/dynamic/corpus.rs` (§16.4).
### Candidates
The following payloads were discovered by the internal mutation fuzzer and
confirmed via `sink_hit && oracle_fired` against instrumented fixtures:
PREAMBLE
for f in $CANDIDATE_FILES; do
sidecar="${f}.json"
printf -- '- `%s`\n' "$f" >> "$body_file"
if [ -f "$sidecar" ]; then
printf ' ```json\n' >> "$body_file"
cat "$sidecar" >> "$body_file"
printf '\n ```\n' >> "$body_file"
fi
done
cat >> "$body_file" <<'CHECKLIST'
### Review checklist
- [ ] Bytes are a genuine attack vector, not a fixture artifact
- [ ] Oracle marker is unique (no collision with other caps)
- [ ] `fixture_paths` updated in `src/dynamic/corpus.rs`
- [ ] `since_corpus_version` set to next version
- [ ] `CORPUS_VERSION` bumped and bump history updated
_Generated by corpus_promote.yml — do not auto-merge._
CHECKLIST
git add fuzz-discovered/ || true
git diff --cached --quiet || git commit -m "chore: add ${CANDIDATE_COUNT} fuzzer-discovered corpus candidates"
git push origin "$branch"
gh pr create \
--title "chore(corpus): promote ${CANDIDATE_COUNT} fuzzer-discovered payload(s)" \
--body "$(cat "$body_file")" \
--base master \
--label "corpus-promotion" || true
rm -f "$body_file"
- name: Dry run summary
if: github.event.inputs.dry_run == 'true'
run: |
echo "Dry run: would promote ${{ steps.candidates.outputs.count }} candidate(s)."
echo "Files: ${{ steps.candidates.outputs.files }}"

View file

@ -0,0 +1,30 @@
name: Dependabot auto-merge
on: pull_request
permissions:
contents: write
pull-requests: write
jobs:
auto-merge:
runs-on: ubuntu-latest
# Skip fork PRs entirely (the merge would fail anyway, but no need to run).
if: >-
github.event.pull_request.user.login == 'dependabot[bot]' &&
github.event.pull_request.head.repo.full_name == github.repository
steps:
- name: Fetch Dependabot metadata
id: metadata
uses: dependabot/fetch-metadata@v3
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Enable auto-merge for patch and minor updates
if: >-
steps.metadata.outputs.update-type == 'version-update:semver-patch' ||
steps.metadata.outputs.update-type == 'version-update:semver-minor'
run: gh pr merge --auto --squash "$PR_URL"
env:
PR_URL: ${{ github.event.pull_request.html_url }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

53
.github/workflows/docs.yml vendored Normal file
View file

@ -0,0 +1,53 @@
name: docs
on:
push:
branches: [master]
paths:
- "docs/**"
- "book.toml"
- ".github/workflows/docs.yml"
- "assets/screenshots/**"
workflow_dispatch:
permissions:
contents: read
pages: write
id-token: write
concurrency:
group: pages
cancel-in-progress: false
jobs:
build-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- name: Cache mdbook
id: cache-mdbook
uses: actions/cache@v5
with:
path: ~/.cargo/bin/mdbook
key: mdbook-0.5.2-${{ runner.os }}
- name: Install mdbook
if: steps.cache-mdbook.outputs.cache-hit != 'true'
run: cargo install mdbook --version 0.5.2 --locked
- name: Build
run: mdbook build
- name: Upload artifact
uses: actions/upload-pages-artifact@v5
with:
path: book
- name: Deploy to GitHub Pages
uses: actions/deploy-pages@v5

146
.github/workflows/dynamic.yml vendored Normal file
View file

@ -0,0 +1,146 @@
# Phase 29 (Track I): dedicated dynamic-verification matrix.
#
# Three rows exercise the dynamic harness pipeline (`cargo nextest run
# --features dynamic`) under the host configurations the Phase 1728
# tracks documented as supported:
#
# linux-process-only — Ubuntu host, no docker daemon. Forces the
# process backend and exercises the Phase 17
# Linux hardening primitives (chroot, seccomp,
# unshare, no_new_privs). `libc6-dev` is
# installed so the hardening probe + escape
# suite can `cc -static`; without it the
# chroot-leg of the escape suite skips silently
# (Phase 20 follow-up #4 in deferred.md).
#
# linux-with-docker — Ubuntu host with the runner Docker daemon. Exercises
# the docker backend (Phase 19) and the
# differential-confirmation parity tests.
#
# macos — macOS-latest, no docker. Exercises the
# Phase-18 `sandbox-exec` primitives plus the
# process backend on Darwin. Track-I acceptance
# literal: "cargo nextest run --features dynamic
# is green on macOS without docker."
name: dynamic
permissions:
contents: read
on:
push:
branches: ["master"]
pull_request:
branches: ["master"]
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
linux-process-only:
name: dynamic / linux-process-only
runs-on: ubuntu-latest
env:
# Force the process backend even when callers default to Auto so
# docker-unavailable paths cannot accidentally hide a regression.
NYX_SANDBOX_BACKEND: process
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
# Phase 17 / Phase 20 follow-up: the hardening probe + escape
# suite chroot leg need static glibc. Without these packages the
# `cc -static probe.c` step in tests/sandbox_hardening_linux.rs +
# tests/sandbox_escape_suite.rs falls back to dynamic linking and
# the chroot leg silently skips.
- name: Install fixture prerequisites (static libc)
run: |
sudo apt-get update -y
sudo apt-get install -y --no-install-recommends libc6-dev libc-dev-bin
- name: Smoke-test interpreter availability
run: |
python3 --version
node --version || sudo apt-get install -y --no-install-recommends nodejs
ruby --version || true
php --version || true
- name: Dynamic suite (process backend only)
run: cargo nextest run --no-fail-fast --features dynamic
linux-with-docker:
name: dynamic / linux-with-docker
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Install fixture prerequisites (static libc)
run: |
sudo apt-get update -y
sudo apt-get install -y --no-install-recommends libc6-dev libc-dev-bin
- name: Pull language images for sandbox tests
run: |
docker pull python:3-slim
docker pull node:20-slim
docker pull eclipse-temurin:21-jre-jammy
docker pull php:8-cli
- name: Smoke-test docker interpreter availability
run: |
docker run --rm python:3-slim python3 --version
docker run --rm node:20-slim node --version
docker run --rm eclipse-temurin:21-jre-jammy java -version
docker run --rm php:8-cli php --version
- name: Dynamic suite (process + docker backends)
run: cargo nextest run --no-fail-fast --features dynamic
macos:
name: dynamic / macos
runs-on: macos-latest
env:
# macOS runners ship without docker; force process backend so the
# `Auto` resolver in src/dynamic/sandbox.rs cannot accidentally
# pick up a stray Lima/Colima daemon and confuse the matrix.
NYX_SANDBOX_BACKEND: process
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
- name: Smoke-test sandbox-exec availability
run: |
/usr/bin/sandbox-exec -p '(version 1)(allow default)' /bin/echo ok
- name: Smoke-test interpreter availability
run: |
python3 --version
node --version
ruby --version
# Phase 29 acceptance literal: "cargo nextest run --features
# dynamic is green on macOS without docker (process-only row)."
- name: Dynamic suite (macOS, process backend)
run: cargo nextest run --no-fail-fast --features dynamic

348
.github/workflows/eval.yml vendored Normal file
View file

@ -0,0 +1,348 @@
# Real-corpus acceptance (Track R).
#
# * owasp (Phase 27 / Track R.0): Gate 6 vs a real OWASP BenchmarkJava
# checkout (Java).
# * jsts (Phase 28 / Track R.1): Gate 7 vs OWASP NodeGoat (Express, .js)
# and OWASP Juice Shop (TypeScript, .ts), one matrix row per corpus.
# * polyglot (Phase 29 / Track R.2): Gate 8 vs OWASP RailsGoat (Rails, .rb),
# DVWA (PHP), DVPWA (aiohttp, .py), gosec (Go) and the RustSec advisory-db
# (Rust negative control), one matrix row per corpus.
#
# Runs on every PR that touches the dynamic verifier (src/dynamic/), the
# eval-corpus harness (tests/eval_corpus/), or the gate script itself.
#
# Each gate enforces, against the committed ground truth:
# * verify wall-clock <= 15 min (CI budget; the dev reference is 10 min),
# * the per-(cap,lang) budget in tests/eval_corpus/budget.toml,
# * per-cap confirmed-rate / precision / recall — hard-gated only for caps
# in NYX_*_FLOOR_CAPS (empty by default → published report-only until a
# cap Confirms end to end), with destinations >= 40% / >= 0.85 / >= 0.40.
#
# No corpus is vendored. Each is cloned at a pinned ref and cached so reruns
# skip the clone. Before the gate runs, the committed ground truth is
# regenerated from its source against the fresh clone and asserted in sync,
# and the converter hard-errors on any labelled path missing from the corpus,
# so a corpus bump that drifts the labels fails the job loudly.
name: eval
permissions:
contents: read
on:
push:
branches: ["master"]
paths:
- "src/dynamic/**"
- "tests/eval_corpus/**"
- "scripts/m7_ship_gate.sh"
- ".github/workflows/eval.yml"
pull_request:
branches: ["master"]
paths:
- "src/dynamic/**"
- "tests/eval_corpus/**"
- "scripts/m7_ship_gate.sh"
- ".github/workflows/eval.yml"
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
owasp:
name: eval / owasp-benchmark-v1.2
runs-on: ubuntu-latest
env:
# Gate 6 self-skips unless this points at a real checkout.
NYX_OWASP_CORPUS: ${{ github.workspace }}/.eval-corpus/owasp_benchmark_v1.2
# CI wall-clock budget: 20 min. The 2740-file OWASP scan+verify lands
# right at the old 15-min ceiling on the hosted runners (observed 900.2s),
# so the gate tripped on CI variance alone; 1200s restores headroom. The
# dev reference stays 10 min — override locally to tighten.
NYX_OWASP_WALLCLOCK_BUDGET_SECONDS: "1200"
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
# The Phase 22 Java compile pool drives `com.sun.tools.javac` out of a
# warm JDK; temurin 21 ships the compiler module the pool loads.
- name: Set up JDK 21
uses: actions/setup-java@v5
with:
distribution: temurin
java-version: "21"
- name: Cache OWASP BenchmarkJava (1.2beta)
id: cache-owasp
uses: actions/cache@v5
with:
path: .eval-corpus/owasp_benchmark_v1.2
key: owasp-benchmark-1.2beta
- name: Clone OWASP BenchmarkJava (1.2beta tag)
if: steps.cache-owasp.outputs.cache-hit != 'true'
run: |
git clone --depth 1 --branch 1.2beta \
https://github.com/OWASP-Benchmark/BenchmarkJava \
.eval-corpus/owasp_benchmark_v1.2
# No-compromise guard: the committed ground truth must be exactly what a
# fresh conversion of the pinned CSV produces. Catches GT drift (a
# corpus bump, a hand-edit) before the gate runs on stale labels.
- name: Verify ground truth is in sync with the pinned corpus
run: |
python3 tests/eval_corpus/owasp_gt_convert.py \
--corpus-dir .eval-corpus/owasp_benchmark_v1.2 \
--output /tmp/owasp_gt_regen.json
python3 - <<'PY'
import json, sys
committed = json.load(open("tests/eval_corpus/ground_truth/owasp_benchmark_v1.2.json"))
regen = json.load(open("/tmp/owasp_gt_regen.json"))
if committed != regen:
sys.exit("committed ground truth diverges from a fresh conversion of "
"the 1.2beta CSV; regenerate with owasp_gt_convert.py")
print(f"ground truth in sync: {len(committed)} records")
PY
- name: eval-corpus harness regression tests
run: |
python3 tests/eval_corpus/test_tabulate_regression.py
python3 tests/eval_corpus/test_manifest_gt_convert.py
- name: Gate 6 — OWASP Benchmark v1.2 acceptance
run: scripts/m7_ship_gate.sh --sets owasp
jsts:
name: eval / ${{ matrix.corpus.name }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
corpus:
- name: nodegoat
repo: https://github.com/OWASP/NodeGoat
# NodeGoat ships no release tags; pin the default branch and let
# the cache key hold it stable. The manifest's path layout
# (app/, config/) has been constant for years.
ref: master
env: NYX_NODEGOAT_CORPUS
manifest: nodegoat.manifest.toml
ground_truth: nodegoat.json
- name: juiceshop
repo: https://github.com/juice-shop/juice-shop
ref: v15.0.0
env: NYX_JUICESHOP_CORPUS
manifest: juiceshop.manifest.toml
ground_truth: juiceshop.json
env:
# CI wall-clock budget: 15 min. Override locally to tighten.
NYX_JSTS_WALLCLOCK_BUDGET_SECONDS: "900"
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
# The dynamic verifier's Node build pool (Phase 23) compiles its
# harnesses with a real node/npm toolchain.
- name: Set up Node 20
uses: actions/setup-node@v6
with:
node-version: "20"
- name: Cache ${{ matrix.corpus.name }}
id: cache-corpus
uses: actions/cache@v5
with:
path: .eval-corpus/${{ matrix.corpus.name }}
key: jsts-${{ matrix.corpus.name }}-${{ matrix.corpus.ref }}
- name: Clone ${{ matrix.corpus.name }} (${{ matrix.corpus.ref }})
if: steps.cache-corpus.outputs.cache-hit != 'true'
run: |
git clone --depth 1 --branch ${{ matrix.corpus.ref }} \
${{ matrix.corpus.repo }} \
.eval-corpus/${{ matrix.corpus.name }}
# No-compromise guard: the committed ground truth must be exactly what a
# fresh conversion of the curated manifest produces *against this
# corpus*. manifest_gt_convert.py hard-errors on any labelled path that
# no longer exists in the clone (corpus drift / typo), and the diff
# below catches a stale committed JSON.
- name: Verify ground truth is in sync with the pinned corpus
run: |
python3 tests/eval_corpus/manifest_gt_convert.py \
--manifest tests/eval_corpus/ground_truth/${{ matrix.corpus.manifest }} \
--corpus-dir .eval-corpus/${{ matrix.corpus.name }} \
--output /tmp/${{ matrix.corpus.name }}_gt_regen.json
python3 - <<'PY'
import json, sys
name = "${{ matrix.corpus.ground_truth }}"
committed = json.load(open(f"tests/eval_corpus/ground_truth/{name}"))
regen = json.load(open("/tmp/${{ matrix.corpus.name }}_gt_regen.json"))
if committed != regen:
sys.exit("committed ground truth diverges from a fresh conversion of "
"the manifest against the pinned corpus; regenerate with "
"manifest_gt_convert.py")
print(f"ground truth in sync: {len(committed)} records")
PY
- name: eval-corpus harness regression tests
run: |
python3 tests/eval_corpus/test_tabulate_regression.py
python3 tests/eval_corpus/test_manifest_gt_convert.py
- name: Gate 7 — ${{ matrix.corpus.name }} acceptance
run: |
export ${{ matrix.corpus.env }}="${{ github.workspace }}/.eval-corpus/${{ matrix.corpus.name }}"
scripts/m7_ship_gate.sh --sets ${{ matrix.corpus.name }}
polyglot:
name: eval / ${{ matrix.corpus.name }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
corpus:
- name: railsgoat
repo: https://github.com/OWASP/railsgoat
ref: rails.5.0.0
lang: ruby
env: NYX_RAILSGOAT_CORPUS
manifest: railsgoat.manifest.toml
ground_truth: railsgoat.json
- name: dvwa
repo: https://github.com/digininja/DVWA
ref: "2.5"
lang: php
env: NYX_DVWA_CORPUS
manifest: dvwa.manifest.toml
ground_truth: dvwa.json
- name: dvpwa
repo: https://github.com/anxolerd/dvpwa
# DVPWA ships no release tags; pin the default branch and let the
# cache key hold it stable.
ref: master
lang: python
env: NYX_DVPWA_CORPUS
manifest: dvpwa.manifest.toml
ground_truth: dvpwa.json
- name: gosec
repo: https://github.com/securego/gosec
ref: v2.26.1
lang: go
env: NYX_GOSEC_CORPUS
manifest: gosec.manifest.toml
ground_truth: gosec.json
- name: rustsec
repo: https://github.com/rustsec/advisory-db
# advisory-db ships no release tags; pin the default branch. This
# is the Rust NEGATIVE CONTROL (advisory metadata, no scannable
# source) — its committed ground truth is empty by construction.
ref: main
lang: rust
env: NYX_RUSTSEC_CORPUS
manifest: rustsec.manifest.toml
ground_truth: rustsec.json
env:
# CI wall-clock budget: 15 min. Override locally to tighten.
NYX_POLYGLOT_WALLCLOCK_BUDGET_SECONDS: "900"
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- uses: taiki-e/install-action@nextest
# The dynamic verifier's per-language build pool (Phase 22/23) compiles
# its harnesses with a real toolchain. Each matrix row sets up only the
# toolchain for its corpus's target language; the Rust row needs no extra
# step (the rust toolchain above covers it, and advisory-db has no
# buildable source anyway).
- name: Set up Ruby
if: matrix.corpus.lang == 'ruby'
uses: ruby/setup-ruby@v1
with:
ruby-version: "3.3"
- name: Set up PHP
if: matrix.corpus.lang == 'php'
uses: shivammathur/setup-php@v2
with:
php-version: "8.3"
- name: Set up Python
if: matrix.corpus.lang == 'python'
uses: actions/setup-python@v6
with:
python-version: "3.12"
- name: Set up Go
if: matrix.corpus.lang == 'go'
uses: actions/setup-go@v6
with:
go-version: "1.22"
- name: Cache ${{ matrix.corpus.name }}
id: cache-corpus
uses: actions/cache@v5
with:
path: .eval-corpus/${{ matrix.corpus.name }}
key: polyglot-${{ matrix.corpus.name }}-${{ matrix.corpus.ref }}
- name: Clone ${{ matrix.corpus.name }} (${{ matrix.corpus.ref }})
if: steps.cache-corpus.outputs.cache-hit != 'true'
run: |
git clone --depth 1 --branch ${{ matrix.corpus.ref }} \
${{ matrix.corpus.repo }} \
.eval-corpus/${{ matrix.corpus.name }}
# No-compromise guard: the committed ground truth must be exactly what a
# fresh conversion of the curated manifest produces *against this corpus*.
# manifest_gt_convert.py hard-errors on any labelled path that no longer
# exists in the clone (corpus drift / typo); the diff below catches a
# stale committed JSON. For the RustSec negative control the manifest
# carries `negative_control = true` and zero entries, so the converter
# emits an empty `[]` — still validated against the real clone.
- name: Verify ground truth is in sync with the pinned corpus
run: |
python3 tests/eval_corpus/manifest_gt_convert.py \
--manifest tests/eval_corpus/ground_truth/${{ matrix.corpus.manifest }} \
--corpus-dir .eval-corpus/${{ matrix.corpus.name }} \
--output /tmp/${{ matrix.corpus.name }}_gt_regen.json
python3 - <<'PY'
import json, sys
name = "${{ matrix.corpus.ground_truth }}"
committed = json.load(open(f"tests/eval_corpus/ground_truth/{name}"))
regen = json.load(open("/tmp/${{ matrix.corpus.name }}_gt_regen.json"))
if committed != regen:
sys.exit("committed ground truth diverges from a fresh conversion of "
"the manifest against the pinned corpus; regenerate with "
"manifest_gt_convert.py")
print(f"ground truth in sync: {len(committed)} records")
PY
- name: eval-corpus harness regression tests
run: |
python3 tests/eval_corpus/test_tabulate_regression.py
python3 tests/eval_corpus/test_manifest_gt_convert.py
- name: Gate 8 — ${{ matrix.corpus.name }} acceptance
run: |
export ${{ matrix.corpus.env }}="${{ github.workspace }}/.eval-corpus/${{ matrix.corpus.name }}"
scripts/m7_ship_gate.sh --sets ${{ matrix.corpus.name }}

217
.github/workflows/fuzz.yml vendored Normal file
View file

@ -0,0 +1,217 @@
name: Fuzz
on:
pull_request:
branches: ["master"]
paths:
- "src/**"
- "fuzz/**"
- "Cargo.toml"
- "Cargo.lock"
- ".github/workflows/fuzz.yml"
schedule:
# Long-form weekly run, Sundays at 06:00 UTC.
- cron: "0 6 * * 0"
workflow_dispatch:
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
fuzz:
name: fuzz-${{ matrix.target }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
target: [scan_bytes, extract_summaries, cross_file_taint]
steps:
- uses: actions/checkout@v6
# cargo-fuzz needs nightly for the libFuzzer codegen flags.
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: nightly
cache: true
cache-workspaces: |
.
fuzz
- uses: taiki-e/install-action@v2
with:
tool: cargo-fuzz
- uses: actions/setup-node@v6
with:
node-version: 20
cache: npm
cache-dependency-path: frontend/package-lock.json
- name: Build frontend
working-directory: frontend
run: |
npm ci
npm run build
- name: Restore fuzz corpus
uses: actions/cache@v5
with:
path: fuzz/corpus/${{ matrix.target }}
key: fuzz-corpus-${{ matrix.target }}-${{ github.sha }}
restore-keys: |
fuzz-corpus-${{ matrix.target }}-
# The harness reads inputs as <lang_idx_byte><source>, so we prefix
# each seed with its language index here at stage time. Files in
# fuzz/seed_corpus/ are committed as plain source without the byte
# because some IDEs strip 0x00 on save.
- name: Layer seed corpus
run: |
set -euo pipefail
target=${{ matrix.target }}
dest="fuzz/corpus/$target"
mkdir -p "$dest"
ext_to_idx() {
case "$1" in
rs) echo 0 ;;
js) echo 1 ;;
ts) echo 2 ;;
py) echo 3 ;;
go) echo 4 ;;
java) echo 5 ;;
rb) echo 6 ;;
php) echo 7 ;;
c) echo 8 ;;
cpp) echo 9 ;;
*) return 1 ;;
esac
}
stage() {
src="$1"
ext="${src##*.}"
idx=$(ext_to_idx "$ext") || return 0
hash=$(sha256sum "$src" | cut -c1-16)
out="$dest/seed-${ext}-${hash}"
[ -e "$out" ] && return 0
printf '%b' "$(printf '\\%03o' "$idx")" > "$out"
cat "$src" >> "$out"
}
for f in benches/fixtures/sample.*; do
[ -e "$f" ] && stage "$f"
done
while IFS= read -r f; do
stage "$f"
done < <(find tests/benchmark/corpus -type f \( \
-name '*.rs' -o -name '*.js' -o -name '*.ts' \
-o -name '*.py' -o -name '*.go' -o -name '*.java' \
-o -name '*.rb' -o -name '*.php' -o -name '*.c' \
-o -name '*.cpp' \))
if [ -d "fuzz/seed_corpus/$target" ]; then
while IFS= read -r f; do
stage "$f"
done < <(find "fuzz/seed_corpus/$target" -type f \( \
-name '*.rs' -o -name '*.js' -o -name '*.ts' \
-o -name '*.py' -o -name '*.go' -o -name '*.java' \
-o -name '*.rb' -o -name '*.php' -o -name '*.c' \
-o -name '*.cpp' \))
fi
echo "Corpus dir: $(ls "$dest" | wc -l) files"
- name: Choose fuzz duration
id: budget
run: |
if [ "${{ github.event_name }}" = "schedule" ] || [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "seconds=18000" >> "$GITHUB_OUTPUT"
else
echo "seconds=600" >> "$GITHUB_OUTPUT"
fi
- name: Run fuzz target
run: |
cargo fuzz run --target x86_64-unknown-linux-gnu ${{ matrix.target }} -- \
-max_total_time=${{ steps.budget.outputs.seconds }} \
-max_len=65536 \
-timeout=60 \
-rss_limit_mb=8192 \
-dict=fuzz/dict/all.dict
- name: Upload crash artifacts
if: failure()
uses: actions/upload-artifact@v7
with:
name: fuzz-artifacts-${{ matrix.target }}-${{ github.run_id }}
path: fuzz/artifacts/${{ matrix.target }}/
if-no-files-found: ignore
retention-days: 14
harness-fuzz:
name: harness-fuzz-${{ matrix.cap }}
runs-on: ubuntu-latest
# Run only on schedule and manual dispatch — 50 k iterations per cap is
# too slow for PR checks but is the right cadence for weekly corpus growth.
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
strategy:
fail-fast: false
matrix:
include:
- cap: sql_query
harness: tests/dynamic_fixtures/python/sqli_positive.py
- cap: code_exec
harness: tests/dynamic_fixtures/python/cmdi_positive.py
- cap: file_io
harness: tests/dynamic_fixtures/python/fileio_positive.py
- cap: ssrf
harness: tests/dynamic_fixtures/python/ssrf_positive.py
- cap: html_escape
harness: tests/dynamic_fixtures/python/xss_positive.py
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
cache: true
cache-workspaces: |
.
fuzz/dynamic_corpus
- uses: actions/setup-node@v6
with:
node-version: 20
cache: npm
cache-dependency-path: frontend/package-lock.json
- name: Build frontend
working-directory: frontend
run: |
npm ci
npm run build
- name: Build nyx-dynamic-corpus
working-directory: fuzz/dynamic_corpus
run: cargo build
- uses: actions/setup-python@v6
with:
python-version: "3.x"
- name: Run harness fuzzer — ${{ matrix.cap }}
run: |
fuzz/dynamic_corpus/target/debug/nyx-dynamic-corpus run \
--cap ${{ matrix.cap }} \
--spec-hash "ci-${{ matrix.cap }}" \
--harness-cmd "python3 ${{ matrix.harness }}" \
--iterations 50000 \
--output fuzz-discovered
- name: Upload discovered candidates
if: always()
uses: actions/upload-artifact@v7
with:
name: harness-fuzz-${{ matrix.cap }}-${{ github.run_id }}
path: fuzz-discovered/
if-no-files-found: ignore
retention-days: 30

68
.github/workflows/image-builder.yml vendored Normal file
View file

@ -0,0 +1,68 @@
name: image-builder
# Phase 19 (Track E.3): daily drift PR.
#
# Runs `nyx-image-builder build --all` on a Linux runner that has docker
# available, captures the rewritten `tools/image-builder/images.toml`, and
# opens a PR when any pinned digest changed. The PR is reviewed manually
# before merge so a hostile upstream image cannot silently land in
# `IMAGE_DIGESTS`.
permissions:
contents: write
pull-requests: write
on:
schedule:
# 04:23 UTC daily — off-peak for the major upstream registries so
# transient pull errors are rare.
- cron: "23 4 * * *"
workflow_dispatch:
concurrency:
group: image-builder
cancel-in-progress: false
jobs:
refresh-digests:
name: refresh image digests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
cache: true
- name: Verify docker is reachable
run: docker info
- name: Build pinned-digest catalogue
run: |
cargo run -F image-builder --bin nyx-image-builder -- build --all
- name: Verify catalogue against local pulls
run: |
cargo run -F image-builder --bin nyx-image-builder -- verify
- name: Open PR on drift
uses: peter-evans/create-pull-request@v8
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: "image-builder: refresh pinned digests"
title: "image-builder: refresh pinned digests"
body: |
Automated digest refresh by `nyx-image-builder build --all`.
The CI job pulled every base image in
`tools/image-builder/images.toml`, captured the resolved
`sha256:` digest, and wrote it back into the file. Review
the diff before merging — a hostile upstream image would
show up here as an unexpected digest change.
branch: image-builder/refresh-digests
base: master
delete-branch: true
labels: |
image-builder
automation

View file

@ -3,31 +3,80 @@ name: Release build & publish
on: on:
release: release:
types: [created] types: [created]
workflow_dispatch:
inputs:
tag:
description: "Existing release tag to (re)build and publish (e.g. v0.5.0)"
required: true
type: string
permissions: permissions:
contents: write contents: write
env: env:
BIN_NAME: nyx BIN_NAME: nyx
RELEASE_TAG: ${{ github.event.release.tag_name || inputs.tag }}
jobs: jobs:
build-and-upload: frontend:
name: build-frontend
runs-on: ubuntu-latest
steps:
- name: Check out sources
uses: actions/checkout@v6
with:
ref: ${{ env.RELEASE_TAG }}
- uses: actions/setup-node@v6
with:
node-version: 20
cache: npm
cache-dependency-path: frontend/package-lock.json
- name: Install frontend dependencies
working-directory: frontend
run: npm ci
- name: Build frontend
working-directory: frontend
run: npm run build
- name: Upload frontend dist
uses: actions/upload-artifact@v7
with:
name: frontend-dist
path: src/server/assets/dist/
if-no-files-found: error
retention-days: 1
build:
needs: frontend
strategy: strategy:
matrix: matrix:
include: include:
- target: x86_64-unknown-linux-gnu - target: x86_64-unknown-linux-gnu
os: ubuntu-latest os: ubuntu-latest
- target: aarch64-unknown-linux-gnu
os: ubuntu-latest
- target: x86_64-pc-windows-msvc - target: x86_64-pc-windows-msvc
os: windows-latest os: windows-latest
- target: x86_64-apple-darwin - target: x86_64-apple-darwin
os: macos-14 os: macos-14
- target: aarch64-apple-darwin - target: aarch64-apple-darwin
os: macos-14 os: macos-14
runs-on: ${{ matrix.os }} runs-on: ${{ matrix.os }}
steps: steps:
- name: Check out sources - name: Check out sources
uses: actions/checkout@v4 uses: actions/checkout@v6
with:
ref: ${{ env.RELEASE_TAG }}
- name: Download prebuilt frontend dist
uses: actions/download-artifact@v8
with:
name: frontend-dist
path: src/server/assets/dist/
- name: Install Rust toolchain - name: Install Rust toolchain
uses: actions-rust-lang/setup-rust-toolchain@v1 uses: actions-rust-lang/setup-rust-toolchain@v1
@ -35,14 +84,23 @@ jobs:
toolchain: stable toolchain: stable
target: ${{ matrix.target }} target: ${{ matrix.target }}
cache: true cache: true
- name: Install cross-compilation tools (ARM Linux)
if: matrix.target == 'aarch64-unknown-linux-gnu'
run: |
sudo apt-get update
sudo apt-get install -y gcc-aarch64-linux-gnu
echo '[target.aarch64-unknown-linux-gnu]' >> ~/.cargo/config.toml
echo 'linker = "aarch64-linux-gnu-gcc"' >> ~/.cargo/config.toml
- name: Install target - name: Install target
run: rustup target add ${{ matrix.target }} run: rustup target add ${{ matrix.target }}
- name: Build - name: Build
run: cargo build --release --bin ${{ env.BIN_NAME }} --target ${{ matrix.target }} run: cargo build --release --bin ${{ env.BIN_NAME }} --target ${{ matrix.target }}
- name: Package - name: Package (Linux & macOS)
if: runner.os != 'Windows'
shell: bash shell: bash
run: | run: |
set -euo pipefail set -euo pipefail
@ -50,19 +108,181 @@ jobs:
TARGET=${{ matrix.target }} TARGET=${{ matrix.target }}
EXT=$([[ "$TARGET" == *windows* ]] && echo ".exe" || echo "") EXT=$([[ "$TARGET" == *windows* ]] && echo ".exe" || echo "")
BIN_PATH=target/$TARGET/release/$BIN$EXT BIN_PATH=target/$TARGET/release/$BIN$EXT
if [[ ! -f "$BIN_PATH" ]]; then
echo "::error ::Binary $BIN_PATH not found"
ls -R target/$TARGET/release || true
exit 1
fi
mkdir -p dist mkdir -p dist
ARCHIVE=$BIN-$TARGET.zip ARCHIVE=$BIN-$TARGET.zip
zip -9 "dist/$ARCHIVE" "$BIN_PATH" files=("$BIN_PATH" THIRDPARTY-LICENSES.html)
shopt -s nullglob
license_files=(LICENSE* COPYING*)
shopt -u nullglob
files+=("${license_files[@]}")
zip -9 "dist/$ARCHIVE" "${files[@]}"
echo "ASSET=$ARCHIVE" >> "$GITHUB_ENV" echo "ASSET=$ARCHIVE" >> "$GITHUB_ENV"
- name: Upload to the release - name: Package (Windows)
uses: softprops/action-gh-release@v2 if: runner.os == 'Windows'
shell: pwsh
run: |
$Bin = '${{ env.BIN_NAME }}'
$Target = '${{ matrix.target }}'
$Ext = '.exe'
$BinPath = "target/$Target/release/$Bin$Ext"
New-Item -ItemType Directory -Path dist -Force | Out-Null
$Archive = "$Bin-$Target.zip"
$LicenseFiles = @(Get-ChildItem -Path 'LICENSE*', 'COPYING*' -File -ErrorAction SilentlyContinue | ForEach-Object { $_.FullName })
$Files = @($BinPath, 'THIRDPARTY-LICENSES.html') + $LicenseFiles
Compress-Archive `
-Path $Files `
-DestinationPath "dist/$Archive" `
-CompressionLevel Optimal
Add-Content -Path $env:GITHUB_ENV -Value "ASSET=$Archive"
- name: Upload build artifact
uses: actions/upload-artifact@v7
with: with:
files: dist/${{ env.ASSET }} name: release-${{ matrix.target }}
path: dist/${{ env.ASSET }}
if-no-files-found: error
retention-days: 1
reproducibility:
name: reproducibility-check
needs: frontend
runs-on: ubuntu-latest
continue-on-error: true
steps:
- name: Check out sources
uses: actions/checkout@v6
with:
ref: ${{ env.RELEASE_TAG }}
- name: Download prebuilt frontend dist
uses: actions/download-artifact@v8
with:
name: frontend-dist
path: src/server/assets/dist/
- name: Install Rust toolchain
uses: actions-rust-lang/setup-rust-toolchain@v1
with:
toolchain: stable
target: x86_64-unknown-linux-gnu
cache: true
- name: Build twice and diff hashes
shell: bash
env:
RUSTFLAGS: "--remap-path-prefix=${{ github.workspace }}=/build"
run: |
set -euo pipefail
TARGET=x86_64-unknown-linux-gnu
BIN=${{ env.BIN_NAME }}
BIN_PATH="target/$TARGET/release/$BIN"
SOURCE_DATE_EPOCH=$(git log -1 --format=%ct HEAD)
export SOURCE_DATE_EPOCH
echo "SOURCE_DATE_EPOCH=$SOURCE_DATE_EPOCH"
cargo build --release --bin "$BIN" --target "$TARGET"
HASH1=$(sha256sum "$BIN_PATH" | awk '{print $1}')
echo "first build: $HASH1"
cargo clean --release --target "$TARGET"
cargo build --release --bin "$BIN" --target "$TARGET"
HASH2=$(sha256sum "$BIN_PATH" | awk '{print $1}')
echo "second build: $HASH2"
if [ "$HASH1" != "$HASH2" ]; then
echo "::error::Reproducibility check failed: builds are not bit-identical"
echo " first: $HASH1"
echo " second: $HASH2"
exit 1
fi
echo "::notice::Reproducible build verified (sha256=$HASH1)"
publish:
name: publish-release
runs-on: ubuntu-latest
needs: [build]
permissions:
contents: write
id-token: write
attestations: write
steps:
- name: Check out sources
uses: actions/checkout@v6
with:
ref: ${{ env.RELEASE_TAG }}
- name: Generate CycloneDX SBOM
uses: anchore/sbom-action@v0
with:
path: .
format: cyclonedx-json
output-file: nyx-${{ env.RELEASE_TAG }}.cdx.json
upload-artifact: false
upload-release-assets: false
- name: Download all build artifacts
uses: actions/download-artifact@v8
with:
path: release-artifacts
pattern: release-*
merge-multiple: true
- name: Generate SHA256SUMS
run: |
set -euo pipefail
cd release-artifacts
ls -lh
sha256sum *.zip > SHA256SUMS
cat SHA256SUMS
# Sigstore keyless signing. Verify with:
# cosign verify-blob --bundle <file>.bundle \
# --certificate-identity-regexp 'https://github.com/elicpeter/nyx/.*' \
# --certificate-oidc-issuer https://token.actions.githubusercontent.com \
# <file>
- name: Install cosign
uses: sigstore/cosign-installer@v4.1.2
- name: Cosign keyless sign release artifacts
shell: bash
run: |
set -euo pipefail
SBOM="nyx-${{ env.RELEASE_TAG }}.cdx.json"
(
cd release-artifacts
for f in *.zip SHA256SUMS; do
cosign sign-blob --yes \
--bundle "$f.bundle" \
"$f"
done
)
cosign sign-blob --yes \
--bundle "$SBOM.bundle" \
"$SBOM"
# SLSA v1 provenance. Verify with `gh attestation verify <file> --repo <repo>`.
- name: Generate SLSA build provenance
uses: actions/attest-build-provenance@v4
with:
subject-path: |
release-artifacts/*.zip
release-artifacts/SHA256SUMS
nyx-${{ env.RELEASE_TAG }}.cdx.json
- name: Upload to the release
uses: softprops/action-gh-release@v3
with:
tag_name: ${{ env.RELEASE_TAG }}
files: |
release-artifacts/*.zip
release-artifacts/*.zip.bundle
release-artifacts/SHA256SUMS
release-artifacts/SHA256SUMS.bundle
nyx-${{ env.RELEASE_TAG }}.cdx.json
nyx-${{ env.RELEASE_TAG }}.cdx.json.bundle
env: env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

104
.github/workflows/repro-bare.yml vendored Normal file
View file

@ -0,0 +1,104 @@
# Replay every tree-committed dynamic repro bundle with host language
# toolchains blocked so we catch regressions where a bundle silently
# depends on an interpreter the operator does not have.
#
# The setup step prepends deny-list wrappers for python3, node, ruby,
# php, and Java so the only toolchain the bundle can use is the docker
# daemon. reproduce.sh in --docker mode pulls the pinned base image
# (via docker_pull.sh) and runs the harness inside the container; if the
# bundle accidentally relied on a host interpreter the run falls over
# before the sentinel check.
#
# Adding a new fixture: extend the `matrix.fixture` list with the new
# `tests/repro_fixtures/<toolchain_id>/<spec_hash>` path. The bundle
# must already exist on disk, see tests/repro_fixture_bundles.rs for
# the regeneration recipe.
name: repro-bare
permissions:
contents: read
on:
push:
branches: ["master"]
pull_request:
branches: ["master"]
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
bare-image-replay:
name: repro-bare / ${{ matrix.fixture }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
fixture:
- tests/repro_fixtures/python-3.11/repro
steps:
- uses: actions/checkout@v6
- name: Block host language toolchains
run: |
set -euo pipefail
# Do not mutate the hosted runner image. ubuntu-latest carries
# preinstalled and cached language runtimes, and apt package
# relationships can shift underneath us as the image is updated.
# A PATH-level deny layer gives this job the bare-host semantics it
# needs without depending on apt being able to uninstall core bits.
deny_dir="${RUNNER_TEMP}/nyx-deny-toolchains"
mkdir -p "$deny_dir"
for exe in \
python python3 python3.10 python3.11 python3.12 python3.13 python3.14 \
node npm npx corepack \
ruby gem bundle \
php \
java javac jar
do
{
printf '%s\n' '#!/bin/sh'
printf '%s\n' 'echo "error: host language toolchain is disabled in repro-bare; use the Docker replay path" >&2'
printf '%s\n' 'exit 127'
} > "${deny_dir}/${exe}"
chmod +x "${deny_dir}/${exe}"
done
export PATH="${deny_dir}:${PATH}"
echo "${deny_dir}" >> "${GITHUB_PATH}"
hash -r 2>/dev/null || true
# Confirm the deny layer is active — surface the failure here
# rather than inside reproduce.sh where it would look like a
# bundle bug.
for exe in python3 node ruby php java; do
resolved="$(command -v "${exe}" || true)"
if [ "${resolved}" != "${deny_dir}/${exe}" ]; then
echo "error: ${exe} deny wrapper is not first on PATH (got ${resolved:-not found})" >&2
exit 1
fi
if "${exe}" --version >/dev/null 2>&1; then
echo "error: ${exe} still runs after host-toolchain block" >&2
exit 1
fi
done
if ! command -v docker >/dev/null 2>&1; then
echo "error: docker is no longer reachable after host-toolchain block" >&2
exit 1
fi
- name: Verify docker is reachable
run: docker info
- name: Pre-pull pinned image
working-directory: ${{ matrix.fixture }}
run: ./docker_pull.sh
- name: Replay bundle via docker
working-directory: ${{ matrix.fixture }}
run: ./reproduce.sh --docker

View file

@ -1,22 +0,0 @@
name: Rust
on:
push:
branches: [ "master" ]
pull_request:
branches: [ "master" ]
env:
CARGO_TERM_COLOR: always
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build
run: cargo build --verbose
- name: Run linter
run: cargo clippy --all-targets --all-features -- -D warnings
- name: Run tests
run: cargo test --verbose

45
.github/workflows/scorecard.yml vendored Normal file
View file

@ -0,0 +1,45 @@
name: OSSF Scorecard
on:
branch_protection_rule:
schedule:
- cron: "0 7 * * 1"
push:
branches: ["master"]
workflow_dispatch:
permissions: read-all
jobs:
analysis:
name: scorecard
runs-on: ubuntu-latest
permissions:
security-events: write
id-token: write
contents: read
steps:
- uses: actions/checkout@v6
with:
persist-credentials: false
- name: Run analysis
uses: ossf/scorecard-action@v2.4.3
with:
results_file: results.sarif
results_format: sarif
# Flip to true once we're happy with the score and want the badge.
publish_results: false
- name: Upload SARIF artifact
uses: actions/upload-artifact@v7
with:
name: scorecard-sarif
path: results.sarif
retention-days: 14
- name: Upload SARIF to Security tab
uses: github/codeql-action/upload-sarif@v4
with:
sarif_file: results.sarif

20
.gitignore vendored
View file

@ -1,2 +1,22 @@
/target /target
/fuzz/target
/fuzz/corpus
/fuzz/dynamic_corpus/target
/fuzz/artifacts
/.idea /.idea
/frontend/node_modules
/src/server/assets/dist
/marketing
/.nyx
/.nyx-build-cache
/logs
/book
.DS_Store
.z3-trace
.pitboss
.eval-corpus
.node_modules-target
node_modules
__pycache__/
*.pyc
tools/sb-trace/*.trace.raw

36
AI-POLICY.md Normal file
View file

@ -0,0 +1,36 @@
# AI Contribution Policy
Nyx accepts contributions that were drafted, refactored, or reviewed with the help of AI tools (LLMs, code assistants, agent systems). We care about the contribution, not the keystrokes. AI changes the failure modes though, so we ask contributors to follow a few rules.
## What we ask of contributors
By opening a pull request you affirm that:
1. **You have read and understood every line you are submitting.** If you cannot explain a change under review, it is not ready to merge. "The model wrote it" is not an answer we will accept for a bug or a regression.
2. **You have the right to submit the code.** AI-generated code is only as license-clean as its training data and its prompt. Do not paste proprietary, GPL-incompatible, or confidential code into an AI tool and then submit the output here. If a model reproduced a substantial verbatim snippet from an identifiable source, disclose it.
3. **You take responsibility for the change.** The DCO `Signed-off-by:` trailer applies the same way to AI-assisted code as it does to hand-written code. You are certifying origin and right-to-submit.
4. **You disclose material AI use in the PR description.** A one-line note is enough. For example, "Drafted with an AI assistant; reviewed and tested by me." Trivial uses like tab-completion, renames, or formatting do not need to be called out. New analysis passes, rule logic, or security-relevant code do.
## What we look for in review
AI-assisted PRs face the same bar as any other PR, but reviewers will pay extra attention to:
- **Tests that exercise the new behavior.** Not just "it compiles." Fixtures under `tests/fixtures/` and assertions in `expected.yaml` are how we verify security logic.
- **Consistency with the existing engine.** Drive-by refactors, speculative abstractions, or parallel implementations of existing passes will usually be rejected, even if they look clean in isolation.
- **Fabricated references.** AI tools sometimes invent function names, crate APIs, CVE IDs, or citations. Every symbol referenced in a PR must exist, and every external claim must be verifiable.
- **Rule metadata honesty.** Rule descriptions, CWE mappings, and severity ratings are part of how downstream users triage. Do not inflate severity or cite CWEs the rule does not actually detect.
## What we will not accept
- PRs that are clearly unreviewed agent output, such as changes in the wrong file, nonsense tests, hallucinated APIs, or code that does not compile.
- PRs that add "AI-generated" boilerplate, marketing copy, or filler documentation to pad scope.
- Mass-generated PRs across many unrelated areas in a single change.
- Code that was generated by pasting another project's proprietary source into an AI tool.
## Project's own use of AI
For transparency, the README includes an [AI Disclosure](README.md#ai-disclosure) describing where AI was used in Nyx itself. The short version: the analysis engine is predominantly human-written and human-reviewed, while documentation, fixtures, and rule metadata were drafted with AI assistance and audited before landing. We hold outside contributions to the same standard.
## Questions
If you are unsure whether a contribution falls inside this policy, open a draft PR or an issue and ask before investing time. We would rather have the conversation early than reject work at review.

551
CHANGELOG.md Normal file
View file

@ -0,0 +1,551 @@
# Changelog
All notable changes to Nyx are documented here. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and the project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html). For where Nyx is going, see the [Roadmap](ROADMAP.md).
## [0.8.0] - 2026-06-06
The dynamic-verification release. An attack-surface map, a sandboxed dynamic verifier, a framework adapter registry that grounds both, the per-language build infrastructure that makes per-finding verification affordable at corpus scale, and the first real-corpus acceptance gates.
The attack-surface map and chain composer turn the flat finding list into a route-to-sink graph. The dynamic verifier re-runs every Medium-or-higher finding against a payload corpus and stamps a Confirmed / PartiallyConfirmed / NotConfirmed / Inconclusive / Unsupported verdict on each. The adapter registry (130+ entries across 8 languages) covers HTTP, message-broker, scheduled-job, GraphQL, WebSocket, middleware, and migration entry points. Per-language build pools and copy-on-write workdirs hold the with-verify wall-clock to within 1.5x of a static-only scan.
### Attack-surface map
- **`nyx surface` subcommand.** Prints the project's entry points, datastores, external services, and dangerous local sinks as text, JSON, Graphviz `dot`, or rendered SVG. Loads the persisted `SurfaceMap` from the most recent indexed scan when available, or rebuilds inline from source. `--build` forces a full pass-1 + call-graph walk so DataStore / ExternalService / DangerousLocal nodes populate on an unscanned project.
- **Surface page in `nyx serve`.** New `SurfacePage` renders the same graph in the browser UI, with ELK layout, sidebar navigation, and a wide-canvas SVG viewer. Persists alongside the index so the frontend reloads without a rescan.
- **Chain findings.** `ChainFinding` records connect a route entry point to a downstream sink via the call graph + surface map. The composer scores `(impact × evidence)` per chain, queues the top-N for composite reverification, and wires the result into `findings.json` / SARIF / the dashboard. Chains rank above isolated findings.
### Framework adapter registry
`src/dynamic/framework/` ships a `FrameworkAdapter` trait with concrete adapters across 8 languages (116 entries today, growing per release). Each adapter binds a route / handler / consumer pattern to a `FrameworkBinding` so the surface map and dynamic verifier can locate entry points without re-walking the AST.
- **HTTP routers.** Flask, Django, FastAPI, Starlette (Python); Express, Koa, NestJS, Fastify (JS/TS); Spring, Quarkus, Micronaut, Jakarta Servlet (Java); Gin, Echo, Fiber, Chi (Go); Axum, Actix, Rocket, Warp (Rust); Rails, Sinatra, Hanami (Ruby); Laravel, Symfony, CodeIgniter (PHP).
- **New `EntryKind` variants.** `ClassMethod`, `MessageHandler`, `ScheduledJob`, `GraphQLResolver`, `WebSocket`, `Middleware`, `Migration` join the existing `RouteHandler` / `Function` set so the surface map shows non-HTTP entry surfaces.
- **Message broker handlers.** Kafka, AWS SQS, Google Pub/Sub, NATS, and RabbitMQ consumers across Python, Node, Java, and Go.
- **Scheduled jobs.** Celery (Python), Sidekiq (Ruby), Quartz (Java), plain cron expression recognition.
- **GraphQL resolvers.** Apollo, Relay, gqlgen, Juniper, Graphene.
- **WebSocket handlers.** ws, Socket.IO, ActionCable, Django Channels.
- **Middleware + migrations.** Express, Laravel, Spring, Django, Rails middleware; Django, Flask, Laravel, Rails, Prisma, Sequelize migration scripts.
- **Sanitizer-aware adapter strengthening.** Every XXE, header-injection, open-redirect, SSTI, LDAP, XPath, deserialization, crypto, and data-exfiltration adapter rejects bindings when the surrounding source visibly hardens the parser (`disallow-doctype-decl`, `resolve_entities=False`, `libxml_disable_entity_loader`), routes the value through a known encoder (`LdapEncoder.filterEncode`, `escape_filter_chars`, `ldap_escape`), swaps a weak primitive for a CSPRNG (`secrets.token_bytes`, `crypto.randomBytes`, `SecureRandom`), or validates the destination host through an allowlist. Cuts adapter FPs without losing the genuinely dangerous calls.
### Dynamic verification
- **`nyx scan --verify`.** Every finding with `Confidence >= Medium` is re-executed inside a sandboxed harness against a curated payload corpus. The verdict (`Confirmed` / `NotConfirmed` / `Inconclusive` / `Unsupported`) lands on `Evidence.dynamic_verdict` and shows up in console output, JSON, SARIF, and the dashboard via a new `VerdictBadge` component on the finding detail page.
- **Backends.** In-process on Linux with `Standard` / `Strict` hardening (namespace unshare, chroot, RLIMIT cap, seccomp filter), in-process on macOS via `sandbox-exec` with a profile-per-policy wrap, Docker with a published image-builder catalogue, and a Firecracker trait stub for future microVM execution. The Docker backend ships native binary support for Rust and Go so harnesses no longer need to drag a toolchain into every image.
- **Language coverage.** Per-language harness emitters for Python, JS/TS, Go, Java, PHP, Ruby, Rust, C, and C++. Stub harness intercepts SQL, HTTP, Redis, and filesystem boundaries so the verdict reflects the sink, not the network. The `JSON_PARSE`, `UNAUTHORIZED_ID`, and `DATA_EXFIL` cap dispatchers are wired into every emitter that ships these caps (Python, JS, TS, Go, Java, PHP, Ruby, Rust), so the verdict pipeline closes the loop on each cap end-to-end rather than per-language piecemeal.
- **Abstract-interpretation and symex sanitizer suppression.** Symbolic execution and the interval/string abstract domain are now consulted at verdict time, so a payload that the static engine would call dangerous but symex can prove never reaches the sink lands as NotConfirmed.
- **Guard-aware verdicts.** When a known input-validation or output-sanitization middleware sits in front of a Confirmed sink (Spring `@PreAuthorize`, Express `helmet`, Nest `@UseGuards`, Django `@permission_classes`, and the per-language registry in `src/dynamic/framework/auth_markers.rs`), the verdict demotes to `ConfirmedWithKnownGuard` and the guard names land on `differential.known_guards`. Authentication-only filters do not trigger the demotion since they do not mitigate injection.
- **Repro bundles.** Every verified finding writes a hermetic bundle to `~/.cache/nyx/dynamic/repro/<spec_hash>/` with `reproduce.sh`, `expected/{verdict.json,outcome.json,trace.jsonl}`, and a `docker_pull.sh` when the toolchain is pinned in `tools/image-builder/images.toml`. `--verbose` flushes the per-step `VerifyTrace` to stderr for live triage.
- **Real-engine harness paths.** LDAP injection routes through an embedded LDAPv3 BER server, exercised from Java via JNDI `InitialDirContext` and from Python and PHP via pure-stdlib BER clients. XPath injection runs against the live parser in each language: Java `javax.xml.xpath`, PHP `DOMXPath`, JS `xpath` npm, Python `lxml`. `Cap::CRYPTO` lands a `WeakKey` probe across Python, Go, Java, PHP, and Rust that flags sub-2^16 keys produced by non-CSPRNG sources. A new `HeaderSmuggledInWire` oracle predicate catches CRLF smuggling on hand-rolled raw-socket HTTP servers (Python `http.server`, Node `net`, Rust `std::net::TcpListener`) where framework-level CRLF strip cannot intervene.
- **Differential rule v2 and partial confirmations.** A finding confirms when *any* vulnerable payload in the set fires and *every* paired benign control stays clean, replacing the strict pair-wise rule so a single missing control no longer downgrades a confirmable finding. A new `PartiallyConfirmed` verdict marks findings where the sink is reached but the exploit chain does not complete (no marker written, no callback observed), so engine work can ratchet without the tool overstating what it proved.
- **Spec derivation v2.** Every derivation strategy now runs and is scored on flow-step depth, framework binding, cross-file source resolution via `GlobalSummaries`, and payload availability; the highest-scoring candidate wins and the runner-up ranking lands in the trace so engine gaps stay visible. Cross-file seeding walks the call graph (max depth 5) until a `Source` step or framework binding is found. New `EntryKind` adapters auto-recover the entry surface from framework decorators and annotations.
### Performance
- **Per-language build pools.** A warm `javac` daemon compiles batched harness sources in one long-lived JVM (Track O headline, Phase 22); Node, PHP, Ruby, Go, Rust, C, and C++ reuse shared module / package / object caches; Python layers a read-only venv per `requirements_hash` with a warmed bytecode cache. Target per-finding harness build: P50 ≤ 200ms hot, ≤ 1.5s cold. Pools self-skip when a toolchain is absent so toolchain-less CI rows stay green.
- **Copy-on-write workdirs.** Per-finding workdir setup uses `clonefile` on macOS and `reflink` / `copy_file_range` on Linux instead of copying every harness file, cutting setup cost to single-digit milliseconds.
- **Cap-routed concurrency lanes.** The verifier worker pool splits into per-cap lanes (`SSRF: 8`, `DESERIALIZE: 2`, `CRYPTO: 1`, and so on) so a slow harness for one cap cannot head-of-line block fast ones.
- **Ship-gate budgets.** Gate 3 holds the with-verify / static-only wall-clock ratio at ≤ 1.5x on `benches/fixtures/`; Gate 6 holds the Java OWASP Benchmark `--verify` run at ≤ 15 min on CI / ≤ 10 min on the dev reference machine.
### Determinism, policy, telemetry
- **YAML policy deny list.** `src/policy.rs` is consulted before harness build. Network egress, filesystem writes outside the sandbox root, and process spawns can be denied per-rule; deny decisions land in the trace, redacted via the shared scrubber.
- **Seeded RNG.** `dynamic::rand::SpecRng` is seeded from each `HarnessSpec` hash so two runs of the same spec produce identical payloads. `scripts/check_no_unseeded_rand.sh` audits the tree for unseeded `rand` usage on every CI run.
- **`VerifyTrace` observability.** Every per-step decision (probe selection, payload mutation, oracle check, deny verdict) writes to the trace stream and the repro bundle.
- **Schema-versioned telemetry.** `events.jsonl` carries `schema_version`, `nyx_version`, `corpus_version`, `kind`, and `ts` on every envelope. PII and secret scrubbing runs on every persisted artefact via `src/utils/redact.rs`.
- **`NYX_NO_TELEMETRY=1`** disables event persistence outright.
### CVE corpus and ground truth
- **New `Cap` corpora.** Vulnerable + patched fixtures landed for the seven new cap classes (LDAP injection, XPath injection, header injection, open redirect, SSTI, XXE, prototype pollution) plus deserialization, crypto, JSON parsing, unauthorized-id, and data exfiltration. Every cap now carries at least one positive / negative / adversarial / unsupported fixture quad per supported language.
- **OWASP Benchmark v1.2 importer.** `tests/eval_corpus/owasp_gt_convert.py` converts the OWASP Java Benchmark expected-results manifest into Nyx ground truth and lands a 16k-line `owasp_benchmark_v1.2.json` for evaluation.
- **NIST SARD importer.** `tests/eval_corpus/sard_gt_convert.py` converts SARD test cases into the same format so cross-dataset recall numbers stay comparable.
- **Evaluation corpus tooling.** `tests/eval_corpus/run_full.sh` runs the Nyx benchmark, OWASP Benchmark, and NIST SARD evaluation sets and writes `tests/eval_corpus/results.json`. `tests/eval_corpus/report.py` and `tabulate.py` produce the per-cap and per-language summary used to track coverage and accuracy.
- **Real-corpus acceptance gates.** `scripts/m7_ship_gate.sh` adds Gate 6 (Java OWASP Benchmark v1.2), Gate 7 (NodeGoat + Juice Shop), and Gate 8 (RailsGoat, DVWA, DVPWA, gosec, RustSec). Each row enforces the per-`(cap, lang)` budget in `tests/eval_corpus/budget.toml` and publishes per-cap precision / recall / confirmed-rate against a committed ground truth. The corpora are not vendored; each row self-skips unless its `NYX_<NAME>_CORPUS` points at a checkout.
- **Per-spec cryptographic canary.** Every oracle marker is now derived from `BLAKE3(spec_hash || run_nonce)` rather than a fixed literal, so markers are unique per finding, collision-resistant against ambient harness output, and never leak to the host. A compile-time audit rejects any new ad-hoc canary.
### Engine
- **DB fast-fail preflight.** `Indexer::init` reads the first 16 bytes of any candidate SQLite file and rejects anything without the standard `SQLite format 3\0` magic. Stops a misnamed JSON / text file from corrupting the index path with a SQLite error halfway through migration.
- **Symbolic-execution coverage.** Symex now recognises a wider set of string operations (`substr`, `replace`, `to_lower`, `to_upper`, `trim`, `strlen`) per the value/transfer pipeline, and the abstract-interpretation framework reasons about interval and prefix/suffix string facts during the dynamic verdict pass.
### CLI
- **`nyx scan --verify`** (enabled by default in standard builds) and `--backend {auto,process,docker}` select the dynamic-verification harness. `--no-verify` skips verification for a single run without changing config.
- **`nyx scan --harden {standard,strict}`** picks the process-backend hardening profile. `standard` is no-new-privs plus a memory rlimit on Linux. `strict` layers namespace unshare, chroot to the workdir, and a default-deny seccomp filter on Linux, or wraps the harness with `sandbox-exec` on macOS.
- **Patch-validation CI mode.** `--baseline FILE` reads a previous scan's JSON (or a stripped `.nyx/baseline.json` written by `--baseline-write`) and diffs it against the current scan on `stable_hash`, emitting `New` / `Resolved` / `FlippedConfirmed` / `FlippedNotConfirmed` transitions. `--gate {no-new-confirmed,resolve-all-confirmed}` exits non-zero when the diff violates the policy so CI fails the build instead of merging an unreviewed regression. The stripped baseline carries only `stable_hash`, `dynamic_verdict`, `severity`, `path`, and `rule_id`, so persisting it between scans does not leak source.
- **Repository triage in CI.** `nyx scan` now reads the same `.nyx/triage.json` file written by `nyx serve`. Terminal triage states (`false_positive`, `accepted_risk`, `suppressed`, `fixed`) are hidden from CLI output and excluded from `--fail-on` by default, while `--show-suppressed` includes them with `triage_state` / `triage_note` metadata for JSON, SARIF, and console output.
- **`nyx scan --verify-all-confidence`** drops the Medium cutoff and re-verifies everything.
- **`nyx scan --unsafe-sandbox`** disables hardening (development only, never for CI).
- **`nyx verify-feedback <finding_id> --wrong <reason> | --right`** records a correction or confirmation for a finding's verdict in the local telemetry log.
- **`nyx scan --explain-engine`** prints the effective engine configuration and exits without scanning.
- **`nyx surface`** (described above) with `--format {text,json,dot,svg}` and `--build`.
- **`nyx repro` subcommand.** Replays dynamic repro bundles by finding id,
spec hash, or explicit bundle path, with `--docker`, `--print-path`, and
`--list` helpers. The CLI now matches the browser UI's reproduced command
and uses bundle manifests to bridge stable finding ids to spec-hash cache
directories.
### Frontend
- **Project target selector in `nyx serve`.** The sidebar now remembers scan roots, lets you switch the active target, and accepts a new project path without restarting the server. `/api/targets` backs the selector, scans can opt into a different `scan_root`, and `nyx scan` / `nyx index build` register the projects they touch so `nyx serve` can pick them up later.
- **Surface page** with ELK auto-layout and the shared node-style palette.
- **Verdict badge** on finding detail, plus a dynamic-verdict section that surfaces the verdict, the payload that triggered it, and a link to the repro bundle.
- **Scan compare** gains a dynamic-verdict diff column so two scans can be compared on what was confirmed versus what was downgraded.
### License
- **Internal license grants documentation** at `LICENSE-GRANTS.md`. Grant 1 covers Nyctos derived works. The repo stays GPL-3.0-or-later; the grants document scope of internal product licensing.
## [0.7.0] - 2026-05-11
A focused release that adds seven new vulnerability classes, ships two SSA sidecars for XML and XPath parser hardening, deepens cross-file authorization for FastAPI, trims roughly a thousand auth false positives on Go DAO helpers along with the dominant Hibernate Criteria SQL cluster, and runs a performance pass on the auth extractor, SCCP, and the global summaries map. A `nyx rules list` CLI surfaces the rule registry, the web UI gets a brand-aligned visual refresh, and the CVE corpus grows across Python, PHP, JavaScript, and C.
### Highlights
- New caps for LDAP injection, XPath injection, header / CRLF injection, open redirect, server-side template injection, XXE, and prototype pollution, with per-language label rules across all eight supported languages.
- Cross-file FastAPI authorization: `include_router` chains and module-level `APIRouter(dependencies=[…])` now lift onto every attached route, with `Security(..., scopes=[...])` recognised distinctly from `Depends(...)`.
- Type-tracked XML and XPath hardening through two new SSA sidecars: parser bodies that set `secure_processing` / `processEntities: false` / `resolve_entities=False`, and `XPath` instances bound to `setXPathVariableResolver(...)`, are recognised as safe.
- ~957 `go.auth.missing_ownership_check` findings closed on gitea-shaped DAO helpers (id-scalar precision pass), 169 of 216 openmrs `cfg-unguarded-sink` findings closed on Hibernate Criteria-API receivers, joomla and drupal `php.deser.unserialize` closed on `Serializable::unserialize($input)` magic-method bodies.
- `nyx rules list` CLI subcommand, brand-aligned `nyx serve` visual refresh, and regenerated README / docs screenshots and GIFs.
### Detector classes
- New `Cap` bits and canonical rule ids: `Cap::LDAP_INJECTION` / `taint-ldap-injection`, `Cap::XPATH_INJECTION` / `taint-xpath-injection`, `Cap::HEADER_INJECTION` / `taint-header-injection`, `Cap::OPEN_REDIRECT` / `taint-open-redirect`, `Cap::SSTI` / `taint-template-injection`, `Cap::XXE` / `taint-xxe`, `Cap::PROTOTYPE_POLLUTION` / `taint-prototype-pollution`. Each ships per-language sink, sanitizer, and gated-sink rules across JS/TS, Python, Java, PHP, Go, Ruby, Rust, and C/C++. Severity, OWASP 2021 mapping, and human-readable description live in `CAP_RULE_REGISTRY` in `src/labels/mod.rs`; `cap_rule_meta()` and `rule_id_for_caps()` are the public lookups.
- `Cap` widened from `u16` to `u32` to fit the new bits. `Evidence.sink_caps` and `RuleInfo.cap_bits` follow. The serde decoder accepts any unsigned integer width so caches written before the bump still load. SQLite schema bumped from 3 to 4 to force a rescan, since older `source_caps` / `sanitizer_caps` / `sink_caps` blobs were emitted before any of the new bits could appear.
- `owasp_bucket_for` consults `CAP_RULE_REGISTRY` first so adding a cap class no longer requires a second-table edit. The match requires an exact rule id or a recognised separator (` `, `(`, `.`) so a future `taint-ssrf-allowlist-violation` cannot silently inherit `taint-ssrf`'s bucket. The legacy family-token table now also routes `xpath`, `header`, and `xxe` to A03 / A05.
- `issue_category_label` (dashboard badge) routes the seven new rule-id prefixes to dedicated labels: LDAP Injection, XPath Injection, Header Injection, Open Redirect, Template Injection, XXE, Prototype Pollution.
### Engine
- **XML-parser configuration tracking.** `src/ssa/xml_config.rs` runs alongside type-fact analysis and carries per-receiver `secure_processing` / `disallow_doctype` / `external_entities` flags forward through copy assignments and phi joins (meet for safe flags, sticky union for the unsafe `external_entities` polarity). `xxe_safe()` queries the result at the type-qualified `XmlParser.parse` sink and strips `Cap::XXE` when the parser was provably hardened (JAXP `setFeature(FEATURE_SECURE_PROCESSING, true)`, lxml `XMLParser(resolve_entities=False, no_network=True)`, fast-xml-parser `processEntities: false`). Persisted to `OptimizeResult.xml_parser_config`.
- **XPath-receiver configuration tracking.** `src/ssa/xpath_config.rs` mirrors the XML sidecar for Java's `XPath` instances: `setXPathVariableResolver(...)` flips the receiver's `has_resolver` flag, copy assignments union, phi joins meet. `xpath_safe()` strips `Cap::XPATH_INJECTION` at `xpath.evaluate(expr, ...)` / `xpath.compile(expr)` sinks when the receiver was provably bound to a resolver. Persisted to `OptimizeResult.xpath_config`.
- **Five new `TypeKind` variants.** `LdapClient` (JNDI `InitialDirContext` / `InitialLdapContext`, Spring `LdapTemplate`, ldapjs `createClient`, python-ldap `initialize`, ldap3 `Connection`), `XPathClient` (JAXP `newXPath`, lxml `etree.XPath`, npm `xpath`), `XmlParser` (JAXP factory products: `newDocumentBuilder`, `newSAXParser`, `getXMLReader`), `Template` (FreeMarker `new Template(...)` / `Configuration.getTemplate`), and `NullPrototypeObject` for JS/TS values produced by `Object.create(null)`. Wired into `constructor_type` for return-type inference and `TypeKind::label_prefix()` for type-qualified callee resolution. `XPathClient` is kept distinct from `DatabaseConnection` so a generic `pdo->query` SQL_QUERY sink does not collide with `xpath.query`.
- **`GateActivation::LiteralOnly`.** Strict literal-value activation: the gate fires only when the activation argument is a literal that matches `dangerous_values` / `dangerous_prefixes`. Unknown or dynamic activation argument suppresses (no conservative `ALL_ARGS_PAYLOAD` push). Used where the dangerous shape is identifiable only by an explicit literal flag, e.g. `jQuery.extend(true, target, src)` deep-merge against Backbone's `Model.extend({proto})`.
- **Two new path-state predicates for inline open-redirect sanitisers.** `RelativeUrlValidated` covers `x.startsWith("/")`, `x.starts_with("/")`, `x.startswith("/")`, PHP `strpos($x, "/") === 0`, and direct `x[0] === "/"`. `HostAllowlistValidated` covers `new URL(x).host === ALLOWED`, `urlparse(x).netloc == ALLOWED`, multi-statement `parsed.host_str() == "..."` for Rust, and `parsed.Host == "..."` / `parsed.Hostname() == "..."` for Go. Both clear `Cap::OPEN_REDIRECT` only on the validated branch, leaving any non-redirect taint downstream to fire on its own caps. The Go form gates on case-sensitive capital `H` so a lowercase `u.host == X` field comparison falls through to the generic `Comparison` predicate.
- **`Object.create(null)` recogniser.** `is_object_create_null_call` in `cfg/literals.rs` matches `Object.create(null)` (and parenthesised, awaited, or TS type-cast wrappers) and tags `CallMeta.produces_null_proto = true`. Type-fact analysis lifts the flag to `TypeKind::NullPrototypeObject` on the returned SSA value so the synthetic `__index_set__` sink is suppressed flow-sensitively. Phi joins drop the tag back to `Unknown` so a partial null-proto receiver still fires on the unsafe path.
- **CFG-layer prototype-pollution suppression** at the synthetic `__index_set__` sink (JS/TS, recognised by the existing `try_lower_subscript_write` lowering). Three flow-insensitive shapes elide the `Sink(PROTOTYPE_POLLUTION)` label before SSA sees the node: constant-key fold (literal key not in `__proto__` / `constructor` / `prototype`), reject pattern (sibling `if (idx === "__proto__" || ...) return / throw / break;`), and allowlist pattern (ancestor `if (idx === "name" || idx === "id") { obj[idx] = v }`). Walks stop at the enclosing function so closure-captured guards in an outer scope cannot silently authorise inner assignments.
- **Spring MVC `return "redirect:" + tainted` recogniser** (Java). `try_lower_spring_redirect_return` in `cfg/mod.rs` matches the leftmost `+`-chain whose root is a `redirect:` string literal and emits a synthetic `__spring_redirect__` Call sink with `Sink(Cap::OPEN_REDIRECT)` between the predecessors and the Return node. Concatenated identifiers from anywhere in the right-hand chain feed the synthetic node's `arg_uses[0]`, so the taint pipeline carries any tainted suffix through OPEN_REDIRECT.
- **Subscript-set form classification for header sinks.** `response.headers["X-Foo"] = bar` / `headers["X-Foo"] = bar` (Ruby `element_reference`, JS/TS `subscript_expression`, Python `subscript`) had no `property` field on the LHS. `push_node` now walks into the subscript's `object` and classifies its member-expression text, so `Cap::HEADER_INJECTION` fires on the bare bracket form alongside `setHeader` / `res.set` / `headers_mut.insert`.
- **PHP literal extraction** extended in `cfg/literals.rs`: PHP `encapsed_string` (double-quoted) when every child is a pure-literal segment; boolean literals (`true` / `false`) for the jQuery `extend(true, ...)` `LiteralOnly` gate; leading-string `binary_expression` concat (`"Location: " . $url`, JS/TS `"Location: " + url`) so `dangerous_prefixes` matching activates on partially dynamic concatenations.
- **PHP receiver-text strip** in `helpers::root_receiver_text` drops the leading `$` from `variable_name` nodes so `$smarty->fetch(...)` / `$twig->createTemplate(...)` reconstruct as `Smarty.fetch` / `Environment.createTemplate` for suffix-matcher gates.
- **Gate-callee resolution hardening for member-source rewrites.** When `first_member_label` rewrites a call's `text` to a Source like `req.body`, the gate matcher now reads the call's `function` / `method` / `name` field instead, so `setValue(target, req.body, ...)` matches the `setValue` proto-pollution gate. Whitespace stripped from the function field so multi-line chains still match flat gate matchers.
- **Ruby option-constant lookup in gate activation.** Bare `scope_resolution` / `constant` nodes (`Nokogiri::XML::ParseOptions::NOENT`) now fall back to the macro-arg extractor used by C/C++/PHP, so Nokogiri XXE gates activate on idiomatic option-flag arguments.
- **PHP `unary_op_expression` negation recognition.** tree-sitter-php emits `unary_op_expression` for unary `!`; CFG `detect_negation` and condition-chain decomposition now match it, so `if (!validate($x))` no longer carries `condition_negated=false` and the surviving branch is the rejection arm, not the validated one.
- **PHP container kinds.** `declaration_list`, `interface_declaration`, `trait_declaration`, `enum_declaration`, `enum_declaration_list` mapped to `Kind::Block` so methods inside them participate in CFG construction.
- **Go variadic `parameter_declaration` named-field handling** for `collect_param_names`. `name` and `type` named fields read directly so type-segment identifiers no longer pollute the param-name set (`info *PackageInfo` no longer contributes `PackageInfo`).
- **Empty-formals SSA lowering signal.** Per-parameter summary probing now seeds via `BodyMeta.param_destructured_fields`; JS/TS arrow `() => {…}` lowers with `with_params=true` so it is treated as "explicitly zero formals" rather than "no formals info".
### Authorization
- **FastAPI cross-file `include_router` dependency tracking.** `auth_analysis/router_facts.rs` captures per-file router declarations (`<router> = X(deps=[…])`) and `<parent>.include_router(<child_module>.<child_var>)` edges in pass 1, persists them into `GlobalSummaries::router_facts_by_module`, and resolves them into the active file's `AuthorizationModel::cross_file_router_deps` at pass 2 entry. Transitive lifts (grandparent to parent to child) handled by iterative index walk. Module identity is the file basename without `.py`. Closes the airflow execution-API shape where a child router lives in `routes/task_instances.py` and its auth is declared on the parent in `routes/__init__.py`.
- **FastAPI router-level `dependencies=[...]` propagation.** Module-level `router = APIRouter(dependencies=[Security(...)])` is pre-walked once per file and merged onto every `@<router>.<verb>(...)` route attached in the same file. Closes airflow execution-API routes that re-use a single `ti_id_router` declared once at module scope.
- **FastAPI `Security(callable, scopes=[...])` recognised distinctly from `Depends(callable)`.** Scoped Security promotes the synthetic `AuthCheck` to `AuthCheckKind::Other` (route-level scope-checked authorization), not Login. New scope-tracking boolean threaded through `expand_decorator_calls` and `extract_fastapi_dependencies`.
- **Caller-scope IPA: same-file route-handler-to-helper auth lift.** `apply_caller_scope_propagation` walks every non-route helper unit; if its in-file callers are non-empty AND every caller is itself an authorized route handler (route-level non-Login auth check) or already authorized via this same propagation, the caller's checks lift onto the helper as synthetic `is_route_level=true` `AuthCheck`s. Iterated to a small fixpoint so transitive helper chains (route to mid_helper to leaf_helper) are covered. Refuses to authorize helpers with no in-file caller, helpers called from a mix of authorized and unauthorized callers, and helpers called only from un-lifted helpers. Cross-file lifting is not implemented. Closes the dominant FastAPI / Django / Flask "route authenticates via decorator/dependency, then delegates to a private helper that performs the sink" FP shape on sentry / saleor / airflow.
- **Go DAO-helper id-scalar precision pass.** For non-route Go units, a parameter whose declared type is a bounded primitive scalar (`int64`, `uint32`, `string`, `bool`, `byte`, `rune`, `float64`, …) and whose name is id-shaped (`id`, `*Id`, `*_id`, `*ids`) is dropped from `unit.params` before ownership-check evaluation. Real Go HTTP handlers always carry a framework-request-typed param (`*http.Request`, `*gin.Context`, `echo.Context`, `*fiber.Ctx`); per-framework route extractors set `include_id_like_typed=true` so id-shaped path params survive on real routes. Mirrors the existing Python `is_python_id_like_typed_param` filter. Closes ~957 `go.auth.missing_ownership_check` findings on gitea backend DAO helpers (`func GetRunByRepoAndID(ctx, repoID, runID int64)`, `func DeleteRunner(ctx, id int64)`, the entire `models/...` layer where the ownership check sits in the calling route handler) and equivalent shapes in minio / Go ORM codebases.
- **Bare-callee verb-name fallback gate.** `list(...)`, `filter(...)`, `update(...)`, `create_audit_entry(...)`, `update_coding_agent_state(...)` (no receiver dot at all) no longer classify as `DbMutation` / `DbCrossTenantRead` via the loose verb-name fallback. Real ORM/DB calls carry a receiver (`User.find(id)`, `Model.objects.filter`, `repo.save(x)`); a bare `list(events)` is the Python builtin and `filter(fn, xs)` is `Iterable.filter`. New helper `receiver_is_simple_chain(callee)` requires a non-chained receiver dot. The realtime / outbound / cache prefix dispatches still match by chain root.
### Type-aware sinks and validators
- **Java JPA / Hibernate Criteria API as structural SQL.** `TypeKind::JpaCriteriaQuery` covers `CriteriaQuery<T>`, `CriteriaUpdate<T>`, `CriteriaDelete<T>`, `Subquery<T>`, `TypedQuery<T>`. `sink_args_jpa_criteria_query_safe` clears `cfg-unguarded-sink` SQL_QUERY when any positional argument to the sink call is JpaCriteriaQuery-typed (receiver excluded; receiver of `session.createQuery(cq)` is the Session/EntityManager channel, never the SQL payload). `cb.createQuery(...)`, `em.getCriteriaBuilder()`, and the JpaCriteriaQuery type chain inferred via constructor / factory return-type hints in `type_facts.rs`. Closes the dominant FP cluster on openmrs (169 of 216 cfg-unguarded-sink), xwiki, and keycloak Hibernate DAO methods.
- **Receiver-side validator registry.** `labels::lookup_receiver_validator(lang, callee)` clears `Cap` from the receiver value (and call equivalents) on success, distinct from `Sanitizer` which clears caps from the return value. Python registers `relative_to => Cap::FILE_IO` so `path.relative_to(base)` drops the file-IO cap on the path. Closes the CVE-2024-23334 patched aiohttp `static_root_path.joinpath(filename).resolve().relative_to(static_root_path)` shape.
- **JS/TS Array-method validator-callback narrowing.** `arr.filter(isSafeIdentifier)`, `arr.find(isValidId)`, `arr.findLast(...)` with a `BooleanTrueIsValid` callback (`isValid…`, `isSafe…`, `hasValid…` and snake-case variants) propagate `validated_must` through the call's return value. Resolves callback name from `info.arg_callees` (call-shape arguments) and SSA `value_defs[v].var_name` (bare-identifier callbacks, the dominant patched-CVE form). Strict-additive: anonymous arrows / opaque identifiers leave existing propagation untouched. `findIndex` / `every` / `some` excluded (scalar return shape). Motivated by CVE-2026-42353.
- **JS/TS ternary-branch source classification.** `let arr = cond ? req.query.lng : "";` previously lowered each branch to a labelless Assign with empty uses; the join phi saw no taint. `lower_ternary_branch` now runs `first_member_label` on the branch AST when no `Source` label is already attached.
- **PHP `fopen` modeled as `Sink(Cap::SSRF)`** (same dual SSRF / LFI shape as `file_get_contents`; fires only on tainted argument). Closes CVE-2026-33486 (roadiz/documents `DownloadedFile::fromUrl` wrapping `fopen($url, 'r')`).
- **PHP `Serializable::unserialize($input)` magic-method passthrough recognition.** The legacy `Serializable` interface contract (deprecated since PHP 8.1) requires the implementation to call `\unserialize($input)` on the formal parameter inside `public function unserialize($x) { ... }`. PHP itself invokes the method when restoring an instance, so the body's call cannot be removed without breaking the interface. `php.deser.unserialize` now suppresses inside this exact shape (method named `unserialize`, single formal, bare-parameter argument). Class-level `Serializable` implementation is the actionable signal (fix is migration to `__serialize` / `__unserialize`). Closes joomla / drupal Serializable-implementing class FPs.
- **SQLAlchemy query-builder chained-call recognition.** `select(X).filter_by(...)`, `query(X).filter(...)`, `select().join().where()` chains now anchor through the chain root primitive when the chain receiver type is opaque. New `db_query_builder_roots` config (Python defaults: `select`, `query`). Closes airflow `session.scalar(select(C).filter_by(conn_id=user_input))` shapes that previously dropped under the chained-call suppression in `classify_sink_class`.
- **Python non-sink container constructor recognition.** Bare-callee `set()` / `dict()` / `list()` / `tuple()` / `frozenset()` / `defaultdict(...)` is treated as a non-sink constructor, so `verified_ids = set(); verified_ids.update(myteams)` does not classify the `.update` call as `DbMutation`. Type-annotation hint form `set[int]` / `dict[str, int]` recognised via PEP 585 generic suffix strip alongside the existing angle-bracket strip.
- **Python `request.match_info` source label** (aiohttp path-parameter source).
- **New Python pattern `py.xss.make_response_format` (Tier B).** Flask `make_response(<f-string-or-concat>)` reflection. Recognises both bare `make_response(...)` and `flask.make_response(...)`. Closes CVE-2023-6568 (mlflow auth `create_user` reflecting attacker-controlled `Content-Type` header into the response body).
### Language coverage
Per-language label rules expanded for the seven new caps.
- **JavaScript / TypeScript:** ldapjs `LdapClient.search`, `escapeXpath` / `xpathEscape`, `document.evaluate` / npm `xpath.select`, `setHeader` / `res.set` / `res.append` / `res.headers[]=`, `stripCRLF` / `escapeHeader`, lodash / dot-prop / object-path deep-merge prototype-pollution gates, Handlebars / EJS / Mustache template sinks, fast-xml-parser / xml2js with `processEntities`-aware activation, `redirect` / `Location` open-redirect sinks.
- **Python:** python-ldap `LDAPObject.search_s`, ldap3 `Connection.search`, lxml `etree.XPath` / `lxml.etree.parse` with parser-config awareness, Flask `response.headers[]=` / `make_response`, Jinja2 `Template(...)` and Mako `Template(...)` SSTI sinks, `flask.redirect` / `aiohttp HTTPFound` open-redirect.
- **Java / Kotlin:** `DirContext.search`, `XPath.evaluate` / `XPath.compile`, JAXP `DocumentBuilder.parse` / `SAXParser.parse` / `XMLReader.parse`, FreeMarker `Template.process`, Spring `redirect:` view-name synthetic sink, `HttpServletResponse.setHeader` / `addHeader`.
- **PHP:** `ldap_search` / `ldap_list` / `ldap_read`, `DOMXPath::query` / `DOMXPath::evaluate`, `header()` with leading-prefix activation, Smarty `fetch` / Twig `createTemplate` / Blade compile + `eval` template forms, `loadXML` / `simplexml_load_string` with `LIBXML_NOENT` activation.
- **Go:** `go-ldap conn.Search`, `etree.Path` / `xmlpath.Compile`, `http.Header.Set` / `Response.Header().Set`, `html/template` and `text/template` `Parse(...)`, `encoding/xml.Unmarshal` / `Decoder.Decode`, `http.Redirect` with relative-URL / host-allowlist gating.
- **Ruby:** `Net::LDAP#search`, `Nokogiri::XML::Document#xpath`, `response.headers[]=`, `ERB.new` SSTI, `Nokogiri::XML.parse` with `NOENT` / `DTDLOAD` activation, `redirect_to` with relative-URL gate.
- **C / C++:** libldap `ldap_search_ext_s`, libxml2 `xmlXPathEval`, `curl_easy_setopt` with header-list activation, libxml2 `xmlReadFile` / `xmlReadMemory` with `XML_PARSE_NOENT` activation.
- **Rust:** actix-web `HeaderMap.insert` / `HeaderValue::from_str` header-injection gates. `Redirect::to` retagged from `Cap::SSRF` to `Cap::OPEN_REDIRECT` so the open-redirect rule fires distinctly from the SSRF rule.
`NYX_PYTHON_PROTO_POLLUTION` opt-in flag: Python `dict.update` / `__dict__.update` proto-pollution gates are off by default because bare `update` overlaps too broadly with `Counter.update` and ordinary state-mutation patterns to ship as a default sink.
### CVE corpus
- **C.** CVE-2017-1000117 (git argv injection via `ssh://-oProxyCommand=…`) vulnerable + patched fixtures under `tests/benchmark/cve_corpus/c/CVE-2017-1000117/`. Known remaining gap: array-element taint propagation, `c.cmdi.exec*` AST patterns, and dash-prefix-byte sanitizer recognition.
- **Python.** CVE-2023-6568 (mlflow reflected XSS), CVE-2024-21513 (langchain SQL / Jinja), CVE-2024-23334 (aiohttp static-file path traversal) vulnerable + patched fixtures.
- **PHP.** CVE-2026-33486 (roadiz/documents SSRF) vulnerable + patched fixtures.
- **JavaScript.** CVE-2026-42353 (i18next-http-middleware path traversal) vulnerable + patched fixtures.
### CLI
- **`nyx rules list`** subcommand. Surfaces the same registry the dashboard's `/api/rules` page reads from: built-in cap-class entries (one per `Cap` with a canonical rule id), per-language label rules (sink / source / sanitizer), gated sinks, and any custom rules from config. Filters: `--lang <slug>`, `--kind <class|source|sink|sanitizer>`, `--class-only` for registry entries only, `--no-class` for per-language rules only. `--json` for machine output. Cap-class entries carry `language = "all"` so a language filter still surfaces them unless `--no-class` is set.
- **`RuleInfo.is_class` / `RuleInfo.emission_active` flags.** Cap-class entries carry `is_class = true` so dashboards can group them separately. `emission_active = false` marks legacy classes (SQL_QUERY, SSRF, FILE_IO, FMT_STRING, DESERIALIZE, CODE_EXEC, CRYPTO) whose findings still surface under the catch-all `taint-unsanitised-flow` rule id; the seven new classes plus `unauthorized_id` and `data_exfil` are `emission_active = true`. The active set is pinned in `cap_rule_registry_emission_active_set_is_pinned` so a future migration of a legacy cap cannot drift silently.
- **`parse_cap` and `CapName::FromStr`** accept the new short names: `ldap_injection` / `ldapi`, `xpath_injection` / `xpathi`, `header_injection` / `crlf` / `response_splitting`, `open_redirect` / `redirect`, `ssti` / `template_injection`, `xxe`, `prototype_pollution` / `proto_pollution`, plus the existing `data_exfil` alias. The `nyx config add-rule --cap` flag and `[analysis.languages.*.rules]` entries take any of these.
### Frontend
- **Refreshed local web UI visual system** around the mint-cyan Nyx brand: warmer light surfaces, deep green accents, updated severity / confidence colors, tighter typography, smaller radii, denser cards, table, badge, button, header, and sidebar styling, and matched graph / code-viewer colors.
- **Reworked `nyx serve` surfaces** for a more operational layout. Overview uses the refreshed health-score card and chart grid; Scans has a fixed compact table with capped language badges; Scan Detail places summary and timing data side by side; Triage, Rules, Config, Explorer, Finding Detail, Scan Compare, and Debug pages received focused spacing, overflow, and density fixes.
- **Branded asset set** shared between the SPA and the embedded server bundle: PNG favicons, Apple touch icon, sidebar logo image, refreshed SVG favicon, and Rust static handlers for the new `/logo.png` and favicon files.
- **Frontend `RuleListItem` and `RuleDetailView`** carry the new `is_class` flag so the dashboard's Rules page can group cap-class entries separately.
- **Regenerated README and docs screenshots and GIFs** against the new UI at 1600x992, saving raw originals before framing and adding CLI GIF plus combined CLI-to-serve demo GIF capture support. Extended the screenshot capture workflow with mint-led framing copy, optional `nyxscan.dev` asset mirroring, WebP regeneration for mirrored PNGs, and raw `_raw` image / GIF outputs for downstream reuse.
### Performance
- **Hoisted `collect_top_level_units` out of the per-extractor loop** in `extract_authorization_model`. Multi-extractor languages (Go gin+echo, JS/TS express+koa+fastify, Python flask+django, Rust axum+actix_web+rocket, Ruby sinatra) had been re-walking the entire AST and rebuilding the `Function`-kind unit set per extractor, then deduping by span. New `AuthExtractor::requires_top_level_units()` opt-out for Spring / Rails which build their own. Was 46% of `extract_authorization_model` wall-clock on the mattermost/server/channels/app subtree.
- **Single `AuthorizationModel` build per file in fused mode.** The diag path and the per-file summary path each ran their own `extract_authorization_model`, duplicating the hoisted unit pass and every framework extractor's AST walk. Auth summaries now extract from the base model (pre var-types, pre helper-lifting) so the persisted per-file summary matches the legacy `extract_auth_summaries_by_key` path bit-for-bit.
- **O(N) shallow value-ref emission in `collect_unit_state`.** The previous per-node `extract_value_refs(node, bytes)` walked the entire subtree on every recursion level (O(N²) per body) even though the recursion below already visits every descendant once. New `append_shallow_value_ref` emits the node's own ref and lets recursion handle the descent. Public callers of `extract_value_refs` (`collect_call`, `collect_condition`, assignment-side extraction) keep the deep walk. Was ~17% + 15% + 11% of wall-clock split across `build_function_unit_with_meta`, `collect_unit_state`, and `extract_value_refs` on mattermost.
- **Per-`ParsedFile` `body_const_facts_cache: OnceCell`.** SSA + const-prop + type-fact build was running 2-3× per body across `run_cfg_analyses_with_lowered`, `run_auth_analyses`, and `collect_file_var_types`. Single-pass cache; gin profile dropped from 13.6% to ~4.5%.
- **SCCP switched from `HashMap<SsaValue, _>` and `HashSet<(BlockId, BlockId)>`** to dense `Vec` per-value lattice and per-destination predecessor `SmallVec<[BlockId; 2]>`. The inner fixed-point loop no longer SipHashes a 64-bit pair for every operand of every phi. Public `ConstPropResult` shape unchanged (one final O(num_values) HashMap conversion).
- **`GlobalSummaries.by_key` switched to `FxHashMap`** (rustc-hash 2.1) from stdlib SipHash. `FuncKey` carries 3 String fields, so any HashMap operation hashes at least 30 bytes; FxHash is ~5× faster on this workload. Seed is fixed (no DoS hardening), fine for an in-process index keyed by program-derived names.
- `large_go_module.go` perf fixture (1493 lines) added to `benches/perf_fixtures/`; `benches/scan_bench.rs` extended with auth-extractor, SCCP, and summary-resolution rows.
### Fixed (false positives)
- `Object.create(null)` receivers no longer fire prototype-pollution at the synthetic `__index_set__` sink. Suppression is flow-sensitive via `TypeKind::NullPrototypeObject` so a phi join that only sometimes resolves to a null-proto receiver still fires on the unsafe path.
- `cfg-unguarded-sink` over-fires on JS/TS object-literal property writes guarded by an explicit `__proto__` / `constructor` / `prototype` reject `if` (early `return` / `throw` / `break`) or by an allowlist `if` whose true arm contains the assignment. Resolved at the CFG layer before the SSA sink scan.
- Spring MVC `return "redirect:" + url` flagged generic `taint-unsanitised-flow` even when the redirect destination was the load-bearing taint. Now routed through the synthetic `__spring_redirect__` sink so the finding emerges as `taint-open-redirect`.
- `$smarty->fetch(...)` / `$twig->createTemplate(...)` no longer drop their SSTI gate match on idiomatic PHP receiver shapes.
- `setValue(target, req.body, ...)` and similar wrappers no longer gate-match on the rewritten Source `req.body` text.
- Nokogiri / lxml / fast-xml-parser parser bodies hardened with `setFeature` / `processEntities: false` / `XMLParser(resolve_entities=False)` no longer fire `taint-xxe`.
- `XPath` instances bound to `setXPathVariableResolver(...)` no longer fire `taint-xpath-injection` on subsequent `xpath.evaluate(expr, ...)` sinks.
- Inline `if (!url.startsWith("/")) reject` and `if (new URL(url).host !== ALLOWED) reject` open-redirect sanitisers narrow `Cap::OPEN_REDIRECT` on the validated branch instead of falling through to the generic `Comparison` predicate. Other taint downstream still fires on its own caps.
- Rust `Redirect::to` no longer fires `taint-ssrf` for what is structurally an open redirect; retagged to `Cap::OPEN_REDIRECT`.
- ~957 gitea backend DAO `go.auth.missing_ownership_check` findings (id-scalar precision pass).
- 169 of 216 openmrs `cfg-unguarded-sink` findings (JpaCriteriaQuery type). Equivalent reductions on xwiki / keycloak Hibernate DAO clusters.
- joomla and drupal `php.deser.unserialize` flagged inside `Serializable::unserialize($input)` magic-method bodies.
- airflow execution-API routes flagged `missing_ownership_check` despite being authorized via cross-file `include_router` chains and module-level `APIRouter(dependencies=[…])` declarations.
- sentry `verified_ids = set(); verified_ids.update(myteams)` flagged as `DbMutation`.
- aiohttp `path.relative_to(static_root_path)` not recognised as a path-traversal validator.
- i18next-http-middleware `arr.filter(utils.isSafeIdentifier)` not narrowing taint on the result.
- `cond ? req.query.lng : ""` ternary lost `Source` label on the truthy branch.
- `if (!validate($x))` rejection-arm narrowing flipped on PHP unary `!`.
- mlflow `make_response(f"Invalid content type: '{content_type}'")` (Tier B pattern).
- Bare-callee verb-name dispatch on Python builtins / locally-defined helpers (`list`, `filter`, `update`, `create_audit_entry`, `update_coding_agent_state`).
- FastAPI `Depends(...)` / `Security(...)` deps declared on a module-level `APIRouter` no longer dropped on every attached route.
- FastAPI `Security(callable, scopes=[...])` no longer downgraded to a Login-only check.
### Tests
- New per-cap integration suites: `tests/{xpath_injection,xxe,ssti,prototype_pollution,header_injection,open_redirect,ldap_injection}_tests.rs`, plus `python_proto_pollution_tests.rs` for the env-gated Python form. Per-cap fixture trees under `tests/fixtures/<class>/<lang>/` cover safe, unsafe, and irrelevant-baseline shapes for every supported language.
- Cross-file FastAPI integration test `tests/fastapi_cross_file_include_router_tests.rs` with airflow-shaped fixture tree under `tests/fixtures/auth_cross_file/airflow_execution_api_includes/`.
- New `cfg/cfg_tests.rs` covers ternary-branch CFG lowering shapes.
- New `summary/tests.rs` covers cross-file `include_router` summary persistence and resolution.
- Per-language safe / vuln auth and detector fixtures across Python, Java, Go, PHP, JS, TS.
### Other
- Refactor passes across `auth_analysis`, `ssa/const_prop`, `ssa/type_facts`, `summary`, and the per-framework auth extractors (cleaner conditional checks, simpler function signatures, deduplicated assertions). No behaviour change.
- README links to a Simplified Chinese translation (`README.zh-CN.md`).
## [0.6.1] - 2026-05-03
A precision pass on auth and resource analysis plus three fresh CVE corpus pairs, plus a UTF-8 slice panic in the path abstract domain. Closes ~1900 Go auth FPs on gitea-shaped helpers, the mastodon/diaspora private-callback Ruby controller pattern, and a phantom-taint outbreak from JS/TS / Java lambda shorthand in jest-style nested test callbacks.
### Added
- Java JDBC raw-SQL sinks. `Statement.execute`, `Statement.executeBatch`, and `Statement.executeLargeUpdate` modeled as `SQL_QUERY` sinks, classified via type-qualified resolution (`DatabaseConnection.execute`) so bare `execute` (Runnable, Executor, HttpClient) does not over-fire. `conn.createStatement()` and `conn.prepareCall()` now infer return type `DatabaseConnection`, so the JDBC chain `Statement s = conn.createStatement(); s.execute(q)` types `s` correctly. Closes GHSA-h8cj-hpmg-636v (Appsmith FilterDataServiceCE.dropTable). Vulnerable + patched Java fixtures added.
- Java/Kotlin `Pattern.matcher(value).matches()` chain recognised as a `ValidationCall` allowlist. Receiver of `.matcher(` must contain `regex` or `pattern`. Validation target is the `.matcher()` argument, not the bare `.matches()` receiver. Branch narrowing applies the `validated_must` to the input variable on the surviving branch. Same GHSA as above (`FILTER_TEMP_TABLE_NAME_PATTERN.matcher(tableName).matches()`).
- Per-parameter SSA summary probe now receives `BodyMeta.param_types`, so `extract_ssa_func_summary` runs a local `analyze_types_with_param_types` pass before extraction. Helper bodies whose sinks resolve only via type-qualified callees (e.g. `DatabaseConnection.execute` for JDBC `Statement.execute`) no longer drop the sink during cross-function summary extraction. Fixes the Appsmith helper `executeDbQuery(query)` that routed SQL through `statement.execute(query)`.
- Short-circuit branch condition CFG nodes now mirror `condition_vars` into `taint.uses`, so `apply_branch_predicates` interns the variable for short-circuit-decomposed validators (`if (x == null || !regex.matcher(x).matches()) throw`). Without this, the per-disjunct cond nodes built via `build_condition_chain` silently no-opped and `x` never reached `validated_must` on the surviving branch.
- Go `goqu.L(s)` and `goqu.Lit(s)` raw-SQL literal builders modeled as `SQL_QUERY` sinks. Safe siblings (`goqu.I` identifier, `goqu.C` column, `goqu.T` table, `goqu.V` parameterised value, `goqu.SUM`, `goqu.COUNT`, …) stay unlabeled. Gin source list extended with the array-returning siblings of the existing scalar helpers: `c.QueryArray`, `c.GetQueryArray`, `c.PostFormArray`, `c.GetPostFormArray`. Closes CVE-2026-41422 (daptin: `c.QueryArray("column")``goqu.L(project)` with the loop variable lifted through `for _, project := range columns`). Vulnerable + patched Go corpus pair under `tests/benchmark/cve_corpus/go/CVE-2026-41422/`.
- Go `for ident := range iter` def-use lifting. The `range_clause` child of `for_statement` is now consulted when `left`/`right` aren't direct fields of the `for` node, so taint from the iterable reaches the loop binding. Required for the daptin CVE shape above.
- Java `enhanced_for_statement`, PHP `foreach`, and Ruby `for` def-use lifting, completing the loop forms the Go `range_clause` fix above started. The `Kind::For` def-use arm only knew the JS/Python `left`/`right` pair and Go's `range_clause`; Java carries the binding on `name` and the iterable on `value`, Ruby's `for` on `pattern`/`value`, and PHP's `foreach` keeps both as unnamed children split by the `as` keyword, so none recorded the loop variable as a define and taint on the iterable never reached the binding (`for (Cookie c : req.getCookies()) { … c.getValue() … }` lost the flow at `c`). Each form now folds onto the shared define/use path. Lifts Java OWASP Benchmark recall: path_traversal 0.21 → 0.32, sqli 0.16 → 0.28, cmdi 0.04 → 0.08.
- Iterable-expression classification for the loop forms above. The loop node is classified against its iterable text, so a source-returning iterable (`req.getCookies()`, `req.getParameterValues("v")`, `$_GET['list']`) lands a `Source` on the loop node and the binding inherits its taint, the same rewrite JS/Python `for … of` / `for … in` already had. Subscript iterables (`$_GET['x']`, `params[:list]`) classify on their base object since sources key on the base name, not the index.
- Java iterable-returning request accessors modeled as sources: `getParameterValues`, `getParameterMap`, `getParameterNames`, `getHeaders`, `getHeaderNames`. The `getParameter` / `getHeader` matchers are word-boundary suffix matches and never covered the plural collection variants that feed for-each loops (`for (String s : req.getParameterValues("v"))`). The dominant OWASP Benchmark vulnerable-source shape.
- Rust format-string named-argument lifting (`format!("...{x}...")`, stable since 1.58). Identifiers captured by `{name}` / `{name:fmt-spec}` are pulled into the call's `uses` for known format-style macros: `format`, `print`/`println`, `eprint`/`eprintln`, `write`/`writeln`, `panic`, `format_args`, `assert`/`debug_assert`, `todo`, `unimplemented`, `unreachable`, plus log-crate severity macros (`info`, `warn`, `error`, `debug`, `trace`). Recursive descent through one or two layers of expression wrapping (`format!("{x}").to_owned()`, RHS chained method calls). Without this, taint stopped at the macro boundary. `let q = format!("...{x}...")` carried no `x` because the identifier lives in format-string bytes rather than as a separate AST argument node. Mirrors the Python f-string lifter.
- Rust CVE corpus extended. CVE-2023-42456, CVE-2024-32884, CVE-2025-53549 vulnerable + patched fixtures under `tests/benchmark/cve_corpus/rust/`.
- Java lambda shorthand recognised by `extract_param_meta`. `lambda_expression`'s `parameters` field as a bare `identifier` (`cmd -> …`) or as an `inferred_parameters` wrapper around identifiers (`(a, b) -> …`) was not matching the formal_parameter / spread_parameter kinds in `PARAM_CONFIG`, so the lambda appeared parameterless and the SSA pipeline treated its formals as closure captures. Mirrors the JS/TS arrow shorthand path.
### Fixed
- Panic on non-ASCII input to `has_first_char_absolute_check` in the path abstract domain. The 32-byte search window around `[0]` was sliced as `&clause[lo..hi]` (str), which panicked when `hi` landed inside a multi-byte UTF-8 char (e.g. the em dash `—`, bytes 34..37). Switched to `&bytes[lo..hi]` with `windows()` byte-pattern checks; all needles are ASCII so the searches are equivalent. Surfaced by `cargo fuzz` (`scan_bytes` target, `.c` extension path, embedded `—` in a comment near `s[0] == '/'`). Regression test added.
### Fixed (false positives)
- `cfg-unguarded-sink` parameter-only trace no longer clears a sink argument whose reaching definition is a loop binding. Once the loop variable resolves to its iterable (the def-use lifting above), a `foreach ($param as $v) { sink($v) }` element looked like a bare `sink($p)` wrapper pass-through and the structural finding was dropped. A loop element over a parameter collection is not wrapper plumbing, so the finding survives for loop-bound sink arguments; literal-keyed arrays stay suppressed through `sink_arg_uses_safe_foreach_key`. Keeps the negative case in `fp_guard_php_foreach_safe_literal_keys` firing.
- Go `unit_has_user_input_evidence` framework-request-name allow-list narrowed for Go. `ctx`, `context`, `info`, `body`, `path`, `payload`, `dto`, `form`, `query` are no longer treated as user-input indicators on Go: in Go these are `context.Context` (cancellation/value-bag from the stdlib) or struct-pointer payload params (`info *PackageInfo`, `opts *FooOptions`), not request bindings. Go HTTP frameworks bind the request to per-framework typed params (`r *http.Request`, `c *gin.Context`, `c echo.Context`, `c *fiber.Ctx`); these arrive at the gate via `RouteHandler` kind or the type-aware param filter below. Stdlib `req` / `request` (the `*http.Request` convention) preserved. Other languages keep the broader allow-list.
- Go param collection drops `ctx context.Context` and `ctx context.CancelFunc` parameters entirely rather than seeding their names into `unit.params`. Tree-sitter-go's `parameter_declaration` exposes `name` and `type` as named fields; descend only into `name` so type-segment identifiers don't pollute the param-name set (`info *PackageInfo` no longer contributes `PackageInfo`). Together with the allow-list narrowing above, closes ~1900 `go.auth.missing_ownership_check` findings on gitea backend helpers whose only "user-input evidence" was the ubiquitous `ctx context.Context` first param.
- Ruby controller method visibility + filter-callback gate. Methods marked `private` (bare `private` directive, targeted `private :foo, :bar`, or `protected`) and Rails filter callback targets (`before_action`, `after_action`, `around_action`, their `prepend_*` / `append_*` / `skip_*` siblings, and the legacy `*_filter` aliases) are no longer emitted as `Function` units. Visibility tracking is class-body source-order with two directive forms (bare toggles default visibility, targeted explicitly marks named methods). Block-form filters (`before_action do … end`) carry no symbol arg and are correctly ignored. Closes mastodon / diaspora `rb.auth.missing_ownership_check` flood on `set_X` row-fetch helpers used as `before_action` callbacks.
- Field-LHS resource acquires no longer counted as local resource leaks at the `apply_assignment` site. `e->name = (char *)e + sizeof(*e)` (sub-buffer alias inside a returned struct) and `mem->buf = ptr` (local-into-field ownership transfer) now mark the RHS local `MOVED` and stop tracking the field as a separately OPEN resource. The parent struct owns the field's lifecycle. Cross-language (distinct from the Go-only `apply_call` field-LHS gate, which is restricted because JS/TS class-field acquires `this.fd = fs.openSync(...)` are the documented expected leak pattern in that path). Closes curl `entry_new` and equivalent C/C++ shapes in openssl / postgres.
- Empty-formals SSA lowering signal. `lower_to_ssa_with_params` now sets `with_params=true` even when `formal_params` is empty, so an arrow `() => {…}` is treated as "explicitly zero formals" rather than "no formals info". External vars in a zero-formal arrow are now correctly tagged as synthetic closure captures, so the JS/TS / Java auto-seed pass cannot mistake a bubbled-up free var (e.g. `userId` lifted from a nested jest test callback) for a real handler formal. Closes 934 phantom taint findings on the outline test suite (`describe("…", () => { test("…", () => { server.post(…) }) })`-shaped fixtures).
- Rust integer-typed values now suppress `Cap::FILE_IO` at the abstract-domain leaf gate (previously HTML_ESCAPE only). An integer's decimal representation is digits with optional leading `-`, never path metacharacters (`/`, `\`, `.`); magnitude is irrelevant. Closes the sudo-rs RUSTSEC-2023-0069 patched FP `let uid: u32 = user.parse()?; path.push(uid.to_string())`.
## [0.6.0] - 2026-05-02
A focused release that splits data-exfiltration off from SSRF and ships sinks for outbound HTTP request bodies across all 10 languages, with calibration tuned so plain user input echoed back upstream does not fire.
### Added
- New `taint-data-exfiltration` rule, separate from SSRF. Fires when a Sensitive-tier source (cookie, header, env, file, database, caught exception) reaches the body, headers, or json payload of an outbound HTTP call. Plain user input gets suppressed at emission time so a gateway echoing `req.body` back upstream is not flagged.
- Sinks ship for `fetch` body, `XMLHttpRequest.send`, Python `requests.post` and `httpx.AsyncClient.post`, Java JDK `HttpClient.send` with `BodyPublishers`, OkHttp builder chains, Apache HttpClient `execute`, RestTemplate, WebClient, Go `http.Post` and `http.NewRequest` + `Do`, Rust `reqwest`/`ureq`/`surf`/`hyper` body/json/form/multipart chains, Ruby `Net::HTTP.post` and RestClient, C and C++ `curl_easy_setopt(CURLOPT_POSTFIELDS, ...)` gated by the macro arg.
- Three suppression knobs:
- Sanitizer convention. `logEvent`, `forwardPayload`, `tracker.send`, `analytics.track`, `metrics.report`, `serializeForUpstream` are treated as `Sanitizer(data_exfil)` by default. Add your own with the standard custom-rule path.
- Trusted destination allowlist in `detectors.data_exfil.trusted_destinations`. Matched against the abstract-string domain prefix; a literal or template prefix that begins with one of these entries drops the cap.
- Detector toggle `detectors.data_exfil.enabled = false` strips the cap before emission. Other taint classes are unaffected.
- Calibration. Severity is High for cookie or env sources, Medium for header, file, database, or caught-exception sources. Confidence stays at Medium even with strong corroboration, drops to Low without abstract or symbolic backing, and drops one tier on path-validated flows. SARIF output carries a `properties.data_exfil_field` entry on data-exfil findings, set to the destination object-literal field the leak reached (`body`, `headers`, or `json`).
- Benchmark coverage. 13 vulnerable fixtures across 8 languages under `tests/benchmark/corpus/{lang}/data_exfil/` and 6 paired safe fixtures for the sensitivity gate and sanitizer convention. New `data_exfil` row in the per-class breakdown. Per-class CI floor at P, R, F1 ≥ 0.85 (current baseline is 1.000).
- Backwards taint walk recognises `Cap::DATA_EXFIL` and emits the same rule ID.
- Ruby SSRF coverage. `OpenURI.open_uri` now classified as an SSRF sink (the low-level fetcher that `URI.open` delegates to). Closes the CarrierWave CVE-2021-21288 download path and equivalent gem shapes that route through `OpenURI` directly.
- Ruby chained-call wrapper classification. Statement-level wrappers like `YAML.safe_load(File.read(filename))` and `Marshal.load(File.read(p))` now classify the inner sink for cross-function summary extraction. Without this, the outer call became a non-sink node and the inner sink was lost when the helper was summarised.
- Ruby CVE corpus. Vulnerable + patched fixtures added for CVE-2021-21288 (CarrierWave SSRF) and CVE-2023-38337 (rswag path traversal).
- Lodash `_.template` modeled as a gated `Cap::CODE_EXEC` sink. Activates on the template-string argument; suppresses when arg-1 carries a literal `{ evaluate: false }`. Closes Strapi CVE-2023-22621 (server-side template injection → RCE via `<% … %>` evaluate blocks). Vulnerable + patched fixtures added under `tests/benchmark/cve_corpus/javascript/CVE-2023-22621/`.
- JS/TS gated-sink kwarg extractor falls back to inspecting arg-1 object literals (`fn(x, { evaluate: false })`) when the language has no `keyword_argument` node. Required so the lodash gate can read its options object.
- Lodash double-call form (`_.template(t)(data)`) routes through `find_chained_inner_call` so the outer call's gated-sink rebinding fires.
- Cross-function helper-validation propagation. New `SsaFuncSummary.validated_params_to_return` field records parameter indices whose taint flow to the return value is fully validated by a dominating predicate (regex allowlist, type check, validation call) on every return path. At call sites, each tainted argument passed to a validated position, and the call's own return value, are marked `validated_must` / `validated_may` in the caller's SSA taint state, the same way an inline `if (!regex.test(x)) throw` would. Closes the helper-validator gap behind PayloadCMS CVE-2026-25544 (Drizzle SQL injection in `sanitizeValue`). Vulnerable + patched TypeScript fixtures added.
- Destructured-arg sibling expansion in per-parameter taint summary probing. JS/TS object-pattern formals (`({ column, operator, value }) => …`) now seed every binding sharing the slot, and any sibling reaching `validated_must` counts as the slot being validated. New `BodyMeta.param_destructured_fields` carries sibling lists alongside `params` and `param_types`. JS `PARAM_CONFIG` accepts `assignment_pattern` (default-value formals) and `object_pattern` (destructured formals).
- Regex-allowlist branch narrowing. `<X>.test(value)` / `<X>.match(value)` / `<X>.matches(value)` where the receiver name contains `regex` or `pattern` classifies as a `ValidationCall` and narrows the call's first argument, not the regex receiver. Was also extended to `extract_validation_target` so the surviving branch validates `value`, not the regex object. Motivated by Payload CVE-2026-25544 (`if (!SAFE_STRING_REGEX.test(value)) throw …`).
- TypeScript template-substring (`${fn(arg)}`) call-resolution arity-hint fallback. When CFG lowering drops `arg_uses` but `args` is non-empty, the resolver passes `None` so the unique-name fallback can still pick up the lone candidate.
- Caller-scope-entity exemption in `rs.auth.missing_ownership_check`. `<entity>.id` / `<entity>.pk` no longer fires when `<entity>` is a unit parameter named after a multi-tenant scope primitive: `organization` / `org`, `project`, `team`, `workspace`, `tenant`, `account`, `community`, `group`, `repository` / `repo`, `company`. Other field names (`.name`, `.slug`) still flag, and `user` / `member` / `actor` are deliberately excluded (handled by `is_actor_context_subject`). Closes a flood of FPs in Sentry / Saleor / Discourse / Mastodon-shaped multi-tenant helpers (`get_environments(request, organization)`, `_filter_releases_by_query(qs, organization, …)`).
- Auth value-ref walker recurses into the `value` child of `keyword_argument` / `keyword_arg` / `named_argument` nodes. `Model.objects.filter(organization_id=org.id)` no longer surfaces the kwarg key (`organization_id`) as a bare-identifier user-input subject. The schema column name is fixed at call time.
- Test-decorator denylist for Flask route extraction. `mock.patch`, `mock.patch.object` / `.dict` / `.multiple`, `unittest.mock.*`, `monkeypatch.setattr` / `setenv` / `delattr` / `delenv`, and `pytest.mark.parametrize` no longer collide with `<app>.patch` route registration. Stops every `@mock.patch("…")`-decorated test method from being attached as a Flask PATCH handler and flagged as `missing_ownership_check`.
- Typed-extractor route-level guard injection for axum and actix-web. Handlers registered via attribute macros (`#[get("/path")]`, `#[routes::path(…)]`) or via external service-config builders previously never had their typed-extractor guards seeded. New `apply_typed_extractor_guards_to_units` walks every `Function`-kind unit and injects guard checks from typed-extractor params, complementing the route-walk path that already covered `.route(...)` registration.
- New auth config key `policy_guard_names`. Typed-extractor wrappers that prove route-level capability/policy enforcement (e.g. meilisearch's `GuardedData<ActionPolicy<X>, _>`) are recognised distinctly from authentication-only wrappers. Matched as last-segment + case-insensitive `starts_with`. Rust default: `["Guarded"]`. Distinct from `login_guard_names` so the pattern doesn't pollute regular call recognition (a function like `guarded_load(..)` is not a login guard).
- Outer-wrapper-aware classification of typed extractors. `GuardedData<ActionPolicy<X>, Data<AuthController>>` is classified by the outer `GuardedData` (policy-bearing → `AuthCheckKind::Other`), not by whether an inner generic arg substring-matches `auth`. Bare data-only extractors (`Path<u64>`, `Query<X>`, `Json<X>`, `Form<X>`, `State<X>`, `Extension<X>`, `Data<X>`) outer-name-match early-return to `None` regardless of inner type tokens. Reference-marker (`&`, `&mut`, `&'a`) and module-path (`std::collections::`) prefixes stripped before matching.
- Project-level web-framework signal in Rust auth analysis. New `FrameworkContext::lang_has_web_framework(lang)` is three-valued: `Some(true)` when manifest names a framework, `Some(false)` when the manifest was inspected and named none, `None` when no manifest was inspected. New `rust_file_imports_web_framework` does a per-file `axum::` / `actix_web::` / `rocket::` / `axum_extra::` import probe (8 KB head). When the project's Cargo.toml is inspected and lists no Rust web framework AND the file does not directly import one, the `context_inputs` and param-name-heuristic arms of `unit_has_user_input_evidence` are suppressed. `RouteHandler` classification (concrete route-registration evidence) still bypasses the gate. Closes a flood of `missing_ownership_check` FPs in non-web Rust crates such as zed-style desktop / GUI codebases where a debug-session handle named `session` would trip `matches_session_context` on `session.update(cx, …)`. Currently Rust-only; other languages keep prior behavior (`None`).
- Rust auth corpus extended with `safe_actix_guarded_data_extractor.rs` and `unsafe_actix_no_guarded_data_extractor.rs` (typed-extractor guard injection); `safe_non_web_rust_project/` and `unsafe_actix_web_project_no_check/` (full Cargo.toml + src/lib.rs project shapes for the framework-signal gate).
- Python auth corpus extended with `vuln_user_id_param_no_auth.py`, `safe_django_orm_caller_scoped_entity.py` (caller-scope-entity exemption), `safe_mock_patch_test_method.py` (test-decorator denylist).
- Go safe corpus extended with `safe_inner_call_close_in_arg.go` (`require.NoError(t, f.Close())` shape), `safe_struct_field_resource_owned_by_struct.go` (field-LHS ownership transfer), and a `vuln_resource_leak_no_close.go` regression guard.
### Fixed (false positives)
- C++ `cpp.memory.reinterpret_cast` no longer fires when the target type is well-defined by C++ aliasing rules. Suppressed targets: byte-pointer family (`char*`, `unsigned char*`, `signed char*`, `wchar_t*`, `uint8_t*`, `int8_t*`, `std::byte*`, `byte*`), `void*`, integer round-trip (`uintptr_t`, `intptr_t`, and `std::` variants, no pointer required), and the BSD socket address family (`sockaddr*`, `struct sockaddr*`, `sockaddr_in*`, `sockaddr_in6*`, `sockaddr_un*`, `sockaddr_storage*`). User-defined struct or class pointer targets keep firing. Closes ~70% over-fire on serialization, hashing, IPC, and socket-API code where the cast is the standard-blessed idiom.
- PHP `php.crypto.md5` and `php.crypto.sha1` suppress when the call's consuming context yields a non-cryptographic identifier name. Recognised contexts: assignment LHS (variable, `$obj->property`, `$arr['key']`), array element keys, subscript indices, return statements (resolved to enclosing method or function name with `get` prefix stripped), and method-call arguments where the method is a key/cache/lookup verb (`get`, `set`, `has`, `delete`, `fetch`, `store`, `find`, `getItem`, `setItem`). Names containing a crypto keyword (`password`, `secret`, `token`, `signature`, `hmac`, `digest`, `salt`, `key`) keep firing. Closes ETag generation, cache-key hashing, dedup fingerprint, and `getCacheKey()`-style false positives in real PHP repos (phpmyadmin, nextcloud).
- JS and TS `secrets.fallback_secret` no longer fire on empty-string fallbacks (`process.env.X || ""`). Developers write `|| ""` to satisfy non-undefined string types without committing a real secret. Non-empty literal fallbacks still fire.
- Path-traversal sink suppression accepts canonicalised-and-rooted shapes. New `PathFact::is_path_traversal_safe` predicate clears `Cap::FILE_IO` when the path is dotdot-free and either non-absolute or carries a verified prefix-lock. New `OPAQUE_PREFIX_LOCK` marker records the structural invariant ("rooted under SOME prefix") when the `starts_with`-style guard's argument is a method call, field access, or configured root rather than a string literal. Closes the Ruby `File.expand_path + start_with?(root)` shape (rswag CVE-2023-38337 patched counterpart), the Python `os.path.realpath + .startswith(root)` shape, and the JS `path.resolve + .startsWith(root)` shape. `classify_path_assertion` extended to JS `.startsWith(...)`, Python `.startswith(...)`, Ruby `.start_with?(...)` (paren and paren-less), and Go `strings.HasPrefix(...)`.
- Branch narrowing now flips prefix-lock attachment under condition negation. For `if !target.startsWith(ROOT) { return; }` the lock attaches to the surviving block, not the rejection arm. Rejection-axis narrowing is unchanged because the rejection classifier is text-level and already accounts for leading `!`.
- Go field-LHS resource acquires no longer counted as local resource leaks. `b.cpuprof = os.Create(...)` transfers ownership to the containing struct; closure responsibility belongs to a paired `Stop()` / `Release()` method on the struct's lifecycle. Gated in both `state/transfer.rs::apply_call` and `cfg_analysis/resources.rs::run`. Restricted to Go (`Lang::Go` check). JS/TS class-field acquires (`this.fd = fs.openSync(...)`) keep being tracked because the leak fixtures rely on it. Production trigger: prometheus `cmd/promtool/tsdb.go::startProfiling` cluster (`b.cpuprof`, `b.memprof`, `b.blockprof`, `b.mtxprof`).
- Go inner-call release in argument position. `require.NoError(t, f.Close())`, `errs = append(errs, f.Close())`, JUnit `assertEquals(0, in.read())`: releases that live in argument position now mark the receiver `CLOSED`. Bare-receiver inner calls only (chained-receiver releases stay owned by `chain_proxies`); marks `CLOSED` only with no `DoubleClose` attribution; respects `in_defer` for symmetry.
### Other
- Action download script warning for the mutable `latest` tag now references `v0.6.0` instead of `v0.5.0`.
## [0.5.0] - 2026-04-29
The biggest release since launch. The taint engine was rebuilt on top of an SSA IR, cross-file analysis was deepened across the board, and Nyx now ships a local web UI for triaging findings without leaving your machine.
> Heads-up: false positives or regressions on cross-file flows are possible. Please open an issue with a minimal reproduction if you hit one.
### Highlights
- **New SSA-based taint engine.** Block-level worklist analysis over a pruned SSA IR, replacing the legacy BFS engine across all 10 languages. More precise, easier to extend, and the foundation for everything else in this release.
- **Cross-file analysis.** Function summaries (including the new SSA summaries) flow across files via SQLite-backed persistence. Callee bodies can be inlined for context-sensitive analysis (k=1) and walked symbolically across file boundaries.
- **Symbolic execution layer.** Candidate findings are walked symbolically from source to sink, producing concrete attack witnesses, pruning infeasible paths, and (optionally) handing constraints off to Z3.
- **Local web UI (`nyx serve`).** React + Vite frontend for browsing findings, viewing flow paths, and triaging results. Triage decisions persist to `.nyx/triage.json` so they version with your code.
- **Hostile-repo hardening.** Path containment, loopback-only serving, CSRF tokens, bounded artifact reads. Safe to run on untrusted code.
- **Tighter false-positive controls.** Type-aware sink suppression, abstract interpretation (intervals + string prefixes), constraint solving, allowlist and type-check guard recognition, and confidence scoring on every finding.
### Engine
- SSA IR with dominance-frontier phi insertion. The optimization pipeline runs constant propagation, branch pruning, copy propagation, alias analysis, DCE, type facts, and points-to in sequence.
- Multi-label classification. A single API can carry both Source and Sink labels (e.g. PHP `file_get_contents`, Java `readObject`).
- Gated sinks. `setAttribute`, `parseFromString`, etc. only activate when the constant attribute argument is dangerous, and only the payload argument is treated as taint-bearing.
- Container taint with per-index precision and bounded points-to. Aliased containers share heap identity correctly.
- Loop-aware analysis: induction-variable pruning, widening at loop heads, bounded unrolling in symex.
- Path-sensitive phi evaluation propagates validation when all tainted predecessors are guarded.
- Per-return-path summaries decompose function effects when paths produce different taint behavior.
- Cross-file SCC fixed-point. Mutually recursive functions across files now reach a joint convergence.
- Demand-driven backwards analysis (off by default) annotates findings with cutoff diagnostics.
- Direction-aware engine notes (`UnderReport`, `OverReport`, `Bail`) flow into confidence scoring, ranking, and the new `--require-converged` strict mode.
- Synthetic field-write inheritance: `u.Path = "/foo"` no longer drops taint carried by other fields of `u`. Fixes Owncast CVE-2023-3188 (SSRF).
- Phantom-Param-aware field suppression skips method/function references that share a base name with a tainted variable.
- Validation err-check narrowing for the two-statement Go idiom `_, err := strconv.Atoi(input); if err != nil { return }`: `input` is marked validated on the surviving `err == nil` branch.
- Go: `strings.Replace` / `strings.ReplaceAll` recognised as a sanitizer when the OLD literal contains a known-dangerous payload (shell metachars, path-traversal, HTML, SQL) and the NEW literal does not reintroduce one.
- Go: literal-strip cap detection extended to shell metachars (`;`, `|`, `&`, `$`, backtick) and SQL metachars (`'`, `"`, `--`).
- Go: `interpreted_string_literal` / `raw_string_literal` handled in tree-sitter so const-string arg extraction works for Go's double-quoted and backtick forms.
### Symbolic Execution
- Expression trees (`SymbolicValue`) preserve computation structure through the path walk: integers, strings, binary ops, concatenations, calls, phi merges.
- Witness strings reconstruct concrete attack payloads at sink nodes.
- Bounded multi-path forking with reachability pruning.
- Cross-file: callee summaries are modeled directly, and pre-lowered callee bodies are loaded from SQLite so witnesses can keep walking across files.
- Interprocedural mode: nested frames with full state propagation, transitive descent up to 3 levels, structured cutoff tracking.
- Field-sensitive symbolic heap with bounded fields per object.
- Symbolic string theory: `Substr`, `Replace`, `ToLower`, `ToUpper`, `Trim`, `StrLen` modeled with concrete folding and sanitizer pattern detection.
- Optional Z3 integration (compile-time `smt` feature) for cross-variable constraint solving.
### Security & Coverage
- Vulnerability classes added: SSRF (10 languages), deserialization (Python, Ruby, Java, PHP), and `Cap::UNAUTHORIZED_ID` for auth-as-taint (off by default behind config flag).
- Auth analysis: receiver-type sink gating, row-level ownership-equality detection, self-actor recognition (`let user = require_auth()`), sink classification (in-memory vs realtime vs outbound), helper-summary lifting, and SQL JOIN-through-ACL recognition.
- State analysis (resource lifecycle, use-after-close, leaks, unauthed access) is now on by default. RAII-aware for Rust and C++; recognizes Python `with`, Go `defer`, Java try-with-resources.
- Framework rule packs: Express, Flask/Django, Spring/JNDI, Rails. Per-language label depth significantly expanded.
- C/C++ taint depth: output-parameter source propagation, implicit definitions for uninitialized declarations.
- Negative test corpus (30 fixtures) and a 262-case benchmark with CI gates on rule-level Precision/Recall/F1.
### Detection metrics
- Aggregate rule-level F1 reaches **0.998** (P=0.995, R=1.000). All real-CVE fixtures fire; only one open FP (`go-safe-009`).
- Go: 98.0% F1 on the 53-case corpus (1 FP / 0 FNs).
- CVE-2023-3188 (owncast SSRF) now detects.
### CLI & Output
- `nyx serve`: local web UI on `localhost` only (refuses non-loopback binds).
- `--require-converged` filters out findings where the engine bailed early.
- Analysis-engine toggles graduated from `NYX_*` env vars to first-class flags and `[analysis.engine]` config: `--constraint-solving`, `--abstract-interp`, `--context-sensitive`, `--symex`, `--cross-file-symex`, `--symex-interproc`, `--smt`, `--parse-timeout-ms`. Old env vars still work when Nyx is consumed as a library.
- Confidence (`High`/`Medium`/`Low`) shown on every finding, including console headers.
- Engine notes surfaced in console (`[capped: N notes, over-report]`), JSON (`engine_notes`, `confidence_capped`), and SARIF (`result.properties.loss_direction`).
- Flow paths reconstructed step-by-step with file/line/snippet for each hop.
- Concrete attack witness strings synthesized by the symbolic executor.
- Primary sink locations now point at the callee's real sink line; caller call sites are preserved as flow steps.
- Richer scan progress: explicit stages, timing breakdowns, language counters, skipped/reused file counts.
- Tighter taint-finding deduplication.
### Hardening
- Centralized path containment rejects traversal, symlink escapes, and oversized reads across UI, debug, and triage routes.
- `nyx serve` validates `Host` headers, requires per-session CSRF tokens for mutations, and refuses scans outside the original repo root.
- Walker re-validates symlink targets against the scan root.
- Bounded reads on framework manifests and `.nyx/triage.json` imports.
- UI falls back to plain text on pathologically long lines to defeat regex-DoS in syntax highlighting.
- Parser timeout is now configuration-backed with hostile-input regression coverage.
### Persistence
- SQLite schema bumped to v2. Anonymous-function identity is now a structural DFS index instead of a byte offset, so inserting a line above an unchanged function no longer invalidates its `FuncKey`. Pre-0.5.0 caches are silently cleared on open; triage data and scan history are preserved.
- Engine-version metadata; persisted summaries and file hashes invalidate on mismatch.
- Stale SSA tables recreate when required columns are missing; deserialization failures log instead of silently dropping rows.
### Frontend
- Replaced the legacy `app.js` with a React + Vite + TypeScript SPA.
- Interactive graph workspace for CFG and call-graph views (Graphology + ELK + Sigma) with neighborhood reduction and a full-page inspector.
- Triage UI with database-backed decisions (true positive, false positive, accepted risk, suppressed) and `.nyx/triage.json` round-trip.
- Scan history, rules management, and finding detail panels with evidence and flow visualization.
- Vitest browser-side test suite wired into CI.
- Bumped to React 19, Vite 8, TypeScript 6.0, ESLint 10, `@vitejs/plugin-react` 6, with aligned `@types/react*`.
- `SSEContext`: typed `reconnectTimer` ref as `ReturnType<typeof setTimeout> | undefined` to satisfy TS 6's stricter `useRef` overloads.
- `FindingsPage`: included `toast` in `useCallback` deps to avoid stale-closure warnings.
- `tsconfig.json`: dropped `baseUrl`, using a relative `./src/*` path mapping instead.
### Removed
- Legacy BFS taint engine, `TaintTransfer`, `TaintState`, and the `NYX_LEGACY` fallback.
- Legacy vanilla-JS frontend (`app.js`).
## [0.4.0] - 2026-02-25
A precision and ergonomics release. Findings are now ranked, lower-noise by default, and easier to triage in CI.
### Highlights
- **Attack-surface ranking.** Every finding gets an exploitability score combining severity, analysis kind, evidence strength, and path-validation. Console output shows the score in the header line; `--no-rank` opts out.
- **Low-noise prioritization.** Quality-category findings are excluded by default (`--include-quality` brings them back). High-frequency Quality rules are rolled up per `(file, rule)` with example occurrences. LOW budgets cap noise without ever displacing High/Medium findings.
- **State-model dataflow analysis.** New per-variable resource-lifecycle and auth-level analysis catches use-after-close, double-close, must-leak, may-leak (branch-aware), and unauthenticated-sink access. Opt-in via `scanner.enable_state_analysis`.
- **Inline `nyx:ignore` suppressions** with same-line and next-line directives, comma lists, wildcard suffixes, and string-literal guards across all 10 languages.
- **AST pattern overhaul.** All 10 language pattern files rewritten with consistent metadata, namespaced IDs (`<lang>.<category>.<specific>`), and 30+ new patterns. 11 broken tree-sitter queries fixed.
- **Monotone forward-dataflow taint engine.** Replaced the BFS engine with a proper worklist over a finite lattice. Termination is now guaranteed by lattice height, eliminating BFS-budget bailouts on large files.
- **Path-sensitive taint analysis.** Branch predicates flow with the analysis. Contradictory guards prune infeasible paths; validation calls produce annotated findings without changing severity.
- **Interprocedural call graph.** Whole-program graph with three-valued callee resolution (`Resolved`/`NotFound`/`Ambiguous`), SCC analysis, and topo ordering ready for bottom-up taint propagation.
### CLI & Output
- `--severity <EXPR>` replaces `--high-only`. Supports `HIGH`, `HIGH,MEDIUM`, `>=MEDIUM`. Filtering is now applied at the output stage so taint and CFG findings are correctly downgraded too.
- `--mode <full|ast|cfg|taint>` replaces `--ast-only` and `--cfg-only`.
- `--index <auto|off|rebuild>` replaces `--no-index` and `--rebuild-index`.
- `--fail-on <SEVERITY>` for CI exit-code gating.
- `--min-score <N>` for ranking-aware filtering.
- `--show-suppressed` reveals suppressed findings dimmed with `[SUPPRESSED]`.
- `--keep-nonprod-severity` (renamed from `--include-nonprod`).
- `--quiet` mirrors `output.quiet`.
- Console renderer overhauled: severity is the strongest visual anchor, file paths are dim blue, taint flows use `→` arrows, multi-line call chains are normalized.
- Confidence shown alongside score in the header line.
- Pattern-level confidence is now set at the pattern definition site, not heuristically inferred from severity.
### Breaking
- Config and data directory renamed from `dev.ecpeter23.nyx` to `nyx`. Existing config and SQLite indexes at the old path won't be picked up. Copy them across or re-run `nyx scan`.
- `Severity::from_str` now returns `Err` for unknown values instead of silently defaulting to Low.
### Notable Fixes
- KINDS-map audit across all 10 languages: 89 missing tree-sitter node types added. Switch/case, try/catch/finally, class bodies, lambdas, closures, and namespaces are no longer silently dropped.
- `else_clause` mapping fixed for C, C++, Rust, JS, TS, Python, PHP. Code inside else blocks was being dropped from the CFG.
- Rust `if let` / `while let` taint propagation now works.
- Taint BFS non-termination on large JS files (the BFS engine has since been replaced).
- C++ `popen` pattern ID collision with C.
- Constant-arg sink suppression for AST patterns.
## [0.3.0] - 2026-02-25
Configurability, SARIF, and an aggressive false-positive purge.
### Highlights
- **Configurable analysis rules.** Sources, sanitizers, sinks, terminators, and event handlers can be defined per language in `nyx.local` or via `nyx config add-rule`/`add-terminator`. Config rules take priority over built-in rules.
- **`nyx config` CLI subcommand** with `show`, `path`, `add-rule`, `add-terminator`.
- **SARIF 2.1.0 output (`-f sarif`).** Spec-compliant for GitHub Code Scanning, Azure DevOps, and other SARIF consumers.
- **`SourceKind` taint classification.** Findings carry an inferred source kind (`UserInput`, `EnvironmentConfig`, `FileSystem`, `Database`, `Unknown`) and severity is now derived from it instead of being hardcoded to High.
- **Non-prod severity downgrade by default.** Findings in tests, vendor, benchmarks, examples, fixtures, build scripts, and `*.min.js` are downgraded one tier. `--include-nonprod` restores original severity.
- **Resource leak detection** for Python, Ruby, PHP, JavaScript, and TypeScript (file handles, sockets, locks, mysqli, curl, fs streams).
- **Progress bars and quiet mode.** Indicatif-driven progress for discovery, Pass 1, and Pass 2 (auto-hidden in JSON/SARIF/quiet modes).
### Performance
- Single fused parse+CFG pass replaces the previous two-parse summary extraction.
- Light-weight dataflow sweep in CFG builder is now O(N) per function instead of O(N²) over the whole file.
- Parallel summary merging via rayon fold/reduce.
- Indexed scans now read and hash each file once instead of up to 4 times.
- SQLite mutex mode relaxed (r2d2 + WAL provides safety without global lock).
- Zero-allocation taint hashing and in-place taint transfer.
### Notable Fixes
- One-hop constant-binding suppression: `cmd = "git"; subprocess.run([cmd, ...])` no longer flags.
- Exec-path guards (`which`, `resolve_binary`, `shutil.which`) recognized.
- `signal.connect` / `event.connect` no longer match Python db-connection acquire patterns.
- `threading.Lock()` without `.acquire()` no longer flags as unreleased.
- `FileResponse(f)` / `send_file(f)` recognized as ownership transfer.
- `el.href` no longer matches `location.href` patterns.
- Constant-only sink calls (`subprocess.run(["make","clean"])`) suppressed.
- `std::cout` no longer treated as a sink.
- Break/continue inside loops correctly wires into the loop header/exit, fixing false unreachable-code findings.
- Preprocessor `#ifdef`/`#endif` blocks no longer orphan subsequent code in C/C++.
- `freopen` no longer matches `fopen` acquire patterns.
- Struct-field, linked-list, and global assignment recognized as ownership transfers.
## [0.2.0] - 2026-02-24
The cross-file release.
- **Two-pass cross-file taint analysis.** Pass 1 extracts `FuncSummary` per function (caps, propagation, callees), Pass 2 runs BFS taint propagation with cross-file callee resolution.
- **CFG analysis engine** with five detectors: unguarded sinks, auth gaps in web handlers, unreachable security code, error fallthrough, resource leaks.
- **Cross-language interop** via explicit `InteropEdge` structs (no false-positive name collisions).
- **Function summaries persisted to SQLite** (`function_summaries` table).
- **Multi-language CFG + taint support** for all 10 languages.
- **Resource leak detection** for C/C++, Go, Rust, and Java.
- **Finding scoring system** combining severity, entry-point proximity, path complexity, taint confirmation, and confidence.
- **Analysis modes**: `Full` (default), `Ast` (`--ast-only`), `Taint` (`--cfg-only`).
- **Cap bitflags expanded**: `ENV_VAR`, `HTML_ESCAPE`, `SHELL_ESCAPE`, `URL_ENCODE`, `JSON_PARSE`, `FILE_IO`.
- Performance: read-once/hash-once via `_from_bytes` variants, lock-free rayon, SQLite WAL + 8 MB cache + 256 MB mmap.
- Tracing instrumentation on all pipeline stages; criterion benchmark suite.
## [0.2.0-alpha] - 2025-06-28
- Experimental intra-procedural CFG + taint analysis for Rust. Builds a CFG, applies dataflow, and flags unsanitised Source → Sink paths (e.g. `env::var``Command::new`).
- O(1) node-kind lookup via per-language PHF tables.
- Debug channel `target=cfg` (`RUST_LOG=nyx::cfg=debug`) to inspect generated graphs.
- Fixed Windows release pipeline (PowerShell has no `zip` command).
## [0.1.1-alpha] - 2025-06-25
- Fixed `scan --no-index` not respecting the `max_results` config setting (#1).
- Integration tests covering indexing and scanning pipelines (#3, #4, #5, #8).
## [0.1.0-alpha] - 2025-06-25
Initial alpha release.
- Multi-language AST pattern scanning via `tree-sitter` for Rust, C/C++, Java, Go, PHP, Python, Ruby, TypeScript, JavaScript.
- `scan` command: filesystem walker, pattern execution, console output.
- `index` command: build, rebuild, and status reporting of SQLite-backed index.
- `list` command: list indexed projects with optional verbosity.
- `clean` command: remove one or all project indexes.
- Configuration system with `nyx.conf` (generated) and `nyx.local` (user overrides).
- Default severity levels: High, Medium, Low.

73
CLA.md Normal file
View file

@ -0,0 +1,73 @@
# Nyx Contributor License Agreement
## Why this exists
Nyx is an open source project and will always have a fully open-source core available to the community.
This Contributor License Agreement (CLA) exists to ensure the long-term sustainability of the project. It allows Nyx to evolve over time, including improving, distributing, and potentially offering commercial versions or services that support continued development.
**You retain ownership of your contributions.** This agreement simply grants the project the rights needed to use and evolve them.
---
Thank you for your interest in contributing to Nyx (the "Project"). This Contributor License Agreement ("Agreement") clarifies the intellectual property rights granted with each Contribution from any person or entity. It is for Your protection as a contributor as well as the protection of the Project and its users.
By submitting a Contribution to the Project, You accept and agree to the terms below. If You do not agree to these terms, please do not submit Contributions.
## 1. Definitions
**"You"** (or **"Your"**) means the individual or legal entity making a Contribution to the Project. For a legal entity, "You" includes the entity and any entity that controls, is controlled by, or is under common control with that entity.
**"Contribution"** means any work of authorship, including any modifications or additions to an existing work, that is intentionally submitted by You to the Project for inclusion in, or documentation of, the Project. "Submitted" means any form of electronic, verbal, or written communication sent to the Project (including but not limited to pull requests, patches, and issue comments) but excluding communication that is conspicuously marked or otherwise designated in writing by You as "Not a Contribution."
## 2. Copyright License Grant
Subject to the terms of this Agreement, You hereby grant to the Project, to any entity that maintains or succeeds it, and to recipients of software distributed by the Project a perpetual, worldwide, non-exclusive, royalty-free, irrevocable copyright license, with the right to sublicense through multiple tiers of sublicensees, to reproduce, prepare derivative works of, publicly display, publicly perform, distribute, and sublicense Your Contribution and such derivative works.
## 3. Patent License Grant
Subject to the terms of this Agreement, You hereby grant to the Project, to any entity that maintains or succeeds it, and to recipients of software distributed by the Project a perpetual, worldwide, non-exclusive, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer Your Contribution and any combination of Your Contribution with the Project to which it was submitted. This patent license applies only to those patent claims licensable by You that are necessarily infringed by Your Contribution alone or by combination of Your Contribution with the Project.
If any entity institutes patent litigation against You or any other entity (including a cross-claim or counterclaim in a lawsuit) alleging that Your Contribution, or the Project to which You have contributed, constitutes direct or contributory patent infringement, then any patent licenses granted to that entity under this Agreement for that Contribution or Project shall terminate as of the date such litigation is filed.
## 4. Relicensing Right
In addition to the licenses granted in Sections 2 and 3, You grant the Project and any entity that maintains or succeeds it the right to relicense Your Contribution, in whole or in part, under terms other than the Project's current license (currently GPL-3.0-or-later), where necessary to support the long-term sustainability, distribution, and evolution of the Project.
This may include, without limitation:
1. Dual-licensing the Project under a commercial license;
2. Combining Your Contribution with proprietary components; or
3. Moving the Project to a different open source license.
This right is irrevocable and may be exercised by the Project's maintainers as part of maintaining and evolving the Project.
## 5. Moral Rights Waiver
To the maximum extent permitted by applicable law, You waive, and agree not to assert, any moral rights or similar rights of attribution and integrity that You may have in Your Contribution against the Project, its successors, and recipients of software distributed by the Project. To the extent such rights cannot be waived under applicable law, You agree not to enforce them in a manner that would limit the rights granted under this Agreement.
## 6. Representations
You represent that:
1. Each of Your Contributions is Your original creation, or You otherwise have the legal right to submit it under the terms of this Agreement;
2. To the best of Your knowledge, Your Contribution does not infringe any third party's copyright, patent, trade secret, or other intellectual property rights; and
3. You have the legal authority to enter into this Agreement and to grant the licenses set forth above.
If any portion of Your Contribution is not Your original creation, You will identify the source and any license or other restriction applicable to that material as part of Your submission.
## 7. Employer Authorization
If You are submitting a Contribution on behalf of Your employer, or the Contribution was made within the scope of Your employment, You represent that Your employer has authorized You to make the Contribution and to grant the licenses set forth in this Agreement. If You are unsure, please confirm with Your employer before submitting.
## 8. No Warranty
You provide Your Contributions on an "AS IS" basis, without warranties or conditions of any kind, either express or implied, including, without limitation, any warranties of title, non-infringement, merchantability, or fitness for a particular purpose. You are not required to provide support for Your Contributions, except to the extent You desire to provide such support.
## 9. Copyright Retained
You retain copyright to Your Contribution. This Agreement grants the licenses set forth above; it does not transfer ownership. Its purpose is to give the Project flexibility to evolve and to relicense the codebase over time without needing to obtain permission from each past contributor on a case-by-case basis.
## 10. Notice of Changes
If You become aware of any facts or circumstances that would make any representation in this Agreement inaccurate in any respect, You agree to notify the Project promptly.

129
CODE_OF_CONDUCT.md Normal file
View file

@ -0,0 +1,129 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
- Demonstrating empathy and kindness toward other people
- Being respectful of differing opinions, viewpoints, and experiences
- Giving and gracefully accepting constructive feedback
- Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
- Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
- The use of sexualized language or imagery, and sexual attention or advances of
any kind
- Trolling, insulting or derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information, such as a physical or email address,
without their explicit permission
- Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official email address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
**opening a private issue** at [https://github.com/elicpeter/nyx/issues/new/choose]().
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the
[Contributor Covenant](https://www.contributor-covenant.org/), version 2.1,
available at
<https://www.contributor-covenant.org/version/2/1/code_of_conduct/>.
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder](https://github.com/mozilla/inclusion).
For answers to common questions about this code of conduct, see the FAQ at
<https://www.contributor-covenant.org/faq/>. Translations are available at
<https://www.contributor-covenant.org/translations/>.

399
CONTRIBUTING.md Normal file
View file

@ -0,0 +1,399 @@
# Contributing to Nyx
Thank you for your interest in improving Nyx. This guide covers everything you need to contribute effectively.
User-facing documentation lives at **[elicpeter.github.io/nyx](https://elicpeter.github.io/nyx/)**; the source for those pages is in [`docs/`](docs/).
Please read our [Code of Conduct](CODE_OF_CONDUCT.md) before participating.
---
## Table of Contents
1. [Development Setup](#development-setup)
2. [Project Layout](#project-layout)
3. [How to Add a New AST Pattern](#how-to-add-a-new-ast-pattern)
4. [How to Add a New Taint Rule](#how-to-add-a-new-taint-rule)
5. [How to Add a New Language](#how-to-add-a-new-language)
6. [Testing](#testing)
7. [Pull Request Guidelines](#pull-request-guidelines)
8. [Bug Reports](#bug-reports)
9. [Feature Requests](#feature-requests)
10. [Release Process](#release-process)
---
## Development Setup
### Prerequisites
- **Rust 1.88+** (edition 2024)
- Git
- **Node 20+** — only if you touch the browser UI under `frontend/` (the
`nyx serve` web app). Pure-Rust changes do not need it.
### Building
```bash
git clone https://github.com/elicpeter/nyx.git
cd nyx
cargo build # Debug build
cargo build --release # Release build
cargo install --path . # Install as `nyx` binary
```
### Running Quality Checks
The fastest way to reproduce CI locally is the bundled script — it runs the same
commands CI runs (fmt, Clippy, tests, and the frontend checks):
```bash
./scripts/check.sh # Mirror CI: fmt + clippy + tests (+ frontend)
./scripts/check.sh --rust-only # Skip the frontend checks
./scripts/fix.sh # Auto-fix: cargo fmt + clippy --fix + prettier/eslint
```
Or run the steps individually:
```bash
cargo test --all-features # Tests, incl. tests/ integration suite
cargo clippy --all-targets --all-features -- -D warnings # Lint, warnings = errors
cargo fmt # Format code
cargo fmt -- --check # Check formatting without modifying
```
> **Match CI exactly.** CI lints and tests with `--all-targets --all-features`.
> The older `cargo test --bin nyx` / `cargo clippy --all` commands skip the
> `tests/` integration suite and feature-gated code, so they can pass locally
> while CI fails. Prefer `./scripts/check.sh`.
> **Note**: The first build downloads and compiles tree-sitter grammars for all 10 languages. Subsequent builds are faster.
### Benchmarks
```bash
cargo bench --bench scan_bench
```
Benchmark fixtures live in `benches/fixtures/`. Criterion produces HTML reports in `target/criterion/`.
---
## Project Layout
> **New here?** [`docs/how-it-works.md`](docs/how-it-works.md) walks the analysis
> pipeline end to end (with a diagram), and [`docs/detectors/taint.md`](docs/detectors/taint.md)
> covers the taint engine. The easiest first contribution is usually a new AST
> pattern (see [below](#how-to-add-a-new-ast-pattern)) — small, self-contained,
> and well templated.
```
src/
main.rs CLI entry point
lib.rs Library re-exports (benchmarks, integration tests)
cli.rs Clap command definitions
commands/ Subcommand handlers (scan, index, list, clean, config, serve)
ast.rs Entry points for both passes; tree-sitter parsing
cfg/ CFG construction from AST, type hierarchy
cfg_analysis/ CFG structural detectors
guards.rs Unguarded sink detection (dominator analysis)
auth.rs Auth gap detection
resources.rs Resource leak detection
error_handling.rs Error fallthrough detection
unreachable.rs Unreachable security code detection
rules.rs Guard rules, auth rules, resource pairs
ssa/ SSA IR (lowering, optimization passes, const prop)
taint/ SSA-based taint engine (sole engine since 0.5.0)
mod.rs Facade + JS two-level solve
domain.rs Shared lattice types (VarTaint, Cap, TaintOrigin)
ssa_transfer/ Block-level worklist, k=1 inline cache, gated sinks
backwards.rs Demand-driven backwards taint walk (opt-in)
path_state.rs Predicate tracking and contradiction pruning
state/
engine.rs Generic monotone dataflow engine (Transfer<S: Lattice>)
transfer.rs DefaultTransfer: resource lifecycle + auth state
summary/ FuncSummary, SsaFuncSummary, GlobalSummaries, hierarchy index
abstract_interp/ Interval + string prefix/suffix domains
pointer/ Field-sensitive points-to (Steensgaard-style)
symex/ Symbolic execution + witness generation
constraint/ Path-constraint solving (optional Z3 via `smt` feature)
auth_analysis/ Rust auth rule (`rs.auth.missing_ownership_check`) + sink classes
suppress/ Inline `nyx:ignore` directive parsing
labels/ Per-language label rules (one file per language)
patterns/ Per-language AST pattern queries (one file per language)
callgraph.rs Call graph construction (petgraph), SCC, topo sort
database.rs SQLite indexing via r2d2 pool
rank.rs Attack-surface ranking
fmt.rs Console output formatting
output.rs SARIF 2.1 builder
walk.rs Parallel file walker (ignore crate, respects .gitignore)
symbol/ Symbol interning (SymbolId)
server/ `nyx serve` HTTP layer, routes, triage sync
interop.rs Cross-language interop edges
engine_notes.rs Direction-aware engine notes (UnderReport / OverReport / Bail)
evidence.rs Structured evidence emitted with each finding
errors.rs NyxError, NyxResult types
utils/
config.rs TOML config loading, merging, Config struct
```
---
## How to Add a New AST Pattern
AST patterns are the simplest detector to add. Each pattern is a tree-sitter query that matches a structural code construct.
### Step-by-step
1. **Pick the language file** under `src/patterns/<lang>.rs`.
2. **Choose the metadata**:
| Field | Options | Guidelines |
|-------|---------|------------|
| **ID** | `<lang>.<category>.<specific>` | e.g. `py.cmdi.os_popen` |
| **Tier** | `A` or `B` | `A` = presence alone is high-signal; `B` = query includes a heuristic guard |
| **Severity** | `High`, `Medium`, `Low` | High: command exec, deser, banned functions. Medium: SQL concat, reflection, XSS. Low: weak crypto, code quality. |
| **Category** | See `PatternCategory` enum | `CommandExec`, `CodeExec`, `Deserialization`, `SqlInjection`, `PathTraversal`, `Xss`, `Crypto`, `Secrets`, `InsecureTransport`, `Reflection`, `MemorySafety`, `Prototype`, `CodeQuality` |
3. **Write the tree-sitter query**:
```rust
Pattern {
id: "py.cmdi.os_popen",
description: "os.popen() shell command execution",
query: r#"(call
function: (attribute
object: (identifier) @pkg (#eq? @pkg "os")
attribute: (identifier) @fn (#eq? @fn "popen")))
@vuln"#,
severity: Severity::High,
tier: PatternTier::A,
category: PatternCategory::CommandExec,
},
```
The query **must** capture a `@vuln` node. That node's span determines the reported location.
4. **Test it**:
```bash
cargo test --bin nyx
```
5. **Update docs**: Add the new rule to `docs/rules/<lang>.md`.
### Tips
- Use the [tree-sitter playground](https://tree-sitter.github.io/tree-sitter/playground) to develop and test queries.
- Avoid duplicating taint coverage. If the same function is already a labeled sink in `src/labels/<lang>.rs`, the AST pattern is still useful for `--mode ast`, but use a distinct ID namespace. The dedup pass prevents exact-duplicate findings at the same location.
- Test with real-world code to check false positive rates before choosing a tier.
---
## How to Add a New Taint Rule
Taint rules define sources (where untrusted data enters), sinks (where dangerous operations happen), and sanitizers (where data is made safe).
### Step-by-step
1. **Open the language file** in `src/labels/<lang>.rs`.
2. **Add an entry** to the `RULES` slice:
```rust
LabelRule {
matchers: &["dangerouslySetInnerHTML"],
label: DataLabel::Sink(Cap::HTML_ESCAPE),
},
```
3. **Choose the right label type**:
| Type | Purpose | Example |
|------|---------|---------|
| `DataLabel::Source(cap)` | Introduces tainted data | `env::var`, `req.body` |
| `DataLabel::Sanitizer(cap)` | Strips matching capability bits | `html_escape`, `encodeURIComponent` |
| `DataLabel::Sink(cap)` | Dangerous operation requiring sanitization | `eval`, `innerHTML`, `Command::new` |
4. **Choose capabilities**:
| Capability | When to use |
|-----------|-------------|
| `Cap::all()` | Sources that produce universally dangerous data |
| `Cap::SHELL_ESCAPE` | Shell command injection sinks/sanitizers |
| `Cap::HTML_ESCAPE` | XSS sinks/sanitizers |
| `Cap::URL_ENCODE` | URL injection sinks/sanitizers |
| `Cap::JSON_PARSE` | JSON parsing sanitizers |
| `Cap::FILE_IO` | File I/O sinks |
| `Cap::FMT_STRING` | Format string sinks |
| `Cap::ENV_VAR` | Environment/config data sources |
5. **Matcher semantics**:
- Case-insensitive suffix matching by default.
- If a matcher ends with `_`, it acts as a prefix match.
- Multiple matchers in one rule are alternatives (any match triggers the rule).
### User-defined rules (no code change needed)
Users can add taint rules via config:
```toml
[[analysis.languages.javascript.rules]]
matchers = ["dangerouslySetInnerHTML"]
kind = "sink"
cap = "html_escape"
```
Or via CLI:
```bash
nyx config add-rule --lang javascript --matcher dangerouslySetInnerHTML --kind sink --cap html_escape
```
---
## How to Add a New Language
Adding a new language requires changes across several modules. Use an existing language (e.g. Go or Python) as a template.
### Checklist
1. **Tree-sitter parser**: Add `tree-sitter-<lang>` to `Cargo.toml`.
2. **Language registration**: Register the parser in `ast.rs` (language detection from file extension, parser initialization).
3. **CFG node kinds**: Create `src/labels/<lang>.rs` with a `KINDS` map that maps tree-sitter node types to the internal `Kind` enum (`Block`, `If`, `While`, `For`, `Return`, `CallFn`, `CallMethod`, `Assignment`, etc.).
4. **Parameter extraction**: Add a `PARAM_CONFIG` constant specifying how to extract function parameters from the AST (field name for parameter list, node type for individual parameters, extraction field for parameter names).
5. **Label rules**: Add `RULES` (sources, sinks, sanitizers) and `TERMINATORS` to the labels file.
6. **AST patterns**: Create `src/patterns/<lang>.rs` with a `PATTERNS` constant.
7. **Registry updates**:
- `src/patterns/mod.rs`: add to the `REGISTRY` HashMap
- `src/labels/mod.rs`: add to the `classify()` dispatch
8. **File extension mapping**: Add the extension in `ast.rs`.
9. **Tests**: Write unit tests and add test fixtures.
---
## Testing
### Tests
Unit tests are inline `#[test]` blocks inside source modules; integration tests
live under `tests/`. Run everything the way CI does:
```bash
cargo test --all-features
```
### What to Test
- **New AST patterns**: Ensure the tree-sitter query matches the intended construct and does not match safe alternatives.
- **New taint rules**: Verify that source-to-sink flows are detected and that sanitizers properly neutralize findings.
- **New CFG rules**: Test that guard dominance logic correctly suppresses findings when guards are present.
- **Edge cases**: Empty files, files with syntax errors (tree-sitter is error-tolerant), deeply nested structures.
### Linting
CI runs Clippy with strict settings. Before submitting:
```bash
cargo clippy --all-targets --all-features -- -D warnings
```
---
## Pull Request Guidelines
First-time contributors are welcome. If you are unsure where to start, open an issue and we can help identify a focused starter task.
1. **Branch from `master`**. Use descriptive branch names: `feat/add-kotlin-support`, `fix/false-positive-sql-concat`, `docs/update-rule-reference`.
2. **Keep PRs focused**. One logical change per PR.
3. **Ensure CI passes** — run `./scripts/check.sh` (mirrors CI), or the steps individually:
```bash
cargo test --all-features
cargo clippy --all-targets --all-features -- -D warnings
cargo fmt -- --check
```
4. **Commit style**: Use [Conventional Commits](https://www.conventionalcommits.org/).
```
feat(patterns): add Python subprocess.Popen pattern
fix(taint): prevent false positive on sanitized innerHTML
docs(rules): update JavaScript rule reference
```
5. **Document new rules**. If you add patterns or taint rules, update the corresponding `docs/rules/<lang>.md` page.
6. **Include test cases** for any new detection rules.
7. **Disclose material AI assistance** in the PR description if the change was drafted, generated, or substantially refactored by an AI tool. One line is enough. See [AI-POLICY.md](AI-POLICY.md) for the full policy and the bar we hold AI-assisted contributions to.
---
## Bug Reports
Please [open an issue](https://github.com/elicpeter/nyx/issues) for:
- **Crashes or panics**: include the backtrace (`RUST_BACKTRACE=1 nyx scan .`)
- **False positives**: include the minimal code snippet, rule ID, and Nyx version
- **False negatives**: describe what you expected Nyx to find and why
- **Documentation errors**: point to the specific page and what's wrong
---
## Feature Requests
We welcome well-motivated feature proposals. Please describe:
1. **Problem statement**: what pain point does this solve?
2. **Proposed solution**: high-level description, optionally with pseudo-code.
3. **Alternatives considered**: why existing functionality is not enough.
---
## Release Process
1. Update version in `Cargo.toml`.
2. Update `CHANGELOG.md` with the new version section.
3. Run full checks: `./scripts/check.sh` (or `cargo test --all-features && cargo clippy --all-targets --all-features -- -D warnings`).
4. Create a git tag: `git tag v0.x.y`.
5. Push tag: `git push origin v0.x.y`.
6. CI builds release binaries and publishes to crates.io.
---
## Security Issues
Please do **not** open public issues for security-sensitive bugs. See [SECURITY.md](SECURITY.md) for our responsible disclosure process.
---
## License
### Contributions are released under GPL-3.0-or-later
By submitting a pull request, patch, or other contribution to Nyx, you agree that your contribution will be released under the [GPL-3.0-or-later](./LICENSE), the same license as the project.
### Developer Certificate of Origin
We use the Developer Certificate of Origin (DCO) as a lightweight baseline for contributions. All commits must include a `Signed-off-by:` trailer, which certifies that you wrote the code yourself or otherwise have the right to submit it under the project license.
Use `git commit -s` to add this automatically.
### Contributor License Agreement
Before your first contribution can be merged, you must sign the Nyx [Contributor License Agreement](./CLA.md).
The CLA does not transfer ownership of your work. You retain copyright to your contributions. It grants Nyx the rights needed to maintain, distribute, and evolve the project over time, including the flexibility to support long-term sustainability through future licensing or commercial offerings.
If you do not agree to these terms, please do not submit contributions to Nyx.

1983
Cargo.lock generated

File diff suppressed because it is too large Load diff

View file

@ -1,37 +1,167 @@
[package] [package]
name = "nyx" name = "nyx-scanner"
version = "0.1.0" version = "0.8.0"
edition = "2024" edition = "2024"
rust-version = "1.88"
description = "A multi-language static analysis tool for detecting security vulnerabilities"
license = "GPL-3.0-or-later"
authors = ["Eli Peter <elicpeter@example.com>"]
homepage = "https://nyxsec.dev/scanner"
repository = "https://github.com/elicpeter/nyx"
documentation = "https://nyxsec.dev/docs/nyx/"
keywords = ["security", "vulnerability", "scanner", "static-analysis", "cli"]
categories = ["security", "command-line-utilities", "development-tools", "parser-implementations", "text-processing"]
readme = "README.md"
default-run = "nyx"
include = [
"/src/**",
"/tools/**",
"/build.rs",
"/Cargo.toml",
"/Cargo.lock",
"/README.md",
"/LICENSE",
"/THIRDPARTY-LICENSES.html",
"/default-nyx.conf",
]
autoexamples = false
[package.metadata.binstall]
pkg-url = "{ repo }/releases/download/v{ version }/nyx-{ target }{ archive-suffix }"
pkg-fmt = "zip"
bin-dir = "target/{ target }/release/{ bin }{ binary-ext }"
# docs.rs builds the `serve` feature (default) so the server module renders.
# `smt` is left off — bundled Z3 takes too long on docs.rs builders, and
# `smt-system-z3` needs a system library that isn't available there.
[package.metadata.docs.rs]
features = ["serve"]
rustdoc-args = ["--cfg", "docsrs"]
[features]
default = ["serve", "dynamic"]
serve = ["dep:axum", "dep:tokio", "dep:tokio-stream", "dep:tower-http"]
smt = ["dep:z3", "z3/bundled"]
smt-system-z3 = ["dep:z3"]
docgen = []
# Dynamic verification layer: builds harnesses from findings, runs them in a
# sandbox, reports back whether the sink fires.
dynamic = ["dep:bytes", "dep:h2", "dep:http", "dep:prost", "dep:tempfile", "dep:tokio"]
# Phase 19 (Track E.3): the `nyx-image-builder` helper binary that builds
# and pins per-toolchain Docker images. Gated so it does not bloat the
# default `nyx` build with extra TOML-write logic CI-only operators need.
image-builder = []
# Phase 20 (Track E.4): the firecracker VM backend. Off by default so
# the standard build pulls in zero Firecracker-related code; turning it
# on adds the `firecracker.rs` backend module and exposes
# `SandboxBackend::Firecracker` to callers. When the feature is on but
# the `firecracker` binary is absent on PATH, the backend returns
# `SandboxError::BackendUnavailable(SandboxBackend::Firecracker)` so the
# verifier can route around it cleanly.
firecracker = ["dynamic"]
[lib]
name = "nyx_scanner"
path = "src/lib.rs"
[[bin]]
name = "nyx"
path = "src/main.rs"
[[bin]]
name = "nyx-docgen"
path = "tools/docgen/main.rs"
required-features = ["docgen"]
[[bin]]
name = "nyx-image-builder"
path = "tools/image-builder/main.rs"
required-features = ["image-builder"]
[[bench]]
name = "scan_bench"
harness = false
[[bench]]
name = "dynamic_bench"
harness = false
required-features = []
[dev-dependencies]
tempfile = "3.27.0"
criterion = { version = "0.8.2", features = ["html_reports"] }
assert_cmd = "2.2.2"
predicates = "3.1.4"
glob = "0.3.3"
tower = { version = "0.5.3", features = ["util"] }
[dependencies] [dependencies]
directories = "6.0.0" directories = "6.0.0"
clap = { version = "4.5.40", features = ["derive"] } clap = { version = "4.6.1", features = ["derive"] }
serde = { version = "1.0.219", features = ["derive"] } serde = { version = "1.0.228", features = ["derive"] }
toml = "0.8.23" serde_json = "1.0.150"
tracing-subscriber = { version = "0.3.19", features = ["env-filter", "json", "ansi","time"] } rmp-serde = "1.3.1"
tracing = "0.1.41" toml = "1.1.2"
tracing-subscriber = { version = "0.3.23", features = ["env-filter", "json", "ansi","time"] }
tracing = "0.1.44"
num_cpus = "1.17.0" num_cpus = "1.17.0"
rusqlite = "0.36.0" rusqlite = { version = "0.39.0", features = ["bundled"] }
ignore = "0.4.23" r2d2_sqlite = { version = "0.34.0", features = ["bundled"] }
tree-sitter = "0.25.6" ignore = "0.4.26"
tree-sitter-rust = "0.24.0" tree-sitter = "0.26.9"
tree-sitter-c = "0.24.1" tree-sitter-rust = "0.24.2"
tree-sitter-c = "0.24.2"
tree-sitter-cpp = "0.23.4" tree-sitter-cpp = "0.23.4"
tree-sitter-java = "0.23.5" tree-sitter-java = "0.23.5"
tree-sitter-typescript = "0.23.2" tree-sitter-typescript = "0.23.2"
tree-sitter-javascript = "0.23.1" tree-sitter-javascript = "0.25.0"
tree-sitter-go = "0.23.4" tree-sitter-go = "0.25.0"
tree-sitter-php = "0.23.11" tree-sitter-php = "0.24.2"
tree-sitter-python = "0.23.6" tree-sitter-python = "0.25.0"
tree-sitter-ruby = "0.23.1" tree-sitter-ruby = "0.23.1"
crossbeam-channel = "0.5.15" crossbeam-channel = "0.5.15"
blake3 = "1.8.2" blake3 = "1.8.5"
once_cell = "1.21.3" once_cell = "1.21.4"
console = "0.15.11" console = "0.16.3"
rayon = "1.10.0" terminal_size = "0.4.4"
r2d2_sqlite = "0.30.0" rayon = "1.12.0"
r2d2 = "0.8.10" r2d2 = "0.8.10"
bytesize = "2.0.1" bytesize = "2.3.1"
chrono = { version = "0.4.41", default-features = false, features = ["std", "clock"] } chrono = { version = "0.4.45", default-features = false, features = ["std", "clock", "serde"] }
thiserror = "2.0.12" thiserror = "2.0.18"
dashmap = "7.0.0-rc2" dashmap = "6.2.1"
parking_lot = "0.12.5"
petgraph = { version = "0.8.3", features = ["serde-1"] }
bitflags = "2.12.1"
phf = { version = "0.13.1", features = ["macros"] }
indicatif = "0.18.4"
smallvec = { version = "1.15.1", features = ["serde"] }
rustc-hash = "2.1.2"
uuid = { version = "1.23.2", features = ["v4"] }
axum = { version = "0.8.9", optional = true }
bytes = { version = "1.11.1", optional = true }
h2 = { version = "0.4.14", optional = true }
http = { version = "1.4.1", optional = true }
prost = { version = "0.14.3", optional = true }
tokio = { version = "1.52.3", features = ["rt-multi-thread", "macros", "signal", "sync", "net", "io-util"], optional = true }
tokio-stream = { version = "0.1.18", features = ["sync"], optional = true }
tower-http = { version = "0.6.11", features = ["cors", "compression-gzip", "trace", "set-header", "limit"], optional = true }
z3 = { version = "0.20.0", optional = true}
tempfile = { version = "3.27.0", optional = true }
[lints.clippy]
# Allowed project-wide instead of per-file. The vast majority of
# `collapsible_if` hits are `if let Some(x) = .. { if cond { .. } }` patterns
# whose only "fix" is to collapse into a let-chain, which hurts readability on
# the complex extractor expressions throughout the engine. Keeping the decision
# here means the rationale lives in one place and new files inherit it
# automatically rather than re-declaring `#![allow(clippy::collapsible_if)]`.
collapsible_if = "allow"
[profile.release]
lto = true
codegen-units = 1
debug = 1
strip = "none"

226
LICENSE Normal file
View file

@ -0,0 +1,226 @@
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright © 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for software and other kinds of works.
The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions.
Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and modification follow.
TERMS AND CONDITIONS
0. Definitions.
“This License” refers to version 3 of the GNU General Public License.
“Copyright” also means copyright-like laws that apply to other kinds of works, such as semiconductor masks.
“The Program” refers to any copyrightable work licensed under this License. Each licensee is addressed as “you”. “Licensees” and “recipients” may be individuals or organizations.
To “modify” a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a “modified version” of the earlier work or a work “based on” the earlier work.
A “covered work” means either the unmodified Program or a work based on the Program.
To “propagate” a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well.
To “convey” a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays “Appropriate Legal Notices” to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion.
1. Source Code.
The “source code” for a work means the preferred form of the work for making modifications to it. “Object code” means any non-source form of a work.
A “Standard Interface” means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language.
The “System Libraries” of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A “Major Component”, in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it.
The “Corresponding Source” for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work.
The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source.
The Corresponding Source for a work in source code form is that same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures.
When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified it, and giving a relevant date.
b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to “keep intact all notices”.
c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so.
A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an “aggregate” if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways:
a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b.
d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d.
A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work.
A “User Product” is either (1) a “consumer product”, which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, “normally used” refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product.
“Installation Information” for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made.
If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM).
The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying.
7. Additional Terms.
“Additional permissions” are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or authors of the material; or
e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors.
All other non-permissive additional terms are considered “further restrictions” within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11).
However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice.
Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License.
An “entity transaction” is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it.
11. Patents.
A “contributor” is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's “contributor version”.
A contributor's “essential patent claims” are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, “control” includes the right to grant patent sublicenses in a manner consistent with the requirements of this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version.
In the following three paragraphs, a “patent license” is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To “grant” such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party.
If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. “Knowingly relying” means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it.
A patent license is “discriminatory” if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License “or any later version” applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation.
If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program.
Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the “copyright” line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an “about box”.
You should also get your employer (if you work as a programmer) or school, if any, to sign a “copyright disclaimer” for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see <https://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read <https://www.gnu.org/licenses/why-not-lgpl.html>.

89
LICENSE-GRANTS.md Normal file
View file

@ -0,0 +1,89 @@
# Internal License Grants
This file records dual-licensing grants the copyright holder of Nyx has issued
beyond the public GPL-3.0-or-later release.
Nyx ships publicly under GPL-3.0-or-later. That license continues to apply to
every public release on GitHub, crates.io, and any other channel. The grants
recorded here are separate, private licenses from the copyright holder to
specific projects. They do not modify the public GPL terms and they are not
transferable to third parties.
The right to issue these grants is preserved in `CLA.md` Section 4
(Relicensing Right):
> [The contributor] grants the Project and any entity that maintains or
> succeeds it the right to relicense Your Contribution, in whole or in part,
> under terms other than the Project's current license (currently
> GPL-3.0-or-later), where necessary to support the long-term sustainability,
> distribution, and evolution of the Project.
The copyright holder is the sole author of every Contribution to Nyx
(verifiable via `git log`). The CLA covers any future external Contributions.
The copyright holder may therefore grant any party, including projects owned
by the same copyright holder, a license to use Nyx under terms other than
GPL-3.0-or-later, without affecting the public GPL release.
## How forks are affected
A third-party fork of nyx-agent that obtains the nyx-agent source under PolyForm
Small Business 1.0.0 (or any successor source-available license) does not
acquire any rights to Nyx beyond the public GPL-3.0-or-later terms. The
internal grant below is project-to-project and non-transferable. Anyone
redistributing a binary that statically or dynamically links the `nyx` crate
must comply with the GPL on the `nyx` portion of the work. GPL is viral
copyleft on distribution. Only the copyright holder may issue further
dual-licensing grants.
---
## Grant Register
### Grant 1: nyx-agent
| Field | Value |
|---|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Grantor | Eli Peter, sole copyright holder of Nyx as of the effective date |
| Grantee | The nyx-agent project (`nyx-agent` daemon, web UI, and accompanying tooling). Repository: `nyx-agent` |
| Effective date | 2026-05-17 |
| Scope | All Nyx source code, documentation, fixtures, build artefacts, and binaries (the "Licensed Material") in any version released as of the effective date or thereafter, plus any future modifications the Grantor authors or accepts under the CLA |
| Permitted uses | (a) static or dynamic linking of the Licensed Material into the nyx-agent daemon; (b) modification of the Licensed Material as required for nyx-agent integration; (c) redistribution of the Licensed Material as part of the nyx-agent distribution; (d) sublicensing the Licensed Material to end users of nyx-agent solely under whatever license terms nyx-agent itself is distributed under (currently PolyForm Small Business 1.0.0, or a separately negotiated commercial license) |
| Restrictions | (a) this grant does not modify, supersede, or revoke the public GPL-3.0-or-later release of Nyx; (b) this grant is non-transferable; only the nyx-agent project, owned by the Grantor, may exercise it; (c) any third-party fork of nyx-agent must obtain Nyx under the public GPL terms unless it negotiates a separate grant from the Grantor; (d) attribution of Nyx authorship must be preserved in any redistribution per the CLA's moral-rights waiver |
| Duration | Perpetual and irrevocable, subject only to the Grantee maintaining ownership-or-control by the Grantor. If the nyx-agent project is sold, assigned, or otherwise transferred to a third party, this grant terminates and the new owner must negotiate a separate license |
| Sublicensing of the grant itself | Not permitted. The Grantee may distribute Nyx as part of nyx-agent to end users under nyx-agent's outward terms, but the Grantee may not grant any other project the right to use Nyx outside the public GPL terms |
| Governing law | Same as Nyx CLA |
---
## Adding future grants
New grants follow the same format as Grant 1. Append a new section
(`### Grant N: <recipient name>`) below the existing entries and commit to
the Nyx repository. Grants are append-only. Revisions land as superseding
entries with their own date, not as edits to the original.
Grants the Grantor anticipates issuing in the future include:
- Commercial-license SKU grants to individual customers of nyx-agent that
exceed the PolyForm Small Business threshold. These will be issued
per-customer under a separate Nyx Commercial License contract.
- Stewardship-transition grants if the project is ever handed off (for
example, to a foundation). These would be a single grant to the receiving
entity.
The Grantor reserves the right to refuse to issue any grant.
---
## What this file is NOT
- It is not a redistribution license. Third parties cannot rely on it to use
Nyx outside the public GPL terms.
- It is not a Contributor License Agreement. `CLA.md` covers contribution
terms separately.
- It is not a public-facing license file. The canonical public license for
Nyx is `LICENSE` (GPL-3.0-or-later).
---
Copyright (c) 2026 Eli Peter. All rights reserved.

339
README.md
View file

@ -1,131 +1,302 @@
<div align="center">
<img src="assets/nyx-readme-header.png" alt="NYX" width="640"/>
**A local-first security scanner with sandboxed dynamic verification and a browser UI. Scan your repo and triage in your browser, with no cloud and no account.**
# Nyx - Lightweight Multi-Language Vulnerability Scanner [![crates.io](https://img.shields.io/crates/v/nyx-scanner.svg)](https://crates.io/crates/nyx-scanner)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Rust 1.88+](https://img.shields.io/badge/rust-1.88%2B-orange)](https://www.rust-lang.org)
[![CI](https://img.shields.io/github/actions/workflow/status/elicpeter/nyx/ci.yml?branch=master)](https://github.com/elicpeter/nyx/actions)
[![Docs](https://img.shields.io/badge/docs-nyxscan.dev%2Fdocs-blue)](https://nyxscan.dev/docs/)
Nyx is a lightweight Rust CLI tool for scanning code across multiple programming languages to detect potential vulnerabilities and code quality issues. It works by converting source code to Abstract Syntax Trees (ASTs), analyzing control flow graphs, performing taint analysis, and searching for common vulnerability patterns. English · [简体中文](./README.zh-CN.md)
</div>
## Features <p align="center"><img src="assets/screenshots/demo.gif" alt="Nyx UI walkthrough: empty Welcome state, kicking off a scan, the populated overview with Health Score, drilling into a HIGH finding's flow visualizer, then the triage flow" width="900"/></p>
- **Fast and Lightweight**: Written in Rust for optimal performance ---
- **Multi-Language Support**: Scans code in multiple programming languages
- **AST-Based Analysis**: Uses tree-sitter for accurate code parsing
- **Project Indexing**: Maintains an index to avoid rescanning unchanged files
- **Configurable**: Extensive configuration options for customizing scans
- **Multiple Output Formats**: Supports table, JSON, CSV, and SARIF output formats
## Installation ## Scan locally, browse locally
### From Source Nyx runs cross-language taint analysis on your repository, then verifies Medium or higher confidence findings by running small sandboxed harnesses against the real code. Results are served to a React UI bound to `127.0.0.1`. You get severity, static evidence, dynamic verdicts, and a step-by-step **flow visualiser** that walks the dataflow from source → sanitizer → sink. Triage decisions persist to `.nyx/triage.json`, which commits alongside your code so the team shares one triage state.
```bash ```bash
# Clone the repository cargo install nyx-scanner
git clone https://github.com/yourusername/nyx.git nyx scan # runs the analyzer, caches findings in .nyx/
cd nyx nyx serve # opens http://localhost:9700 in your browser
# Build the project
cargo build --release
# Install the binary
cargo install --path .
``` ```
## Usage Everything stays on your machine: loopback-only bind, host-header enforcement, CSRF on every mutation, no remote telemetry, no login.
### Basic Scanning <p align="center"><img src="assets/screenshots/overview.png" alt="Overview dashboard for a small JS app: Health Score C 78 with the five-component breakdown (Severity pressure, Confidence quality, Trend, Triage coverage, Regression resistance), 3 findings detected, OWASP A03 and A02 buckets, confidence distribution and issue category bars, top affected files" width="900"/></p>
---
## What's in the UI
| Page | What it shows |
|---|---|
| **Overview** | Dashboard: finding counts by severity, top offenders, engine profile summary |
| **Findings** | Browsable list with severity badges, triage status, rule filter, language filter |
| **Finding detail** | Flow-path visualiser with numbered steps (source → sanitizer → sink), dynamic verdicts, code snippets, evidence, cross-file markers, triage dropdown |
| **Triage** | Bulk update states (open, investigating, fixed, false_positive, accepted_risk, suppressed), audit trail, import/export JSON |
| **Explorer** | File tree with per-file symbol list and finding overlay |
| **Scans** | Run history, metrics, diff two scans to see what changed |
| **Rules** | Built-in and custom rules per language; add rules from the UI |
| **Config** | Live config editor; reload without restart |
`nyx serve` flags: `--port <N>` (default `9700`), `--host <addr>` (loopback only: `127.0.0.1`, `localhost`, or `::1`), `--no-browser`. See `[server]` in `nyx.conf` for persistent settings, and the [Browser UI guide](https://nyxscan.dev/docs/serve.html) for the page-by-page UI tour and security model.
---
## CLI for CI
The same engine runs headless for CI pipelines. SARIF output uploads directly to GitHub Code Scanning.
<p align="center"><img src="assets/screenshots/cli-scan.gif" alt="nyx scan console output: HIGH taint findings across a JS and Python file with source → sink arrows" width="820"/></p>
```bash ```bash
# Scan the current directory # Fail the job on medium or higher, emit SARIF
nyx scan nyx scan --format sarif --fail-on MEDIUM > results.sarif
# Scan a specific directory # Ad-hoc JSON, no index
nyx scan /path/to/project nyx scan ./server --format json --index off
# Scan with specific output format # AST patterns only (fastest; skips CFG + taint)
nyx scan --format json nyx scan --mode ast
# Scan only for high severity issues # Engine-depth shortcut: fast | balanced (default) | deep
nyx scan --high-only # `deep` adds symex + demand-driven backwards taint for higher precision at ~2-3× cost
nyx scan --engine-profile deep
``` ```
### Managing Project Indexes Forward cross-file taint runs in every profile. Symex and the demand-driven backwards walk are opt-in. Turn them on either via `--engine-profile deep`, or individually (`--symex`, `--backwards-analysis`). See the [CLI reference](https://nyxscan.dev/docs/cli.html#engine-depth-profile) for the full toggle matrix.
### GitHub Action
```yaml
- uses: elicpeter/nyx@v0.8.0
with:
format: sarif
fail-on: MEDIUM
- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: nyx-results.sarif
```
Inputs: `path`, `version`, `format` (`sarif`|`json`|`console`), `fail-on`, `args`, `token`. Outputs: `finding-count`, `sarif-file`, `exit-code`, `nyx-version`. Linux and macOS runners (x86_64, ARM64).
---
## Install
**Cargo (recommended):**
```bash
cargo install nyx-scanner
```
**Pre-built binaries:** Grab the archive for your platform from [Releases](https://github.com/elicpeter/nyx/releases), verify against `SHA256SUMS` (and the detached `SHA256SUMS.asc` GPG signature, when present), unzip, and drop `nyx` on your `PATH`.
```bash ```bash
# Build or update index for current project # Optional: verify the checksum file's GPG signature (when SHA256SUMS.asc is published)
nyx index build gpg --verify SHA256SUMS.asc SHA256SUMS
sha256sum -c SHA256SUMS --ignore-missing
# Force rebuild index unzip nyx-x86_64-unknown-linux-gnu.zip && chmod +x nyx && sudo mv nyx /usr/local/bin/
nyx index build --force
# Show index status
nyx index status
# List all indexed projects
nyx list
# List all indexed projects with details
nyx list --verbose
# Remove a project from index
nyx clean project-name
# Clean all projects
nyx clean --all
``` ```
## Supported Languages **From source:**
```bash
git clone https://github.com/elicpeter/nyx.git
cd nyx && cargo build --release
```
Nyx currently supports scanning code in the following languages: Requires stable Rust 1.88+. The frontend is compiled and embedded in the binary at build time, so there is no separate install step for `nyx serve`.
- Rust ---
- C
- C++
- Java
- Go
- PHP
- Python
- TypeScript
- JavaScript
## How It Works ## Languages
1. **Code Traversal**: Nyx walks through your project's directory structure, respecting ignore files and exclusion patterns. All 10 languages parse via tree-sitter and run through the full pipeline, but rule depth and engine coverage are uneven. Benchmark F1 on the synthetic corpus at [`tests/benchmark/ground_truth.json`](tests/benchmark/ground_truth.json) is 100% across all ten languages at the last measured baseline (see [`tests/benchmark/RESULTS.md`](tests/benchmark/RESULTS.md)), so F1 alone no longer separates the tiers. Tiering reflects rule depth, gated-sink coverage, and structural idioms the synthetic corpus does not fully stress:
2. **AST Generation**: For each supported file, Nyx uses tree-sitter to parse the code into an Abstract Syntax Tree (AST). | Tier | Languages | F1 | Use as a CI gate? |
|---|---|---|---|
| **Stable** | Python, JavaScript, TypeScript | 100% | Yes |
| **Beta** | Java, PHP, Ruby, Rust, Go | 100% | Yes, with light FP triage |
| **Preview** | C, C++ | 100% on synthetic corpus | No. STL container flow, builder chains, and inline class member functions are tracked, but deep pointer aliasing and function pointers are not. Pair with clang-tidy or Clang Static Analyzer |
3. **Pattern Matching**: Nyx applies language-specific vulnerability patterns to the AST to identify potential issues. All real-CVE fixtures fire and the corpus carries zero open FPs at the recorded baseline (P=R=F1=1.000). Per-dimension detail and known blind spots live on the [Language maturity page](https://nyxscan.dev/docs/language-maturity.html).
4. **Control Flow Analysis**: (Planned) Nyx will convert ASTs to control flow graphs for more sophisticated analysis. ### Validated against real CVEs
5. **Taint Analysis**: (Planned) Nyx will track the flow of untrusted data through your application. The corpus also holds a small set of vulnerable/patched pairs extracted from published advisories, so the benchmark floor is defended by regression protection on demonstrably real bugs rather than just synthetic analogues. Nyx fires on the vulnerable file and emits zero findings on the patched file for each pair.
6. **Reporting**: Issues are reported with severity levels, file locations, and descriptions. | CVE | Project | Language | Class |
|---|---|---|---|
| [CVE-2023-48022](https://nvd.nist.gov/vuln/detail/CVE-2023-48022) | Ray | Python | Command injection |
| [CVE-2017-18342](https://nvd.nist.gov/vuln/detail/CVE-2017-18342) | PyYAML | Python | Deserialization |
| [CVE-2019-14939](https://nvd.nist.gov/vuln/detail/CVE-2019-14939) | mongo-express | JavaScript | Code execution (`eval`) |
| [CVE-2023-22621](https://nvd.nist.gov/vuln/detail/CVE-2023-22621) | Strapi | JavaScript | Code execution (SSTI) |
| [CVE-2025-64430](https://nvd.nist.gov/vuln/detail/CVE-2025-64430) | Parse Server | JavaScript | SSRF |
| [CVE-2023-26159](https://nvd.nist.gov/vuln/detail/CVE-2023-26159) | follow-redirects | TypeScript | SSRF |
| [GHSA-4x48-cgf9-q33f](https://github.com/advisories/GHSA-4x48-cgf9-q33f) | Novu | TypeScript | SSRF |
| [CVE-2026-25544](https://nvd.nist.gov/vuln/detail/CVE-2026-25544) | Payload CMS | TypeScript | SQL injection |
| [CVE-2022-30323](https://nvd.nist.gov/vuln/detail/CVE-2022-30323) | hashicorp/go-getter | Go | Command injection |
| [CVE-2024-31450](https://nvd.nist.gov/vuln/detail/CVE-2024-31450) | owncast | Go | Path traversal |
| [CVE-2023-3188](https://nvd.nist.gov/vuln/detail/CVE-2023-3188) | owncast | Go | SSRF |
| [CVE-2026-41422](https://github.com/daptin/daptin/security/advisories/GHSA-rw2c-8rfq-gwfv) | daptin | Go | SQL injection |
| [CVE-2015-7501](https://nvd.nist.gov/vuln/detail/CVE-2015-7501) | Apache Commons Collections | Java | Deserialization |
| [CVE-2017-12629](https://nvd.nist.gov/vuln/detail/CVE-2017-12629) | Apache Solr | Java | Command injection |
| [CVE-2022-1471](https://nvd.nist.gov/vuln/detail/CVE-2022-1471) | SnakeYAML | Java | Deserialization |
| [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) | Apache Commons Text | Java | Code execution |
| [GHSA-h8cj-hpmg-636v](https://github.com/advisories/GHSA-h8cj-hpmg-636v) | Appsmith | Java | SQL injection |
| [CVE-2013-0156](https://nvd.nist.gov/vuln/detail/CVE-2013-0156) | Ruby on Rails | Ruby | Deserialization |
| [CVE-2020-8130](https://nvd.nist.gov/vuln/detail/CVE-2020-8130) | Rake | Ruby | Command injection |
| [CVE-2021-21288](https://nvd.nist.gov/vuln/detail/CVE-2021-21288) | CarrierWave | Ruby | SSRF |
| [CVE-2023-38337](https://nvd.nist.gov/vuln/detail/CVE-2023-38337) | rswag-api | Ruby | Path traversal |
| [CVE-2017-9841](https://nvd.nist.gov/vuln/detail/CVE-2017-9841) | PHPUnit | PHP | Code execution (`eval`) |
| [CVE-2018-15133](https://nvd.nist.gov/vuln/detail/CVE-2018-15133) | Laravel | PHP | Deserialization |
| [CVE-2018-20997](https://nvd.nist.gov/vuln/detail/CVE-2018-20997) | tar-rs | Rust | Path traversal |
| [CVE-2022-36113](https://nvd.nist.gov/vuln/detail/CVE-2022-36113) | cargo | Rust | Path traversal |
| [CVE-2024-24576](https://nvd.nist.gov/vuln/detail/CVE-2024-24576) | Rust stdlib | Rust | Command injection |
| [CVE-2023-42456](https://rustsec.org/advisories/RUSTSEC-2023-0069.html) | sudo-rs | Rust | Path traversal |
| [CVE-2024-32884](https://rustsec.org/advisories/RUSTSEC-2024-0335.html) | gitoxide | Rust | Command injection |
| [CVE-2025-53549](https://rustsec.org/advisories/RUSTSEC-2025-0043.html) | matrix-rust-sdk | Rust | SQL injection |
| [CVE-2016-3714](https://nvd.nist.gov/vuln/detail/CVE-2016-3714) | ImageMagick (ImageTragick) | C | Command injection |
| [CVE-2019-18634](https://nvd.nist.gov/vuln/detail/CVE-2019-18634) | sudo (pwfeedback) | C | Memory safety |
| [CVE-2019-13132](https://nvd.nist.gov/vuln/detail/CVE-2019-13132) | ZeroMQ libzmq | C++ | Memory safety |
| [CVE-2022-1941](https://nvd.nist.gov/vuln/detail/CVE-2022-1941) | Protocol Buffers | C++ | Memory safety |
| [CVE-2025-69662](https://nvd.nist.gov/vuln/detail/CVE-2025-69662) | geopandas | Python | SQL injection |
| [CVE-2026-33626](https://nvd.nist.gov/vuln/detail/CVE-2026-33626) | LMDeploy | Python | SSRF |
Fixtures live under [`tests/benchmark/cve_corpus/`](tests/benchmark/cve_corpus/) with upstream attribution headers.
<!--
### Real-world findings
- **Nextcloud server**, [PR #59979](https://github.com/nextcloud/server/pull/59979), merged. The runtime decoder for this column already restricted `allowed_classes`, but the repair routine called `unserialize()` without it, so magic methods on referenced classes could still run. Fix matches the runtime path.
-->
---
## How it works
Two passes over the filesystem, with an optional SQLite index to skip unchanged files:
```mermaid
flowchart LR
Repo["Repository files"] --> Pass1["Pass 1 per file<br/>tree-sitter, CFG, SSA"]
Pass1 --> Summaries["Function summaries<br/>sources, sinks, sanitizers, points-to"]
Summaries --> Index["SQLite index<br/>optional incremental cache"]
Index --> Pass2["Pass 2 cross-file<br/>global summaries, k=1 inline, SCC fixpoint"]
Pass2 --> Rank["Rank and dedupe<br/>severity, evidence, exploitability"]
Rank --> Verify["Dynamic verification<br/>sandboxed harnesses, verdicts"]
Verify --> Output["Console, JSON, SARIF<br/>and browser UI"]
```
1. **Pass 1**: parse each file via tree-sitter, build an intra-procedural CFG (petgraph), lower to pruned SSA (Cytron phi insertion over dominance frontiers), and export per-function summaries (source/sanitizer/sink caps, taint transforms, points-to, callees).
2. **Summary merge**: union all per-file summaries into a `GlobalSummaries` map.
3. **Pass 2**: re-analyze each file with cross-file context under bounded context sensitivity (k=1 inlining for intra-file callees, SCC fixpoint capped at 64 iterations, and summary fallback for callees above the inline body-size cap). A forward dataflow worklist propagates taint through the SSA lattice with guaranteed convergence. Call-graph SCCs iterate to fixed-point (within the cap) so mutually recursive functions get accurate summaries.
4. **Rank, dedupe, verify, emit**: findings are scored by severity × evidence strength × source-kind exploitability. Medium or higher confidence findings are dynamically verified by default, then results are emitted to console, JSON, SARIF, and the browser UI.
Detector families: taint (cross-file source→sink, with cap-specific rule classes for SQLi, XSS, command/code exec, deserialization, SSRF, path traversal, format string, crypto, LDAP injection, XPath injection, HTTP header / response splitting, open redirect, server-side template injection, XXE, prototype pollution, data exfiltration, and the auth fold-in), CFG structural (auth gaps, unguarded sinks, resource leaks), state model (use-after-close, double-close, must-leak, unauthed-access), AST patterns (tree-sitter structural match). Full detector docs: [Detectors](https://nyxscan.dev/docs/detectors.html).
---
## Verify findings dynamically
Static analysis says a sink is reachable. Dynamic verification tries to prove it. With `--verify` (on by default), Nyx builds a small harness around each Medium-or-higher finding, runs it in a sandbox against a curated payload corpus, and stamps a verdict onto the finding.
```bash
nyx scan --verify # build + run a harness per finding (default)
nyx scan --no-verify # static analysis only, for fast local loops
```
A finding is **Confirmed** only when an attacker-controlled payload fires the sink *and* a paired benign control stays clean. That differential rule, plus behavioral oracles (a template that renders `49`, a deserializer that resolves a gadget class, a redirect that leaves the origin), keeps the verifier from confirming on an echoed string. Sinks behind a recognized guard demote to `ConfirmedWithKnownGuard`; sinks reached without a completed exploit chain land as `PartiallyConfirmed`.
Coverage spans 18 verifiable capability classes and 120+ registered adapters across all ten languages (Flask, Django, Express, NestJS, Spring, Rails, Laravel, Gin, Axum, and more), with per-language build pools and copy-on-write workdirs to keep the per-finding cost low. Confirmed findings write a hermetic repro bundle with a `reproduce.sh`. Runs are deterministic: every payload is seeded from the spec hash.
```bash
# CI: fail the build if a new Confirmed finding appears vs. a baseline
nyx scan --baseline .nyx/baseline.json --gate no-new-confirmed
```
Backends: Docker (preferred, network-blocked by default) or an in-process runner with `--harden {standard,strict}`. Full matrix, oracle list, and limitations: [Dynamic verification](https://nyxscan.dev/docs/dynamic.html).
---
## Configuration ## Configuration
Nyx uses a configuration system with defaults that can be overridden by a user-specific configuration file. The configuration file is located at: Config merges `nyx.conf` (defaults) and `nyx.local` (your overrides) from the platform config directory (`~/.config/nyx/` on Linux, `~/Library/Application Support/nyx/` on macOS, `%APPDATA%\elicpeter\nyx\config\` on Windows).
- Linux/macOS: `~/.config/nyx/nyx.local`
- Windows: `C:\Users\<username>\AppData\Roaming\ecpeter23\nyx\config\nyx.local`
Example configuration:
```toml ```toml
[scanner] [scanner]
mode = "full" # full | ast | cfg | taint
min_severity = "Medium" min_severity = "Medium"
follow_symlinks = true
[output] [server]
default_format = "json" host = "127.0.0.1"
color_output = true port = 9700
open_browser = true
[performance] # Project-specific sanitizer
worker_threads = 8 [[analysis.languages.javascript.rules]]
matchers = ["escapeHtml"]
kind = "sanitizer"
cap = "html_escape"
``` ```
## License Or add rules interactively: `nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape`. Caps: `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `data_exfil`, `code_exec`, `crypto`, `unauthorized_id`, `ldap_injection`, `xpath_injection`, `header_injection`, `open_redirect`, `ssti`, `xxe`, `prototype_pollution`, `all`. Full schema: [Configuration](https://nyxscan.dev/docs/configuration.html). Run `nyx rules list` to browse the registry from the terminal.
[Add your license information here] ---
## Status
Under active development. APIs, detector behavior, and configuration options may change between releases. Rule-level F1 on the synthetic corpus is the CI regression floor; per-language detail lives in [`tests/benchmark/RESULTS.md`](tests/benchmark/RESULTS.md).
Taint analysis is interprocedural. Persisted per-function SSA summaries carry per-return-path transforms and parameter-granularity points-to, and call-graph SCCs (including SCCs that span files) iterate to a joint fixed-point. The default `balanced` profile also runs k=1 context-sensitive inlining for intra-file callees. Symex (with cross-file and interprocedural frames) and the demand-driven backwards walk are opt-in. Enable them individually with `--symex` and `--backwards-analysis`, or together with `--engine-profile deep`.
Limitations:
- Interprocedural precision is bounded rather than unlimited. Context-sensitive inlining is k=1 with a callee body-size cap, and SCC fixed-point has an iteration cap. When the engine hits a bound it falls back to summaries and records an `engine_note` on the finding.
- Cross-language calls (FFI, subprocess, WASM) are not traversed. Each language is analysed independently.
- Several language features are not modeled: macros, most dynamic dispatch, aliased imports, reflection.
- C/C++ are preview tier. STL container flow, builder chains, and inline class member functions are tracked now; deep pointer aliasing and function pointers are not. A clean report should not be read as a clean audit. Pair with a clang-based tool before using as a hard CI gate.
- Results may contain false positives or false negatives; manual review is expected.
---
## Documentation
Browse the full docs site at **[nyxscan.dev/docs](https://nyxscan.dev/docs/)**.
- [Quick Start](https://nyxscan.dev/docs/quickstart.html) · [CLI Reference](https://nyxscan.dev/docs/cli.html) · [Installation](https://nyxscan.dev/docs/installation.html)
- [`nyx serve`](https://nyxscan.dev/docs/serve.html) · [Output Formats](https://nyxscan.dev/docs/output.html) · [Configuration](https://nyxscan.dev/docs/configuration.html) · [Dynamic verification](https://nyxscan.dev/docs/dynamic.html)
- [How it works](https://nyxscan.dev/docs/how-it-works.html) · [Detectors](https://nyxscan.dev/docs/detectors.html) ([Taint](https://nyxscan.dev/docs/detectors/taint.html), [CFG](https://nyxscan.dev/docs/detectors/cfg.html), [State](https://nyxscan.dev/docs/detectors/state.html), [AST Patterns](https://nyxscan.dev/docs/detectors/patterns.html))
- [Rule Reference](https://nyxscan.dev/docs/rules.html) · [Language Maturity](https://nyxscan.dev/docs/language-maturity.html) · [Advanced Analysis](https://nyxscan.dev/docs/advanced-analysis.html) · [Auth Analysis](https://nyxscan.dev/docs/auth.html)
---
## Contributing ## Contributing
[Add contribution guidelines here] Contributions are welcome.
Nyx is open source and will always have a fully open-source core. To support long-term development and keep the project sustainable, contributors may be asked to sign a Contributor License Agreement before their first merged contribution.
Run `sh scripts/check.sh` before submitting. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the full guide, including how to add rules and support new languages. Open an issue for crashes, panics, or suspicious results; attach a minimal snippet and the Nyx version.
---
## AI Disclosure
- **Engine code** (taint, SSA, CFG, call graph, abstract interp, symbolic exec): predominantly human-written. AI was used selectively for refactors and boilerplate, with all merges human-reviewed.
- **Docs and most of this README**: AI-generated from the code and hand-edited. Report doc/code drift as a bug.
- **Test fixtures and `expected.yaml` files**: AI-assisted drafting, human-audited before landing.
- **Frontend UI** (React app): built with AI assistance, human-reviewed.
As with any static analyzer, validate findings against your own corpus before using Nyx as a CI gate.
---
## License
GNU General Public License v3.0 or later (GPL-3.0-or-later). The optional `smt` feature bundles Z3 (MIT-licensed); distributors of binaries built with `--features smt` should include Z3's license in their attribution. Full text in [LICENSE](./LICENSE); third-party dependencies in [THIRDPARTY-LICENSES.html](./THIRDPARTY-LICENSES.html).

276
README.zh-CN.md Normal file
View file

@ -0,0 +1,276 @@
<div align="center">
<img src="assets/nyx-readme-header.png" alt="NYX" width="640"/>
**本地优先的安全扫描器,带沙箱动态验证和浏览器 UI。在本地扫描代码仓库并在浏览器中分诊处理无需云端、无需账号。**
[![crates.io](https://img.shields.io/crates/v/nyx-scanner.svg)](https://crates.io/crates/nyx-scanner)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Rust 1.88+](https://img.shields.io/badge/rust-1.88%2B-orange)](https://www.rust-lang.org)
[![CI](https://img.shields.io/github/actions/workflow/status/elicpeter/nyx/ci.yml?branch=master)](https://github.com/elicpeter/nyx/actions)
[![Docs](https://img.shields.io/badge/docs-nyxscan.dev%2Fdocs-blue)](https://nyxscan.dev/docs/)
[English](./README.md) · 简体中文
</div>
<p align="center"><img src="assets/screenshots/demo.gif" alt="Nyx UI 演示:从空欢迎页开始扫描,查看含健康分的总览页,钻入一条 HIGH 级发现的流可视化,再到分诊流程" width="900"/></p>
---
## 本地扫描,本地浏览
Nyx 在你的代码仓库上运行跨语言污点分析,然后对中高置信度发现运行小型沙箱 harness验证真实代码里 source 到 sink 的流是否会触发。结果通过绑定到 `127.0.0.1` 的 React UI 提供给你。你会看到严重等级、静态证据、动态验证结果,以及分步**流可视化**,从源 → 净化器 → 汇逐步呈现数据流。分诊决策持久化在 `.nyx/triage.json` 中,与代码一同提交,团队共享同一份分诊状态。
```bash
cargo install nyx-scanner
nyx scan # 运行分析器,把发现缓存到 .nyx/
nyx serve # 在浏览器中打开 http://localhost:9700
```
一切都留在你本地:仅回环绑定、强制 host 头校验、所有变更操作均带 CSRF、无远程遥测、无登录。
<p align="center"><img src="assets/screenshots/overview.png" alt="一个小型 JS 应用的总览仪表盘:健康分 C 78五项分量分解严重度压力、置信度质量、趋势、分诊覆盖、回归抗性3 条发现OWASP A03 与 A02 类别,置信度分布与问题类别条形图,受影响最多的文件" width="900"/></p>
---
## UI 中包含什么
| 页面 | 显示内容 |
|---|---|
| **总览** | 仪表盘:按严重等级分类的发现计数、热点文件、引擎画像摘要 |
| **发现** | 可浏览列表,含严重度徽章、分诊状态、规则筛选、语言筛选 |
| **发现详情** | 流路径可视化,带编号步骤(源 → 净化器 → 汇)、动态验证结果、代码片段、证据、跨文件标记、分诊下拉框 |
| **分诊** | 批量更新状态open、investigating、fixed、false_positive、accepted_risk、suppressed审计日志JSON 导入/导出 |
| **资源管理器** | 文件树,含每个文件的符号列表与发现叠加层 |
| **扫描** | 历史记录、指标,对比两次扫描查看差异 |
| **规则** | 各语言的内置与自定义规则;可在 UI 中添加规则 |
| **配置** | 实时配置编辑器;无需重启即可重载 |
`nyx serve` 参数:`--port <N>`(默认 `9700`)、`--host <addr>`(仅回环:`127.0.0.1``localhost``::1`)、`--no-browser`。持久化设置见 `nyx.conf``[server]` 段,分页面 UI 介绍与安全模型详见 [Browser UI 指南](https://nyxscan.dev/docs/serve.html)。
---
## 用于 CI 的 CLI
同一个引擎可以无头运行用于 CI 流水线。SARIF 输出可直接上传到 GitHub Code Scanning。
<p align="center"><img src="assets/screenshots/cli-scan.gif" alt="nyx scan 终端输出JS 与 Python 文件中的 HIGH 级污点发现及 source → sink 箭头" width="820"/></p>
```bash
# 在 medium 及以上等级让 CI 失败,并输出 SARIF
nyx scan --format sarif --fail-on MEDIUM > results.sarif
# 临时 JSON无索引
nyx scan ./server --format json --index off
# 仅 AST 模式(最快;跳过 CFG + 污点)
nyx scan --mode ast
# 引擎深度快捷方式fast | balanced默认 | deep
# `deep` 增加 symex 与按需后向污点,精度更高,开销约 2-3 倍
nyx scan --engine-profile deep
```
正向跨文件污点在所有画像下都会运行。Symex 与按需后向遍历是可选项,可通过 `--engine-profile deep` 一次性开启,或单独开启(`--symex``--backwards-analysis`)。完整开关矩阵见 [CLI 参考](https://nyxscan.dev/docs/cli.html#engine-depth-profile)。
### GitHub Action
```yaml
- uses: elicpeter/nyx@v0.8.0
with:
format: sarif
fail-on: MEDIUM
- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: nyx-results.sarif
```
输入:`path``version``format``sarif`|`json`|`console`)、`fail-on``args``token`。输出:`finding-count``sarif-file``exit-code``nyx-version`。支持 Linux 与 macOS runnerx86_64、ARM64
---
## 安装
**Cargo推荐**
```bash
cargo install nyx-scanner
```
**预编译二进制:** 从 [Releases](https://github.com/elicpeter/nyx/releases) 下载对应平台的归档包,对照 `SHA256SUMS`(以及随附的 `SHA256SUMS.asc` GPG 签名,如有提供)校验,解压并把 `nyx` 放到 `PATH` 中。
```bash
# 可选:校验校验文件的 GPG 签名(当 SHA256SUMS.asc 已发布时)
gpg --verify SHA256SUMS.asc SHA256SUMS
sha256sum -c SHA256SUMS --ignore-missing
unzip nyx-x86_64-unknown-linux-gnu.zip && chmod +x nyx && sudo mv nyx /usr/local/bin/
```
**从源码编译:**
```bash
git clone https://github.com/elicpeter/nyx.git
cd nyx && cargo build --release
```
需要 stable Rust 1.88+。前端会在编译期被打包嵌入二进制中,因此 `nyx serve` 没有单独的安装步骤。
---
## 语言支持
全部 10 种语言都通过 tree-sitter 解析并跑完整流水线,但规则深度与引擎覆盖并不均衡。在 [`tests/benchmark/ground_truth.json`](tests/benchmark/ground_truth.json) 的合成语料上,所有十种语言在最近一次基线测量中 F1 均为 100%(见 [`tests/benchmark/RESULTS.md`](tests/benchmark/RESULTS.md)),因此 F1 已无法单独区分梯度。分级反映规则深度、门控汇覆盖、以及合成语料未充分覆盖的结构性惯用法:
| 梯度 | 语言 | F1 | 适合用作 CI 门禁吗? |
|---|---|---|---|
| **稳定** | Python、JavaScript、TypeScript | 100% | 适合 |
| **Beta** | Java、PHP、Ruby、Rust、Go | 100% | 适合,需轻度 FP 分诊 |
| **预览** | C、C++ | 合成语料 100% | 不适合。已跟踪 STL 容器流、builder 链、内联类成员函数;尚未覆盖深度指针别名与函数指针。建议与 clang-tidy 或 Clang Static Analyzer 搭配使用 |
所有真实 CVE 用例均触发,语料在记录基线下无未关闭的 FPP=R=F1=1.000)。各维度详情与已知盲区见 [语言成熟度页面](https://nyxscan.dev/docs/language-maturity.html)。
### 通过真实 CVE 验证
语料中还包含一小批从公开公告中提取的「漏洞 / 已修复」配对,因此基准下限不仅由合成的同形测例守护,还由对真实 bug 的回归保护守护。每个配对 Nyx 都在漏洞文件上触发、在已修复文件上零发现。
| CVE | 项目 | 语言 | 类别 |
|---|---|---|---|
| [CVE-2023-48022](https://nvd.nist.gov/vuln/detail/CVE-2023-48022) | Ray | Python | 命令注入 |
| [CVE-2017-18342](https://nvd.nist.gov/vuln/detail/CVE-2017-18342) | PyYAML | Python | 反序列化 |
| [CVE-2019-14939](https://nvd.nist.gov/vuln/detail/CVE-2019-14939) | mongo-express | JavaScript | 代码执行(`eval` |
| [CVE-2023-22621](https://nvd.nist.gov/vuln/detail/CVE-2023-22621) | Strapi | JavaScript | 代码执行SSTI |
| [CVE-2025-64430](https://nvd.nist.gov/vuln/detail/CVE-2025-64430) | Parse Server | JavaScript | SSRF |
| [CVE-2023-26159](https://nvd.nist.gov/vuln/detail/CVE-2023-26159) | follow-redirects | TypeScript | SSRF |
| [GHSA-4x48-cgf9-q33f](https://github.com/advisories/GHSA-4x48-cgf9-q33f) | Novu | TypeScript | SSRF |
| [CVE-2026-25544](https://nvd.nist.gov/vuln/detail/CVE-2026-25544) | Payload CMS | TypeScript | SQL 注入 |
| [CVE-2022-30323](https://nvd.nist.gov/vuln/detail/CVE-2022-30323) | hashicorp/go-getter | Go | 命令注入 |
| [CVE-2024-31450](https://nvd.nist.gov/vuln/detail/CVE-2024-31450) | owncast | Go | 路径穿越 |
| [CVE-2023-3188](https://nvd.nist.gov/vuln/detail/CVE-2023-3188) | owncast | Go | SSRF |
| [CVE-2026-41422](https://github.com/daptin/daptin/security/advisories/GHSA-rw2c-8rfq-gwfv) | daptin | Go | SQL 注入 |
| [CVE-2015-7501](https://nvd.nist.gov/vuln/detail/CVE-2015-7501) | Apache Commons Collections | Java | 反序列化 |
| [CVE-2017-12629](https://nvd.nist.gov/vuln/detail/CVE-2017-12629) | Apache Solr | Java | 命令注入 |
| [CVE-2022-1471](https://nvd.nist.gov/vuln/detail/CVE-2022-1471) | SnakeYAML | Java | 反序列化 |
| [CVE-2022-42889](https://nvd.nist.gov/vuln/detail/CVE-2022-42889) | Apache Commons Text | Java | 代码执行 |
| [GHSA-h8cj-hpmg-636v](https://github.com/advisories/GHSA-h8cj-hpmg-636v) | Appsmith | Java | SQL 注入 |
| [CVE-2013-0156](https://nvd.nist.gov/vuln/detail/CVE-2013-0156) | Ruby on Rails | Ruby | 反序列化 |
| [CVE-2020-8130](https://nvd.nist.gov/vuln/detail/CVE-2020-8130) | Rake | Ruby | 命令注入 |
| [CVE-2021-21288](https://nvd.nist.gov/vuln/detail/CVE-2021-21288) | CarrierWave | Ruby | SSRF |
| [CVE-2023-38337](https://nvd.nist.gov/vuln/detail/CVE-2023-38337) | rswag-api | Ruby | 路径穿越 |
| [CVE-2017-9841](https://nvd.nist.gov/vuln/detail/CVE-2017-9841) | PHPUnit | PHP | 代码执行(`eval` |
| [CVE-2018-15133](https://nvd.nist.gov/vuln/detail/CVE-2018-15133) | Laravel | PHP | 反序列化 |
| [CVE-2018-20997](https://nvd.nist.gov/vuln/detail/CVE-2018-20997) | tar-rs | Rust | 路径穿越 |
| [CVE-2022-36113](https://nvd.nist.gov/vuln/detail/CVE-2022-36113) | cargo | Rust | 路径穿越 |
| [CVE-2024-24576](https://nvd.nist.gov/vuln/detail/CVE-2024-24576) | Rust stdlib | Rust | 命令注入 |
| [CVE-2023-42456](https://rustsec.org/advisories/RUSTSEC-2023-0069.html) | sudo-rs | Rust | 路径穿越 |
| [CVE-2024-32884](https://rustsec.org/advisories/RUSTSEC-2024-0335.html) | gitoxide | Rust | 命令注入 |
| [CVE-2025-53549](https://rustsec.org/advisories/RUSTSEC-2025-0043.html) | matrix-rust-sdk | Rust | SQL 注入 |
| [CVE-2016-3714](https://nvd.nist.gov/vuln/detail/CVE-2016-3714) | ImageMagick (ImageTragick) | C | 命令注入 |
| [CVE-2019-18634](https://nvd.nist.gov/vuln/detail/CVE-2019-18634) | sudo (pwfeedback) | C | 内存安全 |
| [CVE-2019-13132](https://nvd.nist.gov/vuln/detail/CVE-2019-13132) | ZeroMQ libzmq | C++ | 内存安全 |
| [CVE-2022-1941](https://nvd.nist.gov/vuln/detail/CVE-2022-1941) | Protocol Buffers | C++ | 内存安全 |
| [CVE-2025-69662](https://nvd.nist.gov/vuln/detail/CVE-2025-69662) | geopandas | Python | SQL 注入 |
| [CVE-2026-33626](https://nvd.nist.gov/vuln/detail/CVE-2026-33626) | LMDeploy | Python | SSRF |
用例文件位于 [`tests/benchmark/cve_corpus/`](tests/benchmark/cve_corpus/),并附上游归属头注释。
---
## 工作原理
对文件系统进行两遍扫描,可选用 SQLite 索引跳过未变更文件:
1. **Pass 1**:用 tree-sitter 解析每个文件,构建过程内 CFGpetgraph下降到剪枝后的 SSA在支配边界上做 Cytron phi 插入并导出每函数摘要source/sanitizer/sink 能力位、污点变换、指向集、被调集合)。
2. **摘要合并**:将每文件摘要并集合并为 `GlobalSummaries` 映射。
3. **Pass 2**:在跨文件上下文与有限上下文敏感(文件内被调用 k=1 内联SCC 不动点上限 64 次迭代,超过内联体大小阈值的被调用走摘要回退)下重新分析每个文件。正向数据流工作表通过 SSA 格传播污点,保证收敛。调用图 SCC 迭代到不动点(在上限内),使相互递归函数能拿到准确摘要。
4. **排序、去重、动态验证、输出**:按 严重度 × 证据强度 × 源类可利用性 打分。默认构建会对中高置信度发现做动态验证然后输出到控制台、JSON、SARIF 和浏览器 UI。
检测器家族:污点(跨文件 source→sink含 SQLi、XSS、命令/代码执行、反序列化、SSRF、路径穿越、格式串、加密、LDAP 注入、XPath 注入、HTTP 头/响应拆分、开放重定向、服务端模板注入、XXE、原型污染、数据外泄、以及 auth 折入的能力位类规则、CFG 结构鉴权缺失、未守卫汇、资源泄漏、状态模型use-after-close、double-close、must-leak、unauthed-access、AST 模式tree-sitter 结构匹配)。完整检测器文档:[Detectors](https://nyxscan.dev/docs/detectors.html)。
---
## 动态验证
静态分析说明 source 到 sink 可达。动态验证会尝试证明这条路径在真实代码里会触发。默认构建开启该功能,`nyx scan` 会为中高置信度发现生成 harness在沙箱中用 curated payload 运行,并把结果写入 `evidence.dynamic_verdict`
```bash
nyx scan --verify # 默认行为的显式写法
nyx scan --no-verify # 只跑静态分析,适合本地快速循环
```
`Confirmed` 只有在攻击 payload 触发 sink 且对应的良性 control 保持干净时才会出现。`NotConfirmed` 表示 harness 跑完但没有触发,不等于发现已关闭。完整能力矩阵、后端与限制见 [Dynamic verification](https://nyxscan.dev/docs/dynamic.html)。
---
## 配置
配置由 `nyx.conf`(默认值)与 `nyx.local`你的覆写合并而成从平台配置目录读取Linux 为 `~/.config/nyx/`macOS 为 `~/Library/Application Support/nyx/`Windows 为 `%APPDATA%\elicpeter\nyx\config\`)。
```toml
[scanner]
mode = "full" # full | ast | cfg | taint
min_severity = "Medium"
[server]
host = "127.0.0.1"
port = 9700
open_browser = true
# 项目专属净化器
[[analysis.languages.javascript.rules]]
matchers = ["escapeHtml"]
kind = "sanitizer"
cap = "html_escape"
```
或交互式添加规则:`nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape`。能力位caps`env_var``html_escape``shell_escape``url_encode``json_parse``file_io``fmt_string``sql_query``deserialize``ssrf``data_exfil``code_exec``crypto``unauthorized_id``ldap_injection``xpath_injection``header_injection``open_redirect``ssti``xxe``prototype_pollution``all`。完整 schema[Configuration](https://nyxscan.dev/docs/configuration.html)。运行 `nyx rules list` 可在终端浏览注册表。
---
## 状态
正在积极开发中。API、检测器行为、配置项可能在版本间发生变化。合成语料上的规则级 F1 是 CI 回归下限;分语言详情见 [`tests/benchmark/RESULTS.md`](tests/benchmark/RESULTS.md)。
污点分析是过程间的。持久化的每函数 SSA 摘要带有按返回路径的变换与参数粒度的指向集,调用图 SCC包括跨文件 SCC迭代到联合不动点。默认 `balanced` 画像还会对文件内被调用做 k=1 上下文敏感内联。Symex含跨文件与过程间帧以及按需后向遍历是可选项。可分别用 `--symex``--backwards-analysis` 单独开启,或通过 `--engine-profile deep` 一并开启。
局限:
- 过程间精度是有界而非无限的。上下文敏感内联为 k=1 且有被调用体大小上限SCC 不动点有迭代上限。引擎触达上限时回退到摘要,并在发现上记录 `engine_note`
- 不跨语言追踪调用FFI、子进程、WASM。每种语言独立分析。
- 几项语言特性未建模:宏、大多数动态分派、别名导入、反射。
- C/C++ 处于预览梯度。当前已跟踪 STL 容器流、builder 链、内联类成员函数;深度指针别名与函数指针未跟踪。干净报告不应被理解为干净审计。在作为硬性 CI 门禁之前,请与基于 clang 的工具搭配使用。
- 结果可能含误报或漏报;预期需要人工复核。
---
## 文档
完整文档站点:**[nyxscan.dev/docs](https://nyxscan.dev/docs/)**。
- [Quick Start](https://nyxscan.dev/docs/quickstart.html) · [CLI Reference](https://nyxscan.dev/docs/cli.html) · [Installation](https://nyxscan.dev/docs/installation.html)
- [`nyx serve`](https://nyxscan.dev/docs/serve.html) · [Output Formats](https://nyxscan.dev/docs/output.html) · [Configuration](https://nyxscan.dev/docs/configuration.html)
- [How it works](https://nyxscan.dev/docs/how-it-works.html) · [Detectors](https://nyxscan.dev/docs/detectors.html)[Taint](https://nyxscan.dev/docs/detectors/taint.html)、[CFG](https://nyxscan.dev/docs/detectors/cfg.html)、[State](https://nyxscan.dev/docs/detectors/state.html)、[AST Patterns](https://nyxscan.dev/docs/detectors/patterns.html)
- [Rule Reference](https://nyxscan.dev/docs/rules.html) · [Language Maturity](https://nyxscan.dev/docs/language-maturity.html) · [Advanced Analysis](https://nyxscan.dev/docs/advanced-analysis.html) · [Auth Analysis](https://nyxscan.dev/docs/auth.html)
---
## 参与贡献
欢迎贡献。
Nyx 是开源项目,并将永远保有完全开源的核心。为了支持长期开发并使项目可持续,贡献者在首次合入前可能会被要求签署 Contributor License Agreement。
提交前请运行 `sh scripts/check.sh`。完整指南(包括如何添加规则与支持新语言)见 [`CONTRIBUTING.md`](CONTRIBUTING.md)。崩溃、panic 或可疑结果请提 issue附最小复现片段与 Nyx 版本号。
---
## AI 披露
- **引擎代码**taint、SSA、CFG、调用图、抽象解释、符号执行以人工编写为主。AI 仅用于有选择的重构与样板代码,所有合入均经人工审阅。
- **文档与本 README 的大部分内容**:由 AI 基于代码生成并经人工编辑。文档与代码漂移请作为 bug 上报。
- **测试用例与 `expected.yaml` 文件**AI 协助起草,落库前经人工审核。
- **前端 UI**React 应用):在 AI 协助下构建,经人工审阅。
与任何静态分析器一样,在把 Nyx 用作 CI 门禁前,请基于你自己的语料验证发现。
---
## 许可证
GNU General Public License v3.0 或更高版本GPL-3.0-or-later。可选的 `smt` 特性会捆绑 Z3MIT 许可);分发以 `--features smt` 构建的二进制时,应在归属信息中包含 Z3 的许可证。完整文本见 [LICENSE](./LICENSE);第三方依赖见 [THIRDPARTY-LICENSES.html](./THIRDPARTY-LICENSES.html)。

94
RELEASE_CHECKLIST.md Normal file
View file

@ -0,0 +1,94 @@
# Release checklist: 0.8.0 (dynamic verification)
Maintainer-facing gate for cutting `0.8.0`. The release ships the dynamic
verifier (Tracks J through S of `.pitboss/play/plan.md`). Sign-off requires
every row below green, and every CI matrix row green for at least three
consecutive runs on `master`.
Legend: `[x]` verified locally on the dev reference machine, `[ ]` confirmed
by CI (must hold for three consecutive runs before tagging).
## Cross-cutting invariants
- [x] `cargo check --no-default-features --features serve` green.
- [x] `cargo check --features dynamic` green.
- [x] `cargo nextest run --features dynamic` green: 6545 passed, 0 failed, 16 skipped.
- [x] Determinism: every payload RNG seeds from `spec.spec_hash`; oracle canaries derive from `BLAKE3(spec_hash || run_nonce)`. `scripts/check_no_unseeded_rand.sh` audits the tree.
- [x] Observability: each new code path emits a `VerifyTrace` event and a typed `Inconclusive` / `Unsupported` reason.
- [x] Security: every sink-under-test routes through `src/dynamic/policy.rs` deny rules; no phase weakened the seccomp / `.sb` profile sets.
- [ ] Performance: default `nyx scan` (no `--verify`) latency does not regress.
## Ship gates (`scripts/m7_ship_gate.sh`)
- [x] Gate 1: static-only scan green on `tests/benchmark/corpus`.
- [x] Gate 2: `cargo nextest run --features dynamic` green (covers Gate 4 + Gate 5 binaries).
- [x] Gate 3: with-verify / static-only wall-clock ratio <= 1.5x on `benches/fixtures/`.
- [x] Gate 4: SARIF schema validation on every dynamic verdict variant.
- [x] Gate 5: layering boundary test green.
- [ ] Gate 6: Java OWASP Benchmark v1.2 `--verify` acceptance (wall-clock <= 15 min CI, per-cap precision >= 0.85 / recall >= 0.40, per-`(cap, lang)` budget). Self-skips without `NYX_OWASP_CORPUS`.
- [ ] Gate 7: NodeGoat + Juice Shop acceptance. Self-skips without `NYX_NODEGOAT_CORPUS` / `NYX_JUICESHOP_CORPUS`.
- [ ] Gate 8: RailsGoat / DVWA / DVPWA / gosec / RustSec acceptance. Self-skips without the matching `NYX_*_CORPUS`.
Gates 6 through 8 run against real corpora that are not vendored into the repo.
They are enforced in the `eval` workflow with the corpora cached on the CI
runner. Locally they self-skip with a clear message.
## CI matrix rows (must be green three runs running)
`ci.yml`:
- [ ] frontend, rustfmt, clippy-stable, cargo-deny, unused-deps, third-party-licenses
- [ ] docs-fresh (`nyx-docgen` output committed), rustdoc
- [ ] rust-beta-build, msrv
- [ ] rust-stable-test-linux-without-docker, rust-stable-test-linux-with-docker (`cargo nextest run --all-features`)
`dynamic.yml` (each runs `cargo nextest run --features dynamic`):
- [ ] linux-process-only
- [ ] linux-with-docker
- [ ] macos
`eval.yml`:
- [ ] owasp (Gate 6)
- [ ] jsts matrix: nodegoat, juiceshop (Gate 7)
- [ ] polyglot matrix: railsgoat, dvwa, dvpwa, gosec, rustsec (Gate 8)
## Docs and metadata
- [x] `Cargo.toml` version bumped to `0.8.0`; `Cargo.lock` regenerated.
- [x] `docs/dynamic.md` rewritten: cap x lang matrix, framework adapter table, oracle table, performance budgets, limitations.
- [x] `README.md` dynamic verification section + docs link.
- [x] `CHANGELOG.md` `[0.8.0]` entry covers Tracks J through S.
- [x] Stray version strings updated (README GitHub Action pin, telemetry doc example).
## Known limitations carried into 0.8.0
These are documented in `docs/dynamic.md` and accepted for the MVP. They are
not release blockers, but the release notes should not overstate the verifier.
- **Guarded-sink over-confirmation (resolved on `dynamic`).** The synthesized
harness now drives the finding's enclosing entry function when one is
derivable, routing the payload to the tainted parameter, so a guard that
lives in the caller (a `Object.create(null)` merge target, an allowlisting
`resolveClass`, a const-name check before `Marshal.load`) runs first and
participates in the verdict. The build-time entry-vs-sink choice is recorded
on the verify trace as `entry_invocation`. When no enclosing entry can be
derived the harness falls back to driving the sink directly, which can still
over-confirm a guard it never executes. On the in-house fixture set the
verify scan now confirms the 8 genuine vulnerabilities and reads
`NotConfirmed` on all 4 negative-control files.
- **In-house confirmed rate is modest.** A `--verify` scan of
`tests/dynamic_fixtures` (process backend) lands 8 Confirmed / 15
NotConfirmed / 115 Inconclusive / 137 Unsupported of 275. The Unsupported
bulk is `SoundOracleUnavailable` (ENV_VAR / SHELL_ESCAPE / URL_ENCODE source
and sanitizer caps, correct by design); the Inconclusive bulk is
`SpecDerivationFailed` on benign and scaffolding fixtures with no derivable
flow. The authoritative confirmed / precision / recall numbers come from the
real-corpus gates (6 through 8), which require the corpora.
- **Real-corpus gates unverified locally.** Gates 6 through 8 self-skip without
`NYX_*_CORPUS`. The >= 40% confirmed and >= 0.85 precision targets are
enforced only in the `eval` workflow.
## Tag
- [ ] Three consecutive green CI runs on `master` confirmed.
- [ ] Real-corpus gates (6 through 8) green in the `eval` workflow with corpora wired.
- [ ] `git tag v0.8.0` and push; `release-build.yml` publishes the binaries and `SHA256SUMS`.

23
ROADMAP.md Normal file
View file

@ -0,0 +1,23 @@
# Roadmap
## Now: recall and precision on real codebases
The current focus is straightforward. Run Nyx against real open-source repositories and real CVEs, then close the gap between what it finds and what it should find.
That means:
- **Recall.** Pick CVEs with public fixes. Reproduce them on the vulnerable commit. If Nyx misses, figure out why (missing source, missing sink, lost flow across a call, dropped at a sanitizer that was not actually a sanitizer) and fix the underlying analysis, not the fixture.
- **Precision.** Triage the noise on large repos (phpMyAdmin, Nextcloud, and others). Each false positive gets reduced to a pattern: receiver-type gate, non-crypto context for `md5`/`sha1`, type-safe sink suppression, etc. Land the gate, re-run the corpus, confirm the count drops without taking real bugs with it.
- **Corpus discipline.** Every fix lands with a fixture (positive or negative) and a corpus row. Rule-level F1 on `tests/benchmark/corpus/` is the scoreboard. CI floors only ratchet up.
The scanner internals (SSA, cross-file summaries, abstract interpretation, symbolic execution, auth analysis) are in place. They get refined in service of the recall/precision work, not extended for their own sake.
## Later: dynamic capability
Static analysis confirms a flow exists. Dynamic execution confirms it fires. The plan is a local sandbox that picks up entry points Nyx already identifies, builds a harness, injects a payload, and watches for the crash or shell. Pairs naturally with fuzzing (libFuzzer, cargo-fuzz, go-fuzz, HTTP) where the static engine picks the targets.
Not started. Lands after the static side is honest on real corpora.
## Later still: reasoning layer
Embeddings for cross-codebase pattern similarity. LLM-assisted detection for logic bugs that resist taint modeling. Automated exploit refinement loops. All speculative until the foundation is solid.

88
SECURITY.md Normal file
View file

@ -0,0 +1,88 @@
# Security Policy
## Reporting a vulnerability
Report privately. Do not open a public GitHub issue for a security bug.
Use [GitHub Security Advisories](https://github.com/elicpeter/nyx/security/advisories/new) to file a private report. Only the maintainers see it.
Include:
- Affected version (`nyx --version`) and OS
- Reproduction steps or a minimal PoC
- Impact (RCE, file read or write, sandbox escape, auth bypass in `nyx serve`, etc.)
- Whether you have a fix in mind
You'll get an acknowledgement within 3 business days, and a status update every 7 days until the issue is closed.
## Scope
In scope: bugs that let untrusted input reach the Nyx process and cause harm.
- Code execution in the scanner: parser exploits, deserialization, command injection in helpers, custom-rule sandbox escape.
- Path traversal or arbitrary file access outside the target repo.
- `nyx serve` issues: auth bypass, host-header bypass, CSRF on mutating routes, XSS in the UI, cross-origin access from a non-loopback origin.
- Memory safety bugs in any unsafe Rust we introduce.
- Tampering with `.nyx/` triage state from outside the user's repo.
- Supply chain issues affecting published `nyx-scanner` crates or release artifacts.
Out of scope:
- False positives or missed detections in scan output. File a regular GitHub issue with the rule ID and a fixture.
- Findings Nyx reports against your own code. That's the scanner working, not a Nyx vulnerability.
- Anything requiring physical or local-account access to the user's machine.
- Self-XSS and missing security headers on `127.0.0.1` endpoints. The UI is loopback-only.
- Performance pathologies on hostile input (a 50 GB file, deeply nested grammars). We harden where we can.
- Issues only reachable by a user editing their own `nyx.conf` to weaken defaults.
## Supported versions
| Version | Status |
|---------|-----------------------|
| 0.7.x | Supported |
| 0.6.x | Critical fixes only |
| < 0.6 | End of life |
The project follows [Semantic Versioning](https://semver.org) once it reaches 1.0.0. Until then, breaking changes can land in any minor release.
## Severity
We use [CVSS 3.1](https://www.first.org/cvss/v3.1/specification-document) to rate reports.
| Severity | Examples |
|----------|-----------------------------------------------------------------------------------------------|
| Critical | Unauthenticated RCE in `nyx serve`, custom-rule sandbox escape during a default scan |
| High | Auth bypass against `nyx serve`, arbitrary file write outside the repo |
| Medium | Stored XSS in the UI, CSRF on a mutating route, host-header bypass |
| Low | Information disclosure with no privilege change, log-injection, denial of service via input |
## Disclosure
Coordinated disclosure.
1. We confirm the report and assign severity.
2. We request a CVE through GitHub or MITRE.
3. A fix is developed on a private branch, with backports to supported lines if needed.
4. A new release ships on crates.io and a public advisory goes out.
5. The reporter is credited in the advisory and the changelog, unless they ask to stay anonymous.
Target window from report to fix is 90 days. If you need to publish on a shorter timeline, tell us in the report and we'll work toward it.
## Safe harbor
Good-faith security research is welcome. We won't pursue legal action against researchers who:
- Report privately and give a reasonable window before publishing.
- Test against their own installations, not third-party deployments running Nyx.
- Avoid data destruction, account takeover, and service disruption.
- Stop and reach out if a test starts to affect data or systems they don't own.
If you're not sure whether a test is in scope, ask first.
## Bounty
There is no paid bug bounty program. Credit, a thank-you in the advisory, and a mention in the changelog are what we offer today.
## Security model recap
Nyx runs locally. The browser UI binds to `127.0.0.1` by default, requires a matching `Host` header, and uses a CSRF token on every mutating request. There is no login, no telemetry, and no remote control plane. If you find a way around any of those defaults, that's a security issue and we want to hear about it.

6498
THIRDPARTY-LICENSES.html Normal file

File diff suppressed because it is too large Load diff

70
about.hbs Normal file
View file

@ -0,0 +1,70 @@
<html>
<head>
<style>
@media (prefers-color-scheme: dark) {
body {
background: #333;
color: white;
}
a {
color: skyblue;
}
}
.container {
font-family: sans-serif;
max-width: 800px;
margin: 0 auto;
}
.intro {
text-align: center;
}
.licenses-list {
list-style-type: none;
margin: 0;
padding: 0;
}
.license-used-by {
margin-top: -10px;
}
.license-text {
max-height: 200px;
overflow-y: scroll;
white-space: pre-wrap;
}
</style>
</head>
<body>
<main class="container">
<div class="intro">
<h1>Third Party Licenses</h1>
<p>This page lists the licenses of the projects used in cargo-about.</p>
</div>
<h2>Overview of licenses:</h2>
<ul class="licenses-overview">
{{#each overview}}
<li><a href="#{{id}}">{{name}}</a> ({{count}})</li>
{{/each}}
</ul>
<h2>All license text:</h2>
<ul class="licenses-list">
{{#each licenses}}
<li class="license">
<h3 id="{{id}}">{{name}}</h3>
<h4>Used by:</h4>
<ul class="license-used-by">
{{#each used_by}}
<li><a href="{{#if crate.repository}} {{crate.repository}} {{else}} https://crates.io/crates/{{crate.name}} {{/if}}">{{crate.name}} {{crate.version}}</a></li>
{{/each}}
</ul>
<pre class="license-text">{{text}}</pre>
</li>
{{/each}}
</ul>
</main>
</body>
</html>

80
about.toml Normal file
View file

@ -0,0 +1,80 @@
# Pin the target triples scanned so `cargo about generate` produces the
# same output regardless of host OS. Must match the release build matrix
# in .github/workflows/release-build.yml — otherwise the CI diff step
# (third-party-licenses) will fail on platform-specific crates like
# linux-raw-sys, android_system_properties, etc.
targets = [
"x86_64-unknown-linux-gnu",
"aarch64-unknown-linux-gnu",
"x86_64-pc-windows-msvc",
"x86_64-apple-darwin",
"aarch64-apple-darwin",
]
accepted = [
# --- Apache / MIT / BSD / permissive ---
"Apache-2.0",
"MIT",
"MIT-0",
"BSD-2-Clause",
"BSD-3-Clause",
"ISC",
"Zlib",
"zlib-acknowledgement",
"BSL-1.0",
"NCSA",
"PostgreSQL",
"curl",
"BlueOak-1.0.0",
"X11",
"HPND",
"TCL",
"ICU",
"Info-ZIP",
# --- Unicode / data / specs ---
"Unicode-DFS-2016",
"Unicode-3.0",
# --- compression / libs ---
"bzip2-1.0.6",
"Libpng",
"libpng-2.0",
"IJG",
"FTL",
# --- public domain style ---
"CC0-1.0",
"Unlicense",
"0BSD",
# --- weak copyleft (GPL-compatible) ---
"MPL-2.0",
"LGPL-3.0",
"EPL-2.0",
# --- GPL family ---
"GPL-3.0",
"GPL-2.0",
# --- Python / PSF ---
"PSF-2.0",
"Python-2.0",
"Python-2.0.1",
# --- Artistic / Perl ---
"Artistic-2.0",
# --- LLVM / clang ---
"Apache-2.0 WITH LLVM-exception",
# --- data / ML ---
"CDLA-Permissive-2.0",
# --- fonts ---
"OFL-1.1",
# --- Creative Commons (code-safe ones) ---
"CC-BY-3.0",
"CC-BY-4.0",
]

148
action-scripts/download.sh Executable file
View file

@ -0,0 +1,148 @@
#!/usr/bin/env bash
set -euo pipefail
REPO="elicpeter/nyx"
VERSION="${NYX_VERSION:-latest}"
INSTALL_DIR="${RUNNER_TOOL_CACHE:-/tmp}/nyx"
# Optional: pin a GPG key fingerprint here (40-char, no spaces) or set
# NYX_GPG_FINGERPRINT in the calling env to require GPG-signed SHA256SUMS.
# Empty ⇒ GPG verification is skipped (SHA256 + SLSA attestation still run).
PINNED_GPG_FINGERPRINT="${NYX_GPG_FINGERPRINT:-}"
# ── Detect runner OS and architecture ─────────────────────────────────────────
OS="$(uname -s)"
ARCH="$(uname -m)"
case "${OS}-${ARCH}" in
Linux-x86_64) TARGET="x86_64-unknown-linux-gnu" ;;
Linux-aarch64) TARGET="aarch64-unknown-linux-gnu" ;;
Darwin-x86_64) TARGET="x86_64-apple-darwin" ;;
Darwin-arm64) TARGET="aarch64-apple-darwin" ;;
*)
echo "::error::Unsupported platform: ${OS} ${ARCH}"
exit 1
;;
esac
# ── Resolve "latest" to an actual release tag ────────────────────────────────
if [[ "$VERSION" == "latest" ]]; then
echo "::warning::version: latest follows a mutable tag. Pin to a specific release (e.g. v0.7.0) for supply-chain safety."
API_URL="https://api.github.com/repos/${REPO}/releases/latest"
CURL_ARGS=(-fsSL)
if [[ -n "${GITHUB_TOKEN:-}" ]]; then
CURL_ARGS+=(-H "Authorization: token ${GITHUB_TOKEN}")
fi
RELEASE_JSON="$(curl "${CURL_ARGS[@]}" "$API_URL")"
VERSION="$(echo "$RELEASE_JSON" | grep -o '"tag_name":\s*"[^"]*"' | head -1 | cut -d'"' -f4)"
if [[ -z "$VERSION" ]]; then
echo "::error::Failed to resolve latest release tag from ${API_URL}"
exit 1
fi
echo "Resolved latest version: ${VERSION}"
fi
# ── Download the release asset into an isolated staging dir ──────────────────
ASSET_NAME="nyx-${TARGET}.zip"
RELEASE_BASE="https://github.com/${REPO}/releases/download/${VERSION}"
DOWNLOAD_URL="${RELEASE_BASE}/${ASSET_NAME}"
STAGING="$(mktemp -d)"
trap 'rm -rf "$STAGING"' EXIT
CURL_COMMON=(-fsSL)
if [[ -n "${GITHUB_TOKEN:-}" ]]; then
CURL_COMMON+=(-H "Authorization: token ${GITHUB_TOKEN}")
fi
echo "Downloading nyx ${VERSION} for ${TARGET}..."
curl "${CURL_COMMON[@]}" -o "${STAGING}/${ASSET_NAME}" "$DOWNLOAD_URL"
# SHA256SUMS is required — the whole release signing chain hinges on it.
echo "Downloading SHA256SUMS..."
curl "${CURL_COMMON[@]}" -o "${STAGING}/SHA256SUMS" "${RELEASE_BASE}/SHA256SUMS"
# SHA256SUMS.asc is optional (GPG signing was wired up mid-0.x); fetch it if
# present so we can attempt signature verification.
SIG_PATH=""
if curl "${CURL_COMMON[@]}" -o "${STAGING}/SHA256SUMS.asc" "${RELEASE_BASE}/SHA256SUMS.asc" 2>/dev/null; then
SIG_PATH="${STAGING}/SHA256SUMS.asc"
fi
# ── Mandatory: verify the binary's SHA256 matches SHA256SUMS ─────────────────
(
cd "$STAGING"
# --ignore-missing: SHA256SUMS lists every platform archive; we only have one.
if ! sha256sum --ignore-missing -c SHA256SUMS >/dev/null 2>&1; then
echo "::error::SHA256 verification failed for ${ASSET_NAME}. Release may be tampered."
echo "Expected (from SHA256SUMS):"
grep -F "${ASSET_NAME}" SHA256SUMS || true
echo "Actual:"
sha256sum "${ASSET_NAME}" || true
exit 1
fi
)
echo "::notice::SHA256 checksum verified for ${ASSET_NAME}."
# ── Best-effort: GPG verify SHA256SUMS.asc against a pinned fingerprint ──────
# Trust model: only accept a signature from a fingerprint we have pinned. A
# signature from any other key is treated as a failure, not a success. If no
# fingerprint is pinned, GPG verification is skipped (SHA256+SLSA still run).
if [[ -n "$SIG_PATH" ]]; then
if [[ -z "$PINNED_GPG_FINGERPRINT" ]]; then
echo "::warning::SHA256SUMS.asc found but no GPG fingerprint pinned. Set NYX_GPG_FINGERPRINT (40-char, no spaces) to enforce GPG verification."
elif ! command -v gpg >/dev/null 2>&1; then
echo "::warning::gpg not installed on runner; skipping SHA256SUMS.asc verification."
else
# Fetch the pinned key from keys.openpgp.org into an ephemeral keyring.
GNUPGHOME="$(mktemp -d)"
export GNUPGHOME
chmod 700 "$GNUPGHOME"
trap 'rm -rf "$STAGING" "$GNUPGHOME"' EXIT
if ! gpg --batch --keyserver hkps://keys.openpgp.org \
--recv-keys "$PINNED_GPG_FINGERPRINT" >/dev/null 2>&1; then
echo "::error::Failed to fetch GPG key ${PINNED_GPG_FINGERPRINT} from keys.openpgp.org."
exit 1
fi
# --status-fd 1 gives machine-readable output; VALIDSIG + the pinned fpr
# is the only accept condition.
GPG_STATUS="$(gpg --batch --status-fd 1 --verify \
"$SIG_PATH" "${STAGING}/SHA256SUMS" 2>/dev/null || true)"
if ! grep -q "^\[GNUPG:\] VALIDSIG ${PINNED_GPG_FINGERPRINT} " <<<"$GPG_STATUS"; then
echo "::error::GPG signature on SHA256SUMS does not match pinned fingerprint ${PINNED_GPG_FINGERPRINT}."
echo "$GPG_STATUS"
exit 1
fi
echo "::notice::GPG signature verified against ${PINNED_GPG_FINGERPRINT}."
fi
else
echo "::warning::SHA256SUMS.asc not published for ${VERSION}; relying on SHA256 + SLSA only."
fi
# ── Best-effort: SLSA build-provenance attestation (Sigstore) ────────────────
# gh attestation verify ships with the gh CLI (preinstalled on GH-hosted
# runners) and validates attestations produced by actions/attest-build-
# provenance against the Sigstore public-good transparency log. Unlike GPG
# this requires no pre-shared key and is the preferred trust root.
if command -v gh >/dev/null 2>&1; then
if gh attestation verify "${STAGING}/${ASSET_NAME}" --repo "${REPO}" >/dev/null 2>&1; then
echo "::notice::SLSA build provenance verified for ${ASSET_NAME}."
else
echo "::warning::gh attestation verify failed or no attestation present for ${VERSION}. (Expected for releases predating attest-build-provenance.)"
fi
else
echo "::warning::gh CLI not available; skipping SLSA attestation verification."
fi
# ── Extract and install ──────────────────────────────────────────────────────
mkdir -p "$INSTALL_DIR"
# The zip stores target/{TARGET}/release/nyx — use -j to flatten paths
unzip -o -j "${STAGING}/${ASSET_NAME}" "*/nyx" -d "$INSTALL_DIR"
chmod +x "${INSTALL_DIR}/nyx"
# ── Add to PATH for subsequent steps ─────────────────────────────────────────
echo "${INSTALL_DIR}" >> "$GITHUB_PATH"
# ── Verify and set output ────────────────────────────────────────────────────
INSTALLED_VERSION="$("${INSTALL_DIR}/nyx" --version 2>&1 | head -1 || echo "unknown")"
echo "nyx-version=${INSTALLED_VERSION}" >> "$GITHUB_OUTPUT"
echo "Installed nyx: ${INSTALLED_VERSION} (${TARGET})"

87
action-scripts/run.sh Executable file
View file

@ -0,0 +1,87 @@
#!/usr/bin/env bash
set -uo pipefail
# Note: NOT -e — we capture nyx's exit code manually.
# ── Build the nyx command ────────────────────────────────────────────────────
FORMAT="${INPUT_FORMAT:-sarif}"
ARGS=("scan" "${INPUT_PATH:-.}" "--quiet" "--format" "$FORMAT")
if [[ -n "${INPUT_FAIL_ON:-}" ]]; then
ARGS+=("--fail-on" "$INPUT_FAIL_ON")
fi
# Append raw user args (word-split is intentional here)
if [[ -n "${INPUT_ARGS:-}" ]]; then
read -ra EXTRA <<< "$INPUT_ARGS"
ARGS+=("${EXTRA[@]}")
fi
# ── Execute the scan ─────────────────────────────────────────────────────────
OUTDIR="${RUNNER_TEMP:-/tmp}"
SARIF_FILE=""
NYX_EXIT=0
echo "::group::nyx scan"
echo "Running: nyx ${ARGS[*]}"
case "$FORMAT" in
sarif)
SARIF_FILE="${OUTDIR}/nyx-results.sarif"
nyx "${ARGS[@]}" > "$SARIF_FILE" || NYX_EXIT=$?
;;
json)
nyx "${ARGS[@]}" > "${OUTDIR}/nyx-results.json" || NYX_EXIT=$?
;;
*)
nyx "${ARGS[@]}" || NYX_EXIT=$?
;;
esac
echo "::endgroup::"
# ── Count findings ───────────────────────────────────────────────────────────
count_findings() {
python3 -c "
import json, sys
try:
data = json.load(open(sys.argv[1]))
fmt = sys.argv[2]
if fmt == 'sarif':
runs = data.get('runs', [])
print(len(runs[0].get('results', [])) if runs else 0)
else:
print(len(data) if isinstance(data, list) else 0)
except Exception:
print(0)
" "$1" "$2" 2>/dev/null || echo "0"
}
FINDING_COUNT="unknown"
case "$FORMAT" in
sarif)
if [[ -f "$SARIF_FILE" ]]; then
FINDING_COUNT="$(count_findings "$SARIF_FILE" sarif)"
fi
;;
json)
if [[ -f "${OUTDIR}/nyx-results.json" ]]; then
FINDING_COUNT="$(count_findings "${OUTDIR}/nyx-results.json" json)"
fi
;;
esac
# ── Set outputs ──────────────────────────────────────────────────────────────
echo "exit-code=${NYX_EXIT}" >> "$GITHUB_OUTPUT"
echo "finding-count=${FINDING_COUNT}" >> "$GITHUB_OUTPUT"
if [[ -n "$SARIF_FILE" ]]; then
echo "sarif-file=${SARIF_FILE}" >> "$GITHUB_OUTPUT"
fi
# ── Summary ──────────────────────────────────────────────────────────────────
if [[ "$NYX_EXIT" -eq 0 ]]; then
echo "::notice::Nyx scan completed. Findings: ${FINDING_COUNT}"
else
echo "::warning::Nyx scan found issues meeting threshold. Findings: ${FINDING_COUNT}"
fi
exit "$NYX_EXIT"

68
action.yml Normal file
View file

@ -0,0 +1,68 @@
name: 'Nyx Security Scanner'
description: 'Run the Nyx multi-language vulnerability scanner on your codebase. Supports Linux and macOS runners (x86_64 and ARM64).'
author: 'Eli Peter'
branding:
icon: 'shield'
color: 'purple'
inputs:
path:
description: 'Directory to scan'
required: false
default: '.'
version:
description: 'Nyx release tag (e.g. v0.7.0). "latest" is accepted but discouraged, pinning to a specific tag protects against upstream compromise.'
required: false
default: 'v0.7.0'
format:
description: 'Output format: sarif, json, or console'
required: false
default: 'sarif'
fail-on:
description: 'Exit non-zero if findings meet this severity threshold: HIGH, MEDIUM, or LOW'
required: false
default: ''
args:
description: 'Additional CLI arguments (e.g. "--severity >=MEDIUM --profile ci")'
required: false
default: ''
token:
description: 'GitHub token for release download (avoids rate limits)'
required: false
default: ${{ github.token }}
outputs:
finding-count:
description: 'Number of findings detected'
value: ${{ steps.scan.outputs.finding-count }}
sarif-file:
description: 'Path to SARIF results file (empty if format is not sarif)'
value: ${{ steps.scan.outputs.sarif-file }}
exit-code:
description: 'Nyx exit code (0 = clean, 1 = threshold breached)'
value: ${{ steps.scan.outputs.exit-code }}
nyx-version:
description: 'Installed nyx version'
value: ${{ steps.install.outputs.nyx-version }}
runs:
using: 'composite'
steps:
- name: Install nyx
id: install
shell: bash
env:
NYX_VERSION: ${{ inputs.version }}
GITHUB_TOKEN: ${{ inputs.token }}
run: ${{ github.action_path }}/action-scripts/download.sh
- name: Run nyx scan
id: scan
shell: bash
env:
INPUT_PATH: ${{ inputs.path }}
INPUT_FORMAT: ${{ inputs.format }}
INPUT_FAIL_ON: ${{ inputs.fail-on }}
INPUT_ARGS: ${{ inputs.args }}
run: ${{ github.action_path }}/action-scripts/run.sh

BIN
assets/logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 432 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.9 KiB

View file

@ -0,0 +1,24 @@
<svg xmlns="http://www.w3.org/2000/svg" width="900" height="275" viewBox="0 0 900 275" role="img" aria-labelledby="title desc">
<title id="title">NYX</title>
<desc id="desc">NYX security scanner.</desc>
<defs>
<style>
.banner {
font-family: ui-monospace, SFMono-Regular, Menlo, Consolas, "Liberation Mono", monospace;
font-size: 38px;
font-weight: 800;
letter-spacing: 0;
white-space: pre;
}
</style>
</defs>
<g transform="translate(146 48)" xml:space="preserve">
<text class="banner" x="0" y="0" fill="#2ea067" xml:space="preserve">███╗ ██╗██╗ ██╗██╗ ██╗</text>
<text class="banner" x="0" y="43" fill="#2ea067" xml:space="preserve">████╗ ██║╚██╗ ██╔╝╚██╗██╔╝</text>
<text class="banner" x="0" y="86" fill="#2ea067" xml:space="preserve">██╔██╗ ██║ ╚████╔╝ ╚███╔╝</text>
<text class="banner" x="0" y="129" fill="#2ea067" xml:space="preserve">██║╚██╗██║ ╚██╔╝ ██╔██╗</text>
<text class="banner" x="0" y="172" fill="#2ea067" xml:space="preserve">██║ ╚████║ ██║ ██╔╝ ██╗</text>
<text class="banner" x="0" y="215" fill="#2ea067" xml:space="preserve">╚═╝ ╚═══╝ ╚═╝ ╚═╝ ╚═╝</text>
</g>
</svg>

After

Width:  |  Height:  |  Size: 1.4 KiB

10
assets/nyx-wordmark.svg Normal file
View file

@ -0,0 +1,10 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 220 100" role="img" aria-label="nyx">
<text x="110" y="72"
text-anchor="middle"
dominant-baseline="alphabetic"
font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', system-ui, sans-serif"
font-weight="700"
font-size="100"
letter-spacing="-1"
fill="#72f3d7">nyx</text>
</svg>

After

Width:  |  Height:  |  Size: 392 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 225 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 257 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 204 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 248 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 MiB

BIN
assets/screenshots/demo.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 222 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 190 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 257 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 248 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 276 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 132 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 137 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 160 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 122 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 145 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 168 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 130 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 167 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 233 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 166 KiB

686
benches/dynamic_bench.rs Normal file
View file

@ -0,0 +1,686 @@
//! Dynamic verification benchmarks (§8.4).
//!
//! Tracks the per-scan cost anchors:
//!
//! 1. `harness_build_cold` — fresh workdir, spec → BuiltHarness (source gen + disk write).
//! 2. `harness_build_warm` — same spec, workdir already staged (file write skipped).
//! 3. `sandbox_run_payload` — single payload run via process backend against
//! sqli_positive.py (subprocess + settrace overhead, no networking).
//! 4. `docker_image_build` — cold image pull/build for the python:3-slim base.
//! 5. `docker_exec_warm` — `docker exec` into a running container (no cold start).
//! 6. `docker_payload_cost` — per-payload sandbox cost via docker backend end-to-end.
//! 7. `composite_chain_reverify_dispatch` — `reverify_top_chains` on a
//! synthetic 3-member chain with no member diags. Measures the no-derive
//! dispatch path (chain_step_specs miss, early-exit build/run loops,
//! Inconclusive verdict allocation, severity downgrade).
//! 8. `composite_chain_reverify_stub_confirmed` — same chain shape, stubbed
//! reverifier returning `Confirmed`. Measures the apply-verdict happy path
//! (no severity bucket change).
//! 9. `composite_chain_reverify_top_n_slice` — 5-chain slice with `top_n=3`.
//! Measures the slice traversal cost so a regression that walks the full
//! slice instead of the prefix is visible.
//! 10. `composite_chain_reverify_replay_stable` — same chain shape as
//! `stub_confirmed`, but with `VerifyOptions::replay_stable_check=true`
//! and a stub that stamps `replay_stable=Some(true)`. Anchors the
//! apply-verdict allocation cost when the telemetry stability field
//! is populated; a regression that adds per-chain work behind the
//! replay opt-in (e.g. an extra run_chain_steps call leaking out of
//! the live path into the stub layer) shows up here.
//!
//! Wall-clock budget anchors for the composite reverify path: the live
//! process backend stays under 400ms per 3-member chain, the docker
//! backend under 1500ms. Those live-run numbers are covered by the
//! `flask_eval_chain_reverify_populates_dynamic_verdict` integration
//! test in `tests/chain_emission_e2e.rs`; the microbenches here anchor
//! the dispatch + verdict-application overhead so regressions on the
//! API-shape half land in the criterion baseline.
//!
//! Baselines committed to `benches/dynamic_bench_baseline.json`.
//! Run: `cargo bench --features dynamic -- dynamic`
//!
//! Docker benchmarks are no-ops when docker is unavailable (skipped, not failed).
use criterion::{Criterion, criterion_group, criterion_main};
#[cfg(feature = "dynamic")]
use nyx_scanner::dynamic::spec::{
EntryKind, HarnessSpec, JavaToolchain, PayloadSlot, SpecDerivationStrategy,
};
#[cfg(feature = "dynamic")]
use nyx_scanner::labels::Cap;
#[cfg(feature = "dynamic")]
use nyx_scanner::symbol::Lang;
#[cfg(feature = "dynamic")]
fn make_rust_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_rust_0001".into(),
entry_file: "tests/dynamic_fixtures/rust/sqli_positive.rs".into(),
entry_name: "run".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::Rust,
toolchain_id: "rust-stable".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/rust/sqli_positive.rs".into(),
sink_line: 18,
spec_hash: "benchrustsqli0001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn make_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench0000000001".into(),
entry_file: "tests/dynamic_fixtures/python/sqli_positive.py".into(),
entry_name: "login".into(),
entry_kind: EntryKind::Function,
lang: Lang::Python,
toolchain_id: "python-3".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/python/sqli_positive.py".into(),
sink_line: 7,
spec_hash: "benchsqli000001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn bench_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_sqli_spec();
c.bench_function("harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("harness build")
});
});
}
#[cfg(feature = "dynamic")]
fn bench_harness_build_warm(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_sqli_spec();
harness::build(&spec).expect("harness pre-stage");
c.bench_function("harness_build_warm", |b| {
b.iter(|| harness::build(&spec).expect("harness build warm"));
});
}
#[cfg(feature = "dynamic")]
fn bench_sandbox_run_payload(c: &mut Criterion) {
use nyx_scanner::dynamic::corpus::payloads_for;
use nyx_scanner::dynamic::harness;
use nyx_scanner::dynamic::sandbox::{self, SandboxOptions};
let spec = make_sqli_spec();
let harness = harness::build(&spec).expect("harness build");
let payloads = payloads_for(Cap::SQL_QUERY);
let payload = payloads
.iter()
.find(|p| !p.is_benign)
.expect("sqli payload");
let opts = SandboxOptions {
timeout: std::time::Duration::from_secs(10),
..SandboxOptions::default()
};
c.bench_function("sandbox_run_payload", |b| {
b.iter(|| sandbox::run(&harness, payload.bytes, &opts).expect("sandbox run"));
});
}
#[cfg(feature = "dynamic")]
fn docker_available() -> bool {
std::process::Command::new("docker")
.arg("info")
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status()
.map(|s| s.success())
.unwrap_or(false)
}
/// Cold docker image pull/build.
///
/// Measures the time to ensure `python:3-slim` is present locally. On a
/// warm cache this is just an inspect call (sub-second). On a cold host it
/// includes the pull from the registry.
///
/// Registers a labelled noop measurement when Docker is absent so criterion's
/// output is never empty for this slot.
#[cfg(feature = "dynamic")]
fn bench_docker_image_build(c: &mut Criterion) {
if !docker_available() {
c.bench_function("docker_image_build_no_docker", |b| b.iter(|| ()));
return;
}
c.bench_function("docker_image_build", |b| {
b.iter(|| {
// `docker pull` is idempotent and fast when image is already local.
let _ = std::process::Command::new("docker")
.args(["pull", "python:3-slim"])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
});
});
}
/// Warm `docker exec` reuse benchmark.
///
/// Starts a single container before the benchmark loop and measures the cost
/// of each `docker exec` call (no cold-start amortisation visible here — that
/// is visible by comparing this vs `bench_docker_payload_cost`).
#[cfg(feature = "dynamic")]
fn bench_docker_exec_warm(c: &mut Criterion) {
if !docker_available() {
eprintln!("bench_docker_exec_warm: docker unavailable, skipping");
return;
}
// Start a long-lived container for the benchmark.
let container = "nyx-bench-exec-warm";
let _ = std::process::Command::new("docker")
.args([
"run",
"-d",
"--rm",
"--name",
container,
"--cap-drop=ALL",
"--security-opt",
"no-new-privileges:true",
"--network",
"none",
"python:3-slim",
"sleep",
"300",
])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
c.bench_function("docker_exec_warm", |b| {
b.iter(|| {
let _ = std::process::Command::new("docker")
.args(["exec", container, "python3", "-c", "pass"])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
});
});
let _ = std::process::Command::new("docker")
.args(["stop", container])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
}
/// Per-payload sandbox cost via docker backend end-to-end.
///
/// Measures the complete path: harness already built + docker backend +
/// process the sqli_positive fixture. The first call includes container
/// start; subsequent calls show exec-reuse cost.
///
/// Registers a labelled noop measurement when Docker is absent so criterion's
/// output is never empty for this slot.
#[cfg(feature = "dynamic")]
fn bench_docker_payload_cost(c: &mut Criterion) {
if !docker_available() {
c.bench_function("docker_payload_cost_no_docker", |b| b.iter(|| ()));
return;
}
use nyx_scanner::dynamic::corpus::payloads_for;
use nyx_scanner::dynamic::harness;
use nyx_scanner::dynamic::sandbox::{self, SandboxBackend, SandboxOptions};
let spec = make_sqli_spec();
let built = harness::build(&spec).expect("harness build");
let payloads = payloads_for(Cap::SQL_QUERY);
let payload = payloads
.iter()
.find(|p| !p.is_benign)
.expect("sqli payload");
let opts = SandboxOptions {
timeout: std::time::Duration::from_secs(30),
backend: SandboxBackend::Docker,
..SandboxOptions::default()
};
c.bench_function("docker_payload_cost", |b| {
b.iter(|| {
let _ = sandbox::run(&built, payload.bytes, &opts);
});
});
}
/// Rust harness build (source gen + disk write, no compilation).
///
/// Measures only `harness::build()` — staging files to the workdir.
/// The expensive `cargo build --release` step is NOT included here
/// (that is the province of an integration benchmark, not this microbench).
#[cfg(feature = "dynamic")]
fn bench_rust_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_rust_sqli_spec();
c.bench_function("rust_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("harness build")
});
});
}
#[cfg(feature = "dynamic")]
fn make_js_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_js_0001".into(),
entry_file: "tests/dynamic_fixtures/js/sqli_positive.js".into(),
entry_name: "login".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::JavaScript,
toolchain_id: "node-20".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/js/sqli_positive.js".into(),
sink_line: 8,
spec_hash: "benchjssqli000001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn make_go_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_go_0001".into(),
entry_file: "tests/dynamic_fixtures/go/sqli_positive.go".into(),
entry_name: "Login".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::Go,
toolchain_id: "go-1.21".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/go/sqli_positive.go".into(),
sink_line: 12,
spec_hash: "benchgosqli000001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn make_java_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_java_0001".into(),
entry_file: "tests/dynamic_fixtures/java/sqli_positive.java".into(),
entry_name: "login".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::Java,
toolchain_id: "java-21".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/java/sqli_positive.java".into(),
sink_line: 9,
spec_hash: "benchjavasqli00001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
#[cfg(feature = "dynamic")]
fn make_php_sqli_spec() -> HarnessSpec {
HarnessSpec {
finding_id: "bench_php_0001".into(),
entry_file: "tests/dynamic_fixtures/php/sqli_positive.php".into(),
entry_name: "login".into(),
entry_kind: nyx_scanner::dynamic::spec::EntryKind::Function,
lang: Lang::Php,
toolchain_id: "php-8".into(),
payload_slot: PayloadSlot::Param(0),
expected_cap: Cap::SQL_QUERY,
constraint_hints: vec![],
sink_file: "tests/dynamic_fixtures/php/sqli_positive.php".into(),
sink_line: 9,
spec_hash: "benchphpsqli000001".into(),
derivation: SpecDerivationStrategy::FromFlowSteps,
stubs_required: vec![],
framework: None,
java_toolchain: JavaToolchain::default(),
}
}
/// JS harness build (source gen + disk write).
#[cfg(feature = "dynamic")]
fn bench_js_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_js_sqli_spec();
c.bench_function("js_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("JS harness build")
});
});
}
/// Go harness build (source gen + disk write, no compilation).
#[cfg(feature = "dynamic")]
fn bench_go_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_go_sqli_spec();
c.bench_function("go_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("Go harness build")
});
});
}
/// Java harness build (source gen + disk write, no compilation).
#[cfg(feature = "dynamic")]
fn bench_java_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_java_sqli_spec();
c.bench_function("java_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("Java harness build")
});
});
}
/// PHP harness build (source gen + disk write).
#[cfg(feature = "dynamic")]
fn bench_php_harness_build_cold(c: &mut Criterion) {
use nyx_scanner::dynamic::harness;
let spec = make_php_sqli_spec();
c.bench_function("php_harness_build_cold", |b| {
b.iter(|| {
let workdir = std::env::temp_dir()
.join("nyx-harness")
.join(&spec.spec_hash);
let _ = std::fs::remove_dir_all(&workdir);
harness::build(&spec).expect("PHP harness build")
});
});
}
#[cfg(feature = "dynamic")]
fn mk_chain_member(hash: u64, idx: usize) -> nyx_scanner::chain::FindingRef {
use nyx_scanner::surface::SourceLocation;
nyx_scanner::chain::FindingRef {
finding_id: format!("bench-chain-member-{idx}"),
stable_hash: hash,
location: SourceLocation::new("bench/synthetic.py", (idx as u32) + 1, 1),
rule_id: "taint-unsanitised-flow".into(),
cap_bits: 0,
}
}
#[cfg(feature = "dynamic")]
fn mk_synthetic_chain(hash: u64, members: usize) -> nyx_scanner::chain::ChainFinding {
use nyx_scanner::chain::{ChainFinding, ChainSeverity, ChainSink, ImpactCategory};
ChainFinding {
stable_hash: hash,
members: (0..members)
.map(|i| mk_chain_member(hash.wrapping_add(i as u64 + 1), i))
.collect(),
sink: ChainSink {
file: "bench/synthetic.py".into(),
line: 99,
col: 1,
function_name: "sink".into(),
cap_bits: 0,
},
implied_impact: ImpactCategory::Rce,
severity: ChainSeverity::Critical,
score: 100.0,
dynamic_verdict: None,
reverify_reason: None,
}
}
#[cfg(feature = "dynamic")]
struct BenchConfirmedReverifier;
#[cfg(feature = "dynamic")]
impl nyx_scanner::chain::CompositeReverifier for BenchConfirmedReverifier {
fn reverify(
&self,
_chain: &nyx_scanner::chain::ChainFinding,
_member_diags: &[nyx_scanner::commands::scan::Diag],
_surface: &nyx_scanner::surface::SurfaceMap,
opts: &nyx_scanner::dynamic::verify::VerifyOptions,
) -> nyx_scanner::evidence::VerifyResult {
// Mirror `DefaultCompositeReverifier::reverify`'s replay-stable
// stamping shape so the apply-verdict allocation cost matches
// the live path when the opt-in is on. The stub does not
// re-run any work (it has none to re-run) but the resulting
// `VerifyResult` populates `replay_stable=Some(true)` so
// downstream sites that branch on the field exercise the same
// path they would for a real Confirmed-with-stable run.
let replay_stable = if opts.replay_stable_check {
Some(true)
} else {
None
};
nyx_scanner::evidence::VerifyResult {
finding_id: "bench".into(),
status: nyx_scanner::evidence::VerifyStatus::Confirmed,
triggered_payload: None,
reason: None,
inconclusive_reason: None,
detail: None,
attempts: vec![],
toolchain_match: None,
differential: None,
replay_stable,
wrong: None,
hardening_outcome: None,
}
}
}
/// Phase 26 dispatch-cost anchor: synthetic 3-member chain with no
/// matching member diags. The reverifier walks chain_step_specs (3
/// HashMap misses → 3 NoFlowSteps errors), the build loop sees zero
/// derived specs and exits early, the run loop sees zero built steps
/// and exits early. The composed VerifyResult is allocated and applied
/// via `apply_dynamic_verdict` (Inconclusive → severity downgrade).
///
/// This is the no-toolchain-dep dispatch overhead — a regression here
/// signals a hot-path allocation introduced into the reverify pipeline.
#[cfg(feature = "dynamic")]
fn bench_composite_chain_reverify_dispatch(c: &mut Criterion) {
use nyx_scanner::chain::reverify;
use nyx_scanner::dynamic::verify::VerifyOptions;
use nyx_scanner::surface::SurfaceMap;
let surface = SurfaceMap::new();
let opts = VerifyOptions::default();
c.bench_function("composite_chain_reverify_dispatch", |b| {
b.iter(|| {
let mut chains = [mk_synthetic_chain(0xC1A1, 3)];
let _ = reverify::reverify_top_chains(&mut chains, &[], &surface, &opts, 1);
});
});
}
/// Phase 26 stub-reverifier happy-path anchor: synthetic 3-member
/// chain driven through `reverify_top_chains_with` + a stubbed
/// reverifier returning `Confirmed`. Measures the apply-verdict path
/// when the verdict does NOT trigger a severity downgrade, so the
/// `ChainReverifyResult` allocation + `chain.apply_dynamic_verdict`
/// transition cost is exercised independent of the verdict-side
/// allocation in the dispatch bench.
#[cfg(feature = "dynamic")]
fn bench_composite_chain_reverify_stub_confirmed(c: &mut Criterion) {
use nyx_scanner::chain::reverify;
use nyx_scanner::dynamic::verify::VerifyOptions;
use nyx_scanner::surface::SurfaceMap;
let surface = SurfaceMap::new();
let opts = VerifyOptions::default();
let reverifier = BenchConfirmedReverifier;
c.bench_function("composite_chain_reverify_stub_confirmed", |b| {
b.iter(|| {
let mut chains = [mk_synthetic_chain(0xC2A2, 3)];
let _ = reverify::reverify_top_chains_with(
&mut chains,
&[],
&surface,
&opts,
1,
&reverifier,
);
});
});
}
/// Phase 26 top-N slice anchor: 5-chain slice with `top_n=3`. Asserts
/// (by way of regression) that the reverify pass never walks past the
/// top-N prefix. The fan-in is the per-chain dispatch cost times three;
/// a regression that drops the `bound = top_n.min(chains.len())` cap
/// would show up as a ~5/3 increase in this bench.
#[cfg(feature = "dynamic")]
fn bench_composite_chain_reverify_top_n_slice(c: &mut Criterion) {
use nyx_scanner::chain::reverify;
use nyx_scanner::dynamic::verify::VerifyOptions;
use nyx_scanner::surface::SurfaceMap;
let surface = SurfaceMap::new();
let opts = VerifyOptions::default();
let reverifier = BenchConfirmedReverifier;
c.bench_function("composite_chain_reverify_top_n_slice", |b| {
b.iter(|| {
let mut chains: [nyx_scanner::chain::ChainFinding; 5] = [
mk_synthetic_chain(0xC301, 3),
mk_synthetic_chain(0xC302, 3),
mk_synthetic_chain(0xC303, 3),
mk_synthetic_chain(0xC304, 3),
mk_synthetic_chain(0xC305, 3),
];
let _ = reverify::reverify_top_chains_with(
&mut chains,
&[],
&surface,
&opts,
3,
&reverifier,
);
});
});
}
/// Phase 26 replay-stable anchor: same 3-member synthetic chain as
/// `stub_confirmed`, driven through `reverify_top_chains_with` with
/// `VerifyOptions::replay_stable_check=true`. The `BenchConfirmedReverifier`
/// stub honours the opt-in by stamping `replay_stable=Some(true)` on
/// the returned `VerifyResult`, exercising the apply-verdict path with
/// the telemetry stability field populated.
///
/// Purpose: anchor the cost of the replay-stable apply path so a
/// regression that leaks a real `run_chain_steps` invocation into the
/// stubbed verifier layer (or that allocates extra state behind the
/// `replay_stable_check` toggle in `chain::reverify::apply_one`) shows
/// up immediately against the `stub_confirmed` baseline.
#[cfg(feature = "dynamic")]
fn bench_composite_chain_reverify_replay_stable(c: &mut Criterion) {
use nyx_scanner::chain::reverify;
use nyx_scanner::dynamic::verify::VerifyOptions;
use nyx_scanner::surface::SurfaceMap;
let surface = SurfaceMap::new();
let opts = VerifyOptions {
replay_stable_check: true,
..VerifyOptions::default()
};
let reverifier = BenchConfirmedReverifier;
c.bench_function("composite_chain_reverify_replay_stable", |b| {
b.iter(|| {
let mut chains = [mk_synthetic_chain(0xC4A3, 3)];
let _ = reverify::reverify_top_chains_with(
&mut chains,
&[],
&surface,
&opts,
1,
&reverifier,
);
});
});
}
#[cfg(feature = "dynamic")]
#[allow(dead_code)]
fn bench_noop(_c: &mut Criterion) {}
// When dynamic feature is off, provide a stub so the binary still links.
#[cfg(not(feature = "dynamic"))]
fn bench_noop(c: &mut Criterion) {
c.bench_function("dynamic_disabled_noop", |b| b.iter(|| ()));
}
#[cfg(feature = "dynamic")]
criterion_group!(
dynamic,
bench_harness_build_cold,
bench_harness_build_warm,
bench_sandbox_run_payload,
bench_docker_image_build,
bench_docker_exec_warm,
bench_docker_payload_cost,
bench_rust_harness_build_cold,
bench_js_harness_build_cold,
bench_go_harness_build_cold,
bench_java_harness_build_cold,
bench_php_harness_build_cold,
bench_composite_chain_reverify_dispatch,
bench_composite_chain_reverify_stub_confirmed,
bench_composite_chain_reverify_top_n_slice,
bench_composite_chain_reverify_replay_stable,
);
#[cfg(not(feature = "dynamic"))]
criterion_group!(dynamic, bench_noop);
criterion_main!(dynamic);

View file

@ -0,0 +1,26 @@
{
"schema": 1,
"note": "ASPIRATIONAL placeholder — values were hand-typed, not captured from a real bench run. Regenerate with: benches/regen_baseline.sh (requires --features dynamic and python3 on PATH). Commit the updated file to establish a real regression reference for M3+.",
"benchmarks": {
"harness_build_cold": {
"mean_ns": 800000,
"stddev_ns": 120000,
"description": "Fresh workdir; spec → BuiltHarness including source gen + disk write."
},
"harness_build_warm": {
"mean_ns": 180000,
"stddev_ns": 30000,
"description": "Workdir already staged; file write skipped by dst.exists() guard."
},
"sandbox_run_payload": {
"mean_ns": 120000000,
"stddev_ns": 15000000,
"description": "Single process-backend run with sqli payload; includes python3 startup + settrace."
}
},
"regression_thresholds": {
"harness_build_cold": 2.0,
"harness_build_warm": 2.0,
"sandbox_run_payload": 1.5
}
}

31
benches/fixtures/sample.c Normal file
View file

@ -0,0 +1,31 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* get_env_value(void) {
return getenv("SECRET");
}
void execute_command(const char* cmd) {
system(cmd);
}
void safe_flow(void) {
char* val = get_env_value();
if (val != NULL) {
printf("Value: %s\n", val);
}
}
void unsafe_flow(void) {
char* val = get_env_value();
if (val != NULL) {
execute_command(val);
}
}
int main(void) {
safe_flow();
unsafe_flow();
return 0;
}

View file

@ -0,0 +1,28 @@
#include <cstdlib>
#include <iostream>
#include <string>
std::string get_env_value() {
const char* val = std::getenv("APP_SECRET");
return val ? std::string(val) : "";
}
void execute_command(const std::string& cmd) {
std::system(cmd.c_str());
}
void safe_flow() {
std::string val = get_env_value();
std::cout << "Value: " << val << std::endl;
}
void unsafe_flow() {
std::string val = get_env_value();
execute_command(val);
}
int main() {
safe_flow();
unsafe_flow();
return 0;
}

View file

@ -0,0 +1,36 @@
package main
import (
"fmt"
"os"
"os/exec"
"html"
)
func getEnv() string {
return os.Getenv("APP_SECRET")
}
func sanitizeHTML(input string) string {
return html.EscapeString(input)
}
func runCommand(cmd string) {
exec.Command("sh", "-c", cmd).Run()
}
func safeFlow() {
val := getEnv()
clean := sanitizeHTML(val)
fmt.Println(clean)
}
func unsafeFlow() {
val := getEnv()
runCommand(val)
}
func main() {
safeFlow()
unsafeFlow()
}

View file

@ -0,0 +1,31 @@
import java.io.IOException;
public class Sample {
public static String getEnv() {
return System.getenv("DB_PASSWORD");
}
public static String sanitize(String input) {
return input.replaceAll("[<>&]", "");
}
public static void executeCommand(String cmd) throws IOException {
Runtime.getRuntime().exec(cmd);
}
public static void safeFlow() throws IOException {
String val = getEnv();
String clean = sanitize(val);
System.out.println(clean);
}
public static void unsafeFlow() throws IOException {
String val = getEnv();
executeCommand(val);
}
public static void main(String[] args) throws IOException {
safeFlow();
unsafeFlow();
}
}

View file

@ -0,0 +1,35 @@
const { execSync } = require("child_process");
function getUserInput() {
return process.env.USER_INPUT || "";
}
function sanitizeHtml(input) {
return input.replace(/[<>&"']/g, "");
}
function renderPage(data) {
document.innerHTML = data;
}
function safeRender() {
const input = getUserInput();
const clean = sanitizeHtml(input);
renderPage(clean);
}
function unsafeRender() {
const input = getUserInput();
renderPage(input);
}
function runShell(cmd) {
execSync(cmd);
}
function unsafeExec() {
const input = getUserInput();
runShell(input);
}
module.exports = { safeRender, unsafeRender, unsafeExec };

View file

@ -0,0 +1,27 @@
<?php
function getEnvValue(): string {
return getenv('APP_SECRET') ?: '';
}
function sanitizeHtml(string $input): string {
return htmlspecialchars($input, ENT_QUOTES, 'UTF-8');
}
function executeCommand(string $cmd): void {
exec($cmd);
}
function safeFlow(): void {
$val = getEnvValue();
$clean = sanitizeHtml($val);
echo $clean;
}
function unsafeFlow(): void {
$val = getEnvValue();
executeCommand($val);
}
safeFlow();
unsafeFlow();

View file

@ -0,0 +1,25 @@
import os
import subprocess
import html
def get_env_value():
return os.environ.get("SECRET_KEY", "")
def sanitize_input(val):
return html.escape(val)
def execute_command(cmd):
subprocess.run(cmd, shell=True)
def safe_flow():
val = get_env_value()
clean = sanitize_input(val)
print(clean)
def unsafe_flow():
val = get_env_value()
execute_command(val)
if __name__ == "__main__":
safe_flow()
unsafe_flow()

View file

@ -0,0 +1,27 @@
require 'cgi'
def get_env_value
ENV['APP_SECRET'] || ''
end
def sanitize_html(input)
CGI.escapeHTML(input)
end
def execute_command(cmd)
system(cmd)
end
def safe_flow
val = get_env_value
clean = sanitize_html(val)
puts clean
end
def unsafe_flow
val = get_env_value
execute_command(val)
end
safe_flow
unsafe_flow

View file

@ -0,0 +1,34 @@
use std::env;
use std::process::Command;
fn get_config() -> String {
env::var("APP_CONFIG").unwrap_or_default()
}
fn sanitize_shell(input: &str) -> String {
shell_escape::unix::escape(input.into()).to_string()
}
fn run_command(cmd: &str) {
Command::new("sh")
.arg("-c")
.arg(cmd)
.status()
.expect("failed to execute");
}
fn safe_run() {
let config = get_config();
let clean = sanitize_shell(&config);
run_command(&clean);
}
fn unsafe_run() {
let config = get_config();
run_command(&config);
}
fn main() {
safe_run();
unsafe_run();
}

View file

@ -0,0 +1,30 @@
import { execSync } from "child_process";
function getUserInput(): string {
return process.env.USER_INPUT || "";
}
function sanitizeHtml(input: string): string {
return input.replace(/[<>&"']/g, "");
}
function renderPage(data: string): void {
document.body.innerHTML = data;
}
function runCommand(cmd: string): void {
execSync(cmd);
}
function safeRender(): void {
const input = getUserInput();
const clean = sanitizeHtml(input);
renderPage(clean);
}
function unsafeExec(): void {
const input = getUserInput();
runCommand(input);
}
export { safeRender, unsafeExec };

View file

@ -0,0 +1,61 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* Clean open/close — no findings expected */
void clean_usage(void) {
FILE *f = fopen("data.txt", "r");
char buf[256];
fread(buf, 1, 256, f);
fclose(f);
}
/* Resource leak — fopen without fclose */
void leaky_function(void) {
FILE *f = fopen("log.txt", "w");
fprintf(f, "hello");
}
/* Use after close */
void use_after_close(void) {
FILE *f = fopen("tmp.txt", "r");
fclose(f);
char buf[64];
fread(buf, 1, 64, f);
}
/* Branch leak — closed on one path only */
void branch_leak(int cond) {
FILE *f = fopen("x.txt", "r");
if (cond) {
fclose(f);
}
}
/* Multiple handles — both properly closed */
void multi_handle(void) {
FILE *a = fopen("a.txt", "r");
FILE *b = fopen("b.txt", "w");
fclose(a);
fclose(b);
}
/* Double close */
void double_close(void) {
FILE *f = fopen("d.txt", "r");
fclose(f);
fclose(f);
}
/* Malloc/free — clean */
void malloc_clean(void) {
char *p = malloc(1024);
memset(p, 0, 1024);
free(p);
}
/* Malloc leak — never freed */
void malloc_leak(void) {
char *p = malloc(512);
memset(p, 0, 512);
}

File diff suppressed because it is too large Load diff

84
benches/regen_baseline.sh Executable file
View file

@ -0,0 +1,84 @@
#!/usr/bin/env bash
# Regenerate benches/dynamic_bench_baseline.json from a real cargo bench run.
#
# Usage:
# bash benches/regen_baseline.sh
#
# Requirements:
# - python3 on PATH
# - cargo (nightly or stable with edition 2024)
# - Criterion's JSON output (criterion feature already in dev-deps)
#
# The script runs the dynamic bench group, parses Criterion's estimates JSON,
# and overwrites dynamic_bench_baseline.json with real numbers.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
BASELINE_FILE="${SCRIPT_DIR}/dynamic_bench_baseline.json"
echo "Running cargo bench --features dynamic -- dynamic ..."
cargo bench --manifest-path "${REPO_ROOT}/Cargo.toml" \
--features dynamic \
-- dynamic \
2>&1 | tee /tmp/nyx_bench_raw.txt
# Criterion writes estimates to target/criterion/<bench>/<group>/estimates.json.
# Extract mean_ns for each tracked benchmark.
extract_ns() {
local path="$1"
if [[ -f "${path}" ]]; then
python3 -c "
import json, sys
d = json.load(open('${path}'))
mean = d['mean']['point_estimate']
stddev = (d['std_dev']['point_estimate']) if 'std_dev' in d else 0
print(int(mean), int(stddev))
"
else
echo "0 0"
fi
}
TARGET="${REPO_ROOT}/target/criterion"
read COLD_MEAN COLD_STDDEV < <(extract_ns "${TARGET}/harness_build_cold/default/estimates.json")
read WARM_MEAN WARM_STDDEV < <(extract_ns "${TARGET}/harness_build_warm/default/estimates.json")
read RUN_MEAN RUN_STDDEV < <(extract_ns "${TARGET}/sandbox_run_payload/default/estimates.json")
MACHINE="$(uname -m) / $(uname -s)"
NYX_VER="$(cargo metadata --manifest-path "${REPO_ROOT}/Cargo.toml" --no-deps --format-version 1 \
| python3 -c "import json,sys; d=json.load(sys.stdin); print(next(p['version'] for p in d['packages'] if p['name']=='nyx-scanner'))")"
DATE="$(date +%Y-%m-%d)"
cat > "${BASELINE_FILE}" <<EOF
{
"schema": 1,
"note": "Baseline captured on ${MACHINE}, nyx v${NYX_VER}, ${DATE}. Regenerate with: benches/regen_baseline.sh",
"benchmarks": {
"harness_build_cold": {
"mean_ns": ${COLD_MEAN},
"stddev_ns": ${COLD_STDDEV},
"description": "Fresh workdir; spec → BuiltHarness including source gen + disk write."
},
"harness_build_warm": {
"mean_ns": ${WARM_MEAN},
"stddev_ns": ${WARM_STDDEV},
"description": "Workdir already staged; file write skipped by dst.exists() guard."
},
"sandbox_run_payload": {
"mean_ns": ${RUN_MEAN},
"stddev_ns": ${RUN_STDDEV},
"description": "Single process-backend run with sqli payload; includes python3 startup + settrace."
}
},
"regression_thresholds": {
"harness_build_cold": 2.0,
"harness_build_warm": 2.0,
"sandbox_run_payload": 1.5
}
}
EOF
echo "Updated ${BASELINE_FILE}"

Some files were not shown because too many files have changed in this diff Show more