mirror of
https://github.com/elicpeter/nyx.git
synced 2026-06-09 19:45:13 +02:00
* chore: Exclude CLAUDE.md from Cargo.toml * feat: add callgraph module and integrate into main analysis flow * feat: enhance CLI with new severity filtering and analysis modes * feat: update CHANGELOG with recent enhancements and fixes to severity filtering and output handling * feat: implement state-model dataflow analysis for resource lifecycle and auth state * feat: enhance diagnostic output formatting and add evidence structure * feat: implement attack surface ranking for diagnostics with scoring and sorting * feat: add comprehensive documentation for installation, usage, and rules reference * feat: add multiple language support for command execution and evaluation endpoints * feat: implement inline suppression for findings using `nyx:ignore` comments * feat: add confidence levels to AST patterns and update output structure * feat: implement low-noise prioritization system with category filtering, rollup grouping, and configurable budgets * feat: bump version to 0.4.0 and update changelog with new features and improvements * feat: add dead code allowances to various functions in mod.rs and real_world_tests.rs
59 KiB
59 KiB
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]
[0.4.0] - 2025-02-25
Added
- Low-noise prioritization system — post-analysis pipeline that reduces noise from high-frequency LOW/Quality findings without hiding security signal. Three-stage process: category filtering, rollup grouping, and LOW budgets.
FindingCategoryenum (Security,Reliability,Quality) — everyDiagnow carries acategoryfield. AST pattern findings derive their category fromPatternCategorymetadata (CodeQuality→Quality, all others →Security). Taint, CFG, and state findings are alwaysSecurity.- Category filtering — Quality-category findings (e.g.
rs.quality.unwrap,rs.quality.expect) are excluded by default. Use--include-qualityto include them. - Rollup grouping — eligible HIGH-frequency rules (
rs.quality.unwrap,rs.quality.expect,rs.quality.panic_macro) are grouped by(file, rule)into a single rollup finding with occurrence count and example locations. Canonical location is the first sorted occurrence. Example count controlled by--rollup-examples(default 5). - LOW budgets — three configurable limits enforce noise caps:
--max-low(default 20, total),--max-low-per-file(default 1),--max-low-per-rule(default 10). Rollups count as one finding for all budgets. High/Medium findings are never dropped. --allCLI flag — disables all prioritization (no category filtering, no rollups, no budgets).--show-instances <RULE>— bypasses rollup for a specific rule, expanding all individual occurrences.- Console suppression footer — when findings are suppressed, a footer displays the count and active filter values with adjustment hints.
rollupfield onDiag— optionalRollupDatawithcountandoccurrences(exampleLocations). Serializes to JSON automatically; omitted when not a rollup.- SARIF rollup support —
categoryin result properties, rollup count inproperties.rollup.count, example locations inrelatedLocations. max_resultsseverity stability — whenmax_resultstruncation is needed, High findings are kept first, then Medium, then Low. Low findings never displace higher-severity ones.- New config fields in
[output]:include_quality,show_all,max_low,max_low_per_file,max_low_per_rule,rollup_examples. - 14 new unit tests covering category filtering, rollup grouping/examples/canonical, LOW budgets (per-file/per-rule/total), High/Medium immunity, rollup-counts-as-one, show_instances bypass, JSON serialization, and determinism.
- Pattern-level confidence for AST rules — each AST pattern in
src/patterns/now carries an explicitconfidence: Confidencefield (High, Medium, or Low). Confidence is set at the pattern definition site and flows directly into emittedDiags, replacing the old heuristic that inferred AST confidence from severity alone.compute_confidence()is retained as a fallback for detectors that don't set confidence (taint, state, legacy).- Tier A patterns with High/Medium severity →
Confidence::High(deterministic structural match). - Tier A patterns with Low severity →
Confidence::Medium(quality/crypto signals). - Tier B patterns (heuristic-guarded) →
Confidence::Medium. - Example:
rs.quality.expectnow producesConfidence: Highregardless of its Low severity.
- Tier A patterns with High/Medium severity →
- Inline per-finding suppressions — suppress specific findings directly in source code using
nyx:ignorecomments. Two directive forms:nyx:ignore <RULE_ID>(same line) andnyx:ignore-next-line <RULE_ID>(next line). Supports comma-separated IDs, wildcard suffixes (rs.quality.*), and automatic canonicalization of taint rule IDs (parenthetical suffixes stripped). Comment detection covers all 10 languages with string/raw-string/template-literal guards to avoid false positives.--show-suppressedCLI flag — reveal suppressed findings in output, dimmed with[SUPPRESSED]tag. Summary shows"N issues (M suppressed)". In JSON/SARIF mode, suppressed findings include"suppressed": trueand"suppression": {...}metadata fields.suppressedandsuppressionfields onDiag— conditionally serialized; JSON output is unchanged when no suppressions are active.- Suppressed findings are excluded from
--fail-onexit-code checks and severity counts. - New module
src/suppress/mod.rswith 22 unit tests covering all comment styles, string guards, wildcard matching, canonicalization, CRLF, and edge cases.
--min-score <N>CLI flag andoutput.min_scoreconfig option — filter out findings whose attack-surface rank score falls below the given threshold. Applied after ranking and severity filtering, beforemax_resultstruncation. Has no effect when--no-rankis used. CLI value overrides config.- Attack surface ranking — deterministic post-analysis scoring layer that prioritizes findings by exploitability. Each
Diagreceives anf64score computed from five components: severity base (High=60, Medium=30, Low=10), analysis kind bonus (taint +10 > state +8 > cfg +3/5 > ast 0), evidence strength (+1 per item, +2–6 for source-kind priority), state rule type bonus (+1–6), and a path-validation penalty (−5 for guarded paths). Findings are sorted by descending score before truncation somax_resultskeeps the most important results. Tie-breaking is deterministic by severity, rule ID, file path, line, column, and message hash.rank_scoreandrank_reasonfields onDiag— optional fields with#[serde(skip_serializing_if = "Option::is_none")]; JSON output is unchanged when ranking is disabled.--no-rankCLI flag — disables attack-surface ranking (enabled by default).output.attack_surface_rankingconfig key — boolean (defaulttrue) to control ranking via config file.- Console score display — dim
Score: Nappended to each finding's header line when ranking is enabled. - New module
src/rank.rs—compute_attack_rank(),rank_diags(), andsort_key()functions. Scoring uses only in-memory data; no extra file I/O or graph recomputation. - 10 new unit tests: ordering correctness (high taint > medium file-io, must-leak > may-leak, taint > cfg-only, state rules, AST lowest at same severity), determinism (input-order-independent), path-validation penalty, and JSON serialization (rank fields omitted when None, present when set).
- State-model dataflow analysis — new
src/state/module implementing a forward worklist dataflow engine over the existing CFG. Tracks per-variable resource lifecycle (UNINIT,OPEN,CLOSED,MOVED) via bitset lattice and per-path authentication level (Unauthed,Authed,Admin) as a composable product domain. Detects:- Use-after-close (
state-use-after-close, High) — variable read/written after its resource handle was closed. - Double-close (
state-double-close, Medium) — resource handle closed more than once. - Must-leak (
state-resource-leak, High) — resource acquired but never closed on any exit path. - May-leak (
state-resource-leak-possible, Medium) — resource open on some but not all exit paths (branch-aware via lattice join). - Unauthenticated access (
state-unauthed-access, High) — sensitive sink reached without a preceding auth/admin check.
- Use-after-close (
- State analysis architecture — six-module design:
lattice.rs—Latticetrait (bot,join,leq) for generic fixed-point computation.domain.rs—ResourceLifecycle(bitflag),ResourceDomainState,AuthLevel,AuthDomainState,ProductStatewith lattice impls.symbol.rs—SymbolInternerthat builds a string-interning table from CFG node defines/uses;SymbolIdnewtype.transfer.rs—DefaultTransferfunction: maps CFG node kinds (Call, Assignment, If, Return) to state transitions using the existingResourcePairdefinitions fromcfg_analysis::rules. EmitsTransferEventfor illegal transitions.engine.rs— two-phase forward worklist solver: Phase 1 iterates to a fixed point (no events collected to avoid spurious reports from intermediate states); Phase 2 re-applies transfer once over converged states to collect events. Bounded byMAX_TRACKED_VARS(64) with guarded degradation.facts.rs— post-analysis pass: extractsStateFindings from transfer events (use-after-close, double-close) and exit-node state inspection (must-leak, may-leak, unauthed access).
scanner.enable_state_analysisconfig option — opt-in boolean (defaultfalse) inScannerConfiganddefault-nyx.conf. Requires CFG mode (fullortaint).Diag.messagefield — optional human-readable message on diagnostic output. State findings carry variable-specific context (e.g. "variablefused after close"). Surfaced in console output (dimmed line below the finding), JSON, and SARIF (message.textprefers per-finding message over generic rule description).- State finding dedup — when state analysis produces findings on a line, overlapping
cfg-resource-leakandcfg-auth-gapfindings on the same line are suppressed (state analysis is more precise). - SARIF rule descriptions for all five state rule IDs.
- 21 integration tests (
tests/state_tests.rs) with 19 C fixture files covering: use-after-close, double-close, resource leak, clean usage, opt-in gating, may-leak vs must-leak branch semantics, early return, nested branches, both-branches-close, loop convergence, loop use-after-close, handle overwrite, reopen-after-close, multiple handles, conservative join masking, chain operations, malloc/free pairs, straight-line double-close, and message field population. - 30+ unit tests across state modules: lattice properties, lifecycle join/leq, domain merging, auth-level join, product state composition, may/must leak semantics, symbol interning, and transfer event generation.
--severity <EXPR>filter — replaces--high-onlywith a flexible severity expression supporting single levels (HIGH), comma lists (HIGH,MEDIUM), and thresholds (>=MEDIUM). Parsing is case-insensitive with whitespace tolerance.SeverityFiltertype withparse()andmatches()inpatterns/mod.rs.--mode <full|ast|cfg|taint>— replaces--ast-onlyand--cfg-onlywith a single canonical analysis mode flag. Enforces mutual exclusivity via clapValueEnum.--index <auto|off|rebuild>— replaces--no-indexand--rebuild-indexwith a single flag (defaultauto).--fail-on <SEVERITY>— CI ergonomics: exit code 1 if any emitted finding meets or exceeds the threshold severity. Example:--fail-on HIGH.--quiet— CLI flag to suppress all human-readable status output (equivalent tooutput.quiet = truein config).--keep-nonprod-severity— renamed from--include-nonprodfor clarity; old name kept as hidden alias.OutputFormatenum —--formatnow uses clapValueEnumwith typedConsole,Json,Sarifvariants (defaultConsole). No more empty-string default.- 10 new unit tests:
SeverityFilterparsing (single, comma list, threshold, case-insensitive, whitespace, empty rejection, invalid level rejection),Severity::from_strrejection of unknown values, andseverity_filter_applied_at_output_stageintegration test verifying that downgraded findings are correctly filtered. - AST pattern overhaul -- all 10 language pattern files (
src/patterns/*.rs) rewritten with consistent conventions, structured metadata, and validated tree-sitter queries.- Pattern schema extensions --
PatternTier(A = structural, B = heuristic-guarded),PatternCategory(13 vulnerability classes), andHashonSeverity. Module-level docs explain conventions and how to add new patterns. - Namespaced IDs -- all pattern IDs follow
<lang>.<category>.<specific>format (e.g.java.deser.readobject,py.cmdi.os_system,js.xss.document_write). - New vulnerability coverage -- 30+ new patterns across languages: Python deserialization (
pickle.loads,yaml.load,shelve.open), Python command injection (os.system,os.popen), Python weak crypto (hashlib.md5/sha1), Java reflection (Method.invoke), Java weak digest (MessageDigest.getInstance("MD5")), Java XSS (getWriter().println), Go TLS misconfiguration (InsecureSkipVerify: true), Go SQL concat, Go hardcoded secrets, Go gob deserialization, PHPassert()code exec, PHPinclude $varpath traversal, PHP weak crypto (md5/sha1/rand), C/C++popen(), C/C++ format-string with variable first arg, C++const_cast, RubyDigest::MD5. - Query fixes -- fixed 11 broken tree-sitter queries: Java
object_creation_expressionused wrong type node (identifier→type_identifier), C++reinterpret_cast/const_castused non-existent node types (→template_functionmatch), Ruby backtick usedshell_command(→subshell), Python SQL usedbinary_expression(→binary_operator), TypeScriptas anyused inaccessible field (→ positional child), PHP patterns missingargumentwrapper nodes, Rustunsafe fnregex used unsupported\b. - No-duplicate rule -- patterns that overlap with taint sinks use distinct ID namespaces and are documented; dedup in
ast.rsprevents duplicate findings at the same location. - Severity recalibration --
unwrap/expect/panic!/todo!moved to Low (filtered by defaultmin_severity). Security patterns remain High/Medium.
- Pattern schema extensions --
- Pattern test suite (
tests/pattern_tests.rs, 26 tests) -- sanity checks (unique IDs, query compilation, non-empty descriptions, naming convention, severity distribution), positive fixture tests (10 languages), and negative fixture tests (10 languages verifying no false positives on safe code). - Pattern test fixtures -- positive and negative fixture files for all 10 languages under
tests/fixtures/patterns/<lang>/. - Real world test suite — comprehensive fixture-based test suite (
tests/real_world_tests.rs) with ~180 test fixtures across all 10 supported languages (C, C++, Go, Java, JavaScript, PHP, Python, Ruby, Rust, TypeScript). Each fixture has an.expect.jsonfile declaring expected findings (withmust_matchfor hard requirements and soft expectations for aspirational coverage). Fixtures are organized by analysis type (taint/,state/,cfg/,mixed/) undertests/fixtures/real_world/<lang>/. A single parameterized test runner validates all fixtures in bothfullandastmodes, with verbose output viaNYX_TEST_VERBOSE=1.
Changed
- Console header line now includes confidence — the finding header shows score and confidence together as a parenthesized suffix:
(Score: 36, Confidence: Medium). The previous standaloneConfidence: ...body line is removed. All four combinations are handled (both, score-only, confidence-only, neither). - Confidence display uses Title Case —
Confidence::Displaynow renders asLow,Medium,High(previously lowercase). - Breaking: Config and data directory changed from
dev.ecpeter23.nyxtonyx(e.g.~/Library/Application Support/nyx/on macOS). Existing config files (nyx.conf,nyx.local) and SQLite indexes at the old path will not be picked up automatically — copy them to the new location or re-runnyx scanto regenerate. - Improved diagnostic output formatting — overhauled console renderer for a professional, security-tool-grade look:
- Severity is now the strongest visual anchor: HIGH (bold red with ✖), MEDIUM (bold orange ⚠), LOW (muted blue-gray ●). Fewer colors, clearer hierarchy.
- File paths rendered dim blue (never brighter than severity).
- Taint flow messages now use
→arrow between shortened source/sink instead of backtick-wrapped text. - Evidence values (Source, Sink) no longer wrapped in backticks — cleaner rendering with no risk of broken backtick spans across wrapped lines.
- Fixed taint expression rendering — multi-line sink/source call chains are now normalised before display:
- Whitespace collapsed (
foo() .bar()→foo().bar()). - Newlines joined into single-line canonical form.
- Spacing artefacts between
)and.in method chains cleaned up. - Long chains truncated with
…ellipsis.
- Whitespace collapsed (
- Added
terminal_sizedependency for terminal-width-aware line wrapping. - Monotone forward dataflow taint analysis — replaced the BFS taint engine in
taint/mod.rswith a proper worklist-based forward dataflow analysis where termination is guaranteed by lattice finiteness. The genericTransfer<S: Lattice>trait instate/engine.rsnow powers both the resource lifecycle/auth analysis and taint analysis.TaintStatelattice (taint/domain.rs) — bounded abstract state with per-variableVarTaint(Cap bitflags + multi-origin tracking viaSmallVec<[TaintOrigin; 2]>), dual validation bitsets (validated_mustfor intersection/all-paths,validated_mayfor union/any-path), and monotonePredicateSummaryfor contradiction pruning. Variables stored in sortedSmallVeckeyed bySymbolIdfor O(n) merge-join. Lattice height bounded at ~8700 (7-bit Cap × 64 vars + validation bits + predicate bits).TaintTransfer(taint/transfer.rs) — implementsTransfer<TaintState>with identical taint logic to the old BFS (source → propagation → sanitization → sink check). Callee resolution unchanged (local → global same-lang → interop edges). EmitsTaintEvent::SinkReachedevents during Phase 2 of the engine.- JS/TS two-level solve — prevents cross-function taint leakage (the main source of state explosion in the old BFS) while preserving global-to-function flows. Level 1 solves top-level code; Level 2 solves each function seeded with read-only top-level taint via
global_seed. - Monotone predicate tracking — path-sensitivity predicates moved from per-BFS-item
PathState(which duplicated state exponentially) to monotonePredicateSummaryin the lattice. Contradiction pruning usesknown_true & known_falsebit intersection (NullCheck/EmptyCheck/ErrorCheck only), which is both more precise and guaranteed monotone. - Multi-origin tracking — each tainted variable tracks up to 4
TaintOrigin(node +SourceKind), enabling multiple findings when distinct sources flow to the same sink. - Guaranteed termination — no more
MAX_BFS_ITERATIONS/MAX_SEEN_STATESsafety nets needed (though a 100K worklist iteration budget remains as defense-in-depth). Convergence follows from finite lattice height × finite CFG edges. analyse_file()signature unchanged —Findingstruct,Diagconversion, and all callers are unaffected.
- Generic dataflow engine (
state/engine.rs) —run_forward()andDataflowResultare now generic over anyS: Lattice+T: Transfer<S>.DefaultTransfer(resource lifecycle) implementsTransfer<ProductState>;TaintTransferimplementsTransfer<TaintState>. Per-domain iteration budget andon_budget_exceededhooks added. path_state.rssimplified — removedPathState,Predicate,MAX_PATH_PREDICATES,state_hash(),priority()structs/methods. KeptPredicateKindenum andclassify_condition()function (used by the new transfer for predicate classification).- Removed BFS infrastructure —
taint_hash(), BFSItemstruct,predpredecessor map, two-tier seen-state map, and all bail-out constants (MAX_BFS_ITERATIONS=200K,MAX_SEEN_STATES=100K,PATH_SENSITIVITY_NODE_LIMIT=500,PATH_SENSITIVITY_QUEUE_LIMIT=10K,MAX_PATH_VARIANTS_PER_KEY=4) are no longer needed and have been removed. - Severity filtering applied at output stage —
--severity(and legacy--high-only) filtering is now applied ONCE inscan::handle()after all severity normalization (nonprod downgrades, dedup, truncation). Previously--high-onlyonly filtered AST patterns during analysis; taint and CFG findings bypassed the filter entirely. --formatdefault isconsole— previously defaulted to empty string, requiring fallback logic.- All status/progress output goes to stderr — "Checking...", "Finished in...", config notes, and progress bars now use
eprintln!/stderr exclusively. JSON and SARIF output is stdout-only. Severity::from_strreturnsErrfor unknown values — previously returnedOk(Severity::Low)for any unrecognized input.- Deprecated CLI flags preserved as hidden aliases —
--high-only,--no-index,--rebuild-index,--ast-only,--cfg-only, and--include-nonprodare hidden from help but still functional, mapping to their canonical replacements. - Path-sensitive taint analysis -- the BFS taint engine now carries a
PathState(bounded set of branch predicates) alongside the taint map. When the BFS traverses a True or False edge from anIfnode, it records aPredicatewith the condition's variables, kind, and polarity. This enables two new capabilities:- Infeasible path pruning -- paths with contradictory predicates (e.g.
if x.is_none() { return; } if x.is_none() { sink }) are detected and pruned, eliminating false positives on code guarded by redundant null/empty/error checks. Contradiction detection is conservative: only whitelisted kinds (NullCheck,EmptyCheck,ErrorCheck) with single-variable predicates are pruned. - Validation guard annotation -- when all tainted variables reaching a sink are guarded by a
ValidationCallpredicate (e.g.if validate(&x) { sink }orif !validate(&x) { return; } sink), the finding is annotated withpath_validated: trueandguard_kind: ValidationCall. This metadata is surfaced in JSON and console output without changing severity.
- Infeasible path pruning -- paths with contradictory predicates (e.g.
- Condition metadata on CFG nodes --
NodeInfonow carriescondition_text,condition_vars, andcondition_negatedforIfnodes, extracted during CFG construction. Negation detection handles!expr,not expr, and Rubyunless. Classification of condition text intoPredicateKind(NullCheck, EmptyCheck, ErrorCheck, ValidationCall, SanitizerCall, Comparison, Unknown) is conservative: call-based kinds require(in the text and a matching callee token. path_validatedandguard_kindfields onDiag-- taint findings carry path-sensitivity metadata in JSON output (fields omitted when not set) and console output (suffix linePath guard: ValidationCallwhen present). Finding IDs are unchanged for dedup stability.smallvecdependency -- used for inline-allocated predicate storage inPathState(avoids heap allocation for the common case of ≤4 predicates per path).- Interprocedural call graph -- a whole-program
CallGraph(petgraph::DiGraph<FuncKey, CallEdge>) is now built between Pass 1 and Pass 2 of every taint-enabled scan. Each function definition is a node; resolved callee relationships are edges. The graph is constructed from the mergedGlobalSummariesand is available in both the filesystem and indexed scan paths. - Three-valued callee resolution --
CalleeResolutionenum distinguishesResolved(FuncKey),NotFound, andAmbiguous(Vec<FuncKey>). Ambiguous callees (same name in multiple namespaces, caller in a third namespace) are tracked separately from missing callees for diagnostics. - Shared resolution helper --
GlobalSummaries::resolve_callee_key()centralizes same-language callee resolution with arity-aware filtering and namespace disambiguation. Both the call graph builder and the taint engine now use the same resolution logic. - Callee-name normalization --
normalize_callee_name()extracts the last segment from qualified callee text ("env::var"→"var","obj.method"→"method") before resolution. The raw call-site text is preserved on graph edges for diagnostics. - SCC / topological analysis --
CallGraphAnalysiscomputes strongly connected components via Tarjan's algorithm and exposes a callee-first (leaves-first) topological ordering of SCC indices, ready for future bottom-up taint propagation. - Call graph tracing --
tracing::info!log with node count, edge count, unresolved-not-found count, unresolved-ambiguous count, and SCC count is emitted after every call graph build. - 8 new path-sensitivity integration tests: early-return validation guard, failed-validation branch, contradictory null-check pruning, if/else validation annotation, sanitize-one-branch regression, path-state budget graceful degradation, unknown-predicate non-pruning, multi-var non-pruning.
- 35 new unit tests in
taint::path_state: classify_condition variants, PathState push/truncation, contradiction detection (whitelisted kinds, single-var only), has_validation_for semantics, state_hash determinism, priority ordering. - 11 new unit tests: callee normalization, same-name-different-namespaces resolution, cross-language isolation, arity separation, recursive SCC detection, not-found vs ambiguous diagnostics, diamond topo ordering, interop edge resolution, namespace normalization consistency, and raw call-site preservation.
- Edge-aware taint traversal --
analyse_file()now usescfg.edges(node)instead ofcfg.neighbors(node), inspectingEdgeKindon each edge. This is required for predicate recording but also makes the taint engine aware of the CFG's branch structure for the first time. - Two-tier seen-state deduplication -- the BFS seen-state map changed from
HashSet<(NodeIndex, u64)>to aHashMapkeyed by(NodeIndex, taint_hash)mapping to a bounded list of(path_hash, priority)pairs. At mostMAX_PATH_VARIANTS_PER_KEY(4) path variants are tracked per taint state, with deterministic eviction preferring non-truncated states with fewer predicates. - Finding deduplication -- taint findings are now deduplicated by
(sink, source)pair after analysis, preferring findings withpath_validated = true(most informative metadata). taint::Findingstruct -- addedpath_validated: boolandguard_kind: Option<PredicateKind>fields. Code that constructsFindingdirectly must include these fields.Diagstruct -- addedpath_validated: boolandguard_kind: Option<String>fields. Both use#[serde(skip_serializing_if)]to omit from JSON when not set.taint::resolve_callee()refactored -- the global resolution step now delegates toGlobalSummaries::resolve_callee_key()and appliesnormalize_callee_name()before lookup, unifying resolution logic with the call graph builder.- Label rules expanded across 8 languages:
- Go — added
r.URL.Query,r.URL.Query.Get,Request.FormValue,Request.URLsources;filepath.Clean/filepath.Basesanitizers;fmt.Fprintf/fmt.Sprintf/fmt.Printfformat-string sinks;os.Open/os.OpenFile/os.Create/ioutil.ReadFile/os.ReadFileFILE_IO sinks;template.HTMLHTML sink;db.QueryRow/db.PrepareSQL sinks. - PHP — sources now match both
$_GETand_GET(without$prefix, matching collect_idents stripping); added$_FILES/_FILES,$_SERVER/_SERVER,$_ENV/_ENVsources;eval/assertshell sinks;include/include_once/require/require_onceFILE_IO sinks;unserializesink;move_uploaded_file/copy/file_put_contents/fwriteFILE_IO sinks;basenameFILE_IO sanitizer;querySQL sink. - Java — added
readObject/readLinesources;ProcessBuildershell sink;Class.forNamereflection sink;println/print/writeHTML sinks. - Python — added
send_file/send_from_directoryFILE_IO sinks;os.path.realpathFILE_IO sanitizer;openchanged from source to FILE_IO sink (fixes source/sink conflict for path traversal detection). - Ruby —
paramssource detection now works via subscript handling. - Rust — added
fs::read_to_string/fs::write/fs::read/File::open/File::createas FILE_IO sinks;fs::read_to_stringremoved from sources (was source/sink conflict). - C/C++ — added
fopen/openas FILE_IO sinks.
- Go — added
- Ruby
rb.cmdi.system_interppattern broadened — no longer requires string interpolation in arguments; now matches anysystem/execcall, promoted from Tier B to Tier A. - C++
cpp.cmdi.popenpattern added —popen()command execution detection for C++, using the language-namespaced ID (the C pattern retainsc.cmdi.popen). - Test config enables state analysis —
test_config()now setsenable_state_analysis = true.
Fixed
- Taint source kind misclassified as "unknown" for non-call sources — source-bearing nodes with
CallWrapperorAssignmentkind (e.g.userInput = req.query.data) had theircalleefield set toNonebecause the CFG builder only populatedcalleeforStmtKind::Callnodes. This causedinfer_source_kind()to receive an empty string, failing to match any keyword pattern and defaulting toSourceKind::Unknown. Fixed by also settingcalleewhen a label (Source/Sink/Sanitizer) is detected, so the extracted member text (e.g. "req.query") flows through to source kind inference. Affects severity classification and diagnostic output for property-access sources across all languages. - Full KINDS map audit across all 10 languages — 89 missing tree-sitter node types added to KINDS maps so the CFG builder no longer silently drops code inside switch/case, try/catch/finally, class bodies, closures/lambdas, and other container nodes. Previously, any node not in a language's KINDS map hit the
build_subfallback which created a terminal Seq node without recursing into children, effectively making all wrapped code invisible to analysis.- C (+3):
switch_statement,case_statement,labeled_statement - C++ (+7, 1 fix):
switch_statement,case_statement,labeled_statement,throw_statement(Return),try_statement,catch_clause,lambda_expression; critical fix:namespace_definitionchanged fromTriviatoBlock(all function definitions inside namespaces were silently dropped) - Java (+11):
do_statement(While),throw_statement(Return),switch_expression,switch_block,switch_block_statement_group,try_statement,catch_clause,finally_clause,lambda_expression,constructor_body,static_initializer - JavaScript (+11):
switch_statement,switch_body,switch_case,switch_default,try_statement,catch_clause,finally_clause,class_declaration,class(expression),class_body,export_statement - TypeScript (+13): all JS switch/try/class entries plus
abstract_class_declaration,export_statement,enum_declaration(Trivia) - PHP (+11):
do_statement(While),throw_expression(Return),switch_statement,switch_block,case_statement,default_statement,try_statement,catch_clause,finally_clause,colon_block,class_declaration - Python (+7):
try_statement,except_clause,finally_clause,class_definition,decorated_definition,match_statement,case_clause - Ruby (+11):
until(While),begin,rescue,ensure,case,when,class,module,singleton_method(Function),do,block - Go (+10):
expression_switch_statement,type_switch_statement,expression_case,type_case,default_case,select_statement,communication_case,go_statement,defer_statement,func_literal(Function) - Rust (+5, 1 removal):
closure_expression,async_block,impl_item,trait_item,declaration_list; removed deadloop_statemententry (node doesn't exist in tree-sitter-rust 0.24.0)
- C (+3):
- Removed unused
Kind::LoopBodyenum variant fromlabels/mod.rs(no arm inbuild_sub, last reference was the dead Rustloop_statemententry) - CFG:
else_clausenot recursed into for C/C++ — tree-sitter's C and C++ grammars wrap else bodies in anelse_clausenode. This node was missing from both languages'KINDSmaps, so the CFG builder's fallback arm treated it as a terminalSeqnode without descending into children. All statements inside else blocks (e.g.fclose(f)) were silently dropped from the CFG, causing false-positive resource leak and incorrect branch analysis. Fixed by mapping"else_clause" => Kind::Blockinsrc/labels/c.rsandsrc/labels/cpp.rs. - CFG:
else_clausemissing from Rust, JavaScript, TypeScript, Python, PHP KINDS maps — same bug class as C/C++: tree-sitter wraps else bodies in anelse_clausenode that was not in KINDS, silently dropping all code inside else blocks from the CFG. Fixed by mapping"else_clause" => Kind::Blockin all five languages. Also added"elif_clause" => Kind::Block(Python),"else_if_clause" => Kind::Block(PHP), and"elsif" => Kind::If(Ruby) to handle chained elif/elsif nodes. - Rust KINDS using wrong tree-sitter node names — tree-sitter-rust uses
_expressionsuffixes (not_statement) forwhile,for, andreturnnodes. The existingwhile_statement,for_statement, andreturn_statemententries were dead code (0 grammar matches). Addedwhile_expression,for_expression, andreturn_expressionmappings. - Rust
match_expression,match_block,match_arm,unsafe_blockmissing from KINDS — these wrapper nodes were not mapped, causing all code inside match arms and unsafe blocks to be silently dropped from the CFG. Mapped toKind::Blockfor sequential traversal. - TypeScript missing
throw_statementanddo_statement—throwwas mapped in JavaScript but not TypeScript;do_statement(do-while loops) was missing from both JS and TS. Added"throw_statement" => Kind::Returnand"do_statement" => Kind::Whileto both languages. - Python
raise_statementandwith_statementmissing from KINDS —raiseterminates the current path (mapped toKind::Return);withwraps code in a context manager (mapped toKind::Block). Both were silently dropping enclosed code. - Dead KINDS entries removed —
"for_of_statement"in TypeScript (0 grammar matches; TS inheritsfor_in_statementfrom JS) and"method_call"in Ruby (0 grammar matches; Ruby only hascall). --high-onlyemitting Low/Medium taint and CFG findings — severity filter was only applied to AST pattern queries during analysis. Taint findings (whose severity derives fromSourceKind) and CFG structural findings passed through unfiltered. The filter is now applied at the final output stage after all severity normalization, ensuring--severity HIGHnever emits downgraded Medium/Low findings.- JSON/SARIF output contaminated with status messages on stdout — status messages ("Checking...", "Finished in...") used
println!and appeared in stdout alongside machine output. Now all status goes to stderr. - CFG: False edge to then-block exits in no-else if statements -- previously,
if (cond) { body }without an else block created aFalseedge from the condition node directly to the then-block's exit nodes. This made the false path appear to traverse the then-block, causing incorrect predicate polarity in path-sensitive analysis and duplicate taint findings with contradictory metadata. The CFG now creates a synthetic pass-throughSeqnode for the false path with an explicitFalseedge from the condition, correctly modeling "skip the then-block." This also fixes the frontier: previously, the no-else non-terminating case duplicatedthen_exitsin the frontier (then_exits ++ then_exits.clone()); it now correctly producesthen_exits ∪ [pass_through]. - Taint BFS non-termination on large JS files — the BFS taint engine in
taint/mod.rshad no global iteration bound. The seen-state deduplication keyed on(node, taint_hash), so every distinct taint map at a CFG node was treated as a novel state. In files with loops and many tainted variables (e.g. a 2,200-line JS file with 18+ top-level variables tainted viawindow.location.search), each loop iteration produced a slightly different taint map, causing the BFS to revisit loop bodies indefinitely. Both--no-indexand--rebuild-indexscans hung near completion (progress showed e.g. 87/88 files). Fixed by adding two hard bounds:MAX_BFS_ITERATIONS(200,000 queue pops) andMAX_SEEN_STATES(100,000 unique(node, taint_hash)entries in the seen-state map). When either limit is reached the analysis bails out gracefully and returns all findings collected so far. Atracing::warn!is emitted on iteration-limit bail-out. Normal files are unaffected (typical BFS uses <1,000 iterations). - Rust
if let/while lettaint propagation — the CFG builder now extracts pattern bindings fromlet_conditionnodes as variable definitions indef_use(), and classifies the value expression (e.g.env::var("CMD")) for source/sink labels inpush_node(). Previously,if let Ok(cmd) = env::var("CMD") { Command::new("sh").arg(&cmd) }produced no taint finding becausecmdwas never recognized as a tainted definition. Now correctly detects taint flow throughif letandwhile letbindings. - C++
popenpattern ID collision — renamedc.cmdi.popentocpp.cmdi.popenin C++ patterns to fix a cross-language duplicate ID that causedall_pattern_ids_are_globally_uniquetest failure. - State analysis early-return leak duplication —
extract_findingsinstate/facts.rsnow skips early-return nodes when checking for resource leaks, only inspecting the synthesized function exit node. Previously, early-return nodes with path-specific state (OPEN only) emittedstate-resource-leakalongside the correctstate-resource-leak-possiblefrom the merged exit state. - Severity filter bug —
min_severitycomparison inast.rswas inverted (<=instead of>), causing all AST patterns at the minimum severity level to be silently dropped. With the defaultmin_severity = Low, all Low-severity patterns (.unwrap(),.expect(),panic!,todo!,mem::forget, Go crypto patterns, narrow casts) were never reported. Fixed 29 test cases. - Nested function analysis — CFG builder now recurses into function expressions passed as call arguments (e.g., Express
app.get('/path', function(req, res) { ... }), Sinatraget '/path' do...end). Addedcollect_nested_function_nodes()to discoverKind::Functionnodes insideCallWrapper/CallFnAST subtrees. Also addedfunction_expressionto JS/TS KINDS maps, anddo_block/blockasKind::Functionin Ruby for Sinatra/Rails blocks. Anonymous functions now get unique names (<anon@{offset}>) to prevent scope collisions in JS two-level taint solve. - Chained method call classification —
classify()now normalizes chained calls liker.URL.Query().Getby stripping internal()between.segments, producingr.URL.Query.Get. Suffix matching is attempted against both the original head and the normalized form, fixing Go HTTP handler source detection and similar patterns. - Subscript access source detection —
first_member_labelandfirst_member_textnow handlesubscript_expression,subscript, andelement_referencenodes, enabling source classification for PHP$_GET['cmd'], Rubyparams[:cmd], and Pythonos.environ['KEY']. - Return-statement call extraction —
Kind::Returnadded to the node types that extract inner call identifiers viafirst_call_ident, fixing cases likereturn send_file(path)where the sink was not classified. - Nested call classification — new
find_classifiable_inner_call()tries all nested calls when the outermost one doesn't classify, fixingstr(eval(expr))whereevalis a sink wrapped in a non-sink call. - Java
newexpression text extraction — addedtypefield fallback inpush_nodeandfirst_call_identforCallFnnodes, fixingnew ProcessBuilder(...)not matching as a sink. - Function body lookup for anonymous functions —
Kind::Functionhandler now falls back to finding aKind::Blockchild whenchild_by_field_name("body")returns None, supporting JS/TS anonymous function expressions and Ruby blocks. - Function-level resource leak detection —
extract_findingsinstate/facts.rsnow inspects per-function Return nodes for leaked resources, not just the file-level Exit node. Previously, variables from one function could be overwritten by same-named variables in subsequent functions, masking leaks. - Use-after-free for memory functions — added
strcpy,strncpy,memcpy,memmove,memset,memcmp,strcmp,strncmp,strlen,sprintf,snprintftoRESOURCE_USE_PATTERNSin state analysis, enabling use-after-free detection for common C/C++ string and memory functions.
[0.3.0] - 2026-02-25
Added
- Configurable analysis rules -- users can define custom sources, sanitizers, and sinks per language via TOML config (
nyx.local) or the newnyx configCLI. Config rules take priority over built-in rules, so project-specific sanitizers likeescapeHtml()are recognized without code changes. nyx configCLI subcommand with four actions:show-- print effective merged configuration as TOMLpath-- print config directory pathadd-rule --lang <LANG> --matcher <NAME> --kind <KIND> --cap <CAP>-- append a label rule tonyx.localadd-terminator --lang <LANG> --name <NAME>-- append a terminator function tonyx.local
--include-nonprodCLI flag -- by default, findings in non-production paths (tests, vendor, benchmarks, examples, fixtures, build scripts,*.min.js) are now downgraded by one severity tier (High→Medium, Medium→Low). Pass--include-nonprodto restore original severity. Controlled byscanner.include_nonprodconfig key.SourceKindenum in the taint engine -- taint findings now carry asource_kindfield (UserInput,EnvironmentConfig,FileSystem,Database,Unknown) inferred from the source callee name and capabilities. Severity is based on source kind rather than hardcoded to High: filesystem and database sources produce Medium, user input and environment sources produce High.- Configurable terminators -- functions like
process.exit()can be declared as terminators per language; the CFG treats them as dead ends, preventing false positives on code after termination calls. - Event handler callback suppression -- functions passed as arguments to configured event handler calls (e.g.
addEventListener) are no longer flagged as unreachable code. - Exec-path guard rules -- calls to
which,resolve_binary,find_program,lookup_path, andshutil.whichare recognized as guards forSHELL_ESCAPEsinks. If such a guard dominates a shell-exec sink, thecfg-unguarded-sinkfinding is suppressed. - One-hop constant binding trace -- the constant-arg sink suppression now traces one hop through the CFG. If a sink's variable was defined by a node with no uses and no Source label, it is treated as constant. Fixes false positives on patterns like
cmd = "git"; subprocess.run([cmd, "status"]). - Evidence-based severity in cfg-only mode -- when taint analysis is not active (no global summaries and no taint findings), structural
cfg-unguarded-sinkfindings without source-derived evidence are downgraded from Medium to Low. - FileResponse ownership transfer -- file handles passed to consuming sinks (
FileResponse,StreamingHttpResponse,send_file,make_response) are no longer flagged as resource leaks. - Lock-not-released refinement -- mutex findings now require an explicit
.acquire()or.lock()call on the acquired variable. Constructor-only patterns likelock = threading.Lock()without acquire no longer producecfg-lock-not-released. - Python
connect/cursorexclusions --signal.connect,event.connect, and.registerare excluded from the Python db-connection acquire pattern, preventing falsecfg-resource-leakfindings on Django signal handlers and event registrations. location.hrefsink rules for JavaScript --location.href,window.location.href, anddocument.location.hrefassignments are classified asSink(URL_ENCODE).throw_statementas terminator in JavaScript --thrownow terminates the current block in the CFG (mapped toKind::Return), preventing falsecfg-error-fallthroughfindings after throw statements.Cap::FMT_STRINGcapability bit -- new bitflag (0b0100_0000) for format-string vulnerabilities, distinct from HTML injection. Sources usingCap::all()automatically match.- Python taint sources --
open,argparse.parse_args,urllib.request.urlopen,requests.get,requests.postadded asCap::all()sources for broader attack-surface coverage. - SARIF 2.1.0 output format (
-f sarif) -- produces spec-compliant Static Analysis Results Interchange Format JSON on stdout. Includes tool metadata, deduplicated rule definitions with descriptions, severity-to-level mapping (High→error,Medium→warning,Low→note), and physical locations with relative paths. Suitable for GitHub Code Scanning, Azure DevOps, and other SARIF-consuming CI tools. - Progress bars via
indicatif-- file discovery, Pass 1, and Pass 2 each display a progress bar on stderr with file counts and ETA. Bars are automatically hidden when output format isjson/sarifor quiet mode is enabled. Index building also shows progress. - Quiet mode (
output.quiet = true) -- suppresses all status messages (config notes, "Checking...", "Finished in...") on stderr. Useful for CI pipelines and scripted invocations. - Resource leak detection for Python, Ruby, PHP, JavaScript, and TypeScript -- new acquire/release pairs: Python (
open/.close,socket/.close,connect/.close,threading.Lock/.release), Ruby (File.open/.close,TCPSocket.new/.close,.lock/.unlock), PHP (fopen/fclose,mysqli_connect/mysqli_close,curl_init/curl_close), JS/TS (fs.open/fs.close,createReadStream/.close). - Walker config wired up --
performance.max_depth,scanner.one_file_system,scanner.require_git_to_read_vcsignore, andscanner.excluded_filesare now enforced during directory walking (previously parsed but ignored). database.vacuum_on_startup-- when enabled, runs SQLite VACUUM before indexed scans to reclaim space.- 31 new unit tests covering config round-trip, rule merging, classify extension, href classification, throw termination, terminator detection, config sanitizer suppression, Python/C++ precision, unreachable+unguarded dedup, resource leak detection, one-hop constant binding, exec-path guards, cfg-only severity downgrade, FileResponse ownership, lock constructor suppression, signal.connect exclusion, nonprod path detection, and severity downgrade.
Changed
taint::Findingstruct -- addedsource_kind: SourceKindfield. Code that constructsFindingdirectly must include this field.AnalysisContextstruct -- addedtaint_active: boolandanalysis_rulesfields. Code that constructsAnalysisContextdirectly must include these fields.ScannerConfigstruct -- addedinclude_nonprod: boolfield (defaultfalse). Deserialization is unaffected due to#[serde(default)].proto_pollutionAST pattern severity -- downgraded from High to Low. The AST-only pattern is a structural indicator; the taint engine separately produces High findings when attacker-controlled data flows to__proto__.location_href_assignmentAST pattern -- constrained to require a known browser global object (window,location,document,self,top,parent,frames). Preventsel.href = valfrom matching; onlywindow.location.href = valand similar patterns trigger the finding.- Taint finding severity -- no longer hardcoded to High. Severity is now derived from
SourceKind: UserInput/EnvironmentConfig/Unknown → High, FileSystem/Database → Medium. - C/C++ sink reclassification --
printf/fprintfmoved fromSink(HTML_ESCAPE)toSink(FMT_STRING).std::cout,std::cerr,std::clogremoved from sinks entirely (output/logging, not injection vectors).sprintf/strcpy/strcatremainSink(HTML_ESCAPE). classify()now accepts an optionalextra: Option<&[RuntimeLabelRule]>parameter; config-defined rules are checked first (higher priority) before built-in static rules.build_cfg(),build_sub(), andpush_node()accept optionalLangAnalysisRulesfor config-driven label classification, terminator detection, and event handler awareness.find_guard_nodes()andis_guard_call()now recognize config-defined sanitizers as guards with matching capability bits.merge_configs()union-merges analysis rules, terminators, and event handlers per language key with dedup.- Assignment LHS classification now tries the full member expression text (e.g.
location.href) before falling back to property-only (e.g.innerHTML), fixing false positives ona.hrefassignments. handle_command()now receivesconfig_dirto support theconfigsubcommand.- Fused single-pass analysis -- AST-only mode now runs a single fused pass (
analyse_file_fused) that parses each file and builds the CFG once, producing both function summaries and diagnostics. Previously every file was parsed twice (once for summary extraction, once for analysis). Taint mode uses the fused pass for Pass 1, eliminating redundant CFG construction during summary extraction. - O(N²) → O(N) function-level dataflow sweep in CFG builder -- the light-weight dataflow sweep and return-node wiring in
build_subforKind::Functionnow iterate only over nodes created within the current function scope (tracked via a snapshot of the node count) instead of scanning the entire graph. Eliminates quadratic scaling in files with many functions. - Parallel summary merging --
scan_filesystemnow uses rayonfold/reduceto build per-threadGlobalSummariesmaps in parallel, then merges them in a binary reduce tree. Eliminates the serialmerge_summariesbottleneck. AddedGlobalSummaries::merge(). - Redundant file I/O eliminated in indexed path -- files are now read once and hashed once per scan. Added
Indexer::should_scan_with_hash()andIndexer::upsert_file_with_hash()to accept pre-computed hashes. Pass 2 usesrun_rules_on_byteswith already-read bytes instead of re-reading from disk. Previously files could be read up to 4 times and hashed up to 3 times per indexed scan. - SQLite mutex mode relaxed -- switched from
SQLITE_OPEN_FULL_MUTEX(global serialization) toSQLITE_OPEN_NO_MUTEX. The r2d2 connection pool guarantees one-connection-per-thread safety; combined with WAL mode this allows concurrent readers without a global lock. - Parallel JSON deserialization in
load_all_summaries-- for large result sets (>256 summaries), JSON deserialization is now parallelized with rayon. - Zero-allocation taint hashing --
taint_hash()replaced sorted-Vec+ blake3 with an order-independent XOR-of-FNV scheme. Eliminates a heap allocation and sort per BFS edge in the taint engine. - In-place taint transfer --
apply_taint()now mutates the taint map in place instead of cloning and returning a newHashMapper node visit. The BFS loop caches hash values and usesstd::mem::takefor the last successor to avoid unnecessary clones.
Fixed
- False positives on one-hop constant bindings --
cmd = "git"; Command::new(cmd)no longer triggerscfg-unguarded-sinkbecause the variable is traced back to a constant definition. - False positives from exec-path guards --
resolve_binary(&bin); Command::new(bin)is now recognized as guarded. - False
cfg-resource-leakon Django signal handlers --signal.connect(handler)no longer matches the Python db-connection acquire pattern. - False
cfg-lock-not-releasedon Lock constructors --threading.Lock()without.acquire()no longer produces a finding. - False
cfg-resource-leakon FileResponse --f = open(...); return FileResponse(f)is recognized as ownership transfer. - Inflated severity in cfg-only mode -- structural findings without taint evidence now correctly produce Low severity instead of Medium.
el.href = valfalse positive in AST patterns -- thelocation_href_assignmentpattern now requires a known browser global, eliminating matches on DOM element.hrefassignments.- Structured output modes (
-f json,-f sarif) now produce zero stderr noise -- config notes, "Checking …", and "Finished in …" messages are fully suppressed (not just redirected to stderr) so thatnyx scan -f json | jqand CI SARIF upload work without extraneous output. Human-readable console format continues to show status messages. - Console output column alignment -- severity tags are now bracketed and padded to a fixed display width (
[HIGH],[MEDIUM],[LOW]) so that rule IDs align consistently regardless of severity. ANSI color codes are applied after width calculation, not before. .hreffalse positives --el.href = "/about"no longer triggerslocation_href_assignmentor sink classification; onlylocation.href(andwindow.location.href,document.location.href) match.- Constant-arg sink false positives -- sinks whose arguments are all constants (no variable uses beyond the callee name) with no taint confirmation are now suppressed. Fixes false positives on patterns like
subprocess.run(["make","clean"])andprintf("hello\n"). - Unreachable + unguarded dedup -- when both
cfg-unreachable-sinkandcfg-unguarded-sinkfire on the same span, the unguarded finding is suppressed (unreachable is more specific). std::coutfalse positives --std::coutno longer classified as a sink, eliminating spurious findings on every C++ iostream print.- Break/continue scope correctness --
breakandcontinueinside loops now correctly wire to their enclosing loop header/exit. Previously,breakin awhile/forbody created a dead-end node that left post-loop code unreachable, producing falsecfg-unreachable-*findings. The If handler's no-else case also now correctly flows the false branch to subsequent code when the then-branch terminates (return/break/continue). True/False edge labels are applied to branch entry nodes rather than exit nodes, fixingcfg-error-fallthroughfalse positives onif (err) { return; }patterns. - Preprocessor dangling-else CFG recovery --
#ifdef/#endifblocks that split anif/elseacross preprocessor boundaries no longer orphan subsequent code. The CFG block handler now recovers the frontier after preprocessor nodes, preventing false unreachable-code findings on code following#ifdef ... #endifblocks. - Wrapper resource function recognition --
curlx_fopen,curlx_fdopen,fdopen, andcurlx_fcloseare now recognized as acquire/release functions for C file handles, eliminating falsecfg-resource-leakfindings on codebases (e.g. curl) that use wrapper functions around standard I/O. freopenfalse positive --freopen()(andcurlx_freopen) no longer triggerscfg-resource-leakfindings. Previouslyfreopenmatched thefopenacquire pattern viaends_with; a newexclude_acquirefield onResourcePairfilters out these false matches for both the file handle and file descriptor resource pairs.- Struct field ownership transfer -- resource leak detection now recognizes ownership transfer via struct field assignment (
s->stream = fp,obj.field = ptr). When an acquired resource is stored into a struct field downstream, the finding is suppressed since the receiving struct assumes lifetime responsibility. - Linked-list/global insertion -- resource leak detection now recognizes linked-list insertion patterns (
p->next = list; list = p) and global variable assignment as ownership transfers, eliminating falsecfg-resource-leakfindings on common C allocation-and-insert idioms. - Removed incorrect
value_enumattribute from CLI--formatargument. - Benchmark compilation error:
classify()calls inbenches/scan_bench.rswere missing the thirdextraparameter.
[0.2.0] - 2026-02-24
Added
- Cross-file taint analysis -- two-pass architecture: Pass 1 extracts
FuncSummaryper function (source/sanitizer/sink capabilities, taint propagation, callees), Pass 2 runs BFS taint propagation with cross-file callee resolution. - CFG analysis engine with five detectors: unguarded sinks (
cfg-unguarded-sink), auth gaps in web handlers (cfg-auth-gap), unreachable security code (cfg-unreachable-*), error fallthrough (cfg-error-fallthrough), and resource leaks (cfg-resource-leak). - Cross-language interop -- taint flows across language boundaries via explicit
InteropEdgestructs without false-positive name collisions. - Function summaries persisted to SQLite (
function_summariestable) with arity, parameter names, capability bitflags, and callee lists. - Multi-language CFG + taint support -- all 10 languages (Rust, C, C++, Java, Go, PHP, Python, Ruby, TypeScript, JavaScript) now have
KINDSmaps,RULES, andPARAM_CONFIGfor full CFG construction and taint analysis. - Resource leak detection for C/C++ (malloc/free, fopen/fclose), Go (os.Open/Close, Lock/Unlock), Rust (alloc/dealloc), and Java (streams, connections).
- Finding scoring system -- numeric scores based on severity, proximity to entry point, path complexity, taint confirmation, and confidence multiplier.
- Analysis modes --
Full(default),Ast(--ast-only), andTaint(--cfg-only) selectable via CLI flags orscanner.modeconfig. GlobalSummarieswith conservative merge: union caps, OR booleans, union param/callee lists on name collisions across files.- Performance optimizations --
_from_bytesvariants to read-once/hash-once, lock-free rayon parallelism, SQLite WAL + 8 MB cache + 256 MB mmap. - Tracing instrumentation --
tracingspans on all pipeline phases (walk, pass1, merge, pass2, per-file ops, db_init). - Benchmark suite -- criterion benchmarks in
benches/scan_bench.rswith fixtures. - 107 unit tests covering taint propagation, cross-file resolution, cross-language interop, CFG analysis, and summaries.
Changed
- Bumped all dependencies to latest compatible versions.
Capbitflags expanded:ENV_VAR,HTML_ESCAPE,SHELL_ESCAPE,URL_ENCODE,JSON_PARSE,FILE_IO.classify()in labels uses zero-allocation byte-level case-insensitive comparisons.- Indexed scans now always re-analyze all files in Pass 2 when taint is enabled (conservative: global summaries may have changed even if a file didn't).
Fixed
- Clippy
ptr_arglint in perf tests (&PathBuf->&Path).
[0.2.0-alpha] - 2025-06-28
Added
- Experimental intra‑procedural CFG + taint analysis for Rust. Nyx now builds a control‑flow graph, applies data‑flow rules, and flags unsanitised Source → Sink paths (e.g. env::var → Command::new).
- O(1) node‑kind lookup via per‑language PHF tables for zero‑cost dispatch.
- Six unit tests covering conditionals, loops, sanitizers, and multiple sources.
- Debug channel target=cfg (use RUST_LOG=nyx::cfg=debug) to inspect generated graphs.
Fixed
- Fixed a bug in the release pipeline where Windows was trying to call the zip, PowerShell doesn't have a zip command
[0.1.1-alpha] - 2025-06-25
Fixed
- Fixed a bug where the
scan --no-indexcommand would not respect themax_resultsconfig setting (#1)
Added
- Integration tests covering indexing and scanning pipelines (#3, #4, #5, #8)
[0.1.0-alpha] - 2025-06-25
Added
- Initial alpha release of Nyx CLI tool
- Multi-language AST pattern scanning via
tree-sitterfor Rust, C/C++, Java, Go, PHP, Python, Ruby, TypeScript, JavaScript scancommand: filesystem walker, pattern execution, console outputindexcommand: build, rebuild, and status reporting of SQLite-backed indexlistcommand: list indexed projects with optional verbositycleancommand: remove one or all project indexes- Configuration system with
nyx.conf(generated) andnyx.local(user overrides) - Default severity levels: High, Medium, Low
- Unit tests for core modules (config, ext, project utils)