diff --git a/README.md b/README.md index 81c7d5a9..3bb04b24 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) [![Rust 1.88+](https://img.shields.io/badge/rust-1.88%2B-orange)](https://www.rust-lang.org) [![CI](https://img.shields.io/github/actions/workflow/status/elicpeter/nyx/ci.yml?branch=master)](https://github.com/elicpeter/nyx/actions) -[![Docs](https://img.shields.io/badge/docs-elicpeter.github.io%2Fnyx-blue)](https://elicpeter.github.io/nyx/) +[![Docs](https://img.shields.io/badge/docs-nyxscan.dev%2Fdocs-blue)](https://nyxscan.dev/docs/) English · [简体中文](./README.zh-CN.md) @@ -46,7 +46,7 @@ Everything stays on your machine: loopback-only bind, host-header enforcement, C | **Config** | Live config editor; reload without restart | -`nyx serve` flags: `--port ` (default `9700`), `--host ` (loopback only: `127.0.0.1`, `localhost`, or `::1`), `--no-browser`. See `[server]` in `nyx.conf` for persistent settings, and the [Browser UI guide](https://elicpeter.github.io/nyx/serve.html) for the page-by-page UI tour and security model. +`nyx serve` flags: `--port ` (default `9700`), `--host ` (loopback only: `127.0.0.1`, `localhost`, or `::1`), `--no-browser`. See `[server]` in `nyx.conf` for persistent settings, and the [Browser UI guide](https://nyxscan.dev/docs/serve.html) for the page-by-page UI tour and security model. --- @@ -71,7 +71,7 @@ nyx scan --mode ast nyx scan --engine-profile deep ``` -Forward cross-file taint runs in every profile. Symex and the demand-driven backwards walk are opt-in. Turn them on either via `--engine-profile deep`, or individually (`--symex`, `--backwards-analysis`). See the [CLI reference](https://elicpeter.github.io/nyx/cli.html#engine-depth-profile) for the full toggle matrix. +Forward cross-file taint runs in every profile. Symex and the demand-driven backwards walk are opt-in. Turn them on either via `--engine-profile deep`, or individually (`--symex`, `--backwards-analysis`). See the [CLI reference](https://nyxscan.dev/docs/cli.html#engine-depth-profile) for the full toggle matrix. ### GitHub Action @@ -125,7 +125,7 @@ All 10 languages parse via tree-sitter and run through the full pipeline, but ru | **Beta** | Java, PHP, Ruby, Rust, Go | 100% | Yes, with light FP triage | | **Preview** | C, C++ | 100% on synthetic corpus | No. STL container flow, builder chains, and inline class member functions are tracked, but deep pointer aliasing and function pointers are not. Pair with clang-tidy or Clang Static Analyzer | -Aggregate rule-level F1: 100.0% (P=1.000, R=1.000). All real-CVE fixtures fire and the corpus carries zero open FPs. Per-dimension detail and known blind spots live on the [Language maturity page](https://elicpeter.github.io/nyx/language-maturity.html). +Aggregate rule-level F1: 100.0% (P=1.000, R=1.000). All real-CVE fixtures fire and the corpus carries zero open FPs. Per-dimension detail and known blind spots live on the [Language maturity page](https://nyxscan.dev/docs/language-maturity.html). ### Validated against real CVEs @@ -188,7 +188,7 @@ Two passes over the filesystem, with an optional SQLite index to skip unchanged 3. **Pass 2**: re-analyze each file with cross-file context under bounded context sensitivity (k=1 inlining for intra-file callees, SCC fixpoint capped at 64 iterations, and summary fallback for callees above the inline body-size cap). A forward dataflow worklist propagates taint through the SSA lattice with guaranteed convergence. Call-graph SCCs iterate to fixed-point (within the cap) so mutually recursive functions get accurate summaries. 4. **Rank, dedupe, emit**: findings are scored by severity × evidence strength × source-kind exploitability, then emitted to console, JSON, or SARIF. -Detector families: taint (cross-file source→sink, with cap-specific rule classes for SQLi, XSS, command/code exec, deserialization, SSRF, path traversal, format string, crypto, LDAP injection, XPath injection, HTTP header / response splitting, open redirect, server-side template injection, XXE, prototype pollution, data exfiltration, and the auth fold-in), CFG structural (auth gaps, unguarded sinks, resource leaks), state model (use-after-close, double-close, must-leak, unauthed-access), AST patterns (tree-sitter structural match). Full detector docs: [Detectors](https://elicpeter.github.io/nyx/detectors.html). +Detector families: taint (cross-file source→sink, with cap-specific rule classes for SQLi, XSS, command/code exec, deserialization, SSRF, path traversal, format string, crypto, LDAP injection, XPath injection, HTTP header / response splitting, open redirect, server-side template injection, XXE, prototype pollution, data exfiltration, and the auth fold-in), CFG structural (auth gaps, unguarded sinks, resource leaks), state model (use-after-close, double-close, must-leak, unauthed-access), AST patterns (tree-sitter structural match). Full detector docs: [Detectors](https://nyxscan.dev/docs/detectors.html). --- @@ -213,7 +213,7 @@ kind = "sanitizer" cap = "html_escape" ``` -Or add rules interactively: `nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape`. Caps: `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `data_exfil`, `code_exec`, `crypto`, `unauthorized_id`, `ldap_injection`, `xpath_injection`, `header_injection`, `open_redirect`, `ssti`, `xxe`, `prototype_pollution`, `all`. Full schema: [Configuration](https://elicpeter.github.io/nyx/configuration.html). Run `nyx rules list` to browse the registry from the terminal. +Or add rules interactively: `nyx config add-rule --lang javascript --matcher escapeHtml --kind sanitizer --cap html_escape`. Caps: `env_var`, `html_escape`, `shell_escape`, `url_encode`, `json_parse`, `file_io`, `fmt_string`, `sql_query`, `deserialize`, `ssrf`, `data_exfil`, `code_exec`, `crypto`, `unauthorized_id`, `ldap_injection`, `xpath_injection`, `header_injection`, `open_redirect`, `ssti`, `xxe`, `prototype_pollution`, `all`. Full schema: [Configuration](https://nyxscan.dev/docs/configuration.html). Run `nyx rules list` to browse the registry from the terminal. --- diff --git a/src/dynamic/framework/adapters/mod.rs b/src/dynamic/framework/adapters/mod.rs index 72b7b09b..013fb93c 100644 --- a/src/dynamic/framework/adapters/mod.rs +++ b/src/dynamic/framework/adapters/mod.rs @@ -463,3 +463,26 @@ pub(super) fn strip_sigils(s: &str) -> &str { .trim_start_matches('@') .trim_start_matches('&') } + +/// True when the source file visibly mitigates prototype-pollution +/// through a known guard pattern: a quoted `'__proto__'` / `"__proto__"` +/// comparison (canonical per-key filter), or a global +/// `Object.freeze(Object.prototype)` / `Object.seal(Object.prototype)` +/// mitigation. Used by the Phase 10 `pp-lodash-merge` / +/// `pp-object-assign` / `pp-json-deep-assign` adapters to skip binding +/// when the surrounding code already neutralises the gadget. +/// +/// The quoted-string form deliberately excludes backtick-wrapped +/// `__proto__` in doc comments so fixtures that mention the key in +/// prose still bind correctly. +pub(super) fn source_filters_proto_keys(file_bytes: &[u8]) -> bool { + const NEEDLES: &[&[u8]] = &[ + b"'__proto__'", + b"\"__proto__\"", + b"Object.freeze(Object.prototype", + b"Object.seal(Object.prototype", + ]; + NEEDLES + .iter() + .any(|n| file_bytes.windows(n.len()).any(|w| w == *n)) +} diff --git a/src/dynamic/framework/adapters/pp_json_deep_assign.rs b/src/dynamic/framework/adapters/pp_json_deep_assign.rs index bd184d3a..612f0a30 100644 --- a/src/dynamic/framework/adapters/pp_json_deep_assign.rs +++ b/src/dynamic/framework/adapters/pp_json_deep_assign.rs @@ -75,6 +75,9 @@ impl FrameworkAdapter for PpJsonDeepAssignJsAdapter { _ast: tree_sitter::Node<'_>, file_bytes: &[u8], ) -> Option { + if super::source_filters_proto_keys(file_bytes) { + return None; + } let matches_call = super::any_callee_matches(summary, callee_is_json_parse); let matches_source = source_has_deep_merge_helper(file_bytes); if matches_call && matches_source { @@ -104,6 +107,9 @@ impl FrameworkAdapter for PpJsonDeepAssignTsAdapter { _ast: tree_sitter::Node<'_>, file_bytes: &[u8], ) -> Option { + if super::source_filters_proto_keys(file_bytes) { + return None; + } let matches_call = super::any_callee_matches(summary, callee_is_json_parse); let matches_source = source_has_deep_merge_helper(file_bytes); if matches_call && matches_source { @@ -153,4 +159,25 @@ mod tests { .detect(&summary, tree.root_node(), src) .is_none()); } + + #[test] + fn skips_when_proto_key_filter_present() { + let src: &[u8] = b"function deepMerge(t, s) {\n\ + for (const k of Object.keys(s)) {\n\ + if (k === '__proto__' || k === 'constructor') continue;\n\ + t[k] = s[k];\n\ + }\n\ + return t;\n\ + }\n\ + function run(payload) { return deepMerge({}, JSON.parse(payload)); }\n"; + let tree = parse_js(src); + let summary = FuncSummary { + name: "run".into(), + callees: vec![crate::summary::CalleeSite::bare("JSON.parse")], + ..Default::default() + }; + assert!(PpJsonDeepAssignJsAdapter + .detect(&summary, tree.root_node(), src) + .is_none()); + } } diff --git a/src/dynamic/framework/adapters/pp_lodash_merge.rs b/src/dynamic/framework/adapters/pp_lodash_merge.rs index 68197b17..8b89ccdd 100644 --- a/src/dynamic/framework/adapters/pp_lodash_merge.rs +++ b/src/dynamic/framework/adapters/pp_lodash_merge.rs @@ -65,6 +65,9 @@ impl FrameworkAdapter for PpLodashMergeJsAdapter { _ast: tree_sitter::Node<'_>, file_bytes: &[u8], ) -> Option { + if super::source_filters_proto_keys(file_bytes) { + return None; + } let matches_call = super::any_callee_matches(summary, callee_is_lodash_merge); let matches_source = source_imports_lodash(file_bytes); if matches_call && matches_source { @@ -94,6 +97,9 @@ impl FrameworkAdapter for PpLodashMergeTsAdapter { _ast: tree_sitter::Node<'_>, file_bytes: &[u8], ) -> Option { + if super::source_filters_proto_keys(file_bytes) { + return None; + } let matches_call = super::any_callee_matches(summary, callee_is_lodash_merge); let matches_source = source_imports_lodash(file_bytes); if matches_call && matches_source { @@ -142,4 +148,40 @@ mod tests { .detect(&summary, tree.root_node(), src) .is_none()); } + + #[test] + fn skips_when_proto_key_filter_present() { + let src: &[u8] = b"const _ = require('lodash');\n\ + function run(payload) {\n\ + for (const k of Object.keys(payload)) {\n\ + if (k === '__proto__' || k === 'constructor') continue;\n\ + }\n\ + return _.merge({}, payload);\n\ + }\n"; + let tree = parse_js(src); + let summary = FuncSummary { + name: "run".into(), + callees: vec![crate::summary::CalleeSite::bare("merge")], + ..Default::default() + }; + assert!(PpLodashMergeJsAdapter + .detect(&summary, tree.root_node(), src) + .is_none()); + } + + #[test] + fn skips_when_object_prototype_frozen() { + let src: &[u8] = b"const _ = require('lodash');\n\ + Object.freeze(Object.prototype);\n\ + function run(payload) { return _.merge({}, payload); }\n"; + let tree = parse_js(src); + let summary = FuncSummary { + name: "run".into(), + callees: vec![crate::summary::CalleeSite::bare("merge")], + ..Default::default() + }; + assert!(PpLodashMergeJsAdapter + .detect(&summary, tree.root_node(), src) + .is_none()); + } } diff --git a/src/dynamic/framework/adapters/pp_object_assign.rs b/src/dynamic/framework/adapters/pp_object_assign.rs index d986a856..d2dc7398 100644 --- a/src/dynamic/framework/adapters/pp_object_assign.rs +++ b/src/dynamic/framework/adapters/pp_object_assign.rs @@ -12,16 +12,11 @@ use crate::summary::FuncSummary; use crate::symbol::Lang; fn callee_is_object_assign(name: &str) -> bool { - let last = name.rsplit_once('.').map(|(_, s)| s).unwrap_or(name); - matches!(last, "assign" | "create") - && (name == "Object.assign" || name == "Object.create" || name == "assign" || name == "create") + matches!(name, "Object.assign" | "assign") } fn source_uses_object_assign(file_bytes: &[u8]) -> bool { - const NEEDLES: &[&[u8]] = &[ - b"Object.assign", - b"Object.create", - ]; + const NEEDLES: &[&[u8]] = &[b"Object.assign"]; NEEDLES .iter() .any(|n| file_bytes.windows(n.len()).any(|w| w == *n)) @@ -57,6 +52,9 @@ impl FrameworkAdapter for PpObjectAssignJsAdapter { _ast: tree_sitter::Node<'_>, file_bytes: &[u8], ) -> Option { + if super::source_filters_proto_keys(file_bytes) { + return None; + } let matches_call = super::any_callee_matches(summary, callee_is_object_assign); let matches_source = source_uses_object_assign(file_bytes); if matches_call && matches_source { @@ -86,6 +84,9 @@ impl FrameworkAdapter for PpObjectAssignTsAdapter { _ast: tree_sitter::Node<'_>, file_bytes: &[u8], ) -> Option { + if super::source_filters_proto_keys(file_bytes) { + return None; + } let matches_call = super::any_callee_matches(summary, callee_is_object_assign); let matches_source = source_uses_object_assign(file_bytes); if matches_call && matches_source { @@ -133,4 +134,38 @@ mod tests { .detect(&summary, tree.root_node(), src) .is_none()); } + + #[test] + fn skips_object_create_null_mitigation() { + let src: &[u8] = + b"function run(payload) { return Object.create(null); }\n"; + let tree = parse_js(src); + let summary = FuncSummary { + name: "run".into(), + callees: vec![crate::summary::CalleeSite::bare("Object.create")], + ..Default::default() + }; + assert!(PpObjectAssignJsAdapter + .detect(&summary, tree.root_node(), src) + .is_none()); + } + + #[test] + fn skips_when_proto_key_filter_present() { + let src: &[u8] = b"function run(payload) {\n\ + for (const k of Object.keys(payload)) {\n\ + if (k === '__proto__' || k === 'constructor') continue;\n\ + }\n\ + return Object.assign({}, payload);\n\ + }\n"; + let tree = parse_js(src); + let summary = FuncSummary { + name: "run".into(), + callees: vec![crate::summary::CalleeSite::bare("Object.assign")], + ..Default::default() + }; + assert!(PpObjectAssignJsAdapter + .detect(&summary, tree.root_node(), src) + .is_none()); + } } diff --git a/tools/image-builder/main.rs b/tools/image-builder/main.rs index 0da5c198..c2a4ab30 100644 --- a/tools/image-builder/main.rs +++ b/tools/image-builder/main.rs @@ -334,19 +334,19 @@ fn parse_catalogue(src: &str) -> Vec { continue; } if line == "[[image]]" { - if let Some(prev) = current.take() { - if !prev.toolchain_id.is_empty() { - entries.push(prev); - } + if let Some(prev) = current.take() + && !prev.toolchain_id.is_empty() + { + entries.push(prev); } current = Some(ImageEntry::default()); continue; } if line.starts_with("[[") || line.starts_with('[') { - if let Some(prev) = current.take() { - if !prev.toolchain_id.is_empty() { - entries.push(prev); - } + if let Some(prev) = current.take() + && !prev.toolchain_id.is_empty() + { + entries.push(prev); } continue; } @@ -361,10 +361,10 @@ fn parse_catalogue(src: &str) -> Vec { _ => {} } } - if let Some(prev) = current.take() { - if !prev.toolchain_id.is_empty() { - entries.push(prev); - } + if let Some(prev) = current.take() + && !prev.toolchain_id.is_empty() + { + entries.push(prev); } entries } @@ -415,19 +415,15 @@ fn rewrite_digests(src: &str, updates: &[(String, String)]) -> String { current_tid = Some(value); } - if parse_toml_string_value(trimmed, "digest").is_some() { - if let Some(tid) = ¤t_tid { - if let Some((_, new_digest)) = - updates.iter().find(|(id, _)| id == tid) - { - // Preserve indentation. - let indent_len = raw.len() - raw.trim_start().len(); - out.push_str(&raw[..indent_len]); - out.push_str(&format!("digest = \"{new_digest}\"")); - out.push('\n'); - continue; - } - } + if parse_toml_string_value(trimmed, "digest").is_some() + && let Some(tid) = ¤t_tid + && let Some((_, new_digest)) = updates.iter().find(|(id, _)| id == tid) + { + let indent_len = raw.len() - raw.trim_start().len(); + out.push_str(&raw[..indent_len]); + out.push_str(&format!("digest = \"{new_digest}\"")); + out.push('\n'); + continue; } }