nyx/src/labels/python.rs

105 lines
3 KiB
Rust
Raw Normal View History

Feat/full cfg (#30) * feat: Enhance control flow analysis with function summaries and taint analysis * feat: Update taint analysis to utilize function summaries for enhanced tracking * Refactor `walk.rs` batch processing and override handling: - Renamed `Batcher` to `BatchSender` for clarity. - Added `BatchSender::new` constructor for cleaner initialization. - Simplified batch size management in `BatchSender`. - Extracted `build_overrides` function for reusable override construction. - Improved error handling and validation in override building. - Enhanced performance with directory and file type filtering in `walk`. * Improve logging and streamline directory walk process: - Added detailed `tracing` logs for debugging batch flushes, override construction, and walk initialization/completion. - Optimized and simplified `filter_entry` logic for directory and file type filters. - Improved metadata checks and max file size enforcement during the scan. * Refactor and optimize taint tracking, label rules, and directory walk process: - Replaced `DefaultHasher` with `blake3::Hasher` for improved taint hashing. - Enhanced sorting and hashing logic in `taint.rs` for consistency and efficiency. - Removed unused `set_hash` function and redundant imports across files. - Improved batch sender logic in `walk.rs`, renaming key components for clarity. - Unified `spawn_senders` and `spawn_file_walker` with thread handling and channel tuple return. - Expanded label rules with additional matchers for sources, sanitizers, and sinks. - Deprecated `dump_cfg` and specific logging utilities in `cfg.rs` for code cleanup. * fix: fixed let chains error in walk.rs * fix: updated dependencies * fix: updated dependencies * chore: Remove standard error in scan.rs * feat: Introduce function summaries for enhanced taint and control flow analysis * feat: Enhance taint analysis with interop support and function summaries * feat: Add configuration analysis module and enhance matcher rules * feat: Add arity column to function_summaries and handle schema migration * fix: fixed clippy &PathBuf warnings * chore: Update dependencies and versioning in Cargo files * docs: Update README to enhance clarity and detail on features and analysis modes * chore: Update CHANGELOG for version 0.2.0 with new features, changes, and fixes * docs: Update SECURITY.md to clarify version support status --------- Co-authored-by: elipeter <eli.peter@es.fcm.travel>
2026-02-24 23:44:07 -05:00
use crate::labels::{Cap, DataLabel, Kind, LabelRule, ParamConfig};
use phf::{Map, phf_map};
pub static RULES: &[LabelRule] = &[
// ─────────── Sources ───────────
LabelRule {
matchers: &["os.getenv", "os.environ"],
label: DataLabel::Source(Cap::all()),
},
LabelRule {
matchers: &[
"request.args",
"request.form",
"request.json",
"request.headers",
"request.cookies",
"input",
],
label: DataLabel::Source(Cap::all()),
},
LabelRule {
matchers: &["sys.argv"],
label: DataLabel::Source(Cap::all()),
},
LabelRule {
matchers: &["open"],
label: DataLabel::Source(Cap::all()),
},
LabelRule {
matchers: &[
"argparse.parse_args",
"urllib.request.urlopen",
"requests.get",
"requests.post",
],
label: DataLabel::Source(Cap::all()),
},
Feat/full cfg (#30) * feat: Enhance control flow analysis with function summaries and taint analysis * feat: Update taint analysis to utilize function summaries for enhanced tracking * Refactor `walk.rs` batch processing and override handling: - Renamed `Batcher` to `BatchSender` for clarity. - Added `BatchSender::new` constructor for cleaner initialization. - Simplified batch size management in `BatchSender`. - Extracted `build_overrides` function for reusable override construction. - Improved error handling and validation in override building. - Enhanced performance with directory and file type filtering in `walk`. * Improve logging and streamline directory walk process: - Added detailed `tracing` logs for debugging batch flushes, override construction, and walk initialization/completion. - Optimized and simplified `filter_entry` logic for directory and file type filters. - Improved metadata checks and max file size enforcement during the scan. * Refactor and optimize taint tracking, label rules, and directory walk process: - Replaced `DefaultHasher` with `blake3::Hasher` for improved taint hashing. - Enhanced sorting and hashing logic in `taint.rs` for consistency and efficiency. - Removed unused `set_hash` function and redundant imports across files. - Improved batch sender logic in `walk.rs`, renaming key components for clarity. - Unified `spawn_senders` and `spawn_file_walker` with thread handling and channel tuple return. - Expanded label rules with additional matchers for sources, sanitizers, and sinks. - Deprecated `dump_cfg` and specific logging utilities in `cfg.rs` for code cleanup. * fix: fixed let chains error in walk.rs * fix: updated dependencies * fix: updated dependencies * chore: Remove standard error in scan.rs * feat: Introduce function summaries for enhanced taint and control flow analysis * feat: Enhance taint analysis with interop support and function summaries * feat: Add configuration analysis module and enhance matcher rules * feat: Add arity column to function_summaries and handle schema migration * fix: fixed clippy &PathBuf warnings * chore: Update dependencies and versioning in Cargo files * docs: Update README to enhance clarity and detail on features and analysis modes * chore: Update CHANGELOG for version 0.2.0 with new features, changes, and fixes * docs: Update SECURITY.md to clarify version support status --------- Co-authored-by: elipeter <eli.peter@es.fcm.travel>
2026-02-24 23:44:07 -05:00
// ───────── Sanitizers ──────────
LabelRule {
matchers: &["html.escape"],
label: DataLabel::Sanitizer(Cap::HTML_ESCAPE),
},
LabelRule {
matchers: &["shlex.quote"],
label: DataLabel::Sanitizer(Cap::SHELL_ESCAPE),
},
// ─────────── Sinks ─────────────
LabelRule {
matchers: &["eval", "exec"],
label: DataLabel::Sink(Cap::SHELL_ESCAPE),
},
LabelRule {
matchers: &[
"os.system",
"os.popen",
"subprocess.call",
"subprocess.run",
"subprocess.Popen",
"subprocess.check_output",
"subprocess.check_call",
],
label: DataLabel::Sink(Cap::SHELL_ESCAPE),
},
LabelRule {
matchers: &["cursor.execute", "cursor.executemany"],
label: DataLabel::Sink(Cap::SHELL_ESCAPE),
},
];
pub static KINDS: Map<&'static str, Kind> = phf_map! {
// control-flow
"if_statement" => Kind::If,
"while_statement" => Kind::While,
"for_statement" => Kind::For,
"return_statement" => Kind::Return,
"break_statement" => Kind::Break,
"continue_statement" => Kind::Continue,
// structure
"module" => Kind::SourceFile,
"block" => Kind::Block,
"function_definition" => Kind::Function,
// data-flow
"call" => Kind::CallFn,
"assignment" => Kind::Assignment,
"expression_statement" => Kind::CallWrapper,
// trivia
"comment" => Kind::Trivia,
":" => Kind::Trivia, "," => Kind::Trivia,
"(" => Kind::Trivia, ")" => Kind::Trivia,
"\n" => Kind::Trivia,
"import_statement" => Kind::Trivia,
"import_from_statement" => Kind::Trivia,
};
pub static PARAM_CONFIG: ParamConfig = ParamConfig {
params_field: "parameters",
param_node_kinds: &["identifier"],
self_param_kinds: &[],
ident_fields: &["name"],
};