Feat/full cfg (#30)

* feat: Enhance control flow analysis with function summaries and taint analysis

* feat: Update taint analysis to utilize function summaries for enhanced tracking

* Refactor `walk.rs` batch processing and override handling:

- Renamed `Batcher` to `BatchSender` for clarity.
- Added `BatchSender::new` constructor for cleaner initialization.
- Simplified batch size management in `BatchSender`.
- Extracted `build_overrides` function for reusable override construction.
- Improved error handling and validation in override building.
- Enhanced performance with directory and file type filtering in `walk`.

* Improve logging and streamline directory walk process:

- Added detailed `tracing` logs for debugging batch flushes, override construction, and walk initialization/completion.
- Optimized and simplified `filter_entry` logic for directory and file type filters.
- Improved metadata checks and max file size enforcement during the scan.

* Refactor and optimize taint tracking, label rules, and directory walk process:

- Replaced `DefaultHasher` with `blake3::Hasher` for improved taint hashing.
- Enhanced sorting and hashing logic in `taint.rs` for consistency and efficiency.
- Removed unused `set_hash` function and redundant imports across files.
- Improved batch sender logic in `walk.rs`, renaming key components for clarity.
- Unified `spawn_senders` and `spawn_file_walker` with thread handling and channel tuple return.
- Expanded label rules with additional matchers for sources, sanitizers, and sinks.
- Deprecated `dump_cfg` and specific logging utilities in `cfg.rs` for code cleanup.

* fix: fixed let chains error in walk.rs

* fix: updated dependencies

* fix: updated dependencies

* chore: Remove standard error in scan.rs

* feat: Introduce function summaries for enhanced taint and control flow analysis

* feat: Enhance taint analysis with interop support and function summaries

* feat: Add configuration analysis module and enhance matcher rules

* feat: Add arity column to function_summaries and handle schema migration

* fix: fixed clippy &PathBuf warnings

* chore: Update dependencies and versioning in Cargo files

* docs: Update README to enhance clarity and detail on features and analysis modes

* chore: Update CHANGELOG for version 0.2.0 with new features, changes, and fixes

* docs: Update SECURITY.md to clarify version support status

---------

Co-authored-by: elipeter <eli.peter@es.fcm.travel>
This commit is contained in:
Eli Peter 2026-02-24 23:44:07 -05:00 committed by GitHub
parent 8cbbec7d90
commit f96a89e7c1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
87 changed files with 11505 additions and 1099 deletions

View file

@ -1,17 +1,91 @@
use crate::labels::{Cap, DataLabel, LabelRule};
use crate::labels::{Cap, DataLabel, Kind, LabelRule, ParamConfig};
use phf::{Map, phf_map};
// TODO: refactor this
pub static RULES: &[LabelRule] = &[
// ─────────── Sources ───────────
LabelRule {
matchers: &["document.location", "window.location"],
matchers: &[
"document.location",
"window.location",
"req.body",
"req.query",
"req.params",
"req.headers",
"req.cookies",
"process.env",
],
label: DataLabel::Source(Cap::all()),
},
// ───────── Sanitizers ──────────
LabelRule {
matchers: &["JSON.parse"],
label: DataLabel::Sanitizer(Cap::JSON_PARSE),
},
LabelRule {
matchers: &["encodeURIComponent", "encodeURI"],
label: DataLabel::Sanitizer(Cap::URL_ENCODE),
},
LabelRule {
matchers: &["DOMPurify.sanitize"],
label: DataLabel::Sanitizer(Cap::HTML_ESCAPE),
},
// ─────────── Sinks ─────────────
LabelRule {
matchers: &["eval"],
label: DataLabel::Sink(Cap::SHELL_ESCAPE),
},
LabelRule {
matchers: &["innerHTML"],
label: DataLabel::Sink(Cap::HTML_ESCAPE),
},
LabelRule {
matchers: &[
"child_process.exec",
"child_process.execSync",
"child_process.spawn",
],
label: DataLabel::Sink(Cap::SHELL_ESCAPE),
},
];
pub static KINDS: Map<&'static str, Kind> = phf_map! {
// control-flow
"if_statement" => Kind::If,
"while_statement" => Kind::While,
"for_statement" => Kind::For,
"for_in_statement" => Kind::For,
"return_statement" => Kind::Return,
"break_statement" => Kind::Break,
"continue_statement" => Kind::Continue,
// structure
"program" => Kind::SourceFile,
"statement_block" => Kind::Block,
"function_declaration" => Kind::Function,
"arrow_function" => Kind::Function,
"method_definition" => Kind::Function,
// data-flow
"call_expression" => Kind::CallFn,
"new_expression" => Kind::CallFn,
"assignment_expression" => Kind::Assignment,
"variable_declaration" => Kind::CallWrapper,
"lexical_declaration" => Kind::CallWrapper,
"expression_statement" => Kind::CallWrapper,
// trivia
"comment" => Kind::Trivia,
";" => Kind::Trivia, "," => Kind::Trivia,
"(" => Kind::Trivia, ")" => Kind::Trivia,
"{" => Kind::Trivia, "}" => Kind::Trivia,
"\n" => Kind::Trivia,
"import_statement" => Kind::Trivia,
};
pub static PARAM_CONFIG: ParamConfig = ParamConfig {
params_field: "parameters",
param_node_kinds: &["identifier"],
self_param_kinds: &[],
ident_fields: &["name", "pattern"],
};