Python fp and docs updtes (#58)

* refactor: Update comments for clarity and add expectations.json files for performance metrics

* feat: Implement FP guard for JS/TS local-collection receivers to suppress missing ownership checks

* feat: Enhance Rust parameter handling to classify local collections and prevent false ownership checks

* refactor: Simplify code formatting for better readability in multiple files

* refactor: Improve UTF-8 sequence length handling and enhance clarity in loop iteration

* feat: Update Java and Python patterns to include new security rules

* refactor: Improve comment clarity and consistency across multiple Rust files

* refactor: Simplify code formatting for improved readability in integration tests and module files

* refactor: Improve comment formatting and enhance clarity in assertions across multiple files
This commit is contained in:
Eli Peter 2026-04-29 19:53:34 -04:00 committed by GitHub
parent 4db0805de6
commit a438886217
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
291 changed files with 9485 additions and 3851 deletions

View file

@ -9,6 +9,16 @@ Nyx ships four independent detector families. They run together in `--mode full`
| [State model](detectors/state.md) | `state-*` | Per-function state lattice | Use-after-close, double-close, leaks, unauthenticated access |
| [AST patterns](detectors/patterns.md) | `<lang>.<cat>.<name>` | Tree-sitter structural match | Banned APIs, weak crypto, dangerous constructs |
The taint family is split into cap-specific rule classes when a sink callee carries multiple vulnerability classes:
| Rule id | Cap | Surface |
|---|---|---|
| `taint-unsanitised-flow` | every cap except `data_exfil` and `unauthorized_id` | Default taint flow class |
| `taint-data-exfiltration` | `data_exfil` | Sensitive data flowing into the payload of an outbound network request (body / headers / json on `fetch`, body on `XMLHttpRequest.send`). Distinct from SSRF: the destination is fixed but attacker-influenced bytes leave the process. |
| `rs.auth.missing_ownership_check.taint` | `unauthorized_id` | Rust auth subsystem fold-in; see [auth.md](auth.md). |
A single call site can fire several of these at once when it carries multiple gates — `fetch(taintedUrl, {body: tainted})` produces both an SSRF finding (URL flow) and a `taint-data-exfiltration` finding (body flow), each with its own cap mask rather than a conflated union.
For Rust auth-specific rules (`rs.auth.*`), see [auth.md](auth.md).
## How they combine

View file

@ -134,7 +134,8 @@ Sources, sanitizers, and sinks are linked by named capabilities. A sanitizer onl
| `fmt_string` | | | `printf(var)` |
| `sql_query` | | parameterized query binders | `cursor.execute`, `db.query` with concatenation |
| `deserialize` | | | `pickle.loads`, `yaml.load`, `Marshal.load` |
| `ssrf` | | URL-prefix locks | `requests.get`, `fetch`, `HttpClient.send` |
| `ssrf` | | URL-prefix locks | `requests.get`, `fetch` URL arg, outbound HTTP destination |
| `data_exfil` | | | `fetch` body / headers / json, `XMLHttpRequest.send` body |
| `code_exec` | | | `eval`, `exec`, `Function` |
| `crypto` | | | weak-algorithm constructors |
| `unauthorized_id` | request-bound scoped IDs (Rust auth analysis) | ownership check | row-level write |

View file

@ -112,12 +112,14 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
| `go.crypto.md5` | Low | A | Medium |
| `go.crypto.sha1` | Low | A | Medium |
### Java: 8 patterns
### Java: 10 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
| `java.cmdi.runtime_exec` | High | A | High |
| `java.code_exec.text4shell_interpolator` | High | A | High |
| `java.deser.readobject` | High | A | High |
| `java.deser.snakeyaml_unsafe_constructor` | High | A | High |
| `java.reflection.class_forname` | Medium | A | High |
| `java.reflection.method_invoke` | Medium | A | High |
| `java.sqli.execute_concat` | Medium | B | Medium |
@ -168,7 +170,7 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
| `php.crypto.rand` | Low | A | Medium |
| `php.crypto.sha1` | Low | A | Medium |
### Python: 13 patterns
### Python: 14 patterns
| Rule ID | Severity | Tier | Confidence |
|---|---|---|---|
@ -182,6 +184,7 @@ The tables below are generated from `src/patterns/<lang>.rs` by [`tools/docgen`]
| `py.code_exec.compile` | Medium | A | High |
| `py.deser.shelve_open` | Medium | A | High |
| `py.sqli.execute_format` | Medium | B | Medium |
| `py.sqli.text_format` | Medium | B | Medium |
| `py.xss.jinja_from_string` | Medium | A | High |
| `py.crypto.md5` | Low | A | Medium |
| `py.crypto.sha1` | Low | A | Medium |