Phase 1 (#33)

* chore: Exclude CLAUDE.md from Cargo.toml * feat: add callgraph module and integrate into main analysis flow * feat: enhance CLI with new severity filtering and analysis modes * feat: update CHANGELOG with recent enhancements and fixes to severity filtering and output handling * feat: implement state-model dataflow analysis for resource lifecycle and auth state * feat: enhance diagnostic output formatting and add evidence structure * feat: implement attack surface ranking for diagnostics with scoring and sorting * feat: add comprehensive documentation for installation, usage, and rules reference * feat: add multiple language support for command execution and evaluation endpoints * feat: implement inline suppression for findings using `nyx:ignore` comments * feat: add confidence levels to AST patterns and update output structure * feat: implement low-noise prioritization system with category filtering, rollup grouping, and configurable budgets * feat: bump version to 0.4.0 and update changelog with new features and improvements * feat: add dead code allowances to various functions in mod.rs and real_world_tests.rs
2026-06-15 20:05:13 +02:00 · 2026-02-25 21:16:36 -05:00 · 2026-02-25 21:16:36 -05:00 · 1bbe4b1cfb
commit 1bbe4b1cfb
parent 19b578c5c4
456 changed files with 25628 additions and 1228 deletions
--- a/docs/rules/c.md
+++ b/docs/rules/c.md
@ -0,0 +1,89 @@
+# C Rules
+
+Nyx detects C vulnerabilities through AST patterns (banned functions, format strings) and taint analysis (user input → shell execution, buffer overflow sinks).
+
+## Taint Sources
+
+| Function | Capability | Source Kind |
+|----------|-----------|-------------|
+| `getenv` | `all` | EnvironmentConfig |
+| `fgets`, `scanf`, `fscanf`, `gets`, `read` | `all` | UserInput |
+
+## Taint Sinks
+
+| Function | Required Capability |
+|----------|-------------------|
+| `system`, `popen`, `exec*` family | `SHELL_ESCAPE` |
+| `sprintf`, `strcpy`, `strcat` | `HTML_ESCAPE` |
+| `printf`, `fprintf` | `FMT_STRING` |
+| `fopen`, `open` | `FILE_IO` |
+
+---
+
+## AST Pattern Rules
+
+### Memory Safety (Banned Functions)
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `c.memory.gets` | High | A | `gets()` — no bounds checking, always exploitable |
+| `c.memory.strcpy` | High | A | `strcpy()` — no bounds checking on destination buffer |
+| `c.memory.strcat` | High | A | `strcat()` — no bounds checking on destination buffer |
+| `c.memory.sprintf` | High | A | `sprintf()` — no length limit on output buffer |
+| `c.memory.scanf_percent_s` | High | A | `scanf("%s")` — unbounded string read |
+| `c.memory.printf_no_fmt` | High | B | `printf(var)` — format-string vulnerability (non-literal first arg) |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `c.cmdi.system` | High | A | `system()` — shell command execution |
+| `c.cmdi.popen` | Medium | A | `popen()` — shell command execution with pipe |
+
+---
+
+## Examples
+
+### `c.memory.gets` — Banned function
+
+**Vulnerable:**
+```c
+char buf[64];
+gets(buf);  // No bounds checking — buffer overflow
+```
+
+**Safe alternative:**
+```c
+char buf[64];
+fgets(buf, sizeof(buf), stdin);
+```
+
+### `c.memory.printf_no_fmt` — Format string
+
+**Vulnerable:**
+```c
+char *user_input = get_input();
+printf(user_input);  // Format string vulnerability
+```
+
+**Safe alternative:**
+```c
+char *user_input = get_input();
+printf("%s", user_input);
+```
+
+### `c.cmdi.system` — Shell execution
+
+**Vulnerable:**
+```c
+char cmd[256];
+snprintf(cmd, sizeof(cmd), "ls %s", user_dir);
+system(cmd);  // Command injection if user_dir contains shell metacharacters
+```
+
+**Safe alternative:**
+```c
+// Use execvp with explicit argument array
+char *args[] = {"ls", user_dir, NULL};
+execvp("ls", args);
+```
--- a/docs/rules/cpp.md
+++ b/docs/rules/cpp.md
@ -0,0 +1,66 @@
+# C++ Rules
+
+C++ rules inherit C banned-function concerns and add C++-specific patterns like dangerous casts.
+
+## Taint Labels
+
+C++ shares taint labels with C. See [C Rules](c.md) for the full source/sink/sanitizer listing.
+
+---
+
+## AST Pattern Rules
+
+### Memory Safety
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `cpp.memory.gets` | High | A | `gets()` — no bounds checking, always exploitable |
+| `cpp.memory.strcpy` | High | A | `strcpy()` — no bounds checking on destination |
+| `cpp.memory.strcat` | High | A | `strcat()` — no bounds checking on destination |
+| `cpp.memory.sprintf` | High | A | `sprintf()` — no length limit on output |
+| `cpp.memory.reinterpret_cast` | Medium | A | `reinterpret_cast` — type-punning cast |
+| `cpp.memory.const_cast` | Medium | A | `const_cast` — removes const/volatile qualifier |
+| `cpp.memory.printf_no_fmt` | High | B | `printf(var)` — format-string vulnerability |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `cpp.cmdi.system` | High | A | `system()` — shell command execution |
+| `cpp.cmdi.popen` | High | A | `popen()` — shell command execution |
+
+---
+
+## Examples
+
+### `cpp.memory.reinterpret_cast` — Type-punning cast
+
+**Flagged:**
+```cpp
+int x = 42;
+float* fp = reinterpret_cast<float*>(&x);  // Type-punning, may violate strict aliasing
+```
+
+**Safe alternative:**
+```cpp
+int x = 42;
+float f;
+std::memcpy(&f, &x, sizeof(f));  // Well-defined type punning
+```
+
+### `cpp.memory.const_cast` — Removing const
+
+**Flagged:**
+```cpp
+void process(const std::string& s) {
+    char* p = const_cast<char*>(s.c_str());  // Removes const
+    p[0] = 'X';  // Undefined behavior
+}
+```
+
+**Safe alternative:**
+```cpp
+void process(std::string s) {  // Take by value
+    s[0] = 'X';
+}
+```
--- a/docs/rules/go.md
+++ b/docs/rules/go.md
@ -0,0 +1,148 @@
+# Go Rules
+
+Nyx detects Go vulnerabilities through AST patterns and taint analysis, covering command execution, unsafe pointer usage, TLS misconfiguration, weak crypto, SQL injection, hardcoded secrets, and deserialization.
+
+## Taint Labels
+
+Go has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/go.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `os.Getenv` | all |
+| `http.Request`, `r.FormValue`, `r.URL`, `r.Body`, `r.Header` | all |
+| `r.URL.Query`, `r.URL.Query.Get`, `Request.FormValue`, `Request.URL` | all |
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `html.EscapeString`, `template.HTMLEscapeString` | HTML_ESCAPE |
+| `url.QueryEscape`, `url.PathEscape` | URL_ENCODE |
+| `filepath.Clean`, `filepath.Base` | FILE_IO |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `exec.Command` | SHELL_ESCAPE |
+| `db.Query`, `db.Exec`, `db.QueryRow`, `db.Prepare` | SHELL_ESCAPE |
+| `fmt.Fprintf`, `fmt.Sprintf`, `fmt.Printf` | FMT_STRING |
+| `os.Open`, `os.OpenFile`, `os.Create`, `ioutil.ReadFile`, `os.ReadFile` | FILE_IO |
+| `template.HTML` | HTML_ESCAPE |
+
+> **Note:** Chained calls like `r.URL.Query().Get("host")` are normalized by stripping internal `()` segments before matching, so `r.URL.Query.Get` matches the source rule.
+
+---
+
+## AST Pattern Rules
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.cmdi.exec_command` | High | A | `exec.Command()` — arbitrary process execution |
+
+### Memory Safety
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.memory.unsafe_pointer` | Medium | A | `unsafe.Pointer` — bypasses Go type system |
+
+### Insecure Transport
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.transport.insecure_skip_verify` | High | A | `InsecureSkipVerify: true` — disables TLS certificate validation |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.crypto.md5` | Low | A | `md5.New()` / `md5.Sum()` — weak hash algorithm |
+| `go.crypto.sha1` | Low | A | `sha1.New()` / `sha1.Sum()` — weak hash algorithm |
+
+### SQL Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.sqli.query_concat` | Medium | B | `db.Query`/`Exec`/`QueryRow` with concatenated string |
+
+### Secrets
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.secrets.hardcoded_key` | Medium | A | Variable with secret-like name assigned a string literal |
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `go.deser.gob_decode` | Medium | A | `gob.NewDecoder` — Go binary deserialization |
+
+---
+
+## Examples
+
+### `go.transport.insecure_skip_verify` — TLS misconfiguration
+
+**Vulnerable:**
+```go
+tr := &http.Transport{
+    TLSClientConfig: &tls.Config{
+        InsecureSkipVerify: true,  // Disables certificate verification
+    },
+}
+```
+
+**Safe alternative:**
+```go
+tr := &http.Transport{
+    TLSClientConfig: &tls.Config{
+        // Use proper CA certificates
+        RootCAs: certPool,
+    },
+}
+```
+
+### `go.sqli.query_concat` — SQL concatenation
+
+**Vulnerable:**
+```go
+rows, err := db.Query("SELECT * FROM users WHERE id=" + userID)
+```
+
+**Safe alternative:**
+```go
+rows, err := db.Query("SELECT * FROM users WHERE id=$1", userID)
+```
+
+### `go.secrets.hardcoded_key` — Hardcoded secret
+
+**Flagged:**
+```go
+apiKey := "sk-1234567890abcdef"
+password := "hunter2"
+```
+
+**Safe alternative:**
+```go
+apiKey := os.Getenv("API_KEY")
+password := os.Getenv("DB_PASSWORD")
+```
+
+### `go.cmdi.exec_command` — Command execution
+
+**Vulnerable:**
+```go
+cmd := exec.Command("sh", "-c", userInput)
+cmd.Run()
+```
+
+**Safe alternative:**
+```go
+// Use explicit command and arguments, not shell
+cmd := exec.Command("ls", "-la", safeDir)
+cmd.Run()
+```
--- a/docs/rules/index.md
+++ b/docs/rules/index.md
@ -0,0 +1,79 @@
+# Rule Reference
+
+This section lists every detection rule in Nyx, organized by language.
+
+## Rule ID Format
+
+| Prefix | Detector Family | Example |
+|--------|----------------|---------|
+| `taint-*` | [Taint analysis](../detectors/taint.md) | `taint-unsanitised-flow (source 5:11)` |
+| `cfg-*` | [CFG structural](../detectors/cfg.md) | `cfg-unguarded-sink`, `cfg-auth-gap` |
+| `state-*` | [State model](../detectors/state.md) | `state-use-after-close`, `state-resource-leak` |
+| `<lang>.*.*` | [AST patterns](../detectors/patterns.md) | `rs.memory.transmute`, `js.code_exec.eval` |
+
+## Cross-Language Rules
+
+These rules apply to all supported languages:
+
+### Taint Rules
+
+| Rule ID | Severity | Description |
+|---------|----------|-------------|
+| `taint-unsanitised-flow (source L:C)` | Varies by source kind | Unsanitized data flows from source to sink |
+
+### CFG Structural Rules
+
+| Rule ID | Severity | Description |
+|---------|----------|-------------|
+| `cfg-unguarded-sink` | High/Medium | Sink without dominating guard |
+| `cfg-auth-gap` | High | Web handler reaches privileged sink without auth |
+| `cfg-unreachable-sink` | Medium | Dangerous function in unreachable code |
+| `cfg-unreachable-sanitizer` | Low | Sanitizer in unreachable code |
+| `cfg-unreachable-source` | Low | Source in unreachable code |
+| `cfg-error-fallthrough` | High/Medium | Error path doesn't terminate before dangerous code |
+| `cfg-resource-leak` | Medium | Resource not released on all exit paths |
+| `cfg-lock-not-released` | Medium | Lock not released on all exit paths |
+
+### State Model Rules
+
+| Rule ID | Severity | Description |
+|---------|----------|-------------|
+| `state-use-after-close` | High | Variable used after being closed |
+| `state-double-close` | Medium | Resource closed twice |
+| `state-resource-leak` | Medium | Resource never closed (definite) |
+| `state-resource-leak-possible` | Low | Resource may not close on all paths |
+| `state-unauthed-access` | High | Privileged operation without authentication |
+
+## Per-Language AST Pattern Rules
+
+Each language page lists all AST pattern rules with examples:
+
+- [Rust](rust.md) — 12 rules (memory safety, code quality)
+- [C](c.md) — 8 rules (banned functions, command execution, format strings)
+- [C++](cpp.md) — 9 rules (banned functions, dangerous casts, command execution)
+- [Java](java.md) — 8 rules (deserialization, command execution, reflection, SQL, crypto, XSS)
+- [Go](go.md) — 8 rules (command execution, unsafe pointer, TLS, crypto, SQL, secrets, deserialization)
+- [JavaScript](javascript.md) — 12 rules (code execution, XSS, prototype pollution, crypto, transport)
+- [TypeScript](typescript.md) — 10 rules (mirrors JS + type-safety escapes)
+- [Python](python.md) — 12 rules (code execution, command execution, deserialization, SQL, crypto, XSS)
+- [PHP](php.md) — 11 rules (code execution, command execution, deserialization, SQL, path traversal, crypto)
+- [Ruby](ruby.md) — 10 rules (code execution, command execution, deserialization, reflection, SSRF, crypto)
+
+## Taint Label Coverage
+
+Taint analysis uses language-specific source/sink/sanitizer labels. Coverage varies by language:
+
+| Language | Sources | Sinks | Sanitizers | Coverage |
+|----------|---------|-------|------------|----------|
+| Rust | Complete | Complete | Complete | Full |
+| JavaScript | Complete | Complete | Partial | Full |
+| TypeScript | Partial | Partial | Partial | Moderate |
+| Python | Partial | Complete | Partial | Moderate |
+| C | Partial | Complete | Minimal | Moderate |
+| C++ | Partial | Complete | Minimal | Moderate |
+| Java | Partial | Partial | Partial | Moderate |
+| Go | Complete | Complete | Partial | Full |
+| PHP | Complete | Complete | Partial | Full |
+| Ruby | Partial | Partial | Partial | Moderate |
+
+"Starter" coverage means basic rules exist but many common library functions are not yet labeled. Contributions welcome.
--- a/docs/rules/java.md
+++ b/docs/rules/java.md
@ -0,0 +1,135 @@
+# Java Rules
+
+Nyx detects Java vulnerabilities through AST patterns and taint analysis, covering deserialization, command execution, reflection, SQL injection, weak crypto, and XSS.
+
+## Taint Labels
+
+Java has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/java.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `System.getenv` | all |
+| `getParameter`, `getInputStream`, `getHeader`, `getCookies`, `getReader`, `getQueryString`, `getPathInfo` | all |
+| `readObject`, `readLine` | all |
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `HtmlUtils.htmlEscape`, `StringEscapeUtils.escapeHtml4` | HTML_ESCAPE |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `Runtime.exec`, `ProcessBuilder` | SHELL_ESCAPE |
+| `executeQuery`, `executeUpdate`, `prepareStatement` | SHELL_ESCAPE |
+| `Class.forName` | SHELL_ESCAPE |
+| `println`, `print`, `write` | HTML_ESCAPE |
+
+---
+
+## AST Pattern Rules
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.deser.readobject` | High | A | `ObjectInputStream.readObject()` — unsafe deserialization |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.cmdi.runtime_exec` | High | A | `Runtime.getRuntime().exec()` — shell command execution |
+
+### Reflection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.reflection.class_forname` | Medium | A | `Class.forName()` — dynamic class loading |
+| `java.reflection.method_invoke` | Medium | A | `Method.invoke()` — reflective method invocation |
+
+### SQL Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.sqli.execute_concat` | Medium | B | SQL `execute*()` with concatenated string argument |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.crypto.insecure_random` | Low | A | `new Random()` — `java.util.Random` is not cryptographically secure |
+| `java.crypto.weak_digest` | Low | A | `MessageDigest.getInstance("MD5"/"SHA1")` |
+
+### XSS
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `java.xss.getwriter_print` | Medium | A | `response.getWriter().print/println/write` — direct output |
+
+---
+
+## Examples
+
+### `java.deser.readobject` — Unsafe deserialization
+
+**Vulnerable:**
+```java
+ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
+Object obj = ois.readObject();  // Arbitrary object instantiation
+```
+
+**Safe alternative:**
+```java
+// Use a safe format like JSON
+ObjectMapper mapper = new ObjectMapper();
+MyType obj = mapper.readValue(request.getInputStream(), MyType.class);
+```
+
+### `java.sqli.execute_concat` — SQL concatenation
+
+**Vulnerable:**
+```java
+String query = "SELECT * FROM users WHERE id=" + userId;
+stmt.executeQuery(query);  // SQL injection
+```
+
+**Safe alternative:**
+```java
+PreparedStatement ps = conn.prepareStatement("SELECT * FROM users WHERE id=?");
+ps.setString(1, userId);
+ResultSet rs = ps.executeQuery();
+```
+
+### `java.cmdi.runtime_exec` — Command execution
+
+**Vulnerable:**
+```java
+Runtime.getRuntime().exec("cmd /c " + userCommand);
+```
+
+**Safe alternative:**
+```java
+ProcessBuilder pb = new ProcessBuilder("cmd", "/c", "dir");
+// Use explicit argument list, never concatenate user input
+```
+
+### `java.reflection.class_forname` — Dynamic class loading
+
+**Flagged:**
+```java
+Class<?> cls = Class.forName(className);
+Object obj = cls.getDeclaredConstructor().newInstance();
+```
+
+**Safe alternative:**
+```java
+// Use an allowlist of permitted class names
+Map<String, Class<?>> allowed = Map.of("User", User.class, "Order", Order.class);
+Class<?> cls = allowed.get(className);
+if (cls != null) { /* ... */ }
+```
--- a/docs/rules/javascript.md
+++ b/docs/rules/javascript.md
@ -0,0 +1,138 @@
+# JavaScript Rules
+
+JavaScript has the most complete taint label coverage alongside Rust. Nyx detects code execution, XSS, prototype pollution, command injection, and weak crypto.
+
+## Taint Sources
+
+| Function | Capability | Source Kind |
+|----------|-----------|-------------|
+| `document.location`, `window.location` | `all` | UserInput |
+| `req.body`, `req.query`, `req.params` | `all` | UserInput |
+| `req.headers`, `req.cookies` | `all` | UserInput |
+| `process.env` | `all` | EnvironmentConfig |
+
+## Taint Sinks
+
+| Function | Required Capability |
+|----------|-------------------|
+| `eval` | `SHELL_ESCAPE` |
+| `innerHTML` | `HTML_ESCAPE` |
+| `location.href`, `window.location.href` | `URL_ENCODE` |
+| `child_process.exec`, `child_process.execSync` | `SHELL_ESCAPE` |
+| `child_process.spawn` | `SHELL_ESCAPE` |
+
+## Taint Sanitizers
+
+| Function | Strips Capability |
+|----------|------------------|
+| `JSON.parse` | `JSON_PARSE` |
+| `encodeURIComponent`, `encodeURI` | `URL_ENCODE` |
+| `DOMPurify.sanitize` | `HTML_ESCAPE` |
+
+> **Note:** Anonymous function expressions and arrow functions passed as callback arguments (e.g., Express `app.get('/path', function(req, res) { ... })`) are automatically walked as separate function scopes for taint analysis. Each anonymous function gets a unique scope identifier to prevent cross-function taint leakage.
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.code_exec.eval` | High | A | `eval()` — dynamic code execution |
+| `js.code_exec.new_function` | High | A | `new Function()` — eval equivalent |
+| `js.code_exec.settimeout_string` | Medium | A | `setTimeout`/`setInterval` with string argument |
+
+### XSS Sinks
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.xss.document_write` | Medium | A | `document.write()` / `document.writeln()` |
+| `js.xss.outer_html` | Medium | A | Assignment to `.outerHTML` |
+| `js.xss.insert_adjacent_html` | Medium | A | `insertAdjacentHTML()` |
+| `js.xss.location_assign` | Medium | A | Assignment to `location`/`location.href` — open redirect |
+| `js.xss.cookie_write` | Medium | A | Write to `document.cookie` |
+
+### Prototype Pollution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.prototype.proto_assignment` | Medium | A | Assignment to `__proto__` |
+| `js.prototype.extend_object` | Medium | A | Assignment to `Object.prototype.*` |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.crypto.weak_hash` | Low | A | `crypto.createHash("md5"/"sha1")` |
+| `js.crypto.math_random` | Low | A | `Math.random()` — not cryptographically secure |
+
+### Insecure Transport
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `js.transport.fetch_http` | Low | A | `fetch("http://...")` — plaintext HTTP |
+
+---
+
+## Examples
+
+### `js.code_exec.eval` — Dynamic code execution
+
+**Vulnerable:**
+```javascript
+const code = req.query.code;
+eval(code);  // Remote code execution
+```
+
+**Safe alternative:**
+```javascript
+// Use a sandboxed interpreter or avoid eval entirely
+const allowed = { add: (a, b) => a + b };
+const result = allowed[req.query.operation]?.(req.query.a, req.query.b);
+```
+
+### `js.xss.document_write` — XSS sink
+
+**Vulnerable:**
+```javascript
+document.write("<h1>" + userName + "</h1>");
+```
+
+**Safe alternative:**
+```javascript
+const el = document.createElement("h1");
+el.textContent = userName;
+document.body.appendChild(el);
+```
+
+### `js.prototype.proto_assignment` — Prototype pollution
+
+**Vulnerable:**
+```javascript
+function merge(target, source) {
+    for (let key in source) {
+        target[key] = source[key];  // If key is "__proto__", pollutes prototype
+    }
+}
+```
+
+**Safe alternative:**
+```javascript
+function merge(target, source) {
+    for (let key in source) {
+        if (key === "__proto__" || key === "constructor") continue;
+        target[key] = source[key];
+    }
+}
+```
+
+### Taint: `req.body` → `eval()`
+
+**Finding:**
+```
+[HIGH]   taint-unsanitised-flow (source 2:18)  src/handler.js:3:5
+         Source: req.body at 2:18
+         Sink: eval()
+         Score: 78
+```
--- a/docs/rules/php.md
+++ b/docs/rules/php.md
@ -0,0 +1,138 @@
+# PHP Rules
+
+Nyx detects PHP vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, SQL injection, path traversal, and weak crypto.
+
+## Taint Labels
+
+PHP has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/php.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `$_GET` / `_GET`, `$_POST` / `_POST`, `$_REQUEST` / `_REQUEST`, `$_COOKIE` / `_COOKIE`, `$_FILES` / `_FILES`, `$_SERVER` / `_SERVER`, `$_ENV` / `_ENV` | all |
+| `file_get_contents`, `fread` | all |
+
+> **Note:** PHP superglobal names are matched both with and without the `$` prefix because the CFG's `collect_idents` strips the leading `$` from variable names. Subscript access like `$_GET['cmd']` is handled via `element_reference` / `subscript_expression` node detection.
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `htmlspecialchars`, `htmlentities` | HTML_ESCAPE |
+| `escapeshellarg`, `escapeshellcmd` | SHELL_ESCAPE |
+| `basename` | FILE_IO |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `system`, `exec`, `passthru`, `shell_exec`, `proc_open`, `popen` | SHELL_ESCAPE |
+| `eval`, `assert` | SHELL_ESCAPE |
+| `include`, `include_once`, `require`, `require_once` | FILE_IO |
+| `unserialize` | SHELL_ESCAPE |
+| `move_uploaded_file`, `copy`, `file_put_contents`, `fwrite` | FILE_IO |
+| `echo`, `print` | HTML_ESCAPE |
+| `mysqli_query`, `pg_query`, `query` | SHELL_ESCAPE |
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.code_exec.eval` | High | A | `eval()` — dynamic code execution |
+| `php.code_exec.create_function` | High | A | `create_function()` — deprecated eval-like constructor |
+| `php.code_exec.preg_replace_e` | High | A | `preg_replace` with `/e` modifier — code execution via regex |
+| `php.code_exec.assert_string` | High | A | `assert()` with string argument — evaluates PHP code |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.cmdi.system` | High | A | `system`/`shell_exec`/`exec`/`passthru`/`proc_open`/`popen` |
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.deser.unserialize` | High | A | `unserialize()` — PHP object injection |
+
+### SQL Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.sqli.query_concat` | Medium | B | `mysql_query`/`mysqli_query` with concatenated SQL |
+
+### Path Traversal
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.path.include_variable` | High | B | `include`/`require` with variable path — file inclusion |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `php.crypto.md5` | Low | A | `md5()` — weak hash function |
+| `php.crypto.sha1` | Low | A | `sha1()` — weak hash function |
+| `php.crypto.rand` | Low | A | `rand()`/`mt_rand()` — not cryptographically secure |
+
+---
+
+## Examples
+
+### `php.code_exec.eval` — Dynamic code execution
+
+**Vulnerable:**
+```php
+eval($_GET['code']);
+```
+
+**Safe alternative:**
+```php
+// Never use eval with user input
+// Use a template engine or allowlisted operations
+```
+
+### `php.deser.unserialize` — Object injection
+
+**Vulnerable:**
+```php
+$obj = unserialize($_COOKIE['data']);
+```
+
+**Safe alternative:**
+```php
+$data = json_decode($_COOKIE['data'], true);
+```
+
+### `php.path.include_variable` — File inclusion
+
+**Vulnerable:**
+```php
+include($_GET['page']);  // Local/remote file inclusion
+```
+
+**Safe alternative:**
+```php
+$allowed = ['home', 'about', 'contact'];
+$page = in_array($_GET['page'], $allowed) ? $_GET['page'] : 'home';
+include("pages/{$page}.php");
+```
+
+### `php.sqli.query_concat` — SQL concatenation
+
+**Vulnerable:**
+```php
+mysqli_query($conn, "SELECT * FROM users WHERE id=" . $_GET['id']);
+```
+
+**Safe alternative:**
+```php
+$stmt = $conn->prepare("SELECT * FROM users WHERE id=?");
+$stmt->bind_param("i", $_GET['id']);
+$stmt->execute();
+```
--- a/docs/rules/python.md
+++ b/docs/rules/python.md
@ -0,0 +1,142 @@
+# Python Rules
+
+Nyx detects Python vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, SQL injection, and weak crypto.
+
+## Taint Labels
+
+Python has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/python.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `os.getenv`, `os.environ` | all |
+| `request.args`, `request.form`, `request.json`, `request.headers`, `request.cookies`, `input` | all |
+| `sys.argv` | all |
+| `argparse.parse_args`, `urllib.request.urlopen`, `requests.get`, `requests.post` | all |
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `html.escape` | HTML_ESCAPE |
+| `shlex.quote` | SHELL_ESCAPE |
+| `os.path.realpath` | FILE_IO |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `eval`, `exec` | SHELL_ESCAPE |
+| `os.system`, `os.popen`, `subprocess.call`, `subprocess.run`, `subprocess.Popen`, `subprocess.check_output`, `subprocess.check_call` | SHELL_ESCAPE |
+| `cursor.execute`, `cursor.executemany` | SHELL_ESCAPE |
+| `send_file`, `send_from_directory` | FILE_IO |
+| `open` | FILE_IO |
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.code_exec.eval` | High | A | `eval()` — dynamic code execution |
+| `py.code_exec.exec` | High | A | `exec()` — dynamic code execution |
+| `py.code_exec.compile` | Medium | A | `compile()` with exec/eval mode |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.cmdi.os_system` | High | A | `os.system()` — shell command execution |
+| `py.cmdi.os_popen` | High | A | `os.popen()` — shell command execution |
+| `py.cmdi.subprocess_shell` | High | B | `subprocess.*` with `shell=True` |
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.deser.pickle_loads` | High | A | `pickle.loads()` / `pickle.load()` — arbitrary object deserialization |
+| `py.deser.yaml_load` | High | A | `yaml.load()` without SafeLoader |
+| `py.deser.shelve_open` | Medium | A | `shelve.open()` — pickle-backed deserialization |
+
+### SQL Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.sqli.execute_format` | Medium | B | `cursor.execute()` with string concatenation |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.crypto.md5` | Low | A | `hashlib.md5()` — weak hash algorithm |
+| `py.crypto.sha1` | Low | A | `hashlib.sha1()` — weak hash algorithm |
+
+### Template Injection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `py.xss.jinja_from_string` | Medium | A | `jinja2.Template.from_string()` — template injection |
+
+---
+
+## Examples
+
+### `py.deser.pickle_loads` — Unsafe deserialization
+
+**Vulnerable:**
+```python
+import pickle
+data = pickle.loads(request.body)  # Arbitrary code execution
+```
+
+**Safe alternative:**
+```python
+import json
+data = json.loads(request.body)  # JSON is safe
+```
+
+### `py.cmdi.subprocess_shell` — Shell execution
+
+**Vulnerable:**
+```python
+import subprocess
+subprocess.call(user_input, shell=True)  # Command injection
+```
+
+**Safe alternative:**
+```python
+import subprocess
+import shlex
+subprocess.call(shlex.split(user_input), shell=False)
+# Or better: use an explicit command list
+subprocess.call(["ls", "-la", user_dir])
+```
+
+### `py.deser.yaml_load` — Unsafe YAML
+
+**Vulnerable:**
+```python
+import yaml
+config = yaml.load(user_data)  # Can instantiate arbitrary objects
+```
+
+**Safe alternative:**
+```python
+import yaml
+config = yaml.safe_load(user_data)  # Only basic Python types
+```
+
+### `py.sqli.execute_format` — SQL concatenation
+
+**Vulnerable:**
+```python
+cursor.execute("SELECT * FROM users WHERE id=" + user_id)
+```
+
+**Safe alternative:**
+```python
+cursor.execute("SELECT * FROM users WHERE id=?", (user_id,))
+```
--- a/docs/rules/ruby.md
+++ b/docs/rules/ruby.md
@ -0,0 +1,132 @@
+# Ruby Rules
+
+Nyx detects Ruby vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, reflection, SSRF, and weak crypto.
+
+## Taint Labels
+
+Ruby has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/ruby.rs`.
+
+### Sources
+
+| Matcher | Cap |
+|---------|-----|
+| `ENV`, `gets` | all |
+| `params` | all |
+
+> **Note:** Ruby's `params[:cmd]` subscript access is detected via `element_reference` node handling in the CFG. Sinatra/Rails `do...end` blocks are walked as function scopes.
+
+### Sanitizers
+
+| Matcher | Cap |
+|---------|-----|
+| `CGI.escapeHTML`, `ERB::Util.html_escape` | HTML_ESCAPE |
+| `Shellwords.escape`, `Shellwords.shellescape` | SHELL_ESCAPE |
+
+### Sinks
+
+| Matcher | Cap |
+|---------|-----|
+| `system`, `exec` | SHELL_ESCAPE |
+| `eval` | SHELL_ESCAPE |
+| `puts`, `print` | HTML_ESCAPE |
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.code_exec.eval` | High | A | `Kernel#eval` — dynamic code execution |
+| `rb.code_exec.instance_eval` | High | A | `instance_eval` — evaluates string in object context |
+| `rb.code_exec.class_eval` | High | A | `class_eval` / `module_eval` — evaluates string in class context |
+
+### Command Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.cmdi.backtick` | High | A | Backtick shell execution (`` `cmd` ``) |
+| `rb.cmdi.system_interp` | High | A | `system`/`exec` call — command execution risk |
+
+### Deserialization
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.deser.yaml_load` | High | A | `YAML.load` — arbitrary object deserialization |
+| `rb.deser.marshal_load` | High | A | `Marshal.load` — arbitrary Ruby object deserialization |
+
+### Reflection
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.reflection.send_dynamic` | Medium | B | `send()` with non-symbol argument — arbitrary method dispatch |
+| `rb.reflection.constantize` | Medium | A | `constantize` / `safe_constantize` — dynamic class resolution |
+
+### SSRF
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.ssrf.open_uri` | Medium | A | `Kernel#open` with HTTP URL — SSRF via open-uri |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rb.crypto.md5` | Low | A | `Digest::MD5` — weak hash algorithm |
+
+---
+
+## Examples
+
+### `rb.deser.yaml_load` — Unsafe YAML deserialization
+
+**Vulnerable:**
+```ruby
+data = YAML.load(params[:config])  # Arbitrary object instantiation
+```
+
+**Safe alternative:**
+```ruby
+data = YAML.safe_load(params[:config])  # Only basic Ruby types
+```
+
+### `rb.cmdi.backtick` — Backtick shell execution
+
+**Vulnerable:**
+```ruby
+output = `ls #{user_dir}`  # Command injection via interpolation
+```
+
+**Safe alternative:**
+```ruby
+require 'open3'
+output, status = Open3.capture2('ls', user_dir)
+```
+
+### `rb.reflection.send_dynamic` — Dynamic method dispatch
+
+**Vulnerable:**
+```ruby
+obj.send(params[:method], params[:arg])  # Arbitrary method invocation
+```
+
+**Safe alternative:**
+```ruby
+allowed = %w[name email phone]
+if allowed.include?(params[:method])
+  obj.send(params[:method])
+end
+```
+
+### `rb.deser.marshal_load` — Marshal deserialization
+
+**Vulnerable:**
+```ruby
+obj = Marshal.load(request.body.read)
+```
+
+**Safe alternative:**
+```ruby
+data = JSON.parse(request.body.read)
+```
--- a/docs/rules/rust.md
+++ b/docs/rules/rust.md
@ -0,0 +1,105 @@
+# Rust Rules
+
+Nyx detects Rust vulnerabilities through AST patterns (memory safety, code quality) and taint analysis (command injection via `env::var` → `Command::new`).
+
+## Taint Sources
+
+| Function | Capability | Source Kind |
+|----------|-----------|-------------|
+| `std::env::var`, `env::var` | `all` | EnvironmentConfig |
+
+## Taint Sinks
+
+| Function | Required Capability |
+|----------|-------------------|
+| `Command::new`, `Command::arg`, `Command::args` | `SHELL_ESCAPE` |
+| `Command::status`, `Command::output` | `SHELL_ESCAPE` |
+| `fs::read_to_string`, `fs::write`, `fs::read`, `File::open`, `File::create` | `FILE_IO` |
+
+## Taint Sanitizers
+
+| Function | Strips Capability |
+|----------|------------------|
+| `html_escape::encode_safe`, `sanitize_html` | `HTML_ESCAPE` |
+| `shell_escape::unix::escape`, `sanitize_shell` | `SHELL_ESCAPE` |
+
+> **Note:** `fs::read_to_string` was moved from taint sources to sinks to support path traversal detection (`env::var` → `fs::read_to_string`).
+
+---
+
+## AST Pattern Rules
+
+### Memory Safety
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rs.memory.transmute` | High | A | `std::mem::transmute` — unchecked type reinterpretation |
+| `rs.memory.copy_nonoverlapping` | High | A | `ptr::copy_nonoverlapping` — raw pointer memcpy |
+| `rs.memory.get_unchecked` | High | A | `get_unchecked` / `get_unchecked_mut` — unchecked indexing |
+| `rs.memory.mem_zeroed` | High | A | `std::mem::zeroed` — may be UB for non-POD types |
+| `rs.memory.ptr_read` | High | A | `ptr::read` / `ptr::read_volatile` — raw pointer dereference |
+| `rs.memory.narrow_cast` | Low | A | `as u8`/`i8`/`u16`/`i16` — possible truncation |
+| `rs.memory.mem_forget` | Low | A | `std::mem::forget` — may leak resources |
+
+### Code Quality
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `rs.quality.unsafe_block` | Medium | A | `unsafe { }` block — manual memory safety obligation |
+| `rs.quality.unsafe_fn` | Medium | A | `unsafe fn` declaration |
+| `rs.quality.unwrap` | Low | A | `.unwrap()` — panics on `None`/`Err` |
+| `rs.quality.expect` | Low | A | `.expect()` — panics on `None`/`Err` |
+| `rs.quality.panic_macro` | Low | A | `panic!()` macro invocation |
+| `rs.quality.todo` | Low | A | `todo!()` / `unimplemented!()` placeholder |
+
+---
+
+## Examples
+
+### `rs.memory.transmute` — Unchecked type reinterpretation
+
+**Vulnerable:**
+```rust
+let x: u32 = 42;
+let y: f32 = unsafe { std::mem::transmute(x) };
+```
+
+**Safe alternative:**
+```rust
+let x: u32 = 42;
+let y: f32 = f32::from_bits(x);
+```
+
+### `rs.quality.unsafe_block` — Unsafe block
+
+**Flagged:**
+```rust
+unsafe {
+    let ptr = &x as *const i32;
+    println!("{}", *ptr);
+}
+```
+
+**Safe alternative:**
+```rust
+// Use safe abstractions when possible
+println!("{}", x);
+```
+
+### Taint: `env::var` → `Command::new`
+
+**Vulnerable:**
+```rust
+let cmd = std::env::var("USER_CMD").unwrap();
+Command::new("sh").arg("-c").arg(&cmd).output()?;
+```
+
+**Safe alternative:**
+```rust
+let cmd = std::env::var("USER_CMD").unwrap();
+// Validate against allowlist
+let allowed = ["ls", "whoami", "date"];
+if allowed.contains(&cmd.as_str()) {
+    Command::new(&cmd).output()?;
+}
+```
--- a/docs/rules/typescript.md
+++ b/docs/rules/typescript.md
@ -0,0 +1,81 @@
+# TypeScript Rules
+
+TypeScript rules mirror JavaScript patterns plus TypeScript-specific type-safety escape detectors. Taint labels are shared with JavaScript (see [JavaScript Rules](javascript.md)).
+
+---
+
+## AST Pattern Rules
+
+### Code Execution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.code_exec.eval` | High | A | `eval()` — dynamic code execution |
+| `ts.code_exec.new_function` | High | A | `new Function()` — eval equivalent |
+| `ts.code_exec.settimeout_string` | Medium | A | `setTimeout`/`setInterval` with string argument |
+
+### XSS Sinks
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.xss.document_write` | Medium | A | `document.write()` / `document.writeln()` |
+| `ts.xss.outer_html` | Medium | A | Assignment to `.outerHTML` |
+| `ts.xss.insert_adjacent_html` | Medium | A | `insertAdjacentHTML()` |
+| `ts.xss.location_assign` | Medium | A | Assignment to `location`/`location.href` |
+| `ts.xss.cookie_write` | Low | A | Write to `document.cookie` |
+
+### Prototype Pollution
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.prototype.proto_assignment` | Medium | A | Assignment to `__proto__` |
+
+### Weak Crypto
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.crypto.math_random` | Low | A | `Math.random()` — not cryptographically secure |
+
+### Code Quality (TypeScript-specific)
+
+| Rule ID | Severity | Tier | Description |
+|---------|----------|------|-------------|
+| `ts.quality.any_annotation` | Low | A | Type annotation of `any` — disables type checking |
+| `ts.quality.as_any` | Low | A | Type assertion `as any` — type-safety escape hatch |
+
+---
+
+## Examples
+
+### `ts.quality.any_annotation` — `any` type
+
+**Flagged:**
+```typescript
+function process(data: any) {  // ts.quality.any_annotation
+    data.whatever();  // No type checking
+}
+```
+
+**Safe alternative:**
+```typescript
+interface UserData { name: string; email: string; }
+function process(data: UserData) {
+    console.log(data.name);
+}
+```
+
+### `ts.quality.as_any` — Type assertion escape
+
+**Flagged:**
+```typescript
+const result = someValue as any;  // ts.quality.as_any
+result.nonexistentMethod();
+```
+
+**Safe alternative:**
+```typescript
+if (isValidType(someValue)) {
+    const result = someValue as KnownType;
+    result.knownMethod();
+}
+```