* chore: Exclude CLAUDE.md from Cargo.toml

* feat: add callgraph module and integrate into main analysis flow

* feat: enhance CLI with new severity filtering and analysis modes

* feat: update CHANGELOG with recent enhancements and fixes to severity filtering and output handling

* feat: implement state-model dataflow analysis for resource lifecycle and auth state

* feat: enhance diagnostic output formatting and add evidence structure

* feat: implement attack surface ranking for diagnostics with scoring and sorting

* feat: add comprehensive documentation for installation, usage, and rules reference

* feat: add multiple language support for command execution and evaluation endpoints

* feat: implement inline suppression for findings using `nyx:ignore` comments

* feat: add confidence levels to AST patterns and update output structure

* feat: implement low-noise prioritization system with category filtering, rollup grouping, and configurable budgets

* feat: bump version to 0.4.0 and update changelog with new features and improvements

* feat: add dead code allowances to various functions in mod.rs and real_world_tests.rs
This commit is contained in:
Eli Peter 2026-02-25 21:16:36 -05:00 committed by GitHub
parent 19b578c5c4
commit 1bbe4b1cfb
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
456 changed files with 25628 additions and 1228 deletions

89
docs/rules/c.md Normal file
View file

@ -0,0 +1,89 @@
# C Rules
Nyx detects C vulnerabilities through AST patterns (banned functions, format strings) and taint analysis (user input → shell execution, buffer overflow sinks).
## Taint Sources
| Function | Capability | Source Kind |
|----------|-----------|-------------|
| `getenv` | `all` | EnvironmentConfig |
| `fgets`, `scanf`, `fscanf`, `gets`, `read` | `all` | UserInput |
## Taint Sinks
| Function | Required Capability |
|----------|-------------------|
| `system`, `popen`, `exec*` family | `SHELL_ESCAPE` |
| `sprintf`, `strcpy`, `strcat` | `HTML_ESCAPE` |
| `printf`, `fprintf` | `FMT_STRING` |
| `fopen`, `open` | `FILE_IO` |
---
## AST Pattern Rules
### Memory Safety (Banned Functions)
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `c.memory.gets` | High | A | `gets()` — no bounds checking, always exploitable |
| `c.memory.strcpy` | High | A | `strcpy()` — no bounds checking on destination buffer |
| `c.memory.strcat` | High | A | `strcat()` — no bounds checking on destination buffer |
| `c.memory.sprintf` | High | A | `sprintf()` — no length limit on output buffer |
| `c.memory.scanf_percent_s` | High | A | `scanf("%s")` — unbounded string read |
| `c.memory.printf_no_fmt` | High | B | `printf(var)` — format-string vulnerability (non-literal first arg) |
### Command Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `c.cmdi.system` | High | A | `system()` — shell command execution |
| `c.cmdi.popen` | Medium | A | `popen()` — shell command execution with pipe |
---
## Examples
### `c.memory.gets` — Banned function
**Vulnerable:**
```c
char buf[64];
gets(buf); // No bounds checking — buffer overflow
```
**Safe alternative:**
```c
char buf[64];
fgets(buf, sizeof(buf), stdin);
```
### `c.memory.printf_no_fmt` — Format string
**Vulnerable:**
```c
char *user_input = get_input();
printf(user_input); // Format string vulnerability
```
**Safe alternative:**
```c
char *user_input = get_input();
printf("%s", user_input);
```
### `c.cmdi.system` — Shell execution
**Vulnerable:**
```c
char cmd[256];
snprintf(cmd, sizeof(cmd), "ls %s", user_dir);
system(cmd); // Command injection if user_dir contains shell metacharacters
```
**Safe alternative:**
```c
// Use execvp with explicit argument array
char *args[] = {"ls", user_dir, NULL};
execvp("ls", args);
```

66
docs/rules/cpp.md Normal file
View file

@ -0,0 +1,66 @@
# C++ Rules
C++ rules inherit C banned-function concerns and add C++-specific patterns like dangerous casts.
## Taint Labels
C++ shares taint labels with C. See [C Rules](c.md) for the full source/sink/sanitizer listing.
---
## AST Pattern Rules
### Memory Safety
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `cpp.memory.gets` | High | A | `gets()` — no bounds checking, always exploitable |
| `cpp.memory.strcpy` | High | A | `strcpy()` — no bounds checking on destination |
| `cpp.memory.strcat` | High | A | `strcat()` — no bounds checking on destination |
| `cpp.memory.sprintf` | High | A | `sprintf()` — no length limit on output |
| `cpp.memory.reinterpret_cast` | Medium | A | `reinterpret_cast` — type-punning cast |
| `cpp.memory.const_cast` | Medium | A | `const_cast` — removes const/volatile qualifier |
| `cpp.memory.printf_no_fmt` | High | B | `printf(var)` — format-string vulnerability |
### Command Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `cpp.cmdi.system` | High | A | `system()` — shell command execution |
| `cpp.cmdi.popen` | High | A | `popen()` — shell command execution |
---
## Examples
### `cpp.memory.reinterpret_cast` — Type-punning cast
**Flagged:**
```cpp
int x = 42;
float* fp = reinterpret_cast<float*>(&x); // Type-punning, may violate strict aliasing
```
**Safe alternative:**
```cpp
int x = 42;
float f;
std::memcpy(&f, &x, sizeof(f)); // Well-defined type punning
```
### `cpp.memory.const_cast` — Removing const
**Flagged:**
```cpp
void process(const std::string& s) {
char* p = const_cast<char*>(s.c_str()); // Removes const
p[0] = 'X'; // Undefined behavior
}
```
**Safe alternative:**
```cpp
void process(std::string s) { // Take by value
s[0] = 'X';
}
```

148
docs/rules/go.md Normal file
View file

@ -0,0 +1,148 @@
# Go Rules
Nyx detects Go vulnerabilities through AST patterns and taint analysis, covering command execution, unsafe pointer usage, TLS misconfiguration, weak crypto, SQL injection, hardcoded secrets, and deserialization.
## Taint Labels
Go has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/go.rs`.
### Sources
| Matcher | Cap |
|---------|-----|
| `os.Getenv` | all |
| `http.Request`, `r.FormValue`, `r.URL`, `r.Body`, `r.Header` | all |
| `r.URL.Query`, `r.URL.Query.Get`, `Request.FormValue`, `Request.URL` | all |
### Sanitizers
| Matcher | Cap |
|---------|-----|
| `html.EscapeString`, `template.HTMLEscapeString` | HTML_ESCAPE |
| `url.QueryEscape`, `url.PathEscape` | URL_ENCODE |
| `filepath.Clean`, `filepath.Base` | FILE_IO |
### Sinks
| Matcher | Cap |
|---------|-----|
| `exec.Command` | SHELL_ESCAPE |
| `db.Query`, `db.Exec`, `db.QueryRow`, `db.Prepare` | SHELL_ESCAPE |
| `fmt.Fprintf`, `fmt.Sprintf`, `fmt.Printf` | FMT_STRING |
| `os.Open`, `os.OpenFile`, `os.Create`, `ioutil.ReadFile`, `os.ReadFile` | FILE_IO |
| `template.HTML` | HTML_ESCAPE |
> **Note:** Chained calls like `r.URL.Query().Get("host")` are normalized by stripping internal `()` segments before matching, so `r.URL.Query.Get` matches the source rule.
---
## AST Pattern Rules
### Command Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `go.cmdi.exec_command` | High | A | `exec.Command()` — arbitrary process execution |
### Memory Safety
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `go.memory.unsafe_pointer` | Medium | A | `unsafe.Pointer` — bypasses Go type system |
### Insecure Transport
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `go.transport.insecure_skip_verify` | High | A | `InsecureSkipVerify: true` — disables TLS certificate validation |
### Weak Crypto
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `go.crypto.md5` | Low | A | `md5.New()` / `md5.Sum()` — weak hash algorithm |
| `go.crypto.sha1` | Low | A | `sha1.New()` / `sha1.Sum()` — weak hash algorithm |
### SQL Injection
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `go.sqli.query_concat` | Medium | B | `db.Query`/`Exec`/`QueryRow` with concatenated string |
### Secrets
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `go.secrets.hardcoded_key` | Medium | A | Variable with secret-like name assigned a string literal |
### Deserialization
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `go.deser.gob_decode` | Medium | A | `gob.NewDecoder` — Go binary deserialization |
---
## Examples
### `go.transport.insecure_skip_verify` — TLS misconfiguration
**Vulnerable:**
```go
tr := &http.Transport{
TLSClientConfig: &tls.Config{
InsecureSkipVerify: true, // Disables certificate verification
},
}
```
**Safe alternative:**
```go
tr := &http.Transport{
TLSClientConfig: &tls.Config{
// Use proper CA certificates
RootCAs: certPool,
},
}
```
### `go.sqli.query_concat` — SQL concatenation
**Vulnerable:**
```go
rows, err := db.Query("SELECT * FROM users WHERE id=" + userID)
```
**Safe alternative:**
```go
rows, err := db.Query("SELECT * FROM users WHERE id=$1", userID)
```
### `go.secrets.hardcoded_key` — Hardcoded secret
**Flagged:**
```go
apiKey := "sk-1234567890abcdef"
password := "hunter2"
```
**Safe alternative:**
```go
apiKey := os.Getenv("API_KEY")
password := os.Getenv("DB_PASSWORD")
```
### `go.cmdi.exec_command` — Command execution
**Vulnerable:**
```go
cmd := exec.Command("sh", "-c", userInput)
cmd.Run()
```
**Safe alternative:**
```go
// Use explicit command and arguments, not shell
cmd := exec.Command("ls", "-la", safeDir)
cmd.Run()
```

79
docs/rules/index.md Normal file
View file

@ -0,0 +1,79 @@
# Rule Reference
This section lists every detection rule in Nyx, organized by language.
## Rule ID Format
| Prefix | Detector Family | Example |
|--------|----------------|---------|
| `taint-*` | [Taint analysis](../detectors/taint.md) | `taint-unsanitised-flow (source 5:11)` |
| `cfg-*` | [CFG structural](../detectors/cfg.md) | `cfg-unguarded-sink`, `cfg-auth-gap` |
| `state-*` | [State model](../detectors/state.md) | `state-use-after-close`, `state-resource-leak` |
| `<lang>.*.*` | [AST patterns](../detectors/patterns.md) | `rs.memory.transmute`, `js.code_exec.eval` |
## Cross-Language Rules
These rules apply to all supported languages:
### Taint Rules
| Rule ID | Severity | Description |
|---------|----------|-------------|
| `taint-unsanitised-flow (source L:C)` | Varies by source kind | Unsanitized data flows from source to sink |
### CFG Structural Rules
| Rule ID | Severity | Description |
|---------|----------|-------------|
| `cfg-unguarded-sink` | High/Medium | Sink without dominating guard |
| `cfg-auth-gap` | High | Web handler reaches privileged sink without auth |
| `cfg-unreachable-sink` | Medium | Dangerous function in unreachable code |
| `cfg-unreachable-sanitizer` | Low | Sanitizer in unreachable code |
| `cfg-unreachable-source` | Low | Source in unreachable code |
| `cfg-error-fallthrough` | High/Medium | Error path doesn't terminate before dangerous code |
| `cfg-resource-leak` | Medium | Resource not released on all exit paths |
| `cfg-lock-not-released` | Medium | Lock not released on all exit paths |
### State Model Rules
| Rule ID | Severity | Description |
|---------|----------|-------------|
| `state-use-after-close` | High | Variable used after being closed |
| `state-double-close` | Medium | Resource closed twice |
| `state-resource-leak` | Medium | Resource never closed (definite) |
| `state-resource-leak-possible` | Low | Resource may not close on all paths |
| `state-unauthed-access` | High | Privileged operation without authentication |
## Per-Language AST Pattern Rules
Each language page lists all AST pattern rules with examples:
- [Rust](rust.md) — 12 rules (memory safety, code quality)
- [C](c.md) — 8 rules (banned functions, command execution, format strings)
- [C++](cpp.md) — 9 rules (banned functions, dangerous casts, command execution)
- [Java](java.md) — 8 rules (deserialization, command execution, reflection, SQL, crypto, XSS)
- [Go](go.md) — 8 rules (command execution, unsafe pointer, TLS, crypto, SQL, secrets, deserialization)
- [JavaScript](javascript.md) — 12 rules (code execution, XSS, prototype pollution, crypto, transport)
- [TypeScript](typescript.md) — 10 rules (mirrors JS + type-safety escapes)
- [Python](python.md) — 12 rules (code execution, command execution, deserialization, SQL, crypto, XSS)
- [PHP](php.md) — 11 rules (code execution, command execution, deserialization, SQL, path traversal, crypto)
- [Ruby](ruby.md) — 10 rules (code execution, command execution, deserialization, reflection, SSRF, crypto)
## Taint Label Coverage
Taint analysis uses language-specific source/sink/sanitizer labels. Coverage varies by language:
| Language | Sources | Sinks | Sanitizers | Coverage |
|----------|---------|-------|------------|----------|
| Rust | Complete | Complete | Complete | Full |
| JavaScript | Complete | Complete | Partial | Full |
| TypeScript | Partial | Partial | Partial | Moderate |
| Python | Partial | Complete | Partial | Moderate |
| C | Partial | Complete | Minimal | Moderate |
| C++ | Partial | Complete | Minimal | Moderate |
| Java | Partial | Partial | Partial | Moderate |
| Go | Complete | Complete | Partial | Full |
| PHP | Complete | Complete | Partial | Full |
| Ruby | Partial | Partial | Partial | Moderate |
"Starter" coverage means basic rules exist but many common library functions are not yet labeled. Contributions welcome.

135
docs/rules/java.md Normal file
View file

@ -0,0 +1,135 @@
# Java Rules
Nyx detects Java vulnerabilities through AST patterns and taint analysis, covering deserialization, command execution, reflection, SQL injection, weak crypto, and XSS.
## Taint Labels
Java has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/java.rs`.
### Sources
| Matcher | Cap |
|---------|-----|
| `System.getenv` | all |
| `getParameter`, `getInputStream`, `getHeader`, `getCookies`, `getReader`, `getQueryString`, `getPathInfo` | all |
| `readObject`, `readLine` | all |
### Sanitizers
| Matcher | Cap |
|---------|-----|
| `HtmlUtils.htmlEscape`, `StringEscapeUtils.escapeHtml4` | HTML_ESCAPE |
### Sinks
| Matcher | Cap |
|---------|-----|
| `Runtime.exec`, `ProcessBuilder` | SHELL_ESCAPE |
| `executeQuery`, `executeUpdate`, `prepareStatement` | SHELL_ESCAPE |
| `Class.forName` | SHELL_ESCAPE |
| `println`, `print`, `write` | HTML_ESCAPE |
---
## AST Pattern Rules
### Deserialization
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `java.deser.readobject` | High | A | `ObjectInputStream.readObject()` — unsafe deserialization |
### Command Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `java.cmdi.runtime_exec` | High | A | `Runtime.getRuntime().exec()` — shell command execution |
### Reflection
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `java.reflection.class_forname` | Medium | A | `Class.forName()` — dynamic class loading |
| `java.reflection.method_invoke` | Medium | A | `Method.invoke()` — reflective method invocation |
### SQL Injection
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `java.sqli.execute_concat` | Medium | B | SQL `execute*()` with concatenated string argument |
### Weak Crypto
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `java.crypto.insecure_random` | Low | A | `new Random()``java.util.Random` is not cryptographically secure |
| `java.crypto.weak_digest` | Low | A | `MessageDigest.getInstance("MD5"/"SHA1")` |
### XSS
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `java.xss.getwriter_print` | Medium | A | `response.getWriter().print/println/write` — direct output |
---
## Examples
### `java.deser.readobject` — Unsafe deserialization
**Vulnerable:**
```java
ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
Object obj = ois.readObject(); // Arbitrary object instantiation
```
**Safe alternative:**
```java
// Use a safe format like JSON
ObjectMapper mapper = new ObjectMapper();
MyType obj = mapper.readValue(request.getInputStream(), MyType.class);
```
### `java.sqli.execute_concat` — SQL concatenation
**Vulnerable:**
```java
String query = "SELECT * FROM users WHERE id=" + userId;
stmt.executeQuery(query); // SQL injection
```
**Safe alternative:**
```java
PreparedStatement ps = conn.prepareStatement("SELECT * FROM users WHERE id=?");
ps.setString(1, userId);
ResultSet rs = ps.executeQuery();
```
### `java.cmdi.runtime_exec` — Command execution
**Vulnerable:**
```java
Runtime.getRuntime().exec("cmd /c " + userCommand);
```
**Safe alternative:**
```java
ProcessBuilder pb = new ProcessBuilder("cmd", "/c", "dir");
// Use explicit argument list, never concatenate user input
```
### `java.reflection.class_forname` — Dynamic class loading
**Flagged:**
```java
Class<?> cls = Class.forName(className);
Object obj = cls.getDeclaredConstructor().newInstance();
```
**Safe alternative:**
```java
// Use an allowlist of permitted class names
Map<String, Class<?>> allowed = Map.of("User", User.class, "Order", Order.class);
Class<?> cls = allowed.get(className);
if (cls != null) { /* ... */ }
```

138
docs/rules/javascript.md Normal file
View file

@ -0,0 +1,138 @@
# JavaScript Rules
JavaScript has the most complete taint label coverage alongside Rust. Nyx detects code execution, XSS, prototype pollution, command injection, and weak crypto.
## Taint Sources
| Function | Capability | Source Kind |
|----------|-----------|-------------|
| `document.location`, `window.location` | `all` | UserInput |
| `req.body`, `req.query`, `req.params` | `all` | UserInput |
| `req.headers`, `req.cookies` | `all` | UserInput |
| `process.env` | `all` | EnvironmentConfig |
## Taint Sinks
| Function | Required Capability |
|----------|-------------------|
| `eval` | `SHELL_ESCAPE` |
| `innerHTML` | `HTML_ESCAPE` |
| `location.href`, `window.location.href` | `URL_ENCODE` |
| `child_process.exec`, `child_process.execSync` | `SHELL_ESCAPE` |
| `child_process.spawn` | `SHELL_ESCAPE` |
## Taint Sanitizers
| Function | Strips Capability |
|----------|------------------|
| `JSON.parse` | `JSON_PARSE` |
| `encodeURIComponent`, `encodeURI` | `URL_ENCODE` |
| `DOMPurify.sanitize` | `HTML_ESCAPE` |
> **Note:** Anonymous function expressions and arrow functions passed as callback arguments (e.g., Express `app.get('/path', function(req, res) { ... })`) are automatically walked as separate function scopes for taint analysis. Each anonymous function gets a unique scope identifier to prevent cross-function taint leakage.
---
## AST Pattern Rules
### Code Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `js.code_exec.eval` | High | A | `eval()` — dynamic code execution |
| `js.code_exec.new_function` | High | A | `new Function()` — eval equivalent |
| `js.code_exec.settimeout_string` | Medium | A | `setTimeout`/`setInterval` with string argument |
### XSS Sinks
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `js.xss.document_write` | Medium | A | `document.write()` / `document.writeln()` |
| `js.xss.outer_html` | Medium | A | Assignment to `.outerHTML` |
| `js.xss.insert_adjacent_html` | Medium | A | `insertAdjacentHTML()` |
| `js.xss.location_assign` | Medium | A | Assignment to `location`/`location.href` — open redirect |
| `js.xss.cookie_write` | Medium | A | Write to `document.cookie` |
### Prototype Pollution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `js.prototype.proto_assignment` | Medium | A | Assignment to `__proto__` |
| `js.prototype.extend_object` | Medium | A | Assignment to `Object.prototype.*` |
### Weak Crypto
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `js.crypto.weak_hash` | Low | A | `crypto.createHash("md5"/"sha1")` |
| `js.crypto.math_random` | Low | A | `Math.random()` — not cryptographically secure |
### Insecure Transport
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `js.transport.fetch_http` | Low | A | `fetch("http://...")` — plaintext HTTP |
---
## Examples
### `js.code_exec.eval` — Dynamic code execution
**Vulnerable:**
```javascript
const code = req.query.code;
eval(code); // Remote code execution
```
**Safe alternative:**
```javascript
// Use a sandboxed interpreter or avoid eval entirely
const allowed = { add: (a, b) => a + b };
const result = allowed[req.query.operation]?.(req.query.a, req.query.b);
```
### `js.xss.document_write` — XSS sink
**Vulnerable:**
```javascript
document.write("<h1>" + userName + "</h1>");
```
**Safe alternative:**
```javascript
const el = document.createElement("h1");
el.textContent = userName;
document.body.appendChild(el);
```
### `js.prototype.proto_assignment` — Prototype pollution
**Vulnerable:**
```javascript
function merge(target, source) {
for (let key in source) {
target[key] = source[key]; // If key is "__proto__", pollutes prototype
}
}
```
**Safe alternative:**
```javascript
function merge(target, source) {
for (let key in source) {
if (key === "__proto__" || key === "constructor") continue;
target[key] = source[key];
}
}
```
### Taint: `req.body``eval()`
**Finding:**
```
[HIGH] taint-unsanitised-flow (source 2:18) src/handler.js:3:5
Source: req.body at 2:18
Sink: eval()
Score: 78
```

138
docs/rules/php.md Normal file
View file

@ -0,0 +1,138 @@
# PHP Rules
Nyx detects PHP vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, SQL injection, path traversal, and weak crypto.
## Taint Labels
PHP has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/php.rs`.
### Sources
| Matcher | Cap |
|---------|-----|
| `$_GET` / `_GET`, `$_POST` / `_POST`, `$_REQUEST` / `_REQUEST`, `$_COOKIE` / `_COOKIE`, `$_FILES` / `_FILES`, `$_SERVER` / `_SERVER`, `$_ENV` / `_ENV` | all |
| `file_get_contents`, `fread` | all |
> **Note:** PHP superglobal names are matched both with and without the `$` prefix because the CFG's `collect_idents` strips the leading `$` from variable names. Subscript access like `$_GET['cmd']` is handled via `element_reference` / `subscript_expression` node detection.
### Sanitizers
| Matcher | Cap |
|---------|-----|
| `htmlspecialchars`, `htmlentities` | HTML_ESCAPE |
| `escapeshellarg`, `escapeshellcmd` | SHELL_ESCAPE |
| `basename` | FILE_IO |
### Sinks
| Matcher | Cap |
|---------|-----|
| `system`, `exec`, `passthru`, `shell_exec`, `proc_open`, `popen` | SHELL_ESCAPE |
| `eval`, `assert` | SHELL_ESCAPE |
| `include`, `include_once`, `require`, `require_once` | FILE_IO |
| `unserialize` | SHELL_ESCAPE |
| `move_uploaded_file`, `copy`, `file_put_contents`, `fwrite` | FILE_IO |
| `echo`, `print` | HTML_ESCAPE |
| `mysqli_query`, `pg_query`, `query` | SHELL_ESCAPE |
---
## AST Pattern Rules
### Code Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `php.code_exec.eval` | High | A | `eval()` — dynamic code execution |
| `php.code_exec.create_function` | High | A | `create_function()` — deprecated eval-like constructor |
| `php.code_exec.preg_replace_e` | High | A | `preg_replace` with `/e` modifier — code execution via regex |
| `php.code_exec.assert_string` | High | A | `assert()` with string argument — evaluates PHP code |
### Command Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `php.cmdi.system` | High | A | `system`/`shell_exec`/`exec`/`passthru`/`proc_open`/`popen` |
### Deserialization
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `php.deser.unserialize` | High | A | `unserialize()` — PHP object injection |
### SQL Injection
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `php.sqli.query_concat` | Medium | B | `mysql_query`/`mysqli_query` with concatenated SQL |
### Path Traversal
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `php.path.include_variable` | High | B | `include`/`require` with variable path — file inclusion |
### Weak Crypto
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `php.crypto.md5` | Low | A | `md5()` — weak hash function |
| `php.crypto.sha1` | Low | A | `sha1()` — weak hash function |
| `php.crypto.rand` | Low | A | `rand()`/`mt_rand()` — not cryptographically secure |
---
## Examples
### `php.code_exec.eval` — Dynamic code execution
**Vulnerable:**
```php
eval($_GET['code']);
```
**Safe alternative:**
```php
// Never use eval with user input
// Use a template engine or allowlisted operations
```
### `php.deser.unserialize` — Object injection
**Vulnerable:**
```php
$obj = unserialize($_COOKIE['data']);
```
**Safe alternative:**
```php
$data = json_decode($_COOKIE['data'], true);
```
### `php.path.include_variable` — File inclusion
**Vulnerable:**
```php
include($_GET['page']); // Local/remote file inclusion
```
**Safe alternative:**
```php
$allowed = ['home', 'about', 'contact'];
$page = in_array($_GET['page'], $allowed) ? $_GET['page'] : 'home';
include("pages/{$page}.php");
```
### `php.sqli.query_concat` — SQL concatenation
**Vulnerable:**
```php
mysqli_query($conn, "SELECT * FROM users WHERE id=" . $_GET['id']);
```
**Safe alternative:**
```php
$stmt = $conn->prepare("SELECT * FROM users WHERE id=?");
$stmt->bind_param("i", $_GET['id']);
$stmt->execute();
```

142
docs/rules/python.md Normal file
View file

@ -0,0 +1,142 @@
# Python Rules
Nyx detects Python vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, SQL injection, and weak crypto.
## Taint Labels
Python has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/python.rs`.
### Sources
| Matcher | Cap |
|---------|-----|
| `os.getenv`, `os.environ` | all |
| `request.args`, `request.form`, `request.json`, `request.headers`, `request.cookies`, `input` | all |
| `sys.argv` | all |
| `argparse.parse_args`, `urllib.request.urlopen`, `requests.get`, `requests.post` | all |
### Sanitizers
| Matcher | Cap |
|---------|-----|
| `html.escape` | HTML_ESCAPE |
| `shlex.quote` | SHELL_ESCAPE |
| `os.path.realpath` | FILE_IO |
### Sinks
| Matcher | Cap |
|---------|-----|
| `eval`, `exec` | SHELL_ESCAPE |
| `os.system`, `os.popen`, `subprocess.call`, `subprocess.run`, `subprocess.Popen`, `subprocess.check_output`, `subprocess.check_call` | SHELL_ESCAPE |
| `cursor.execute`, `cursor.executemany` | SHELL_ESCAPE |
| `send_file`, `send_from_directory` | FILE_IO |
| `open` | FILE_IO |
---
## AST Pattern Rules
### Code Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `py.code_exec.eval` | High | A | `eval()` — dynamic code execution |
| `py.code_exec.exec` | High | A | `exec()` — dynamic code execution |
| `py.code_exec.compile` | Medium | A | `compile()` with exec/eval mode |
### Command Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `py.cmdi.os_system` | High | A | `os.system()` — shell command execution |
| `py.cmdi.os_popen` | High | A | `os.popen()` — shell command execution |
| `py.cmdi.subprocess_shell` | High | B | `subprocess.*` with `shell=True` |
### Deserialization
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `py.deser.pickle_loads` | High | A | `pickle.loads()` / `pickle.load()` — arbitrary object deserialization |
| `py.deser.yaml_load` | High | A | `yaml.load()` without SafeLoader |
| `py.deser.shelve_open` | Medium | A | `shelve.open()` — pickle-backed deserialization |
### SQL Injection
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `py.sqli.execute_format` | Medium | B | `cursor.execute()` with string concatenation |
### Weak Crypto
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `py.crypto.md5` | Low | A | `hashlib.md5()` — weak hash algorithm |
| `py.crypto.sha1` | Low | A | `hashlib.sha1()` — weak hash algorithm |
### Template Injection
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `py.xss.jinja_from_string` | Medium | A | `jinja2.Template.from_string()` — template injection |
---
## Examples
### `py.deser.pickle_loads` — Unsafe deserialization
**Vulnerable:**
```python
import pickle
data = pickle.loads(request.body) # Arbitrary code execution
```
**Safe alternative:**
```python
import json
data = json.loads(request.body) # JSON is safe
```
### `py.cmdi.subprocess_shell` — Shell execution
**Vulnerable:**
```python
import subprocess
subprocess.call(user_input, shell=True) # Command injection
```
**Safe alternative:**
```python
import subprocess
import shlex
subprocess.call(shlex.split(user_input), shell=False)
# Or better: use an explicit command list
subprocess.call(["ls", "-la", user_dir])
```
### `py.deser.yaml_load` — Unsafe YAML
**Vulnerable:**
```python
import yaml
config = yaml.load(user_data) # Can instantiate arbitrary objects
```
**Safe alternative:**
```python
import yaml
config = yaml.safe_load(user_data) # Only basic Python types
```
### `py.sqli.execute_format` — SQL concatenation
**Vulnerable:**
```python
cursor.execute("SELECT * FROM users WHERE id=" + user_id)
```
**Safe alternative:**
```python
cursor.execute("SELECT * FROM users WHERE id=?", (user_id,))
```

132
docs/rules/ruby.md Normal file
View file

@ -0,0 +1,132 @@
# Ruby Rules
Nyx detects Ruby vulnerabilities through AST patterns and taint analysis, covering code execution, command injection, deserialization, reflection, SSRF, and weak crypto.
## Taint Labels
Ruby has moderate taint label coverage. Sources, sinks, and sanitizers are defined in `src/labels/ruby.rs`.
### Sources
| Matcher | Cap |
|---------|-----|
| `ENV`, `gets` | all |
| `params` | all |
> **Note:** Ruby's `params[:cmd]` subscript access is detected via `element_reference` node handling in the CFG. Sinatra/Rails `do...end` blocks are walked as function scopes.
### Sanitizers
| Matcher | Cap |
|---------|-----|
| `CGI.escapeHTML`, `ERB::Util.html_escape` | HTML_ESCAPE |
| `Shellwords.escape`, `Shellwords.shellescape` | SHELL_ESCAPE |
### Sinks
| Matcher | Cap |
|---------|-----|
| `system`, `exec` | SHELL_ESCAPE |
| `eval` | SHELL_ESCAPE |
| `puts`, `print` | HTML_ESCAPE |
---
## AST Pattern Rules
### Code Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `rb.code_exec.eval` | High | A | `Kernel#eval` — dynamic code execution |
| `rb.code_exec.instance_eval` | High | A | `instance_eval` — evaluates string in object context |
| `rb.code_exec.class_eval` | High | A | `class_eval` / `module_eval` — evaluates string in class context |
### Command Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `rb.cmdi.backtick` | High | A | Backtick shell execution (`` `cmd` ``) |
| `rb.cmdi.system_interp` | High | A | `system`/`exec` call — command execution risk |
### Deserialization
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `rb.deser.yaml_load` | High | A | `YAML.load` — arbitrary object deserialization |
| `rb.deser.marshal_load` | High | A | `Marshal.load` — arbitrary Ruby object deserialization |
### Reflection
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `rb.reflection.send_dynamic` | Medium | B | `send()` with non-symbol argument — arbitrary method dispatch |
| `rb.reflection.constantize` | Medium | A | `constantize` / `safe_constantize` — dynamic class resolution |
### SSRF
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `rb.ssrf.open_uri` | Medium | A | `Kernel#open` with HTTP URL — SSRF via open-uri |
### Weak Crypto
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `rb.crypto.md5` | Low | A | `Digest::MD5` — weak hash algorithm |
---
## Examples
### `rb.deser.yaml_load` — Unsafe YAML deserialization
**Vulnerable:**
```ruby
data = YAML.load(params[:config]) # Arbitrary object instantiation
```
**Safe alternative:**
```ruby
data = YAML.safe_load(params[:config]) # Only basic Ruby types
```
### `rb.cmdi.backtick` — Backtick shell execution
**Vulnerable:**
```ruby
output = `ls #{user_dir}` # Command injection via interpolation
```
**Safe alternative:**
```ruby
require 'open3'
output, status = Open3.capture2('ls', user_dir)
```
### `rb.reflection.send_dynamic` — Dynamic method dispatch
**Vulnerable:**
```ruby
obj.send(params[:method], params[:arg]) # Arbitrary method invocation
```
**Safe alternative:**
```ruby
allowed = %w[name email phone]
if allowed.include?(params[:method])
obj.send(params[:method])
end
```
### `rb.deser.marshal_load` — Marshal deserialization
**Vulnerable:**
```ruby
obj = Marshal.load(request.body.read)
```
**Safe alternative:**
```ruby
data = JSON.parse(request.body.read)
```

105
docs/rules/rust.md Normal file
View file

@ -0,0 +1,105 @@
# Rust Rules
Nyx detects Rust vulnerabilities through AST patterns (memory safety, code quality) and taint analysis (command injection via `env::var``Command::new`).
## Taint Sources
| Function | Capability | Source Kind |
|----------|-----------|-------------|
| `std::env::var`, `env::var` | `all` | EnvironmentConfig |
## Taint Sinks
| Function | Required Capability |
|----------|-------------------|
| `Command::new`, `Command::arg`, `Command::args` | `SHELL_ESCAPE` |
| `Command::status`, `Command::output` | `SHELL_ESCAPE` |
| `fs::read_to_string`, `fs::write`, `fs::read`, `File::open`, `File::create` | `FILE_IO` |
## Taint Sanitizers
| Function | Strips Capability |
|----------|------------------|
| `html_escape::encode_safe`, `sanitize_html` | `HTML_ESCAPE` |
| `shell_escape::unix::escape`, `sanitize_shell` | `SHELL_ESCAPE` |
> **Note:** `fs::read_to_string` was moved from taint sources to sinks to support path traversal detection (`env::var``fs::read_to_string`).
---
## AST Pattern Rules
### Memory Safety
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `rs.memory.transmute` | High | A | `std::mem::transmute` — unchecked type reinterpretation |
| `rs.memory.copy_nonoverlapping` | High | A | `ptr::copy_nonoverlapping` — raw pointer memcpy |
| `rs.memory.get_unchecked` | High | A | `get_unchecked` / `get_unchecked_mut` — unchecked indexing |
| `rs.memory.mem_zeroed` | High | A | `std::mem::zeroed` — may be UB for non-POD types |
| `rs.memory.ptr_read` | High | A | `ptr::read` / `ptr::read_volatile` — raw pointer dereference |
| `rs.memory.narrow_cast` | Low | A | `as u8`/`i8`/`u16`/`i16` — possible truncation |
| `rs.memory.mem_forget` | Low | A | `std::mem::forget` — may leak resources |
### Code Quality
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `rs.quality.unsafe_block` | Medium | A | `unsafe { }` block — manual memory safety obligation |
| `rs.quality.unsafe_fn` | Medium | A | `unsafe fn` declaration |
| `rs.quality.unwrap` | Low | A | `.unwrap()` — panics on `None`/`Err` |
| `rs.quality.expect` | Low | A | `.expect()` — panics on `None`/`Err` |
| `rs.quality.panic_macro` | Low | A | `panic!()` macro invocation |
| `rs.quality.todo` | Low | A | `todo!()` / `unimplemented!()` placeholder |
---
## Examples
### `rs.memory.transmute` — Unchecked type reinterpretation
**Vulnerable:**
```rust
let x: u32 = 42;
let y: f32 = unsafe { std::mem::transmute(x) };
```
**Safe alternative:**
```rust
let x: u32 = 42;
let y: f32 = f32::from_bits(x);
```
### `rs.quality.unsafe_block` — Unsafe block
**Flagged:**
```rust
unsafe {
let ptr = &x as *const i32;
println!("{}", *ptr);
}
```
**Safe alternative:**
```rust
// Use safe abstractions when possible
println!("{}", x);
```
### Taint: `env::var``Command::new`
**Vulnerable:**
```rust
let cmd = std::env::var("USER_CMD").unwrap();
Command::new("sh").arg("-c").arg(&cmd).output()?;
```
**Safe alternative:**
```rust
let cmd = std::env::var("USER_CMD").unwrap();
// Validate against allowlist
let allowed = ["ls", "whoami", "date"];
if allowed.contains(&cmd.as_str()) {
Command::new(&cmd).output()?;
}
```

81
docs/rules/typescript.md Normal file
View file

@ -0,0 +1,81 @@
# TypeScript Rules
TypeScript rules mirror JavaScript patterns plus TypeScript-specific type-safety escape detectors. Taint labels are shared with JavaScript (see [JavaScript Rules](javascript.md)).
---
## AST Pattern Rules
### Code Execution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `ts.code_exec.eval` | High | A | `eval()` — dynamic code execution |
| `ts.code_exec.new_function` | High | A | `new Function()` — eval equivalent |
| `ts.code_exec.settimeout_string` | Medium | A | `setTimeout`/`setInterval` with string argument |
### XSS Sinks
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `ts.xss.document_write` | Medium | A | `document.write()` / `document.writeln()` |
| `ts.xss.outer_html` | Medium | A | Assignment to `.outerHTML` |
| `ts.xss.insert_adjacent_html` | Medium | A | `insertAdjacentHTML()` |
| `ts.xss.location_assign` | Medium | A | Assignment to `location`/`location.href` |
| `ts.xss.cookie_write` | Low | A | Write to `document.cookie` |
### Prototype Pollution
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `ts.prototype.proto_assignment` | Medium | A | Assignment to `__proto__` |
### Weak Crypto
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `ts.crypto.math_random` | Low | A | `Math.random()` — not cryptographically secure |
### Code Quality (TypeScript-specific)
| Rule ID | Severity | Tier | Description |
|---------|----------|------|-------------|
| `ts.quality.any_annotation` | Low | A | Type annotation of `any` — disables type checking |
| `ts.quality.as_any` | Low | A | Type assertion `as any` — type-safety escape hatch |
---
## Examples
### `ts.quality.any_annotation``any` type
**Flagged:**
```typescript
function process(data: any) { // ts.quality.any_annotation
data.whatever(); // No type checking
}
```
**Safe alternative:**
```typescript
interface UserData { name: string; email: string; }
function process(data: UserData) {
console.log(data.name);
}
```
### `ts.quality.as_any` — Type assertion escape
**Flagged:**
```typescript
const result = someValue as any; // ts.quality.as_any
result.nonexistentMethod();
```
**Safe alternative:**
```typescript
if (isValidType(someValue)) {
const result = someValue as KnownType;
result.knownMethod();
}
```