flakestorm/docs/BEHAVIORAL_CONTRACTS.md

113 lines
5.5 KiB
Markdown
Raw Permalink Normal View History

# Behavioral Contracts (Pillar 2)
**What it is:** A **contract** is a named set of **invariants** (rules the agent must always follow). Flakestorm runs your agent under each scenario in a **chaos matrix** and checks every invariant in every scenario. The result is a **resilience score** (0100%) and a pass/fail matrix.
**Why it matters:** You need to know that the agent still obeys its rules when tools fail, the LLM is degraded, or context is poisoned — not just on the happy path.
**Question answered:** *Does the agent obey its rules when the world breaks?*
---
## When to use it
- You have hard rules: “always cite a source”, “never return PII”, “never fabricate numbers when tools fail”.
- You want a single **resilience score** for CI that reflects behavior across multiple failure modes.
- You run `flakestorm contract run` for contract-only checks, or `flakestorm ci` to include contract in the overall score.
---
## Configuration
In `flakestorm.yaml` with `version: "2.0"` add `contract` and `chaos_matrix`:
```yaml
contract:
name: "Finance Agent Contract"
description: "Invariants that must hold under all failure conditions"
invariants:
- id: always-cite-source
type: regex
pattern: "(?i)(source|according to|reference)"
severity: critical
when: always
description: "Must always cite a data source"
- id: never-fabricate-when-tools-fail
type: regex
pattern: '\\$[\\d,]+\\.\\d{2}'
negate: true
severity: critical
when: tool_faults_active
description: "Must not return dollar figures when tools are failing"
- id: max-latency
type: latency
max_ms: 60000
severity: medium
when: always
chaos_matrix:
- name: "no-chaos"
tool_faults: []
llm_faults: []
- name: "search-tool-down"
tool_faults:
- tool: market_data_api
mode: error
error_code: 503
- name: "llm-degraded"
llm_faults:
- mode: truncated_response
max_tokens: 20
```
### Invariant fields
| Field | Required | Description |
|-------|----------|-------------|
| `id` | Yes | Unique identifier for this invariant. |
| `type` | Yes | Same as run invariants: `contains`, `regex`, `latency`, `valid_json`, `similarity`, `excludes_pii`, `refusal_check`, `completes`, `output_not_empty`, `contains_any`, `excludes_pattern`, `behavior_unchanged`, etc. |
| `severity` | No | `critical` \| `high` \| `medium` \| `low` (default `medium`). Weights the resilience score; **any critical failure** = automatic fail. |
| `when` | No | `always` \| `tool_faults_active` \| `llm_faults_active` \| `any_chaos_active` \| `no_chaos`. When this invariant is evaluated. |
| `negate` | No | If true, the check passes when the pattern does **not** match (e.g. “must NOT contain dollar figures”). |
| `description` | No | Human-readable description. |
| **`probes`** | No | For **system_prompt_leak_probe**: list of probe prompts to run instead of golden_prompts; use with `excludes_pattern` to ensure no leak. |
| **`baseline`** | No | For `behavior_unchanged`: `auto` or manual baseline string. |
| **`similarity_threshold`** | No | For `behavior_unchanged`/similarity; default 0.75. |
| Plus type-specific | — | `pattern`, `patterns`, `value`, `values`, `max_ms`, `threshold`, etc., same as [Configuration Guide](CONFIGURATION_GUIDE.md). |
### Chaos matrix
Each entry is a **scenario**: a name plus optional `tool_faults`, `llm_faults`, and `context_attacks`. The contract engine runs golden prompts (or **probes** for that invariant when set) under each scenario and verifies every invariant. Result: **invariants × scenarios** cells; resilience score is severity-weighted pass rate, and **any critical failure** fails the contract.
---
## Resilience score
- **Formula:** (Σ passed × severity_weight) / (Σ total × severity_weight) × 100.
- **Weights:** critical = 3, high = 2, medium = 1, low = 1.
- **Automatic FAIL:** If any invariant with severity `critical` fails in any scenario, the contract is considered failed regardless of the numeric score.
See [V2 Spec](V2_SPEC.md) for the exact formula and matrix isolation (reset) behavior.
---
## Commands
| Command | What it does |
|---------|----------------|
| `flakestorm contract run` | Run the contract across the chaos matrix; print resilience score and pass/fail. |
| `flakestorm contract validate` | Validate contract YAML without executing. |
| `flakestorm contract score` | Output only the resilience score (e.g. for CI: `flakestorm contract score -c flakestorm.yaml`). |
| `flakestorm ci` | Runs contract (if configured) and includes **contract_compliance** in the **overall** weighted score. |
---
## Stateful agents
If your agent keeps state between calls, each (invariant × scenario) cell should start from a clean state. Set **`agent.reset_endpoint`** (HTTP POST URL, e.g. `http://localhost:8000/reset`) or **`agent.reset_function`** (Python module path, e.g. `myagent:reset_state`) so Flakestorm can reset between cells. If the agent appears stateful (same prompt produces different responses on two calls) and no reset is configured, Flakestorm logs: *"Warning: No reset_endpoint configured. Contract matrix cells may share state. Results may be contaminated. Add reset_endpoint to your config for accurate isolation."* It does not fail the run.
---
## See also
- [Environment Chaos](ENVIRONMENT_CHAOS.md) — How tool/LLM faults and context attacks are defined.
- [Configuration Guide](CONFIGURATION_GUIDE.md) — Full `invariants` and checker reference.