flakestorm/docs/BEHAVIORAL_CONTRACTS.md

112 lines
5.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Behavioral Contracts (Pillar 2)
**What it is:** A **contract** is a named set of **invariants** (rules the agent must always follow). Flakestorm runs your agent under each scenario in a **chaos matrix** and checks every invariant in every scenario. The result is a **resilience score** (0100%) and a pass/fail matrix.
**Why it matters:** You need to know that the agent still obeys its rules when tools fail, the LLM is degraded, or context is poisoned — not just on the happy path.
**Question answered:** *Does the agent obey its rules when the world breaks?*
---
## When to use it
- You have hard rules: “always cite a source”, “never return PII”, “never fabricate numbers when tools fail”.
- You want a single **resilience score** for CI that reflects behavior across multiple failure modes.
- You run `flakestorm contract run` for contract-only checks, or `flakestorm ci` to include contract in the overall score.
---
## Configuration
In `flakestorm.yaml` with `version: "2.0"` add `contract` and `chaos_matrix`:
```yaml
contract:
name: "Finance Agent Contract"
description: "Invariants that must hold under all failure conditions"
invariants:
- id: always-cite-source
type: regex
pattern: "(?i)(source|according to|reference)"
severity: critical
when: always
description: "Must always cite a data source"
- id: never-fabricate-when-tools-fail
type: regex
pattern: '\\$[\\d,]+\\.\\d{2}'
negate: true
severity: critical
when: tool_faults_active
description: "Must not return dollar figures when tools are failing"
- id: max-latency
type: latency
max_ms: 60000
severity: medium
when: always
chaos_matrix:
- name: "no-chaos"
tool_faults: []
llm_faults: []
- name: "search-tool-down"
tool_faults:
- tool: market_data_api
mode: error
error_code: 503
- name: "llm-degraded"
llm_faults:
- mode: truncated_response
max_tokens: 20
```
### Invariant fields
| Field | Required | Description |
|-------|----------|-------------|
| `id` | Yes | Unique identifier for this invariant. |
| `type` | Yes | Same as run invariants: `contains`, `regex`, `latency`, `valid_json`, `similarity`, `excludes_pii`, `refusal_check`, `completes`, `output_not_empty`, `contains_any`, `excludes_pattern`, `behavior_unchanged`, etc. |
| `severity` | No | `critical` \| `high` \| `medium` \| `low` (default `medium`). Weights the resilience score; **any critical failure** = automatic fail. |
| `when` | No | `always` \| `tool_faults_active` \| `llm_faults_active` \| `any_chaos_active` \| `no_chaos`. When this invariant is evaluated. |
| `negate` | No | If true, the check passes when the pattern does **not** match (e.g. “must NOT contain dollar figures”). |
| `description` | No | Human-readable description. |
| **`probes`** | No | For **system_prompt_leak_probe**: list of probe prompts to run instead of golden_prompts; use with `excludes_pattern` to ensure no leak. |
| **`baseline`** | No | For `behavior_unchanged`: `auto` or manual baseline string. |
| **`similarity_threshold`** | No | For `behavior_unchanged`/similarity; default 0.75. |
| Plus type-specific | — | `pattern`, `patterns`, `value`, `values`, `max_ms`, `threshold`, etc., same as [Configuration Guide](CONFIGURATION_GUIDE.md). |
### Chaos matrix
Each entry is a **scenario**: a name plus optional `tool_faults`, `llm_faults`, and `context_attacks`. The contract engine runs golden prompts (or **probes** for that invariant when set) under each scenario and verifies every invariant. Result: **invariants × scenarios** cells; resilience score is severity-weighted pass rate, and **any critical failure** fails the contract.
---
## Resilience score
- **Formula:** (Σ passed × severity_weight) / (Σ total × severity_weight) × 100.
- **Weights:** critical = 3, high = 2, medium = 1, low = 1.
- **Automatic FAIL:** If any invariant with severity `critical` fails in any scenario, the contract is considered failed regardless of the numeric score.
See [V2 Spec](V2_SPEC.md) for the exact formula and matrix isolation (reset) behavior.
---
## Commands
| Command | What it does |
|---------|----------------|
| `flakestorm contract run` | Run the contract across the chaos matrix; print resilience score and pass/fail. |
| `flakestorm contract validate` | Validate contract YAML without executing. |
| `flakestorm contract score` | Output only the resilience score (e.g. for CI: `flakestorm contract score -c flakestorm.yaml`). |
| `flakestorm ci` | Runs contract (if configured) and includes **contract_compliance** in the **overall** weighted score. |
---
## Stateful agents
If your agent keeps state between calls, each (invariant × scenario) cell should start from a clean state. Set **`agent.reset_endpoint`** (HTTP POST URL, e.g. `http://localhost:8000/reset`) or **`agent.reset_function`** (Python module path, e.g. `myagent:reset_state`) so Flakestorm can reset between cells. If the agent appears stateful (same prompt produces different responses on two calls) and no reset is configured, Flakestorm logs: *"Warning: No reset_endpoint configured. Contract matrix cells may share state. Results may be contaminated. Add reset_endpoint to your config for accurate isolation."* It does not fail the run.
---
## See also
- [Environment Chaos](ENVIRONMENT_CHAOS.md) — How tool/LLM faults and context attacks are defined.
- [Configuration Guide](CONFIGURATION_GUIDE.md) — Full `invariants` and checker reference.