flakestorm/docs/V2_SPEC.md

31 lines
1.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# V2 Spec Clarifications
## Python callable / tool interception
For `agent.type: python`, **tool fault injection** requires one of:
- An explicit list of tool callables in config that Flakestorm can wrap, or
- A `ToolRegistry` interface that Flakestorm wraps.
If neither is provided, Flakestorm **fails with a clear error** (does not silently skip tool fault injection).
## Contract matrix isolation
Each (invariant × scenario) cell is an **independent invocation**. Agent state must not leak between cells.
- **Reset is optional:** configure `agent.reset_endpoint` (HTTP) or `agent.reset_function` (Python) to clear state before each cell.
- If no reset is configured and the agent **appears stateful** (response variance across identical inputs), Flakestorm **warns** (does not fail):
*"Warning: No reset_endpoint configured. Contract matrix cells may share state. Results may be contaminated. Add reset_endpoint to your config for accurate isolation."*
## Resilience score formula
**Per-contract score:**
```
score = (Σ(passed_critical×3) + Σ(passed_high×2) + Σ(passed_medium×1))
/ (Σ(total_critical×3) + Σ(total_high×2) + Σ(total_medium×1)) × 100
```
**Automatic FAIL:** If any **critical** severity invariant fails in any scenario, the overall result is FAIL regardless of the numeric score.
**Overall score (mutation + chaos + contract + replay):** Configurable via `scoring.weights` (default: mutation 20%, chaos 35%, contract 35%, replay 10%).