Release v2.1.23 Receipt Lock hardening

Hardens Sanhedrin Receipt Lock for model-agnostic use, adds fail-open telemetry and receipt docs, fixes smart_ingest batch safety, wires opt-in CUDA Qwen3 device selection, and refreshes dashboard/release assets.\n\nFixes #54\nFixes #58\nFixes #60\nRefs #59
This commit is contained in:
Sam Valladares 2026-05-27 19:03:16 -05:00 committed by GitHub
parent a8550410b0
commit 14b061f124
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
161 changed files with 1775 additions and 262 deletions

View file

@ -21,7 +21,7 @@ The default Cognitive Sandwich installer only stages files and removes old v2.1.
└────────────────────────────────────────────────┘
```
Sanhedrin, preflight, and all Vestige Claude Code hooks are optional. The default installer wires none of them; it does not call Claude, start MLX, require a 19 GB model download, or require 20+ GB of RAM. Users who want preflight context can opt in with `--enable-preflight`. Users who want the post-response verifier can opt in with `--enable-sanhedrin` and point it at any OpenAI-compatible `/v1/chat/completions` endpoint. On Apple Silicon, an additional `--with-launchd` flag can auto-start the local MLX Qwen backend.
Sanhedrin, preflight, and all Vestige Claude Code hooks are optional. The default installer wires none of them; it does not call Claude, start MLX, require a 19 GB model download, or require 20+ GB of RAM. Users who want preflight context can opt in with `--enable-preflight`. Users who want the post-response verifier can opt in with `--enable-sanhedrin` and point it at any OpenAI-compatible `/v1/chat/completions` endpoint and model name. Sanhedrin is model-agnostic: if no verifier model is configured, it fails open and records guidance instead of guessing a large model. On Apple Silicon, an additional `--with-launchd` flag can auto-start the local MLX Qwen backend.
---
@ -80,7 +80,7 @@ The bridge still prints legacy one-line `yes` / `no - ...` by default for Stop-h
- it never calls `smart_ingest`
- it cannot promote, demote, merge, suppress, or supersede durable memories
- it does not satisfy the durable-evidence requirement for `REFUTED_BY_ABSENCE`
- it does not satisfy the durable-evidence requirement for `SUPPORTED`, `REFUTED`, or `REFUTED_BY_ABSENCE`
- durable memory writes remain a separate commit-after-pass step
False-positive guards (added v2.1.0 after dogfood):
@ -130,7 +130,8 @@ scripts/check-sandwich-prereqs.sh --preflight
Sanhedrin is a separate opt-in layer.
```bash
# Wire the Sanhedrin Stop hook, using the default OpenAI-compatible endpoint.
# Wire the Sanhedrin Stop hook without choosing a model yet.
# It will fail open until endpoint/model are configured.
vestige sandwich install --enable-sanhedrin
# Apple Silicon only, and only if the machine has enough memory:
@ -143,6 +144,16 @@ vestige sandwich install \
--sanhedrin-model=qwen2.5:14b
```
Backend presets live at `hooks/sanhedrin-presets.json` and cover custom
OpenAI-compatible servers, small local laptops, balanced local Ollama, MLX,
vLLM, llama.cpp, hosted OpenAI-compatible APIs, and Anthropic via LiteLLM.
Presets are recipes, not requirements. The hook itself only needs an
OpenAI-compatible `/v1/chat/completions` endpoint and a model name chosen by the
user. Backend-specific payload extensions are enabled only by
`VESTIGE_SANHEDRIN_BACKEND=mlx` or `vllm`. For hosted APIs, use
`VESTIGE_SANHEDRIN_API_KEY`; Sanhedrin intentionally does not forward a generic
`OPENAI_API_KEY` to arbitrary configured endpoints.
### Prerequisites
| Tool | Install |
@ -202,8 +213,9 @@ On M3 Max 14-core or M2/M1 Max: closer to 37s prompt processing, ~5060 tok
| `VESTIGE_SANHEDRIN_ENABLED` | `0` | Set to `1` to enable the optional Sanhedrin Stop hook |
| `VESTIGE_SWARM_ENABLED` | `1` | Set to `0` to disable preflight lateral-thinker swarm |
| `VESTIGE_DASHBOARD_PORT` | `3927` | Vestige MCP HTTP API port used by hooks |
| `VESTIGE_SANHEDRIN_ENDPOINT` | `http://127.0.0.1:8080/v1/chat/completions` | OpenAI-compatible chat completions endpoint for Sanhedrin |
| `VESTIGE_SANHEDRIN_MODEL` | `mlx-community/Qwen3.6-35B-A3B-4bit` | Model name sent to the Sanhedrin endpoint |
| `VESTIGE_SANHEDRIN_ENDPOINT` | unset | OpenAI-compatible chat completions endpoint for Sanhedrin |
| `VESTIGE_SANHEDRIN_MODEL` | unset | Model name sent to the Sanhedrin endpoint; choose any compatible model |
| `VESTIGE_SANHEDRIN_BACKEND` | unset | Optional backend hint (`ollama`, `llama.cpp`, `mlx`, `vllm`, `openai`, `litellm`) |
| `VESTIGE_SANHEDRIN_CLAIM_MODE` | `1` when installed with `--enable-sanhedrin` | Enables per-claim retrieval and fail-closed user-critical lanes |
| `VESTIGE_SANHEDRIN_OUTPUT` | `json` when installed with `--enable-sanhedrin` | Emits structured JSON from the bridge; shell hook also accepts legacy text |
| `VESTIGE_SANHEDRIN_STAGE_FILE` | unset | Optional JSON-array staged evidence overlay, read-only and non-durable |

View file

@ -0,0 +1,96 @@
# Sanhedrin Receipt Schema
Sanhedrin writes local, inspectable receipts so a Stop-hook veto is appealable
instead of opaque. The current schema is `vestige.sanhedrin.receipt.v1`.
## Locations
- Latest JSON: `~/.vestige/sanhedrin/latest.json`
- Latest HTML: `~/.vestige/sanhedrin/latest.html`
- Receipt archive: `~/.vestige/sanhedrin/receipts/<receipt-id>.json`
- Command receipt ledger: `~/.vestige/sanhedrin/command-receipts.jsonl`
- Appeals: `~/.vestige/sanhedrin/appeals.jsonl`
- Fail-open events: `~/.vestige/sanhedrin/fail-open.jsonl`
## v1 JSON Shape
```json
{
"schema": "vestige.sanhedrin.receipt.v1",
"id": "receipt_<stable hash>",
"draftId": "draft_<stable hash>",
"createdAt": "2026-05-25T18:00:00+00:00",
"overall": "pass|pass_with_warnings|veto|appealed",
"verdictBar": "PASS|NOTE|CAUTION|VETO|APPEALED",
"summary": "Human-readable result",
"draftPreview": "First 1000 chars of the assistant draft",
"claims": [
{
"id": "c001",
"text": "All tests passed.",
"fingerprint": "16-char sha256 prefix",
"class": "receipt_lock|TECHNICAL|ACHIEVEMENT|...",
"subject": "Sam|draft|command receipt",
"risk": "normal|hard",
"evidence_state": "supported|missing_receipt|contradicted|appealed|...",
"decision": "pass|pass_unverified|veto|appealed",
"precedent": [
{
"type": "command|receipt_lock|vestige|appeal",
"summary": "Why this claim passed or failed",
"command": "cargo test --workspace",
"exitCode": 0
}
],
"fix": "Suggested rewrite",
"appeal": {
"status": "open|appealed",
"actions": ["stale", "wrong", "too_strict"]
}
}
],
"receipts": [
{
"source": "transcript|codex-transcript",
"command": "cargo test --workspace",
"exitCode": 0,
"success": true,
"timestamp": "2026-05-25T18:00:00+00:00"
}
],
"source": {
"stateDir": "~/.vestige/sanhedrin",
"transcript": "/path/to/session.jsonl"
}
}
```
## Compatibility Rules
- Readers should accept `vestige.sanhedrin.receipt.v1` without warning.
- Readers should keep rendering unknown schemas defensively, but surface a
warning instead of silently treating them as v1.
- New schema versions must keep `id`, `createdAt`, `verdictBar`, `summary`, and
`claims` stable or provide a dashboard migration.
## Staged Evidence Boundary
`VESTIGE_SANHEDRIN_STAGE_FILE` is a non-durable overlay for current-turn context.
It may help the executioner understand a draft, but code enforces that staged
evidence cannot satisfy durable evidence requirements for `SUPPORTED`,
`REFUTED`, or `REFUTED_BY_ABSENCE`. Durable support must come from Vestige memory
or command receipts.
## Receipt Lock Compatibility Flags
`VESTIGE_SANHEDRIN_ALLOW_COMMAND_LEDGER=1` lets Receipt Lock read
`command-receipts.jsonl` when no live transcript path is available.
`VESTIGE_SANHEDRIN_ALLOW_LOOSE_LEDGER=1` re-enables the legacy fallback that
regex-scans transcript JSON blobs for `command` or `cmd` fields. Keep this off
unless you are migrating old transcripts; structured tool-use receipts are safer
because loose scanning can mistake quoted text for a real command execution.
Hosted Sanhedrin backends should use `VESTIGE_SANHEDRIN_API_KEY` in
`~/.claude/hooks/vestige-sanhedrin.env`. The installer keeps that file at mode
`0600`; do not store shared or unrelated API keys there.