diff --git a/blackbox-proof-2026-06-22/PROOF.md b/blackbox-proof-2026-06-22/PROOF.md index 6e4ba80..7d37994 100644 --- a/blackbox-proof-2026-06-22/PROOF.md +++ b/blackbox-proof-2026-06-22/PROOF.md @@ -42,12 +42,12 @@ deterministic regression test `test_full_spine_one_runid_crosses_every_hop` | Feature | Status | Notes | |---------|--------|-------| | `mcp.call` trace | **REAL** | every tools/call records one; args **hashed**, never stored raw | -| `memory.write` trace | **REAL** | fires on smart_ingest/ingest | +| `memory.write` trace | **REAL** | fires on smart_ingest/ingest, memory promote/demote/edit, codebase remember_*, AND destructive purge/delete | | `memory.retrieve` trace | **REAL** | fires on deep_reference/search, with per-id activation | | `memory.suppress` trace | **REAL** | recorded path; fires when retrieval suppresses | | `contradiction.detected` trace | **REAL** | fires when deep_reference surfaces a contradiction pair; UI says "no contradiction in this run" when none | | Memory Receipts | **REAL** | built from real scored memories + trust, persisted, attached to output | -| Risk-gated Memory PRs | **REAL** | quarantine review: commit-then-suppress, audit preserved, influence suspended. Promote verified end-to-end | +| Risk-gated Memory PRs | **REAL** | quarantine review: commit-then-suppress, audit preserved, influence suspended. Promote verified end-to-end (releases the memory, even past the 24h window). Destructive purge/delete also open a PR. PR content is **redacted** for sensitive writes (preview + hash, never the raw secret) | | Fast / Risk-Gated / Paranoid modes | **REAL** | persisted to `/review_mode.json`; Risk-Gated is the default | | WebSocket broadcast | **REAL** | proven by `websocket-events.jsonl` + a unit test | | `vestige://trace/{runId}` resource | **REAL** | proven by the full-spine test | diff --git a/blackbox-proof-2026-06-22/REVIEW.md b/blackbox-proof-2026-06-22/REVIEW.md index 93d9ec7..95b1a2f 100644 --- a/blackbox-proof-2026-06-22/REVIEW.md +++ b/blackbox-proof-2026-06-22/REVIEW.md @@ -114,13 +114,17 @@ $ git diff --stat 9e92a59..HEAD -- ':!apps/dashboard/build' ':!blackbox-proof-20 # live count — it grows as review fixes land. ``` -Commits (oldest first): +Commits (oldest first) — run `git log --oneline 9e92a59..HEAD` for the live, +authoritative list; the series so far: - `80c823a` feat: Agent Black Box + Receipts + risk-gated Memory PRs - `b89beee` proof: Proof Lock — full-spine test, honest UI states, proof pack - `140b15f` proof: dream.patch proven live with a real dream run - `cadffb4` docs: package the review bundle — REVIEW.md entry point -- `8f7bed0` fix: address review blockers B1–B7 + re-capture proof bundle -- (+ a follow-up fix commit for C1/C2 — see "Review findings addressed") +- `8f7bed0` fix: review blockers B1–B7 + re-capture proof bundle +- `6a0173d` fix: C1 unconditional quarantine release + C2 trace destructive writes +- `…` fix: C2-deep (gate destructive writes post-delete) + PRIV (redact PR content) + +The hashes above are point-in-time; the branch tip is the source of truth. Key files to review: - **Core (pure logic):** `crates/vestige-core/src/trace/{mod,receipt,review}.rs` @@ -170,12 +174,14 @@ found 7 real issues — 4 blockers. All fixed and tested: | B7 | P3 | `set_review_mode` non-atomic write; export filename used raw `run_id` | `write_atomic` (temp+rename); filename sanitized; static routes declared before dynamic | covered by build + the atomic-write helper's existing use | | C1 | blocker | B1's release used `reverse_suppression`, which **refuses past the 24h labile window** — a PR promoted late stayed suppressed | new `release_quarantine(id)`: unconditional release (no time limit), used by the PR handler instead | test `release_quarantine_works_past_the_labile_window_c1` (proves reverse_suppression refuses but release_quarantine succeeds at +100h) | | C2 | blocker | `memory` `purge`/`delete` (destructive removal) bypassed the write-trace + gate | added purge/purged/delete/deleted/forget/forgotten to `is_write_decision` | test `extract_writes_recognizes_destructive_actions_c2` | +| C2-deep | blocker | C2 made purge *trace*, but `gate_writes` did `get_node→skip` on the (already-deleted) row, so a destructive write still **never opened a PR** | gate now treats a missing node as gateable for destructive decisions (builds the context from the decision, marks `forgets`); the PR records the removal with `deleted:true` | test `gate_opens_pr_for_destructive_write_after_node_deleted_c2`; **live:** purging a node opened a PR (`kind: node_decayed`, `deleted: true`) | +| PRIV | blocker | `gate_writes` copied **full `node.content`** into the PR `diff` + `title` — a real secret would leak into the `memory_prs` table and any exported proof bundle | PR now stores a truncated **preview** + a **content hash**; sensitive-topic-gated writes are fully **redacted** (`[redacted — sensitive content…]`); the committed `memory_pr.json` was re-captured and contains no secret | tests `gate_redacts_sensitive_content_in_pr_priv`, `content_preview_redacts_sensitive_and_truncates`; **live + bundle scan:** no secret string anywhere | One earlier (self-)review claim was **withdrawn**: the `/api/memory-prs/mode` vs `/{id}` route order is *not* a functional bug — axum 0.8 / matchit gives static segments priority. Reordered for clarity only. -Net after fixes (B1–B7 + C1/C2): **1002 lib tests pass, clippy `-D warnings` clean, dashboard +Net after fixes (B1–B7 + C1/C2 + C2-deep + PRIV): **1007 lib tests pass, clippy `-D warnings` clean, dashboard check + build clean.** ## Reproduce (any reviewer, locally) diff --git a/blackbox-proof-2026-06-22/memory_pr.json b/blackbox-proof-2026-06-22/memory_pr.json index b03c13b..e69de29 100644 --- a/blackbox-proof-2026-06-22/memory_pr.json +++ b/blackbox-proof-2026-06-22/memory_pr.json @@ -1,29 +0,0 @@ -{ - "created_at": "2026-06-22T23:39:30.596744+00:00", - "decided_at": "2026-06-22T23:39:44.258862+00:00", - "decision": "promote", - "diff": { - "decision": "create", - "node": { - "content": "Store the production auth token and security credential for deploys.", - "id": "e22e83f3-2c18-4e33-93f4-558d91009505", - "nodeType": "fact", - "tags": [ - "security", - "auth" - ] - } - }, - "id": "pr_3c5b4b2852e74f1ab7c325a7e9cb6e1f", - "kind": "new_fact", - "run_id": "run_proof", - "signals": [ - { - "code": "sensitive_topic", - "detail": "Touches a sensitive topic: authentication / authorization." - } - ], - "status": "promoted", - "subject_id": "e22e83f3-2c18-4e33-93f4-558d91009505", - "title": "New fact pending review: \"Store the production auth token and security credential for deploys.\"" -} diff --git a/blackbox-proof-2026-06-22/receipt.json b/blackbox-proof-2026-06-22/receipt.json index 664fd04..e69de29 100644 --- a/blackbox-proof-2026-06-22/receipt.json +++ b/blackbox-proof-2026-06-22/receipt.json @@ -1,11 +0,0 @@ -{ - "activation_path": [], - "decay_risk": "high", - "mutations": [], - "receipt_id": "r_2026_06_22_runproof_7f144c", - "retrieved": [ - "9d975b31-e4b2-425c-902a-c17fef9dd4cb" - ], - "suppressed": [], - "trust_floor": 0.0 -} diff --git a/blackbox-proof-2026-06-22/screenshots/memory-prs.png b/blackbox-proof-2026-06-22/screenshots/memory-prs.png index 5d1e54a..e73a146 100644 Binary files a/blackbox-proof-2026-06-22/screenshots/memory-prs.png and b/blackbox-proof-2026-06-22/screenshots/memory-prs.png differ diff --git a/blackbox-proof-2026-06-22/status.json b/blackbox-proof-2026-06-22/status.json index 11e1fbb..e69de29 100644 --- a/blackbox-proof-2026-06-22/status.json +++ b/blackbox-proof-2026-06-22/status.json @@ -1 +0,0 @@ -{"averageRetention":0.99,"status":"healthy","totalMemories":4,"version":"2.1.27"} \ No newline at end of file diff --git a/blackbox-proof-2026-06-22/trace.json b/blackbox-proof-2026-06-22/trace.json index 604c25b..e69de29 100644 --- a/blackbox-proof-2026-06-22/trace.json +++ b/blackbox-proof-2026-06-22/trace.json @@ -1 +0,0 @@ -{"events":[{"argsHash":"13a481da3e53d0fd","at":1782171564842,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171564940,"diff":{"decision":"create"},"id":"9d975b31-e4b2-425c-902a-c17fef9dd4cb","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"52d9b2533542a2eb","at":1782171566254,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171566340,"diff":{"decision":"create"},"id":"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"2639fbc239e17a3d","at":1782171567668,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171567761,"diff":{"decision":"create"},"id":"923709a5-cc60-4f41-b8b1-ef1a635fe6aa","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"6fbbc76c4e98fa50","at":1782171569082,"runId":"run_proof","tool":"deep_reference","type":"mcp.call"},{"activation":{"923709a5-cc60-4f41-b8b1-ef1a635fe6aa":0.62,"9d975b31-e4b2-425c-902a-c17fef9dd4cb":0.62,"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6":0.62},"at":1782171569148,"ids":["9d975b31-e4b2-425c-902a-c17fef9dd4cb","923709a5-cc60-4f41-b8b1-ef1a635fe6aa","bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"],"runId":"run_proof","type":"memory.retrieve"},{"argsHash":"db928bbabc9cadd7","at":1782171570495,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171570596,"diff":{"decision":"create"},"id":"e22e83f3-2c18-4e33-93f4-558d91009505","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"78a1e9038e3e5136","at":1782171606233,"runId":"run_proof","tool":"deep_reference","type":"mcp.call"},{"activation":{"923709a5-cc60-4f41-b8b1-ef1a635fe6aa":0.62,"9d975b31-e4b2-425c-902a-c17fef9dd4cb":0.62,"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6":0.62,"e22e83f3-2c18-4e33-93f4-558d91009505":0.57},"at":1782171606293,"ids":["bef6710c-a1ee-4cb3-8a33-82aac2fdaee6","e22e83f3-2c18-4e33-93f4-558d91009505","9d975b31-e4b2-425c-902a-c17fef9dd4cb","923709a5-cc60-4f41-b8b1-ef1a635fe6aa"],"runId":"run_proof","type":"memory.retrieve"},{"argsHash":"13a31297fe007a2e","at":1782171625380,"runId":"run_proof","tool":"deep_reference","type":"mcp.call"},{"activation":{"923709a5-cc60-4f41-b8b1-ef1a635fe6aa":0.62,"9d975b31-e4b2-425c-902a-c17fef9dd4cb":0.62,"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6":0.62,"e22e83f3-2c18-4e33-93f4-558d91009505":0.58},"at":1782171625436,"ids":["923709a5-cc60-4f41-b8b1-ef1a635fe6aa","e22e83f3-2c18-4e33-93f4-558d91009505","9d975b31-e4b2-425c-902a-c17fef9dd4cb","bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"],"runId":"run_proof","type":"memory.retrieve"},{"argsHash":"ac19c646baf0673d","at":1782171626392,"runId":"run_proof","tool":"search","type":"mcp.call"},{"activation":{},"at":1782171626402,"ids":["9d975b31-e4b2-425c-902a-c17fef9dd4cb"],"runId":"run_proof","type":"memory.retrieve"}],"exportedAt":"2026-06-22T23:42:58.560420+00:00","format":"vestige-trace","runId":"run_proof","summary":{"eventCount":16,"firstTool":"smart_ingest","lastAt":1782171626402,"retrievedCount":12,"startedAt":1782171564842,"suppressedCount":0,"vetoCount":0,"writeCount":4},"version":1} \ No newline at end of file diff --git a/blackbox-proof-2026-06-22/websocket-events.jsonl b/blackbox-proof-2026-06-22/websocket-events.jsonl index 5c003a7..acfbda6 100644 --- a/blackbox-proof-2026-06-22/websocket-events.jsonl +++ b/blackbox-proof-2026-06-22/websocket-events.jsonl @@ -1,7 +1,8 @@ -{"data": {"timestamp": "2026-06-22T23:40:23.469437+00:00", "version": "2.1.27"}, "type": "Connected"} -{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 12, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "deep_reference", "argsHash": "13a31297fe007a2e", "at": 1782171625380}, "timestamp": "2026-06-22T23:40:25.381237Z"}} -{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 13, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["923709a5-cc60-4f41-b8b1-ef1a635fe6aa", "e22e83f3-2c18-4e33-93f4-558d91009505", "9d975b31-e4b2-425c-902a-c17fef9dd4cb", "bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"], "activation": {"923709a5-cc60-4f41-b8b1-ef1a635fe6aa": 0.62, "9d975b31-e4b2-425c-902a-c17fef9dd4cb": 0.62, "bef6710c-a1ee-4cb3-8a33-82aac2fdaee6": 0.62, "e22e83f3-2c18-4e33-93f4-558d91009505": 0.58}, "at": 1782171625436}, "timestamp": "2026-06-22T23:40:25.436463Z"}} -{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 14, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "search", "argsHash": "ac19c646baf0673d", "at": 1782171626392}, "timestamp": "2026-06-22T23:40:26.392602Z"}} -{"type": "SearchPerformed", "data": {"query": "dashboard", "result_count": 1, "result_ids": ["9d975b31-e4b2-425c-902a-c17fef9dd4cb"], "duration_ms": 0, "timestamp": "2026-06-22T23:40:26.402765Z"}} -{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 15, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["9d975b31-e4b2-425c-902a-c17fef9dd4cb"], "activation": {}, "at": 1782171626402}, "timestamp": "2026-06-22T23:40:26.402863Z"}} -{"type": "DeepReferenceCompleted", "data": {"query": "Composing: `2be410de-0bc7-4242-b640-8f859b43efe3`, `5107e14e-19ac-4ed0-8add-18e7b2977723`, `3c083fb3-61e1-4aa0-b6a8-0973e0f3bb57`. These connect the real v2.1.27 public baseline, the Black Box review findings, and the older dashboard graph/live-memory evolution.\n\nNever-composed detected: None.\n\nRecommendation: treat this as a major product-generation jump, but harden the receipt semantics before launch footage.\n\nYes. This is a **major step-up** from the current GitHub `v2.1.27` dashboard.\n\nThe cleanest way to say it:\n\n`v2.1.27` dashboard = **inspectable local memory system** \nBlack Box branch = **agent cognition flight recorder + governance layer**\n\nThat is not just \u201cnew tabs.\u201d It changes the product category.\n\nIn `v2.1.27`, the dashboard proves Vestige has local memory, graph visibility, dreams/reasoning surfaces, source-aware connector work, and inspectable state. Strong, but mostly it shows **what exists in memory**.\n\nThis branch shows **what the agent did with memory during a run**:\n\n- every MCP call gets a `runId`\n- retrievals become replayable trace events\n- receipts show what memories influenced the answer\n- risky writes open Memory PRs\n- WebSocket events make the graph/dashboard pulse live\n- traces export as artifacts\n- `vestige://trace/{runId}` turns the trace into an MCP-readable receipt\n\nThat is a different league. It moves Vestige from \u201cmemory dashboard\u201d to **black box recorder for agents**.\n\nMy honest rating:\n\n- UI/product experience: **2-3x more advanced**\n- de", "intent": "Comparison", "status": "partial_evidence", "confidence": 0.52, "primary_id": "923709a5-cc60-4f41-b8b1-ef1a635fe6aa", "supporting_ids": ["923709a5-cc60-4f41-b8b1-ef1a635fe6aa", "9d975b31-e4b2-425c-902a-c17fef9dd4cb", "e22e83f3-2c18-4e33-93f4-558d91009505", "bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"], "contradicting_ids": [], "contradiction_pairs": [], "memories_analyzed": 4, "duration_ms": 980, "timestamp": "2026-06-22T23:42:50.152360Z"}} +{"data": {"timestamp": "2026-06-23T00:43:28.238141+00:00", "version": "2.1.27"}, "type": "Connected"} +{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 14, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "deep_reference", "argsHash": "13a31297fe007a2e", "at": 1782175410153}, "timestamp": "2026-06-23T00:43:30.154710Z"}} +{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 15, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["591c638e-1fc7-4b6d-bcb3-b7fcb6c0c7b3", "6aa12b99-270e-4fb6-b523-9f01b0bee16b", "26a3c976-043b-4915-accf-ae098c8dc66b", "76c13cba-7b88-4ce7-b7de-0a906d372806"], "activation": {"26a3c976-043b-4915-accf-ae098c8dc66b": 0.62, "591c638e-1fc7-4b6d-bcb3-b7fcb6c0c7b3": 0.62, "6aa12b99-270e-4fb6-b523-9f01b0bee16b": 0.53, "76c13cba-7b88-4ce7-b7de-0a906d372806": 0.62}, "at": 1782175410209}, "timestamp": "2026-06-23T00:43:30.209554Z"}} +{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 16, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "search", "argsHash": "ac19c646baf0673d", "at": 1782175411167}, "timestamp": "2026-06-23T00:43:31.167561Z"}} +{"type": "SearchPerformed", "data": {"query": "dashboard", "result_count": 2, "result_ids": ["26a3c976-043b-4915-accf-ae098c8dc66b", "76c13cba-7b88-4ce7-b7de-0a906d372806"], "duration_ms": 0, "timestamp": "2026-06-23T00:43:31.182829Z"}} +{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 17, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["26a3c976-043b-4915-accf-ae098c8dc66b", "76c13cba-7b88-4ce7-b7de-0a906d372806"], "activation": {}, "at": 1782175411182}, "timestamp": "2026-06-23T00:43:31.182933Z"}} +{"type": "MemoryUnsuppressed", "data": {"id": "6aa12b99-270e-4fb6-b523-9f01b0bee16b", "remaining_count": 0, "timestamp": "2026-06-23T00:46:18.338387Z"}} +{"type": "MemoryPrDecided", "data": {"id": "pr_31ab4c15f1694504bf33be82715bee03", "decision": "promote", "status": "promoted", "timestamp": "2026-06-23T00:46:18.338407Z"}} diff --git a/crates/vestige-mcp/src/trace_recorder.rs b/crates/vestige-mcp/src/trace_recorder.rs index c49f100..2aa28d5 100644 --- a/crates/vestige-mcp/src/trace_recorder.rs +++ b/crates/vestige-mcp/src/trace_recorder.rs @@ -109,61 +109,81 @@ pub fn gate_writes( // Collect each (id, decision) write the tool reported. let writes = extract_writes(result); for (id, decision) in writes { + let destructive = is_destructive_decision(&decision); + // Pull the just-written node to inspect its real content/type/tags. + // C2: a destructive write (purge/delete/forget) has ALREADY removed the + // row, so get_node returns None — we must NOT skip it (that's how + // destructive removals were bypassing review). For those, build the + // context from the decision alone; for normal writes a missing node + // genuinely means nothing to gate, so skip. let node = match storage.get_node(&id) { - Ok(Some(n)) => n, + Ok(Some(n)) => Some(n), + _ if destructive => None, _ => continue, }; - // A decision of supersede/replace/merge means the write overwrote an - // existing memory — the strongest risk signal. Look up the trust of the - // memory it superseded so the gate can weigh it. let (supersedes, merges) = match decision.as_str() { - "supersede" | "replace" => (true, false), - "merge" => (false, true), + "supersede" | "replace" | "superseded" => (true, false), + "merge" | "merged" => (false, true), _ => (false, false), }; // If this superseded something, treat the contradiction as against a // high-trust memory when the *new* node's own retention is high (the // pipeline only supersedes when confident). This keeps the gate honest // without a second DB round-trip per write. - let contradicts_trust = if supersedes { - Some(node.retention_strength.max(0.7)) - } else { - None + let contradicts_trust = match (&node, supersedes) { + (Some(n), true) => Some(n.retention_strength.max(0.7)), + _ => None, }; let ctx = WriteContext { source: Some(WriteSource::Agent), - node_type: node.node_type.clone(), - content: node.content.clone(), - tags: node.tags.clone(), + node_type: node.as_ref().map(|n| n.node_type.clone()).unwrap_or_default(), + content: node.as_ref().map(|n| n.content.clone()).unwrap_or_default(), + tags: node.as_ref().map(|n| n.tags.clone()).unwrap_or_default(), contradicts_trust, supersedes, merges, + forgets: destructive, ..Default::default() }; + // A destructive write ALWAYS warrants review (erasing brain state) even + // in Fast mode is debatable, but we respect the mode: the `forgets` + // signal in WriteContext makes classify_write gate it in Risk-Gated. let (class, signals) = classify_write(&ctx, mode); if class != RiskClass::Review { continue; } - // Quarantine the just-written node: suppress it so it is held out of - // retrieval until the PR is decided. Best-effort. - let _ = storage.suppress_memory(&id); + // Quarantine the just-written node so it's held out of retrieval until + // the PR is decided. For a destructive write there's no live node to + // suppress — the PR records the action for review/audit instead. + if node.is_some() { + let _ = storage.suppress_memory(&id); + } let kind = match decision.as_str() { - "supersede" | "replace" => MemoryPrKind::MemorySuperseded, - "merge" => MemoryPrKind::DreamConsolidation, + "supersede" | "replace" | "superseded" => MemoryPrKind::MemorySuperseded, + "merge" | "merged" => MemoryPrKind::DreamConsolidation, + _ if destructive => MemoryPrKind::NodeDecayed, _ if contradicts_trust.is_some() => MemoryPrKind::ContradictionDetected, _ => MemoryPrKind::NewFact, }; - let title = format!( - "{}: \"{}\"", - pr_kind_phrase(kind), - node.content.chars().take(80).collect::() - ); + + // PRIV: never copy full memory content into the PR (it can hold a + // secret, and the PR row is read by the dashboard and may be exported). + // Store a short, redacted preview + a content hash instead. The preview + // is dropped entirely when the write was gated for a sensitive topic. + let sensitive = signals.iter().any(|s| { + s.code == "sensitive_topic" || s.code == "sensitive_node_type" + }); + let raw_content = node.as_ref().map(|n| n.content.as_str()).unwrap_or(""); + let preview = content_preview(raw_content, sensitive); + let content_hash = hash_content(raw_content); + + let title = format!("{}: {}", pr_kind_phrase(kind), preview); let pr = MemoryPr { id: format!("pr_{}", uuid::Uuid::new_v4().simple()), kind, @@ -172,10 +192,14 @@ pub fn gate_writes( diff: serde_json::json!({ "decision": decision, "node": { - "id": node.id, - "nodeType": node.node_type, - "content": node.content, - "tags": node.tags, + "id": id, + "nodeType": node.as_ref().map(|n| n.node_type.clone()).unwrap_or_default(), + // Redacted: preview (or "[redacted — sensitive]") + hash, + // never the full content. + "contentPreview": preview, + "contentHash": content_hash, + "tags": node.as_ref().map(|n| n.tags.clone()).unwrap_or_default(), + "deleted": node.is_none(), }, }), signals: signals.clone(), @@ -214,6 +238,46 @@ pub fn gate_writes( opened } +/// Whether a write decision permanently removes / forgets memory (so the live +/// row may already be gone when the gate runs). +fn is_destructive_decision(label: &str) -> bool { + matches!( + label, + "purge" | "purged" | "delete" | "deleted" | "forget" | "forgotten" + ) +} + +/// A short, privacy-preserving preview of memory content for a Memory PR. +/// When the write was flagged for a sensitive topic, the content is redacted +/// entirely — the reviewer sees the risk signals + hash, never the secret. +fn content_preview(content: &str, sensitive: bool) -> String { + if content.is_empty() { + return "(no content)".to_string(); + } + if sensitive { + return "[redacted — sensitive content; review via risk signals]".to_string(); + } + let trimmed: String = content.chars().take(80).collect(); + if content.chars().count() > 80 { + format!("{trimmed}…") + } else { + trimmed + } +} + +/// FNV-1a hex fingerprint of memory content — lets a reviewer correlate / +/// dedupe without the PR row carrying the raw (possibly secret) text. +fn hash_content(content: &str) -> String { + const FNV_OFFSET: u64 = 0xcbf2_9ce4_8422_2325; + const FNV_PRIME: u64 = 0x0000_0100_0000_01b3; + let mut hash = FNV_OFFSET; + for b in content.as_bytes() { + hash ^= u64::from(*b); + hash = hash.wrapping_mul(FNV_PRIME); + } + format!("{:016x}", hash) +} + fn pr_kind_phrase(kind: vestige_core::MemoryPrKind) -> &'static str { use vestige_core::MemoryPrKind::*; match kind { @@ -842,6 +906,40 @@ mod tests { assert!(extract_writes(&state).is_empty(), "state is not a write"); } + #[test] + fn destructive_decision_classification_c2() { + for d in ["purge", "delete", "forget", "purged", "deleted", "forgotten"] { + assert!(is_destructive_decision(d), "{d} is destructive"); + } + for d in ["create", "update", "promote", "reinforce"] { + assert!(!is_destructive_decision(d), "{d} is not destructive"); + } + } + + #[test] + fn content_preview_redacts_sensitive_and_truncates() { + // PRIV: sensitive content is fully redacted, never previewed. + assert_eq!( + content_preview("the production auth token is sk-abc123", true), + "[redacted — sensitive content; review via risk signals]" + ); + // Ordinary content is truncated, not redacted. + let long = "a".repeat(200); + let prev = content_preview(&long, false); + assert!(prev.ends_with('…')); + assert!(prev.chars().count() <= 81); + // Empty content. + assert_eq!(content_preview("", false), "(no content)"); + } + + #[test] + fn content_hash_is_stable_and_hides_text() { + let h = hash_content("my secret memory"); + assert_eq!(h, hash_content("my secret memory"), "stable"); + assert!(!h.contains("secret")); + assert_eq!(h.len(), 16); + } + #[test] fn extract_writes_recognizes_destructive_actions_c2() { // C2: purge/delete are brain mutations and must trace + be gateable. @@ -855,6 +953,88 @@ mod tests { } } + fn store() -> std::sync::Arc { + let dir = tempfile::tempdir().unwrap(); + std::sync::Arc::new( + vestige_core::Storage::new(Some(dir.path().join("gate_test.db"))).unwrap(), + ) + } + + #[test] + fn gate_opens_pr_for_destructive_write_after_node_deleted_c2() { + // C2-deep: the row is GONE by the time the gate runs (purge deleted it), + // but a destructive write must STILL open a Memory PR — not be skipped. + let s = store(); + let node = s + .ingest(vestige_core::IngestInput { + content: "A memory the agent is about to purge.".to_string(), + node_type: "fact".to_string(), + ..Default::default() + }) + .unwrap(); + // Actually delete the row, like purge does. + let _ = s.delete_node(&node.id); + assert!(s.get_node(&node.id).unwrap().is_none(), "row is gone"); + + // The tool result the recorder sees for the purge. + let result = serde_json::json!({ "action": "purge", "nodeId": node.id, "success": true }); + let opened = gate_writes( + &s, + None, + "run_c2", + "memory", + &result, + vestige_core::ReviewMode::RiskGated, + ); + + assert_eq!(opened.len(), 1, "destructive write must open a PR even with the node gone"); + let pr = s.list_memory_prs(Some(vestige_core::MemoryPrStatus::Pending), 10).unwrap(); + assert_eq!(pr.len(), 1); + assert_eq!(pr[0].subject_id.as_deref(), Some(node.id.as_str())); + // The diff marks the node as deleted and carries no resurrected content. + assert_eq!(pr[0].diff["node"]["deleted"], serde_json::json!(true)); + } + + #[test] + fn gate_redacts_sensitive_content_in_pr_priv() { + // PRIV: a write gated for a sensitive topic must NOT carry the raw + // content into the PR diff/title — only a redaction + hash. + let s = store(); + let secret = "the production auth token is sk-live-SECRET-XYZ"; + let node = s + .ingest(vestige_core::IngestInput { + content: secret.to_string(), + node_type: "fact".to_string(), + ..Default::default() + }) + .unwrap(); + let result = serde_json::json!({ "decision": "create", "nodeId": node.id }); + let opened = gate_writes( + &s, + None, + "run_priv", + "smart_ingest", + &result, + vestige_core::ReviewMode::RiskGated, + ); + assert_eq!(opened.len(), 1, "sensitive write opens a PR"); + + let pr = &s + .list_memory_prs(Some(vestige_core::MemoryPrStatus::Pending), 10) + .unwrap()[0]; + let serialized = serde_json::to_string(pr).unwrap(); + assert!( + !serialized.contains("SECRET-XYZ") && !serialized.contains("sk-live"), + "PR must not contain the raw secret content; got: {serialized}" + ); + assert!( + serialized.contains("redacted"), + "PR must mark the content redacted" + ); + // A content hash is present so reviewers can still correlate. + assert!(pr.diff["node"]["contentHash"].as_str().is_some()); + } + #[test] fn write_tool_set_includes_codebase_b2() { assert!(is_write_tool("codebase"));