fix(blackbox): C2-deep gate destructive writes post-delete + redact PR content

Two deeper review findings (both blockers) + doc de-staling.

C2-deep: my earlier C2 made purge/delete TRACE as memory.write, but gate_writes
did `get_node(id) -> skip on None`, and purge had already DELETEd the row — so a
destructive removal still never opened a Memory PR (it was silently skipped).
The most security-critical write type couldn't be reviewed. Fix: a missing node
is now gateable for destructive decisions — gate_writes builds the WriteContext
from the decision itself (marks `forgets`, which classify_write gates), and the
PR records the removal with node.deleted=true. Proven live: purging a node opens
a PR (kind node_decayed, deleted true); test
gate_opens_pr_for_destructive_write_after_node_deleted_c2.

PRIV: gate_writes copied the FULL node.content into the PR diff + title, so a
real secret in a gated memory would leak into the memory_prs table, the
dashboard, and any exported proof bundle — defeating the point of gating
sensitive writes. Fix: the PR now stores a truncated content PREVIEW + an FNV
content HASH, and sensitive-topic/sensitive-node-type writes are fully REDACTED
("[redacted — sensitive content; review via risk signals]"). The reviewer still
sees the risk signals (why it opened) and a hash (to correlate), never the
secret. Tests gate_redacts_sensitive_content_in_pr_priv,
content_preview_redacts_sensitive_and_truncates, content_hash_is_stable. The
committed memory_pr.json + the whole proof bundle were re-captured and contain
no secret (verified by scan); the re-shot memory-prs.png shows the redaction.

DOC: REVIEW.md commit list is now git-log-based (no stale hashes); C2-deep + PRIV
added to the findings table; PROOF.md write/PR rows updated; test count -> 1007.

Gates: 1007 lib tests pass (+7 new regressions), clippy -D warnings clean,
dashboard check + build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Sam Valladares 2026-06-22 19:50:57 -05:00
parent 6a0173dc7b
commit e08182675b
9 changed files with 227 additions and 82 deletions

View file

@ -42,12 +42,12 @@ deterministic regression test `test_full_spine_one_runid_crosses_every_hop`
| Feature | Status | Notes |
|---------|--------|-------|
| `mcp.call` trace | **REAL** | every tools/call records one; args **hashed**, never stored raw |
| `memory.write` trace | **REAL** | fires on smart_ingest/ingest |
| `memory.write` trace | **REAL** | fires on smart_ingest/ingest, memory promote/demote/edit, codebase remember_*, AND destructive purge/delete |
| `memory.retrieve` trace | **REAL** | fires on deep_reference/search, with per-id activation |
| `memory.suppress` trace | **REAL** | recorded path; fires when retrieval suppresses |
| `contradiction.detected` trace | **REAL** | fires when deep_reference surfaces a contradiction pair; UI says "no contradiction in this run" when none |
| Memory Receipts | **REAL** | built from real scored memories + trust, persisted, attached to output |
| Risk-gated Memory PRs | **REAL** | quarantine review: commit-then-suppress, audit preserved, influence suspended. Promote verified end-to-end |
| Risk-gated Memory PRs | **REAL** | quarantine review: commit-then-suppress, audit preserved, influence suspended. Promote verified end-to-end (releases the memory, even past the 24h window). Destructive purge/delete also open a PR. PR content is **redacted** for sensitive writes (preview + hash, never the raw secret) |
| Fast / Risk-Gated / Paranoid modes | **REAL** | persisted to `<data_dir>/review_mode.json`; Risk-Gated is the default |
| WebSocket broadcast | **REAL** | proven by `websocket-events.jsonl` + a unit test |
| `vestige://trace/{runId}` resource | **REAL** | proven by the full-spine test |

View file

@ -114,13 +114,17 @@ $ git diff --stat 9e92a59..HEAD -- ':!apps/dashboard/build' ':!blackbox-proof-20
# live count — it grows as review fixes land.
```
Commits (oldest first):
Commits (oldest first) — run `git log --oneline 9e92a59..HEAD` for the live,
authoritative list; the series so far:
- `80c823a` feat: Agent Black Box + Receipts + risk-gated Memory PRs
- `b89beee` proof: Proof Lock — full-spine test, honest UI states, proof pack
- `140b15f` proof: dream.patch proven live with a real dream run
- `cadffb4` docs: package the review bundle — REVIEW.md entry point
- `8f7bed0` fix: address review blockers B1B7 + re-capture proof bundle
- (+ a follow-up fix commit for C1/C2 — see "Review findings addressed")
- `8f7bed0` fix: review blockers B1B7 + re-capture proof bundle
- `6a0173d` fix: C1 unconditional quarantine release + C2 trace destructive writes
- `…` fix: C2-deep (gate destructive writes post-delete) + PRIV (redact PR content)
The hashes above are point-in-time; the branch tip is the source of truth.
Key files to review:
- **Core (pure logic):** `crates/vestige-core/src/trace/{mod,receipt,review}.rs`
@ -170,12 +174,14 @@ found 7 real issues — 4 blockers. All fixed and tested:
| B7 | P3 | `set_review_mode` non-atomic write; export filename used raw `run_id` | `write_atomic` (temp+rename); filename sanitized; static routes declared before dynamic | covered by build + the atomic-write helper's existing use |
| C1 | blocker | B1's release used `reverse_suppression`, which **refuses past the 24h labile window** — a PR promoted late stayed suppressed | new `release_quarantine(id)`: unconditional release (no time limit), used by the PR handler instead | test `release_quarantine_works_past_the_labile_window_c1` (proves reverse_suppression refuses but release_quarantine succeeds at +100h) |
| C2 | blocker | `memory` `purge`/`delete` (destructive removal) bypassed the write-trace + gate | added purge/purged/delete/deleted/forget/forgotten to `is_write_decision` | test `extract_writes_recognizes_destructive_actions_c2` |
| C2-deep | blocker | C2 made purge *trace*, but `gate_writes` did `get_node→skip` on the (already-deleted) row, so a destructive write still **never opened a PR** | gate now treats a missing node as gateable for destructive decisions (builds the context from the decision, marks `forgets`); the PR records the removal with `deleted:true` | test `gate_opens_pr_for_destructive_write_after_node_deleted_c2`; **live:** purging a node opened a PR (`kind: node_decayed`, `deleted: true`) |
| PRIV | blocker | `gate_writes` copied **full `node.content`** into the PR `diff` + `title` — a real secret would leak into the `memory_prs` table and any exported proof bundle | PR now stores a truncated **preview** + a **content hash**; sensitive-topic-gated writes are fully **redacted** (`[redacted — sensitive content…]`); the committed `memory_pr.json` was re-captured and contains no secret | tests `gate_redacts_sensitive_content_in_pr_priv`, `content_preview_redacts_sensitive_and_truncates`; **live + bundle scan:** no secret string anywhere |
One earlier (self-)review claim was **withdrawn**: the `/api/memory-prs/mode`
vs `/{id}` route order is *not* a functional bug — axum 0.8 / matchit gives
static segments priority. Reordered for clarity only.
Net after fixes (B1B7 + C1/C2): **1002 lib tests pass, clippy `-D warnings` clean, dashboard
Net after fixes (B1B7 + C1/C2 + C2-deep + PRIV): **1007 lib tests pass, clippy `-D warnings` clean, dashboard
check + build clean.**
## Reproduce (any reviewer, locally)

View file

@ -1,29 +0,0 @@
{
"created_at": "2026-06-22T23:39:30.596744+00:00",
"decided_at": "2026-06-22T23:39:44.258862+00:00",
"decision": "promote",
"diff": {
"decision": "create",
"node": {
"content": "Store the production auth token and security credential for deploys.",
"id": "e22e83f3-2c18-4e33-93f4-558d91009505",
"nodeType": "fact",
"tags": [
"security",
"auth"
]
}
},
"id": "pr_3c5b4b2852e74f1ab7c325a7e9cb6e1f",
"kind": "new_fact",
"run_id": "run_proof",
"signals": [
{
"code": "sensitive_topic",
"detail": "Touches a sensitive topic: authentication / authorization."
}
],
"status": "promoted",
"subject_id": "e22e83f3-2c18-4e33-93f4-558d91009505",
"title": "New fact pending review: \"Store the production auth token and security credential for deploys.\""
}

View file

@ -1,11 +0,0 @@
{
"activation_path": [],
"decay_risk": "high",
"mutations": [],
"receipt_id": "r_2026_06_22_runproof_7f144c",
"retrieved": [
"9d975b31-e4b2-425c-902a-c17fef9dd4cb"
],
"suppressed": [],
"trust_floor": 0.0
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 582 KiB

After

Width:  |  Height:  |  Size: 579 KiB

Before After
Before After

View file

@ -1 +0,0 @@
{"averageRetention":0.99,"status":"healthy","totalMemories":4,"version":"2.1.27"}

View file

@ -1 +0,0 @@
{"events":[{"argsHash":"13a481da3e53d0fd","at":1782171564842,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171564940,"diff":{"decision":"create"},"id":"9d975b31-e4b2-425c-902a-c17fef9dd4cb","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"52d9b2533542a2eb","at":1782171566254,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171566340,"diff":{"decision":"create"},"id":"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"2639fbc239e17a3d","at":1782171567668,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171567761,"diff":{"decision":"create"},"id":"923709a5-cc60-4f41-b8b1-ef1a635fe6aa","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"6fbbc76c4e98fa50","at":1782171569082,"runId":"run_proof","tool":"deep_reference","type":"mcp.call"},{"activation":{"923709a5-cc60-4f41-b8b1-ef1a635fe6aa":0.62,"9d975b31-e4b2-425c-902a-c17fef9dd4cb":0.62,"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6":0.62},"at":1782171569148,"ids":["9d975b31-e4b2-425c-902a-c17fef9dd4cb","923709a5-cc60-4f41-b8b1-ef1a635fe6aa","bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"],"runId":"run_proof","type":"memory.retrieve"},{"argsHash":"db928bbabc9cadd7","at":1782171570495,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171570596,"diff":{"decision":"create"},"id":"e22e83f3-2c18-4e33-93f4-558d91009505","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"78a1e9038e3e5136","at":1782171606233,"runId":"run_proof","tool":"deep_reference","type":"mcp.call"},{"activation":{"923709a5-cc60-4f41-b8b1-ef1a635fe6aa":0.62,"9d975b31-e4b2-425c-902a-c17fef9dd4cb":0.62,"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6":0.62,"e22e83f3-2c18-4e33-93f4-558d91009505":0.57},"at":1782171606293,"ids":["bef6710c-a1ee-4cb3-8a33-82aac2fdaee6","e22e83f3-2c18-4e33-93f4-558d91009505","9d975b31-e4b2-425c-902a-c17fef9dd4cb","923709a5-cc60-4f41-b8b1-ef1a635fe6aa"],"runId":"run_proof","type":"memory.retrieve"},{"argsHash":"13a31297fe007a2e","at":1782171625380,"runId":"run_proof","tool":"deep_reference","type":"mcp.call"},{"activation":{"923709a5-cc60-4f41-b8b1-ef1a635fe6aa":0.62,"9d975b31-e4b2-425c-902a-c17fef9dd4cb":0.62,"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6":0.62,"e22e83f3-2c18-4e33-93f4-558d91009505":0.58},"at":1782171625436,"ids":["923709a5-cc60-4f41-b8b1-ef1a635fe6aa","e22e83f3-2c18-4e33-93f4-558d91009505","9d975b31-e4b2-425c-902a-c17fef9dd4cb","bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"],"runId":"run_proof","type":"memory.retrieve"},{"argsHash":"ac19c646baf0673d","at":1782171626392,"runId":"run_proof","tool":"search","type":"mcp.call"},{"activation":{},"at":1782171626402,"ids":["9d975b31-e4b2-425c-902a-c17fef9dd4cb"],"runId":"run_proof","type":"memory.retrieve"}],"exportedAt":"2026-06-22T23:42:58.560420+00:00","format":"vestige-trace","runId":"run_proof","summary":{"eventCount":16,"firstTool":"smart_ingest","lastAt":1782171626402,"retrievedCount":12,"startedAt":1782171564842,"suppressedCount":0,"vetoCount":0,"writeCount":4},"version":1}

View file

@ -1,7 +1,8 @@
{"data": {"timestamp": "2026-06-22T23:40:23.469437+00:00", "version": "2.1.27"}, "type": "Connected"}
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 12, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "deep_reference", "argsHash": "13a31297fe007a2e", "at": 1782171625380}, "timestamp": "2026-06-22T23:40:25.381237Z"}}
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 13, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["923709a5-cc60-4f41-b8b1-ef1a635fe6aa", "e22e83f3-2c18-4e33-93f4-558d91009505", "9d975b31-e4b2-425c-902a-c17fef9dd4cb", "bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"], "activation": {"923709a5-cc60-4f41-b8b1-ef1a635fe6aa": 0.62, "9d975b31-e4b2-425c-902a-c17fef9dd4cb": 0.62, "bef6710c-a1ee-4cb3-8a33-82aac2fdaee6": 0.62, "e22e83f3-2c18-4e33-93f4-558d91009505": 0.58}, "at": 1782171625436}, "timestamp": "2026-06-22T23:40:25.436463Z"}}
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 14, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "search", "argsHash": "ac19c646baf0673d", "at": 1782171626392}, "timestamp": "2026-06-22T23:40:26.392602Z"}}
{"type": "SearchPerformed", "data": {"query": "dashboard", "result_count": 1, "result_ids": ["9d975b31-e4b2-425c-902a-c17fef9dd4cb"], "duration_ms": 0, "timestamp": "2026-06-22T23:40:26.402765Z"}}
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 15, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["9d975b31-e4b2-425c-902a-c17fef9dd4cb"], "activation": {}, "at": 1782171626402}, "timestamp": "2026-06-22T23:40:26.402863Z"}}
{"type": "DeepReferenceCompleted", "data": {"query": "Composing: `2be410de-0bc7-4242-b640-8f859b43efe3`, `5107e14e-19ac-4ed0-8add-18e7b2977723`, `3c083fb3-61e1-4aa0-b6a8-0973e0f3bb57`. These connect the real v2.1.27 public baseline, the Black Box review findings, and the older dashboard graph/live-memory evolution.\n\nNever-composed detected: None.\n\nRecommendation: treat this as a major product-generation jump, but harden the receipt semantics before launch footage.\n\nYes. This is a **major step-up** from the current GitHub `v2.1.27` dashboard.\n\nThe cleanest way to say it:\n\n`v2.1.27` dashboard = **inspectable local memory system** \nBlack Box branch = **agent cognition flight recorder + governance layer**\n\nThat is not just \u201cnew tabs.\u201d It changes the product category.\n\nIn `v2.1.27`, the dashboard proves Vestige has local memory, graph visibility, dreams/reasoning surfaces, source-aware connector work, and inspectable state. Strong, but mostly it shows **what exists in memory**.\n\nThis branch shows **what the agent did with memory during a run**:\n\n- every MCP call gets a `runId`\n- retrievals become replayable trace events\n- receipts show what memories influenced the answer\n- risky writes open Memory PRs\n- WebSocket events make the graph/dashboard pulse live\n- traces export as artifacts\n- `vestige://trace/{runId}` turns the trace into an MCP-readable receipt\n\nThat is a different league. It moves Vestige from \u201cmemory dashboard\u201d to **black box recorder for agents**.\n\nMy honest rating:\n\n- UI/product experience: **2-3x more advanced**\n- de", "intent": "Comparison", "status": "partial_evidence", "confidence": 0.52, "primary_id": "923709a5-cc60-4f41-b8b1-ef1a635fe6aa", "supporting_ids": ["923709a5-cc60-4f41-b8b1-ef1a635fe6aa", "9d975b31-e4b2-425c-902a-c17fef9dd4cb", "e22e83f3-2c18-4e33-93f4-558d91009505", "bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"], "contradicting_ids": [], "contradiction_pairs": [], "memories_analyzed": 4, "duration_ms": 980, "timestamp": "2026-06-22T23:42:50.152360Z"}}
{"data": {"timestamp": "2026-06-23T00:43:28.238141+00:00", "version": "2.1.27"}, "type": "Connected"}
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 14, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "deep_reference", "argsHash": "13a31297fe007a2e", "at": 1782175410153}, "timestamp": "2026-06-23T00:43:30.154710Z"}}
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 15, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["591c638e-1fc7-4b6d-bcb3-b7fcb6c0c7b3", "6aa12b99-270e-4fb6-b523-9f01b0bee16b", "26a3c976-043b-4915-accf-ae098c8dc66b", "76c13cba-7b88-4ce7-b7de-0a906d372806"], "activation": {"26a3c976-043b-4915-accf-ae098c8dc66b": 0.62, "591c638e-1fc7-4b6d-bcb3-b7fcb6c0c7b3": 0.62, "6aa12b99-270e-4fb6-b523-9f01b0bee16b": 0.53, "76c13cba-7b88-4ce7-b7de-0a906d372806": 0.62}, "at": 1782175410209}, "timestamp": "2026-06-23T00:43:30.209554Z"}}
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 16, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "search", "argsHash": "ac19c646baf0673d", "at": 1782175411167}, "timestamp": "2026-06-23T00:43:31.167561Z"}}
{"type": "SearchPerformed", "data": {"query": "dashboard", "result_count": 2, "result_ids": ["26a3c976-043b-4915-accf-ae098c8dc66b", "76c13cba-7b88-4ce7-b7de-0a906d372806"], "duration_ms": 0, "timestamp": "2026-06-23T00:43:31.182829Z"}}
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 17, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["26a3c976-043b-4915-accf-ae098c8dc66b", "76c13cba-7b88-4ce7-b7de-0a906d372806"], "activation": {}, "at": 1782175411182}, "timestamp": "2026-06-23T00:43:31.182933Z"}}
{"type": "MemoryUnsuppressed", "data": {"id": "6aa12b99-270e-4fb6-b523-9f01b0bee16b", "remaining_count": 0, "timestamp": "2026-06-23T00:46:18.338387Z"}}
{"type": "MemoryPrDecided", "data": {"id": "pr_31ab4c15f1694504bf33be82715bee03", "decision": "promote", "status": "promoted", "timestamp": "2026-06-23T00:46:18.338407Z"}}

View file

@ -109,61 +109,81 @@ pub fn gate_writes(
// Collect each (id, decision) write the tool reported.
let writes = extract_writes(result);
for (id, decision) in writes {
let destructive = is_destructive_decision(&decision);
// Pull the just-written node to inspect its real content/type/tags.
// C2: a destructive write (purge/delete/forget) has ALREADY removed the
// row, so get_node returns None — we must NOT skip it (that's how
// destructive removals were bypassing review). For those, build the
// context from the decision alone; for normal writes a missing node
// genuinely means nothing to gate, so skip.
let node = match storage.get_node(&id) {
Ok(Some(n)) => n,
Ok(Some(n)) => Some(n),
_ if destructive => None,
_ => continue,
};
// A decision of supersede/replace/merge means the write overwrote an
// existing memory — the strongest risk signal. Look up the trust of the
// memory it superseded so the gate can weigh it.
let (supersedes, merges) = match decision.as_str() {
"supersede" | "replace" => (true, false),
"merge" => (false, true),
"supersede" | "replace" | "superseded" => (true, false),
"merge" | "merged" => (false, true),
_ => (false, false),
};
// If this superseded something, treat the contradiction as against a
// high-trust memory when the *new* node's own retention is high (the
// pipeline only supersedes when confident). This keeps the gate honest
// without a second DB round-trip per write.
let contradicts_trust = if supersedes {
Some(node.retention_strength.max(0.7))
} else {
None
let contradicts_trust = match (&node, supersedes) {
(Some(n), true) => Some(n.retention_strength.max(0.7)),
_ => None,
};
let ctx = WriteContext {
source: Some(WriteSource::Agent),
node_type: node.node_type.clone(),
content: node.content.clone(),
tags: node.tags.clone(),
node_type: node.as_ref().map(|n| n.node_type.clone()).unwrap_or_default(),
content: node.as_ref().map(|n| n.content.clone()).unwrap_or_default(),
tags: node.as_ref().map(|n| n.tags.clone()).unwrap_or_default(),
contradicts_trust,
supersedes,
merges,
forgets: destructive,
..Default::default()
};
// A destructive write ALWAYS warrants review (erasing brain state) even
// in Fast mode is debatable, but we respect the mode: the `forgets`
// signal in WriteContext makes classify_write gate it in Risk-Gated.
let (class, signals) = classify_write(&ctx, mode);
if class != RiskClass::Review {
continue;
}
// Quarantine the just-written node: suppress it so it is held out of
// retrieval until the PR is decided. Best-effort.
let _ = storage.suppress_memory(&id);
// Quarantine the just-written node so it's held out of retrieval until
// the PR is decided. For a destructive write there's no live node to
// suppress — the PR records the action for review/audit instead.
if node.is_some() {
let _ = storage.suppress_memory(&id);
}
let kind = match decision.as_str() {
"supersede" | "replace" => MemoryPrKind::MemorySuperseded,
"merge" => MemoryPrKind::DreamConsolidation,
"supersede" | "replace" | "superseded" => MemoryPrKind::MemorySuperseded,
"merge" | "merged" => MemoryPrKind::DreamConsolidation,
_ if destructive => MemoryPrKind::NodeDecayed,
_ if contradicts_trust.is_some() => MemoryPrKind::ContradictionDetected,
_ => MemoryPrKind::NewFact,
};
let title = format!(
"{}: \"{}\"",
pr_kind_phrase(kind),
node.content.chars().take(80).collect::<String>()
);
// PRIV: never copy full memory content into the PR (it can hold a
// secret, and the PR row is read by the dashboard and may be exported).
// Store a short, redacted preview + a content hash instead. The preview
// is dropped entirely when the write was gated for a sensitive topic.
let sensitive = signals.iter().any(|s| {
s.code == "sensitive_topic" || s.code == "sensitive_node_type"
});
let raw_content = node.as_ref().map(|n| n.content.as_str()).unwrap_or("");
let preview = content_preview(raw_content, sensitive);
let content_hash = hash_content(raw_content);
let title = format!("{}: {}", pr_kind_phrase(kind), preview);
let pr = MemoryPr {
id: format!("pr_{}", uuid::Uuid::new_v4().simple()),
kind,
@ -172,10 +192,14 @@ pub fn gate_writes(
diff: serde_json::json!({
"decision": decision,
"node": {
"id": node.id,
"nodeType": node.node_type,
"content": node.content,
"tags": node.tags,
"id": id,
"nodeType": node.as_ref().map(|n| n.node_type.clone()).unwrap_or_default(),
// Redacted: preview (or "[redacted — sensitive]") + hash,
// never the full content.
"contentPreview": preview,
"contentHash": content_hash,
"tags": node.as_ref().map(|n| n.tags.clone()).unwrap_or_default(),
"deleted": node.is_none(),
},
}),
signals: signals.clone(),
@ -214,6 +238,46 @@ pub fn gate_writes(
opened
}
/// Whether a write decision permanently removes / forgets memory (so the live
/// row may already be gone when the gate runs).
fn is_destructive_decision(label: &str) -> bool {
matches!(
label,
"purge" | "purged" | "delete" | "deleted" | "forget" | "forgotten"
)
}
/// A short, privacy-preserving preview of memory content for a Memory PR.
/// When the write was flagged for a sensitive topic, the content is redacted
/// entirely — the reviewer sees the risk signals + hash, never the secret.
fn content_preview(content: &str, sensitive: bool) -> String {
if content.is_empty() {
return "(no content)".to_string();
}
if sensitive {
return "[redacted — sensitive content; review via risk signals]".to_string();
}
let trimmed: String = content.chars().take(80).collect();
if content.chars().count() > 80 {
format!("{trimmed}")
} else {
trimmed
}
}
/// FNV-1a hex fingerprint of memory content — lets a reviewer correlate /
/// dedupe without the PR row carrying the raw (possibly secret) text.
fn hash_content(content: &str) -> String {
const FNV_OFFSET: u64 = 0xcbf2_9ce4_8422_2325;
const FNV_PRIME: u64 = 0x0000_0100_0000_01b3;
let mut hash = FNV_OFFSET;
for b in content.as_bytes() {
hash ^= u64::from(*b);
hash = hash.wrapping_mul(FNV_PRIME);
}
format!("{:016x}", hash)
}
fn pr_kind_phrase(kind: vestige_core::MemoryPrKind) -> &'static str {
use vestige_core::MemoryPrKind::*;
match kind {
@ -842,6 +906,40 @@ mod tests {
assert!(extract_writes(&state).is_empty(), "state is not a write");
}
#[test]
fn destructive_decision_classification_c2() {
for d in ["purge", "delete", "forget", "purged", "deleted", "forgotten"] {
assert!(is_destructive_decision(d), "{d} is destructive");
}
for d in ["create", "update", "promote", "reinforce"] {
assert!(!is_destructive_decision(d), "{d} is not destructive");
}
}
#[test]
fn content_preview_redacts_sensitive_and_truncates() {
// PRIV: sensitive content is fully redacted, never previewed.
assert_eq!(
content_preview("the production auth token is sk-abc123", true),
"[redacted — sensitive content; review via risk signals]"
);
// Ordinary content is truncated, not redacted.
let long = "a".repeat(200);
let prev = content_preview(&long, false);
assert!(prev.ends_with('…'));
assert!(prev.chars().count() <= 81);
// Empty content.
assert_eq!(content_preview("", false), "(no content)");
}
#[test]
fn content_hash_is_stable_and_hides_text() {
let h = hash_content("my secret memory");
assert_eq!(h, hash_content("my secret memory"), "stable");
assert!(!h.contains("secret"));
assert_eq!(h.len(), 16);
}
#[test]
fn extract_writes_recognizes_destructive_actions_c2() {
// C2: purge/delete are brain mutations and must trace + be gateable.
@ -855,6 +953,88 @@ mod tests {
}
}
fn store() -> std::sync::Arc<vestige_core::Storage> {
let dir = tempfile::tempdir().unwrap();
std::sync::Arc::new(
vestige_core::Storage::new(Some(dir.path().join("gate_test.db"))).unwrap(),
)
}
#[test]
fn gate_opens_pr_for_destructive_write_after_node_deleted_c2() {
// C2-deep: the row is GONE by the time the gate runs (purge deleted it),
// but a destructive write must STILL open a Memory PR — not be skipped.
let s = store();
let node = s
.ingest(vestige_core::IngestInput {
content: "A memory the agent is about to purge.".to_string(),
node_type: "fact".to_string(),
..Default::default()
})
.unwrap();
// Actually delete the row, like purge does.
let _ = s.delete_node(&node.id);
assert!(s.get_node(&node.id).unwrap().is_none(), "row is gone");
// The tool result the recorder sees for the purge.
let result = serde_json::json!({ "action": "purge", "nodeId": node.id, "success": true });
let opened = gate_writes(
&s,
None,
"run_c2",
"memory",
&result,
vestige_core::ReviewMode::RiskGated,
);
assert_eq!(opened.len(), 1, "destructive write must open a PR even with the node gone");
let pr = s.list_memory_prs(Some(vestige_core::MemoryPrStatus::Pending), 10).unwrap();
assert_eq!(pr.len(), 1);
assert_eq!(pr[0].subject_id.as_deref(), Some(node.id.as_str()));
// The diff marks the node as deleted and carries no resurrected content.
assert_eq!(pr[0].diff["node"]["deleted"], serde_json::json!(true));
}
#[test]
fn gate_redacts_sensitive_content_in_pr_priv() {
// PRIV: a write gated for a sensitive topic must NOT carry the raw
// content into the PR diff/title — only a redaction + hash.
let s = store();
let secret = "the production auth token is sk-live-SECRET-XYZ";
let node = s
.ingest(vestige_core::IngestInput {
content: secret.to_string(),
node_type: "fact".to_string(),
..Default::default()
})
.unwrap();
let result = serde_json::json!({ "decision": "create", "nodeId": node.id });
let opened = gate_writes(
&s,
None,
"run_priv",
"smart_ingest",
&result,
vestige_core::ReviewMode::RiskGated,
);
assert_eq!(opened.len(), 1, "sensitive write opens a PR");
let pr = &s
.list_memory_prs(Some(vestige_core::MemoryPrStatus::Pending), 10)
.unwrap()[0];
let serialized = serde_json::to_string(pr).unwrap();
assert!(
!serialized.contains("SECRET-XYZ") && !serialized.contains("sk-live"),
"PR must not contain the raw secret content; got: {serialized}"
);
assert!(
serialized.contains("redacted"),
"PR must mark the content redacted"
);
// A content hash is present so reviewers can still correlate.
assert!(pr.diff["node"]["contentHash"].as_str().is_some());
}
#[test]
fn write_tool_set_includes_codebase_b2() {
assert!(is_write_tool("codebase"));