mirror of
https://github.com/samvallad33/vestige.git
synced 2026-07-02 22:01:01 +02:00
fix(blackbox): C2-deep gate destructive writes post-delete + redact PR content
Two deeper review findings (both blockers) + doc de-staling.
C2-deep: my earlier C2 made purge/delete TRACE as memory.write, but gate_writes
did `get_node(id) -> skip on None`, and purge had already DELETEd the row — so a
destructive removal still never opened a Memory PR (it was silently skipped).
The most security-critical write type couldn't be reviewed. Fix: a missing node
is now gateable for destructive decisions — gate_writes builds the WriteContext
from the decision itself (marks `forgets`, which classify_write gates), and the
PR records the removal with node.deleted=true. Proven live: purging a node opens
a PR (kind node_decayed, deleted true); test
gate_opens_pr_for_destructive_write_after_node_deleted_c2.
PRIV: gate_writes copied the FULL node.content into the PR diff + title, so a
real secret in a gated memory would leak into the memory_prs table, the
dashboard, and any exported proof bundle — defeating the point of gating
sensitive writes. Fix: the PR now stores a truncated content PREVIEW + an FNV
content HASH, and sensitive-topic/sensitive-node-type writes are fully REDACTED
("[redacted — sensitive content; review via risk signals]"). The reviewer still
sees the risk signals (why it opened) and a hash (to correlate), never the
secret. Tests gate_redacts_sensitive_content_in_pr_priv,
content_preview_redacts_sensitive_and_truncates, content_hash_is_stable. The
committed memory_pr.json + the whole proof bundle were re-captured and contain
no secret (verified by scan); the re-shot memory-prs.png shows the redaction.
DOC: REVIEW.md commit list is now git-log-based (no stale hashes); C2-deep + PRIV
added to the findings table; PROOF.md write/PR rows updated; test count -> 1007.
Gates: 1007 lib tests pass (+7 new regressions), clippy -D warnings clean,
dashboard check + build clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
6a0173dc7b
commit
e08182675b
9 changed files with 227 additions and 82 deletions
|
|
@ -42,12 +42,12 @@ deterministic regression test `test_full_spine_one_runid_crosses_every_hop`
|
|||
| Feature | Status | Notes |
|
||||
|---------|--------|-------|
|
||||
| `mcp.call` trace | **REAL** | every tools/call records one; args **hashed**, never stored raw |
|
||||
| `memory.write` trace | **REAL** | fires on smart_ingest/ingest |
|
||||
| `memory.write` trace | **REAL** | fires on smart_ingest/ingest, memory promote/demote/edit, codebase remember_*, AND destructive purge/delete |
|
||||
| `memory.retrieve` trace | **REAL** | fires on deep_reference/search, with per-id activation |
|
||||
| `memory.suppress` trace | **REAL** | recorded path; fires when retrieval suppresses |
|
||||
| `contradiction.detected` trace | **REAL** | fires when deep_reference surfaces a contradiction pair; UI says "no contradiction in this run" when none |
|
||||
| Memory Receipts | **REAL** | built from real scored memories + trust, persisted, attached to output |
|
||||
| Risk-gated Memory PRs | **REAL** | quarantine review: commit-then-suppress, audit preserved, influence suspended. Promote verified end-to-end |
|
||||
| Risk-gated Memory PRs | **REAL** | quarantine review: commit-then-suppress, audit preserved, influence suspended. Promote verified end-to-end (releases the memory, even past the 24h window). Destructive purge/delete also open a PR. PR content is **redacted** for sensitive writes (preview + hash, never the raw secret) |
|
||||
| Fast / Risk-Gated / Paranoid modes | **REAL** | persisted to `<data_dir>/review_mode.json`; Risk-Gated is the default |
|
||||
| WebSocket broadcast | **REAL** | proven by `websocket-events.jsonl` + a unit test |
|
||||
| `vestige://trace/{runId}` resource | **REAL** | proven by the full-spine test |
|
||||
|
|
|
|||
|
|
@ -114,13 +114,17 @@ $ git diff --stat 9e92a59..HEAD -- ':!apps/dashboard/build' ':!blackbox-proof-20
|
|||
# live count — it grows as review fixes land.
|
||||
```
|
||||
|
||||
Commits (oldest first):
|
||||
Commits (oldest first) — run `git log --oneline 9e92a59..HEAD` for the live,
|
||||
authoritative list; the series so far:
|
||||
- `80c823a` feat: Agent Black Box + Receipts + risk-gated Memory PRs
|
||||
- `b89beee` proof: Proof Lock — full-spine test, honest UI states, proof pack
|
||||
- `140b15f` proof: dream.patch proven live with a real dream run
|
||||
- `cadffb4` docs: package the review bundle — REVIEW.md entry point
|
||||
- `8f7bed0` fix: address review blockers B1–B7 + re-capture proof bundle
|
||||
- (+ a follow-up fix commit for C1/C2 — see "Review findings addressed")
|
||||
- `8f7bed0` fix: review blockers B1–B7 + re-capture proof bundle
|
||||
- `6a0173d` fix: C1 unconditional quarantine release + C2 trace destructive writes
|
||||
- `…` fix: C2-deep (gate destructive writes post-delete) + PRIV (redact PR content)
|
||||
|
||||
The hashes above are point-in-time; the branch tip is the source of truth.
|
||||
|
||||
Key files to review:
|
||||
- **Core (pure logic):** `crates/vestige-core/src/trace/{mod,receipt,review}.rs`
|
||||
|
|
@ -170,12 +174,14 @@ found 7 real issues — 4 blockers. All fixed and tested:
|
|||
| B7 | P3 | `set_review_mode` non-atomic write; export filename used raw `run_id` | `write_atomic` (temp+rename); filename sanitized; static routes declared before dynamic | covered by build + the atomic-write helper's existing use |
|
||||
| C1 | blocker | B1's release used `reverse_suppression`, which **refuses past the 24h labile window** — a PR promoted late stayed suppressed | new `release_quarantine(id)`: unconditional release (no time limit), used by the PR handler instead | test `release_quarantine_works_past_the_labile_window_c1` (proves reverse_suppression refuses but release_quarantine succeeds at +100h) |
|
||||
| C2 | blocker | `memory` `purge`/`delete` (destructive removal) bypassed the write-trace + gate | added purge/purged/delete/deleted/forget/forgotten to `is_write_decision` | test `extract_writes_recognizes_destructive_actions_c2` |
|
||||
| C2-deep | blocker | C2 made purge *trace*, but `gate_writes` did `get_node→skip` on the (already-deleted) row, so a destructive write still **never opened a PR** | gate now treats a missing node as gateable for destructive decisions (builds the context from the decision, marks `forgets`); the PR records the removal with `deleted:true` | test `gate_opens_pr_for_destructive_write_after_node_deleted_c2`; **live:** purging a node opened a PR (`kind: node_decayed`, `deleted: true`) |
|
||||
| PRIV | blocker | `gate_writes` copied **full `node.content`** into the PR `diff` + `title` — a real secret would leak into the `memory_prs` table and any exported proof bundle | PR now stores a truncated **preview** + a **content hash**; sensitive-topic-gated writes are fully **redacted** (`[redacted — sensitive content…]`); the committed `memory_pr.json` was re-captured and contains no secret | tests `gate_redacts_sensitive_content_in_pr_priv`, `content_preview_redacts_sensitive_and_truncates`; **live + bundle scan:** no secret string anywhere |
|
||||
|
||||
One earlier (self-)review claim was **withdrawn**: the `/api/memory-prs/mode`
|
||||
vs `/{id}` route order is *not* a functional bug — axum 0.8 / matchit gives
|
||||
static segments priority. Reordered for clarity only.
|
||||
|
||||
Net after fixes (B1–B7 + C1/C2): **1002 lib tests pass, clippy `-D warnings` clean, dashboard
|
||||
Net after fixes (B1–B7 + C1/C2 + C2-deep + PRIV): **1007 lib tests pass, clippy `-D warnings` clean, dashboard
|
||||
check + build clean.**
|
||||
|
||||
## Reproduce (any reviewer, locally)
|
||||
|
|
|
|||
|
|
@ -1,29 +0,0 @@
|
|||
{
|
||||
"created_at": "2026-06-22T23:39:30.596744+00:00",
|
||||
"decided_at": "2026-06-22T23:39:44.258862+00:00",
|
||||
"decision": "promote",
|
||||
"diff": {
|
||||
"decision": "create",
|
||||
"node": {
|
||||
"content": "Store the production auth token and security credential for deploys.",
|
||||
"id": "e22e83f3-2c18-4e33-93f4-558d91009505",
|
||||
"nodeType": "fact",
|
||||
"tags": [
|
||||
"security",
|
||||
"auth"
|
||||
]
|
||||
}
|
||||
},
|
||||
"id": "pr_3c5b4b2852e74f1ab7c325a7e9cb6e1f",
|
||||
"kind": "new_fact",
|
||||
"run_id": "run_proof",
|
||||
"signals": [
|
||||
{
|
||||
"code": "sensitive_topic",
|
||||
"detail": "Touches a sensitive topic: authentication / authorization."
|
||||
}
|
||||
],
|
||||
"status": "promoted",
|
||||
"subject_id": "e22e83f3-2c18-4e33-93f4-558d91009505",
|
||||
"title": "New fact pending review: \"Store the production auth token and security credential for deploys.\""
|
||||
}
|
||||
|
|
@ -1,11 +0,0 @@
|
|||
{
|
||||
"activation_path": [],
|
||||
"decay_risk": "high",
|
||||
"mutations": [],
|
||||
"receipt_id": "r_2026_06_22_runproof_7f144c",
|
||||
"retrieved": [
|
||||
"9d975b31-e4b2-425c-902a-c17fef9dd4cb"
|
||||
],
|
||||
"suppressed": [],
|
||||
"trust_floor": 0.0
|
||||
}
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 582 KiB After Width: | Height: | Size: 579 KiB |
|
|
@ -1 +0,0 @@
|
|||
{"averageRetention":0.99,"status":"healthy","totalMemories":4,"version":"2.1.27"}
|
||||
|
|
@ -1 +0,0 @@
|
|||
{"events":[{"argsHash":"13a481da3e53d0fd","at":1782171564842,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171564940,"diff":{"decision":"create"},"id":"9d975b31-e4b2-425c-902a-c17fef9dd4cb","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"52d9b2533542a2eb","at":1782171566254,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171566340,"diff":{"decision":"create"},"id":"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"2639fbc239e17a3d","at":1782171567668,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171567761,"diff":{"decision":"create"},"id":"923709a5-cc60-4f41-b8b1-ef1a635fe6aa","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"6fbbc76c4e98fa50","at":1782171569082,"runId":"run_proof","tool":"deep_reference","type":"mcp.call"},{"activation":{"923709a5-cc60-4f41-b8b1-ef1a635fe6aa":0.62,"9d975b31-e4b2-425c-902a-c17fef9dd4cb":0.62,"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6":0.62},"at":1782171569148,"ids":["9d975b31-e4b2-425c-902a-c17fef9dd4cb","923709a5-cc60-4f41-b8b1-ef1a635fe6aa","bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"],"runId":"run_proof","type":"memory.retrieve"},{"argsHash":"db928bbabc9cadd7","at":1782171570495,"runId":"run_proof","tool":"smart_ingest","type":"mcp.call"},{"at":1782171570596,"diff":{"decision":"create"},"id":"e22e83f3-2c18-4e33-93f4-558d91009505","runId":"run_proof","source":"agent","type":"memory.write"},{"argsHash":"78a1e9038e3e5136","at":1782171606233,"runId":"run_proof","tool":"deep_reference","type":"mcp.call"},{"activation":{"923709a5-cc60-4f41-b8b1-ef1a635fe6aa":0.62,"9d975b31-e4b2-425c-902a-c17fef9dd4cb":0.62,"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6":0.62,"e22e83f3-2c18-4e33-93f4-558d91009505":0.57},"at":1782171606293,"ids":["bef6710c-a1ee-4cb3-8a33-82aac2fdaee6","e22e83f3-2c18-4e33-93f4-558d91009505","9d975b31-e4b2-425c-902a-c17fef9dd4cb","923709a5-cc60-4f41-b8b1-ef1a635fe6aa"],"runId":"run_proof","type":"memory.retrieve"},{"argsHash":"13a31297fe007a2e","at":1782171625380,"runId":"run_proof","tool":"deep_reference","type":"mcp.call"},{"activation":{"923709a5-cc60-4f41-b8b1-ef1a635fe6aa":0.62,"9d975b31-e4b2-425c-902a-c17fef9dd4cb":0.62,"bef6710c-a1ee-4cb3-8a33-82aac2fdaee6":0.62,"e22e83f3-2c18-4e33-93f4-558d91009505":0.58},"at":1782171625436,"ids":["923709a5-cc60-4f41-b8b1-ef1a635fe6aa","e22e83f3-2c18-4e33-93f4-558d91009505","9d975b31-e4b2-425c-902a-c17fef9dd4cb","bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"],"runId":"run_proof","type":"memory.retrieve"},{"argsHash":"ac19c646baf0673d","at":1782171626392,"runId":"run_proof","tool":"search","type":"mcp.call"},{"activation":{},"at":1782171626402,"ids":["9d975b31-e4b2-425c-902a-c17fef9dd4cb"],"runId":"run_proof","type":"memory.retrieve"}],"exportedAt":"2026-06-22T23:42:58.560420+00:00","format":"vestige-trace","runId":"run_proof","summary":{"eventCount":16,"firstTool":"smart_ingest","lastAt":1782171626402,"retrievedCount":12,"startedAt":1782171564842,"suppressedCount":0,"vetoCount":0,"writeCount":4},"version":1}
|
||||
|
|
@ -1,7 +1,8 @@
|
|||
{"data": {"timestamp": "2026-06-22T23:40:23.469437+00:00", "version": "2.1.27"}, "type": "Connected"}
|
||||
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 12, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "deep_reference", "argsHash": "13a31297fe007a2e", "at": 1782171625380}, "timestamp": "2026-06-22T23:40:25.381237Z"}}
|
||||
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 13, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["923709a5-cc60-4f41-b8b1-ef1a635fe6aa", "e22e83f3-2c18-4e33-93f4-558d91009505", "9d975b31-e4b2-425c-902a-c17fef9dd4cb", "bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"], "activation": {"923709a5-cc60-4f41-b8b1-ef1a635fe6aa": 0.62, "9d975b31-e4b2-425c-902a-c17fef9dd4cb": 0.62, "bef6710c-a1ee-4cb3-8a33-82aac2fdaee6": 0.62, "e22e83f3-2c18-4e33-93f4-558d91009505": 0.58}, "at": 1782171625436}, "timestamp": "2026-06-22T23:40:25.436463Z"}}
|
||||
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 14, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "search", "argsHash": "ac19c646baf0673d", "at": 1782171626392}, "timestamp": "2026-06-22T23:40:26.392602Z"}}
|
||||
{"type": "SearchPerformed", "data": {"query": "dashboard", "result_count": 1, "result_ids": ["9d975b31-e4b2-425c-902a-c17fef9dd4cb"], "duration_ms": 0, "timestamp": "2026-06-22T23:40:26.402765Z"}}
|
||||
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 15, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["9d975b31-e4b2-425c-902a-c17fef9dd4cb"], "activation": {}, "at": 1782171626402}, "timestamp": "2026-06-22T23:40:26.402863Z"}}
|
||||
{"type": "DeepReferenceCompleted", "data": {"query": "Composing: `2be410de-0bc7-4242-b640-8f859b43efe3`, `5107e14e-19ac-4ed0-8add-18e7b2977723`, `3c083fb3-61e1-4aa0-b6a8-0973e0f3bb57`. These connect the real v2.1.27 public baseline, the Black Box review findings, and the older dashboard graph/live-memory evolution.\n\nNever-composed detected: None.\n\nRecommendation: treat this as a major product-generation jump, but harden the receipt semantics before launch footage.\n\nYes. This is a **major step-up** from the current GitHub `v2.1.27` dashboard.\n\nThe cleanest way to say it:\n\n`v2.1.27` dashboard = **inspectable local memory system** \nBlack Box branch = **agent cognition flight recorder + governance layer**\n\nThat is not just \u201cnew tabs.\u201d It changes the product category.\n\nIn `v2.1.27`, the dashboard proves Vestige has local memory, graph visibility, dreams/reasoning surfaces, source-aware connector work, and inspectable state. Strong, but mostly it shows **what exists in memory**.\n\nThis branch shows **what the agent did with memory during a run**:\n\n- every MCP call gets a `runId`\n- retrievals become replayable trace events\n- receipts show what memories influenced the answer\n- risky writes open Memory PRs\n- WebSocket events make the graph/dashboard pulse live\n- traces export as artifacts\n- `vestige://trace/{runId}` turns the trace into an MCP-readable receipt\n\nThat is a different league. It moves Vestige from \u201cmemory dashboard\u201d to **black box recorder for agents**.\n\nMy honest rating:\n\n- UI/product experience: **2-3x more advanced**\n- de", "intent": "Comparison", "status": "partial_evidence", "confidence": 0.52, "primary_id": "923709a5-cc60-4f41-b8b1-ef1a635fe6aa", "supporting_ids": ["923709a5-cc60-4f41-b8b1-ef1a635fe6aa", "9d975b31-e4b2-425c-902a-c17fef9dd4cb", "e22e83f3-2c18-4e33-93f4-558d91009505", "bef6710c-a1ee-4cb3-8a33-82aac2fdaee6"], "contradicting_ids": [], "contradiction_pairs": [], "memories_analyzed": 4, "duration_ms": 980, "timestamp": "2026-06-22T23:42:50.152360Z"}}
|
||||
{"data": {"timestamp": "2026-06-23T00:43:28.238141+00:00", "version": "2.1.27"}, "type": "Connected"}
|
||||
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 14, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "deep_reference", "argsHash": "13a31297fe007a2e", "at": 1782175410153}, "timestamp": "2026-06-23T00:43:30.154710Z"}}
|
||||
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 15, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["591c638e-1fc7-4b6d-bcb3-b7fcb6c0c7b3", "6aa12b99-270e-4fb6-b523-9f01b0bee16b", "26a3c976-043b-4915-accf-ae098c8dc66b", "76c13cba-7b88-4ce7-b7de-0a906d372806"], "activation": {"26a3c976-043b-4915-accf-ae098c8dc66b": 0.62, "591c638e-1fc7-4b6d-bcb3-b7fcb6c0c7b3": 0.62, "6aa12b99-270e-4fb6-b523-9f01b0bee16b": 0.53, "76c13cba-7b88-4ce7-b7de-0a906d372806": 0.62}, "at": 1782175410209}, "timestamp": "2026-06-23T00:43:30.209554Z"}}
|
||||
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 16, "event": {"type": "mcp.call", "runId": "run_proof", "tool": "search", "argsHash": "ac19c646baf0673d", "at": 1782175411167}, "timestamp": "2026-06-23T00:43:31.167561Z"}}
|
||||
{"type": "SearchPerformed", "data": {"query": "dashboard", "result_count": 2, "result_ids": ["26a3c976-043b-4915-accf-ae098c8dc66b", "76c13cba-7b88-4ce7-b7de-0a906d372806"], "duration_ms": 0, "timestamp": "2026-06-23T00:43:31.182829Z"}}
|
||||
{"type": "TraceEvent", "data": {"run_id": "run_proof", "seq": 17, "event": {"type": "memory.retrieve", "runId": "run_proof", "ids": ["26a3c976-043b-4915-accf-ae098c8dc66b", "76c13cba-7b88-4ce7-b7de-0a906d372806"], "activation": {}, "at": 1782175411182}, "timestamp": "2026-06-23T00:43:31.182933Z"}}
|
||||
{"type": "MemoryUnsuppressed", "data": {"id": "6aa12b99-270e-4fb6-b523-9f01b0bee16b", "remaining_count": 0, "timestamp": "2026-06-23T00:46:18.338387Z"}}
|
||||
{"type": "MemoryPrDecided", "data": {"id": "pr_31ab4c15f1694504bf33be82715bee03", "decision": "promote", "status": "promoted", "timestamp": "2026-06-23T00:46:18.338407Z"}}
|
||||
|
|
|
|||
|
|
@ -109,61 +109,81 @@ pub fn gate_writes(
|
|||
// Collect each (id, decision) write the tool reported.
|
||||
let writes = extract_writes(result);
|
||||
for (id, decision) in writes {
|
||||
let destructive = is_destructive_decision(&decision);
|
||||
|
||||
// Pull the just-written node to inspect its real content/type/tags.
|
||||
// C2: a destructive write (purge/delete/forget) has ALREADY removed the
|
||||
// row, so get_node returns None — we must NOT skip it (that's how
|
||||
// destructive removals were bypassing review). For those, build the
|
||||
// context from the decision alone; for normal writes a missing node
|
||||
// genuinely means nothing to gate, so skip.
|
||||
let node = match storage.get_node(&id) {
|
||||
Ok(Some(n)) => n,
|
||||
Ok(Some(n)) => Some(n),
|
||||
_ if destructive => None,
|
||||
_ => continue,
|
||||
};
|
||||
|
||||
// A decision of supersede/replace/merge means the write overwrote an
|
||||
// existing memory — the strongest risk signal. Look up the trust of the
|
||||
// memory it superseded so the gate can weigh it.
|
||||
let (supersedes, merges) = match decision.as_str() {
|
||||
"supersede" | "replace" => (true, false),
|
||||
"merge" => (false, true),
|
||||
"supersede" | "replace" | "superseded" => (true, false),
|
||||
"merge" | "merged" => (false, true),
|
||||
_ => (false, false),
|
||||
};
|
||||
// If this superseded something, treat the contradiction as against a
|
||||
// high-trust memory when the *new* node's own retention is high (the
|
||||
// pipeline only supersedes when confident). This keeps the gate honest
|
||||
// without a second DB round-trip per write.
|
||||
let contradicts_trust = if supersedes {
|
||||
Some(node.retention_strength.max(0.7))
|
||||
} else {
|
||||
None
|
||||
let contradicts_trust = match (&node, supersedes) {
|
||||
(Some(n), true) => Some(n.retention_strength.max(0.7)),
|
||||
_ => None,
|
||||
};
|
||||
|
||||
let ctx = WriteContext {
|
||||
source: Some(WriteSource::Agent),
|
||||
node_type: node.node_type.clone(),
|
||||
content: node.content.clone(),
|
||||
tags: node.tags.clone(),
|
||||
node_type: node.as_ref().map(|n| n.node_type.clone()).unwrap_or_default(),
|
||||
content: node.as_ref().map(|n| n.content.clone()).unwrap_or_default(),
|
||||
tags: node.as_ref().map(|n| n.tags.clone()).unwrap_or_default(),
|
||||
contradicts_trust,
|
||||
supersedes,
|
||||
merges,
|
||||
forgets: destructive,
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
// A destructive write ALWAYS warrants review (erasing brain state) even
|
||||
// in Fast mode is debatable, but we respect the mode: the `forgets`
|
||||
// signal in WriteContext makes classify_write gate it in Risk-Gated.
|
||||
let (class, signals) = classify_write(&ctx, mode);
|
||||
if class != RiskClass::Review {
|
||||
continue;
|
||||
}
|
||||
|
||||
// Quarantine the just-written node: suppress it so it is held out of
|
||||
// retrieval until the PR is decided. Best-effort.
|
||||
let _ = storage.suppress_memory(&id);
|
||||
// Quarantine the just-written node so it's held out of retrieval until
|
||||
// the PR is decided. For a destructive write there's no live node to
|
||||
// suppress — the PR records the action for review/audit instead.
|
||||
if node.is_some() {
|
||||
let _ = storage.suppress_memory(&id);
|
||||
}
|
||||
|
||||
let kind = match decision.as_str() {
|
||||
"supersede" | "replace" => MemoryPrKind::MemorySuperseded,
|
||||
"merge" => MemoryPrKind::DreamConsolidation,
|
||||
"supersede" | "replace" | "superseded" => MemoryPrKind::MemorySuperseded,
|
||||
"merge" | "merged" => MemoryPrKind::DreamConsolidation,
|
||||
_ if destructive => MemoryPrKind::NodeDecayed,
|
||||
_ if contradicts_trust.is_some() => MemoryPrKind::ContradictionDetected,
|
||||
_ => MemoryPrKind::NewFact,
|
||||
};
|
||||
let title = format!(
|
||||
"{}: \"{}\"",
|
||||
pr_kind_phrase(kind),
|
||||
node.content.chars().take(80).collect::<String>()
|
||||
);
|
||||
|
||||
// PRIV: never copy full memory content into the PR (it can hold a
|
||||
// secret, and the PR row is read by the dashboard and may be exported).
|
||||
// Store a short, redacted preview + a content hash instead. The preview
|
||||
// is dropped entirely when the write was gated for a sensitive topic.
|
||||
let sensitive = signals.iter().any(|s| {
|
||||
s.code == "sensitive_topic" || s.code == "sensitive_node_type"
|
||||
});
|
||||
let raw_content = node.as_ref().map(|n| n.content.as_str()).unwrap_or("");
|
||||
let preview = content_preview(raw_content, sensitive);
|
||||
let content_hash = hash_content(raw_content);
|
||||
|
||||
let title = format!("{}: {}", pr_kind_phrase(kind), preview);
|
||||
let pr = MemoryPr {
|
||||
id: format!("pr_{}", uuid::Uuid::new_v4().simple()),
|
||||
kind,
|
||||
|
|
@ -172,10 +192,14 @@ pub fn gate_writes(
|
|||
diff: serde_json::json!({
|
||||
"decision": decision,
|
||||
"node": {
|
||||
"id": node.id,
|
||||
"nodeType": node.node_type,
|
||||
"content": node.content,
|
||||
"tags": node.tags,
|
||||
"id": id,
|
||||
"nodeType": node.as_ref().map(|n| n.node_type.clone()).unwrap_or_default(),
|
||||
// Redacted: preview (or "[redacted — sensitive]") + hash,
|
||||
// never the full content.
|
||||
"contentPreview": preview,
|
||||
"contentHash": content_hash,
|
||||
"tags": node.as_ref().map(|n| n.tags.clone()).unwrap_or_default(),
|
||||
"deleted": node.is_none(),
|
||||
},
|
||||
}),
|
||||
signals: signals.clone(),
|
||||
|
|
@ -214,6 +238,46 @@ pub fn gate_writes(
|
|||
opened
|
||||
}
|
||||
|
||||
/// Whether a write decision permanently removes / forgets memory (so the live
|
||||
/// row may already be gone when the gate runs).
|
||||
fn is_destructive_decision(label: &str) -> bool {
|
||||
matches!(
|
||||
label,
|
||||
"purge" | "purged" | "delete" | "deleted" | "forget" | "forgotten"
|
||||
)
|
||||
}
|
||||
|
||||
/// A short, privacy-preserving preview of memory content for a Memory PR.
|
||||
/// When the write was flagged for a sensitive topic, the content is redacted
|
||||
/// entirely — the reviewer sees the risk signals + hash, never the secret.
|
||||
fn content_preview(content: &str, sensitive: bool) -> String {
|
||||
if content.is_empty() {
|
||||
return "(no content)".to_string();
|
||||
}
|
||||
if sensitive {
|
||||
return "[redacted — sensitive content; review via risk signals]".to_string();
|
||||
}
|
||||
let trimmed: String = content.chars().take(80).collect();
|
||||
if content.chars().count() > 80 {
|
||||
format!("{trimmed}…")
|
||||
} else {
|
||||
trimmed
|
||||
}
|
||||
}
|
||||
|
||||
/// FNV-1a hex fingerprint of memory content — lets a reviewer correlate /
|
||||
/// dedupe without the PR row carrying the raw (possibly secret) text.
|
||||
fn hash_content(content: &str) -> String {
|
||||
const FNV_OFFSET: u64 = 0xcbf2_9ce4_8422_2325;
|
||||
const FNV_PRIME: u64 = 0x0000_0100_0000_01b3;
|
||||
let mut hash = FNV_OFFSET;
|
||||
for b in content.as_bytes() {
|
||||
hash ^= u64::from(*b);
|
||||
hash = hash.wrapping_mul(FNV_PRIME);
|
||||
}
|
||||
format!("{:016x}", hash)
|
||||
}
|
||||
|
||||
fn pr_kind_phrase(kind: vestige_core::MemoryPrKind) -> &'static str {
|
||||
use vestige_core::MemoryPrKind::*;
|
||||
match kind {
|
||||
|
|
@ -842,6 +906,40 @@ mod tests {
|
|||
assert!(extract_writes(&state).is_empty(), "state is not a write");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn destructive_decision_classification_c2() {
|
||||
for d in ["purge", "delete", "forget", "purged", "deleted", "forgotten"] {
|
||||
assert!(is_destructive_decision(d), "{d} is destructive");
|
||||
}
|
||||
for d in ["create", "update", "promote", "reinforce"] {
|
||||
assert!(!is_destructive_decision(d), "{d} is not destructive");
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn content_preview_redacts_sensitive_and_truncates() {
|
||||
// PRIV: sensitive content is fully redacted, never previewed.
|
||||
assert_eq!(
|
||||
content_preview("the production auth token is sk-abc123", true),
|
||||
"[redacted — sensitive content; review via risk signals]"
|
||||
);
|
||||
// Ordinary content is truncated, not redacted.
|
||||
let long = "a".repeat(200);
|
||||
let prev = content_preview(&long, false);
|
||||
assert!(prev.ends_with('…'));
|
||||
assert!(prev.chars().count() <= 81);
|
||||
// Empty content.
|
||||
assert_eq!(content_preview("", false), "(no content)");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn content_hash_is_stable_and_hides_text() {
|
||||
let h = hash_content("my secret memory");
|
||||
assert_eq!(h, hash_content("my secret memory"), "stable");
|
||||
assert!(!h.contains("secret"));
|
||||
assert_eq!(h.len(), 16);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn extract_writes_recognizes_destructive_actions_c2() {
|
||||
// C2: purge/delete are brain mutations and must trace + be gateable.
|
||||
|
|
@ -855,6 +953,88 @@ mod tests {
|
|||
}
|
||||
}
|
||||
|
||||
fn store() -> std::sync::Arc<vestige_core::Storage> {
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
std::sync::Arc::new(
|
||||
vestige_core::Storage::new(Some(dir.path().join("gate_test.db"))).unwrap(),
|
||||
)
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn gate_opens_pr_for_destructive_write_after_node_deleted_c2() {
|
||||
// C2-deep: the row is GONE by the time the gate runs (purge deleted it),
|
||||
// but a destructive write must STILL open a Memory PR — not be skipped.
|
||||
let s = store();
|
||||
let node = s
|
||||
.ingest(vestige_core::IngestInput {
|
||||
content: "A memory the agent is about to purge.".to_string(),
|
||||
node_type: "fact".to_string(),
|
||||
..Default::default()
|
||||
})
|
||||
.unwrap();
|
||||
// Actually delete the row, like purge does.
|
||||
let _ = s.delete_node(&node.id);
|
||||
assert!(s.get_node(&node.id).unwrap().is_none(), "row is gone");
|
||||
|
||||
// The tool result the recorder sees for the purge.
|
||||
let result = serde_json::json!({ "action": "purge", "nodeId": node.id, "success": true });
|
||||
let opened = gate_writes(
|
||||
&s,
|
||||
None,
|
||||
"run_c2",
|
||||
"memory",
|
||||
&result,
|
||||
vestige_core::ReviewMode::RiskGated,
|
||||
);
|
||||
|
||||
assert_eq!(opened.len(), 1, "destructive write must open a PR even with the node gone");
|
||||
let pr = s.list_memory_prs(Some(vestige_core::MemoryPrStatus::Pending), 10).unwrap();
|
||||
assert_eq!(pr.len(), 1);
|
||||
assert_eq!(pr[0].subject_id.as_deref(), Some(node.id.as_str()));
|
||||
// The diff marks the node as deleted and carries no resurrected content.
|
||||
assert_eq!(pr[0].diff["node"]["deleted"], serde_json::json!(true));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn gate_redacts_sensitive_content_in_pr_priv() {
|
||||
// PRIV: a write gated for a sensitive topic must NOT carry the raw
|
||||
// content into the PR diff/title — only a redaction + hash.
|
||||
let s = store();
|
||||
let secret = "the production auth token is sk-live-SECRET-XYZ";
|
||||
let node = s
|
||||
.ingest(vestige_core::IngestInput {
|
||||
content: secret.to_string(),
|
||||
node_type: "fact".to_string(),
|
||||
..Default::default()
|
||||
})
|
||||
.unwrap();
|
||||
let result = serde_json::json!({ "decision": "create", "nodeId": node.id });
|
||||
let opened = gate_writes(
|
||||
&s,
|
||||
None,
|
||||
"run_priv",
|
||||
"smart_ingest",
|
||||
&result,
|
||||
vestige_core::ReviewMode::RiskGated,
|
||||
);
|
||||
assert_eq!(opened.len(), 1, "sensitive write opens a PR");
|
||||
|
||||
let pr = &s
|
||||
.list_memory_prs(Some(vestige_core::MemoryPrStatus::Pending), 10)
|
||||
.unwrap()[0];
|
||||
let serialized = serde_json::to_string(pr).unwrap();
|
||||
assert!(
|
||||
!serialized.contains("SECRET-XYZ") && !serialized.contains("sk-live"),
|
||||
"PR must not contain the raw secret content; got: {serialized}"
|
||||
);
|
||||
assert!(
|
||||
serialized.contains("redacted"),
|
||||
"PR must mark the content redacted"
|
||||
);
|
||||
// A content hash is present so reviewers can still correlate.
|
||||
assert!(pr.diff["node"]["contentHash"].as_str().is_some());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn write_tool_set_includes_codebase_b2() {
|
||||
assert!(is_write_tool("codebase"));
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue