fix(blackbox): address review blockers B1–B7 + re-capture proof bundle

A full multi-agent review found 7 real issues (4 blockers). All fixed + tested. B1 (blocker): Promoting a Memory PR did not release the quarantined memory — the UI said "promoted" while the memory stayed suppressed/out of retrieval. act_on_memory_pr now calls reverse_suppression(subject_id) on accept actions; MemoryPrAction::releases_memory() encodes the rule (promote/merge/supersede release; forget/quarantine keep it held). Proven live: PR response subjectReleased:true, SQLite suppression_count 0. B2 (blocker): memory promote/demote (returns `action`, not `decision`) and codebase remember_* writes bypassed the write-trace + PR gate. extract_writes now reads `action` too, filtered by is_write_decision (reads like get/state excluded); is_write_tool includes `codebase`. B3 (blocker): receipt ids collided within a run (r_<date>_<runId> + INSERT OR REPLACE overwrote earlier receipts). IDs are now r_<date>_<runId8>_<unique6>; build() mints the suffix, build_with_unique() keeps tests deterministic. B4 (blocker): proof bundle was assembled from two runs (trace.json=run_proof, websocket-events.jsonl=run_proof2). Re-captured the whole bundle from a single run — trace, websocket, receipt, and memory_pr all carry run_proof now. B5: Black Box receipts panel showed global latest, not the selected run. Added list_receipts_for_run + /api/receipts?run= ; the page uses listForRun. B6: SENSITIVE_TOPICS substring matching false-fired (tokenizer->token, author->auth, secretary->secret). Switched to word-boundary matching; real phrasings (auth token, security vulnerability, api key) still gate. B7: set_review_mode now writes atomically (temp+rename via write_atomic); export_trace sanitizes run_id in the Content-Disposition filename; memory-prs static routes declared before the dynamic /{id} route. Withdrawn: the /mode-vs-/{id} route order is NOT a functional bug (axum 0.8 / matchit prioritizes static segments) — reordered for clarity only. Gates: 999 lib tests pass (+9 new regressions), clippy -D warnings clean, dashboard check + build clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-02 22:01:01 +02:00 · 2026-06-22 18:46:14 -05:00 · 2026-06-22 18:46:14 -05:00 · 8f7bed0463
commit 8f7bed0463
parent cadffb419b
17 changed files with 486 additions and 63 deletions
--- a/apps/dashboard/src/lib/stores/api.ts
+++ b/apps/dashboard/src/lib/stores/api.ts
@ -146,6 +146,11 @@ export const api = {
 	// Memory Receipts (v2.2): the nutrition label for a retrieval.
 	receipts: {
 		list: (limit = 50) => fetcher<ReceiptListResponse>(`/receipts?limit=${limit}`),
+		// B5: scope to one run so the Black Box panel shows that run's receipts.
+		listForRun: (runId: string, limit = 50) =>
+			fetcher<ReceiptListResponse>(
+				`/receipts?run=${encodeURIComponent(runId)}&limit=${limit}`
+			),
 		get: (receiptId: string) => fetcher<Receipt>(`/receipts/${encodeURIComponent(receiptId)}`)
 	},

--- a/apps/dashboard/src/routes/(app)/blackbox/+page.svelte
+++ b/apps/dashboard/src/routes/(app)/blackbox/+page.svelte
@ -86,9 +86,9 @@
 		try {
 			detail = await api.traces.get(runId);
 			scrubIndex = Math.max(0, (detail.events.length || 1) - 1);
-			// Receipts are the proof behind a run's retrievals. The list is
-			// recent-first; the newest typically belong to the just-selected run.
-			receipts = (await api.receipts.list(8)).receipts;
+			// Receipts are the proof behind THIS run's retrievals — scoped to
+			// the selected run (B5), not the global latest.
+			receipts = (await api.receipts.listForRun(runId, 8)).receipts;
 		} catch (e) {
 			error = String(e);
 			detail = null;