Sam is a solo founder and built Postdict alone — the pitch must say "I built,"
"I ported," "I already did it," never "we." For an early-stage solo founder that
is a strength (ships without a team = what investors bet on). Only remaining
"we" is "now we're handing debugging to AI agents" — there it means all of us as
an industry, not the company. Added delivery rule #4: "I built this" is a flex.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sam delivers this on a stage to 50-100 people — voice only, no screen. Rewrote
from a terminal-demo script into a pure ~60s spoken pitch built for the ear:
short sentences, hard stops, [pause] cues (silence = the loudest move on stage),
and one memorizable detonation line ("a root cause never looks like the bug it
creates"). Opens about the audience's pain, not the product; promises the
laptop demo afterward rather than fumbling a terminal live. Includes a 30s
hallway version and a 5s one-liner.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The detonation version of the funding pitch. Deliberately REJECTS a "X companies
got hacked" hook (commodity security-FUD, needs an uncitable stat that torches
the honest+reproducible moat, and shrinks the market from "every agent in
production" to "security"). Instead opens with the WOUND every engineer has
lived — "production breaks, the cause is a change you made days ago and forgot"
— then detonates the industry's flawed axiom ("relevance ≠ resemblance; a root
cause never looks like the bug it creates"), proves it live, defends the moat
(Nature port + their architecture IS the axiom), and ends on TAM.
Uses `postdict` on screen (the new name); includes the interactive-shell alias
so it can be recorded with the new name today. Contrast re-verified end-to-end.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rewrote the 60s funding script around Sam's stronger framing: lead with the
flawed AXIOM the whole industry shares ("relevance equals resemblance"), then
detonate it with the fact ("a root cause never looks like the bug it creates").
This names the MECHANISM of the error, not just that one exists — a diagnosis,
not a claim. Keeps the live contrast as the proof, the Nature-port + "their
architecture IS the axiom" as the moat, and "every agent that touches
production" as the market. On-screen commands re-verified end-to-end:
similarity→billing lookalike (wrong), Postdict→API_TIMEOUT cause (right).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A separate, investor-grade script (vs the viral clip): leads with the thesis
"the entire AI-memory industry is trapped in a category error — a root cause
never looks like the bug it creates", proves it live with the similarity-vs-
Postdict contrast, then closes on the moat (faithful Nature port + incumbents'
architecture IS the category error) and the market (every agent that touches
production).
Verified the exact on-screen commands end-to-end: the 3-memory scenario (cause
+ billing-500 lookalike + crash) makes the contrast devastating — similarity
search returns the billing lookalike as its #1, Postdict reaches back 3 days to
the real env-var cause. (Without the lookalike distractor the contrast collapses
— similarity would also surface the cause — so the script plants all three.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
demo/README.md: the complete self-serve demo artifact — one-command run, the
seeded scenario explained, a "build your own scenario" section, the honest
boundary (won't invent a cause; can't reach a cause that was never recorded),
the Nature citation + the "field admits this is unsolved" sources, and the
recording playbook + paste-ready caption.
Writing/testing the README surfaced a real inconsistency, now fixed:
- The CLI's failure-finder used a hardcoded content-only marker subset and
ignored tags, so a "Checkout latency spiked" memory (regression tag, no crash
word in content) was never picked as the failure. The CLI now calls the SAME
public `looks_like_failure` (content + tags, full list) the backfill tool uses
— one definition, no drift.
- Extended FAILURE_MARKERS with performance/degradation failures (spiked,
latency, degraded, slow, hang, throttled, oom, 502/503/504, flaky, ...) so the
feature backfills from perf regressions, not just hard crashes.
clippy clean; 527 core + 453 mcp tests; both the main demo and the README's
custom scenario verified end-to-end.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3-model rotation audit (DeepSeek V4-Pro / Kimi K2.7 / MiniMax M3, max thinking,
each model × each of 3 sections). Claude verified every finding against code.
CONFIRMED + FIXED:
- [FTS, consensus DeepSeek+MiniMax] sanitize_fts5_or_query split on
!is_alphanumeric()+'_', but the index uses tokenize='porter ascii' which
splits on '_' and non-ASCII. So "API_TIMEOUT"/"café" became single phrases that
could NEVER match. Now splits on !is_ascii_alphanumeric() + lowercases to mirror
the tokenizer; caps token count (64) and length (64) for DoS hardening. Also
fixes the pre-existing storage.search bug (multi-word queries silently returned
nothing). 5 new tests pin it.
- [Demo honesty, consensus Kimi+DeepSeek] the contrast labeled keyword search as
"SIMILARITY SEARCH" and asserted "NONE of these is the cause" universally. Now
prints the REAL engine ("keyword (BM25)" vs "semantic (vector + BM25 hybrid)")
and claims only what's true ("ranked by RESEMBLANCE; its top hit is a lookalike").
De-hardcoded the "Service crashed:" munging to a generic label-strip.
VERIFIED FALSE POSITIVE (not changed): MiniMax "fts.id non-existent column" —
the FTS5 table is declared `fts5(id, content, tags, ...)`, the JOIN is valid.
No injection found by any model (quote-doubling + operator-stripping confirmed safe).
clippy clean; 527 core + 453 mcp tests pass; demo verified.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>