mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
docs(plans): add historic-sql pattern shard smoke/docs plan
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
2d1efda176
commit
fed5e6f89f
1 changed files with 407 additions and 0 deletions
|
|
@ -0,0 +1,407 @@
|
|||
# Historic SQL Pattern Shard Smoke Docs Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Align the Postgres historic-SQL smoke and example docs with sharded pattern WorkUnits.
|
||||
|
||||
**Architecture:** The runtime already writes the full `patterns-input.json` audit file and bounded `patterns-input/part-0001.json` style shards. This plan updates the example acceptance assets so they verify the sharded contract instead of the pre-sharding root `historic-sql-patterns` WorkUnit.
|
||||
|
||||
**Tech Stack:** Bash, Node.js built-in test runner, pnpm workspace scripts, KTX local stage-only ingest.
|
||||
|
||||
---
|
||||
|
||||
## Spec And Existing Plan Status
|
||||
|
||||
Spec: `docs/superpowers/specs/2026-05-11-historic-sql-redesign-design.md`
|
||||
|
||||
Plans derived from this spec and implemented in this worktree:
|
||||
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-foundations.md` - implemented. Evidence: `packages/context/src/ingest/adapters/historic-sql/skill-schemas.ts`, `packages/context/src/sql-analysis/ports.ts`, daemon `/sql/analyze-batch`, `SemanticLayerSource.usage`, and `mergeUsagePreservingExternal()`.
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-search-enrichment.md` - implemented. Evidence: usage-aware SL search text, SQLite FTS snippets, and local/MCP result fields `frequencyTier` plus `snippet`.
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-unified-hot-path.md` - implemented. Evidence: `stageHistoricSqlAggregatedSnapshot()`, `chunkHistoricSqlUnifiedStagedDir()`, `PostgresPgssReader`, aggregate BigQuery/Snowflake readers, unified schemas, and package exports.
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-skills-projection-cutover.md` - implemented. Evidence: `HistoricSqlSourceAdapter`, `historic_sql_table_digest`, `historic_sql_patterns`, `emit_historic_sql_evidence`, `HistoricSqlProjectionPostProcessor`, and legacy skill removal from runtime code.
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-cross-dialect-readiness.md` - implemented. Evidence: local adapter registration tests for Postgres, BigQuery, and Snowflake plus PG doctor coverage for informational `pg_stat_statements.max`.
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-docs-smoke-and-config-cleanup.md` - implemented at the time it was written, but its smoke assertions predate pattern shard WorkUnits.
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-projection-archive-hardening.md` - implemented. Evidence: `isArchivedPatternPage()`, archive exclusion from slug matching, stale table tests, and legacy query-page cleanup coverage.
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-end-to-end-retrieval-acceptance.md` - implemented. Evidence: `local-ingest-acceptance.test.ts` proves production adapter output reaches SL search and wiki search.
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-redaction-hardening.md` - implemented. Evidence: `redaction.ts`, `redaction.test.ts`, and staged artifact redaction coverage in `stage-unified.test.ts`.
|
||||
- `docs/superpowers/plans/2026-05-11-historic-sql-pattern-workunit-sharding.md` - implemented. Evidence: `pattern-inputs.ts`, `pattern-inputs.test.ts`, `stage-unified.ts` writes `patterns-input/part-*.json`, `chunk-unified.ts` emits `historic-sql-patterns-part-*`, `historic_sql_patterns` reads shards, and acceptance tests use `rawPath: 'patterns-input/part-0001.json'`.
|
||||
|
||||
No existing spec-derived implementation plan is currently unimplemented in this worktree.
|
||||
|
||||
Remaining gap this plan fixes:
|
||||
|
||||
- `examples/postgres-historic/scripts/smoke.sh` still asserts a WorkUnit with `unitKey === 'historic-sql-patterns'`.
|
||||
- Current runtime emits pattern WorkUnits with keys like `historic-sql-patterns-part-0001` and raw files like `patterns-input/part-0001.json`.
|
||||
- The same smoke only validates the audit file `patterns-input.json`; it does not assert that the bounded shard files exist or contain only cross-table candidates.
|
||||
- `examples/postgres-historic/README.md` and `examples/README.md` describe unchanged "pattern inputs" but do not explain that `patterns-input.json` is now audit-only and `patterns-input/part-*.json` drives pattern WorkUnits.
|
||||
- `scripts/examples-docs.test.mjs` does not pin the sharded smoke/doc contract, so the stale root WorkUnit assertion can regress silently.
|
||||
|
||||
## File Structure
|
||||
|
||||
- Modify `scripts/examples-docs.test.mjs`
|
||||
Pins docs and smoke script to the sharded pattern WorkUnit contract.
|
||||
- Modify `examples/postgres-historic/scripts/smoke.sh`
|
||||
Validates `patterns-input/part-*.json` shard files and `historic-sql-patterns-part-*` stage-only WorkUnits.
|
||||
- Modify `examples/postgres-historic/README.md`
|
||||
Documents `patterns-input.json` as the full audit artifact and `patterns-input/part-*.json` as bounded pattern WorkUnit input.
|
||||
- Modify `examples/README.md`
|
||||
Updates the short example catalog entry with the same audit-vs-shard wording.
|
||||
|
||||
### Task 1: Pin Example Tests To Pattern Shards
|
||||
|
||||
**Files:**
|
||||
- Modify: `scripts/examples-docs.test.mjs`
|
||||
|
||||
- [ ] **Step 1: Add failing assertions for sharded pattern smoke/docs**
|
||||
|
||||
In `scripts/examples-docs.test.mjs`, inside `it('documents the Postgres historic SQL smoke example', ...)`, add these assertions immediately after the existing `assert.match(readme, /patterns-input\.json/);` line:
|
||||
|
||||
```javascript
|
||||
assert.match(readme, /patterns-input\/part-\*\.json/);
|
||||
assert.match(readme, /full audit input/);
|
||||
assert.match(readme, /bounded pattern WorkUnit shards/);
|
||||
```
|
||||
|
||||
In the same test, add these assertions immediately after the existing `assert.match(smoke, /assert_stage_record "\$UNCHANGED_RECORD" unchanged zero/);` line:
|
||||
|
||||
```javascript
|
||||
assert.match(smoke, /assertPatternShards/);
|
||||
assert.match(smoke, /historic-sql-patterns-part-/);
|
||||
assert.match(smoke, /patterns-input\/part-/);
|
||||
assert.doesNotMatch(smoke, /unitKey === 'historic-sql-patterns'/);
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the example docs test to verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
node --test scripts/examples-docs.test.mjs
|
||||
```
|
||||
|
||||
Expected: FAIL. The test should report missing `patterns-input/part-*.json`, `full audit input`, `bounded pattern WorkUnit shards`, `assertPatternShards`, or it should fail because `smoke.sh` still contains `unitKey === 'historic-sql-patterns'`.
|
||||
|
||||
- [ ] **Step 3: Commit the failing test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add scripts/examples-docs.test.mjs
|
||||
git commit -m "test: expect historic sql pattern shard smoke docs"
|
||||
```
|
||||
|
||||
### Task 2: Update The Postgres Historic Smoke
|
||||
|
||||
**Files:**
|
||||
- Modify: `examples/postgres-historic/scripts/smoke.sh`
|
||||
- Test: `scripts/examples-docs.test.mjs`
|
||||
|
||||
- [ ] **Step 1: Import `existsSync` in the embedded snapshot assertion**
|
||||
|
||||
In `examples/postgres-historic/scripts/smoke.sh`, inside `assert_unified_snapshot()`, replace this line:
|
||||
|
||||
```javascript
|
||||
const { readFileSync, readdirSync } = require('node:fs');
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```javascript
|
||||
const { existsSync, readFileSync, readdirSync } = require('node:fs');
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add shard validation to `assert_unified_snapshot()`**
|
||||
|
||||
In `examples/postgres-historic/scripts/smoke.sh`, inside the embedded Node script in `assert_unified_snapshot()`, add this function after the `legacyKeys` loop:
|
||||
|
||||
```javascript
|
||||
function assertPatternShards(root) {
|
||||
const shardDir = join(root, 'patterns-input');
|
||||
assert(existsSync(shardDir), 'Expected patterns-input shard directory');
|
||||
const shardFiles = readdirSync(shardDir)
|
||||
.filter((file) => /^part-\d{4}\.json$/.test(file))
|
||||
.sort()
|
||||
.map((file) => `patterns-input/${file}`);
|
||||
assert(shardFiles.length > 0, 'Expected at least one pattern shard file');
|
||||
|
||||
for (const shardFile of shardFiles) {
|
||||
const shard = JSON.parse(readFileSync(join(root, shardFile), 'utf8'));
|
||||
assert(Array.isArray(shard.templates), `${shardFile}: expected templates array`);
|
||||
assert(shard.templates.length > 0, `${shardFile}: expected at least one template`);
|
||||
assert(
|
||||
shard.templates.every((template) => Array.isArray(template.tablesTouched) && template.tablesTouched.length >= 2),
|
||||
`${shardFile}: expected only cross-table pattern candidates`,
|
||||
);
|
||||
}
|
||||
|
||||
return shardFiles;
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Assert the full audit input and bounded shards**
|
||||
|
||||
In the same embedded Node script, replace the current `patterns` block:
|
||||
|
||||
```javascript
|
||||
const patterns = JSON.parse(readFileSync(join(root, 'patterns-input.json'), 'utf8'));
|
||||
assert(Array.isArray(patterns.templates) && patterns.templates.length > 0, 'Expected patterns-input templates');
|
||||
assert(
|
||||
patterns.templates.every((template) => Array.isArray(template.tablesTouched) && template.tablesTouched.length > 0),
|
||||
'Expected every pattern template to have touched tables',
|
||||
);
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```javascript
|
||||
const patterns = JSON.parse(readFileSync(join(root, 'patterns-input.json'), 'utf8'));
|
||||
assert(Array.isArray(patterns.templates) && patterns.templates.length > 0, 'Expected patterns-input audit templates');
|
||||
assert(
|
||||
patterns.templates.every((template) => Array.isArray(template.tablesTouched) && template.tablesTouched.length > 0),
|
||||
'Expected every audit pattern template to have touched tables',
|
||||
);
|
||||
const shardFiles = assertPatternShards(root);
|
||||
assert(
|
||||
shardFiles.length <= patterns.templates.length,
|
||||
`Expected shard count ${shardFiles.length} to be no greater than audit template count ${patterns.templates.length}`,
|
||||
);
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Update the stage record WorkUnit assertions**
|
||||
|
||||
In `examples/postgres-historic/scripts/smoke.sh`, inside the embedded Node script in `assert_stage_record()`, replace:
|
||||
|
||||
```javascript
|
||||
assert(record.rawFileCount >= 3, `${label}: expected manifest, patterns input, and at least one table file`);
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```javascript
|
||||
assert(record.rawFileCount >= 4, `${label}: expected manifest, audit patterns input, pattern shard, and at least one table file`);
|
||||
```
|
||||
|
||||
Then replace this nonzero WorkUnit block:
|
||||
|
||||
```javascript
|
||||
} else if (expectedWorkUnits === 'nonzero') {
|
||||
assert(record.workUnitCount > 0, `${label}: expected nonzero WorkUnits`);
|
||||
assert(record.workUnits.some((unit) => unit.unitKey === 'historic-sql-patterns'), `${label}: expected patterns WorkUnit`);
|
||||
assert(record.workUnits.some((unit) => unit.unitKey.startsWith('historic-sql-table-')), `${label}: expected table WorkUnit`);
|
||||
} else {
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```javascript
|
||||
} else if (expectedWorkUnits === 'nonzero') {
|
||||
assert(record.workUnitCount > 0, `${label}: expected nonzero WorkUnits`);
|
||||
const patternUnits = record.workUnits.filter((unit) => /^historic-sql-patterns-part-\d{4}$/.test(unit.unitKey));
|
||||
assert(patternUnits.length > 0, `${label}: expected sharded patterns WorkUnit`);
|
||||
for (const unit of patternUnits) {
|
||||
assert(
|
||||
unit.rawFiles.some((rawFile) => /^patterns-input\/part-\d{4}\.json$/.test(rawFile)),
|
||||
`${label}: expected ${unit.unitKey} to read a pattern shard`,
|
||||
);
|
||||
assert(
|
||||
!unit.rawFiles.includes('patterns-input.json'),
|
||||
`${label}: expected ${unit.unitKey} not to schedule the full audit patterns input`,
|
||||
);
|
||||
}
|
||||
assert(record.workUnits.some((unit) => unit.unitKey.startsWith('historic-sql-table-')), `${label}: expected table WorkUnit`);
|
||||
} else {
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run shell syntax and the docs test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
bash -n examples/postgres-historic/scripts/smoke.sh
|
||||
node --test scripts/examples-docs.test.mjs
|
||||
```
|
||||
|
||||
Expected: `bash -n` exits 0. The docs test still fails until the README files are updated in Task 3.
|
||||
|
||||
- [ ] **Step 6: Commit the smoke update**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add examples/postgres-historic/scripts/smoke.sh
|
||||
git commit -m "test: assert historic sql pattern shard smoke"
|
||||
```
|
||||
|
||||
### Task 3: Update Example Documentation
|
||||
|
||||
**Files:**
|
||||
- Modify: `examples/postgres-historic/README.md`
|
||||
- Modify: `examples/README.md`
|
||||
- Test: `scripts/examples-docs.test.mjs`
|
||||
|
||||
- [ ] **Step 1: Update the artifact list in the Postgres historic README**
|
||||
|
||||
In `examples/postgres-historic/README.md`, replace this list:
|
||||
|
||||
```markdown
|
||||
- `manifest.json`
|
||||
- `tables/*.json`
|
||||
- `patterns-input.json`
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```markdown
|
||||
- `manifest.json`
|
||||
- `tables/*.json`
|
||||
- `patterns-input.json` as the full audit input
|
||||
- `patterns-input/part-*.json` as bounded pattern WorkUnit shards
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Update the idempotency wording**
|
||||
|
||||
In `examples/postgres-historic/README.md`, replace this paragraph:
|
||||
|
||||
```markdown
|
||||
The smoke also runs the same workload twice and verifies the second stage-only
|
||||
run has `workUnitCount: 0`, which proves unchanged bucketed table and pattern
|
||||
inputs do not schedule LLM work.
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```markdown
|
||||
The smoke also runs the same workload twice and verifies the second stage-only
|
||||
run has `workUnitCount: 0`, which proves unchanged bucketed table inputs and
|
||||
unchanged bounded pattern shards do not schedule LLM work.
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Update the manifest inspection wording**
|
||||
|
||||
In `examples/postgres-historic/README.md`, replace this paragraph:
|
||||
|
||||
```markdown
|
||||
The manifest should have `source: "historic-sql"`, `dialect: "postgres"`,
|
||||
positive `snapshotRowCount`, positive `touchedTableCount`, numeric
|
||||
`parseFailures`, `warnings`, and `probeWarnings`. The same directory should
|
||||
contain `patterns-input.json` and one `tables/*.json` file per touched table.
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```markdown
|
||||
The manifest should have `source: "historic-sql"`, `dialect: "postgres"`,
|
||||
positive `snapshotRowCount`, positive `touchedTableCount`, numeric
|
||||
`parseFailures`, `warnings`, and `probeWarnings`. The same directory should
|
||||
contain `patterns-input.json`, at least one `patterns-input/part-*.json` pattern
|
||||
shard for cross-table candidates, and one `tables/*.json` file per touched
|
||||
table.
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Update the examples catalog entry**
|
||||
|
||||
In `examples/README.md`, replace this paragraph:
|
||||
|
||||
```markdown
|
||||
`postgres-historic/` is a manual Docker-backed smoke for Postgres historic-SQL
|
||||
ingest via `pg_stat_statements`. It verifies setup, unified Historic SQL artifacts,
|
||||
managed daemon batch SQL analysis, and no-WorkUnit idempotency for unchanged
|
||||
bucketed table and pattern inputs.
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```markdown
|
||||
`postgres-historic/` is a manual Docker-backed smoke for Postgres historic-SQL
|
||||
ingest via `pg_stat_statements`. It verifies setup, unified Historic SQL artifacts,
|
||||
managed daemon batch SQL analysis, bounded pattern WorkUnit shards, and
|
||||
no-WorkUnit idempotency for unchanged bucketed table inputs and pattern shards.
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run the example docs test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
node --test scripts/examples-docs.test.mjs
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit the docs update**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git add examples/postgres-historic/README.md examples/README.md
|
||||
git commit -m "docs: explain historic sql pattern shards"
|
||||
```
|
||||
|
||||
### Task 4: Verify The Smoke Contract
|
||||
|
||||
**Files:**
|
||||
- Verify: `scripts/examples-docs.test.mjs`
|
||||
- Verify: `examples/postgres-historic/scripts/smoke.sh`
|
||||
- Verify: `examples/postgres-historic/README.md`
|
||||
- Verify: `examples/README.md`
|
||||
|
||||
- [ ] **Step 1: Run focused local checks**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
bash -n examples/postgres-historic/scripts/smoke.sh
|
||||
node --test scripts/examples-docs.test.mjs
|
||||
```
|
||||
|
||||
Expected: both commands pass.
|
||||
|
||||
- [ ] **Step 2: Run the Docker-backed Postgres historic smoke**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
examples/postgres-historic/scripts/smoke.sh
|
||||
```
|
||||
|
||||
Expected: PASS with `Postgres historic SQL smoke passed`. The stage-only records should include pattern WorkUnits with keys like `historic-sql-patterns-part-0001`, each reading `patterns-input/part-0001.json`, and the unchanged run should report `workUnitCount: 0`.
|
||||
|
||||
- [ ] **Step 3: Run the drift grep**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
rg -n "unitKey === 'historic-sql-patterns'|expected patterns WorkUnit|patterns-input\\.json\\` and one \\`tables|unchanged bucketed table and pattern inputs" examples scripts
|
||||
```
|
||||
|
||||
Expected: no matches.
|
||||
|
||||
- [ ] **Step 4: Commit verification metadata if any test-only wording changed**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git status --short
|
||||
```
|
||||
|
||||
Expected: no unstaged files. If a previous step required a wording fix, commit only the touched files:
|
||||
|
||||
```bash
|
||||
git add scripts/examples-docs.test.mjs examples/postgres-historic/scripts/smoke.sh examples/postgres-historic/README.md examples/README.md
|
||||
git commit -m "test: verify historic sql sharded smoke docs"
|
||||
```
|
||||
|
||||
## Self-Review
|
||||
|
||||
**Spec coverage:** This plan follows spec section 5.2's deterministic pattern sharding and preserves section 4.6's full `patterns-input.json` audit artifact. It updates the smoke and docs around the already implemented sharded runtime contract.
|
||||
|
||||
**Placeholder scan:** The plan contains exact file paths, exact snippets, commands, expected outcomes, and commit commands.
|
||||
|
||||
**Type consistency:** The plan uses the implemented runtime names consistently: `patterns-input.json` for the audit file, `patterns-input/part-*.json` for bounded shards, and `historic-sql-patterns-part-0001` style WorkUnit keys for pattern curation.
|
||||
|
||||
Plan complete and saved to `docs/superpowers/plans/2026-05-11-historic-sql-pattern-shard-smoke-docs.md`. Two execution options:
|
||||
|
||||
**1. Subagent-Driven (recommended)** - dispatch a fresh subagent per task, review between tasks, fast iteration
|
||||
|
||||
**2. Inline Execution** - execute tasks in this session using executing-plans, batch execution with checkpoints
|
||||
Loading…
Add table
Add a link
Reference in a new issue