From 56b20c44cfe16fbc6eb6a173205e31470578dfbc Mon Sep 17 00:00:00 2001 From: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com> Date: Mon, 11 May 2026 20:31:35 +0200 Subject: [PATCH] docs: explain historic sql pattern shards --- examples/README.md | 4 ++-- examples/postgres-historic/README.md | 11 +++++++---- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/examples/README.md b/examples/README.md index 7b463bef..2a3ed818 100644 --- a/examples/README.md +++ b/examples/README.md @@ -31,8 +31,8 @@ warehouse credential. `postgres-historic/` is a manual Docker-backed smoke for Postgres historic-SQL ingest via `pg_stat_statements`. It verifies setup, unified Historic SQL artifacts, -managed daemon batch SQL analysis, and no-WorkUnit idempotency for unchanged -bucketed table and pattern inputs. +managed daemon batch SQL analysis, bounded pattern WorkUnit shards, and +no-WorkUnit idempotency for unchanged bucketed table inputs and pattern shards. ## package-artifacts diff --git a/examples/postgres-historic/README.md b/examples/postgres-historic/README.md index 9e5e6ad4..b235c93f 100644 --- a/examples/postgres-historic/README.md +++ b/examples/postgres-historic/README.md @@ -7,11 +7,12 @@ preloaded, generates query workload under separate users, runs `ktx setup` with - `manifest.json` - `tables/*.json` -- `patterns-input.json` +- `patterns-input.json` as the full audit input +- `patterns-input/part-*.json` as bounded pattern WorkUnit shards The smoke also runs the same workload twice and verifies the second stage-only -run has `workUnitCount: 0`, which proves unchanged bucketed table and pattern -inputs do not schedule LLM work. +run has `workUnitCount: 0`, which proves unchanged bucketed table inputs and +unchanged bounded pattern shards do not schedule LLM work. ## Prerequisites @@ -114,7 +115,9 @@ find /tmp/ktx-postgres-historic/raw-sources/warehouse/historic-sql -name manifes The manifest should have `source: "historic-sql"`, `dialect: "postgres"`, positive `snapshotRowCount`, positive `touchedTableCount`, numeric `parseFailures`, `warnings`, and `probeWarnings`. The same directory should -contain `patterns-input.json` and one `tables/*.json` file per touched table. +contain `patterns-input.json`, at least one `patterns-input/part-*.json` pattern +shard for cross-table candidates, and one `tables/*.json` file per touched +table. ## Troubleshooting