mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
* feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
4.2 KiB
4.2 KiB
| name | description | callers | |
|---|---|---|---|
| historic_sql_patterns | Identify recurring cross-table historic-SQL analytical intents from a bounded pattern shard and emit typed pattern evidence for deterministic wiki projection. |
|
Historic SQL Patterns
Use this skill when the WorkUnit raw file is a patterns-input/part-0001.json style shard from the historic-sql adapter. Older staged bundles may still provide root patterns-input.json; when that is the WorkUnit raw file, read it the same way.
Required Workflow
- Read the WorkUnit notes first.
- Find the single pattern input file listed under the WorkUnit
rawFilessection. - Call
read_raw_filefor that exact raw file path. - Identify recurring analytical intents that span at least two tables and have repeated usage signal.
- Emit one
patternevidence object per durable cross-table intent by callingemit_historic_sql_evidence. - Set each evidence object's
rawPathto the exact raw file path read in step 3. - Stop after all pattern evidence has been emitted.
Every join column mentioned in pattern descriptions must be verified via entity_details for both sides of the join.
Identifier Verification Protocol
Before writing a wiki page or SL source on any topic:
discover_data({query: "<topic>"})- see what wikis, SL sources, and raw tables already exist. Prefer updating existing pages over creating new ones.
Before emitting any schema.table or schema.table.column into a wiki body,
SL source, tables: frontmatter, sl_refs, or emit_unmapped_fallback:
entity_details({connectionName, targets: [{display: "<identifier>"}]})- confirm the identifier resolves; inspect native types, FK/PK, and sampleValues.- For literal values from the source, such as status codes or plan tiers,
check whether they appear in
entity_detailssampleValues for the relevant column. If sampleValues is short or the sample may have missed real values, run asql_executionprobe with the same warehouse connection name:sql_execution({connectionName, sql: "SELECT DISTINCT <col> FROM <ref> LIMIT 50"}). - If the candidate identifier still does not resolve, do one of:
- Use
sql_execution({connectionName, sql: "SELECT 1 FROM <ref> LIMIT 0"}). If it errors, the identifier is fictional. - Wrap the identifier in
[unverified - from <rawPath>]in the wiki body, citing the exact raw path that mentioned it. - When recording
emit_unmapped_fallbackwithno_physical_table, include the failing probe error inclarification.
- Use
- Never copy
<schema>.<table>placeholder strings from these instructions into output.
Evidence Shape
Each call to emit_historic_sql_evidence must use this shape:
{
"kind": "pattern",
"rawPath": "patterns-input/part-0001.json",
"pattern": {
"slug": "order-lifecycle-analysis",
"title": "Order Lifecycle Analysis",
"narrative": "Analysts compare order statuses with customer segments to understand lifecycle movement.",
"definitionSql": "select o.status, count(*) from public.orders o join public.customers c on c.id = o.customer_id group by o.status",
"tablesInvolved": ["public.orders", "public.customers"],
"slRefs": ["orders", "customers"],
"constituentTemplateIds": ["pg:1", "pg:2"]
}
}
The pattern object must match patternOutputSchema; multiple calls together must form patternsArraySchema.
Pattern Selection Rules
- Prefer patterns that involve two or more tables.
- Prefer templates with
executionsBucketat least10-100anddistinctUsersBucketabove solo usage. - Merge templates into one pattern only when the business intent is the same.
- Use a stable kebab-case slug based on intent, not a template id.
- Set
definitionSqlto the clearest representative SQL from a constituent template. - Set
slRefsto source names when the source name is obvious from table names; omit uncertain refs rather than guessing. - Treat each pattern shard independently; do not read peer shard files from
peerFileIndex.
Boundaries
- Do not call wiki_write.
- Do not call sl_write_source.
- Do not call sl_edit_source.
- Do not call context_candidate_write.
- Do not create single-table pattern pages.
- Do not copy credentials, tokens, user emails, or unredacted literals into evidence.