mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
Fix historic SQL ingest setup and progress
This commit is contained in:
parent
f3f6b36551
commit
1bd29c7eb1
14 changed files with 877 additions and 34 deletions
|
|
@ -0,0 +1,459 @@
|
|||
# External Hosted Postgres Discovery Manual Test Plan
|
||||
|
||||
This plan tests KTX from the point of view of a new external user who discovers
|
||||
the public CLI and connects the hosted Kaelio demo Postgres database as the
|
||||
source. It starts with the credential-free seeded demo, then creates a real KTX
|
||||
project that reads from `start.kaelio.com`.
|
||||
|
||||
The plan avoids writing the database password into this repository. Keep the
|
||||
password in a local environment variable and configure KTX with
|
||||
`env:KTX_DEMO_DATABASE_URL`.
|
||||
|
||||
## Scope
|
||||
|
||||
Use this plan when the goal is to test KTX as an external user with the hosted
|
||||
demo database. The commands use the published package shape through
|
||||
`npx @kaelio/ktx`. If you are testing from this repository, you can replace
|
||||
`npx @kaelio/ktx` with the local `ktx` alias.
|
||||
|
||||
The required checks cover:
|
||||
|
||||
- Running the packaged seeded demo without credentials.
|
||||
- Creating a new project that points to the hosted Postgres demo source.
|
||||
- Verifying the connection through the public CLI.
|
||||
- Running public ingest against the hosted database.
|
||||
- Searching semantic-layer sources through `agent sl list --query`.
|
||||
- Running the Postgres historic-SQL readiness doctor.
|
||||
- Running the historic-SQL adapter when the demo database exposes query
|
||||
history and local LLM configuration is available.
|
||||
- Searching generated historic-SQL usage and pattern pages when historic-SQL
|
||||
ingest runs.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Prepare a clean terminal before starting. The required path needs Node and
|
||||
network access to `start.kaelio.com:5432`. The optional historic-SQL ingest path
|
||||
also needs `uv` and an LLM provider configured for KTX.
|
||||
|
||||
1. Confirm Node 22 or newer is available:
|
||||
|
||||
```bash
|
||||
node --version
|
||||
```
|
||||
|
||||
Expected: the version is `v22` or newer.
|
||||
|
||||
2. Confirm the hosted Postgres endpoint is reachable from your network:
|
||||
|
||||
```bash
|
||||
nc -vz start.kaelio.com 5432
|
||||
```
|
||||
|
||||
Expected: the command reports that the TCP connection succeeds. If `nc` is
|
||||
unavailable, continue and let `ktx connection test` perform the real check.
|
||||
|
||||
3. Create an isolated test parent:
|
||||
|
||||
```bash
|
||||
export KTX_EXTERNAL_PARENT="$(mktemp -d)"
|
||||
export KTX_SEEDED_PROJECT="$KTX_EXTERNAL_PARENT/seeded-demo"
|
||||
export KTX_HOSTED_PROJECT="$KTX_EXTERNAL_PARENT/hosted-postgres"
|
||||
export KTX_RUNTIME_ROOT="$KTX_EXTERNAL_PARENT/managed-runtime"
|
||||
```
|
||||
|
||||
Expected: every file created by this test stays under
|
||||
`$KTX_EXTERNAL_PARENT`.
|
||||
|
||||
4. Set the hosted database URL without committing the password:
|
||||
|
||||
```bash
|
||||
read -rsp "Demo database password: " KTX_DEMO_DB_PASSWORD
|
||||
printf '\n'
|
||||
export KTX_DEMO_DATABASE_URL="postgresql://kaelio_demo:${KTX_DEMO_DB_PASSWORD}"
|
||||
export KTX_DEMO_DATABASE_URL="${KTX_DEMO_DATABASE_URL}@start.kaelio.com:5432/demo?sslmode=prefer"
|
||||
unset KTX_DEMO_DB_PASSWORD
|
||||
```
|
||||
|
||||
Expected: `KTX_DEMO_DATABASE_URL` is set only in your shell. The project
|
||||
config will store `env:KTX_DEMO_DATABASE_URL`, not the literal URL.
|
||||
|
||||
The hosted demo endpoint uses libpq-style `sslmode=prefer`, which means
|
||||
"try SSL, then fall back to non-SSL." KTX handles this mode explicitly for
|
||||
the Node Postgres connector so the setup check can connect to the hosted
|
||||
demo database.
|
||||
|
||||
5. Verify the required shell variables before running any `ktx` commands:
|
||||
|
||||
```bash
|
||||
: "${KTX_EXTERNAL_PARENT:?Run prerequisite step 3 in this shell first}"
|
||||
: "${KTX_SEEDED_PROJECT:?Run prerequisite step 3 in this shell first}"
|
||||
: "${KTX_HOSTED_PROJECT:?Run prerequisite step 3 in this shell first}"
|
||||
: "${KTX_RUNTIME_ROOT:?Run prerequisite step 3 in this shell first}"
|
||||
: "${KTX_DEMO_DATABASE_URL:?Run prerequisite step 4 in this shell first}"
|
||||
```
|
||||
|
||||
Expected: the command prints nothing and exits zero. If it prints a shell
|
||||
error, rerun the referenced prerequisite in the same terminal before
|
||||
continuing.
|
||||
|
||||
## Step 1: Run the packaged seeded demo
|
||||
|
||||
Start with the shortest public path. The seeded demo uses packaged data and
|
||||
prebuilt context, so it must not ask for an LLM key.
|
||||
|
||||
1. Run the seeded demo:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx setup demo \
|
||||
--project-dir "$KTX_SEEDED_PROJECT" \
|
||||
--plain \
|
||||
--no-input
|
||||
```
|
||||
|
||||
Expected: output includes `Mode: seeded`, `Source: packaged demo project`,
|
||||
and `LLM calls: none`.
|
||||
|
||||
2. Inspect the seeded demo:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx setup demo inspect \
|
||||
--project-dir "$KTX_SEEDED_PROJECT" \
|
||||
--json > "$KTX_EXTERNAL_PARENT/seeded-inspect.json"
|
||||
```
|
||||
|
||||
Expected: the JSON reports seeded mode, semantic-layer sources, knowledge
|
||||
pages, and `reports/seeded-demo-report.json`.
|
||||
|
||||
3. Search seeded semantic-layer sources:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx agent sl list \
|
||||
--project-dir "$KTX_SEEDED_PROJECT" \
|
||||
--json \
|
||||
--query "revenue" \
|
||||
> "$KTX_EXTERNAL_PARENT/seeded-sl-search.json"
|
||||
```
|
||||
|
||||
Expected: the command exits zero and returns at least one source with a
|
||||
numeric `score`.
|
||||
|
||||
## Step 2: Create a hosted Postgres project
|
||||
|
||||
Create a new KTX project that uses the hosted demo database as the warehouse
|
||||
source. This step enables historic SQL in the config, but it does not require
|
||||
LLM credentials yet.
|
||||
|
||||
If an earlier setup attempt failed after creating `$KTX_HOSTED_PROJECT/ktx.yaml`,
|
||||
start a fresh test project before rerunning the `--new` command:
|
||||
|
||||
```bash
|
||||
export KTX_HOSTED_PROJECT="$KTX_EXTERNAL_PARENT/hosted-postgres-retry"
|
||||
```
|
||||
|
||||
1. Create the project and connection:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx setup \
|
||||
--project-dir "${KTX_HOSTED_PROJECT:?Run prerequisite step 3 first}" \
|
||||
--new \
|
||||
--skip-llm \
|
||||
--skip-embeddings \
|
||||
--skip-sources \
|
||||
--skip-agents \
|
||||
--database postgres \
|
||||
--new-database-connection-id warehouse \
|
||||
--database-url env:KTX_DEMO_DATABASE_URL \
|
||||
--database-schema public \
|
||||
--enable-historic-sql \
|
||||
--historic-sql-min-executions 2 \
|
||||
--yes \
|
||||
--no-input
|
||||
```
|
||||
|
||||
Expected: `$KTX_HOSTED_PROJECT/ktx.yaml` exists and contains a `warehouse`
|
||||
Postgres connection whose URL is `env:KTX_DEMO_DATABASE_URL`.
|
||||
|
||||
2. Confirm the password was not written to disk:
|
||||
|
||||
```bash
|
||||
grep -R "start.kaelio.com:5432/demo" "$KTX_HOSTED_PROJECT" || true
|
||||
```
|
||||
|
||||
Expected: no matches are printed.
|
||||
|
||||
3. Inspect the generated connection config:
|
||||
|
||||
```bash
|
||||
sed -n '1,120p' "$KTX_HOSTED_PROJECT/ktx.yaml"
|
||||
```
|
||||
|
||||
Expected: the `warehouse` connection has `driver: postgres`,
|
||||
`url: env:KTX_DEMO_DATABASE_URL` or an equivalent URL reference, and
|
||||
`historicSql.enabled: true`.
|
||||
|
||||
## Step 3: Test the hosted connection
|
||||
|
||||
Run the public connection check before ingest. This verifies that the external
|
||||
user can reach and introspect the hosted source.
|
||||
|
||||
1. Test the connection:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx connection test warehouse \
|
||||
--project-dir "$KTX_HOSTED_PROJECT"
|
||||
```
|
||||
|
||||
Expected: output includes `Driver: postgres` and a positive table count.
|
||||
|
||||
2. List configured connections:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx connection list \
|
||||
--project-dir "$KTX_HOSTED_PROJECT"
|
||||
```
|
||||
|
||||
Expected: output includes the `warehouse` connection.
|
||||
|
||||
## Step 4: Run public ingest
|
||||
|
||||
Run the public ingest command. For warehouse connections, this performs the
|
||||
database scan path and writes local context files that agent search can use.
|
||||
|
||||
1. Run ingest:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx ingest warehouse \
|
||||
--project-dir "$KTX_HOSTED_PROJECT" \
|
||||
--no-input
|
||||
```
|
||||
|
||||
Expected: output reports that ingest finished and that the `scan` step is
|
||||
`done`.
|
||||
|
||||
2. Inspect the latest public ingest status:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx ingest status \
|
||||
--project-dir "$KTX_HOSTED_PROJECT" \
|
||||
--no-input
|
||||
```
|
||||
|
||||
Expected: the status references the hosted `warehouse` source and a
|
||||
completed scan.
|
||||
|
||||
3. Confirm semantic-layer files exist:
|
||||
|
||||
```bash
|
||||
find "$KTX_HOSTED_PROJECT/semantic-layer/warehouse" \
|
||||
-name '*.yaml' -print | head
|
||||
```
|
||||
|
||||
Expected: at least one semantic-layer YAML file is printed.
|
||||
|
||||
## Step 5: Search the hosted database context
|
||||
|
||||
Use the agent-facing semantic-layer search command after ingest. This validates
|
||||
the discovery path that agents use for database analysis.
|
||||
|
||||
1. Run semantic-layer search:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx agent sl list \
|
||||
--project-dir "$KTX_HOSTED_PROJECT" \
|
||||
--connection-id warehouse \
|
||||
--json \
|
||||
--query "orders revenue customers" \
|
||||
> "$KTX_EXTERNAL_PARENT/hosted-sl-search.json"
|
||||
```
|
||||
|
||||
Expected: the command exits zero.
|
||||
|
||||
2. Validate search metadata:
|
||||
|
||||
```bash
|
||||
node - "$KTX_EXTERNAL_PARENT/hosted-sl-search.json" <<'NODE'
|
||||
const { readFileSync } = require('node:fs');
|
||||
const result = JSON.parse(readFileSync(process.argv[2], 'utf8'));
|
||||
const assert = (ok, message) => {
|
||||
if (!ok) throw new Error(message);
|
||||
};
|
||||
assert(Array.isArray(result.sources), 'sources missing');
|
||||
assert(result.sources.length > 0, 'no semantic-layer hits');
|
||||
assert(Number.isFinite(result.sources[0].score), 'score missing');
|
||||
console.log('hosted semantic-layer search ok');
|
||||
NODE
|
||||
```
|
||||
|
||||
Expected: the script prints `hosted semantic-layer search ok`.
|
||||
|
||||
3. Read the top source:
|
||||
|
||||
```bash
|
||||
node - "$KTX_EXTERNAL_PARENT/hosted-sl-search.json" \
|
||||
> "$KTX_EXTERNAL_PARENT/hosted-top-source-name.txt" <<'NODE'
|
||||
const { readFileSync } = require('node:fs');
|
||||
const result = JSON.parse(readFileSync(process.argv[2], 'utf8'));
|
||||
process.stdout.write(result.sources[0].name);
|
||||
NODE
|
||||
|
||||
npx @kaelio/ktx agent sl read \
|
||||
"$(cat "$KTX_EXTERNAL_PARENT/hosted-top-source-name.txt")" \
|
||||
--project-dir "$KTX_HOSTED_PROJECT" \
|
||||
--connection-id warehouse \
|
||||
--json \
|
||||
> "$KTX_EXTERNAL_PARENT/hosted-sl-read.json"
|
||||
```
|
||||
|
||||
Expected: the JSON includes the full semantic-layer source.
|
||||
|
||||
## Step 6: Check historic-SQL readiness
|
||||
|
||||
Run the Postgres historic-SQL doctor. This determines whether the hosted demo
|
||||
database exposes the query-history prerequisites needed for the redesign's
|
||||
historic-SQL adapter.
|
||||
|
||||
1. Run doctor:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx dev doctor \
|
||||
--project-dir "$KTX_HOSTED_PROJECT" \
|
||||
--no-input
|
||||
```
|
||||
|
||||
Expected: output includes a `Postgres Historic SQL (warehouse)` check.
|
||||
|
||||
2. Interpret the result:
|
||||
|
||||
- `PASS` means the hosted source is ready for the optional historic-SQL
|
||||
ingest path.
|
||||
- `WARN` or `FAIL` means the external discovery test still covers scan and
|
||||
semantic-layer search, but historic-SQL query-history ingestion is blocked
|
||||
by database permissions or configuration.
|
||||
|
||||
## Step 7: Optional historic-SQL ingest
|
||||
|
||||
Run this section only when the doctor passes and the KTX project has an LLM
|
||||
provider configured. Historic-SQL table and pattern curation uses LLM-backed
|
||||
skills, so this path is not credential-free.
|
||||
|
||||
1. Configure LLM and embeddings if you skipped them during setup:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx setup \
|
||||
--project-dir "$KTX_HOSTED_PROJECT"
|
||||
```
|
||||
|
||||
Expected: `npx @kaelio/ktx setup status --project-dir "$KTX_HOSTED_PROJECT"`
|
||||
reports that LLM and embedding setup are ready.
|
||||
|
||||
2. Run historic-SQL ingest:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx dev ingest run \
|
||||
--project-dir "$KTX_HOSTED_PROJECT" \
|
||||
--connection-id warehouse \
|
||||
--adapter historic-sql \
|
||||
--plain \
|
||||
--yes \
|
||||
--no-input
|
||||
```
|
||||
|
||||
Expected: the command exits zero and schedules `historic-sql-table-` and
|
||||
`historic-sql-patterns-` WorkUnits when the database has qualifying query
|
||||
history.
|
||||
|
||||
3. Locate the latest historic-SQL manifest:
|
||||
|
||||
```bash
|
||||
find "$KTX_HOSTED_PROJECT/raw-sources/warehouse/historic-sql" \
|
||||
-name manifest.json -print | sort | tail -n 1
|
||||
```
|
||||
|
||||
Expected: a manifest path is printed.
|
||||
|
||||
4. Search for generated usage:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx agent sl list \
|
||||
--project-dir "$KTX_HOSTED_PROJECT" \
|
||||
--connection-id warehouse \
|
||||
--json \
|
||||
--query "common filters joins usage" \
|
||||
> "$KTX_EXTERNAL_PARENT/historic-sl-search.json"
|
||||
```
|
||||
|
||||
Expected: hits produced from historic-SQL usage include `score`, and hits
|
||||
with projected usage include `frequencyTier` and `snippet`.
|
||||
|
||||
5. Search for generated pattern pages:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx agent wiki search "historic sql pattern" \
|
||||
--project-dir "$KTX_HOSTED_PROJECT" \
|
||||
--json \
|
||||
--limit 10 \
|
||||
> "$KTX_EXTERNAL_PARENT/historic-wiki-search.json"
|
||||
```
|
||||
|
||||
Expected: results include pages whose keys start with `historic-sql/` when
|
||||
the run produced cross-table patterns.
|
||||
|
||||
## Step 8: Record results
|
||||
|
||||
Capture the result in a way that separates the external discovery path from the
|
||||
optional historic-SQL path.
|
||||
|
||||
1. Save useful outputs:
|
||||
|
||||
```bash
|
||||
mkdir -p "$KTX_EXTERNAL_PARENT/results"
|
||||
cp "$KTX_EXTERNAL_PARENT/seeded-inspect.json" \
|
||||
"$KTX_EXTERNAL_PARENT/results/" 2>/dev/null || true
|
||||
cp "$KTX_EXTERNAL_PARENT/hosted-sl-search.json" \
|
||||
"$KTX_EXTERNAL_PARENT/results/" 2>/dev/null || true
|
||||
cp "$KTX_EXTERNAL_PARENT/hosted-sl-read.json" \
|
||||
"$KTX_EXTERNAL_PARENT/results/" 2>/dev/null || true
|
||||
cp "$KTX_EXTERNAL_PARENT/historic-sl-search.json" \
|
||||
"$KTX_EXTERNAL_PARENT/results/" 2>/dev/null || true
|
||||
cp "$KTX_EXTERNAL_PARENT/historic-wiki-search.json" \
|
||||
"$KTX_EXTERNAL_PARENT/results/" 2>/dev/null || true
|
||||
```
|
||||
|
||||
Expected: the results directory contains the JSON outputs created during the
|
||||
run.
|
||||
|
||||
2. Mark these areas as pass, fail, or blocked:
|
||||
|
||||
- Public package discovery through `npx @kaelio/ktx`.
|
||||
- Seeded demo without credentials.
|
||||
- Hosted Postgres project setup.
|
||||
- Hosted Postgres connection test.
|
||||
- Public ingest scan.
|
||||
- Semantic-layer search and read.
|
||||
- Historic-SQL doctor.
|
||||
- Historic-SQL ingest, if doctor and LLM setup allow it.
|
||||
- Historic-SQL usage search, if ingest ran.
|
||||
- Historic-SQL wiki pattern search, if ingest ran.
|
||||
|
||||
Expected: every required external discovery area passes. Historic-SQL ingest
|
||||
is pass, fail, or blocked based on the doctor result and local LLM
|
||||
configuration.
|
||||
|
||||
## Cleanup
|
||||
|
||||
Remove the disposable project after collecting results. Keep it only when you
|
||||
need the files for debugging.
|
||||
|
||||
1. Stop the managed runtime:
|
||||
|
||||
```bash
|
||||
npx @kaelio/ktx runtime stop || true
|
||||
```
|
||||
|
||||
2. Remove the test parent:
|
||||
|
||||
```bash
|
||||
rm -rf "$KTX_EXTERNAL_PARENT"
|
||||
```
|
||||
|
||||
Expected: temporary projects and runtime files are removed.
|
||||
|
|
@ -92,7 +92,7 @@ export function registerIngestCommands(
|
|||
sourceDir: options.sourceDir ? resolve(options.sourceDir) : undefined,
|
||||
databaseIntrospectionUrl: options.databaseIntrospectionUrl || undefined,
|
||||
cliVersion: context.packageInfo.version,
|
||||
runtimeInstallPolicy: runtimeInstallPolicyFromFlags(options),
|
||||
runtimeInstallPolicy: runtimeInstallPolicyFromFlags({ yes: options.yes }),
|
||||
...(options.debugLlmRequestFile ? { debugLlmRequestFile: resolve(options.debugLlmRequestFile) } : {}),
|
||||
outputMode: outputMode(options),
|
||||
...inputMode(options),
|
||||
|
|
|
|||
|
|
@ -920,7 +920,7 @@ describe('runKtxCli', () => {
|
|||
sourceDir: tempDir,
|
||||
databaseIntrospectionUrl: undefined,
|
||||
cliVersion: '0.0.0-private',
|
||||
runtimeInstallPolicy: 'never',
|
||||
runtimeInstallPolicy: 'prompt',
|
||||
debugLlmRequestFile: `${tempDir}/debug.jsonl`,
|
||||
outputMode: 'json',
|
||||
inputMode: 'disabled',
|
||||
|
|
@ -934,9 +934,9 @@ describe('runKtxCli', () => {
|
|||
expect(ingestReplayHelpIo.stderr()).toBe('');
|
||||
});
|
||||
|
||||
it('routes ingest managed runtime install policies', async () => {
|
||||
it('routes ingest managed runtime install policy separately from visualization input mode', async () => {
|
||||
const autoIo = makeIo();
|
||||
const conflictIo = makeIo();
|
||||
const nonInteractiveIo = makeIo();
|
||||
const ingest = vi.fn(async () => 0);
|
||||
|
||||
await expect(
|
||||
|
|
@ -972,10 +972,10 @@ describe('runKtxCli', () => {
|
|||
'--yes',
|
||||
'--no-input',
|
||||
],
|
||||
conflictIo.io,
|
||||
nonInteractiveIo.io,
|
||||
{ ingest },
|
||||
),
|
||||
).resolves.toBe(1);
|
||||
).resolves.toBe(0);
|
||||
|
||||
expect(ingest).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
|
|
@ -985,7 +985,16 @@ describe('runKtxCli', () => {
|
|||
}),
|
||||
autoIo.io,
|
||||
);
|
||||
expect(conflictIo.stderr()).toContain('Choose only one runtime install mode: --yes or --no-input');
|
||||
expect(ingest).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
command: 'run',
|
||||
cliVersion: '0.0.0-private',
|
||||
runtimeInstallPolicy: 'auto',
|
||||
inputMode: 'disabled',
|
||||
}),
|
||||
nonInteractiveIo.io,
|
||||
);
|
||||
expect(nonInteractiveIo.stderr()).toBe('');
|
||||
});
|
||||
|
||||
it('dispatches public connection through the existing connection implementation', async () => {
|
||||
|
|
|
|||
|
|
@ -304,7 +304,7 @@ describe('runKtxIngest viz and replay', () => {
|
|||
expect(io.stdout()).toContain('KTX memory flow warehouse/fake done');
|
||||
});
|
||||
|
||||
it('does not attach a live memory-flow sink for plain run output', async () => {
|
||||
it('attaches a plain progress memory-flow sink for interactive plain run output', async () => {
|
||||
const projectDir = join(tempDir, 'project');
|
||||
await writeWarehouseConfig(projectDir);
|
||||
const sourceDir = join(tempDir, 'source');
|
||||
|
|
@ -329,7 +329,8 @@ describe('runKtxIngest viz and replay', () => {
|
|||
),
|
||||
).resolves.toBe(0);
|
||||
|
||||
expect(runLocal).toHaveBeenCalledWith(expect.not.objectContaining({ memoryFlow: expect.anything() }));
|
||||
expect(runLocal).toHaveBeenCalledWith(expect.objectContaining({ memoryFlow: expect.anything() }));
|
||||
expect(io.stdout()).toContain('[5%] Fetching source files for warehouse/fake');
|
||||
expect(io.stdout()).toContain('Job: plain-run');
|
||||
expect(io.stdout()).not.toContain('KTX memory flow');
|
||||
});
|
||||
|
|
@ -403,7 +404,8 @@ describe('runKtxIngest viz and replay', () => {
|
|||
).resolves.toBe(0);
|
||||
|
||||
expect(startLiveMemoryFlow).not.toHaveBeenCalled();
|
||||
expect(runLocal).toHaveBeenCalledWith(expect.not.objectContaining({ memoryFlow: expect.anything() }));
|
||||
expect(runLocal).toHaveBeenCalledWith(expect.objectContaining({ memoryFlow: expect.anything() }));
|
||||
expect(io.stdout()).toContain('[5%] Fetching source files for warehouse/fake');
|
||||
expect(io.stdout()).toContain('Job: raw-missing-viz-run');
|
||||
expect(io.stdout()).not.toContain('KTX memory flow');
|
||||
expect(io.stderr()).toContain(
|
||||
|
|
|
|||
|
|
@ -762,6 +762,103 @@ describe('runKtxIngest', () => {
|
|||
);
|
||||
});
|
||||
|
||||
it('prints live progress for plain local ingest in interactive terminals', async () => {
|
||||
const projectDir = join(tempDir, 'historic-sql-progress-project');
|
||||
await mkdir(projectDir, { recursive: true });
|
||||
await writeFile(
|
||||
join(projectDir, 'ktx.yaml'),
|
||||
[
|
||||
'project: historic-sql-progress-project',
|
||||
'connections:',
|
||||
' warehouse:',
|
||||
' driver: postgres',
|
||||
' url: env:WAREHOUSE_DATABASE_URL',
|
||||
' historicSql:',
|
||||
' enabled: true',
|
||||
' dialect: postgres',
|
||||
' minExecutions: 2',
|
||||
'ingest:',
|
||||
' adapters:',
|
||||
' - historic-sql',
|
||||
'',
|
||||
].join('\n'),
|
||||
'utf-8',
|
||||
);
|
||||
const createdAdapters: SourceAdapter[] = [
|
||||
{ source: 'historic-sql', skillNames: [], detect: async () => true, chunk: async () => ({ workUnits: [] }) },
|
||||
];
|
||||
const createAdapters = vi.fn(() => createdAdapters as never);
|
||||
const runLocal = vi.fn(async (input: RunLocalIngestOptions) => {
|
||||
expect(input.memoryFlow).toBeDefined();
|
||||
input.memoryFlow?.emit({
|
||||
type: 'source_acquired',
|
||||
adapter: 'historic-sql',
|
||||
trigger: 'manual_resync',
|
||||
fileCount: 3,
|
||||
});
|
||||
input.memoryFlow?.update({ syncId: 'sync-progress-1' });
|
||||
input.memoryFlow?.emit({ type: 'raw_snapshot_written', syncId: 'sync-progress-1', rawFileCount: 3 });
|
||||
input.memoryFlow?.emit({ type: 'diff_computed', added: 2, modified: 0, deleted: 0, unchanged: 1 });
|
||||
input.memoryFlow?.update({
|
||||
plannedWorkUnits: [
|
||||
{
|
||||
unitKey: 'historic-sql-table-public-orders',
|
||||
rawFiles: ['tables/public/orders.json'],
|
||||
peerFileCount: 0,
|
||||
dependencyCount: 0,
|
||||
},
|
||||
],
|
||||
});
|
||||
input.memoryFlow?.emit({ type: 'chunks_planned', chunkCount: 1, workUnitCount: 1, evictionCount: 0 });
|
||||
input.memoryFlow?.emit({
|
||||
type: 'work_unit_started',
|
||||
unitKey: 'historic-sql-table-public-orders',
|
||||
skills: ['historic_sql_table_digest'],
|
||||
stepBudget: 40,
|
||||
});
|
||||
input.memoryFlow?.emit({
|
||||
type: 'work_unit_finished',
|
||||
unitKey: 'historic-sql-table-public-orders',
|
||||
status: 'success',
|
||||
});
|
||||
input.memoryFlow?.emit({ type: 'saved', commitSha: null, wikiCount: 0, slCount: 1 });
|
||||
input.memoryFlow?.emit({ type: 'provenance_recorded', rowCount: 3 });
|
||||
input.memoryFlow?.emit({ type: 'report_created', runId: 'run-live-1', reportPath: 'report-live-1' });
|
||||
input.memoryFlow?.finish('done');
|
||||
return completedLocalBundleRun(input, input.jobId ?? 'historic-progress-job');
|
||||
});
|
||||
const io = makeIo({ isTTY: true });
|
||||
|
||||
await expect(
|
||||
runKtxIngest(
|
||||
{
|
||||
command: 'run',
|
||||
projectDir,
|
||||
connectionId: 'warehouse',
|
||||
adapter: 'historic-sql',
|
||||
outputMode: 'plain',
|
||||
},
|
||||
io.io,
|
||||
{
|
||||
createAdapters,
|
||||
runLocalIngest: runLocal,
|
||||
jobIdFactory: () => 'historic-progress-job',
|
||||
},
|
||||
),
|
||||
).resolves.toBe(0);
|
||||
|
||||
const stdout = io.stdout();
|
||||
expect(stdout).toContain('[5%] Fetching source files for warehouse/historic-sql');
|
||||
expect(stdout).toContain('[15%] Fetched 3 source files from historic-sql');
|
||||
expect(stdout).toContain('[45%] Planned 1 work unit');
|
||||
expect(stdout).toContain('[80%] Processed 1/1 work units');
|
||||
expect(stdout).toContain('[100%] Ingest completed');
|
||||
expect(stdout.indexOf('[5%] Fetching source files for warehouse/historic-sql')).toBeLessThan(
|
||||
stdout.indexOf('Report: report-live-1'),
|
||||
);
|
||||
expect(io.stderr()).toBe('');
|
||||
});
|
||||
|
||||
it('passes local Looker pull-config options and agent runner into scheduled ingest for Looker scheduled ingest', async () => {
|
||||
const projectDir = join(tempDir, 'project');
|
||||
await writeWarehouseConfig(projectDir);
|
||||
|
|
|
|||
|
|
@ -8,6 +8,7 @@ import {
|
|||
ingestReportToMemoryFlowReplay,
|
||||
type LocalMetabaseFanoutResult,
|
||||
type LocalMetabaseFanoutProgress,
|
||||
type MemoryFlowEvent,
|
||||
type MemoryFlowReplayInput,
|
||||
type RunLocalIngestOptions,
|
||||
renderMemoryFlowReplay,
|
||||
|
|
@ -170,6 +171,118 @@ function createMetabaseFanoutProgress(
|
|||
};
|
||||
}
|
||||
|
||||
function formatDiffProgress(event: Extract<MemoryFlowEvent, { type: 'diff_computed' }>): string {
|
||||
return `+${event.added}/~${event.modified}/-${event.deleted}/=${event.unchanged}`;
|
||||
}
|
||||
|
||||
function completedWorkUnitCount(snapshot: MemoryFlowReplayInput): number {
|
||||
return snapshot.events.filter((event) => event.type === 'work_unit_finished').length;
|
||||
}
|
||||
|
||||
function plainIngestEventProgress(
|
||||
event: MemoryFlowEvent,
|
||||
snapshot: MemoryFlowReplayInput,
|
||||
): { percent: number; message: string } | null {
|
||||
switch (event.type) {
|
||||
case 'source_acquired':
|
||||
return {
|
||||
percent: 15,
|
||||
message: `Fetched ${pluralize(event.fileCount, 'source file')} from ${event.adapter}`,
|
||||
};
|
||||
case 'raw_snapshot_written':
|
||||
return {
|
||||
percent: 25,
|
||||
message: `Wrote raw snapshot ${event.syncId} with ${pluralize(event.rawFileCount, 'file')}`,
|
||||
};
|
||||
case 'diff_computed':
|
||||
return { percent: 35, message: `Computed source diff ${formatDiffProgress(event)}` };
|
||||
case 'chunks_planned':
|
||||
return {
|
||||
percent: 45,
|
||||
message: `Planned ${pluralize(event.workUnitCount, 'work unit')}`,
|
||||
};
|
||||
case 'stage_skipped':
|
||||
return { percent: 45, message: `Skipped ${event.stage}: ${event.reason}` };
|
||||
case 'work_unit_started':
|
||||
return { percent: 55, message: `Processing ${event.unitKey}` };
|
||||
case 'work_unit_finished': {
|
||||
const total = snapshot.plannedWorkUnits.length || completedWorkUnitCount(snapshot);
|
||||
const completed = completedWorkUnitCount(snapshot);
|
||||
const percent = total > 0 ? 55 + Math.round((completed / total) * 25) : 80;
|
||||
return {
|
||||
percent,
|
||||
message: `Processed ${completed}/${total} work units`,
|
||||
};
|
||||
}
|
||||
case 'reconciliation_finished':
|
||||
return {
|
||||
percent: 85,
|
||||
message: `Reconciled results with ${pluralize(event.conflictCount, 'conflict')} and ${pluralize(
|
||||
event.fallbackCount,
|
||||
'fallback',
|
||||
)}`,
|
||||
};
|
||||
case 'saved':
|
||||
return {
|
||||
percent: 90,
|
||||
message: `Saved memory updates (${event.wikiCount} wiki, ${event.slCount} SL)`,
|
||||
};
|
||||
case 'provenance_recorded':
|
||||
return { percent: 95, message: `Recorded ${pluralize(event.rowCount, 'provenance row')}` };
|
||||
case 'report_created':
|
||||
return { percent: 98, message: `Created ingest report ${event.reportPath ?? event.runId}` };
|
||||
case 'scope_detected':
|
||||
case 'work_unit_step':
|
||||
case 'candidate_action':
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
function shouldWritePlainIngestProgress(
|
||||
outputMode: KtxIngestOutputMode,
|
||||
io: KtxIngestIo,
|
||||
env: NodeJS.ProcessEnv,
|
||||
): boolean {
|
||||
return outputMode === 'plain' && io.stdout.isTTY === true && env.CI !== 'true';
|
||||
}
|
||||
|
||||
function createPlainIngestProgressRenderer(
|
||||
args: Extract<KtxIngestArgs, { command: 'run' }>,
|
||||
io: KtxIngestIo,
|
||||
): { start(): void; update(snapshot: MemoryFlowReplayInput): void } {
|
||||
let printedEvents = 0;
|
||||
let lastPercent = 0;
|
||||
let printedCompletion = false;
|
||||
|
||||
const write = (percent: number, message: string) => {
|
||||
const nextPercent = Math.max(lastPercent, Math.max(0, Math.min(100, percent)));
|
||||
lastPercent = nextPercent;
|
||||
io.stdout.write(`[${nextPercent}%] ${message}\n`);
|
||||
};
|
||||
|
||||
return {
|
||||
start() {
|
||||
write(5, `Fetching source files for ${args.connectionId}/${args.adapter}`);
|
||||
},
|
||||
update(snapshot) {
|
||||
while (printedEvents < snapshot.events.length) {
|
||||
const event = snapshot.events[printedEvents++];
|
||||
if (!event) {
|
||||
continue;
|
||||
}
|
||||
const progress = plainIngestEventProgress(event, snapshot);
|
||||
if (progress) {
|
||||
write(progress.percent, progress.message);
|
||||
}
|
||||
}
|
||||
if (!printedCompletion && snapshot.status !== 'running') {
|
||||
printedCompletion = true;
|
||||
write(100, snapshot.status === 'done' ? 'Ingest completed' : 'Ingest failed');
|
||||
}
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
function writeReportJson(report: IngestReportSnapshot, io: KtxIngestIo): void {
|
||||
io.stdout.write(`${JSON.stringify(report, null, 2)}\n`);
|
||||
}
|
||||
|
|
@ -366,10 +479,14 @@ export async function runKtxIngest(
|
|||
});
|
||||
const shouldUseLiveViz =
|
||||
runOutputMode === 'viz' && (args.inputMode ?? 'auto') === 'auto' && isInteractiveTerminal(io);
|
||||
const initialMemoryFlow = shouldUseLiveViz ? initialRunMemoryFlowInput(args, jobId ?? 'pending') : undefined;
|
||||
const plainProgress = shouldWritePlainIngestProgress(runOutputMode, io, env)
|
||||
? createPlainIngestProgressRenderer(args, io)
|
||||
: null;
|
||||
const initialMemoryFlow =
|
||||
shouldUseLiveViz || plainProgress ? initialRunMemoryFlowInput(args, jobId ?? 'pending') : undefined;
|
||||
let latestMemoryFlowSnapshot: MemoryFlowReplayInput | null = initialMemoryFlow ?? null;
|
||||
|
||||
if (initialMemoryFlow && isTuiCapableIo(io)) {
|
||||
if (shouldUseLiveViz && initialMemoryFlow && isTuiCapableIo(io)) {
|
||||
const startLiveMemoryFlow = deps.startLiveMemoryFlow ?? startLiveMemoryFlowTui;
|
||||
liveTui = await startLiveMemoryFlow(initialMemoryFlow, io);
|
||||
}
|
||||
|
|
@ -382,13 +499,17 @@ export async function runKtxIngest(
|
|||
liveTui.update(snapshot);
|
||||
return;
|
||||
}
|
||||
if (!liveTui) {
|
||||
if (shouldUseLiveViz && !liveTui) {
|
||||
writeMemoryFlowInput(snapshot, io, { clear: true });
|
||||
return;
|
||||
}
|
||||
plainProgress?.update(snapshot);
|
||||
},
|
||||
})
|
||||
: undefined;
|
||||
|
||||
plainProgress?.start();
|
||||
|
||||
try {
|
||||
const result = await executeLocalIngest({
|
||||
project,
|
||||
|
|
@ -403,7 +524,7 @@ export async function runKtxIngest(
|
|||
...(args.debugLlmRequestFile ? { llmDebugRequestFile: args.debugLlmRequestFile } : {}),
|
||||
...(memoryFlow ? { memoryFlow } : {}),
|
||||
});
|
||||
if (memoryFlow) {
|
||||
if (shouldUseLiveViz && memoryFlow) {
|
||||
latestMemoryFlowSnapshot = memoryFlow.snapshot();
|
||||
liveTui?.close();
|
||||
liveTui = null;
|
||||
|
|
|
|||
|
|
@ -767,6 +767,9 @@ export async function runKtxSetupContextStep(
|
|||
|
||||
const missing = missingCapabilities(project);
|
||||
if (missing.length > 0) {
|
||||
if (args.allowEmpty === true) {
|
||||
return { status: 'skipped', projectDir: args.projectDir };
|
||||
}
|
||||
writeMissingCapabilities(missing, io);
|
||||
return { status: 'missing-input', projectDir: args.projectDir };
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1174,6 +1174,66 @@ describe('setup status', () => {
|
|||
expect(calls).toEqual(['model', 'embeddings', 'databases', 'sources']);
|
||||
});
|
||||
|
||||
it('does not fail context build when prerequisites were explicitly skipped and agents are skipped', async () => {
|
||||
const calls: string[] = [];
|
||||
const io = makeIo();
|
||||
await writeFile(
|
||||
join(tempDir, 'ktx.yaml'),
|
||||
[
|
||||
'project: revenue',
|
||||
'connections:',
|
||||
' warehouse:',
|
||||
' driver: postgres',
|
||||
' url: env:DEMO_DATABASE_URL',
|
||||
' readonly: true',
|
||||
'',
|
||||
].join('\n'),
|
||||
'utf-8',
|
||||
);
|
||||
|
||||
await expect(
|
||||
runKtxSetup(
|
||||
{
|
||||
command: 'run',
|
||||
projectDir: tempDir,
|
||||
mode: 'existing',
|
||||
agents: false,
|
||||
skipAgents: true,
|
||||
inputMode: 'disabled',
|
||||
yes: true,
|
||||
cliVersion: '0.2.0',
|
||||
skipLlm: true,
|
||||
skipEmbeddings: true,
|
||||
skipDatabases: true,
|
||||
skipSources: true,
|
||||
databaseSchemas: [],
|
||||
},
|
||||
io.io,
|
||||
{
|
||||
model: async () => {
|
||||
calls.push('model');
|
||||
return { status: 'skipped', projectDir: tempDir };
|
||||
},
|
||||
embeddings: async () => {
|
||||
calls.push('embeddings');
|
||||
return { status: 'skipped', projectDir: tempDir };
|
||||
},
|
||||
databases: async () => {
|
||||
calls.push('databases');
|
||||
return { status: 'skipped', projectDir: tempDir };
|
||||
},
|
||||
sources: async () => {
|
||||
calls.push('sources');
|
||||
return { status: 'skipped', projectDir: tempDir };
|
||||
},
|
||||
},
|
||||
),
|
||||
).resolves.toBe(0);
|
||||
|
||||
expect(calls).toEqual(['model', 'embeddings', 'databases', 'sources']);
|
||||
expect(io.stderr()).not.toContain('KTX cannot build agent-ready context yet.');
|
||||
});
|
||||
|
||||
it('runs context after sources and before agents in full setup', async () => {
|
||||
const calls: string[] = [];
|
||||
const io = makeIo();
|
||||
|
|
|
|||
|
|
@ -129,6 +129,25 @@ describe('KtxPostgresScanConnector', () => {
|
|||
options: '-c search_path=analytics,public',
|
||||
ssl: { rejectUnauthorized: false },
|
||||
});
|
||||
const libpqPreferConfig = postgresPoolConfigFromConfig({
|
||||
connectionId: 'warehouse',
|
||||
connection: {
|
||||
driver: 'postgres',
|
||||
url: 'env:DEMO_DATABASE_URL',
|
||||
readonly: true,
|
||||
},
|
||||
env: {
|
||||
DEMO_DATABASE_URL: 'postgresql://reader@start.kaelio.com:5432/demo?sslmode=prefer',
|
||||
},
|
||||
});
|
||||
expect(libpqPreferConfig).toMatchObject({
|
||||
host: 'start.kaelio.com',
|
||||
port: 5432,
|
||||
database: 'demo',
|
||||
user: 'reader',
|
||||
});
|
||||
expect(libpqPreferConfig).not.toHaveProperty('connectionString');
|
||||
expect(libpqPreferConfig).not.toHaveProperty('ssl');
|
||||
expect(() =>
|
||||
postgresPoolConfigFromConfig({
|
||||
connectionId: 'warehouse',
|
||||
|
|
|
|||
|
|
@ -57,6 +57,8 @@ export interface KtxPostgresConnectionConfig {
|
|||
schema?: string;
|
||||
schemas?: string[];
|
||||
ssl?: boolean;
|
||||
sslmode?: string;
|
||||
sslMode?: string;
|
||||
rejectUnauthorized?: boolean;
|
||||
readonly?: boolean;
|
||||
[key: string]: unknown;
|
||||
|
|
@ -253,15 +255,22 @@ function numberValue(value: unknown): number | undefined {
|
|||
|
||||
function parsePostgresUrl(url: string): Partial<KtxPostgresConnectionConfig> {
|
||||
const parsed = new URL(url);
|
||||
const sslmode = parsed.searchParams.get('sslmode') ?? undefined;
|
||||
return {
|
||||
host: parsed.hostname,
|
||||
port: parsed.port ? Number(parsed.port) : undefined,
|
||||
database: parsed.pathname.replace(/^\/+/, '') || undefined,
|
||||
username: parsed.username ? decodeURIComponent(parsed.username) : undefined,
|
||||
password: parsed.password ? decodeURIComponent(parsed.password) : undefined,
|
||||
...(sslmode ? { sslmode } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
function normalizedSslMode(connection: KtxPostgresConnectionConfig): string | undefined {
|
||||
const value = connection.sslmode ?? connection.sslMode;
|
||||
return typeof value === 'string' && value.trim().length > 0 ? value.trim().toLowerCase() : undefined;
|
||||
}
|
||||
|
||||
function schemasFromConnection(connection: KtxPostgresConnectionConfig): string[] {
|
||||
if (Array.isArray(connection.schemas) && connection.schemas.length > 0) {
|
||||
return connection.schemas.filter((schema): schema is string => typeof schema === 'string' && schema.length > 0);
|
||||
|
|
@ -299,6 +308,7 @@ export function postgresPoolConfigFromConfig(input: {
|
|||
const database = stringConfigValue(merged, 'database', env);
|
||||
const user = stringConfigValue(merged, 'username', env) ?? stringConfigValue(merged, 'user', env);
|
||||
const password = stringConfigValue(merged, 'password', env);
|
||||
const sslmode = normalizedSslMode(merged);
|
||||
|
||||
if (!referencedUrl && !host) {
|
||||
throw new Error(`Native PostgreSQL connector requires connections.${input.connectionId}.host or url`);
|
||||
|
|
@ -314,7 +324,7 @@ export function postgresPoolConfigFromConfig(input: {
|
|||
max: 10,
|
||||
idleTimeoutMillis: 30_000,
|
||||
connectionTimeoutMillis: 10_000,
|
||||
...(referencedUrl
|
||||
...(referencedUrl && sslmode !== 'prefer' && sslmode !== 'disable'
|
||||
? { connectionString: referencedUrl }
|
||||
: { host, port: numberValue(merged.port) ?? 5432, database, user, password }),
|
||||
};
|
||||
|
|
@ -322,7 +332,7 @@ export function postgresPoolConfigFromConfig(input: {
|
|||
if (searchPathSchemas.length > 0) {
|
||||
config.options = `-c search_path=${searchPathSchemas.join(',')}`;
|
||||
}
|
||||
if (merged.ssl) {
|
||||
if (merged.ssl && sslmode !== 'prefer' && sslmode !== 'disable') {
|
||||
config.ssl = { rejectUnauthorized: merged.rejectUnauthorized ?? true };
|
||||
}
|
||||
return config;
|
||||
|
|
|
|||
|
|
@ -1,7 +1,16 @@
|
|||
import { describe, expect, it, vi } from 'vitest';
|
||||
import { asSchema } from 'ai';
|
||||
import { createEmitHistoricSqlEvidenceTool } from './evidence-tool.js';
|
||||
|
||||
describe('emit_historic_sql_evidence tool', () => {
|
||||
it('exposes an AI SDK v6 tool input schema with top-level object type', async () => {
|
||||
const tool = createEmitHistoricSqlEvidenceTool();
|
||||
|
||||
expect(await asSchema(tool.inputSchema).jsonSchema).toMatchObject({
|
||||
type: 'object',
|
||||
});
|
||||
});
|
||||
|
||||
it('writes table usage evidence to the ignored run evidence directory', async () => {
|
||||
const writeFile = vi.fn(async () => ({ success: true, commitHash: null }));
|
||||
const tool = createEmitHistoricSqlEvidenceTool();
|
||||
|
|
|
|||
|
|
@ -6,6 +6,42 @@ import { patternOutputSchema, tableUsageOutputSchema } from './skill-schemas.js'
|
|||
const SYSTEM_AUTHOR = 'System User';
|
||||
const SYSTEM_EMAIL = 'system@example.com';
|
||||
|
||||
const emitHistoricSqlEvidenceInputSchema = z
|
||||
.object({
|
||||
kind: z.enum(['table_usage', 'pattern']),
|
||||
table: z.string().min(1).optional(),
|
||||
rawPath: z.string().min(1),
|
||||
usage: tableUsageOutputSchema.optional(),
|
||||
pattern: patternOutputSchema.optional(),
|
||||
})
|
||||
.superRefine((input, ctx) => {
|
||||
if (input.kind === 'table_usage') {
|
||||
if (!input.table) {
|
||||
ctx.addIssue({
|
||||
code: 'custom',
|
||||
path: ['table'],
|
||||
message: 'table is required when kind is table_usage',
|
||||
});
|
||||
}
|
||||
if (!input.usage) {
|
||||
ctx.addIssue({
|
||||
code: 'custom',
|
||||
path: ['usage'],
|
||||
message: 'usage is required when kind is table_usage',
|
||||
});
|
||||
}
|
||||
}
|
||||
if (input.kind === 'pattern' && !input.pattern) {
|
||||
ctx.addIssue({
|
||||
code: 'custom',
|
||||
path: ['pattern'],
|
||||
message: 'pattern is required when kind is pattern',
|
||||
});
|
||||
}
|
||||
});
|
||||
|
||||
type EmitHistoricSqlEvidenceInput = z.infer<typeof emitHistoricSqlEvidenceInputSchema>;
|
||||
|
||||
interface EmitHistoricSqlEvidenceToolContext {
|
||||
connectionId?: string | null;
|
||||
session?: {
|
||||
|
|
@ -23,30 +59,42 @@ interface EmitHistoricSqlEvidenceToolContext {
|
|||
};
|
||||
}
|
||||
|
||||
function unitKeyForEvidence(input: { kind: string; table?: string; pattern?: { slug: string } }): string {
|
||||
function unitKeyForEvidence(input: EmitHistoricSqlEvidenceInput): string {
|
||||
if (input.kind === 'table_usage') {
|
||||
return `historic-sql-table-${String(input.table).replace(/[^a-zA-Z0-9]+/g, '-').replace(/^-+|-+$/g, '')}`;
|
||||
}
|
||||
return `historic-sql-pattern-${String(input.pattern?.slug).replace(/[^a-zA-Z0-9]+/g, '-').replace(/^-+|-+$/g, '')}`;
|
||||
}
|
||||
|
||||
function evidenceEnvelope(input: EmitHistoricSqlEvidenceInput, connectionId: string) {
|
||||
if (input.kind === 'table_usage') {
|
||||
if (!input.table || !input.usage) {
|
||||
throw new Error('Invalid historic-SQL table usage evidence input.');
|
||||
}
|
||||
return {
|
||||
kind: 'table_usage' as const,
|
||||
connectionId,
|
||||
table: input.table,
|
||||
rawPath: input.rawPath,
|
||||
usage: input.usage,
|
||||
};
|
||||
}
|
||||
if (!input.pattern) {
|
||||
throw new Error('Invalid historic-SQL pattern evidence input.');
|
||||
}
|
||||
return {
|
||||
kind: 'pattern' as const,
|
||||
connectionId,
|
||||
rawPath: input.rawPath,
|
||||
pattern: input.pattern,
|
||||
};
|
||||
}
|
||||
|
||||
export function createEmitHistoricSqlEvidenceTool(defaultContext?: EmitHistoricSqlEvidenceToolContext) {
|
||||
return tool({
|
||||
description:
|
||||
'Record typed historic-SQL evidence for deterministic projection. Use this instead of wiki_write, sl_write_source, sl_edit_source, or context_candidate_write during historic-SQL WorkUnits.',
|
||||
inputSchema: z.discriminatedUnion('kind', [
|
||||
z.object({
|
||||
kind: z.literal('table_usage'),
|
||||
table: z.string().min(1),
|
||||
rawPath: z.string().min(1),
|
||||
usage: tableUsageOutputSchema,
|
||||
}),
|
||||
z.object({
|
||||
kind: z.literal('pattern'),
|
||||
rawPath: z.string().min(1),
|
||||
pattern: patternOutputSchema,
|
||||
}),
|
||||
]),
|
||||
inputSchema: emitHistoricSqlEvidenceInputSchema,
|
||||
execute: async (input, options): Promise<string> => {
|
||||
const context = (options.experimental_context as EmitHistoricSqlEvidenceToolContext | undefined) ?? defaultContext;
|
||||
const ingest = context?.session?.ingest;
|
||||
|
|
@ -56,7 +104,8 @@ export function createEmitHistoricSqlEvidenceTool(defaultContext?: EmitHistoricS
|
|||
}
|
||||
|
||||
const unitKey = unitKeyForEvidence(input);
|
||||
const content = serializeHistoricSqlEvidence({ ...input, connectionId: context.connectionId });
|
||||
const evidence = evidenceEnvelope(input, context.connectionId);
|
||||
const content = serializeHistoricSqlEvidence(evidence);
|
||||
await configService.writeFile(
|
||||
historicSqlEvidencePath(ingest.runId, unitKey),
|
||||
content,
|
||||
|
|
@ -65,7 +114,7 @@ export function createEmitHistoricSqlEvidenceTool(defaultContext?: EmitHistoricS
|
|||
`Record historic-SQL evidence: ${unitKey}`,
|
||||
{ skipLock: true },
|
||||
);
|
||||
const label = input.kind === 'table_usage' ? input.table : input.pattern.slug;
|
||||
const label = evidence.kind === 'table_usage' ? evidence.table : evidence.pattern.slug;
|
||||
return `Recorded historic-SQL ${input.kind} evidence for ${label}.`;
|
||||
},
|
||||
});
|
||||
|
|
|
|||
|
|
@ -61,10 +61,10 @@ def _column_name(column: exp.Column) -> str:
|
|||
return str(column.name)
|
||||
|
||||
|
||||
def _columns_from_nodes(nodes: list[exp.Expression | None]) -> list[str]:
|
||||
def _columns_from_nodes(nodes: list[object]) -> list[str]:
|
||||
names: list[str] = []
|
||||
for node in nodes:
|
||||
if node is None:
|
||||
if not isinstance(node, exp.Expression):
|
||||
continue
|
||||
names.extend(_column_name(column) for column in node.find_all(exp.Column))
|
||||
return _ordered_unique(names)
|
||||
|
|
|
|||
|
|
@ -3,6 +3,7 @@ from __future__ import annotations
|
|||
from ktx_daemon.sql_analysis import (
|
||||
AnalyzeSqlBatchItem,
|
||||
AnalyzeSqlBatchRequest,
|
||||
_columns_from_nodes,
|
||||
analyze_sql_batch_response,
|
||||
)
|
||||
|
||||
|
|
@ -51,3 +52,7 @@ def test_analyze_sql_batch_returns_per_item_parse_errors() -> None:
|
|||
assert result.tables_touched == []
|
||||
assert result.columns_by_clause == {}
|
||||
assert result.error is not None
|
||||
|
||||
|
||||
def test_columns_from_nodes_ignores_non_expression_clause_values() -> None:
|
||||
assert _columns_from_nodes([True, False, None]) == []
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue