Commit graph

58 commits

Author SHA1 Message Date
Andrey Avtomonov
616fc211b0 Merge origin/main into simplify-ktx-releases 2026-05-19 16:35:49 +02:00
Andrey Avtomonov
bbd9568287 ci: simplify ktx release flow 2026-05-19 16:33:41 +02:00
Andrey Avtomonov
b58232115e fix: align release provenance metadata 2026-05-19 16:10:30 +02:00
Andrey Avtomonov
03e2f9f0a3
fix: prevent stable release pushes to main (#145) 2026-05-19 16:01:07 +02:00
Andrey Avtomonov
7110aa6f5c
fix: publish first stable release as 0.1.0 (#143) 2026-05-19 15:52:30 +02:00
Andrey Avtomonov
d1c84e5564
fix: improve setup wizard behavior (#127)
* fix: improve setup wizard behavior

* fix: derive runtime versions from release metadata

* test: validate metabase source mapping requirements

* Fix boundary check release identifiers
2026-05-17 19:15:09 +02:00
Andrey Avtomonov
d3d58a279b
fix(release): repair next npm release workflow (#122)
* fix(ci): run rc releases from next branch

* fix(context): allow release git askpass env

* fix(release): make npm publish noninteractive

* fix(release): use npm trusted publishing

* fix(release): tolerate npm propagation in smoke

* docs(release): document trusted publishing auth
2026-05-17 01:41:07 +02:00
Andrey Avtomonov
de72a10ffb
fix(cli): build runtime assets during dev setup (#121) 2026-05-17 01:04:44 +02:00
Andrey Avtomonov
b565e44a22
feat: add claude-code llm backend with runtime port (#115)
* docs: revise claude-code ingest backend spec

* docs: keep claude-code spec focused on ingest

* docs: expand claude-code spec to full llm parity

* Refine claude-code backend spec after adversarial review iteration 1

* Refine claude-code backend spec after adversarial review iteration 2

* Refine claude-code backend spec after adversarial review iteration 3

* feat: recognize claude-code llm backend

* feat: add ktx llm runtime port

* feat: add claude-code llm runtime

* feat: route non-agent llm calls through runtime

* feat: run ingest agents through llm runtime

* feat: support claude-code setup and status

* test: verify claude-code backend runtime

* docs: add claude-code backend v1 runtime plan

* fix: close claude-code runtime isolation checks

* fix: warn on claude-code prompt caching during setup

* chore: verify claude-code v1 closure

* docs: add claude-code backend v1 isolation closure plan

* fix: update claude-code ingest setup guidance

* docs: add claude-code backend v1 ingest guidance closure plan

* docs: align claude-code isolation spec with sdk metadata

* test: cover claude-code host discovery metadata

* fix: tolerate claude-code host discovery metadata

* docs: clarify claude-code host discovery metadata

* docs: add claude-code auth-probe isolation fix plan

* chore: prepare kaelio ktx rc1 release

* chore: add semantic release workflow

* fix: unblock ci checks

* chore(release): 0.1.0-rc.1

* feat: add Claude Code model selection to setup

* fix: keep git maintenance attached in local repos
2026-05-16 12:06:34 +02:00
Andrey Avtomonov
a72fca2b32
fix(cli): auto-install runtime during setup (#116)
* fix(cli): auto-install runtime during setup

* test: align docs smoke with readme
2026-05-16 11:39:43 +02:00
Andrey Avtomonov
f9532f549b
perf(cli): cache pnpm run ktx builds against a stamp file (#113)
The staleness check compared source mtimes against packages/cli/dist/bin.js,
but tsc only rewrites outputs whose source actually changed. Editing any
non-bin source (e.g. setup.ts) left bin.js untouched, so its mtime stayed
older than the sources forever and every `pnpm run ktx` invocation
rebuilt the whole workspace. Write a dedicated .ktx-build-stamp after a
successful build and check sources against that instead.
2026-05-15 15:49:39 +02:00
Andrey Avtomonov
2de4dd2c1b
perf(setup): speed up conductor setup and make it rerun-safe (#107)
Drop the duplicate `pnpm run build` (artifacts:build already builds every
package). Run package builds in parallel topology via one recursive pnpm
invocation. Enable incremental tsc and keep the cli's tsbuildinfo outside
its dist (moved the dist wipe into a separate `clean` script). Run the
final `ktx status` doctor from a temp dir so it stops walking up into a
parent ktx.yaml and failing the script.

Conductor setup drops from ~26s to ~9.8s cold and ~4.4s warm.
2026-05-15 12:06:37 +02:00
Andrey Avtomonov
b759a4a286
feat(mcp):added MCP server (#97)
* docs(specs): design research-agent MCP tools and ktx mcp daemon

Adds the 2026-05-14 design spec for exposing four new MCP tools
(discover_data, entity_details, dictionary_search, sql_execution),
shipping a ktx-research skill, and introducing an HTTP-only ktx mcp
daemon so external agents can use KTX as a research-capable context
layer.

* Refine research-agent MCP tools spec after adversarial review iteration 1

* Refine research-agent MCP tools spec after adversarial review iteration 2

* Refine research-agent MCP tools spec after adversarial review iteration 3

* Refine spec: drop connectionName compat carve-out and ground summary/snippet provenance per kind

* feat(daemon): validate read-only SQL with sqlglot

* feat(context): expose read-only SQL validation port

* feat(context): register MCP sql execution tool

* feat(context): execute MCP SQL through validated connector path

* test(context): update SQL analysis port fixtures

* docs: add research-agent MCP sql execution foundation plan

* feat(context): add scan-backed entity details service

* feat(context): register MCP entity details tool

* feat(context): expose local MCP entity details

* test(context): align entity details scan fixtures

* docs: add research-agent MCP entity_details plan

* feat(context): add dictionary search service

* feat(context): register MCP dictionary search tool

* feat(context): expose local MCP dictionary search

* docs: add research-agent MCP dictionary_search plan

* feat: add MCP discover data service

* feat: expose discover data MCP tool

* feat: wire local discover data MCP port

* docs: add research-agent MCP discover_data plan

* feat(cli): add mcp http security helpers

* feat(cli): host mcp over streamable http

* feat(cli): manage mcp daemon lifecycle

* feat(cli): add ktx mcp commands

* fix(cli): stabilize mcp daemon verification

* docs: add research-agent MCP http daemon plan

* feat(cli): install KTX research skill

* feat(cli): configure MCP clients in setup agents

* feat(cli): support Claude local MCP setup scope

* docs: add research-agent MCP setup-agents plan

* refactor(context): use connectionId in warehouse verification tools

* docs(context): update ingest verification prompts for connectionId

* docs: add research-agent MCP ingest contract convergence plan

* chore: build runtime artifacts in conductor setup

---------

Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-05-15 02:35:09 +02:00
Andrey Avtomonov
cb8902f1e5
fix(context): merge overlay columns onto manifest columns by name (#94)
* fix(context): merge overlay columns onto manifest columns by name

composeOverlay was appending overlay columns to the manifest column list,
producing duplicate entries when dbt/metabase overlays declared a column
just to attach descriptions. The duplicates carried no `type`, so the
pydantic SourceDefinition rejected them at semantic-query time and broke
`ktx sl query` for every overlay-backed measure. Now overlay columns
match base columns by name (case-insensitive): same-name entries merge
onto the manifest (overlay fields win, type/role fall back to the base,
descriptions merge per source key) and only new names append.

* refactor(sl): split overlay columns from column_overrides and enforce TS/Python wire contract

Overlay sources now have two distinct collections: `columns:` for computed
columns (requiring `expr` + `type`) and `column_overrides:` for metadata
patches to inherited manifest columns. Composing or loading an overlay that
mixes the two — or references an unknown column — fails with a typed error.

Introduce `ResolvedSemanticLayerSource` / `resolvedSourceSchema` /
`toResolvedWire` as the strict shape sent to the Python engine, and add a
schema contract test that diffs Zod against the Pydantic JSON schema dumped
by `python -m semantic_layer dump-schema`. `SourceDefinition` is now
`extra="forbid"` on the Python side.

`loadAllSources` surfaces per-file load errors instead of swallowing them,
so validation/query paths can report manifest shard parse failures.

* fix(context): make scan description generation resilient and quiet

A transient sampleTable failure during ingest used to take out every
table in a connection: generateTableDescription returned a hardcoded
'Table not found' string into descriptions.ai, and KtxDescriptionGenerator
was constructed without a logger, so the failure left no trail anywhere.

- sampleTable / sampleColumn calls retry 3x with 200/400/800ms backoff,
  honouring KtxScanContext.signal via a new KtxAbortedError.
- On retry exhaustion or missing capability, table generation falls back
  to a metadata-only prompt built from column name / native type / comment
  / rawDescriptions. The column path follows the same rule -- call the
  LLM when any of samples or rawDescriptions are available; skip only
  when both are absent.
- Logger is now threaded from KtxScanContext into the generator. Failures
  emit structured KtxScanWarning entries (new description_fallback_used
  code, plus existing sampling_failed / enrichment_failed /
  connector_capability_missing). ktx scan groups warnings by code so a
  batch of identical failures collapses to one summary line plus sample.
- Returns null on failure instead of the 'Table not found' sentinel; the
  manifest writer's existing guard already skips empty descriptions, so
  schema YAML no longer carries misleading text. SCAN_MANAGED_DESCRIPTION_KEYS
  already strips stale 'ai' on merge, so existing YAML clears on next run.

Also suppress AI SDK v6 'system in messages' warning: pull system messages
out of KtxMessageBuilder.wrapSimple's output via a new splitKtxSystemMessages
helper and pass them top-level to generateText (preserves cacheControl
providerOptions on the SystemModelMessage). Agent-runner's local
splitSystemPromptMessages dedupes onto the shared helper.

* test(docs): align examples-docs assertions with revamped docs

PR #103 (setup/guide doc revamp) reworded several CLI examples and
connection labels; the assertions in scripts/examples-docs.test.mjs
still referenced the pre-revamp wording and were failing in CI on main.
Update the regexes to match the post-revamp content:

- drop the `--json` flag from the sl-query example expectation
- move the `Driver:` / `Status: ok` probe to the connection reference,
  which is where that output now lives (driver id is lowercase
  `postgres`, not the display name `PostgreSQL`)
- drop the obsolete `Install \`uv\`...` troubleshooting line
- accept `<connectionId>` everywhere; the docs no longer use the
  hyphenated `<connection-id>` form
- match the `warehouse` connection id used in the quickstart instead of
  the `postgres-warehouse` id only used in the README and setup ref

* fix(sl): skip TS/Python schema contract test when uv is unavailable

The TypeScript checks CI job does not install uv or Python, so the
module-level `execFileSync('uv', ...)` in schemas.contract.test.ts threw
ENOENT and failed the suite. Wrap the schema dump in a try/catch and
guard the describe block with `describe.skipIf` so the test skips in
environments without uv. Local dev and any CI job that has uv on PATH
still runs the cross-language contract assertion.
2026-05-15 02:11:04 +02:00
Luca Martial
5cf2c89093
Revise CLI reference docs (#100)
* docs: revise CLI reference

* docs: sync CLI reference with current commands
2026-05-14 12:53:55 -04:00
Andrey Avtomonov
ce23aca4c4
fix: remove project from ktx config (#95) 2026-05-14 17:39:31 +02:00
Andrey Avtomonov
b3be54e3fa
refactor(context): validate ktx.yaml with Zod and surface issues in status (#91)
* refactor(context): validate ktx.yaml with Zod and surface issues in status

- Replace hand-rolled ktx.yaml parsing with a strict Zod schema and
  derive KtxProjectConfig types from it.
- Add validateKtxProjectConfig returning structured KtxConfigIssue[]
  with migration hints for deprecated keys (ingest.llm,
  scan.enrichment.backend, etc.).
- Wire ktx status/doctor to run validation, render schema issues in
  plain and JSON output, and add a Config row to project status.
- Update the orbit example to camelCase scan.relationships keys to
  match the schema.

* fix(context): tolerate legacy setup.completed_steps and optional driver

- Accept and drop the legacy setup.completed_steps field so existing
  ktx.yaml files migrated from older versions still load.
- Make connections.<id>.driver optional in the schema; runtime code
  already produces a clear "no driver" error at use time.

* feat(cli): add ktx status --validate to run only ktx.yaml schema validation

- New --validate flag dispatches a focused runKtxDoctor 'validate' branch
  that reads ktx.yaml, runs validateKtxProjectConfig, and skips LLM,
  connection, embedding, and query-history checks.
- Plain output prints a single Config row; JSON output emits
  {ok: true} on success or the existing invalid_config / missing_project
  shapes on failure.
2026-05-14 15:36:35 +02:00
Andrey Avtomonov
b00c1a11a9
feat: merge ingest and scan
* docs: add CLI component reuse guidance

* docs: add unified ingest ux design

* Refine unified ingest UX design after adversarial review iteration 1

* Refine unified ingest UX design after adversarial review iteration 2

* Refine unified ingest UX design after adversarial review iteration 3

* feat(cli): route public connection ingest command

* feat(cli): hide standalone scan from public help

* feat(cli): plan public ingest depth and query history

* feat(cli): execute public database ingest facets

* feat(ingest): read connection query history config

* fix(cli): use public ingest wording

* fix(config): stop generating ingest adapter allow lists

* docs: document public ingest command

* test: align ingest surface expectations

* docs: add unified ingest public CLI surface plan

* feat(cli): preflight deep public ingest readiness

* feat(setup): store query history in connection context

* feat(setup): store database context depth

* feat(setup): verify context readiness by database depth

* fix(setup): keep context build foreground only

* fix(config): reject reserved ingest connection ids

* test: close unified ingest v1 expectations

* docs: add unified ingest v1 closure plan

* fix(ingest): bypass adapter allow-list for public source ingest

* fix(ingest): honor query history window intent

* fix(ingest): hide scan internals from public database ingest

* feat(ingest): use foreground view for interactive public ingest

* fix(setup): use schema context and query history wording

* test(cli): verify unified ingest public output

* docs: add unified ingest v1 public output closure plan

* fix(setup): forward query history flags

* fix(setup): prompt for postgres query history

* fix(status): report query history readiness

* fix(ingest): remove legacy public guidance

* fix(ingest): polish foreground retry copy

* docs(examples): use unified query history wording

* chore(ingest): finish public query history cleanup

* docs: add unified ingest v1 query history status cleanup plan

* test(docs): cover unified ingest public docs

* docs: align ingest CLI reference with unified UX

* docs: update context build guides for unified ingest

* docs: update setup and primary source ingest wording

* docs: stop advertising adapter-backed example ingest

* docs: close unified ingest public docs gaps

* docs: add unified ingest v1 docs site closure plan

* fix: render unified ingest foreground warnings

* fix: explain query history schema order

* fix: add public ingest retry guidance

* fix: align setup next steps with unified ingest

* fix: remove scan wording from demo progress

* test: verify unified ingest ux closure

* docs: add unified ingest v1 foreground and retry closure plan

* fix(cli): preserve query-history pull config in public ingest

* fix(cli): omit hidden commands from docs command tree

* test(cli): close unified ingest final public surface checks

* docs: add unified ingest v1 final public surface closure plan

* fix(cli): use public source labels in ingest reports

* fix(cli): suppress low-level public ingest output

* test(cli): verify unified ingest public plain output

* docs: add unified ingest v1 public plain output closure plan

* fix(cli): add public ingest copy sanitizers

* fix(cli): sanitize public ingest progress copy

* fix(cli): rename setup schema scope prompt

* docs(plan): add progress copy closure; test: align setup back-nav fixture

Adds the iter9 plan and updates the setup back-navigation test fixture
to pass disableQueryHistory plus listSchemas/listTables stubs that the
unified ingest setup step now requires.

* docs(plan): add final ux labels plan with narrowed label scans

* fix(cli): aggregate unsupported query-history warnings

* fix(cli): align setup database labels

* test(cli): fix setup database test type-check

* fix(cli): remove primary-source wording from setup output

* test(cli): verify unified ingest setup closure

* docs(plan): add unified ingest v1 verification copy closure plan

* fix(cli): remove top-level scan command

* fix(cli): remove legacy ingest and wiki commands

* Merge scan into ingest flow

* feat(cli): split ingest progress into per-phase rows, rename work units to tasks

Each database target in the unified ingest dashboard now renders one row per
real subprocess (Schema, then Query history when enabled) instead of a single
combined bar. Each phase has its own monotonic 0-100% bar so the progress
never snaps back to zero when historic-sql starts after scan completes.
Completed phases keep their final bar, summary, and elapsed time visible as
an inline audit trail; queued and skipped phases are shown explicitly.

Also rename user-facing "work units" / "Failed work units" to "tasks" /
"Failed tasks" in ingest output and parseIngestSummary. The parser still
accepts the legacy "Work units:" wording in captured output for backward
compat. Internal memory-flow event names and type fields are left alone.

* Fix test harness failures

* Fix CI smoke checks

---------

Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-05-14 01:43:06 +02:00
Andrey Avtomonov
1a472cf3ed
fix: clean up ktx yaml config parameters (#75)
* fix: clean up ktx yaml config parameters

* fix: align ci smoke checks with status output

* test: update artifact smoke status assertion
2026-05-14 01:27:31 +02:00
Andrey Avtomonov
0a261fe8a4
ci: add codecov coverage reporting (#82)
* ci: add codecov coverage reporting

* ci: fix codecov and secret scan checks

* ci: fix smoke and artifact checks
2026-05-14 01:13:31 +02:00
Andrey Avtomonov
fa9237956e
ci: run pre-commit checks in CI (#74)
* ci: run pre-commit in CI

* test: update CI workflow guardrail
2026-05-13 19:49:25 +02:00
Andrey Avtomonov
3fde4438b1
fix: stop requiring readonly connection config (#71) 2026-05-13 19:37:25 +02:00
Luca Martial
e50fef851f fix(cli): hide setup project banner 2026-05-13 09:16:35 -07:00
Andrey Avtomonov
d7147f9ca1
feat: rename project wiki directory (#66)
* feat: rename project wiki directory

* test: fix wiki skill ordering expectations

* Show configured context sources in setup
2026-05-13 16:05:58 +02:00
Andrey Avtomonov
97da9919e9
refactor: remove legacy compatibility paths (#64)
* refactor: remove legacy compatibility paths

* fix: support legacy metabase native queries

* test: use canonical semantic layer descriptions

* Rename CLI description

* Recover setup scan from SQLite ABI mismatch

* Remove legacy product name from CLI help
2026-05-13 15:55:00 +02:00
Andrey Avtomonov
c202202e6b
feat(cli): clean up wiki and sl commands (#65)
* feat(cli): clean up wiki and sl commands

* test(scripts): update package artifact CLI smoke assertion
2026-05-13 15:41:10 +02:00
Andrey Avtomonov
bcb0d2f8f7
chore: add TypeScript dead-code checks (#60)
* chore: add TypeScript dead-code checks

* chore: trim stale Knip ignores

* Fix CI smoke and artifact checks
2026-05-13 13:33:28 +02:00
Andrey Avtomonov
721f1a998f
feat(cli)!: remove ktx agent command (#58)
* feat(cli)!: remove ktx agent command

* test(context): update PGlite boundary guardrail
2026-05-13 13:01:56 +02:00
Andrey Avtomonov
eaaabb361e
fix(cli): clean up dev runtime commands (#59) 2026-05-13 12:28:24 +02:00
Andrey Avtomonov
b9e0a746af
feat(cli): clean up dev command surface (#57)
* feat(cli): clean up dev command surface

* test: align CI expectations with CLI cleanup

* test(cli): update slow test command expectations
2026-05-13 12:00:08 +02:00
Andrey Avtomonov
85fc408054
chore(deps): refresh workspace dependencies (#43)
* chore(deps): refresh workspace dependencies

* Fix pnpm artifact smoke build approvals
2026-05-13 01:15:35 +02:00
Andrey Avtomonov
17a2fee69a
fix(cli): remove ktx setup subcommands (#42)
* fix(cli): remove ktx setup subcommands

* test(scripts): update setup-dev status expectation
2026-05-13 00:38:26 +02:00
Andrey Avtomonov
e15a4ebaec feat(cli): clean up command surface 2026-05-12 23:51:46 +02:00
Luca Martial
60457e9407
Improve schema setup and Notion ingest UX (#14)
* Improve schema setup and Notion ingest UX

* Handle Postgres network scan failures

* WIP: save local changes before main merge

* Refine setup prompt choices

* Tighten ingest reconciliation guidance

* Commit setup config updates

* Canonicalize unmapped fallback details

* Count reconciliation actions in reports

* Harden semantic layer source validation

* Return wiki content after edits

* Validate SL sources against manifests

* Validate wiki refs before writes

* Simplify CLI next steps

* Clarify agent setup summary

* Surface dbt target SL sources

* Recover SL write fallbacks

* Preserve failed context build metadata

* Track raw paths for ingest actions

* test(cli): update seeded demo expectations

* fix(ingest): scope fallback recovery checks

* fix(sl): tighten source validation guards

* fix(wiki): ignore empty embedding vectors

* Improve Notion ingest UX

* Enforce flat wiki keys

* test(context): update wiki key assertion

---------

Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-05-12 22:56:58 +02:00
Andrey Avtomonov
5277c81b33 fix(ci): update artifact source test 2026-05-12 16:17:45 +02:00
Andrey Avtomonov
6a1fded5ce fix(ci): align smoke stderr expectations 2026-05-12 15:31:41 +02:00
Andrey Avtomonov
15f433930e
Merge branch 'main' into andreybavt/execute-context7-plan 2026-05-12 13:04:16 +02:00
Andrey Avtomonov
52400c599c chore: standardize pre-commit checks 2026-05-12 13:02:06 +02:00
Andrey Avtomonov
bd5154f918
Merge pull request #31 from Kaelio/andreybavt/pasted-text-attachment
fix: standardize KTX environment variables
2026-05-12 12:26:33 +02:00
Andrey Avtomonov
4c93a6e983
fix(ci): update stale KTX test expectations (#32)
Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-05-12 12:02:26 +02:00
Andrey Avtomonov
d5f484eb7e fix: standardize KTX environment variables 2026-05-12 11:21:37 +02:00
Andrey Avtomonov
bb8868f238
Merge pull request #19 from Kaelio/andreybavt/dbt-vertex-no-anthropic
fix(cli): honor configured LLM backends in setup
2026-05-12 10:25:24 +02:00
Andrey Avtomonov
3e9869340f ci: parallelize KTX CI checks 2026-05-12 01:44:15 +02:00
Andrey Avtomonov
0153ac4945 Relax boundary check for test fixtures 2026-05-12 01:34:18 +02:00
Andrey Avtomonov
f3f6b36551 Merge remote-tracking branch 'origin/main' into andreybavt/historic-sql-redesign 2026-05-11 20:52:19 +02:00
Andrey Avtomonov
ca325cd72c test: verify historic sql sharded smoke docs 2026-05-11 20:32:32 +02:00
Andrey Avtomonov
194752756a test: expect historic sql pattern shard smoke docs 2026-05-11 20:30:14 +02:00
Andrey Avtomonov
69128ccf72 fix: link Conductor agent overlays from root checkout 2026-05-11 20:06:13 +02:00
Andrey Avtomonov
81ec2eee7c test: verify historic sql docs and smoke cleanup 2026-05-11 19:42:51 +02:00
Andrey Avtomonov
598149b6a4 test: expect unified historic sql example docs 2026-05-11 19:39:21 +02:00