* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
* fix(cli): double the height of the setup banner t crossbar
* fix(cli): unify setup multi-select hints and make Tab the select key
The six interactive multi-select surfaces in `ktx setup` documented three
different hint voices, one had no hint at all, and they named two different
select keys (Space vs Tab). Tab is the only key that can toggle selection
without colliding with type-to-search input, so make it the single documented
select key everywhere and compose every hint from one shared fragment
vocabulary in prompt-navigation.ts.
- Register `updateSettings({ aliases: { tab: 'space' } })` so Tab toggles flat
multiselects; the alias applies only to non-text prompts, leaving typed
search input (schema/Notion) untouched.
- Add the missing hint to the agent-targets prompt and drop the stray
"Space to select … Esc …" info line plus the now-dead writeSetupInfo helper.
- Replace the schema-scope ad-hoc hint with the searchable-multiselect voice
and standardize "filter" -> "search" vocabulary.
- Delete DEFAULT_TREE_PICKER_HELP_TEXT and the unused TreePickerChrome.helpText
seam; render the shared tree hint instead.
* refactor(cli): show LLM check progress for every setup backend
Rename runLlmHealthCheckWithProgress to validateModelWithProgress and
wrap the Claude subscription and Codex auth probes in the same spinner
progress as the Anthropic API and Vertex backends, so each backend shows
consistent "Checking <provider> LLM" output during setup.
* feat(cli): add ktx-orange progress spinners to setup steps
Add a shared runWithCliSpinner helper and a TTY-aware createCliSpinner:
an animated clack spinner in a terminal, and a static stderr-only spinner
before raw-mode pickers (the table tree picker and demo tour), where the
animated spinner's stdin grab would otherwise corrupt the next prompt.
Wrap the slow setup waits in progress spinners: managed runtime install,
embedding daemon start + first-run model download, embeddings health
check, the connection-test gate, and source validation / dbt clone /
Metabase discovery. Recolor every spinner frame from clack's magenta to
the ktx mascot orange (#FF8A4C) via the static helper and clack's
styleFrame option.
Align the tree with AGENTS.md/CLAUDE.md conventions:
- Rewrite user-facing strings, docs, and tests to lowercase `ktx`
(no bare uppercase `KTX` tokens remain outside literal identifiers).
- Drop the legacy `historicSql` migration path and its now-unused
helpers, per the no-backward-compat rule.
- Remove `as unknown as` / `any` casts: narrow `BaseTool` generics to
`z.ZodObject`, add a typed `createLookerClient`, and delete the dead
`getParametersSchema`/`toAnthropicFormat` pre-AI-SDK helpers.
- Use `InvalidArgumentError` for Commander parse failures.
- Finish the adapter→connector prose conversion in the `ktx.yaml` docs
while keeping the literal `adapters` config key.
Setup wizard flow tweaks:
- Add a reveal-tail password prompt (reveal-password-prompt.ts) that unmasks
the last few characters of a typed/pasted secret, and wire it into the setup
prompt adapter in place of clack's password(); adds the @clack/core dep.
- Reorder wizard select options: surface "Paste a key" before the
environment-variable option across embeddings/models/sources, promote
Metabase/Notion in the source list, put Git URL before Local path, reorder
the Notion crawl-mode choices, and relabel the sources "Done" action.
Query-history filter picker output:
- Collapse the per-template parse-failure lines into a single count in the
setup output and route the full template-id list to --debug stderr.
- Model parse failures as a structured parseFailedTemplateIds field instead of
warning strings.
- Add a privacy-safe query_history_filter_completed telemetry event
(counts/enums only), mirrored into the Python daemon schema.
* feat(cli): block context build when a required connection fails its live test
A context build can take several minutes, so a connection that is
unreachable or misconfigured should stop the build up front instead of
failing partway through. Before the build starts, run a live connection
test for every primary- and context-source connection the build depends
on.
Each test's output is captured in a discarded buffer so raw error text
(and database paths) never reach the user; failures are surfaced only by
connection id and connector type, with a pointer to `ktx connection test
<id>` for the underlying error.
- Interactive setup lets the user fix the connection and retry without
restarting, re-resolving targets so an added/removed/reconfigured
connection is honored.
- `--no-input` exits non-zero and writes a failed context state with a
failureReason, so scripts stop early and setup never reads as ready.
Extract the buffered command IO helper out of setup-databases into
src/io/buffered-command-io.ts so both call sites share one implementation.
* feat(cli): use recovery primitive for database setup
* feat(cli): use recovery primitive for source setup
* docs: document setup connection recovery
* fix(cli): close database recovery gaps
* fix(cli): target failing project in gate hint and preserve missing-input
Address two review findings on the connection-recovery work:
- The connection-gate failure hint emitted `ktx connection test <id>` with no
--project-dir, so a setup run started with `--project-dir ./analytics` pointed
users at cwd/KTX_PROJECT_DIR instead of the project that just failed. Emit the
resolved project dir, matching the contextBuildCommands convention.
- The non-interactive database configure path returned `cancelled`, which the
recovery primitive collapses to `failed`. Sibling paths still report
`missing-input` for absent flags, so incomplete-flag runs were
indistinguishable from real connection failures. The database wrapper now
tracks the configure missing-input signal and restores the `missing-input`
step status; the shared primitive keeps its four outcomes.
Notion's setup path read --source-api-key-ref while writing the auth_token_ref
config field, so --source-auth-token-ref was silently dropped. Align Notion to
the flag=field convention every other connector follows: it now reads
--source-auth-token-ref, and --source-api-key-ref becomes Metabase-only.
Also add validation rejecting any credential-ref flag not applicable to the
chosen --source, with a pointer to the correct flag, closing the silent-drop
class for all connectors.
Update CLI-reference docs, the ktx skill Notion example, and tests.
Fixes KLO-724.
* feat(cli): define full warehouse dialect contract
* test(cli): keep dialect edge tests focused
* fix(cli): stabilize dialect contract foundation
* refactor(connectors): own read-only query preparation
* refactor(connectors): resolve dialects through registry
* refactor(connectors): keep concrete dialect classes internal
* chore(workspace): enforce dialect import boundary
* refactor(cli): resolve relationship dialect at scan boundary
* refactor(cli): use dialect display parsing for entity details
* refactor(cli): use dialect display parsing for warehouse catalog
* refactor(cli): use dialect SQL in relationship workflows
* test(cli): verify solid dialect scan workflow closure
* test: split cli tests from source tree
* refactor(cli): standardize BigQuery scope listing
* feat(sqlite): implement connector scope listing
* test(connectors): cover required table listing
* feat(cli): add warehouse driver registry
* refactor(setup): route scope discovery through driver registry
* refactor(cli): route local query execution through driver registry
* refactor(historic-sql): route dialect support through driver registry
* refactor(cli): test warehouse connections through driver registry
* fix(cli): close driver registry type export gaps
* Improve setup daemon diagnostics
* refactor(setup): centralize rail-prefixed diagnostics + query-history fallback
Extract errorMessage, writePrefixedLines, and flushPrefixedBufferedCommandOutput
into clack.ts so the setup wizard, managed daemons, and embedding/agent steps
share one rail-formatted writer. setup-databases.ts also adds a
"disable query history and retry" option when the schema-context build fails
and query history is the likely culprit, surfaced via a new
failed-query-history-unavailable status.
* fix(cli): carry catalog through the picker so BigQuery/Snowflake/SQL Server scope filters match
The setup picker's KtxTableListEntry was a 2-level { schema, name }, so
qualifiedTableId always wrote db.name into enabled_tables. When BigQuery,
Snowflake, or SQL Server later ran fast ingest, their introspect step filtered
the scope set with scopedTableNames(scope, { catalog: projectId|database, db })
— catalog was non-null on the introspect side but null in the scope refs, so
every entry was rejected, the live-database adapter staged zero table files,
and detect() failed with 'Adapter "live-database" did not recognize fetched
source output'.
Align the picker boundary with the canonical 3-level KtxTableRef:
- Add catalog: string | null to KtxTableListEntry.
- BigQuery/Snowflake/SQL Server listTables populate catalog from the
resolved projectId / database; Postgres/MySQL/ClickHouse/SQLite set null.
- qualifiedTableId emits catalog.schema.name when catalog is non-null
(resolveEnabledTables already accepts the 3-part shape) and
schemasFromEnabledTables now goes through parseDottedTableEntry so it
recovers the schema correctly from both 2-part and 3-part entries.
- Export parseDottedTableEntry from enabled-tables.ts (@internal) for picker
reuse.
Update listTables expectations in all seven connector tests and the setup /
picker test fixtures. Add a picker regression test that covers the
catalog-bearing round-trip (save + refine).
* fix(cli): allow debug telemetry under opt-out env
* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm
* refactor(workspace): rewrite @ktx/llm imports to relative paths
* refactor(workspace): fold internal packages into cli
* chore(workspace): gate dead-code with knip production mode
Turn on production-mode knip plus an autofix run in pre-commit and the
`pnpm dead-code` script, document the `/** @internal */` convention for
test-only exports in AGENTS.md, annotate test-only exports across the
CLI with that JSDoc, and drop dead exports/wrappers the new gate
surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`,
`createLocalScanEnrichmentProvidersFromConfig`,
`PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports).
Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit
production entries so cross-package barrel leaks are caught.
* refactor(cli): delete internal barrel index.ts files
The 34 `index.ts` re-export barrels inside `packages/cli/src/` were
holdovers from the pre-fold multi-workspace structure. Post-fold-in they
served no production purpose: external consumers go through the single
package main entry, and in-repo callers mostly imported through them
only because the path was short. Internally, knip flagged most barrel
re-exports as production-dead (only reached via tests).
This change:
- Deletes every internal barrel except `packages/cli/src/index.ts`
(the published package entry).
- Rewrites ~270 source/test files to import each name directly from
the file that defines it.
- Moves `tools/warehouse-verification/index.ts` to
`create-warehouse-verification-tools.ts` (the function it defined
locally) and updates its single consumer.
- Renames `search/backend-conformance.ts` → `.test-utils.ts` to match
the existing test-helper file convention.
- Deletes 13 dead test-only chains (dbt-descriptions/*,
live-database/extracted-schema, live-database/structural-sync,
relationship-* feedback/review chain) plus their tests and a
cascading orphan integration test.
- Updates test mocks that pointed at deleted barrel paths
(notion-client, connector barrels in scan/local-scan-connectors
tests) to mock the source files instead.
- Points the maintainer benchmark script
(`scripts/relationship-benchmark-report.mjs`) at source files
instead of `dist/context/scan/index.js`.
- Drops the barrel `!` entries from `knip.json`; adds explicit
production entries only for the benchmark code reached via dist by
the maintainer script.
Net: 413 files changed, ~1.2k insertions, ~9.4k deletions.
`pnpm run dead-code` (Biome + knip default + knip production) and
`pnpm run type-check` are clean; 2277 tests pass.
* refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly
Promote the CLI workspace package to the public name `@kaelio/ktx` and
drop the separate `scripts/build-public-npm-package.mjs` wrapper. The
CLI package is now publishable in place (`publishConfig.access: public`,
`provenance: true`), so artifact packing uses `pnpm pack` against
`packages/cli/` instead of assembling a parallel package tree.
Updates all workspace filter invocations, docs, tests, and release
readiness checks to reference the new package name, and folds the
tarball-name helper into `scripts/public-npm-release-metadata.mjs`.
* docs: align "agent clients" and "data agents" terminology
Replace "client agents" with "agent clients" and "database agents" with
"data agents" across AGENTS.md, README.md, the docs-site copy, and the
matching setup-agents test description, matching the canonical
vocabulary in docs/terminology.md.
Also moves packages/cli/tsconfig.json's tsBuildInfoFile from
node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive
node_modules reinstalls.
* refactor(release): single source of truth for package version
Make packages/cli/package.json the single source of truth for the
@kaelio/ktx version. publicNpmPackageVersion() now reads it directly,
so artifact filenames, release-readiness checks, and the Python wheel
version all derive from one field. The duplicate
release-policy.json.publicNpmPackageVersion is removed.
Previously the two fields could drift: tarballs were named
kaelio-ktx-0.4.1.tgz while internally containing
@kaelio/ktx@0.0.0-private.
- update-public-release-version.mjs rewrites both Python pyproject.toml
files (ktx-daemon, ktx-sl) alongside the npm package.jsons,
normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2).
- semantic-release-config.cjs adds the two pyproject.toml files to
@semantic-release/git assets so the release commit back to main
carries every version source in lockstep.
- The six "?? '0.0.0-private'" fallback literals across the CLI are
replaced with "?? getKtxCliPackageInfo().version", and
createDefaultKtxMcpServer makes its version arg required.
- docs/release.md describes the actual commit-back model: the dev tree
always reflects the most recent release; no sentinel pin to
maintain.
Verified: pnpm run artifacts:build now produces
kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with
@kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and
2287 vitests + 173 script tests pass.
* refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime
Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and
scan command entrypoints so tests can stub them, and teach
resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime
feature when ktx.yaml selects sentence-transformers.
* chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal
Both symbols are consumed only by status-project.test.ts. Annotating with
/** @internal */ keeps knip's production-mode check clean without changing
runtime behavior.
* fix(cli): use real package metadata in print-command-tree
The stubbed package name embedded a forbidden product identifier that
tripped the boundary check in CI. Read the metadata from package.json
instead — keeps the rendered tree unchanged and removes a duplicate
source of truth.
* feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts
Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer
source counts, computed with `SUM(embedding_json IS NOT NULL)` over
`knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to
"Wiki" (canonical per `docs/terminology.md`) and rename the matching
`localStats.knowledgePages` field to `localStats.wikiPages`.
Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those
duplicated the per-surface rows above. Disk now reports only actual byte
usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` /
`semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry`
helpers, and the `filter` arg on `summarizeDir` are removed.
* refactor(context): export and describe mapping shape schemas
* feat(context): add driver-schemas module with warehouse drivers
* feat(context): add metabase, looker, lookml driver schemas with mappings
* feat(context): add notion, dbt, metricflow driver schemas
* refactor(context): make connectionSchema a driver-discriminated union
* chore(context): re-export KtxConnectionConfig from project package
* docs(context): add connection driver schema plan
* chore(secrets): allowlist example credentials in driver-schemas fixtures
* test(cli): align metabase fixtures with required api_url field
The driver-discriminated union added in this branch now requires api_url
for metabase connections and a known driver for warehouses. Update slow
CLI test fixtures and assertions so they exercise the new schema:
- ingest.test-utils.ts: add api_url to the prod-metabase fixture.
- setup.test.ts: switch metabase fixture from 'url' to 'api_url'.
- local-scan-connectors.test.ts: invalid-driver/missing-driver tests now
expect the schema error from loadKtxProject (parse-time rejection).
Wraps the validation clone in defaultValidateDbt so auth or network
failures surface as a clean validation error instead of an unhandled
RepoFetchError that exits the wizard. Verifies pasted tokens with
testGitRepo before saving them as a secret so bad tokens are caught at
paste time. In interactive setup, validation failures now bounce the
user back to source selection (with a "Edit the connection or pick a
different source" hint) instead of killing the process; --source flag
mode still exits with failed as before.
* docs: add CLI component reuse guidance
* docs: add unified ingest ux design
* Refine unified ingest UX design after adversarial review iteration 1
* Refine unified ingest UX design after adversarial review iteration 2
* Refine unified ingest UX design after adversarial review iteration 3
* feat(cli): route public connection ingest command
* feat(cli): hide standalone scan from public help
* feat(cli): plan public ingest depth and query history
* feat(cli): execute public database ingest facets
* feat(ingest): read connection query history config
* fix(cli): use public ingest wording
* fix(config): stop generating ingest adapter allow lists
* docs: document public ingest command
* test: align ingest surface expectations
* docs: add unified ingest public CLI surface plan
* feat(cli): preflight deep public ingest readiness
* feat(setup): store query history in connection context
* feat(setup): store database context depth
* feat(setup): verify context readiness by database depth
* fix(setup): keep context build foreground only
* fix(config): reject reserved ingest connection ids
* test: close unified ingest v1 expectations
* docs: add unified ingest v1 closure plan
* fix(ingest): bypass adapter allow-list for public source ingest
* fix(ingest): honor query history window intent
* fix(ingest): hide scan internals from public database ingest
* feat(ingest): use foreground view for interactive public ingest
* fix(setup): use schema context and query history wording
* test(cli): verify unified ingest public output
* docs: add unified ingest v1 public output closure plan
* fix(setup): forward query history flags
* fix(setup): prompt for postgres query history
* fix(status): report query history readiness
* fix(ingest): remove legacy public guidance
* fix(ingest): polish foreground retry copy
* docs(examples): use unified query history wording
* chore(ingest): finish public query history cleanup
* docs: add unified ingest v1 query history status cleanup plan
* test(docs): cover unified ingest public docs
* docs: align ingest CLI reference with unified UX
* docs: update context build guides for unified ingest
* docs: update setup and primary source ingest wording
* docs: stop advertising adapter-backed example ingest
* docs: close unified ingest public docs gaps
* docs: add unified ingest v1 docs site closure plan
* fix: render unified ingest foreground warnings
* fix: explain query history schema order
* fix: add public ingest retry guidance
* fix: align setup next steps with unified ingest
* fix: remove scan wording from demo progress
* test: verify unified ingest ux closure
* docs: add unified ingest v1 foreground and retry closure plan
* fix(cli): preserve query-history pull config in public ingest
* fix(cli): omit hidden commands from docs command tree
* test(cli): close unified ingest final public surface checks
* docs: add unified ingest v1 final public surface closure plan
* fix(cli): use public source labels in ingest reports
* fix(cli): suppress low-level public ingest output
* test(cli): verify unified ingest public plain output
* docs: add unified ingest v1 public plain output closure plan
* fix(cli): add public ingest copy sanitizers
* fix(cli): sanitize public ingest progress copy
* fix(cli): rename setup schema scope prompt
* docs(plan): add progress copy closure; test: align setup back-nav fixture
Adds the iter9 plan and updates the setup back-navigation test fixture
to pass disableQueryHistory plus listSchemas/listTables stubs that the
unified ingest setup step now requires.
* docs(plan): add final ux labels plan with narrowed label scans
* fix(cli): aggregate unsupported query-history warnings
* fix(cli): align setup database labels
* test(cli): fix setup database test type-check
* fix(cli): remove primary-source wording from setup output
* test(cli): verify unified ingest setup closure
* docs(plan): add unified ingest v1 verification copy closure plan
* fix(cli): remove top-level scan command
* fix(cli): remove legacy ingest and wiki commands
* Merge scan into ingest flow
* feat(cli): split ingest progress into per-phase rows, rename work units to tasks
Each database target in the unified ingest dashboard now renders one row per
real subprocess (Schema, then Query history when enabled) instead of a single
combined bar. Each phase has its own monotonic 0-100% bar so the progress
never snaps back to zero when historic-sql starts after scan completes.
Completed phases keep their final bar, summary, and elapsed time visible as
an inline audit trail; queued and skipped phases are shown explicitly.
Also rename user-facing "work units" / "Failed work units" to "tasks" /
"Failed tasks" in ingest output and parseIngestSummary. The parser still
accepts the legacy "Work units:" wording in captured output for backward
compat. Internal memory-flow event names and type fields are left alone.
* Fix test harness failures
* Fix CI smoke checks
---------
Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
* feat(cli): add edit flow for primary database connections in setup
Allow users to edit existing primary database connections during setup
instead of only adding new ones. Preselects existing values (URL, schemas,
tables) so users can adjust without re-entering everything.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(cli): add edit flow for context source connections in setup
Allow users to edit existing context source connections during setup.
Preselects existing values (URLs, credentials, repo details) and offers
a "Keep existing credential" option for sensitive fields.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): rename "Add more" to "Add additional" in primary sources menu
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update every setup step to write completed_steps to .ktx/setup/state.json
instead of ktx.yaml, stripping legacy entries from config on write.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>