* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
* feat(cli): show cached update notices after commands
* docs(cli): describe update notices
* fix(cli): type update check environment
* fix(cli): decouple update notice display from refresh and harden suppression
Display a cached "update available" notice based solely on the lastNoticeAt
24h throttle, independent of checkedAt refresh freshness, matching the design's
independent display/refresh decisions. Suppress the check unconditionally under
--json, CI, and non-TTY before consulting output-mode preferences, so a
KTX_OUTPUT=pretty override can no longer make CI/non-TTY contexts phone npm.
Setup wizard flow tweaks:
- Add a reveal-tail password prompt (reveal-password-prompt.ts) that unmasks
the last few characters of a typed/pasted secret, and wire it into the setup
prompt adapter in place of clack's password(); adds the @clack/core dep.
- Reorder wizard select options: surface "Paste a key" before the
environment-variable option across embeddings/models/sources, promote
Metabase/Notion in the source list, put Git URL before Local path, reorder
the Notion crawl-mode choices, and relabel the sources "Done" action.
Query-history filter picker output:
- Collapse the per-template parse-failure lines into a single count in the
setup output and route the full template-id list to --debug stderr.
- Model parse failures as a structured parseFailedTemplateIds field instead of
warning strings.
- Add a privacy-safe query_history_filter_completed telemetry event
(counts/enums only), mirrored into the Python daemon schema.
* feat: add codex sdk runner foundation
* feat: parse codex runtime events
* feat: expose codex runtime mcp tools
* feat: add codex llm runtime
* feat: wire codex llm backend
* test: avoid Array.fromAsync in codex runner test
* docs: document codex llm backend
* fix: tighten codex runtime config ownership
* fix: use codex sdk env and thread options
* fix: parse codex sdk event shapes
* test: add codex backend live smoke
* docs: clarify codex backend isolation
* fix: drive codex loop metrics from mcp events
* fix: enforce codex local step budget
* docs: disclose codex isolation limits
* fix: count all codex agent steps and stream step callbacks live
The agent-loop step budget only counted completed mcp_tool_call items, so
built-in command_execution steps (which the public Codex SDK/CLI surface can
still expose) never decremented the budget, letting ingest/reconciliation run
past stepBudget until Codex stopped on its own. onStepFinish was also replayed
only after the whole stream drained, so live work_unit_step / reconciliation
progress appeared stuck until the Codex process exited.
collectEvents is now the single live step accumulator: it counts every
completed agent-action item via a shared isCompletedAgentStep predicate
(command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish
as each step completes, and enforces the budget on that broader count. A
no-tool turn still counts as one step. toolFailures stays MCP-specific, since a
non-zero command exit is normal agent exploration, not a loop failure.
* test: align ingest llm-guard assertions with codex backend
The skip-llm ingest guard message now lists codex as a valid backend and
mentions a Claude Code/Codex session plus a codex setup hint, but this slow
suite test still asserted the pre-codex wording. Update it to match the
production message (already covered by the local-bundle-runtime unit test) and
add the codex setup-line assertion.
* fix: treat codex error:null tool calls as success
The Codex SDK serializes error: null on successful mcp_tool_call items, so
the failure check (item.error !== undefined) flagged every successful tool
call as failed with the empty-payload default "Codex turn failed". This
killed every ingest work unit under the codex backend before it could
produce a patch.
Key on status === 'failed' (authoritative, always set) and only treat a
populated error object as a failure. Add a regression test built from a
verbatim real-SDK event capture.
* fix: default codex backend to gpt-5.5 and report real probe errors
The previous default gpt-5.3-codex is an API-key-only model that the OpenAI
API rejects under ChatGPT-account (subscription) auth, so codex status/setup
failed with a misleading "authentication is not usable" message even though
auth was fine.
- Default codex model is now gpt-5.5 (works on both subscription and API-key
auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and
keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark).
- runCodexAuthProbe now distinguishes "model not available" from an auth
failure and surfaces the real API error: collectEvents retains stream
events when the SDK throws on a non-zero exit, and the API error JSON
envelope is unwrapped to its human-readable message.
- The Codex isolation warning now renders inside the clack setup frame.
- Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth.
* fix: require llm.models.default in status and match codex probe remediation
Status reported a project ready when a non-none LLM backend was configured
without llm.models.default, but the runtime (resolveModelSlots) hard-requires
it, so ingest/scan/memory threw after `ktx status` said the project was usable.
buildLlmStatus now fails for any non-none backend missing models.default and no
longer invents a fallback model for claude-code/codex.
Codex probe failures now carry a category-matched fix: a model-access failure
steers the user at llm.models.default instead of the auth/install remediation.
runCodexAuthProbe returns the fix and status consumes it; the message stays
self-sufficient so setup output is unchanged.
Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx
states --llm-model only accepts codex/default or gpt-*/codex-* ids.
Repaired four doctor fixtures that configured a backend without models.default
(the now-correctly-blocked config) and added coverage for the new behavior.
Replace the tall portrait README ingestion SVG with two landscape
diagrams — "1 · Ingestion" (build the context layer) and "2 · Serving"
(agents query it through MCP) — wired in as transparent 2x PNGs that
read on GitHub light and dark.
Add docs-site/diagram-studio: a static React Flow page with custom
themed nodes and the inlined ktx mascot that renders both diagrams and
exports them to PNG via html-to-image (the diagrams' reproducible
source). Remove the superseded ingestion-flow SVGs.
* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm
* refactor(workspace): rewrite @ktx/llm imports to relative paths
* refactor(workspace): fold internal packages into cli
* chore(workspace): gate dead-code with knip production mode
Turn on production-mode knip plus an autofix run in pre-commit and the
`pnpm dead-code` script, document the `/** @internal */` convention for
test-only exports in AGENTS.md, annotate test-only exports across the
CLI with that JSDoc, and drop dead exports/wrappers the new gate
surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`,
`createLocalScanEnrichmentProvidersFromConfig`,
`PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports).
Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit
production entries so cross-package barrel leaks are caught.
* refactor(cli): delete internal barrel index.ts files
The 34 `index.ts` re-export barrels inside `packages/cli/src/` were
holdovers from the pre-fold multi-workspace structure. Post-fold-in they
served no production purpose: external consumers go through the single
package main entry, and in-repo callers mostly imported through them
only because the path was short. Internally, knip flagged most barrel
re-exports as production-dead (only reached via tests).
This change:
- Deletes every internal barrel except `packages/cli/src/index.ts`
(the published package entry).
- Rewrites ~270 source/test files to import each name directly from
the file that defines it.
- Moves `tools/warehouse-verification/index.ts` to
`create-warehouse-verification-tools.ts` (the function it defined
locally) and updates its single consumer.
- Renames `search/backend-conformance.ts` → `.test-utils.ts` to match
the existing test-helper file convention.
- Deletes 13 dead test-only chains (dbt-descriptions/*,
live-database/extracted-schema, live-database/structural-sync,
relationship-* feedback/review chain) plus their tests and a
cascading orphan integration test.
- Updates test mocks that pointed at deleted barrel paths
(notion-client, connector barrels in scan/local-scan-connectors
tests) to mock the source files instead.
- Points the maintainer benchmark script
(`scripts/relationship-benchmark-report.mjs`) at source files
instead of `dist/context/scan/index.js`.
- Drops the barrel `!` entries from `knip.json`; adds explicit
production entries only for the benchmark code reached via dist by
the maintainer script.
Net: 413 files changed, ~1.2k insertions, ~9.4k deletions.
`pnpm run dead-code` (Biome + knip default + knip production) and
`pnpm run type-check` are clean; 2277 tests pass.
* refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly
Promote the CLI workspace package to the public name `@kaelio/ktx` and
drop the separate `scripts/build-public-npm-package.mjs` wrapper. The
CLI package is now publishable in place (`publishConfig.access: public`,
`provenance: true`), so artifact packing uses `pnpm pack` against
`packages/cli/` instead of assembling a parallel package tree.
Updates all workspace filter invocations, docs, tests, and release
readiness checks to reference the new package name, and folds the
tarball-name helper into `scripts/public-npm-release-metadata.mjs`.
* docs: align "agent clients" and "data agents" terminology
Replace "client agents" with "agent clients" and "database agents" with
"data agents" across AGENTS.md, README.md, the docs-site copy, and the
matching setup-agents test description, matching the canonical
vocabulary in docs/terminology.md.
Also moves packages/cli/tsconfig.json's tsBuildInfoFile from
node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive
node_modules reinstalls.
* refactor(release): single source of truth for package version
Make packages/cli/package.json the single source of truth for the
@kaelio/ktx version. publicNpmPackageVersion() now reads it directly,
so artifact filenames, release-readiness checks, and the Python wheel
version all derive from one field. The duplicate
release-policy.json.publicNpmPackageVersion is removed.
Previously the two fields could drift: tarballs were named
kaelio-ktx-0.4.1.tgz while internally containing
@kaelio/ktx@0.0.0-private.
- update-public-release-version.mjs rewrites both Python pyproject.toml
files (ktx-daemon, ktx-sl) alongside the npm package.jsons,
normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2).
- semantic-release-config.cjs adds the two pyproject.toml files to
@semantic-release/git assets so the release commit back to main
carries every version source in lockstep.
- The six "?? '0.0.0-private'" fallback literals across the CLI are
replaced with "?? getKtxCliPackageInfo().version", and
createDefaultKtxMcpServer makes its version arg required.
- docs/release.md describes the actual commit-back model: the dev tree
always reflects the most recent release; no sentinel pin to
maintain.
Verified: pnpm run artifacts:build now produces
kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with
@kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and
2287 vitests + 173 script tests pass.
* refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime
Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and
scan command entrypoints so tests can stub them, and teach
resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime
feature when ktx.yaml selects sentence-transformers.
* chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal
Both symbols are consumed only by status-project.test.ts. Annotating with
/** @internal */ keeps knip's production-mode check clean without changing
runtime behavior.
* fix(cli): use real package metadata in print-command-tree
The stubbed package name embedded a forbidden product identifier that
tripped the boundary check in CI. Read the metadata from package.json
instead — keeps the rendered tree unchanged and removes a duplicate
source of truth.
* feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts
Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer
source counts, computed with `SUM(embedding_json IS NOT NULL)` over
`knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to
"Wiki" (canonical per `docs/terminology.md`) and rename the matching
`localStats.knowledgePages` field to `localStats.wikiPages`.
Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those
duplicated the per-surface rows above. Disk now reports only actual byte
usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` /
`semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry`
helpers, and the `filter` arg on `summarizeDir` are removed.
* feat(release): commit version files back to branch for one-version-everywhere
Add @semantic-release/git to the release plugin chain so the bumped
package.json, release-policy.json, and packages/cli/package.json land
back on the release branch after publish. This keeps the published npm
version and the in-repo version files in sync, so local builds from
main report the released version (e.g. ktx --version and the daemon
/health endpoint via KTX_DAEMON_VERSION).
Also widens assertPublicNpmReleaseTag to accept branch-<sanitized> tags,
unblocking branch RC publishes that pass through update-public-release-
version.mjs.
* test(release): pin GITHUB_REF_NAME in main-rc releaseTag assertion
The bare releaseTag('rc') call defaulted to process.env.GITHUB_REF_NAME,
which on PR CI is the merge ref (e.g. 186/merge) and yields
'branch-186-merge' instead of 'next'. Pass an explicit { GITHUB_REF_NAME:
'main' } so the test exercises the main-rc path regardless of CI env.
* chore: standardize daemon naming on "KTX daemon"
Replace inconsistent names ("KTX Python daemon", "KTX local embeddings
daemon", "KTX managed daemon", "Python daemon") with the single name
"KTX daemon" in CLI output, errors, command descriptions, test
assertions, smoke scripts, docs, AGENTS.md, issue templates, and
codecov flags. The daemon is a portable compute server with endpoints
for SQL analysis, semantic layer, LookML, database introspection, and
embeddings; the previous labels misrepresented it as embeddings-only or
exposed implementation details ("Python", "managed").
The "KTX Python runtime" concept (installed interpreter + packages) is
deliberately left as-is — it is a separate concept from the daemon
process.
* refactor(release): drop release-policy.json runtime dep and next branch
Strips the release-policy.json fallback from release-version.ts so the CLI
reads its version straight from packages/cli/package.json. dev → 0.0.0-private,
installed @kaelio/ktx → the real semver baked into the published package.json.
KtxCliPackageInfo collapses to { name, version, contextPackageName }; /health
no longer depends on version files surviving past a CI run.
Replaces the dual-branch (main + next) semantic-release model with a single-
branch model on main. rcs and stables interleave on the same branch via
{ name: 'main', prerelease: 'rc', channel: 'next' } / ['main']. Drops
@semantic-release/git and @semantic-release/changelog (nothing is committed
back to the repo on any channel) and the workflow's "Prepare next prerelease
branch" step plus the KTX_PRERELEASE_BRANCH plumbing. The git tag plus the
published npm artifact carry the version forward.
Updates docs/release.md, removes the two now-unused devDeps, regenerates
pnpm-lock.yaml. 611/611 @ktx/cli tests, 173/173 script tests, type-check,
biome, knip all clean.
* fix(release): don't throw on non-main branches at config-load time
knip loads .releaserc.cjs on every PR run, where GITHUB_REF_NAME is the
merge ref (e.g. 180/merge). The previous version of releaseBranches threw
immediately when the branch wasn't main, which made knip fail to evaluate
the config and then mis-flag @semantic-release/exec as an unused dep.
semantic-release already refuses to publish when the current branch doesn't
match a configured release branch, so the explicit throw was redundant.
Drop it (and the unused currentBranch helper) and replace the
"rejects releases from non-main" assertion with one that exercises a CI-
shaped GITHUB_REF_NAME and confirms the config loads.
* feat(docs): visualize KTX ingestion with ReactFlow diagram
Reframe the introduction around the two user-facing ingestion outputs (wiki
and executable semantic layer) and replace the static product-mechanics card
flow with a ReactFlow diagram: sources fan into a sequential ingest pipeline,
which forks into wiki and semantic-layer outputs connected by a bidirectional
"references" edge. Drop the .ktx/raw-sources internal-implementation rows from
the intro table and update the content test to guard the new copy.
* Improve KTX docs introduction
* feat(docs): animate ingestion flow with running dots
Replace static smoothstep edges in the introduction page's ingestion
diagram with a custom animated edge that runs glowing cyan dots along
each path, conveying the source → stage → output flow. Dot duration
scales with path length and is hidden under prefers-reduced-motion.
* feat(docs): route ingestion atoms through full source→output journey
Replace per-edge dots with full-journey particles: each atom is born at
a source, threads the entire stage chain, and lands at either the wiki
or semantic layer. Particles are tinted by their source's accent so
the origin is legible. Each source produces exactly 2 atoms (8 total)
to guarantee every input is visibly active, while the destination and
begin offsets are randomized per page load. Particles populate on
client mount to avoid hydration mismatch, and are hidden under
prefers-reduced-motion.
* fix(context): merge overlay columns onto manifest columns by name
composeOverlay was appending overlay columns to the manifest column list,
producing duplicate entries when dbt/metabase overlays declared a column
just to attach descriptions. The duplicates carried no `type`, so the
pydantic SourceDefinition rejected them at semantic-query time and broke
`ktx sl query` for every overlay-backed measure. Now overlay columns
match base columns by name (case-insensitive): same-name entries merge
onto the manifest (overlay fields win, type/role fall back to the base,
descriptions merge per source key) and only new names append.
* refactor(sl): split overlay columns from column_overrides and enforce TS/Python wire contract
Overlay sources now have two distinct collections: `columns:` for computed
columns (requiring `expr` + `type`) and `column_overrides:` for metadata
patches to inherited manifest columns. Composing or loading an overlay that
mixes the two — or references an unknown column — fails with a typed error.
Introduce `ResolvedSemanticLayerSource` / `resolvedSourceSchema` /
`toResolvedWire` as the strict shape sent to the Python engine, and add a
schema contract test that diffs Zod against the Pydantic JSON schema dumped
by `python -m semantic_layer dump-schema`. `SourceDefinition` is now
`extra="forbid"` on the Python side.
`loadAllSources` surfaces per-file load errors instead of swallowing them,
so validation/query paths can report manifest shard parse failures.
* fix(context): make scan description generation resilient and quiet
A transient sampleTable failure during ingest used to take out every
table in a connection: generateTableDescription returned a hardcoded
'Table not found' string into descriptions.ai, and KtxDescriptionGenerator
was constructed without a logger, so the failure left no trail anywhere.
- sampleTable / sampleColumn calls retry 3x with 200/400/800ms backoff,
honouring KtxScanContext.signal via a new KtxAbortedError.
- On retry exhaustion or missing capability, table generation falls back
to a metadata-only prompt built from column name / native type / comment
/ rawDescriptions. The column path follows the same rule -- call the
LLM when any of samples or rawDescriptions are available; skip only
when both are absent.
- Logger is now threaded from KtxScanContext into the generator. Failures
emit structured KtxScanWarning entries (new description_fallback_used
code, plus existing sampling_failed / enrichment_failed /
connector_capability_missing). ktx scan groups warnings by code so a
batch of identical failures collapses to one summary line plus sample.
- Returns null on failure instead of the 'Table not found' sentinel; the
manifest writer's existing guard already skips empty descriptions, so
schema YAML no longer carries misleading text. SCAN_MANAGED_DESCRIPTION_KEYS
already strips stale 'ai' on merge, so existing YAML clears on next run.
Also suppress AI SDK v6 'system in messages' warning: pull system messages
out of KtxMessageBuilder.wrapSimple's output via a new splitKtxSystemMessages
helper and pass them top-level to generateText (preserves cacheControl
providerOptions on the SystemModelMessage). Agent-runner's local
splitSystemPromptMessages dedupes onto the shared helper.
* test(docs): align examples-docs assertions with revamped docs
PR #103 (setup/guide doc revamp) reworded several CLI examples and
connection labels; the assertions in scripts/examples-docs.test.mjs
still referenced the pre-revamp wording and were failing in CI on main.
Update the regexes to match the post-revamp content:
- drop the `--json` flag from the sl-query example expectation
- move the `Driver:` / `Status: ok` probe to the connection reference,
which is where that output now lives (driver id is lowercase
`postgres`, not the display name `PostgreSQL`)
- drop the obsolete `Install \`uv\`...` troubleshooting line
- accept `<connectionId>` everywhere; the docs no longer use the
hyphenated `<connection-id>` form
- match the `warehouse` connection id used in the quickstart instead of
the `postgres-warehouse` id only used in the README and setup ref
* fix(sl): skip TS/Python schema contract test when uv is unavailable
The TypeScript checks CI job does not install uv or Python, so the
module-level `execFileSync('uv', ...)` in schemas.contract.test.ts threw
ENOENT and failed the suite. Wrap the schema dump in a try/catch and
guard the describe block with `describe.skipIf` so the test skips in
environments without uv. Local dev and any CI job that has uv on PATH
still runs the cross-language contract assertion.