feat(duckdb): cross-database federation via derived DuckDB connection (#295)
* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-06-15 22:01:39 +07:00
---
title: Cross-database federation
feat: Add duckdb connector (#308)
* refactor(duckdb): extract shared json-safe bigint helper
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add and register the duckdb primary connector
Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): route live-database ingest through the DuckDB connector
Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add duckdb to the setup database flow
Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): attach native duckdb files in federation
Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs(duckdb): document the duckdb primary-source connector
Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): use single-namespace display shape for main-only refs
DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): resolve connector dialect via getDialectForDriver
Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): skip schema picker for single-file duckdb setup
DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): reuse shared config helper and derive scope skip
Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): use a genuinely unknown driver in the rejection test
The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): cover dialect, connector, and live-introspection branches
Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.
- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction
Functions 100%, lines ~99% across the duckdb connector dir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs: add DuckDB to supported-driver references
The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-01 19:06:02 +07:00
description: How ktx federates postgres, mysql, sqlite, and duckdb connections so a single read-only SQL query can join across them without copying data.
feat(duckdb): cross-database federation via derived DuckDB connection (#295)
* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-06-15 22:01:39 +07:00
---
Cross-database federation lets a single read-only SQL query join tables that
live in different databases. **ktx** achieves this by embedding DuckDB and
using its `ATTACH` mechanism to connect each member database read-only. The
join executes inside DuckDB at query time — live data, no ETL, no copy.
You run federated queries as raw SQL against the `_ktx_federated` connection
(see [Querying the federated connection
directly](#querying-the-federated-connection-directly)). Semantic-layer queries
(`ktx sl query` / the `sl_query` tool) stay per-connection; pointing one at
`_ktx_federated` returns an error telling you to use raw SQL instead.
Federation activates automatically when a `ktx.yaml` file declares two or more
attach-compatible connections. There is nothing to configure and no federation
block to add. With zero or one compatible connection the behavior is unchanged.
## Which connections participate
feat: Add duckdb connector (#308)
* refactor(duckdb): extract shared json-safe bigint helper
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add and register the duckdb primary connector
Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): route live-database ingest through the DuckDB connector
Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add duckdb to the setup database flow
Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): attach native duckdb files in federation
Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs(duckdb): document the duckdb primary-source connector
Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): use single-namespace display shape for main-only refs
DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): resolve connector dialect via getDialectForDriver
Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): skip schema picker for single-file duckdb setup
DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): reuse shared config helper and derive scope skip
Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): use a genuinely unknown driver in the rejection test
The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): cover dialect, connector, and live-introspection branches
Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.
- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction
Functions 100%, lines ~99% across the duckdb connector dir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs: add DuckDB to supported-driver references
The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-01 19:06:02 +07:00
The v1 federation engine supports four drivers:
feat(duckdb): cross-database federation via derived DuckDB connection (#295)
* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-06-15 22:01:39 +07:00
| Driver | Participates in federation |
|--------|---------------------------|
| `postgres` | Yes |
| `mysql` | Yes |
| `sqlite` | Yes |
feat: Add duckdb connector (#308)
* refactor(duckdb): extract shared json-safe bigint helper
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add and register the duckdb primary connector
Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): route live-database ingest through the DuckDB connector
Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add duckdb to the setup database flow
Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): attach native duckdb files in federation
Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs(duckdb): document the duckdb primary-source connector
Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): use single-namespace display shape for main-only refs
DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): resolve connector dialect via getDialectForDriver
Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): skip schema picker for single-file duckdb setup
DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): reuse shared config helper and derive scope skip
Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): use a genuinely unknown driver in the rejection test
The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): cover dialect, connector, and live-introspection branches
Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.
- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction
Functions 100%, lines ~99% across the duckdb connector dir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs: add DuckDB to supported-driver references
The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-01 19:06:02 +07:00
| `duckdb` | Yes |
feat(duckdb): cross-database federation via derived DuckDB connection (#295)
* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-06-15 22:01:39 +07:00
| `snowflake` | No — standalone connection |
| `bigquery` | No — standalone connection |
| `clickhouse` | No — standalone connection |
| `sqlserver` | No — standalone connection |
Non-participating connections continue to work exactly as they did. They are
queried independently; they do not appear as federation members.
## How it activates
**ktx** inspects the connections in `ktx.yaml` at startup. When it finds two or
feat: Add duckdb connector (#308)
* refactor(duckdb): extract shared json-safe bigint helper
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add and register the duckdb primary connector
Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): route live-database ingest through the DuckDB connector
Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add duckdb to the setup database flow
Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): attach native duckdb files in federation
Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs(duckdb): document the duckdb primary-source connector
Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): use single-namespace display shape for main-only refs
DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): resolve connector dialect via getDialectForDriver
Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): skip schema picker for single-file duckdb setup
DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): reuse shared config helper and derive scope skip
Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): use a genuinely unknown driver in the rejection test
The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): cover dialect, connector, and live-introspection branches
Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.
- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction
Functions 100%, lines ~99% across the duckdb connector dir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs: add DuckDB to supported-driver references
The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-01 19:06:02 +07:00
more connections whose driver is `postgres`, `mysql`, `sqlite`, or `duckdb`, it
feat(duckdb): cross-database federation via derived DuckDB connection (#295)
* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-06-15 22:01:39 +07:00
instantiates the DuckDB federation engine and attaches each one read-only.
There is no `federation:` key, no opt-in flag, and no connection-level setting
to enable. The engine is derived entirely from what is already declared.
A minimal `ktx.yaml` that triggers federation:
```yaml
connections:
pg_books:
driver: postgres
url: "postgres://user:pass@localhost:5432/books" # pragma: allowlist secret
sqlite_reviews:
driver: sqlite
path: ./data/reviews.db
```
Two attach-compatible connections are present, so federation is active.
## Table naming in federated queries
Inside a federated query, postgres and mysql tables use a three-part name:
feat: Add duckdb connector (#308)
* refactor(duckdb): extract shared json-safe bigint helper
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add and register the duckdb primary connector
Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): route live-database ingest through the DuckDB connector
Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add duckdb to the setup database flow
Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): attach native duckdb files in federation
Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs(duckdb): document the duckdb primary-source connector
Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): use single-namespace display shape for main-only refs
DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): resolve connector dialect via getDialectForDriver
Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): skip schema picker for single-file duckdb setup
DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): reuse shared config helper and derive scope skip
Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): use a genuinely unknown driver in the rejection test
The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): cover dialect, connector, and live-introspection branches
Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.
- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction
Functions 100%, lines ~99% across the duckdb connector dir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs: add DuckDB to supported-driver references
The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-01 19:06:02 +07:00
`connectionId.schema.table`. SQLite and DuckDB tables use the two-part form
`connectionId.table`, since ktx addresses both as single-namespace members. In
both cases the connection's `id` field in `ktx.yaml` becomes the catalog name
inside DuckDB.
feat(duckdb): cross-database federation via derived DuckDB connection (#295)
* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-06-15 22:01:39 +07:00
If a connection `id` is not a bare SQL identifier — for example it contains a
hyphen, like `books-db` — double-quote it in the query the same way DuckDB
quotes any identifier: `"books-db".public.books`. Writing it unquoted
(`books-db.public.books`) is a SQL syntax error, not a federation feature.
For the example above:
- `pg_books.public.books` — the `books` table in the `public` schema of the
postgres connection
- `sqlite_reviews.reviews` — the `reviews` table in the sqlite connection
These fully qualified names are what you write when you query the federated
connection with raw SQL (see [Querying the federated connection
directly](#querying-the-federated-connection-directly)). A source file's own
`table:` field is not prefixed this way — see [Source files keep member-native
table refs](#source-files-keep-member-native-table-refs) below.
## Source names in the federated view
When you list or search semantic-layer sources under the federated connection,
each source's `name` is prefixed with its member connection id — for example
`pg_books.books` and `sqlite_reviews.reviews`. The prefix keeps names unique
when two members own a table with the same name: a `users` table in each of
`pg_app` and `sqlite_app` surfaces as `pg_app.users` and `sqlite_app.users`
rather than colliding on a bare `users`.
## Source files keep member-native table refs
A source file's physical `table:` field is not prefixed with the connection id.
It stays the member-native reference the connector uses on its own —
`public.books` for the postgres member, `reviews` for the sqlite member —
because the same file backs a per-connection semantic-layer query against that
member, which runs on the member's own driver where a `pg_books.` catalog prefix
would point at a database that does not exist. The connection-id prefix is a
DuckDB catalog name that appears only in raw federated SQL; the member prefix on
the source `name` (above) is independent of it.
## Cross-database joins
Write a cross-database join as raw SQL against `_ktx_federated` — see
[Querying the federated connection
directly](#querying-the-federated-connection-directly) below for a runnable
example. DuckDB attaches both members and resolves the join live at query time.
Declaring the join in a source file's `joins:` block is not supported yet. The
semantic layer plans each connection on its own, so a `joins:` entry whose `to:`
points at a table in another member is not resolved across the federation
boundary. Until that lands, express cross-database joins as raw SQL.
## Querying the federated connection directly
The federated connection is addressable by its id,
`_ktx_federated`, anywhere **ktx** runs read-only SQL. The same id works for the
`ktx sql` command and for a data agent calling the `sql_execution` MCP tool, so
both surfaces can run a cross-database query without a source file:
```bash
ktx sql -c _ktx_federated \
"SELECT b.title, avg(r.rating) AS avg_rating
FROM pg_books.public.books b
JOIN sqlite_reviews.reviews r ON b.id = r.book_id
GROUP BY b.title"
```
Table names follow the rules from
[Table naming in federated queries](#table-naming-in-federated-queries):
three-part `connectionId.schema.table` for postgres and mysql, two-part
feat: Add duckdb connector (#308)
* refactor(duckdb): extract shared json-safe bigint helper
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add and register the duckdb primary connector
Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): route live-database ingest through the DuckDB connector
Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add duckdb to the setup database flow
Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): attach native duckdb files in federation
Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs(duckdb): document the duckdb primary-source connector
Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): use single-namespace display shape for main-only refs
DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): resolve connector dialect via getDialectForDriver
Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): skip schema picker for single-file duckdb setup
DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): reuse shared config helper and derive scope skip
Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): use a genuinely unknown driver in the rejection test
The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): cover dialect, connector, and live-introspection branches
Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.
- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction
Functions 100%, lines ~99% across the duckdb connector dir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs: add DuckDB to supported-driver references
The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-01 19:06:02 +07:00
`connectionId.table` for sqlite and duckdb. The `_ktx_federated` id is virtual —
it is never written to `ktx.yaml` and only exists when two or more attach-compatible
feat(duckdb): cross-database federation via derived DuckDB connection (#295)
* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-06-15 22:01:39 +07:00
connections are declared. It surfaces in `ktx connection` and in the agent's
connection list so the id is discoverable. Querying a single member database
directly with its own connection id (`ktx sql -c pg_books ...`) is unchanged.
feat: query_policy semantic-layer-only restricts agents to predefined semantic-layer measures (#334)
* feat(sl): add predefined_measures_only guard to semantic query planning
SemanticQuery gains a predefined_measures_only flag; the planner rejects
any measure resolved with Provenance.COMPOSED (runtime aggregate
expressions and query-time derivations) while predefined measures,
predefined derived chains, dimensions, filters, and segments pass.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* feat(config): add per-connection query_policy to warehouse connections
query_policy: semantic-layer-only | read-only-sql (default) on the
warehouse connection schema, plus a policy module with the raw-SQL
guard, federated member restriction lookup, and the project-level
predicate used to gate sql_execution registration.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* feat(cli): enforce query_policy on raw SQL through one shared executor
ktx sql and the MCP sql_execution tool now share executeProjectRawSql
(resolve, policy check, read-only validation, execute), collapsing
their duplicated validate-then-execute paths. Restricted connections
are rejected before validation; federated raw SQL is rejected when any
member is restricted. sql_execution is not registered when every SQL
connection is restricted, and connection_list marks restricted
connections so agents route to sl_query. executeProjectReadOnlySql
stays generic for ktx-internal SQL (scan, ingest, SL-generated).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* feat(sl): compile queries with predefined_measures_only from query_policy
compileLocalSlQuery injects the flag from the connection's query_policy,
never from caller input, covering both ktx sl query and the MCP
sl_query tool through the daemon compile path.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* docs: document query_policy semantic-layer-only
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(sl): close semantic-layer-only bypasses via filters and federated hint
The predefined_measures_only guard only inspected query.measures, so a
composed aggregate written into `filters` slipped through _classify_filters
into a HAVING clause untouched — letting a restricted agent evaluate
arbitrary aggregates (e.g. threshold-probing `sum(x) BETWEEN a AND b`).
Reject filter clauses that compose an aggregate function; a HAVING that
compares a predefined measure by name (`orders.revenue > 100`) still works.
Also make the federated sl_query error policy-aware: when a member is
restricted, raw federated SQL is disabled too, so stop directing the agent
to `ktx sql -c _ktx_federated` / sql_execution (a guaranteed failure) and
point to per-connection semantic-layer queries instead.
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-03 01:54:17 -07:00
If any member connection sets
[`query_policy: semantic-layer-only`](/docs/configuration/ktx-yaml#query-policy),
raw SQL against `_ktx_federated` is rejected as a whole: a federated query can
touch any member's tables, so one restricted member restricts the federation.
feat(duckdb): cross-database federation via derived DuckDB connection (#295)
* feat(duckdb): add @duckdb/node-api dependency for federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): extract resolveStringReference to shared module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(connectors): route all identical connectors through shared resolveStringReference
Collapse the 5 remaining private copies in bigquery, clickhouse, mysql,
snowflake, and sqlserver into the shared module. Fix a latent bug in the
shared module where `~/path` was incorrectly sliced (dropping only `~`,
leaving the leading `/` and making resolve() ignore homedir). Add a
tilde-expansion test that caught the bug and now covers that branch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): reserve _ktx_ connection-id prefix for virtual connections
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(connections): derive virtual federated connection from compatible members
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(duckdb): federated executor builds READ_ONLY attaches and runs SQL
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance and escape quotes in attach url
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(sl): union member source directories for _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(query): route _ktx_federated through DuckDB executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(sl): use duckdb dialect for federated query compilation
Bypass assertSafeConnectionId for _ktx_federated in resolveLocalConnectionId
and loadComputableSources, and resolve the compute dialect to 'duckdb' when
connectionId is FEDERATED_CONNECTION_ID instead of falling through to the
default postgres lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(duckdb): end-to-end cross-catalog federated join
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(duckdb): harden federated join test with multi-book join-key coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(ingest): keep declared cross-DB joins to federated siblings
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(setup): surface federated connection availability after adding a member
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(setup): mark federationNoticeFor @internal for dead-code gate
Also marks attachTypeForDriver, buildAttachStatements, and
isReservedConnectionId @internal — all three are exported solely for
unit-test access with no production cross-file consumer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): document cross-database federation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(concepts): correct sqlite two-part naming in federation doc
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): quote federated catalog alias so hyphenated connection ids attach
* refactor(duckdb): single-source federation driver list, dedup attach loads
Collapse the parallel ATTACH_COMPATIBLE_DRIVERS set and ATTACH_TYPE_BY_DRIVER
map into one map in federation.ts whose keys are the membership rule. Replace
FederatedMember.config (read only via a type-erasing cast) with a typed url
field extracted at derive time. Emit INSTALL/LOAD once per distinct driver
type instead of once per member.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(duckdb): close federated DuckDB instance on connect failure; dedup id validation
Wrap the federated DuckDB instance in its own try/finally so a failing
connect() or a throwing connection.closeSync() no longer leaks the native
instance. Route setup-sources connection-id validation through the canonical
assertSafeConnectionId so the reserved _ktx_ prefix guard applies there too.
Derive the federated dialect through sqlAnalysisDialectForDriver instead of a
hardcoded literal.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): carry member connection config and projectDir on FederatedMember
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): resolve per-member attach targets via canonical connector resolvers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): quote mysql attach-string values like postgres
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): resolve member attach targets via canonical resolvers, supporting sqlite path:
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): thread projectDir through deriveFederatedConnection callers
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared project read-only SQL executor that routes _ktx_federated
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): exercise shared executor default federated path with real DuckDB
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): route ingest query executor through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route MCP sql_execution _ktx_federated through shared executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve cross-DB joins to federated siblings in manifest re-emit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve declared cross-DB joins through scan re-ingest
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): document sibling-ref invariant, drop unsafe casts in test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): namespace federated source names by member to avoid collisions
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document member-namespaced federated source names
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): preserve member SSL/search_path in attach, classify federated MCP errors
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify federated dispatch and parallelize sibling reads
Dedup the federated driver ternary in local-query, derive the prefixed
source.name from the already-built name, drop the duplicated error in
federatedAttachTarget's exhaustive switch, inline the one-line
cleanupConnector wrapper, and parallelize federatedSiblingTargets' shard
reads (was sequential await-in-for on the scan hot path).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): carry headerTypes through shared SQL executor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): add shared federated connection listing builder
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): route ktx sql through shared executor for _ktx_federated parity
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): show _ktx_federated in ktx connection list
Surfaces the virtual federated connection in the output of
`ktx connection list` so agents and users can discover cross-database
querying when 2+ attach-compatible connections are configured.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(federation): surface _ktx_federated in MCP connection_list
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(federation): ktx sql federated cross-file join end-to-end
Drive runKtxSql with the real federated DuckDB executor against two on-disk
sqlite files, stubbing only SQL validation. The test surfaced that the JSON
output path could not serialize bigint values DuckDB returns for integer
columns; printJson now coerces bigint to JSON numbers, matching the
plain/pretty paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(federation): document direct _ktx_federated query surface
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): coerce DuckDB bigint to number in shared federated executor
DuckDB returns integer columns as JS bigint, which JSON.stringify cannot
serialize. The CLI --json path worked around this with a replacer, but the
MCP sql_execution tool serializes via plain JSON.stringify and crashed on
any federated query selecting an integer column. Coerce bigint to Number
once in executeFederatedQuery so every consumer (CLI, MCP, ingest, SL)
gets a JSON-safe result, and remove the now-redundant CLI replacer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* refactor(federation): simplify driver map and collapse forked MCP SQL path
- Replace the identity-valued ATTACH_TYPE_BY_DRIVER record with a
ATTACH_COMPATIBLE_DRIVERS Set; the driver name doubles as the attach
type, so the map encoded nothing beyond membership.
- Switch federatedAttachTarget directly on the driver with a default
throw, dropping the unreachable post-switch throw and its comment.
- Route the MCP sql_execution standard-connection case through the
shared executeProjectReadOnlySql instead of reimplementing the
connector create/capability-check/execute/cleanup ceremony, so
federated and standard connections share one execution path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* chore(federation): allowlist placeholder credentials for detect-secrets
The federation doc example URL and the federated-attach test fixtures use
literal placeholder credentials that trip detect-secrets. Mark them with
line-scoped pragma allowlist comments so a real secret added later is still
caught.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(federation): correct SL addressing, join pruning, and id-quoting guidance
- Federated SL list/search records carry the virtual `_ktx_federated`
connection id (member origin stays in the prefixed source name), so rows
round-trip to `ktx sl -c _ktx_federated read` and the fts index no longer
clobbers per-connection partitions.
- Prune semantic-layer joins by membership in the connection's own source set
instead of matching the target's first dotted segment against other
connection ids; a same-connection join whose target name collides with a
sibling connection id is preserved, and orphan targets that would poison the
planner are dropped.
- Document double-quoting for connection ids that are not bare SQL identifiers
(e.g. "books-db".public.books) in the federated naming hint, the sl-query
rejection error, and the federation docs.
- Preserve exact federated BIGINT values beyond 2^53 as strings instead of
rounding, and steer the setup federation notice to raw SQL against
`_ktx_federated`.
* fix(federation): carry ssl:true into postgres URL attach target
A postgres member configured with `url` plus `ssl: true` resolved to both a
connectionString and an ssl flag, but the federated attach builder early-returned
the bare URL and dropped the ssl intent. DuckDB then handed libpq a URL with no
sslmode, so the URL path silently diverged from the discrete-field path (which
emits sslmode=require) and from the direct scan path (which enforces TLS).
Append sslmode=require to the URL when the member sets ssl, unless the URL
already pins a stronger sslmode.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-06-15 22:01:39 +07:00
## Federated queries are read-only
DuckDB attaches every member database with read-only access. Federated queries
are `SELECT`/`WITH` only. No writes, no DDL, and no mutations reach any member
database through the federation engine.
## Current limitations
- **Raw SQL joins only.** Cross-database joins are written as raw SQL; declaring
them in a source's `joins:` block and automatic discovery of cross-database
relationships are not available yet. Intra-database relationship discovery for
each member connection is unchanged.
feat: Add duckdb connector (#308)
* refactor(duckdb): extract shared json-safe bigint helper
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add and register the duckdb primary connector
Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): route live-database ingest through the DuckDB connector
Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add duckdb to the setup database flow
Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): attach native duckdb files in federation
Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs(duckdb): document the duckdb primary-source connector
Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): use single-namespace display shape for main-only refs
DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): resolve connector dialect via getDialectForDriver
Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): skip schema picker for single-file duckdb setup
DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): reuse shared config helper and derive scope skip
Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): use a genuinely unknown driver in the rejection test
The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): cover dialect, connector, and live-introspection branches
Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.
- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction
Functions 100%, lines ~99% across the duckdb connector dir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs: add DuckDB to supported-driver references
The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-01 19:06:02 +07:00
- **postgres, mysql, sqlite, and duckdb only.** Other drivers (snowflake,
bigquery, clickhouse, sqlserver) do not participate in federation in this
version. They remain usable as standalone connections.