Commit graph

7 commits

Author SHA1 Message Date
Kevin Messiaen
3c4fcc27c7
feat: Add duckdb connector (#308)
* refactor(duckdb): extract shared json-safe bigint helper

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(duckdb): add and register the duckdb primary connector

Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(duckdb): route live-database ingest through the DuckDB connector

Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(duckdb): add duckdb to the setup database flow

Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(duckdb): attach native duckdb files in federation

Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(duckdb): document the duckdb primary-source connector

Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(duckdb): use single-namespace display shape for main-only refs

DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(duckdb): resolve connector dialect via getDialectForDriver

Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(duckdb): skip schema picker for single-file duckdb setup

DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(duckdb): reuse shared config helper and derive scope skip

Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(duckdb): use a genuinely unknown driver in the rejection test

The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(duckdb): cover dialect, connector, and live-introspection branches

Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.

- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
  cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction

Functions 100%, lines ~99% across the duckdb connector dir.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs: add DuckDB to supported-driver references

The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-01 12:06:02 +00:00
Andrey Avtomonov
ca231df5fe
fix(gdrive): validate folder access, run config test, harden Drive API (#321)
* fix(gdrive): validate folder access, run config test, harden Drive API

Connection test and setup validation now verify folder_id resolves to an accessible Drive folder before counting Docs, via a shared verifyGdriveFolderAndCountDocs helper, so a wrong or unshared folder fails instead of passing with 0 docs.

Move gdrive-config.test.ts under test/ so Vitest's test/** glob actually runs it; escape folder_id in the Drive query; add retry/backoff on transient Google API responses; and record skipped non-Google-Doc files in the staged manifest.

* chore: sync uv.lock to ktx-daemon/ktx-sl 0.13.1
2026-06-28 01:02:37 +02:00
ARYAN
5645dc4d28
Add gdrive context source adapter (#209)
* Add gdrive context source adapter

* feat(gdrive): normalize internal doc links, tabs, and header/footer structure

* fix(gdrive): reject generic source credential flags

* test(gdrive): include local adapter in expected list

* fix(gdrive): remove dead exports and silence false positive secret checks

* fix(setup): restore notion source auth flow
2026-06-27 23:41:32 +02:00
Andrey Avtomonov
fb7b94b60e
feat(telemetry): collect PostHog $exception error reports in CLI and daemon (#262)
* feat(telemetry): add node exception reporter

* feat(telemetry): report node cli exceptions

* feat(telemetry): add daemon exception reporter

* feat(telemetry): report daemon exceptions

* docs(telemetry): document error reports

* fix(telemetry): pass redaction snapshots from node call sites

* test(telemetry): verify prepared node exception payload

* fix(telemetry): close daemon exception lifecycle gaps

* test(telemetry): verify prepared daemon exception payload

* test(telemetry): close error collection acceptance gaps

* test(telemetry): close posthog exception acceptance gaps
2026-06-05 19:36:21 +02:00
Andrey Avtomonov
ec7edf8f50
fix(telemetry): preserve driver error class and code in connection_test (#260)
Native connector test failures were flattened to `new Error(message)`,
collapsing every driver's error class to `Error` and dropping `.code` /
`.number`. connection_test telemetry could therefore not tell a SQL Server
login rejection (ELOGIN / 18456) apart from a network or TLS error, and the
only field that varied was a raw message.

Connectors now return `connectorTestFailure(error)`, which preserves the
original driver error as `cause`, and `testNativeConnection` re-throws that
cause. `scrubErrorClass` then records the real class (e.g. ConnectionError)
and `formatErrorDetail` keeps the code prefix (e.g. "ELOGIN: ..."). The
helper is the single source of truth for the failure shape across all seven
native connectors. User-facing terminal output is unchanged.
2026-06-04 14:51:14 +02:00
Andrey Avtomonov
6da8c3452a
feat(telemetry): include error details for failures (#254) 2026-06-02 17:23:51 +02:00
Andrey Avtomonov
56985b7e09
test: split cli tests from source tree (#216)
* feat(cli): define full warehouse dialect contract

* test(cli): keep dialect edge tests focused

* fix(cli): stabilize dialect contract foundation

* refactor(connectors): own read-only query preparation

* refactor(connectors): resolve dialects through registry

* refactor(connectors): keep concrete dialect classes internal

* chore(workspace): enforce dialect import boundary

* refactor(cli): resolve relationship dialect at scan boundary

* refactor(cli): use dialect display parsing for entity details

* refactor(cli): use dialect display parsing for warehouse catalog

* refactor(cli): use dialect SQL in relationship workflows

* test(cli): verify solid dialect scan workflow closure

* test: split cli tests from source tree

* refactor(cli): standardize BigQuery scope listing

* feat(sqlite): implement connector scope listing

* test(connectors): cover required table listing

* feat(cli): add warehouse driver registry

* refactor(setup): route scope discovery through driver registry

* refactor(cli): route local query execution through driver registry

* refactor(historic-sql): route dialect support through driver registry

* refactor(cli): test warehouse connections through driver registry

* fix(cli): close driver registry type export gaps

* Improve setup daemon diagnostics

* refactor(setup): centralize rail-prefixed diagnostics + query-history fallback

Extract errorMessage, writePrefixedLines, and flushPrefixedBufferedCommandOutput
into clack.ts so the setup wizard, managed daemons, and embedding/agent steps
share one rail-formatted writer. setup-databases.ts also adds a
"disable query history and retry" option when the schema-context build fails
and query history is the likely culprit, surfaced via a new
failed-query-history-unavailable status.

* fix(cli): carry catalog through the picker so BigQuery/Snowflake/SQL Server scope filters match

The setup picker's KtxTableListEntry was a 2-level { schema, name }, so
qualifiedTableId always wrote db.name into enabled_tables. When BigQuery,
Snowflake, or SQL Server later ran fast ingest, their introspect step filtered
the scope set with scopedTableNames(scope, { catalog: projectId|database, db })
— catalog was non-null on the introspect side but null in the scope refs, so
every entry was rejected, the live-database adapter staged zero table files,
and detect() failed with 'Adapter "live-database" did not recognize fetched
source output'.

Align the picker boundary with the canonical 3-level KtxTableRef:

- Add catalog: string | null to KtxTableListEntry.
- BigQuery/Snowflake/SQL Server listTables populate catalog from the
  resolved projectId / database; Postgres/MySQL/ClickHouse/SQLite set null.
- qualifiedTableId emits catalog.schema.name when catalog is non-null
  (resolveEnabledTables already accepts the 3-part shape) and
  schemasFromEnabledTables now goes through parseDottedTableEntry so it
  recovers the schema correctly from both 2-part and 3-part entries.
- Export parseDottedTableEntry from enabled-tables.ts (@internal) for picker
  reuse.

Update listTables expectations in all seven connector tests and the setup /
picker test fixtures. Add a picker regression test that covers the
catalog-bearing round-trip (save + refine).

* fix(cli): allow debug telemetry under opt-out env
2026-05-26 08:49:05 +02:00
Renamed from packages/cli/src/connection.test.ts (Browse further)