ktx/packages/cli/test/context/mcp/dialect-notes.test.ts
Kevin Messiaen 3c4fcc27c7
feat: Add duckdb connector (#308)
* refactor(duckdb): extract shared json-safe bigint helper

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(duckdb): add and register the duckdb primary connector

Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(duckdb): route live-database ingest through the DuckDB connector

Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(duckdb): add duckdb to the setup database flow

Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(duckdb): attach native duckdb files in federation

Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(duckdb): document the duckdb primary-source connector

Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(duckdb): use single-namespace display shape for main-only refs

DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(duckdb): resolve connector dialect via getDialectForDriver

Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(duckdb): skip schema picker for single-file duckdb setup

DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(duckdb): reuse shared config helper and derive scope skip

Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(duckdb): use a genuinely unknown driver in the rejection test

The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(duckdb): cover dialect, connector, and live-introspection branches

Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.

- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
  cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction

Functions 100%, lines ~99% across the duckdb connector dir.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs: add DuckDB to supported-driver references

The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
2026-07-01 12:06:02 +00:00

110 lines
5.5 KiB
TypeScript

import { readdirSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
import { describe, expect, it } from 'vitest';
import { KtxExpectedError } from '../../../src/errors.js';
import { KTX_DATABASE_DRIVER_IDS } from '../../../src/connection-drivers.js';
import type { KtxProjectConnectionConfig } from '../../../src/context/project/config.js';
import { sqlAnalysisDialectForDriver } from '../../../src/context/sql-analysis/dialect.js';
import { DIALECTS_WITH_NOTES, sqlDialectNotes } from '../../../src/context/sql-analysis/dialect-notes.js';
import { resolveDialectNotesForConnection } from '../../../src/context/mcp/local-project-ports.js';
function conn(driver: string): KtxProjectConnectionConfig {
return { driver } as KtxProjectConnectionConfig;
}
describe('per-dialect SQL notes', () => {
it('covers every dialect reachable from a configured warehouse driver', () => {
// Derived from the connector registry, not a hand-maintained list: a new
// warehouse driver whose resolved dialect lacks authored notes fails here.
for (const driver of KTX_DATABASE_DRIVER_IDS) {
const dialect = sqlAnalysisDialectForDriver(driver);
expect(DIALECTS_WITH_NOTES, `driver "${driver}" resolves to dialect "${dialect}"`).toContain(dialect);
expect(sqlDialectNotes(dialect).length).toBeGreaterThan(0);
}
});
it('keeps the authored-dialect list and the ./dialects markdown files in sync', () => {
const dir = fileURLToPath(new URL('../../../src/context/sql-analysis/dialects/', import.meta.url));
const files = readdirSync(dir)
.filter((name) => name.endsWith('.md'))
.map((name) => name.replace(/\.md$/, ''))
.sort();
expect(files).toEqual([...DIALECTS_WITH_NOTES].sort());
});
it('does not author notes for unreachable dialects', () => {
// databricks appears in the resolver map but no connector produces it.
expect(DIALECTS_WITH_NOTES).not.toContain('databricks');
});
it('answers the full rubric for every dialect', () => {
for (const dialect of DIALECTS_WITH_NOTES) {
const notes = sqlDialectNotes(dialect);
expect(notes, `${dialect}: FQTN`).toContain('**FQTN:**');
expect(notes, `${dialect}: identifiers`).toContain('**Identifiers:**');
expect(notes, `${dialect}: date/time`).toContain('**Date/time:**');
expect(notes, `${dialect}: top-N`).toMatch(/\*\*Top-N/);
expect(notes, `${dialect}: series`).toMatch(/\*\*Series/);
expect(notes, `${dialect}: rolling window`).toMatch(/\*\*Rolling/);
expect(notes, `${dialect}: safe cast`).toMatch(/\*\*Safe cast/);
expect(notes, `${dialect}: semi-structured`).toMatch(/\*\*(JSON|Semi-structured)/);
}
});
it('gives each engine its own idioms and never leaks another engine-only construct', () => {
// A sqlite analyst gets sqlite date idioms and never Snowflake/BigQuery-only syntax.
expect(sqlDialectNotes('sqlite')).toMatch(/strftime|julianday/);
expect(sqlDialectNotes('sqlite')).not.toContain('VARIANT');
expect(sqlDialectNotes('sqlite')).not.toContain('_TABLE_SUFFIX');
// QUALIFY appears only for the engines that actually support it.
expect(sqlDialectNotes('snowflake')).toContain('QUALIFY');
expect(sqlDialectNotes('bigquery')).toContain('QUALIFY');
for (const dialect of ['postgres', 'mysql', 'sqlite', 'clickhouse', 'tsql'] as const) {
expect(sqlDialectNotes(dialect), `${dialect} must not mention QUALIFY`).not.toContain('QUALIFY');
}
// Engine-exclusive markers stay in their own dialect.
expect(sqlDialectNotes('snowflake')).toContain('VARIANT');
expect(sqlDialectNotes('snowflake')).toContain('DATABASE.SCHEMA.TABLE');
expect(sqlDialectNotes('bigquery')).toContain('_TABLE_SUFFIX');
expect(sqlDialectNotes('clickhouse')).toContain('LIMIT n BY');
expect(sqlDialectNotes('tsql')).toContain('TOP (n)');
});
it('contains no benchmark/grader or version-dated content', () => {
for (const dialect of DIALECTS_WITH_NOTES) {
const notes = sqlDialectNotes(dialect);
expect(notes).not.toMatch(/\bspider\b|\bbenchmark\b|\bgold\b|\bgrader\b/i);
expect(notes).not.toMatch(/\bas of v(ersion)?\b/i);
}
});
it('falls back to postgres notes for a dialect without its own file', () => {
expect(sqlAnalysisDialectForDriver('some-future-engine')).toBe('postgres');
// redshift is a valid SqlAnalysisDialect but intentionally unauthored.
expect(sqlDialectNotes('redshift')).toBe(sqlDialectNotes('postgres'));
});
});
describe('resolveDialectNotesForConnection', () => {
it('resolves a warehouse connection to its dialect notes', () => {
expect(resolveDialectNotesForConnection('wh', conn('sqlite'))).toMatchObject({
connectionId: 'wh',
dialect: 'sqlite',
});
expect(resolveDialectNotesForConnection('wh', conn('snowflake')).dialect).toBe('snowflake');
// The sqlserver driver resolves to the tsql dialect (resolver codomain).
expect(resolveDialectNotesForConnection('wh', conn('sqlserver')).dialect).toBe('tsql');
});
it('rejects a non-SQL context source with a clear expected error, not postgres notes', () => {
expect(() => resolveDialectNotesForConnection('mb', conn('metabase'))).toThrow(KtxExpectedError);
expect(() => resolveDialectNotesForConnection('mb', conn('metabase'))).toThrow(/not a SQL warehouse/);
});
it('rejects an unconfigured connection', () => {
expect(() => resolveDialectNotesForConnection('missing', undefined)).toThrow(KtxExpectedError);
expect(() => resolveDialectNotesForConnection('missing', undefined)).toThrow(/not configured/);
});
});