mirror of
https://github.com/Kaelio/ktx.git
synced 2026-07-04 10:52:13 +02:00
* refactor(duckdb): extract shared json-safe bigint helper
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add and register the duckdb primary connector
Add KtxDuckDbDialect, KtxDuckDbScanConnector (local file-backed, read-only,
never-create, main-schema introspection via information_schema and
duckdb_constraints() for foreign keys), and register the duckdb driver across
the dialect factory, driver registry, connection-type enum, warehouse descriptor,
config schema, scan normalization, connection test drivers, and status display.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): route live-database ingest through the DuckDB connector
Add the DuckDB live-database introspection bridge and dispatch duckdb
connections to it in local-adapters, matching the SQLite path. Repoint the
config-rejection test off duckdb (now a valid driver) onto the no-driver case.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): add duckdb to the setup database flow
Offer DuckDB in the interactive checklist and via ktx setup --database duckdb,
with a file-path prompt and duckdb-local default connection id, parallel to SQLite.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(duckdb): attach native duckdb files in federation
Native .duckdb members ATTACH with (READ_ONLY) and no TYPE/INSTALL/LOAD, since
the duckdb format needs no extension. attachTypeForDriver returns null for the
native case; buildAttachStatements builds load statements from non-null types
only and emits a conditional ATTACH clause.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs(duckdb): document the duckdb primary-source connector
Add a DuckDB section to the primary-sources integration page (config, read-only
never-create behavior, main-schema scope, federation) and update the
supported-driver assertion in dialects.test.ts to include duckdb.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): use single-namespace display shape for main-only refs
DuckDB v1 introspects the main schema and sets db=null on every table, so its
display refs are single-namespace like SQLite. The ansi shape emitted a 1-part
table display it then refused to parse, breaking column-level display resolution.
Switch the dialect to the sqlite display shape and add a round-trip test plus a
composite-foreign-key test that were missing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): resolve connector dialect via getDialectForDriver
Route the connector's dialect through the shared factory like every other
connector, now that duckdb is registered. Single construction path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(duckdb): skip schema picker for single-file duckdb setup
DuckDB is a single-file, single-namespace ('main') database like SQLite,
but the setup scope step only skipped the schema picker for sqlite. DuckDB
fell into the multi-schema path with an empty schema list, rendering a
broken picker ("No matches found" for main). Extend the file-based-driver
early-return to cover duckdb so it ingests every table directly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(duckdb): reuse shared config helper and derive scope skip
Route duckdb path resolution through the shared resolveStringReference
helper instead of a local third copy of env:/file: handling. Derive the
setup scope-picker skip from SCOPE_DISCOVERY_SPECS membership rather than
a hardcoded sqlite/duckdb driver list.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): use a genuinely unknown driver in the rejection test
The merged "rejects unknown drivers" test used `driver: duckdb` as its
unknown-driver stand-in, which stopped being unknown once this branch
added the duckdb connector. Switch to `nonsense` so it again exercises
the unsupported-driver config error.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(duckdb): cover dialect, connector, and live-introspection branches
Codecov flagged uncovered branches as dead code; all are real connector,
dialect, and live-ingest behavior. Add unit tests instead of removing them.
- dialect: precedence ladder, sample/clause builders, profiling expressions
- connector: url/env config forms, error throws, never-create guard,
cardinality cap branches, table-scope empty/non-empty paths
- live-introspection: full-schema and table-scope extraction
Functions 100%, lines ~99% across the duckdb connector dir.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* docs: add DuckDB to supported-driver references
The DuckDB connector PR documented the connector itself but left the
scattered supported-driver enumerations stale. Add duckdb to the
federation concept page (participation table, activation, table naming,
limitations), the ktx setup CLI reference, the ktx.yaml warehouse-driver
table, the primary-sources field reference, and the quickstart driver
list (which also restores the missing ClickHouse entry).
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>
108 lines
5.1 KiB
TypeScript
108 lines
5.1 KiB
TypeScript
import { describe, expect, it } from 'vitest';
|
|
import { KtxDuckDbDialect } from '../../../src/connectors/duckdb/dialect.js';
|
|
|
|
describe('KtxDuckDbDialect', () => {
|
|
const dialect = new KtxDuckDbDialect();
|
|
|
|
it('quotes identifiers with double quotes and escapes embedded quotes', () => {
|
|
expect(dialect.quoteIdentifier('order"s')).toBe('"order""s"');
|
|
});
|
|
|
|
it('maps integer types to number dimension', () => {
|
|
expect(dialect.mapToDimensionType('BIGINT')).toBe('number');
|
|
expect(dialect.mapToDimensionType('DOUBLE')).toBe('number');
|
|
});
|
|
|
|
it('maps timestamp types to time dimension', () => {
|
|
expect(dialect.mapToDimensionType('TIMESTAMP')).toBe('time');
|
|
expect(dialect.mapToDimensionType('DATE')).toBe('time');
|
|
});
|
|
|
|
it('maps text types to string dimension', () => {
|
|
expect(dialect.mapToDimensionType('VARCHAR')).toBe('string');
|
|
});
|
|
|
|
it('maps boolean types to boolean dimension', () => {
|
|
expect(dialect.mapToDimensionType('BOOLEAN')).toBe('boolean');
|
|
expect(dialect.mapToDimensionType('BOOL')).toBe('boolean');
|
|
});
|
|
|
|
it('falls back to string for an empty or unknown native type', () => {
|
|
expect(dialect.mapToDimensionType('')).toBe('string');
|
|
expect(dialect.mapToDimensionType('JSON')).toBe('string');
|
|
});
|
|
|
|
// The precedence ladder strips parameters before substring rules fire, so a
|
|
// parameterized DECIMAL still resolves through the numeric branch rather than
|
|
// the string fallback.
|
|
it('strips type parameters before resolving the dimension', () => {
|
|
expect(dialect.mapToDimensionType('DECIMAL(10,2)')).toBe('number');
|
|
expect(dialect.mapToDimensionType('VARCHAR(255)')).toBe('string');
|
|
});
|
|
|
|
// Types absent from the exact-match table still resolve via substring rules:
|
|
// TIMESTAMP_NS (time), UINT128/HUGEINT-like (number), and lowercase input.
|
|
it('resolves unlisted types through substring matching, case-insensitively', () => {
|
|
expect(dialect.mapToDimensionType('timestamp_ns')).toBe('time');
|
|
expect(dialect.mapToDimensionType('INT128')).toBe('number');
|
|
expect(dialect.mapToDimensionType(' double ')).toBe('number');
|
|
});
|
|
|
|
it('generates a limited sample query', () => {
|
|
expect(dialect.generateSampleQuery('"t"', 5)).toBe('SELECT * FROM "t" LIMIT 5');
|
|
});
|
|
|
|
it('quotes selected columns in a sample query', () => {
|
|
expect(dialect.generateSampleQuery('"t"', 5, ['a', 'b'])).toBe('SELECT "a", "b" FROM "t" LIMIT 5');
|
|
});
|
|
|
|
it('builds a non-null, non-blank column sample query', () => {
|
|
expect(dialect.generateColumnSampleQuery('"t"', 'email', 3)).toBe(
|
|
`SELECT "email" FROM "t" WHERE "email" IS NOT NULL AND TRIM(CAST("email" AS VARCHAR)) != '' LIMIT 3`,
|
|
);
|
|
});
|
|
|
|
// A degenerate sample percentage (<=0 or >=1) means "no sampling", so both the
|
|
// random filter and the TABLESAMPLE clause must collapse to an empty string.
|
|
it('returns empty sample clauses outside the (0,1) range and real clauses inside it', () => {
|
|
expect(dialect.getRandomSampleFilter(0)).toBe('');
|
|
expect(dialect.getRandomSampleFilter(1)).toBe('');
|
|
expect(dialect.getRandomSampleFilter(0.25)).toBe('RANDOM() < 0.25');
|
|
expect(dialect.getTableSampleClause(0)).toBe('');
|
|
expect(dialect.getTableSampleClause(0.1)).toBe('USING SAMPLE 10 PERCENT (bernoulli)');
|
|
});
|
|
|
|
// A type missing from the exact-match table but containing BOOL still resolves
|
|
// through the substring branch rather than the string fallback.
|
|
it('resolves a BOOL-substring type to boolean', () => {
|
|
expect(dialect.mapToDimensionType('MYBOOL')).toBe('boolean');
|
|
});
|
|
|
|
it('builds limit/offset, sample-value aggregation, and randomized cardinality clauses', () => {
|
|
expect(dialect.getLimitOffsetClause(10, 5)).toContain('LIMIT 10');
|
|
expect(dialect.getSampleValueAggregation('SELECT 1')).toContain('STRING_AGG');
|
|
expect(dialect.generateRandomizedCardinalitySampleQuery('"t"', 'c', 100)).toContain('USING SAMPLE 100 ROWS');
|
|
});
|
|
|
|
it('exposes profiling expressions and a null column-statistics query', () => {
|
|
expect(dialect.getNullCountExpression('c')).toBe('SUM(CASE WHEN c IS NULL THEN 1 ELSE 0 END)');
|
|
expect(dialect.getDistinctCountExpression('c')).toBe('COUNT(DISTINCT c)');
|
|
expect(dialect.textLengthExpression('c')).toBe('LENGTH(CAST(c AS VARCHAR))');
|
|
expect(dialect.castToText('c')).toBe('CAST(c AS VARCHAR)');
|
|
expect(dialect.mapDataType('BIGINT')).toBe('BIGINT');
|
|
expect(dialect.getTopClause(5)).toBe('');
|
|
expect(dialect.generateColumnStatisticsQuery('main', 't')).toBeNull();
|
|
});
|
|
|
|
// Guards the single-namespace (db=null) display shape: v1 introspects only
|
|
// `main`, so a display ref must round-trip as a bare table name. An ANSI shape
|
|
// would emit a 1-part name it then refuses to parse, breaking column lookups.
|
|
it('round-trips a single-namespace display ref and reports a 1-part column shape', () => {
|
|
const table = { catalog: null, db: null, name: 'orders' };
|
|
const display = dialect.formatDisplayRef(table);
|
|
expect(display).toBe('orders');
|
|
expect(dialect.parseDisplayRef(display)).toMatchObject({ name: 'orders' });
|
|
expect(dialect.columnDisplayTablePartCount()).toBe(1);
|
|
expect(dialect.formatTableName(table)).toBe('"orders"');
|
|
});
|
|
});
|