test: split cli tests from source tree (#216)

* feat(cli): define full warehouse dialect contract

* test(cli): keep dialect edge tests focused

* fix(cli): stabilize dialect contract foundation

* refactor(connectors): own read-only query preparation

* refactor(connectors): resolve dialects through registry

* refactor(connectors): keep concrete dialect classes internal

* chore(workspace): enforce dialect import boundary

* refactor(cli): resolve relationship dialect at scan boundary

* refactor(cli): use dialect display parsing for entity details

* refactor(cli): use dialect display parsing for warehouse catalog

* refactor(cli): use dialect SQL in relationship workflows

* test(cli): verify solid dialect scan workflow closure

* test: split cli tests from source tree

* refactor(cli): standardize BigQuery scope listing

* feat(sqlite): implement connector scope listing

* test(connectors): cover required table listing

* feat(cli): add warehouse driver registry

* refactor(setup): route scope discovery through driver registry

* refactor(cli): route local query execution through driver registry

* refactor(historic-sql): route dialect support through driver registry

* refactor(cli): test warehouse connections through driver registry

* fix(cli): close driver registry type export gaps

* Improve setup daemon diagnostics

* refactor(setup): centralize rail-prefixed diagnostics + query-history fallback

Extract errorMessage, writePrefixedLines, and flushPrefixedBufferedCommandOutput
into clack.ts so the setup wizard, managed daemons, and embedding/agent steps
share one rail-formatted writer. setup-databases.ts also adds a
"disable query history and retry" option when the schema-context build fails
and query history is the likely culprit, surfaced via a new
failed-query-history-unavailable status.

* fix(cli): carry catalog through the picker so BigQuery/Snowflake/SQL Server scope filters match

The setup picker's KtxTableListEntry was a 2-level { schema, name }, so
qualifiedTableId always wrote db.name into enabled_tables. When BigQuery,
Snowflake, or SQL Server later ran fast ingest, their introspect step filtered
the scope set with scopedTableNames(scope, { catalog: projectId|database, db })
— catalog was non-null on the introspect side but null in the scope refs, so
every entry was rejected, the live-database adapter staged zero table files,
and detect() failed with 'Adapter "live-database" did not recognize fetched
source output'.

Align the picker boundary with the canonical 3-level KtxTableRef:

- Add catalog: string | null to KtxTableListEntry.
- BigQuery/Snowflake/SQL Server listTables populate catalog from the
  resolved projectId / database; Postgres/MySQL/ClickHouse/SQLite set null.
- qualifiedTableId emits catalog.schema.name when catalog is non-null
  (resolveEnabledTables already accepts the 3-part shape) and
  schemasFromEnabledTables now goes through parseDottedTableEntry so it
  recovers the schema correctly from both 2-part and 3-part entries.
- Export parseDottedTableEntry from enabled-tables.ts (@internal) for picker
  reuse.

Update listTables expectations in all seven connector tests and the setup /
picker test fixtures. Add a picker regression test that covers the
catalog-bearing round-trip (save + refine).

* fix(cli): allow debug telemetry under opt-out env
This commit is contained in:
Andrey Avtomonov 2026-05-26 08:49:05 +02:00 committed by GitHub
parent 924868841d
commit 56985b7e09
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
548 changed files with 5048 additions and 2228 deletions

View file

@ -137,8 +137,10 @@ Enabling query history makes deep ingest readiness matter for later
When query history is enabled for PostgreSQL, Snowflake, or BigQuery, When query history is enabled for PostgreSQL, Snowflake, or BigQuery,
`ktx setup` runs a non-blocking readiness probe after the connection test `ktx setup` runs a non-blocking readiness probe after the connection test
passes. A failed probe still writes setup changes, prints the warehouse-specific passes. A failed probe still writes setup changes, prints the warehouse-specific
grant or extension remediation, and leaves query-history ingest disabled until grant or extension remediation, and skips query-history processing until you
you fix the prerequisite. fix the prerequisite. If the later schema-context build also fails, interactive
setup offers **Disable query history and retry** so you can finish database
setup with `connections.<id>.context.queryHistory.enabled: false`.
For BigQuery, the remediation tells you to grant `roles/bigquery.resourceViewer` For BigQuery, the remediation tells you to grant `roles/bigquery.resourceViewer`
on the BigQuery project, or grant a custom role that contains on the BigQuery project, or grant a custom role that contains

View file

@ -89,3 +89,41 @@ enough reason to fix it even when the local code "works."
(`loadX` vs `loadHigherX`, `createY` vs `createDefaultY`, `xClient` (`loadX` vs `loadHigherX`, `createY` vs `createDefaultY`, `xClient`
vs `xService`), assume callers will pick the wrong one. Unify, or vs `xService`), assume callers will pick the wrong one. Unify, or
document inline why both must exist. document inline why both must exist.
## Dispatch and contract leaks across per-variant layers
Layers with multiple per-variant implementations (warehouse drivers,
dialects, LLM providers, ingest adapters, historic-SQL probes) drift
toward parallel switches and informal contracts. The patterns below
look locally reasonable per file but multiply with the number of
variants times the number of consumers — every fix has to be applied
N times, and silent drift between variants is invisible until a user
hits it.
- **MUST NOT**: Maintain two or more files that switch on the same
enum or string union to dispatch to per-variant behavior. Promote
the dispatch to a single registry table keyed by the union, exposed
through one resolution function. If you find yourself writing the
third such switch, the second one was already a bug.
- **MUST**: When every variant of an abstraction implements the same
method, the method belongs on the shared interface. An informal
contract that every implementation happens to satisfy is a leak
waiting to happen — callers will reach for the concrete class
instead of the contract, and the next variant added will silently
forget to implement it.
- **MUST**: When a layer has both a thin shared interface and rich
per-variant concrete classes, they must agree. Either widen the
interface so callers never need the concrete class, or make the
concrete class private (test-only `/** @internal */` JSDoc plus a
boundary check in `scripts/check-boundaries.mjs`). A class that is
public AND has methods the interface does not expose is the exact
configuration that produces leaks.
The warehouse driver / dialect layer in
`packages/cli/src/connectors/<driver>/` plus
`packages/cli/src/context/connections/{dialects,drivers}.ts` is the
canonical worked example: per-driver dialect classes carry
`/** @internal */`, `scripts/check-boundaries.mjs` enforces the import
boundary, and dispatch lives in the two registry files. Apply the
same shape to any other per-variant layer that grows beyond two
implementations.

View file

@ -14,8 +14,8 @@
"src/telemetry/schema-writer.ts!", "src/telemetry/schema-writer.ts!",
"src/telemetry/index.ts!", "src/telemetry/index.ts!",
"scripts/**/*.mjs", "scripts/**/*.mjs",
"src/**/*.test-utils.ts", "test/**/*.test-utils.ts",
"src/**/acceptance-fixtures.ts", "test/**/acceptance-fixtures.ts",
"src/context/scan/relationship-benchmarks.ts!", "src/context/scan/relationship-benchmarks.ts!",
"src/context/scan/relationship-benchmark-report.ts!" "src/context/scan/relationship-benchmark-report.ts!"
] ]

View file

@ -32,12 +32,12 @@
"build": "tsc -p tsconfig.json && node dist/telemetry/schema-writer.js src/telemetry/events.schema.json ../../python/ktx-daemon/src/ktx_daemon/telemetry/events.schema.json && node scripts/copy-runtime-assets.mjs && node ../../scripts/prepare-cli-bin.mjs", "build": "tsc -p tsconfig.json && node dist/telemetry/schema-writer.js src/telemetry/events.schema.json ../../python/ktx-daemon/src/ktx_daemon/telemetry/events.schema.json && node scripts/copy-runtime-assets.mjs && node ../../scripts/prepare-cli-bin.mjs",
"clean": "node -e \"fs.rmSync('dist', { recursive: true, force: true })\"", "clean": "node -e \"fs.rmSync('dist', { recursive: true, force: true })\"",
"docs:commands": "pnpm run build && node dist/print-command-tree.js", "docs:commands": "pnpm run build && node dist/print-command-tree.js",
"smoke": "vitest run src/standalone-smoke.test.ts src/example-smoke.test.ts --testTimeout 30000", "smoke": "vitest run test/standalone-smoke.test.ts test/example-smoke.test.ts --testTimeout 30000",
"test": "vitest run --exclude src/standalone-smoke.test.ts --exclude src/example-smoke.test.ts --exclude src/setup-databases.test.ts --exclude src/scan.test.ts --exclude src/commands/connection-metabase-setup.test.ts --exclude src/setup-models.test.ts --exclude src/setup-sources.test.ts --exclude src/setup.test.ts --exclude src/connection.test.ts --exclude src/setup-embeddings.test.ts --exclude src/ingest.test.ts --exclude src/commands/connection-mapping.test.ts --exclude src/ingest-viz.test.ts --exclude src/demo.test.ts --exclude src/setup-project.test.ts --exclude src/sl.test.ts --exclude src/local-scan-connectors.test.ts --exclude src/commands/connection-notion.test.ts --exclude src/context/scan/local-scan.test.ts --exclude src/context/mcp/local-project-ports.test.ts --exclude src/context/ingest/local-stage-ingest.test.ts --exclude src/context/sl/pglite-sl-search-prototype.test.ts --exclude src/context/core/git.service.test.ts --exclude src/context/ingest/local-adapters.test.ts --exclude src/context/ingest/local-bundle-ingest.test.ts --exclude src/context/ingest/local-metabase-ingest.test.ts --exclude src/context/sl/local-sl.test.ts --exclude src/context/search/pglite-owner-process.test.ts --exclude src/context/scan/local-enrichment-artifacts.test.ts --exclude src/context/search/pglite-spike.test.ts --exclude src/context/wiki/local-knowledge.test.ts --exclude src/context/sl/local-query.test.ts --exclude src/context/scan/relationship-review-decisions.test.ts --exclude src/context/scan/relationship-profiling.test.ts", "test": "vitest run --exclude test/standalone-smoke.test.ts --exclude test/example-smoke.test.ts --exclude test/setup-databases.test.ts --exclude test/scan.test.ts --exclude test/commands/connection-metabase-setup.test.ts --exclude test/setup-models.test.ts --exclude test/setup-sources.test.ts --exclude test/setup.test.ts --exclude test/connection.test.ts --exclude test/setup-embeddings.test.ts --exclude test/ingest.test.ts --exclude test/commands/connection-mapping.test.ts --exclude test/ingest-viz.test.ts --exclude test/demo.test.ts --exclude test/setup-project.test.ts --exclude test/sl.test.ts --exclude test/local-scan-connectors.test.ts --exclude test/commands/connection-notion.test.ts --exclude test/context/scan/local-scan.test.ts --exclude test/context/mcp/local-project-ports.test.ts --exclude test/context/ingest/local-stage-ingest.test.ts --exclude test/context/sl/pglite-sl-search-prototype.test.ts --exclude test/context/core/git.service.test.ts --exclude test/context/ingest/local-adapters.test.ts --exclude test/context/ingest/local-bundle-ingest.test.ts --exclude test/context/ingest/local-metabase-ingest.test.ts --exclude test/context/sl/local-sl.test.ts --exclude test/context/search/pglite-owner-process.test.ts --exclude test/context/scan/local-enrichment-artifacts.test.ts --exclude test/context/search/pglite-spike.test.ts --exclude test/context/wiki/local-knowledge.test.ts --exclude test/context/sl/local-query.test.ts --exclude test/context/scan/relationship-review-decisions.test.ts --exclude test/context/scan/relationship-profiling.test.ts",
"test:slow": "vitest run src/setup-databases.test.ts src/scan.test.ts src/commands/connection-metabase-setup.test.ts src/setup-models.test.ts src/setup-sources.test.ts src/setup.test.ts src/connection.test.ts src/setup-embeddings.test.ts src/ingest.test.ts src/commands/connection-mapping.test.ts src/ingest-viz.test.ts src/demo.test.ts src/setup-project.test.ts src/sl.test.ts src/local-scan-connectors.test.ts src/commands/connection-notion.test.ts src/context/scan/local-scan.test.ts src/context/mcp/local-project-ports.test.ts src/context/ingest/local-stage-ingest.test.ts src/context/sl/pglite-sl-search-prototype.test.ts src/context/core/git.service.test.ts src/context/ingest/local-adapters.test.ts src/context/ingest/local-bundle-ingest.test.ts src/context/ingest/local-metabase-ingest.test.ts src/context/sl/local-sl.test.ts src/context/search/pglite-owner-process.test.ts src/context/scan/local-enrichment-artifacts.test.ts src/context/search/pglite-spike.test.ts src/context/wiki/local-knowledge.test.ts src/context/sl/local-query.test.ts src/context/scan/relationship-review-decisions.test.ts src/context/scan/relationship-profiling.test.ts --testTimeout 30000", "test:slow": "vitest run test/setup-databases.test.ts test/scan.test.ts test/commands/connection-metabase-setup.test.ts test/setup-models.test.ts test/setup-sources.test.ts test/setup.test.ts test/connection.test.ts test/setup-embeddings.test.ts test/ingest.test.ts test/commands/connection-mapping.test.ts test/ingest-viz.test.ts test/demo.test.ts test/setup-project.test.ts test/sl.test.ts test/local-scan-connectors.test.ts test/commands/connection-notion.test.ts test/context/scan/local-scan.test.ts test/context/mcp/local-project-ports.test.ts test/context/ingest/local-stage-ingest.test.ts test/context/sl/pglite-sl-search-prototype.test.ts test/context/core/git.service.test.ts test/context/ingest/local-adapters.test.ts test/context/ingest/local-bundle-ingest.test.ts test/context/ingest/local-metabase-ingest.test.ts test/context/sl/local-sl.test.ts test/context/search/pglite-owner-process.test.ts test/context/scan/local-enrichment-artifacts.test.ts test/context/search/pglite-spike.test.ts test/context/wiki/local-knowledge.test.ts test/context/sl/local-query.test.ts test/context/scan/relationship-review-decisions.test.ts test/context/scan/relationship-profiling.test.ts --testTimeout 30000",
"type-check": "tsc -p tsconfig.json --noEmit", "type-check": "tsc -p tsconfig.json --noEmit && tsc -p tsconfig.test.json --noEmit",
"relationships:benchmarks": "pnpm --silent run build && node ../../scripts/relationship-benchmark-report.mjs", "relationships:benchmarks": "pnpm --silent run build && node ../../scripts/relationship-benchmark-report.mjs",
"relationships:benchmarks:test": "KTX_RUN_RELATIONSHIP_BENCHMARKS=1 vitest run src/context/scan/relationship-benchmarks.test.ts", "relationships:benchmarks:test": "KTX_RUN_RELATIONSHIP_BENCHMARKS=1 vitest run test/context/scan/relationship-benchmarks.test.ts",
"search:pglite-spike": "node ../../scripts/pglite-hybrid-search-spike.mjs", "search:pglite-spike": "node ../../scripts/pglite-hybrid-search-spike.mjs",
"search:pglite-owner-prototype": "node ../../scripts/pglite-owner-process-prototype.mjs", "search:pglite-owner-prototype": "node ../../scripts/pglite-owner-process-prototype.mjs",
"search:pglite-sl-prototype": "node ../../scripts/pglite-sl-search-prototype.mjs" "search:pglite-sl-prototype": "node ../../scripts/pglite-sl-search-prototype.mjs"

View file

@ -1,7 +1,30 @@
import { cancel, confirm, isCancel, log, spinner } from '@clack/prompts'; import { cancel, confirm, isCancel, log, spinner } from '@clack/prompts';
import type { KtxCliIo } from './cli-runtime.js';
const ESC = String.fromCharCode(0x1b); const ESC = String.fromCharCode(0x1b);
export interface RailBufferedSource {
stdoutText(): string;
stderrText(): string;
}
export function errorMessage(error: unknown): string {
return error instanceof Error ? error.message : String(error);
}
export function writePrefixedLines(write: (chunk: string) => void, output: string): void {
for (const line of output.split(/\r?\n/)) {
if (line.length > 0) {
write(`${line}\n`);
}
}
}
export function flushPrefixedBufferedCommandOutput(io: KtxCliIo, buffered: RailBufferedSource): void {
writePrefixedLines((chunk) => io.stdout.write(chunk), buffered.stdoutText());
writePrefixedLines((chunk) => io.stderr.write(chunk), buffered.stderrText());
}
export interface KtxCliSpinner { export interface KtxCliSpinner {
start(message: string): void; start(message: string): void;
message(message: string): void; message(message: string): void;

View file

@ -6,6 +6,7 @@ import { type NotionBotInfo, NotionClient } from './context/ingest/adapters/noti
import { createLocalLookerCredentialResolver } from './context/ingest/adapters/looker/local-looker.adapter.js'; import { createLocalLookerCredentialResolver } from './context/ingest/adapters/looker/local-looker.adapter.js';
import { metabaseRuntimeConfigFromLocalConnection } from './context/ingest/adapters/metabase/local-metabase.adapter.js'; import { metabaseRuntimeConfigFromLocalConnection } from './context/ingest/adapters/metabase/local-metabase.adapter.js';
import { testRepoConnection } from './context/ingest/repo-fetch.js'; import { testRepoConnection } from './context/ingest/repo-fetch.js';
import { getDriverRegistration } from './context/connections/drivers.js';
import { parseNotionConnectionConfig, resolveNotionConnectionAuthToken } from './context/connections/notion-config.js'; import { parseNotionConnectionConfig, resolveNotionConnectionAuthToken } from './context/connections/notion-config.js';
import { resolveKtxConfigReference } from './context/core/config-reference.js'; import { resolveKtxConfigReference } from './context/core/config-reference.js';
import { type KtxLocalProject, loadKtxProject } from './context/project/project.js'; import { type KtxLocalProject, loadKtxProject } from './context/project/project.js';
@ -272,15 +273,7 @@ async function testConnectionByDriver(
return { driver, detailKey: 'Repo', detailValue: result.repoUrl }; return { driver, detailKey: 'Repo', detailValue: result.repoUrl };
} }
if ( if (getDriverRegistration(driver)) {
driver === 'sqlite' ||
driver === 'postgres' ||
driver === 'mysql' ||
driver === 'clickhouse' ||
driver === 'sqlserver' ||
driver === 'bigquery' ||
driver === 'snowflake'
) {
const result = await testNativeConnection( const result = await testNativeConnection(
project, project,
connectionId, connectionId,

View file

@ -1,5 +1,6 @@
import { BigQuery, type TableField } from '@google-cloud/bigquery'; import { BigQuery, type TableField } from '@google-cloud/bigquery';
import { normalizeBigQueryProjectId, normalizeBigQueryRegion } from '../../context/connections/bigquery-identifiers.js'; import { normalizeBigQueryProjectId, normalizeBigQueryRegion } from '../../context/connections/bigquery-identifiers.js';
import { getDialectForDriver } from '../../context/connections/dialects.js';
import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js'; import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js';
import { tryConstraintQuery } from '../../context/scan/constraint-discovery.js'; import { tryConstraintQuery } from '../../context/scan/constraint-discovery.js';
import { scopedTableNames } from '../../context/scan/table-ref.js'; import { scopedTableNames } from '../../context/scan/table-ref.js';
@ -26,7 +27,6 @@ import {
import { readFileSync } from 'node:fs'; import { readFileSync } from 'node:fs';
import { homedir } from 'node:os'; import { homedir } from 'node:os';
import { resolve } from 'node:path'; import { resolve } from 'node:path';
import { KtxBigQueryDialect } from './dialect.js';
export interface KtxBigQueryConnectionConfig { export interface KtxBigQueryConnectionConfig {
driver?: string; driver?: string;
@ -235,6 +235,23 @@ function normalizeValue(value: unknown): unknown {
return value; return value;
} }
/** @internal */
export function prepareBigQueryReadOnlyQuery(
sql: string,
params?: Record<string, unknown>,
): { sql: string; params?: Record<string, unknown> } {
if (!params) {
return { sql, params: undefined };
}
let processedSql = sql;
const processedParams: Record<string, unknown> = {};
for (const [key, value] of Object.entries(params)) {
processedSql = processedSql.replace(new RegExp(`:${key}\\b`, 'g'), `@${key}`);
processedParams[key] = value;
}
return { sql: processedSql, params: Object.keys(processedParams).length > 0 ? processedParams : undefined };
}
export function isKtxBigQueryConnectionConfig( export function isKtxBigQueryConnectionConfig(
connection: KtxBigQueryConnectionConfig | undefined, connection: KtxBigQueryConnectionConfig | undefined,
): connection is KtxBigQueryConnectionConfig { ): connection is KtxBigQueryConnectionConfig {
@ -286,7 +303,7 @@ export class KtxBigQueryScanConnector implements KtxScanConnector {
private readonly now: () => Date; private readonly now: () => Date;
private readonly maxBytesBilled?: number | string; private readonly maxBytesBilled?: number | string;
private readonly queryTimeoutMs?: number; private readonly queryTimeoutMs?: number;
private readonly dialect = new KtxBigQueryDialect(); private readonly dialect = getDialectForDriver('bigquery');
private client: KtxBigQueryClient | null = null; private client: KtxBigQueryClient | null = null;
constructor(options: KtxBigQueryScanConnectorOptions) { constructor(options: KtxBigQueryScanConnectorOptions) {
@ -364,7 +381,7 @@ export class KtxBigQueryScanConnector implements KtxScanConnector {
async executeReadOnly(input: KtxBigQueryReadOnlyQueryInput, _ctx: KtxScanContext): Promise<KtxQueryResult> { async executeReadOnly(input: KtxBigQueryReadOnlyQueryInput, _ctx: KtxScanContext): Promise<KtxQueryResult> {
this.assertConnection(input.connectionId); this.assertConnection(input.connectionId);
const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows); const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows);
const prepared = this.dialect.prepareQuery(limitedSql, input.params); const prepared = prepareBigQueryReadOnlyQuery(limitedSql, input.params);
const result = await this.query(prepared.sql, prepared.params); const result = await this.query(prepared.sql, prepared.params);
return { ...result, rowCount: result.rows.length }; return { ...result, rowCount: result.rows.length };
} }
@ -411,7 +428,7 @@ export class KtxBigQueryScanConnector implements KtxScanConnector {
return this.dialect.quoteIdentifier(identifier); return this.dialect.quoteIdentifier(identifier);
} }
async listDatasets(): Promise<string[]> { async listSchemas(): Promise<string[]> {
const [datasets] = await this.getClient().getDatasets(); const [datasets] = await this.getClient().getDatasets();
return datasets.map((dataset) => dataset.id).filter((id): id is string => Boolean(id)); return datasets.map((dataset) => dataset.id).filter((id): id is string => Boolean(id));
} }
@ -437,6 +454,7 @@ export class KtxBigQueryScanConnector implements KtxScanConnector {
params, params,
); );
return rows.map((row) => ({ return rows.map((row) => ({
catalog: this.resolved.projectId,
schema: row.table_schema, schema: row.table_schema,
name: row.table_name, name: row.table_name,
kind: kind:

View file

@ -1,9 +1,18 @@
import type { KtxDialect } from '../../context/connections/dialects.js';
import {
columnDisplayPartCount,
formatDialectDisplayRef,
formatDialectTableName,
limitOffsetClause,
parseDialectDisplayRef,
} from '../../context/connections/dialect-helpers.js';
import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js'; import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js';
type BigQueryTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>; type BigQueryTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>;
export class KtxBigQueryDialect { /** @internal */
readonly type = 'bigquery'; export class KtxBigQueryDialect implements KtxDialect {
readonly type = 'bigquery' as const;
private readonly typeMappings: Record<string, KtxSchemaDimensionType> = { private readonly typeMappings: Record<string, KtxSchemaDimensionType> = {
TIMESTAMP: 'time', TIMESTAMP: 'time',
@ -27,13 +36,19 @@ export class KtxBigQueryDialect {
} }
formatTableName(table: BigQueryTableNameRef): string { formatTableName(table: BigQueryTableNameRef): string {
if (table.catalog && table.db) { return formatDialectTableName(table, this.quoteIdentifier.bind(this), 'three-part');
return `${this.quoteIdentifier(table.catalog)}.${this.quoteIdentifier(table.db)}.${this.quoteIdentifier(table.name)}`; }
}
if (table.db) { formatDisplayRef(table: BigQueryTableNameRef): string {
return `${this.quoteIdentifier(table.db)}.${this.quoteIdentifier(table.name)}`; return formatDialectDisplayRef(table, 'three-part');
} }
return this.quoteIdentifier(table.name);
parseDisplayRef(display: string): KtxTableRef | null {
return parseDialectDisplayRef(display, 'three-part');
}
columnDisplayTablePartCount(): 1 | 2 | 3 {
return columnDisplayPartCount('three-part');
} }
mapDataType(nativeType: string): string { mapDataType(nativeType: string): string {
@ -93,19 +108,6 @@ export class KtxBigQueryDialect {
return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND TRIM(CAST(${quotedColumn} AS STRING)) != '' ORDER BY RAND() LIMIT ${limit}`; return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND TRIM(CAST(${quotedColumn} AS STRING)) != '' ORDER BY RAND() LIMIT ${limit}`;
} }
prepareQuery(sql: string, params?: Record<string, unknown>): { sql: string; params?: Record<string, unknown> } {
if (!params) {
return { sql, params: undefined };
}
let processedSql = sql;
const processedParams: Record<string, unknown> = {};
for (const [key, value] of Object.entries(params)) {
processedSql = processedSql.replace(new RegExp(`:${key}\\b`, 'g'), `@${key}`);
processedParams[key] = value;
}
return { sql: processedSql, params: Object.keys(processedParams).length > 0 ? processedParams : undefined };
}
getRandomSampleFilter(samplePct: number): string { getRandomSampleFilter(samplePct: number): string {
if (samplePct <= 0 || samplePct >= 1) { if (samplePct <= 0 || samplePct >= 1) {
return ''; return '';
@ -121,7 +123,11 @@ export class KtxBigQueryDialect {
} }
getLimitOffsetClause(limit: number, offset?: number): string { getLimitOffsetClause(limit: number, offset?: number): string {
return offset !== undefined && offset > 0 ? `LIMIT ${limit} OFFSET ${offset}` : `LIMIT ${limit}`; return limitOffsetClause(limit, offset);
}
getTopClause(_limit: number): string {
return '';
} }
getNullCountExpression(column: string): string { getNullCountExpression(column: string): string {
@ -132,6 +138,18 @@ export class KtxBigQueryDialect {
return `APPROX_COUNT_DISTINCT(${column})`; return `APPROX_COUNT_DISTINCT(${column})`;
} }
textLengthExpression(columnSql: string): string {
return `LENGTH(CAST(${columnSql} AS STRING))`;
}
castToText(columnSql: string): string {
return `CAST(${columnSql} AS STRING)`;
}
getSampleValueAggregation(innerSql: string): string {
return `(SELECT STRING_AGG(CAST(value AS STRING), '\\u001F') FROM (${innerSql}) AS relationship_profile_values)`;
}
generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string { generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string {
return ` return `
WITH sampled AS ( WITH sampled AS (
@ -172,36 +190,4 @@ export class KtxBigQueryDialect {
FROM sampled FROM sampled
`; `;
} }
getTimeTruncExpression(
column: string,
granularity: 'day' | 'week' | 'month' | 'quarter' | 'year',
timezone?: string,
): string {
const bigQueryGranularity = granularity.toUpperCase();
if (timezone) {
return `DATE_TRUNC(DATETIME(${column}, '${timezone}'), ${bigQueryGranularity})`;
}
return `DATE_TRUNC(${column}, ${bigQueryGranularity})`;
}
getCustomTimeTruncExpression(column: string, interval: string, origin?: string, timezone?: string): string {
const col = timezone ? `DATETIME(${column}, '${timezone}')` : column;
const [rawAmount, rawUnit] = interval.split(' ');
let diffUnit = rawUnit!.toUpperCase();
let amount = Number(rawAmount);
let addUnit = diffUnit;
if (diffUnit === 'WEEK') {
diffUnit = 'DAY';
amount = amount * 7;
addUnit = 'DAY';
}
const originExpr = origin ? `TIMESTAMP '${origin}'` : `TIMESTAMP '1970-01-01'`;
return `TIMESTAMP_ADD(${originExpr}, INTERVAL CAST(FLOOR(TIMESTAMP_DIFF(${col}, ${originExpr}, ${diffUnit}) / ${amount}) * ${amount} AS INT64) ${addUnit})`;
}
parseIntervalToSql(interval: string): string {
const [amount, unit] = interval.split(' ');
return `INTERVAL ${amount} ${unit!.toUpperCase()}`;
}
} }

View file

@ -1,4 +1,5 @@
import { createClient } from '@clickhouse/client'; import { createClient } from '@clickhouse/client';
import { getDialectForDriver } from '../../context/connections/dialects.js';
import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js'; import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js';
import { createKtxConnectorCapabilities, type KtxColumnSampleInput, type KtxColumnSampleResult, type KtxColumnStatsInput, type KtxColumnStatsResult, type KtxQueryResult, type KtxReadOnlyQueryInput, type KtxScanConnector, type KtxScanContext, type KtxScanInput, type KtxSchemaColumn, type KtxSchemaSnapshot, type KtxSchemaTable, type KtxTableRef, type KtxTableSampleInput, type KtxTableListEntry, type KtxTableSampleResult } from '../../context/scan/types.js'; import { createKtxConnectorCapabilities, type KtxColumnSampleInput, type KtxColumnSampleResult, type KtxColumnStatsInput, type KtxColumnStatsResult, type KtxQueryResult, type KtxReadOnlyQueryInput, type KtxScanConnector, type KtxScanContext, type KtxScanInput, type KtxSchemaColumn, type KtxSchemaSnapshot, type KtxSchemaTable, type KtxTableRef, type KtxTableSampleInput, type KtxTableListEntry, type KtxTableSampleResult } from '../../context/scan/types.js';
import { scopedTableNames } from '../../context/scan/table-ref.js'; import { scopedTableNames } from '../../context/scan/table-ref.js';
@ -6,7 +7,6 @@ import { readFileSync } from 'node:fs';
import { Agent as HttpsAgent } from 'node:https'; import { Agent as HttpsAgent } from 'node:https';
import { homedir } from 'node:os'; import { homedir } from 'node:os';
import { resolve } from 'node:path'; import { resolve } from 'node:path';
import { KtxClickHouseDialect } from './dialect.js';
export interface KtxClickHouseConnectionConfig { export interface KtxClickHouseConnectionConfig {
driver?: string; driver?: string;
@ -198,6 +198,49 @@ function clickHouseTableKey(database: string, table: string): string {
return `${database}.${table}`; return `${database}.${table}`;
} }
function inferClickHouseQueryParamType(value: unknown): string {
if (value === null || value === undefined) {
return 'String';
}
if (typeof value === 'boolean') {
return 'Bool';
}
if (typeof value === 'number') {
return Number.isInteger(value) ? 'Int64' : 'Float64';
}
if (value instanceof Date) {
return 'DateTime';
}
return 'String';
}
/** @internal */
export function prepareClickHouseReadOnlyQuery(
sql: string,
params?: Record<string, unknown>,
): { sql: string; params?: Record<string, unknown> } {
if (!params) {
return { sql, params: undefined };
}
let parameterizedQuery = sql;
const queryParams: Record<string, unknown> = {};
const sortedKeys = Object.keys(params).sort((a, b) => b.length - a.length);
for (const key of sortedKeys) {
const placeholder = `:${key}`;
if (parameterizedQuery.includes(placeholder)) {
parameterizedQuery = parameterizedQuery.replace(
new RegExp(`:${key}\\b`, 'g'),
`{${key}:${inferClickHouseQueryParamType(params[key])}}`,
);
queryParams[key] = params[key];
}
}
return { sql: parameterizedQuery, params: Object.keys(queryParams).length > 0 ? queryParams : undefined };
}
export function isKtxClickHouseConnectionConfig( export function isKtxClickHouseConnectionConfig(
connection: KtxClickHouseConnectionConfig | undefined, connection: KtxClickHouseConnectionConfig | undefined,
): connection is KtxClickHouseConnectionConfig { ): connection is KtxClickHouseConnectionConfig {
@ -256,7 +299,7 @@ export class KtxClickHouseScanConnector implements KtxScanConnector {
private readonly clientFactory: KtxClickHouseClientFactory; private readonly clientFactory: KtxClickHouseClientFactory;
private readonly endpointResolver?: KtxClickHouseEndpointResolver; private readonly endpointResolver?: KtxClickHouseEndpointResolver;
private readonly now: () => Date; private readonly now: () => Date;
private readonly dialect = new KtxClickHouseDialect(); private readonly dialect = getDialectForDriver('clickhouse');
private client: KtxClickHouseClient | null = null; private client: KtxClickHouseClient | null = null;
private resolvedEndpoint: KtxClickHouseResolvedEndpoint | null = null; private resolvedEndpoint: KtxClickHouseResolvedEndpoint | null = null;
@ -408,7 +451,7 @@ export class KtxClickHouseScanConnector implements KtxScanConnector {
async executeReadOnly(input: KtxClickHouseReadOnlyQueryInput, _ctx: KtxScanContext): Promise<KtxQueryResult> { async executeReadOnly(input: KtxClickHouseReadOnlyQueryInput, _ctx: KtxScanContext): Promise<KtxQueryResult> {
this.assertConnection(input.connectionId); this.assertConnection(input.connectionId);
const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows); const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows);
const prepared = this.dialect.prepareQuery(limitedSql, input.params); const prepared = prepareClickHouseReadOnlyQuery(limitedSql, input.params);
const result = await this.query(prepared.sql, prepared.params); const result = await this.query(prepared.sql, prepared.params);
return { ...result, rowCount: result.rows.length }; return { ...result, rowCount: result.rows.length };
} }
@ -488,6 +531,7 @@ export class KtxClickHouseScanConnector implements KtxScanConnector {
{ schemas: filterSchemas }, { schemas: filterSchemas },
); );
return rows.map((row) => ({ return rows.map((row) => ({
catalog: null,
schema: row.database, schema: row.database,
name: row.name, name: row.name,
kind: row.engine === 'View' || row.engine === 'MaterializedView' ? ('view' as const) : ('table' as const), kind: row.engine === 'View' || row.engine === 'MaterializedView' ? ('view' as const) : ('table' as const),

View file

@ -1,9 +1,18 @@
import type { KtxDialect } from '../../context/connections/dialects.js';
import {
columnDisplayPartCount,
formatDialectDisplayRef,
formatDialectTableName,
limitOffsetClause,
parseDialectDisplayRef,
} from '../../context/connections/dialect-helpers.js';
import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js'; import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js';
type ClickHouseTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>; type ClickHouseTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>;
export class KtxClickHouseDialect { /** @internal */
readonly type = 'clickhouse'; export class KtxClickHouseDialect implements KtxDialect {
readonly type = 'clickhouse' as const;
private readonly typeMappings: Record<string, KtxSchemaDimensionType> = { private readonly typeMappings: Record<string, KtxSchemaDimensionType> = {
date: 'time', date: 'time',
@ -45,9 +54,19 @@ export class KtxClickHouseDialect {
} }
formatTableName(table: ClickHouseTableNameRef): string { formatTableName(table: ClickHouseTableNameRef): string {
return table.db return formatDialectTableName(table, this.quoteIdentifier.bind(this), 'ansi');
? `${this.quoteIdentifier(table.db)}.${this.quoteIdentifier(table.name)}` }
: this.quoteIdentifier(table.name);
formatDisplayRef(table: ClickHouseTableNameRef): string {
return formatDialectDisplayRef(table, 'ansi');
}
parseDisplayRef(display: string): KtxTableRef | null {
return parseDialectDisplayRef(display, 'ansi');
}
columnDisplayTablePartCount(): 1 | 2 | 3 {
return columnDisplayPartCount('ansi');
} }
mapDataType(nativeType: string): string { mapDataType(nativeType: string): string {
@ -97,29 +116,6 @@ export class KtxClickHouseDialect {
return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND trim(toString(${quotedColumn})) != '' LIMIT ${limit}`; return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND trim(toString(${quotedColumn})) != '' LIMIT ${limit}`;
} }
prepareQuery(sql: string, params?: Record<string, unknown>): { sql: string; params?: Record<string, unknown> } {
if (!params) {
return { sql, params: undefined };
}
let parameterizedQuery = sql;
const queryParams: Record<string, unknown> = {};
const sortedKeys = Object.keys(params).sort((a, b) => b.length - a.length);
for (const key of sortedKeys) {
const placeholder = `:${key}`;
if (parameterizedQuery.includes(placeholder)) {
parameterizedQuery = parameterizedQuery.replace(
new RegExp(`:${key}\\b`, 'g'),
`{${key}:${this.inferClickHouseType(params[key])}}`,
);
queryParams[key] = params[key];
}
}
return { sql: parameterizedQuery, params: queryParams };
}
getRandomSampleFilter(samplePct: number): string { getRandomSampleFilter(samplePct: number): string {
if (samplePct <= 0 || samplePct >= 1) { if (samplePct <= 0 || samplePct >= 1) {
return ''; return '';
@ -132,7 +128,11 @@ export class KtxClickHouseDialect {
} }
getLimitOffsetClause(limit: number, offset?: number): string { getLimitOffsetClause(limit: number, offset?: number): string {
return offset !== undefined && offset > 0 ? `LIMIT ${limit} OFFSET ${offset}` : `LIMIT ${limit}`; return limitOffsetClause(limit, offset);
}
getTopClause(_limit: number): string {
return '';
} }
getNullCountExpression(column: string): string { getNullCountExpression(column: string): string {
@ -143,6 +143,18 @@ export class KtxClickHouseDialect {
return `COUNT(DISTINCT ${column})`; return `COUNT(DISTINCT ${column})`;
} }
textLengthExpression(columnSql: string): string {
return `length(toString(${columnSql}))`;
}
castToText(columnSql: string): string {
return `toString(${columnSql})`;
}
getSampleValueAggregation(innerSql: string): string {
return `(SELECT arrayStringConcat(groupArray(toString(value)), '\\x1F') FROM (${innerSql}) AS relationship_profile_values)`;
}
generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string { generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string {
return ` return `
SELECT COUNT(DISTINCT val) AS cardinality SELECT COUNT(DISTINCT val) AS cardinality
@ -181,99 +193,9 @@ export class KtxClickHouseDialect {
) )
`; `;
} }
getTimeTruncExpression(
column: string,
granularity: 'day' | 'week' | 'month' | 'quarter' | 'year',
timezone?: string,
): string {
const tz = timezone ? `, '${timezone}'` : '';
switch (granularity) {
case 'day':
return `toStartOfDay(${column}${tz})`;
case 'week':
return `toStartOfWeek(${column}, 1${tz})`;
case 'month':
return `toStartOfMonth(${column}${tz})`;
case 'quarter':
return `toStartOfQuarter(${column}${tz})`;
case 'year':
return `toStartOfYear(${column}${tz})`;
}
}
getCustomTimeTruncExpression(column: string, interval: string, origin?: string, timezone?: string): string {
const col = timezone ? `toTimezone(${column}, '${timezone}')` : column;
const [rawAmount, rawUnit] = interval.split(' ');
const amount = Number(rawAmount);
const unit = rawUnit!.toLowerCase();
const originExpr = origin ? `toDateTime('${origin}')` : "toDateTime('1970-01-01')";
const calendarUnit = this.toClickHouseDateDiffUnit(unit);
if (calendarUnit) {
return `dateAdd(${calendarUnit}, intDiv(dateDiff(${calendarUnit}, ${originExpr}, ${col}), ${amount}) * ${amount}, ${originExpr})`;
}
const seconds = this.intervalToSeconds(amount, unit);
return `addSeconds(${originExpr}, intDiv(toUInt64(dateDiff('second', ${originExpr}, ${col})), ${seconds}) * ${seconds})`;
}
parseIntervalToSql(interval: string): string {
const [amount, unit] = interval.split(' ');
return `INTERVAL ${amount} ${unit!.toUpperCase()}`;
}
private unwrapClickHouseType(value: string, wrapper: string): string { private unwrapClickHouseType(value: string, wrapper: string): string {
const prefix = `${wrapper}(`; const prefix = `${wrapper}(`;
return value.startsWith(prefix) && value.endsWith(')') ? value.slice(prefix.length, -1) : value; return value.startsWith(prefix) && value.endsWith(')') ? value.slice(prefix.length, -1) : value;
} }
private inferClickHouseType(value: unknown): string {
if (value === null || value === undefined) {
return 'String';
}
if (typeof value === 'boolean') {
return 'Bool';
}
if (typeof value === 'number') {
return Number.isInteger(value) ? 'Int64' : 'Float64';
}
if (value instanceof Date) {
return 'DateTime';
}
return 'String';
}
private toClickHouseDateDiffUnit(unit: string): string | null {
if (unit === 'month' || unit === 'months') {
return "'month'";
}
if (unit === 'quarter' || unit === 'quarters') {
return "'quarter'";
}
if (unit === 'year' || unit === 'years') {
return "'year'";
}
return null;
}
private intervalToSeconds(amount: number, unit: string): number {
switch (unit) {
case 'second':
case 'seconds':
return amount;
case 'minute':
case 'minutes':
return amount * 60;
case 'hour':
case 'hours':
return amount * 3600;
case 'day':
case 'days':
return amount * 86400;
case 'week':
case 'weeks':
return amount * 604800;
default:
return amount * 86400;
}
}
} }

View file

@ -2,6 +2,7 @@ import mysql, { type FieldPacket, type Pool, type RowDataPacket } from 'mysql2/p
import { readFileSync } from 'node:fs'; import { readFileSync } from 'node:fs';
import { homedir } from 'node:os'; import { homedir } from 'node:os';
import { resolve } from 'node:path'; import { resolve } from 'node:path';
import { getDialectForDriver } from '../../context/connections/dialects.js';
import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js'; import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js';
import { import {
constraintDiscoveryWarning, constraintDiscoveryWarning,
@ -30,7 +31,6 @@ import {
type KtxTableSampleInput, type KtxTableSampleInput,
type KtxTableSampleResult, type KtxTableSampleResult,
} from '../../context/scan/types.js'; } from '../../context/scan/types.js';
import { KtxMysqlDialect } from './dialect.js';
export interface KtxMysqlConnectionConfig { export interface KtxMysqlConnectionConfig {
driver?: string; driver?: string;
@ -303,6 +303,25 @@ function queryParams(params: Record<string, unknown> | unknown[] | undefined): u
return Array.isArray(params) ? params : Object.values(params); return Array.isArray(params) ? params : Object.values(params);
} }
/** @internal */
export function prepareMysqlReadOnlyQuery(
sql: string,
params?: Record<string, unknown>,
): { sql: string; params?: unknown[] } {
if (!params) {
return { sql, params: undefined };
}
const values: unknown[] = [];
const parameterizedQuery = sql.replace(/:([A-Za-z_][A-Za-z0-9_]*)\b/g, (placeholder, key: string) => {
if (!(key in params)) {
return placeholder;
}
values.push(params[key]);
return '?';
});
return { sql: parameterizedQuery, params: values };
}
export function isKtxMysqlConnectionConfig( export function isKtxMysqlConnectionConfig(
connection: KtxMysqlConnectionConfig | undefined, connection: KtxMysqlConnectionConfig | undefined,
): connection is KtxMysqlConnectionConfig { ): connection is KtxMysqlConnectionConfig {
@ -376,7 +395,7 @@ export class KtxMysqlScanConnector implements KtxScanConnector {
private readonly poolFactory: KtxMysqlPoolFactory; private readonly poolFactory: KtxMysqlPoolFactory;
private readonly endpointResolver?: KtxMysqlEndpointResolver; private readonly endpointResolver?: KtxMysqlEndpointResolver;
private readonly now: () => Date; private readonly now: () => Date;
private readonly dialect = new KtxMysqlDialect(); private readonly dialect = getDialectForDriver('mysql');
private pool: KtxMysqlPool | null = null; private pool: KtxMysqlPool | null = null;
private resolvedEndpoint: KtxMysqlResolvedEndpoint | null = null; private resolvedEndpoint: KtxMysqlResolvedEndpoint | null = null;
@ -550,7 +569,7 @@ export class KtxMysqlScanConnector implements KtxScanConnector {
const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows); const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows);
const prepared = Array.isArray(input.params) const prepared = Array.isArray(input.params)
? { sql: limitedSql, params: input.params } ? { sql: limitedSql, params: input.params }
: this.dialect.prepareQuery(limitedSql, input.params); : prepareMysqlReadOnlyQuery(limitedSql, input.params);
const result = await this.query(prepared.sql, prepared.params); const result = await this.query(prepared.sql, prepared.params);
return { ...result, rowCount: result.rows.length }; return { ...result, rowCount: result.rows.length };
} }
@ -625,6 +644,7 @@ export class KtxMysqlScanConnector implements KtxScanConnector {
filterSchemas, filterSchemas,
); );
return rows.map((row) => ({ return rows.map((row) => ({
catalog: null,
schema: row.TABLE_SCHEMA, schema: row.TABLE_SCHEMA,
name: row.TABLE_NAME, name: row.TABLE_NAME,
kind: row.TABLE_TYPE === 'VIEW' ? ('view' as const) : ('table' as const), kind: row.TABLE_TYPE === 'VIEW' ? ('view' as const) : ('table' as const),

View file

@ -1,9 +1,18 @@
import type { KtxDialect } from '../../context/connections/dialects.js';
import {
columnDisplayPartCount,
formatDialectDisplayRef,
formatDialectTableName,
limitOffsetClause,
parseDialectDisplayRef,
} from '../../context/connections/dialect-helpers.js';
import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js'; import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js';
type MysqlTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>; type MysqlTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>;
export class KtxMysqlDialect { /** @internal */
readonly type = 'mysql'; export class KtxMysqlDialect implements KtxDialect {
readonly type = 'mysql' as const;
private readonly typeMappings: Record<string, KtxSchemaDimensionType> = { private readonly typeMappings: Record<string, KtxSchemaDimensionType> = {
datetime: 'time', datetime: 'time',
@ -41,9 +50,19 @@ export class KtxMysqlDialect {
} }
formatTableName(table: MysqlTableNameRef): string { formatTableName(table: MysqlTableNameRef): string {
return table.db return formatDialectTableName(table, this.quoteIdentifier.bind(this), 'ansi');
? `${this.quoteIdentifier(table.db)}.${this.quoteIdentifier(table.name)}` }
: this.quoteIdentifier(table.name);
formatDisplayRef(table: MysqlTableNameRef): string {
return formatDialectDisplayRef(table, 'ansi');
}
parseDisplayRef(display: string): KtxTableRef | null {
return parseDialectDisplayRef(display, 'ansi');
}
columnDisplayTablePartCount(): 1 | 2 | 3 {
return columnDisplayPartCount('ansi');
} }
mapDataType(nativeType: string): string { mapDataType(nativeType: string): string {
@ -91,21 +110,6 @@ export class KtxMysqlDialect {
return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND TRIM(CAST(${quotedColumn} AS CHAR)) != '' LIMIT ${limit}`; return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND TRIM(CAST(${quotedColumn} AS CHAR)) != '' LIMIT ${limit}`;
} }
prepareQuery(sql: string, params?: Record<string, unknown>): { sql: string; params?: unknown[] } {
if (!params) {
return { sql, params: undefined };
}
const values: unknown[] = [];
const parameterizedQuery = sql.replace(/:([A-Za-z_][A-Za-z0-9_]*)\b/g, (placeholder, key: string) => {
if (!(key in params)) {
return placeholder;
}
values.push(params[key]);
return '?';
});
return { sql: parameterizedQuery, params: values };
}
getRandomSampleFilter(samplePct: number): string { getRandomSampleFilter(samplePct: number): string {
if (samplePct <= 0 || samplePct >= 1) { if (samplePct <= 0 || samplePct >= 1) {
return ''; return '';
@ -118,7 +122,11 @@ export class KtxMysqlDialect {
} }
getLimitOffsetClause(limit: number, offset?: number): string { getLimitOffsetClause(limit: number, offset?: number): string {
return offset !== undefined && offset > 0 ? `LIMIT ${limit} OFFSET ${offset}` : `LIMIT ${limit}`; return limitOffsetClause(limit, offset);
}
getTopClause(_limit: number): string {
return '';
} }
getNullCountExpression(column: string): string { getNullCountExpression(column: string): string {
@ -129,6 +137,18 @@ export class KtxMysqlDialect {
return `COUNT(DISTINCT ${column})`; return `COUNT(DISTINCT ${column})`;
} }
textLengthExpression(columnSql: string): string {
return `CHAR_LENGTH(CAST(${columnSql} AS CHAR))`;
}
castToText(columnSql: string): string {
return `CAST(${columnSql} AS CHAR)`;
}
getSampleValueAggregation(innerSql: string): string {
return `(SELECT GROUP_CONCAT(CAST(value AS CHAR) SEPARATOR CHAR(31)) FROM (${innerSql}) AS relationship_profile_values)`;
}
generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string { generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string {
return ` return `
SELECT COUNT(DISTINCT val) AS cardinality SELECT COUNT(DISTINCT val) AS cardinality
@ -167,36 +187,4 @@ export class KtxMysqlDialect {
) AS sampled ) AS sampled
`; `;
} }
getTimeTruncExpression(
column: string,
granularity: 'day' | 'week' | 'month' | 'quarter' | 'year',
timezone?: string,
): string {
const col = timezone ? `CONVERT_TZ(${column}, '+00:00', '${timezone}')` : column;
switch (granularity) {
case 'day':
return `DATE(${col})`;
case 'week':
return `DATE(${col} - INTERVAL WEEKDAY(${col}) DAY)`;
case 'month':
return `DATE_FORMAT(${col}, '%Y-%m-01')`;
case 'quarter':
return `MAKEDATE(YEAR(${col}), 1) + INTERVAL (QUARTER(${col}) - 1) QUARTER`;
case 'year':
return `DATE_FORMAT(${col}, '%Y-01-01')`;
}
}
getCustomTimeTruncExpression(column: string, interval: string, origin?: string, timezone?: string): string {
const col = timezone ? `CONVERT_TZ(${column}, '+00:00', '${timezone}')` : column;
const [amount, unit] = interval.split(' ');
const originExpr = origin ? `'${origin}'` : `'1970-01-01'`;
return `DATE_ADD(${originExpr}, INTERVAL FLOOR(TIMESTAMPDIFF(${unit!.toUpperCase()}, ${originExpr}, ${col}) / ${amount}) * ${amount} ${unit!.toUpperCase()})`;
}
parseIntervalToSql(interval: string): string {
const [amount, unit] = interval.split(' ');
return `INTERVAL ${amount} ${unit!.toUpperCase()}`;
}
} }

View file

@ -1,6 +1,7 @@
import { readFileSync } from 'node:fs'; import { readFileSync } from 'node:fs';
import { homedir } from 'node:os'; import { homedir } from 'node:os';
import { resolve } from 'node:path'; import { resolve } from 'node:path';
import { getDialectForDriver } from '../../context/connections/dialects.js';
import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js'; import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js';
import { tryConstraintQuery } from '../../context/scan/constraint-discovery.js'; import { tryConstraintQuery } from '../../context/scan/constraint-discovery.js';
import { scopedTableNames } from '../../context/scan/table-ref.js'; import { scopedTableNames } from '../../context/scan/table-ref.js';
@ -26,7 +27,6 @@ import {
type KtxTableSampleResult, type KtxTableSampleResult,
} from '../../context/scan/types.js'; } from '../../context/scan/types.js';
import { Pool } from 'pg'; import { Pool } from 'pg';
import { KtxPostgresDialect } from './dialect.js';
const PG_OID_TYPE_MAP: Record<number, string> = { const PG_OID_TYPE_MAP: Record<number, string> = {
16: 'boolean', 16: 'boolean',
@ -219,6 +219,29 @@ function groupByTable<T extends { table_name: string }>(rows: T[]): Map<string,
return grouped; return grouped;
} }
/** @internal */
export function preparePostgresReadOnlyQuery(
sql: string,
params?: Record<string, unknown>,
): { sql: string; params?: unknown[] } {
if (!params) {
return { sql, params: undefined };
}
const paramNames = Object.keys(params);
const values: unknown[] = new Array(paramNames.length);
const paramIndexMap = new Map<string, number>();
paramNames.forEach((name, index) => {
paramIndexMap.set(name, index + 1);
values[index] = params[name];
});
const sortedKeys = [...paramNames].sort((a, b) => b.length - a.length);
let parameterizedQuery = sql;
for (const name of sortedKeys) {
parameterizedQuery = parameterizedQuery.replace(new RegExp(`:${name}\\b`, 'g'), `$${paramIndexMap.get(name)}`);
}
return { sql: parameterizedQuery, params: values };
}
function primaryKeyMap(rows: PostgresPrimaryKeyRow[]): Map<string, Set<string>> { function primaryKeyMap(rows: PostgresPrimaryKeyRow[]): Map<string, Set<string>> {
const grouped = new Map<string, Set<string>>(); const grouped = new Map<string, Set<string>>();
for (const row of rows) { for (const row of rows) {
@ -400,7 +423,7 @@ export class KtxPostgresScanConnector implements KtxScanConnector {
private readonly poolFactory: KtxPostgresPoolFactory; private readonly poolFactory: KtxPostgresPoolFactory;
private readonly endpointResolver?: KtxPostgresEndpointResolver; private readonly endpointResolver?: KtxPostgresEndpointResolver;
private readonly now: () => Date; private readonly now: () => Date;
private readonly dialect = new KtxPostgresDialect(); private readonly dialect = getDialectForDriver('postgres');
private pool: KtxPostgresPool | null = null; private pool: KtxPostgresPool | null = null;
private lastIdlePoolError: Error | null = null; private lastIdlePoolError: Error | null = null;
private resolvedEndpoint: KtxPostgresResolvedEndpoint | null = null; private resolvedEndpoint: KtxPostgresResolvedEndpoint | null = null;
@ -489,7 +512,7 @@ export class KtxPostgresScanConnector implements KtxScanConnector {
const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows); const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows);
const prepared = Array.isArray(input.params) const prepared = Array.isArray(input.params)
? { sql: limitedSql, params: input.params } ? { sql: limitedSql, params: input.params }
: this.dialect.prepareQuery(limitedSql, input.params); : preparePostgresReadOnlyQuery(limitedSql, input.params);
const result = await this.query(prepared.sql, prepared.params); const result = await this.query(prepared.sql, prepared.params);
return { ...result, rowCount: result.rows.length }; return { ...result, rowCount: result.rows.length };
} }
@ -584,6 +607,7 @@ export class KtxPostgresScanConnector implements KtxScanConnector {
[filterSchemas], [filterSchemas],
); );
return rows.map((row) => ({ return rows.map((row) => ({
catalog: null,
schema: row.schema_name, schema: row.schema_name,
name: row.table_name, name: row.table_name,
kind: row.table_kind === 'v' ? ('view' as const) : ('table' as const), kind: row.table_kind === 'v' ? ('view' as const) : ('table' as const),

View file

@ -1,9 +1,18 @@
import type { KtxDialect } from '../../context/connections/dialects.js';
import {
columnDisplayPartCount,
formatDialectDisplayRef,
formatDialectTableName,
limitOffsetClause,
parseDialectDisplayRef,
} from '../../context/connections/dialect-helpers.js';
import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js'; import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js';
type PostgresTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>; type PostgresTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>;
export class KtxPostgresDialect { /** @internal */
readonly type = 'postgresql'; export class KtxPostgresDialect implements KtxDialect {
readonly type = 'postgres' as const;
private readonly typeMappings: Record<string, KtxSchemaDimensionType> = { private readonly typeMappings: Record<string, KtxSchemaDimensionType> = {
timestamp: 'time', timestamp: 'time',
@ -45,9 +54,19 @@ export class KtxPostgresDialect {
} }
formatTableName(table: PostgresTableNameRef): string { formatTableName(table: PostgresTableNameRef): string {
return table.db return formatDialectTableName(table, this.quoteIdentifier.bind(this), 'ansi');
? `${this.quoteIdentifier(table.db)}.${this.quoteIdentifier(table.name)}` }
: this.quoteIdentifier(table.name);
formatDisplayRef(table: PostgresTableNameRef): string {
return formatDialectDisplayRef(table, 'ansi');
}
parseDisplayRef(display: string): KtxTableRef | null {
return parseDialectDisplayRef(display, 'ansi');
}
columnDisplayTablePartCount(): 1 | 2 | 3 {
return columnDisplayPartCount('ansi');
} }
mapDataType(nativeType: string): string { mapDataType(nativeType: string): string {
@ -92,25 +111,6 @@ export class KtxPostgresDialect {
return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND TRIM(CAST(${quotedColumn} AS TEXT)) != '' LIMIT ${limit}`; return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND TRIM(CAST(${quotedColumn} AS TEXT)) != '' LIMIT ${limit}`;
} }
prepareQuery(sql: string, params?: Record<string, unknown>): { sql: string; params?: unknown[] } {
if (!params) {
return { sql, params: undefined };
}
const paramNames = Object.keys(params);
const values: unknown[] = new Array(paramNames.length);
const paramIndexMap = new Map<string, number>();
paramNames.forEach((name, index) => {
paramIndexMap.set(name, index + 1);
values[index] = params[name];
});
const sortedKeys = [...paramNames].sort((a, b) => b.length - a.length);
let parameterizedQuery = sql;
for (const name of sortedKeys) {
parameterizedQuery = parameterizedQuery.replace(new RegExp(`:${name}\\b`, 'g'), `$${paramIndexMap.get(name)}`);
}
return { sql: parameterizedQuery, params: values };
}
getRandomSampleFilter(samplePct: number): string { getRandomSampleFilter(samplePct: number): string {
if (samplePct <= 0 || samplePct >= 1) { if (samplePct <= 0 || samplePct >= 1) {
return ''; return '';
@ -126,7 +126,11 @@ export class KtxPostgresDialect {
} }
getLimitOffsetClause(limit: number, offset?: number): string { getLimitOffsetClause(limit: number, offset?: number): string {
return offset !== undefined && offset > 0 ? `LIMIT ${limit} OFFSET ${offset}` : `LIMIT ${limit}`; return limitOffsetClause(limit, offset);
}
getTopClause(_limit: number): string {
return '';
} }
getNullCountExpression(column: string): string { getNullCountExpression(column: string): string {
@ -137,6 +141,18 @@ export class KtxPostgresDialect {
return `COUNT(DISTINCT ${column})`; return `COUNT(DISTINCT ${column})`;
} }
textLengthExpression(columnSql: string): string {
return `LENGTH(CAST(${columnSql} AS TEXT))`;
}
castToText(columnSql: string): string {
return `CAST(${columnSql} AS TEXT)`;
}
getSampleValueAggregation(innerSql: string): string {
return `(SELECT STRING_AGG(CAST(value AS TEXT), CHR(31)) FROM (${innerSql}) AS relationship_profile_values)`;
}
generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string { generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string {
return ` return `
WITH sampled AS ( WITH sampled AS (
@ -191,23 +207,4 @@ export class KtxPostgresDialect {
FROM sampled FROM sampled
`; `;
} }
getTimeTruncExpression(
column: string,
granularity: 'day' | 'week' | 'month' | 'quarter' | 'year',
timezone?: string,
): string {
const col = timezone ? `(${column} AT TIME ZONE '${timezone.replace(/'/g, "''")}')` : column;
return `DATE_TRUNC('${granularity}', ${col})`;
}
getCustomTimeTruncExpression(column: string, interval: string, origin?: string, timezone?: string): string {
const col = timezone ? `(${column} AT TIME ZONE '${timezone.replace(/'/g, "''")}')` : column;
const originExpr = origin ? `TIMESTAMP '${origin.replace(/'/g, "''")}'` : "TIMESTAMP '1970-01-01'";
return `${originExpr} + FLOOR(EXTRACT(EPOCH FROM (${col} - ${originExpr})) / EXTRACT(EPOCH FROM INTERVAL '${interval.replace(/'/g, "''")}')) * INTERVAL '${interval.replace(/'/g, "''")}'`;
}
parseIntervalToSql(interval: string): string {
return `INTERVAL '${interval.replace(/'/g, "''")}'`;
}
} }

View file

@ -2,6 +2,7 @@ import { createPrivateKey } from 'node:crypto';
import { readFileSync } from 'node:fs'; import { readFileSync } from 'node:fs';
import { homedir } from 'node:os'; import { homedir } from 'node:os';
import { resolve } from 'node:path'; import { resolve } from 'node:path';
import { getDialectForDriver } from '../../context/connections/dialects.js';
import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js'; import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js';
import { tryConstraintQuery } from '../../context/scan/constraint-discovery.js'; import { tryConstraintQuery } from '../../context/scan/constraint-discovery.js';
import { scopedTableNames } from '../../context/scan/table-ref.js'; import { scopedTableNames } from '../../context/scan/table-ref.js';
@ -27,7 +28,6 @@ import {
} from '../../context/scan/types.js'; } from '../../context/scan/types.js';
import snowflake from 'snowflake-sdk'; import snowflake from 'snowflake-sdk';
import type { Bind, Binds, Connection, ConnectionOptions } from 'snowflake-sdk'; import type { Bind, Binds, Connection, ConnectionOptions } from 'snowflake-sdk';
import { KtxSnowflakeDialect } from './dialect.js';
import { assertSafeSnowflakeIdentifier, quoteSnowflakeIdentifier } from './identifiers.js'; import { assertSafeSnowflakeIdentifier, quoteSnowflakeIdentifier } from './identifiers.js';
import { configureSnowflakeSdkLogger } from './sdk-logger.js'; import { configureSnowflakeSdkLogger } from './sdk-logger.js';
@ -229,6 +229,14 @@ function toSnowflakeBinds(params: unknown[] | undefined): Binds | undefined {
return params?.map((value) => toSnowflakeBind(value)); return params?.map((value) => toSnowflakeBind(value));
} }
/** @internal */
export function prepareSnowflakeReadOnlyQuery(
sql: string,
params?: Record<string, unknown>,
): { sql: string; params?: unknown[] } {
return { sql, params: params ? Object.values(params) : undefined };
}
export function isKtxSnowflakeConnectionConfig( export function isKtxSnowflakeConnectionConfig(
connection: KtxSnowflakeConnectionConfig | undefined, connection: KtxSnowflakeConnectionConfig | undefined,
): connection is KtxSnowflakeConnectionConfig { ): connection is KtxSnowflakeConnectionConfig {
@ -430,6 +438,7 @@ class SnowflakeSdkDriver implements KtxSnowflakeDriver {
[this.resolved.database, ...(schemas ?? [])], [this.resolved.database, ...(schemas ?? [])],
); );
return result.rows.map((row) => ({ return result.rows.map((row) => ({
catalog: this.resolved.database,
schema: String(row[0]), schema: String(row[0]),
name: String(row[1]), name: String(row[1]),
kind: String(row[2]) === 'VIEW' ? ('view' as const) : ('table' as const), kind: String(row[2]) === 'VIEW' ? ('view' as const) : ('table' as const),
@ -550,7 +559,7 @@ export class KtxSnowflakeScanConnector implements KtxScanConnector {
private readonly resolved: KtxSnowflakeResolvedConnectionConfig; private readonly resolved: KtxSnowflakeResolvedConnectionConfig;
private readonly driverFactory: KtxSnowflakeDriverFactory; private readonly driverFactory: KtxSnowflakeDriverFactory;
private readonly dialect = new KtxSnowflakeDialect(); private readonly dialect = getDialectForDriver('snowflake');
private readonly now: () => Date; private readonly now: () => Date;
private driverInstance: KtxSnowflakeDriver | null = null; private driverInstance: KtxSnowflakeDriver | null = null;
@ -635,7 +644,7 @@ export class KtxSnowflakeScanConnector implements KtxScanConnector {
async executeReadOnly(input: KtxSnowflakeReadOnlyQueryInput, _ctx: KtxScanContext): Promise<KtxQueryResult> { async executeReadOnly(input: KtxSnowflakeReadOnlyQueryInput, _ctx: KtxScanContext): Promise<KtxQueryResult> {
this.assertConnection(input.connectionId); this.assertConnection(input.connectionId);
const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows); const limitedSql = limitSqlForExecution(assertReadOnlySql(input.sql), input.maxRows);
const prepared = this.dialect.prepareQuery(limitedSql, input.params); const prepared = prepareSnowflakeReadOnlyQuery(limitedSql, input.params);
return this.getDriver().query(prepared.sql, prepared.params); return this.getDriver().query(prepared.sql, prepared.params);
} }
@ -696,6 +705,7 @@ export class KtxSnowflakeScanConnector implements KtxScanConnector {
[this.resolved.database, ...(schemas ?? [])], [this.resolved.database, ...(schemas ?? [])],
); );
return result.rows.map((row) => ({ return result.rows.map((row) => ({
catalog: this.resolved.database,
schema: String(row[0]), schema: String(row[0]),
name: String(row[1]), name: String(row[1]),
kind: String(row[2]) === 'VIEW' ? ('view' as const) : ('table' as const), kind: String(row[2]) === 'VIEW' ? ('view' as const) : ('table' as const),

View file

@ -1,9 +1,18 @@
import type { KtxDialect } from '../../context/connections/dialects.js';
import {
columnDisplayPartCount,
formatDialectDisplayRef,
formatDialectTableName,
limitOffsetClause,
parseDialectDisplayRef,
} from '../../context/connections/dialect-helpers.js';
import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js'; import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js';
type SnowflakeTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>; type SnowflakeTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>;
export class KtxSnowflakeDialect { /** @internal */
readonly type = 'snowflake'; export class KtxSnowflakeDialect implements KtxDialect {
readonly type = 'snowflake' as const;
private readonly typeMappings: Record<string, KtxSchemaDimensionType> = { private readonly typeMappings: Record<string, KtxSchemaDimensionType> = {
TIMESTAMP_NTZ: 'time', TIMESTAMP_NTZ: 'time',
@ -45,13 +54,19 @@ export class KtxSnowflakeDialect {
} }
formatTableName(table: SnowflakeTableNameRef): string { formatTableName(table: SnowflakeTableNameRef): string {
if (table.catalog && table.db) { return formatDialectTableName(table, this.quoteIdentifier.bind(this), 'three-part');
return `${this.quoteIdentifier(table.catalog)}.${this.quoteIdentifier(table.db)}.${this.quoteIdentifier(table.name)}`; }
}
if (table.db) { formatDisplayRef(table: SnowflakeTableNameRef): string {
return `${this.quoteIdentifier(table.db)}.${this.quoteIdentifier(table.name)}`; return formatDialectDisplayRef(table, 'three-part');
} }
return this.quoteIdentifier(table.name);
parseDisplayRef(display: string): KtxTableRef | null {
return parseDialectDisplayRef(display, 'three-part');
}
columnDisplayTablePartCount(): 1 | 2 | 3 {
return columnDisplayPartCount('three-part');
} }
mapDataType(nativeType: string): string { mapDataType(nativeType: string): string {
@ -96,10 +111,6 @@ export class KtxSnowflakeDialect {
return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND TRIM(CAST(${quotedColumn} AS STRING)) != '' LIMIT ${limit}`; return `SELECT ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND TRIM(CAST(${quotedColumn} AS STRING)) != '' LIMIT ${limit}`;
} }
prepareQuery(sql: string, params?: Record<string, unknown>): { sql: string; params?: unknown[] } {
return { sql, params: params ? Object.values(params) : undefined };
}
getRandomSampleFilter(samplePct: number): string { getRandomSampleFilter(samplePct: number): string {
if (samplePct <= 0 || samplePct >= 1) { if (samplePct <= 0 || samplePct >= 1) {
return ''; return '';
@ -115,7 +126,11 @@ export class KtxSnowflakeDialect {
} }
getLimitOffsetClause(limit: number, offset?: number): string { getLimitOffsetClause(limit: number, offset?: number): string {
return offset !== undefined && offset > 0 ? `LIMIT ${limit} OFFSET ${offset}` : `LIMIT ${limit}`; return limitOffsetClause(limit, offset);
}
getTopClause(_limit: number): string {
return '';
} }
getNullCountExpression(column: string): string { getNullCountExpression(column: string): string {
@ -126,6 +141,18 @@ export class KtxSnowflakeDialect {
return `APPROX_COUNT_DISTINCT(${column})`; return `APPROX_COUNT_DISTINCT(${column})`;
} }
textLengthExpression(columnSql: string): string {
return `LENGTH(CAST(${columnSql} AS TEXT))`;
}
castToText(columnSql: string): string {
return `CAST(${columnSql} AS VARCHAR)`;
}
getSampleValueAggregation(innerSql: string): string {
return `(SELECT LISTAGG(CAST(value AS VARCHAR), '\\x1f') FROM (${innerSql}) AS relationship_profile_values)`;
}
generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string { generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string {
return ` return `
WITH sampled AS ( WITH sampled AS (
@ -164,24 +191,4 @@ export class KtxSnowflakeDialect {
FROM sampled FROM sampled
`; `;
} }
getTimeTruncExpression(
column: string,
granularity: 'day' | 'week' | 'month' | 'quarter' | 'year',
timezone?: string,
): string {
const target = timezone ? `CONVERT_TIMEZONE('UTC', '${timezone}', ${column})` : column;
return `DATE_TRUNC('${granularity}', ${target})`;
}
getCustomTimeTruncExpression(column: string, interval: string, origin?: string, timezone?: string): string {
const target = timezone ? `CONVERT_TIMEZONE('UTC', '${timezone}', ${column})` : column;
const [amount, unit] = interval.split(' ');
const originExpr = origin ? `'${origin}'::TIMESTAMP` : `'1970-01-01'::TIMESTAMP`;
return `DATEADD(${unit}, FLOOR(DATEDIFF(${unit}, ${originExpr}, ${target}) / ${amount}) * ${amount}, ${originExpr})`;
}
parseIntervalToSql(interval: string): string {
return `INTERVAL '${interval}'`;
}
} }

View file

@ -3,11 +3,11 @@ import { existsSync, readFileSync, statSync } from 'node:fs';
import { homedir } from 'node:os'; import { homedir } from 'node:os';
import { isAbsolute, resolve } from 'node:path'; import { isAbsolute, resolve } from 'node:path';
import { fileURLToPath } from 'node:url'; import { fileURLToPath } from 'node:url';
import { getDialectForDriver } from '../../context/connections/dialects.js';
import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js'; import { assertReadOnlySql, limitSqlForExecution } from '../../context/connections/read-only-sql.js';
import { normalizeQueryRows } from '../../context/connections/query-executor.js'; import { normalizeQueryRows } from '../../context/connections/query-executor.js';
import { createKtxConnectorCapabilities, type KtxColumnSampleInput, type KtxColumnSampleResult, type KtxColumnStatsInput, type KtxColumnStatsResult, type KtxQueryResult, type KtxReadOnlyQueryInput, type KtxScanConnector, type KtxScanContext, type KtxScanInput, type KtxSchemaForeignKey, type KtxSchemaSnapshot, type KtxSchemaTable, type KtxTableRef, type KtxTableSampleInput, type KtxTableSampleResult } from '../../context/scan/types.js'; import { createKtxConnectorCapabilities, type KtxColumnSampleInput, type KtxColumnSampleResult, type KtxColumnStatsInput, type KtxColumnStatsResult, type KtxQueryResult, type KtxReadOnlyQueryInput, type KtxScanConnector, type KtxScanContext, type KtxScanInput, type KtxSchemaForeignKey, type KtxSchemaSnapshot, type KtxSchemaTable, type KtxTableListEntry, type KtxTableRef, type KtxTableSampleInput, type KtxTableSampleResult } from '../../context/scan/types.js';
import { scopedTableNames } from '../../context/scan/table-ref.js'; import { scopedTableNames } from '../../context/scan/table-ref.js';
import { KtxSqliteDialect } from './dialect.js';
export interface KtxSqliteConnectionConfig { export interface KtxSqliteConnectionConfig {
driver?: string; driver?: string;
@ -157,7 +157,7 @@ export class KtxSqliteScanConnector implements KtxScanConnector {
private readonly connectionId: string; private readonly connectionId: string;
private readonly dbPath: string; private readonly dbPath: string;
private readonly now: () => Date; private readonly now: () => Date;
private readonly dialect = new KtxSqliteDialect(); private readonly dialect = getDialectForDriver('sqlite');
private db: Database.Database | null = null; private db: Database.Database | null = null;
constructor(options: KtxSqliteScanConnectorOptions) { constructor(options: KtxSqliteScanConnectorOptions) {
@ -209,6 +209,31 @@ export class KtxSqliteScanConnector implements KtxScanConnector {
}; };
} }
async listSchemas(): Promise<string[]> {
return [];
}
async listTables(_schemas?: string[]): Promise<KtxTableListEntry[]> {
const rows = this.database()
.prepare(
`
SELECT name, type
FROM sqlite_master
WHERE type IN ('table', 'view')
AND name NOT LIKE 'sqlite_%'
ORDER BY name
`,
)
.all() as SqliteMasterRow[];
return rows.map((row) => ({
catalog: null,
schema: '',
name: row.name,
kind: row.type === 'view' ? ('view' as const) : ('table' as const),
}));
}
async sampleTable(input: KtxTableSampleInput, _ctx: KtxScanContext): Promise<KtxTableSampleResult> { async sampleTable(input: KtxTableSampleInput, _ctx: KtxScanContext): Promise<KtxTableSampleResult> {
this.assertConnection(input.connectionId); this.assertConnection(input.connectionId);
const result = this.query(this.dialect.generateSampleQuery(this.qTableName(input.table), input.limit, input.columns)); const result = this.query(this.dialect.generateSampleQuery(this.qTableName(input.table), input.limit, input.columns));

View file

@ -1,9 +1,18 @@
import type { KtxDialect } from '../../context/connections/dialects.js';
import {
columnDisplayPartCount,
formatDialectDisplayRef,
formatDialectTableName,
limitOffsetClause,
parseDialectDisplayRef,
} from '../../context/connections/dialect-helpers.js';
import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js'; import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js';
type SqliteTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>; type SqliteTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>;
export class KtxSqliteDialect { /** @internal */
readonly type = 'sqlite'; export class KtxSqliteDialect implements KtxDialect {
readonly type = 'sqlite' as const;
private readonly typeMappings: Record<string, KtxSchemaDimensionType> = { private readonly typeMappings: Record<string, KtxSchemaDimensionType> = {
DATETIME: 'time', DATETIME: 'time',
@ -29,7 +38,19 @@ export class KtxSqliteDialect {
} }
formatTableName(table: SqliteTableNameRef): string { formatTableName(table: SqliteTableNameRef): string {
return this.quoteIdentifier(table.name); return formatDialectTableName(table, this.quoteIdentifier.bind(this), 'sqlite');
}
formatDisplayRef(table: SqliteTableNameRef): string {
return formatDialectDisplayRef(table, 'sqlite');
}
parseDisplayRef(display: string): KtxTableRef | null {
return parseDialectDisplayRef(display, 'sqlite');
}
columnDisplayTablePartCount(): 1 | 2 | 3 {
return columnDisplayPartCount('sqlite');
} }
mapDataType(nativeType: string): string { mapDataType(nativeType: string): string {
@ -76,10 +97,6 @@ export class KtxSqliteDialect {
return `SELECT ${quoted} FROM ${tableName} WHERE ${quoted} IS NOT NULL AND TRIM(CAST(${quoted} AS TEXT)) != '' LIMIT ${limit}`; return `SELECT ${quoted} FROM ${tableName} WHERE ${quoted} IS NOT NULL AND TRIM(CAST(${quoted} AS TEXT)) != '' LIMIT ${limit}`;
} }
prepareQuery(sql: string, params?: Record<string, unknown>): { sql: string; params?: unknown } {
return params ? { sql, params } : { sql };
}
getRandomSampleFilter(samplePct: number): string { getRandomSampleFilter(samplePct: number): string {
if (samplePct <= 0 || samplePct >= 1) { if (samplePct <= 0 || samplePct >= 1) {
return ''; return '';
@ -92,7 +109,11 @@ export class KtxSqliteDialect {
} }
getLimitOffsetClause(limit: number, offset?: number): string { getLimitOffsetClause(limit: number, offset?: number): string {
return offset !== undefined && offset > 0 ? `LIMIT ${limit} OFFSET ${offset}` : `LIMIT ${limit}`; return limitOffsetClause(limit, offset);
}
getTopClause(_limit: number): string {
return '';
} }
getNullCountExpression(column: string): string { getNullCountExpression(column: string): string {
@ -103,6 +124,18 @@ export class KtxSqliteDialect {
return `COUNT(DISTINCT ${column})`; return `COUNT(DISTINCT ${column})`;
} }
textLengthExpression(columnSql: string): string {
return `LENGTH(CAST(${columnSql} AS TEXT))`;
}
castToText(columnSql: string): string {
return `CAST(${columnSql} AS TEXT)`;
}
getSampleValueAggregation(innerSql: string): string {
return `(SELECT GROUP_CONCAT(CAST(value AS TEXT), char(31)) FROM (${innerSql}) AS relationship_profile_values)`;
}
generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string { generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string {
return ` return `
WITH sampled AS ( WITH sampled AS (
@ -143,35 +176,4 @@ export class KtxSqliteDialect {
FROM sampled FROM sampled
`; `;
} }
getTimeTruncExpression(
column: string,
granularity: 'day' | 'week' | 'month' | 'quarter' | 'year',
_timezone?: string,
): string {
switch (granularity) {
case 'day':
return `DATE(${column})`;
case 'week':
return `DATE(${column}, 'weekday 0', '-6 days')`;
case 'month':
return `DATE(${column}, 'start of month')`;
case 'quarter':
return `DATE(${column}, 'start of month', '-' || ((CAST(STRFTIME('%m', ${column}) AS INTEGER) - 1) % 3) || ' months')`;
case 'year':
return `DATE(${column}, 'start of year')`;
}
}
getCustomTimeTruncExpression(column: string, interval: string, origin?: string, _timezone?: string): string {
const [amount, unit] = interval.split(' ');
const originExpr = origin ? `julianday('${origin}')` : `julianday('1970-01-01')`;
const unitDays = unit === 'day' ? 1 : unit === 'week' ? 7 : 30;
const intervalDays = Number(amount) * unitDays;
return `DATE(julianday('1970-01-01') + (CAST((julianday(${column}) - ${originExpr}) / ${intervalDays} AS INTEGER) * ${intervalDays}))`;
}
parseIntervalToSql(interval: string): string {
return `'${interval}'`;
}
} }

View file

@ -1,4 +1,5 @@
import { assertReadOnlySql } from '../../context/connections/read-only-sql.js'; import { assertReadOnlySql } from '../../context/connections/read-only-sql.js';
import { getDialectForDriver } from '../../context/connections/dialects.js';
import { tryConstraintQuery } from '../../context/scan/constraint-discovery.js'; import { tryConstraintQuery } from '../../context/scan/constraint-discovery.js';
import { scopedTableNames } from '../../context/scan/table-ref.js'; import { scopedTableNames } from '../../context/scan/table-ref.js';
import { import {
@ -26,7 +27,6 @@ import { readFileSync } from 'node:fs';
import { homedir } from 'node:os'; import { homedir } from 'node:os';
import { resolve } from 'node:path'; import { resolve } from 'node:path';
import sql from 'mssql'; import sql from 'mssql';
import { KtxSqlServerDialect } from './dialect.js';
export interface KtxSqlServerConnectionConfig { export interface KtxSqlServerConnectionConfig {
driver?: string; driver?: string;
@ -158,6 +158,21 @@ function tableScopeSql(
return { clause: `AND ${columnExpression} IN (${placeholders.join(', ')})`, params }; return { clause: `AND ${columnExpression} IN (${placeholders.join(', ')})`, params };
} }
/** @internal */
export function prepareSqlServerReadOnlyQuery(
sql: string,
params?: Record<string, unknown>,
): { sql: string; params?: Record<string, unknown> } {
if (!params) {
return { sql, params: undefined };
}
let parameterizedQuery = sql;
for (const key of Object.keys(params)) {
parameterizedQuery = parameterizedQuery.replace(new RegExp(`:${key}\\b`, 'g'), `@${key}`);
}
return { sql: parameterizedQuery, params };
}
class DefaultSqlServerPoolFactory implements KtxSqlServerPoolFactory { class DefaultSqlServerPoolFactory implements KtxSqlServerPoolFactory {
async createPool(config: KtxSqlServerPoolConfig): Promise<KtxSqlServerPool> { async createPool(config: KtxSqlServerPoolConfig): Promise<KtxSqlServerPool> {
const pool = await new sql.ConnectionPool(config as sql.config).connect(); const pool = await new sql.ConnectionPool(config as sql.config).connect();
@ -349,7 +364,7 @@ export class KtxSqlServerScanConnector implements KtxScanConnector {
private readonly poolFactory: KtxSqlServerPoolFactory; private readonly poolFactory: KtxSqlServerPoolFactory;
private readonly endpointResolver?: KtxSqlServerEndpointResolver; private readonly endpointResolver?: KtxSqlServerEndpointResolver;
private readonly now: () => Date; private readonly now: () => Date;
private readonly dialect = new KtxSqlServerDialect(); private readonly dialect = getDialectForDriver('sqlserver');
private pool: KtxSqlServerPool | null = null; private pool: KtxSqlServerPool | null = null;
private resolvedEndpoint: KtxSqlServerResolvedEndpoint | null = null; private resolvedEndpoint: KtxSqlServerResolvedEndpoint | null = null;
@ -427,7 +442,7 @@ export class KtxSqlServerScanConnector implements KtxScanConnector {
async executeReadOnly(input: KtxSqlServerReadOnlyQueryInput, _ctx: KtxScanContext): Promise<KtxQueryResult> { async executeReadOnly(input: KtxSqlServerReadOnlyQueryInput, _ctx: KtxScanContext): Promise<KtxQueryResult> {
this.assertConnection(input.connectionId); this.assertConnection(input.connectionId);
const limitedSql = limitSqlForSqlServerExecution(input.sql, input.maxRows); const limitedSql = limitSqlForSqlServerExecution(input.sql, input.maxRows);
const prepared = this.dialect.prepareQuery(limitedSql, input.params); const prepared = prepareSqlServerReadOnlyQuery(limitedSql, input.params);
const result = await this.query(prepared.sql, prepared.params); const result = await this.query(prepared.sql, prepared.params);
return { ...result, rowCount: result.rows.length }; return { ...result, rowCount: result.rows.length };
} }
@ -517,6 +532,7 @@ export class KtxSqlServerScanConnector implements KtxScanConnector {
params, params,
); );
return rows.map((row) => ({ return rows.map((row) => ({
catalog: this.poolConfig.database,
schema: row.schema_name, schema: row.schema_name,
name: row.table_name, name: row.table_name,
kind: row.table_type === 'VIEW' ? ('view' as const) : ('table' as const), kind: row.table_type === 'VIEW' ? ('view' as const) : ('table' as const),

View file

@ -1,9 +1,18 @@
import type { KtxDialect } from '../../context/connections/dialects.js';
import {
columnDisplayPartCount,
formatDialectDisplayRef,
formatDialectTableName,
parseDialectDisplayRef,
safeSqlLimit,
} from '../../context/connections/dialect-helpers.js';
import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js'; import type { KtxSchemaDimensionType, KtxTableRef } from '../../context/scan/types.js';
type SqlServerTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>; type SqlServerTableNameRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>;
export class KtxSqlServerDialect { /** @internal */
readonly type = 'sqlserver'; export class KtxSqlServerDialect implements KtxDialect {
readonly type = 'sqlserver' as const;
private readonly typeMappings: Record<string, KtxSchemaDimensionType> = { private readonly typeMappings: Record<string, KtxSchemaDimensionType> = {
datetime: 'time', datetime: 'time',
@ -39,9 +48,19 @@ export class KtxSqlServerDialect {
} }
formatTableName(table: SqlServerTableNameRef): string { formatTableName(table: SqlServerTableNameRef): string {
return table.db return formatDialectTableName(table, this.quoteIdentifier.bind(this), 'three-part');
? `${this.quoteIdentifier(table.db)}.${this.quoteIdentifier(table.name)}` }
: this.quoteIdentifier(table.name);
formatDisplayRef(table: SqlServerTableNameRef): string {
return formatDialectDisplayRef(table, 'three-part');
}
parseDisplayRef(display: string): KtxTableRef | null {
return parseDialectDisplayRef(display, 'three-part');
}
columnDisplayTablePartCount(): 1 | 2 | 3 {
return columnDisplayPartCount('three-part');
} }
mapDataType(nativeType: string): string { mapDataType(nativeType: string): string {
@ -86,17 +105,6 @@ export class KtxSqlServerDialect {
return `SELECT TOP ${limit} ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND LTRIM(RTRIM(CAST(${quotedColumn} AS NVARCHAR(MAX)))) != ''`; return `SELECT TOP ${limit} ${quotedColumn} FROM ${tableName} WHERE ${quotedColumn} IS NOT NULL AND LTRIM(RTRIM(CAST(${quotedColumn} AS NVARCHAR(MAX)))) != ''`;
} }
prepareQuery(sql: string, params?: Record<string, unknown>): { sql: string; params?: Record<string, unknown> } {
if (!params) {
return { sql, params: undefined };
}
let parameterizedQuery = sql;
for (const key of Object.keys(params)) {
parameterizedQuery = parameterizedQuery.replace(new RegExp(`:${key}\\b`, 'g'), `@${key}`);
}
return { sql: parameterizedQuery, params };
}
getRandomSampleFilter(samplePct: number): string { getRandomSampleFilter(samplePct: number): string {
if (samplePct <= 0 || samplePct >= 1) { if (samplePct <= 0 || samplePct >= 1) {
return ''; return '';
@ -111,12 +119,12 @@ export class KtxSqlServerDialect {
return `TABLESAMPLE (${samplePct * 100} PERCENT)`; return `TABLESAMPLE (${samplePct * 100} PERCENT)`;
} }
getLimitOffsetClause(limit: number, offset?: number): string { getLimitOffsetClause(_limit: number, _offset?: number): string {
return offset !== undefined && offset > 0 ? `OFFSET ${offset} ROWS FETCH NEXT ${limit} ROWS ONLY` : ''; return '';
} }
getTopClause(limit: number): string { getTopClause(limit: number): string {
return `TOP ${limit}`; return `TOP (${safeSqlLimit(limit)})`;
} }
getNullCountExpression(column: string): string { getNullCountExpression(column: string): string {
@ -127,6 +135,18 @@ export class KtxSqlServerDialect {
return `COUNT(DISTINCT ${column})`; return `COUNT(DISTINCT ${column})`;
} }
textLengthExpression(columnSql: string): string {
return `LEN(CAST(${columnSql} AS NVARCHAR(MAX)))`;
}
castToText(columnSql: string): string {
return `CAST(${columnSql} AS NVARCHAR(MAX))`;
}
getSampleValueAggregation(innerSql: string): string {
return `(SELECT STRING_AGG(CAST(value AS NVARCHAR(MAX)), CHAR(31)) FROM (${innerSql}) AS relationship_profile_values)`;
}
generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string { generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string {
return ` return `
WITH sampled AS ( WITH sampled AS (
@ -167,35 +187,4 @@ export class KtxSqlServerDialect {
FROM sampled FROM sampled
`; `;
} }
getTimeTruncExpression(
column: string,
granularity: 'day' | 'week' | 'month' | 'quarter' | 'year',
timezone?: string,
): string {
const col = timezone ? `${column} AT TIME ZONE 'UTC' AT TIME ZONE '${timezone}'` : column;
switch (granularity) {
case 'day':
return `CAST(${col} AS DATE)`;
case 'week':
return `DATEADD(WEEK, DATEDIFF(WEEK, 0, ${col}), 0)`;
case 'month':
return `DATEFROMPARTS(YEAR(${col}), MONTH(${col}), 1)`;
case 'quarter':
return `DATEFROMPARTS(YEAR(${col}), (DATEPART(QUARTER, ${col}) - 1) * 3 + 1, 1)`;
case 'year':
return `DATEFROMPARTS(YEAR(${col}), 1, 1)`;
}
}
getCustomTimeTruncExpression(column: string, interval: string, origin?: string, timezone?: string): string {
const col = timezone ? `${column} AT TIME ZONE 'UTC' AT TIME ZONE '${timezone}'` : column;
const [amount, unit] = interval.split(' ');
const originExpr = origin ? `'${origin}'` : `'1970-01-01'`;
return `DATEADD(${unit}, (DATEDIFF(${unit}, ${originExpr}, ${col}) / ${amount}) * ${amount}, ${originExpr})`;
}
parseIntervalToSql(interval: string): string {
return `'${interval}'`;
}
} }

View file

@ -0,0 +1,87 @@
import type { KtxTableRef } from '../scan/types.js';
export type KtxDialectIdentifierShape = 'ansi' | 'sqlite' | 'three-part';
export type KtxDialectTableRef = Pick<KtxTableRef, 'name'> & Partial<Pick<KtxTableRef, 'catalog' | 'db'>>;
export function safeSqlLimit(limit: number): number {
return Math.max(1, Math.floor(limit));
}
function safeSqlOffset(offset: number | undefined): number | null {
if (offset === undefined) {
return null;
}
const normalized = Math.floor(offset);
return normalized > 0 ? normalized : null;
}
function cleanIdentifierPart(part: string): string {
return part.trim().replace(/^["'`\[]|["'`\]]$/g, '');
}
function splitDisplay(display: string): string[] {
return display.trim().split('.').map(cleanIdentifierPart).filter(Boolean);
}
function tableParts(table: KtxDialectTableRef, shape: KtxDialectIdentifierShape): string[] {
if (shape === 'sqlite') {
return [table.name];
}
return [table.catalog ?? null, table.db ?? null, table.name].filter((part): part is string => Boolean(part));
}
function acceptedDisplayPartCounts(shape: KtxDialectIdentifierShape): readonly number[] {
if (shape === 'sqlite') {
return [1];
}
if (shape === 'three-part') {
return [3];
}
return [2, 3];
}
export function formatDialectTableName(
table: KtxDialectTableRef,
quoteIdentifier: (identifier: string) => string,
shape: KtxDialectIdentifierShape,
): string {
return tableParts(table, shape).map(quoteIdentifier).join('.');
}
export function formatDialectDisplayRef(table: KtxDialectTableRef, shape: KtxDialectIdentifierShape): string {
return tableParts(table, shape).join('.');
}
export function parseDialectDisplayRef(display: string, shape: KtxDialectIdentifierShape): KtxTableRef | null {
const parts = splitDisplay(display);
if (!acceptedDisplayPartCounts(shape).includes(parts.length)) {
return null;
}
if (parts.length === 1) {
return { catalog: null, db: null, name: parts[0]! };
}
if (parts.length === 2) {
return { catalog: null, db: parts[0]!, name: parts[1]! };
}
if (parts.length === 3) {
return { catalog: parts[0]!, db: parts[1]!, name: parts[2]! };
}
return null;
}
export function columnDisplayPartCount(shape: KtxDialectIdentifierShape): 1 | 2 | 3 {
if (shape === 'sqlite') {
return 1;
}
if (shape === 'three-part') {
return 3;
}
return 2;
}
export function limitOffsetClause(limit: number, offset?: number): string {
const safeLimit = safeSqlLimit(limit);
const safeOffset = safeSqlOffset(offset);
return safeOffset === null ? `LIMIT ${safeLimit}` : `LIMIT ${safeLimit} OFFSET ${safeOffset}`;
}

View file

@ -1,34 +0,0 @@
import { describe, expect, it } from 'vitest';
import { getDialectForDriver } from './dialects.js';
describe('getDialectForDriver', () => {
it.each([
['postgres', '"public"."orders"'],
['mysql', '`public`.`orders`'],
['clickhouse', '`public`.`orders`'],
['sqlite', '"orders"'],
['snowflake', '"analytics"."public"."orders"'],
['bigquery', '`analytics`.`public`.`orders`'],
['sqlserver', '[analytics].[public].[orders]'],
] as const)('formats table names for %s', (driver, expected) => {
const dialect = getDialectForDriver(driver);
expect(
dialect.formatTableName({
catalog: driver === 'snowflake' || driver === 'bigquery' || driver === 'sqlserver' ? 'analytics' : null,
db: driver === 'sqlite' ? null : 'public',
name: 'orders',
}),
).toBe(expected);
});
it('throws with a supported-driver list for unknown drivers', () => {
expect(() => getDialectForDriver('oracle')).toThrow(
'Unsupported warehouse driver "oracle". Supported drivers: bigquery, clickhouse, mysql, postgres, sqlite, snowflake, sqlserver',
);
});
it('rejects legacy driver aliases', () => {
expect(() => getDialectForDriver('postgresql')).toThrow('Unsupported warehouse driver "postgresql"');
expect(() => getDialectForDriver('sqlite3')).toThrow('Unsupported warehouse driver "sqlite3"');
});
});

View file

@ -1,22 +1,40 @@
import type { KtxSchemaDimensionType, KtxTableRef } from '../scan/types.js'; import { KtxBigQueryDialect } from '../../connectors/bigquery/dialect.js';
import { KtxClickHouseDialect } from '../../connectors/clickhouse/dialect.js';
type SupportedDriver = import { KtxMysqlDialect } from '../../connectors/mysql/dialect.js';
| 'postgres' import { KtxPostgresDialect } from '../../connectors/postgres/dialect.js';
| 'mysql' import { KtxSqliteDialect } from '../../connectors/sqlite/dialect.js';
| 'sqlserver' import { KtxSnowflakeDialect } from '../../connectors/snowflake/dialect.js';
| 'snowflake' import { KtxSqlServerDialect } from '../../connectors/sqlserver/dialect.js';
| 'bigquery' import type { KtxConnectionDriver, KtxSchemaDimensionType, KtxTableRef } from '../scan/types.js';
| 'clickhouse' import type { KtxDialectTableRef } from './dialect-helpers.js';
| 'sqlite';
export interface KtxDialect { export interface KtxDialect {
readonly type: SupportedDriver; readonly type: KtxConnectionDriver;
quoteIdentifier(identifier: string): string; quoteIdentifier(identifier: string): string;
formatTableName(table: KtxTableRef): string; formatTableName(table: KtxDialectTableRef): string;
formatDisplayRef(table: KtxDialectTableRef): string;
parseDisplayRef(display: string): KtxTableRef | null;
columnDisplayTablePartCount(): 1 | 2 | 3;
getLimitOffsetClause(limit: number, offset?: number): string;
getTopClause(limit: number): string;
getRandomSampleFilter(samplePct: number): string;
getTableSampleClause(samplePct: number): string;
generateSampleQuery(tableName: string, limit: number, columns?: string[]): string;
generateColumnSampleQuery(tableName: string, columnName: string, limit: number): string;
getSampleValueAggregation(innerSql: string): string;
generateCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string;
generateRandomizedCardinalitySampleQuery(tableName: string, columnName: string, sampleSize: number): string;
generateDistinctValuesQuery(tableName: string, columnName: string, limit: number): string;
generateColumnStatisticsQuery(schemaName: string, tableName: string): string | null;
getNullCountExpression(column: string): string;
getDistinctCountExpression(column: string): string;
textLengthExpression(columnSql: string): string;
castToText(columnSql: string): string;
mapToDimensionType(nativeType: string): KtxSchemaDimensionType; mapToDimensionType(nativeType: string): KtxSchemaDimensionType;
mapDataType(nativeType: string): string;
} }
const supportedDrivers: SupportedDriver[] = [ const supportedDrivers: KtxConnectionDriver[] = [
'bigquery', 'bigquery',
'clickhouse', 'clickhouse',
'mysql', 'mysql',
@ -26,71 +44,21 @@ const supportedDrivers: SupportedDriver[] = [
'sqlserver', 'sqlserver',
]; ];
function doubleQuoted(identifier: string): string { const dialectFactories: Record<KtxConnectionDriver, () => KtxDialect> = {
return `"${identifier.replace(/"/g, '""')}"`; bigquery: () => new KtxBigQueryDialect(),
} clickhouse: () => new KtxClickHouseDialect(),
mysql: () => new KtxMysqlDialect(),
function backtickQuoted(identifier: string): string { postgres: () => new KtxPostgresDialect(),
return `\`${identifier.replace(/`/g, '``')}\``; sqlite: () => new KtxSqliteDialect(),
} snowflake: () => new KtxSnowflakeDialect(),
sqlserver: () => new KtxSqlServerDialect(),
function bigQueryQuoted(identifier: string): string {
return `\`${identifier.replace(/`/g, '\\`')}\``;
}
function bracketQuoted(identifier: string): string {
return `[${identifier.replace(/\]/g, ']]')}]`;
}
function inferDimensionType(nativeType: string): KtxSchemaDimensionType {
const normalized = nativeType.toLowerCase().trim();
if (normalized.includes('date') || normalized.includes('time')) {
return 'time';
}
if (
normalized.includes('int') ||
normalized.includes('num') ||
normalized.includes('dec') ||
normalized.includes('float') ||
normalized.includes('double') ||
normalized.includes('real')
) {
return 'number';
}
if (normalized.includes('bool') || normalized === 'bit') {
return 'boolean';
}
return 'string';
}
function formatWithParts(table: KtxTableRef, quote: (identifier: string) => string, sqlite = false): string {
const parts = sqlite ? [table.name] : [table.catalog, table.db, table.name].filter((part): part is string => !!part);
return parts.map(quote).join('.');
}
function createDialect(type: SupportedDriver, quote: (identifier: string) => string, sqlite = false): KtxDialect {
return {
type,
quoteIdentifier: quote,
formatTableName: (table) => formatWithParts(table, quote, sqlite),
mapToDimensionType: inferDimensionType,
};
}
const dialects: Record<SupportedDriver, KtxDialect> = {
postgres: createDialect('postgres', doubleQuoted),
mysql: createDialect('mysql', backtickQuoted),
clickhouse: createDialect('clickhouse', backtickQuoted),
sqlite: createDialect('sqlite', doubleQuoted, true),
snowflake: createDialect('snowflake', doubleQuoted),
bigquery: createDialect('bigquery', bigQueryQuoted),
sqlserver: createDialect('sqlserver', bracketQuoted),
}; };
export function getDialectForDriver(driver: string): KtxDialect { export function getDialectForDriver(driver: string): KtxDialect {
const normalized = driver.toLowerCase().trim(); const normalized = driver.toLowerCase().trim();
if (normalized in dialects) { const factory = dialectFactories[normalized as KtxConnectionDriver];
return dialects[normalized as SupportedDriver]; if (factory) {
return factory();
} }
throw new Error(`Unsupported warehouse driver "${driver}". Supported drivers: ${supportedDrivers.join(', ')}`); throw new Error(`Unsupported warehouse driver "${driver}". Supported drivers: ${supportedDrivers.join(', ')}`);
} }

View file

@ -0,0 +1,199 @@
import type { KtxConnectionDriver, KtxScanConnector } from '../scan/types.js';
/** @internal */
export type KtxScopeConfigKey = 'dataset_ids' | 'databases' | 'schemas' | 'schema_names';
/** @internal */
export interface KtxDriverConnectorModule {
isConnectionConfig(connection: unknown): boolean;
createScanConnector(args: {
connectionId: string;
connection: unknown;
projectDir: string;
}): KtxScanConnector;
}
export interface KtxDriverRegistration {
readonly driver: KtxConnectionDriver;
readonly scopeConfigKey: KtxScopeConfigKey | null;
readonly hasHistoricSqlReader: boolean;
readonly hasLocalQueryExecutor: boolean;
load(): Promise<KtxDriverConnectorModule>;
}
function invalidConnectionConfig(driver: KtxConnectionDriver): Error {
return new Error(`Connection config does not match warehouse driver "${driver}".`);
}
/** @internal */
export const driverRegistrations: Record<KtxConnectionDriver, KtxDriverRegistration> = {
bigquery: {
driver: 'bigquery',
scopeConfigKey: 'dataset_ids',
hasHistoricSqlReader: true,
hasLocalQueryExecutor: false,
load: async () => {
const m = await import('../../connectors/bigquery/connector.js');
return {
isConnectionConfig: (connection) => {
const typedConnection = connection as Parameters<typeof m.isKtxBigQueryConnectionConfig>[0];
return m.isKtxBigQueryConnectionConfig(typedConnection);
},
createScanConnector: ({ connectionId, connection }) => {
const typedConnection = connection as Parameters<typeof m.isKtxBigQueryConnectionConfig>[0];
if (!m.isKtxBigQueryConnectionConfig(typedConnection)) {
throw invalidConnectionConfig('bigquery');
}
return new m.KtxBigQueryScanConnector({ connectionId, connection: typedConnection });
},
};
},
},
clickhouse: {
driver: 'clickhouse',
scopeConfigKey: 'databases',
hasHistoricSqlReader: false,
hasLocalQueryExecutor: false,
load: async () => {
const m = await import('../../connectors/clickhouse/connector.js');
return {
isConnectionConfig: (connection) => {
const typedConnection = connection as Parameters<typeof m.isKtxClickHouseConnectionConfig>[0];
return m.isKtxClickHouseConnectionConfig(typedConnection);
},
createScanConnector: ({ connectionId, connection }) => {
const typedConnection = connection as Parameters<typeof m.isKtxClickHouseConnectionConfig>[0];
if (!m.isKtxClickHouseConnectionConfig(typedConnection)) {
throw invalidConnectionConfig('clickhouse');
}
return new m.KtxClickHouseScanConnector({ connectionId, connection: typedConnection });
},
};
},
},
mysql: {
driver: 'mysql',
scopeConfigKey: 'schemas',
hasHistoricSqlReader: false,
hasLocalQueryExecutor: false,
load: async () => {
const m = await import('../../connectors/mysql/connector.js');
return {
isConnectionConfig: (connection) => {
const typedConnection = connection as Parameters<typeof m.isKtxMysqlConnectionConfig>[0];
return m.isKtxMysqlConnectionConfig(typedConnection);
},
createScanConnector: ({ connectionId, connection }) => {
const typedConnection = connection as Parameters<typeof m.isKtxMysqlConnectionConfig>[0];
if (!m.isKtxMysqlConnectionConfig(typedConnection)) {
throw invalidConnectionConfig('mysql');
}
return new m.KtxMysqlScanConnector({ connectionId, connection: typedConnection });
},
};
},
},
postgres: {
driver: 'postgres',
scopeConfigKey: 'schemas',
hasHistoricSqlReader: true,
hasLocalQueryExecutor: true,
load: async () => {
const m = await import('../../connectors/postgres/connector.js');
return {
isConnectionConfig: (connection) => {
const typedConnection = connection as Parameters<typeof m.isKtxPostgresConnectionConfig>[0];
return m.isKtxPostgresConnectionConfig(typedConnection);
},
createScanConnector: ({ connectionId, connection }) => {
const typedConnection = connection as Parameters<typeof m.isKtxPostgresConnectionConfig>[0];
if (!m.isKtxPostgresConnectionConfig(typedConnection)) {
throw invalidConnectionConfig('postgres');
}
return new m.KtxPostgresScanConnector({ connectionId, connection: typedConnection });
},
};
},
},
sqlite: {
driver: 'sqlite',
scopeConfigKey: null,
hasHistoricSqlReader: false,
hasLocalQueryExecutor: true,
load: async () => {
const m = await import('../../connectors/sqlite/connector.js');
return {
isConnectionConfig: (connection) => {
const typedConnection = connection as Parameters<typeof m.isKtxSqliteConnectionConfig>[0];
return m.isKtxSqliteConnectionConfig(typedConnection);
},
createScanConnector: ({ connectionId, connection, projectDir }) => {
const typedConnection = connection as Parameters<typeof m.isKtxSqliteConnectionConfig>[0];
if (!m.isKtxSqliteConnectionConfig(typedConnection)) {
throw invalidConnectionConfig('sqlite');
}
return new m.KtxSqliteScanConnector({ connectionId, connection: typedConnection, projectDir });
},
};
},
},
snowflake: {
driver: 'snowflake',
scopeConfigKey: 'schema_names',
hasHistoricSqlReader: true,
hasLocalQueryExecutor: false,
load: async () => {
const m = await import('../../connectors/snowflake/connector.js');
return {
isConnectionConfig: (connection) => {
const typedConnection = connection as Parameters<typeof m.isKtxSnowflakeConnectionConfig>[0];
return m.isKtxSnowflakeConnectionConfig(typedConnection);
},
createScanConnector: ({ connectionId, connection, projectDir }) => {
const typedConnection = connection as Parameters<typeof m.isKtxSnowflakeConnectionConfig>[0];
if (!m.isKtxSnowflakeConnectionConfig(typedConnection)) {
throw invalidConnectionConfig('snowflake');
}
return new m.KtxSnowflakeScanConnector({ connectionId, connection: typedConnection, projectDir });
},
};
},
},
sqlserver: {
driver: 'sqlserver',
scopeConfigKey: 'schemas',
hasHistoricSqlReader: false,
hasLocalQueryExecutor: false,
load: async () => {
const m = await import('../../connectors/sqlserver/connector.js');
return {
isConnectionConfig: (connection) => {
const typedConnection = connection as Parameters<typeof m.isKtxSqlServerConnectionConfig>[0];
return m.isKtxSqlServerConnectionConfig(typedConnection);
},
createScanConnector: ({ connectionId, connection }) => {
const typedConnection = connection as Parameters<typeof m.isKtxSqlServerConnectionConfig>[0];
if (!m.isKtxSqlServerConnectionConfig(typedConnection)) {
throw invalidConnectionConfig('sqlserver');
}
return new m.KtxSqlServerScanConnector({ connectionId, connection: typedConnection });
},
};
},
},
};
const supportedDrivers = Object.keys(driverRegistrations).sort() as KtxConnectionDriver[];
function isRegisteredDriver(driver: string): driver is KtxConnectionDriver {
return Object.prototype.hasOwnProperty.call(driverRegistrations, driver);
}
export function getDriverRegistration(driver: string): KtxDriverRegistration | undefined {
const normalized = driver.toLowerCase().trim();
return isRegisteredDriver(normalized) ? driverRegistrations[normalized] : undefined;
}
export function listSupportedDrivers(): KtxConnectionDriver[] {
return [...supportedDrivers];
}

View file

@ -1,3 +1,4 @@
import { driverRegistrations, getDriverRegistration } from './drivers.js';
import { createPostgresQueryExecutor } from './postgres-query-executor.js'; import { createPostgresQueryExecutor } from './postgres-query-executor.js';
import type { import type {
KtxSqlQueryExecutionInput, KtxSqlQueryExecutionInput,
@ -5,6 +6,7 @@ import type {
KtxSqlQueryExecutorPort, KtxSqlQueryExecutorPort,
} from './query-executor.js'; } from './query-executor.js';
import { createSqliteQueryExecutor } from './sqlite-query-executor.js'; import { createSqliteQueryExecutor } from './sqlite-query-executor.js';
import type { KtxConnectionDriver } from '../scan/types.js';
export interface DefaultLocalQueryExecutorOptions { export interface DefaultLocalQueryExecutorOptions {
postgres?: KtxSqlQueryExecutorPort; postgres?: KtxSqlQueryExecutorPort;
@ -15,20 +17,43 @@ function driverFor(input: KtxSqlQueryExecutionInput): string {
return String(input.connection?.driver ?? '').toLowerCase(); return String(input.connection?.driver ?? '').toLowerCase();
} }
function localExecutorMap(
options: DefaultLocalQueryExecutorOptions,
): Partial<Record<KtxConnectionDriver, KtxSqlQueryExecutorPort>> {
const wiredExecutors: Partial<Record<KtxConnectionDriver, KtxSqlQueryExecutorPort>> = {
postgres: options.postgres ?? createPostgresQueryExecutor(),
sqlite: options.sqlite ?? createSqliteQueryExecutor(),
};
const executors: Partial<Record<KtxConnectionDriver, KtxSqlQueryExecutorPort>> = {};
for (const registration of Object.values(driverRegistrations)) {
if (!registration.hasLocalQueryExecutor) continue;
const executor = wiredExecutors[registration.driver];
if (executor) {
executors[registration.driver] = executor;
}
}
return executors;
}
export function createDefaultLocalQueryExecutor(options: DefaultLocalQueryExecutorOptions = {}): KtxSqlQueryExecutorPort { export function createDefaultLocalQueryExecutor(options: DefaultLocalQueryExecutorOptions = {}): KtxSqlQueryExecutorPort {
const postgres = options.postgres ?? createPostgresQueryExecutor(); const executors = localExecutorMap(options);
const sqlite = options.sqlite ?? createSqliteQueryExecutor();
return { return {
async execute(input: KtxSqlQueryExecutionInput): Promise<KtxSqlQueryExecutionResult> { async execute(input: KtxSqlQueryExecutionInput): Promise<KtxSqlQueryExecutionResult> {
const driver = driverFor(input); const driver = driverFor(input);
if (driver === 'postgres') { const registration = getDriverRegistration(driver);
return postgres.execute(input); if (!registration?.hasLocalQueryExecutor) {
throw new Error(`No local query executor is configured for driver "${input.connection?.driver ?? 'unknown'}".`);
} }
if (driver === 'sqlite') {
return sqlite.execute(input); const executor = executors[registration.driver];
if (!executor) {
throw new Error(
`Local query executor flag is enabled for driver "${registration.driver}", but no executor factory is wired.`,
);
} }
throw new Error(`No local query executor is configured for driver "${input.connection?.driver ?? 'unknown'}".`); return executor.execute(input);
}, },
}; };
} }

View file

@ -1,5 +1,9 @@
import { getDriverRegistration } from '../../../connections/drivers.js';
import type { KtxConnectionDriver } from '../../../scan/types.js';
import type { HistoricSqlDialect } from './types.js'; import type { HistoricSqlDialect } from './types.js';
const historicSqlDialects: readonly HistoricSqlDialect[] = ['postgres', 'bigquery', 'snowflake'];
function recordOrNull(value: unknown): Record<string, unknown> | null { function recordOrNull(value: unknown): Record<string, unknown> | null {
return value && typeof value === 'object' && !Array.isArray(value) ? (value as Record<string, unknown>) : null; return value && typeof value === 'object' && !Array.isArray(value) ? (value as Record<string, unknown>) : null;
} }
@ -10,6 +14,14 @@ function queryHistoryRecord(connection: unknown): Record<string, unknown> | null
return context ? recordOrNull(context.queryHistory) : null; return context ? recordOrNull(context.queryHistory) : null;
} }
function historicSqlDialectForDriver(driver: KtxConnectionDriver): HistoricSqlDialect {
const dialect = historicSqlDialects.find((candidate) => candidate === driver);
if (!dialect) {
throw new Error(`Driver "${driver}" is marked as historic-SQL capable but has no HistoricSqlDialect mapping.`);
}
return dialect;
}
export function isQueryHistoryEnabled(connection: unknown): boolean { export function isQueryHistoryEnabled(connection: unknown): boolean {
return queryHistoryRecord(connection)?.enabled === true; return queryHistoryRecord(connection)?.enabled === true;
} }
@ -25,8 +37,6 @@ export function queryHistoryDialectForConnection(connection: unknown): HistoricS
} }
const conn = recordOrNull(connection); const conn = recordOrNull(connection);
const driver = String(conn?.driver ?? '').toLowerCase(); const driver = String(conn?.driver ?? '').toLowerCase();
if (driver === 'postgres') return 'postgres'; const registration = getDriverRegistration(driver);
if (driver === 'bigquery') return 'bigquery'; return registration?.hasHistoricSqlReader ? historicSqlDialectForDriver(registration.driver) : null;
if (driver === 'snowflake') return 'snowflake';
return null;
} }

View file

@ -27,12 +27,13 @@ export function resolveEnabledTables(
function parseEnabledTableEntry(value: unknown): KtxTableRef | null { function parseEnabledTableEntry(value: unknown): KtxTableRef | null {
if (typeof value === 'string') { if (typeof value === 'string') {
return parseDottedEntry(value); return parseDottedTableEntry(value);
} }
return null; return null;
} }
function parseDottedEntry(value: string): KtxTableRef | null { /** @internal */
export function parseDottedTableEntry(value: string): KtxTableRef | null {
const trimmed = value.trim(); const trimmed = value.trim();
if (trimmed.length === 0) return null; if (trimmed.length === 0) return null;
const parts = trimmed.split('.'); const parts = trimmed.split('.');

View file

@ -1,7 +1,7 @@
import type { KtxLocalProject } from '../../context/project/project.js'; import type { KtxLocalProject } from '../../context/project/project.js';
import { getDialectForDriver, type KtxDialect } from '../connections/dialects.js';
import { readLocalScanStructuralSnapshot } from './local-structural-artifacts.js'; import { readLocalScanStructuralSnapshot } from './local-structural-artifacts.js';
import type { import type {
KtxConnectionDriver,
KtxScanReport, KtxScanReport,
KtxSchemaColumn, KtxSchemaColumn,
KtxSchemaSnapshot, KtxSchemaSnapshot,
@ -88,59 +88,23 @@ function refsEqual(left: KtxTableRef, right: KtxTableRef): boolean {
); );
} }
function cleanIdentifierPart(part: string): string {
return part.trim().replace(/^["'`\[]|["'`\]]$/g, '');
}
function splitDisplay(display: string): string[] {
return display
.trim()
.split('.')
.map(cleanIdentifierPart)
.filter(Boolean);
}
function displayForTable(driver: KtxConnectionDriver, table: KtxTableRef): string {
if (driver === 'sqlite') {
return table.name;
}
return [table.catalog, table.db, table.name].filter((part): part is string => Boolean(part)).join('.');
}
function tableRef(table: KtxSchemaTable): KtxTableRef { function tableRef(table: KtxSchemaTable): KtxTableRef {
return { catalog: table.catalog, db: table.db, name: table.name }; return { catalog: table.catalog, db: table.db, name: table.name };
} }
function candidateList( function candidateList(
driver: KtxConnectionDriver, dialect: KtxDialect,
tables: KtxSchemaTable[], tables: KtxSchemaTable[],
): Array<{ tableRef: KtxTableRef; display: string }> { ): Array<{ tableRef: KtxTableRef; display: string }> {
return tables return tables
.map((table) => ({ .map((table) => ({
tableRef: tableRef(table), tableRef: tableRef(table),
display: displayForTable(driver, table), display: dialect.formatDisplayRef(table),
})) }))
.sort((left, right) => left.display.localeCompare(right.display)); .sort((left, right) => left.display.localeCompare(right.display));
} }
function parseDisplayRef(driver: KtxConnectionDriver, display: string): KtxTableRef | null { function resolveTable(snapshot: KtxSchemaSnapshot, input: KtxEntityDetailsTableInput, dialect: KtxDialect): ResolveResult {
const parts = splitDisplay(display);
if (driver === 'sqlite') {
return parts.length === 1 ? { catalog: null, db: null, name: parts[0]! } : null;
}
if (driver === 'bigquery' || driver === 'snowflake' || driver === 'sqlserver') {
return parts.length === 3 ? { catalog: parts[0]!, db: parts[1]!, name: parts[2]! } : null;
}
if (parts.length === 2) {
return { catalog: null, db: parts[0]!, name: parts[1]! };
}
if (parts.length === 3) {
return { catalog: parts[0]!, db: parts[1]!, name: parts[2]! };
}
return null;
}
function resolveTable(snapshot: KtxSchemaSnapshot, input: KtxEntityDetailsTableInput): ResolveResult {
if (typeof input !== 'string') { if (typeof input !== 'string') {
const table = snapshot.tables.find((candidate) => refsEqual(candidate, input)) ?? null; const table = snapshot.tables.find((candidate) => refsEqual(candidate, input)) ?? null;
return table return table
@ -149,13 +113,13 @@ function resolveTable(snapshot: KtxSchemaSnapshot, input: KtxEntityDetailsTableI
table: null, table: null,
error: { error: {
code: 'table_not_found', code: 'table_not_found',
message: `Table not found in latest scan: ${displayForTable(snapshot.driver, input)}`, message: `Table not found in latest scan: ${dialect.formatDisplayRef(input)}`,
candidates: candidateList(snapshot.driver, snapshot.tables), candidates: candidateList(dialect, snapshot.tables),
}, },
}; };
} }
const parsed = parseDisplayRef(snapshot.driver, input); const parsed = dialect.parseDisplayRef(input);
if (parsed) { if (parsed) {
const table = snapshot.tables.find((candidate) => refsEqual(candidate, parsed)) ?? null; const table = snapshot.tables.find((candidate) => refsEqual(candidate, parsed)) ?? null;
return table return table
@ -165,7 +129,7 @@ function resolveTable(snapshot: KtxSchemaSnapshot, input: KtxEntityDetailsTableI
error: { error: {
code: 'table_not_found', code: 'table_not_found',
message: `Table not found in latest scan: ${input}`, message: `Table not found in latest scan: ${input}`,
candidates: candidateList(snapshot.driver, snapshot.tables), candidates: candidateList(dialect, snapshot.tables),
}, },
}; };
} }
@ -180,7 +144,7 @@ function resolveTable(snapshot: KtxSchemaSnapshot, input: KtxEntityDetailsTableI
error: { error: {
code: 'ambiguous_table', code: 'ambiguous_table',
message: `Table name "${input}" is ambiguous across schemas/catalogs; pass a structured table ref.`, message: `Table name "${input}" is ambiguous across schemas/catalogs; pass a structured table ref.`,
candidates: candidateList(snapshot.driver, byName), candidates: candidateList(dialect, byName),
}, },
}; };
} }
@ -189,7 +153,7 @@ function resolveTable(snapshot: KtxSchemaSnapshot, input: KtxEntityDetailsTableI
error: { error: {
code: 'table_not_found', code: 'table_not_found',
message: `Table not found in latest scan: ${input}`, message: `Table not found in latest scan: ${input}`,
candidates: candidateList(snapshot.driver, snapshot.tables), candidates: candidateList(dialect, snapshot.tables),
}, },
}; };
} }
@ -261,9 +225,10 @@ export function createKtxEntityDetailsService(project: KtxLocalProject) {
} }
const info = snapshotInfo(scan.report, scan.snapshot); const info = snapshotInfo(scan.report, scan.snapshot);
const dialect = getDialectForDriver(scan.snapshot.driver);
const results: KtxEntityDetailsResponse['results'] = []; const results: KtxEntityDetailsResponse['results'] = [];
for (const entity of input.entities) { for (const entity of input.entities) {
const resolved = resolveTable(scan.snapshot, entity.table); const resolved = resolveTable(scan.snapshot, entity.table, dialect);
if (!resolved.table) { if (!resolved.table) {
results.push({ results.push({
ok: false, ok: false,
@ -289,7 +254,7 @@ export function createKtxEntityDetailsService(project: KtxLocalProject) {
snapshot: info, snapshot: info,
error: { error: {
code: 'column_not_found', code: 'column_not_found',
message: `Column(s) not found on ${displayForTable(scan.snapshot.driver, resolved.table)}: ${missing.join(', ')}`, message: `Column(s) not found on ${dialect.formatDisplayRef(resolved.table)}: ${missing.join(', ')}`,
candidates: resolved.table.columns.map((column) => column.name), candidates: resolved.table.columns.map((column) => column.name),
}, },
}); });
@ -300,7 +265,7 @@ export function createKtxEntityDetailsService(project: KtxLocalProject) {
ok: true, ok: true,
connectionId: input.connectionId, connectionId: input.connectionId,
tableRef: tableRef(resolved.table), tableRef: tableRef(resolved.table),
display: displayForTable(scan.snapshot.driver, resolved.table), display: dialect.formatDisplayRef(resolved.table),
kind: resolved.table.kind, kind: resolved.table.kind,
comment: resolved.table.comment, comment: resolved.table.comment,
estimatedRows: resolved.table.estimatedRows, estimatedRows: resolved.table.estimatedRows,

View file

@ -1,5 +1,6 @@
import pLimit from 'p-limit'; import pLimit from 'p-limit';
import type { KtxLlmRuntimePort } from '../../context/llm/runtime-port.js'; import type { KtxLlmRuntimePort } from '../../context/llm/runtime-port.js';
import { getDialectForDriver } from '../connections/dialects.js';
import { buildDefaultKtxProjectConfig, type KtxScanRelationshipConfig } from '../project/config.js'; import { buildDefaultKtxProjectConfig, type KtxScanRelationshipConfig } from '../project/config.js';
import { KtxDescriptionGenerator } from './description-generation.js'; import { KtxDescriptionGenerator } from './description-generation.js';
import { buildKtxColumnEmbeddingText } from './embedding-text.js'; import { buildKtxColumnEmbeddingText } from './embedding-text.js';
@ -118,6 +119,18 @@ function targetMatchesForeignKey(table: KtxEnrichedTable, foreignKey: KtxSchemaF
); );
} }
function assertConnectorDriverMatchesSnapshot(input: {
connector: KtxScanConnector;
snapshot: KtxSchemaSnapshot;
connectionId: string;
}): void {
if (input.connector.driver !== input.snapshot.driver) {
throw new Error(
`ktx scan connector driver "${input.connector.driver}" does not match snapshot driver "${input.snapshot.driver}" for connection "${input.connectionId}"`,
);
}
}
function formalRelationshipsFromSnapshot( function formalRelationshipsFromSnapshot(
snapshot: KtxSchemaSnapshot, snapshot: KtxSchemaSnapshot,
tables: readonly KtxEnrichedTable[], tables: readonly KtxEnrichedTable[],
@ -468,6 +481,12 @@ export async function runLocalScanEnrichment(
)); ));
await progress?.update(0.05, `Loaded schema snapshot with ${snapshot.tables.length} tables`); await progress?.update(0.05, `Loaded schema snapshot with ${snapshot.tables.length} tables`);
assertConnectorDriverMatchesSnapshot({
connector: input.connector,
snapshot,
connectionId: input.connectionId,
});
const dialect = getDialectForDriver(snapshot.driver);
const now = input.now ?? (() => new Date()); const now = input.now ?? (() => new Date());
const state = completedKtxScanEnrichmentStateSummary(); const state = completedKtxScanEnrichmentStateSummary();
const syncId = input.syncId ?? input.context.runId; const syncId = input.syncId ?? input.context.runId;
@ -575,7 +594,7 @@ export async function runLocalScanEnrichment(
await relationshipProgress?.update(0, 'Detecting relationships'); await relationshipProgress?.update(0, 'Detecting relationships');
const detection = await discoverKtxRelationships({ const detection = await discoverKtxRelationships({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: snapshot.driver, dialect,
connector: input.connector, connector: input.connector,
schema, schema,
context: input.context, context: input.context,

View file

@ -6,6 +6,7 @@ import { gunzipSync } from 'node:zlib';
import Database from 'better-sqlite3'; import Database from 'better-sqlite3';
import YAML from 'yaml'; import YAML from 'yaml';
import { z } from 'zod'; import { z } from 'zod';
import { getDialectForDriver } from '../connections/dialects.js';
import type { KtxLlmRuntimePort } from '../llm/runtime-port.js'; import type { KtxLlmRuntimePort } from '../llm/runtime-port.js';
import type { KtxEnrichedRelationship, KtxEnrichedSchema, KtxRelationshipType } from './enrichment-types.js'; import type { KtxEnrichedRelationship, KtxEnrichedSchema, KtxRelationshipType } from './enrichment-types.js';
import { snapshotToKtxEnrichedSchema } from './local-enrichment.js'; import { snapshotToKtxEnrichedSchema } from './local-enrichment.js';
@ -536,6 +537,7 @@ export function ktxRelationshipBenchmarkDetectorWithLlm(
const formalLinks = formalMetadata.accepted.map((relationship) => relationshipToBenchmarkLink(relationship)); const formalLinks = formalMetadata.accepted.map((relationship) => relationshipToBenchmarkLink(relationship));
const acceptedKeys = new Set(formalLinks.map(fkKey)); const acceptedKeys = new Set(formalLinks.map(fkKey));
const sqliteDataAvailable = Boolean(input.dataPath && input.snapshot.driver === 'sqlite'); const sqliteDataAvailable = Boolean(input.dataPath && input.snapshot.driver === 'sqlite');
const dialect = getDialectForDriver(input.snapshot.driver);
const profilingExecutor = const profilingExecutor =
sqliteDataAvailable && input.mode !== 'profiling_disabled' sqliteDataAvailable && input.mode !== 'profiling_disabled'
? new KtxRelationshipBenchmarkSqliteExecutor(input.dataPath as string) ? new KtxRelationshipBenchmarkSqliteExecutor(input.dataPath as string)
@ -550,7 +552,7 @@ export function ktxRelationshipBenchmarkDetectorWithLlm(
}) })
: await profileKtxRelationshipSchema({ : await profileKtxRelationshipSchema({
connectionId: input.snapshot.connectionId, connectionId: input.snapshot.connectionId,
driver: input.snapshot.driver, dialect,
schema: input.schema, schema: input.schema,
executor: profilingExecutor, executor: profilingExecutor,
ctx: { runId: `relationship-benchmark:${input.fixtureId}:${input.mode}:profile` }, ctx: { runId: `relationship-benchmark:${input.fixtureId}:${input.mode}:profile` },
@ -580,7 +582,7 @@ export function ktxRelationshipBenchmarkDetectorWithLlm(
: Math.max(0, input.validationBudget - profiles.queryCount); : Math.max(0, input.validationBudget - profiles.queryCount);
const validatedBroadCandidates = await validateKtxRelationshipDiscoveryCandidates({ const validatedBroadCandidates = await validateKtxRelationshipDiscoveryCandidates({
connectionId: input.snapshot.connectionId, connectionId: input.snapshot.connectionId,
driver: input.snapshot.driver, dialect,
candidates, candidates,
profiles, profiles,
executor: validationExecutor, executor: validationExecutor,
@ -597,7 +599,7 @@ export function ktxRelationshipBenchmarkDetectorWithLlm(
input.mode !== 'validation_disabled' input.mode !== 'validation_disabled'
? await discoverKtxCompositeRelationships({ ? await discoverKtxCompositeRelationships({
connectionId: input.snapshot.connectionId, connectionId: input.snapshot.connectionId,
driver: input.snapshot.driver, dialect,
schema: input.schema, schema: input.schema,
profiles, profiles,
executor: validationExecutor, executor: validationExecutor,
@ -671,6 +673,7 @@ export function currentKtxRelationshipBenchmarkDetector(): KtxRelationshipBenchm
const formalLinks = formalMetadata.accepted.map((relationship) => relationshipToBenchmarkLink(relationship)); const formalLinks = formalMetadata.accepted.map((relationship) => relationshipToBenchmarkLink(relationship));
const acceptedKeys = new Set(formalLinks.map(fkKey)); const acceptedKeys = new Set(formalLinks.map(fkKey));
const sqliteDataAvailable = Boolean(input.dataPath && input.snapshot.driver === 'sqlite'); const sqliteDataAvailable = Boolean(input.dataPath && input.snapshot.driver === 'sqlite');
const dialect = getDialectForDriver(input.snapshot.driver);
const profilingExecutor = const profilingExecutor =
sqliteDataAvailable && input.mode !== 'profiling_disabled' sqliteDataAvailable && input.mode !== 'profiling_disabled'
? new KtxRelationshipBenchmarkSqliteExecutor(input.dataPath as string) ? new KtxRelationshipBenchmarkSqliteExecutor(input.dataPath as string)
@ -685,7 +688,7 @@ export function currentKtxRelationshipBenchmarkDetector(): KtxRelationshipBenchm
}) })
: await profileKtxRelationshipSchema({ : await profileKtxRelationshipSchema({
connectionId: input.snapshot.connectionId, connectionId: input.snapshot.connectionId,
driver: input.snapshot.driver, dialect,
schema: input.schema, schema: input.schema,
executor: profilingExecutor, executor: profilingExecutor,
ctx: { runId: `relationship-benchmark:${input.fixtureId}:${input.mode}:profile` }, ctx: { runId: `relationship-benchmark:${input.fixtureId}:${input.mode}:profile` },
@ -702,7 +705,7 @@ export function currentKtxRelationshipBenchmarkDetector(): KtxRelationshipBenchm
: Math.max(0, input.validationBudget - profiles.queryCount); : Math.max(0, input.validationBudget - profiles.queryCount);
const validatedBroadCandidates = await validateKtxRelationshipDiscoveryCandidates({ const validatedBroadCandidates = await validateKtxRelationshipDiscoveryCandidates({
connectionId: input.snapshot.connectionId, connectionId: input.snapshot.connectionId,
driver: input.snapshot.driver, dialect,
candidates: broadRelationshipCandidates, candidates: broadRelationshipCandidates,
profiles, profiles,
executor: validationExecutor, executor: validationExecutor,
@ -719,7 +722,7 @@ export function currentKtxRelationshipBenchmarkDetector(): KtxRelationshipBenchm
input.mode !== 'validation_disabled' input.mode !== 'validation_disabled'
? await discoverKtxCompositeRelationships({ ? await discoverKtxCompositeRelationships({
connectionId: input.snapshot.connectionId, connectionId: input.snapshot.connectionId,
driver: input.snapshot.driver, dialect,
schema: input.schema, schema: input.schema,
profiles, profiles,
executor: validationExecutor, executor: validationExecutor,

View file

@ -1,11 +1,10 @@
import type { KtxDialect } from '../connections/dialects.js';
import type { KtxEnrichedColumn, KtxEnrichedSchema, KtxEnrichedTable, KtxRelationshipType } from './enrichment-types.js'; import type { KtxEnrichedColumn, KtxEnrichedSchema, KtxEnrichedTable, KtxRelationshipType } from './enrichment-types.js';
import { import {
formatKtxRelationshipTableRef,
quoteKtxRelationshipIdentifier,
type KtxRelationshipProfileArtifact, type KtxRelationshipProfileArtifact,
type KtxRelationshipReadOnlyExecutor, type KtxRelationshipReadOnlyExecutor,
} from './relationship-profiling.js'; } from './relationship-profiling.js';
import type { KtxConnectionDriver, KtxQueryResult, KtxScanContext, KtxTableRef } from './types.js'; import type { KtxQueryResult, KtxScanContext, KtxTableRef } from './types.js';
type KtxCompositeRelationshipStatus = 'accepted' | 'review' | 'rejected'; type KtxCompositeRelationshipStatus = 'accepted' | 'review' | 'rejected';
@ -57,7 +56,7 @@ export interface KtxCompositeRelationshipCandidate {
export interface DiscoverKtxCompositeRelationshipsInput { export interface DiscoverKtxCompositeRelationshipsInput {
connectionId: string; connectionId: string;
driver: KtxConnectionDriver; dialect: KtxDialect;
schema: KtxEnrichedSchema; schema: KtxEnrichedSchema;
profiles: KtxRelationshipProfileArtifact; profiles: KtxRelationshipProfileArtifact;
executor: KtxRelationshipReadOnlyExecutor | null; executor: KtxRelationshipReadOnlyExecutor | null;
@ -224,28 +223,16 @@ function numberAt(result: KtxQueryResult, header: string): number {
return 0; return 0;
} }
function topSql(driver: KtxConnectionDriver, limit: number): string { function sqlSuffix(fragment: string): string {
if (driver === 'sqlserver') { return fragment ? ` ${fragment}` : '';
return ` TOP (${Math.max(1, Math.floor(limit))})`;
}
return '';
} }
function limitSql(driver: KtxConnectionDriver, limit: number): string { function aliasedTupleSelect(dialect: KtxDialect, columns: readonly string[]): string {
if (driver === 'sqlserver') { return columns.map((column, index) => `${dialect.quoteIdentifier(column)} AS c${index}`).join(', ');
return '';
}
return ` LIMIT ${Math.max(1, Math.floor(limit))}`;
} }
function aliasedTupleSelect(driver: KtxConnectionDriver, columns: readonly string[]): string { function nonNullPredicate(dialect: KtxDialect, columns: readonly string[]): string {
return columns return columns.map((column) => `${dialect.quoteIdentifier(column)} IS NOT NULL`).join(' AND ');
.map((column, index) => `${quoteKtxRelationshipIdentifier(driver, column)} AS c${index}`)
.join(', ');
}
function nonNullPredicate(driver: KtxConnectionDriver, columns: readonly string[]): string {
return columns.map((column) => `${quoteKtxRelationshipIdentifier(driver, column)} IS NOT NULL`).join(' AND ');
} }
function tupleEquality(columns: number): string { function tupleEquality(columns: number): string {
@ -255,39 +242,39 @@ function tupleEquality(columns: number): string {
} }
function buildTupleDistinctSql(input: { function buildTupleDistinctSql(input: {
driver: KtxConnectionDriver; dialect: KtxDialect;
table: KtxTableRef; table: KtxTableRef;
columns: readonly string[]; columns: readonly string[];
}): string { }): string {
const tableSql = formatKtxRelationshipTableRef(input.driver, input.table); const tableSql = input.dialect.formatTableName(input.table);
return [ return [
'WITH tuple_values AS (', 'WITH tuple_values AS (',
`SELECT DISTINCT ${aliasedTupleSelect(input.driver, input.columns)} FROM ${tableSql}`, `SELECT DISTINCT ${aliasedTupleSelect(input.dialect, input.columns)} FROM ${tableSql}`,
`WHERE ${nonNullPredicate(input.driver, input.columns)}`, `WHERE ${nonNullPredicate(input.dialect, input.columns)}`,
')', ')',
'SELECT COUNT(*) AS distinct_count FROM tuple_values', 'SELECT COUNT(*) AS distinct_count FROM tuple_values',
].join(' '); ].join(' ');
} }
function buildCompositeCoverageSql(input: { function buildCompositeCoverageSql(input: {
driver: KtxConnectionDriver; dialect: KtxDialect;
childTable: KtxTableRef; childTable: KtxTableRef;
childColumns: readonly string[]; childColumns: readonly string[];
parentTable: KtxTableRef; parentTable: KtxTableRef;
parentColumns: readonly string[]; parentColumns: readonly string[];
maxDistinctSourceValues: number; maxDistinctSourceValues: number;
}): string { }): string {
const childTableSql = formatKtxRelationshipTableRef(input.driver, input.childTable); const childTableSql = input.dialect.formatTableName(input.childTable);
const parentTableSql = formatKtxRelationshipTableRef(input.driver, input.parentTable); const parentTableSql = input.dialect.formatTableName(input.parentTable);
const top = topSql(input.driver, input.maxDistinctSourceValues); const top = input.dialect.getTopClause(input.maxDistinctSourceValues);
const limit = limitSql(input.driver, input.maxDistinctSourceValues); const limit = sqlSuffix(input.dialect.getLimitOffsetClause(input.maxDistinctSourceValues));
return [ return [
'WITH child_values AS (', 'WITH child_values AS (',
`SELECT DISTINCT${top} ${aliasedTupleSelect(input.driver, input.childColumns)} FROM ${childTableSql}`, `SELECT DISTINCT${top ? ` ${top}` : ''} ${aliasedTupleSelect(input.dialect, input.childColumns)} FROM ${childTableSql}`,
`WHERE ${nonNullPredicate(input.driver, input.childColumns)}${limit}`, `WHERE ${nonNullPredicate(input.dialect, input.childColumns)}${limit}`,
'), parent_values AS (', '), parent_values AS (',
`SELECT DISTINCT ${aliasedTupleSelect(input.driver, input.parentColumns)} FROM ${parentTableSql}`, `SELECT DISTINCT ${aliasedTupleSelect(input.dialect, input.parentColumns)} FROM ${parentTableSql}`,
`WHERE ${nonNullPredicate(input.driver, input.parentColumns)}`, `WHERE ${nonNullPredicate(input.dialect, input.parentColumns)}`,
')', ')',
'SELECT', 'SELECT',
'(SELECT COUNT(*) FROM child_values) AS child_distinct,', '(SELECT COUNT(*) FROM child_values) AS child_distinct,',
@ -335,7 +322,7 @@ function hasAcceptedSubset(
async function detectCompositePrimaryKeys(input: { async function detectCompositePrimaryKeys(input: {
connectionId: string; connectionId: string;
driver: KtxConnectionDriver; dialect: KtxDialect;
table: KtxEnrichedTable; table: KtxEnrichedTable;
profiles: KtxRelationshipProfileArtifact; profiles: KtxRelationshipProfileArtifact;
executor: KtxRelationshipReadOnlyExecutor; executor: KtxRelationshipReadOnlyExecutor;
@ -379,7 +366,7 @@ async function detectCompositePrimaryKeys(input: {
{ {
connectionId: input.connectionId, connectionId: input.connectionId,
sql: buildTupleDistinctSql({ sql: buildTupleDistinctSql({
driver: input.driver, dialect: input.dialect,
table: input.table.ref, table: input.table.ref,
columns: columnNames, columns: columnNames,
}), }),
@ -439,7 +426,7 @@ function compatibleTuple(sourceColumns: readonly KtxEnrichedColumn[], targetColu
async function validateCompositeRelationship(input: { async function validateCompositeRelationship(input: {
connectionId: string; connectionId: string;
driver: KtxConnectionDriver; dialect: KtxDialect;
sourceTable: KtxEnrichedTable; sourceTable: KtxEnrichedTable;
sourceColumns: readonly KtxEnrichedColumn[]; sourceColumns: readonly KtxEnrichedColumn[];
targetKey: KtxCompositePrimaryKeyCandidate; targetKey: KtxCompositePrimaryKeyCandidate;
@ -454,7 +441,7 @@ async function validateCompositeRelationship(input: {
{ {
connectionId: input.connectionId, connectionId: input.connectionId,
sql: buildCompositeCoverageSql({ sql: buildCompositeCoverageSql({
driver: input.driver, dialect: input.dialect,
childTable: input.sourceTable.ref, childTable: input.sourceTable.ref,
childColumns: input.sourceColumns.map((column) => column.name), childColumns: input.sourceColumns.map((column) => column.name),
parentTable: input.targetTable.ref, parentTable: input.targetTable.ref,
@ -552,7 +539,7 @@ export async function discoverKtxCompositeRelationships(
for (const table of tables) { for (const table of tables) {
const result = await detectCompositePrimaryKeys({ const result = await detectCompositePrimaryKeys({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, dialect: input.dialect,
table, table,
profiles: input.profiles, profiles: input.profiles,
executor: input.executor, executor: input.executor,
@ -595,7 +582,7 @@ export async function discoverKtxCompositeRelationships(
const result = await validateCompositeRelationship({ const result = await validateCompositeRelationship({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, dialect: input.dialect,
sourceTable, sourceTable,
sourceColumns, sourceColumns,
targetKey, targetKey,

View file

@ -1,4 +1,5 @@
import type { KtxLlmRuntimePort } from '../../context/llm/runtime-port.js'; import type { KtxLlmRuntimePort } from '../../context/llm/runtime-port.js';
import type { KtxDialect } from '../connections/dialects.js';
import type { KtxScanRelationshipConfig } from '../project/config.js'; import type { KtxScanRelationshipConfig } from '../project/config.js';
import type { KtxEnrichedRelationship, KtxEnrichedSchema, KtxRelationshipUpdate } from './enrichment-types.js'; import type { KtxEnrichedRelationship, KtxEnrichedSchema, KtxRelationshipUpdate } from './enrichment-types.js';
import { import {
@ -24,7 +25,6 @@ import {
} from './relationship-profiling.js'; } from './relationship-profiling.js';
import { validateKtxRelationshipDiscoveryCandidates } from './relationship-validation.js'; import { validateKtxRelationshipDiscoveryCandidates } from './relationship-validation.js';
import type { import type {
KtxConnectionDriver,
KtxScanConnector, KtxScanConnector,
KtxScanContext, KtxScanContext,
KtxScanEnrichmentSummary, KtxScanEnrichmentSummary,
@ -34,7 +34,7 @@ import type {
export interface DiscoverKtxRelationshipsInput { export interface DiscoverKtxRelationshipsInput {
connectionId: string; connectionId: string;
driver: KtxConnectionDriver; dialect: KtxDialect;
connector: KtxScanConnector; connector: KtxScanConnector;
schema: KtxEnrichedSchema; schema: KtxEnrichedSchema;
context: KtxScanContext; context: KtxScanContext;
@ -122,7 +122,7 @@ function compositeSummary(relationships: readonly KtxCompositeRelationshipCandid
async function detectCompositeRelationships(input: { async function detectCompositeRelationships(input: {
connectionId: string; connectionId: string;
driver: DiscoverKtxRelationshipsInput['driver']; dialect: KtxDialect;
schema: KtxEnrichedSchema; schema: KtxEnrichedSchema;
profile: KtxRelationshipProfileArtifact; profile: KtxRelationshipProfileArtifact;
executor: KtxRelationshipReadOnlyExecutor | null; executor: KtxRelationshipReadOnlyExecutor | null;
@ -135,7 +135,7 @@ async function detectCompositeRelationships(input: {
try { try {
const compositeDetection = await discoverKtxCompositeRelationships({ const compositeDetection = await discoverKtxCompositeRelationships({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, dialect: input.dialect,
schema: input.schema, schema: input.schema,
profiles: input.profile, profiles: input.profile,
executor: input.executor, executor: input.executor,
@ -223,7 +223,7 @@ export async function discoverKtxRelationships(
const profileCache = createKtxRelationshipProfileCache(); const profileCache = createKtxRelationshipProfileCache();
const profile = await profileKtxRelationshipSchema({ const profile = await profileKtxRelationshipSchema({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, dialect: input.dialect,
schema: input.schema, schema: input.schema,
executor, executor,
ctx: input.context, ctx: input.context,
@ -256,7 +256,7 @@ export async function discoverKtxRelationships(
warnings.push(...llmProposalResult.warnings); warnings.push(...llmProposalResult.warnings);
const validated = await validateKtxRelationshipDiscoveryCandidates({ const validated = await validateKtxRelationshipDiscoveryCandidates({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, dialect: input.dialect,
candidates, candidates,
profiles: profile, profiles: profile,
executor, executor,
@ -282,7 +282,7 @@ export async function discoverKtxRelationships(
}); });
const compositeRelationships = await detectCompositeRelationships({ const compositeRelationships = await detectCompositeRelationships({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, dialect: input.dialect,
schema: input.schema, schema: input.schema,
profile, profile,
executor, executor,

View file

@ -1,3 +1,4 @@
import type { KtxDialect } from '../connections/dialects.js';
import type { KtxEnrichedColumn, KtxEnrichedSchema, KtxEnrichedTable } from './enrichment-types.js'; import type { KtxEnrichedColumn, KtxEnrichedSchema, KtxEnrichedTable } from './enrichment-types.js';
import { mapWithConcurrency } from './relationship-validation.js'; import { mapWithConcurrency } from './relationship-validation.js';
import type { import type {
@ -55,7 +56,7 @@ export interface KtxRelationshipProfileCache {
export interface ProfileKtxRelationshipSchemaInput { export interface ProfileKtxRelationshipSchemaInput {
connectionId: string; connectionId: string;
driver: KtxConnectionDriver; dialect: KtxDialect;
schema: KtxEnrichedSchema; schema: KtxEnrichedSchema;
executor: KtxRelationshipReadOnlyExecutor | null; executor: KtxRelationshipReadOnlyExecutor | null;
ctx: KtxScanContext; ctx: KtxScanContext;
@ -71,75 +72,6 @@ export function createKtxRelationshipProfileCache(): KtxRelationshipProfileCache
const SAMPLE_VALUE_DELIMITER = '\u001f'; const SAMPLE_VALUE_DELIMITER = '\u001f';
type QuoteStyle = 'double' | 'backtick' | 'bracket';
function quoteStyle(driver: KtxConnectionDriver): QuoteStyle {
if (driver === 'mysql' || driver === 'clickhouse') {
return 'backtick';
}
if (driver === 'sqlserver') {
return 'bracket';
}
return 'double';
}
export function quoteKtxRelationshipIdentifier(driver: KtxConnectionDriver, identifier: string): string {
switch (quoteStyle(driver)) {
case 'backtick':
return `\`${identifier.replace(/`/g, '``')}\``;
case 'bracket':
return `[${identifier.replace(/\]/g, ']]')}]`;
case 'double':
return `"${identifier.replace(/"/g, '""')}"`;
}
}
export function formatKtxRelationshipTableRef(driver: KtxConnectionDriver, table: KtxTableRef): string {
const parts =
driver === 'sqlite'
? [table.name]
: [table.catalog, table.db, table.name].filter((value): value is string => Boolean(value));
return parts.map((part) => quoteKtxRelationshipIdentifier(driver, part)).join('.');
}
function textLengthExpression(driver: KtxConnectionDriver, columnSql: string): string {
if (driver === 'mysql') {
return `CHAR_LENGTH(CAST(${columnSql} AS CHAR))`;
}
if (driver === 'sqlserver') {
return `LEN(CAST(${columnSql} AS NVARCHAR(MAX)))`;
}
if (driver === 'bigquery') {
return `LENGTH(CAST(${columnSql} AS STRING))`;
}
if (driver === 'clickhouse') {
return `length(toString(${columnSql}))`;
}
return `LENGTH(CAST(${columnSql} AS TEXT))`;
}
function limitSql(driver: KtxConnectionDriver, limit: number): string {
if (driver === 'sqlserver') {
return '';
}
return ` LIMIT ${Math.max(1, Math.floor(limit))}`;
}
function topSql(driver: KtxConnectionDriver, limit: number): string {
if (driver === 'sqlserver') {
return ` TOP (${Math.max(1, Math.floor(limit))})`;
}
return '';
}
function sampledTableSql(driver: KtxConnectionDriver, tableSql: string, limit: number): string {
const safeLimit = Math.max(1, Math.floor(limit));
if (driver === 'sqlserver') {
return `(SELECT TOP (${safeLimit}) * FROM ${tableSql}) AS relationship_profile_sample`;
}
return `(SELECT * FROM ${tableSql}${limitSql(driver, safeLimit)}) AS relationship_profile_sample`;
}
function firstRow(result: KtxQueryResult): unknown[] { function firstRow(result: KtxQueryResult): unknown[] {
return result.rows[0] ?? []; return result.rows[0] ?? [];
} }
@ -191,7 +123,7 @@ function columnKey(table: KtxEnrichedTable, column: KtxEnrichedColumn): string {
function tableProfileCacheKey(input: { function tableProfileCacheKey(input: {
connectionId: string; connectionId: string;
driver: KtxConnectionDriver; dialect: KtxDialect;
ctx: KtxScanContext; ctx: KtxScanContext;
table: KtxTableRef; table: KtxTableRef;
sampleValuesPerColumn: number; sampleValuesPerColumn: number;
@ -200,7 +132,7 @@ function tableProfileCacheKey(input: {
return [ return [
input.ctx.runId, input.ctx.runId,
input.connectionId, input.connectionId,
input.driver, input.dialect.type,
input.table.catalog ?? '', input.table.catalog ?? '',
input.table.db ?? '', input.table.db ?? '',
input.table.name, input.table.name,
@ -213,57 +145,47 @@ function sqlStringLiteral(value: string): string {
return `'${value.replace(/'/g, "''")}'`; return `'${value.replace(/'/g, "''")}'`;
} }
function sampleAggregateSql(driver: KtxConnectionDriver, innerSql: string): string { function sqlSuffix(fragment: string): string {
if (driver === 'postgres') { return fragment ? ` ${fragment}` : '';
return `(SELECT STRING_AGG(CAST(value AS TEXT), CHR(31)) FROM (${innerSql}) AS relationship_profile_values)`; }
function sampledTableSql(dialect: KtxDialect, tableSql: string, limit: number): string {
const top = dialect.getTopClause(limit);
if (top) {
return `(SELECT ${top} * FROM ${tableSql}) AS relationship_profile_sample`;
} }
if (driver === 'bigquery') { return `(SELECT * FROM ${tableSql}${sqlSuffix(dialect.getLimitOffsetClause(limit))}) AS relationship_profile_sample`;
return `(SELECT STRING_AGG(CAST(value AS STRING), '\\u001F') FROM (${innerSql}) AS relationship_profile_values)`;
}
if (driver === 'mysql') {
return `(SELECT GROUP_CONCAT(CAST(value AS CHAR) SEPARATOR CHAR(31)) FROM (${innerSql}) AS relationship_profile_values)`;
}
if (driver === 'sqlserver') {
return `(SELECT STRING_AGG(CAST(value AS NVARCHAR(MAX)), CHAR(31)) FROM (${innerSql}) AS relationship_profile_values)`;
}
if (driver === 'clickhouse') {
return `(SELECT arrayStringConcat(groupArray(toString(value)), '\\x1F') FROM (${innerSql}) AS relationship_profile_values)`;
}
if (driver === 'snowflake') {
return `(SELECT LISTAGG(CAST(value AS VARCHAR), '\\x1f') FROM (${innerSql}) AS relationship_profile_values)`;
}
return `(SELECT GROUP_CONCAT(CAST(value AS TEXT), char(31)) FROM (${innerSql}) AS relationship_profile_values)`;
} }
function sampleValuesSql(input: { function sampleValuesSql(input: {
driver: KtxConnectionDriver; dialect: KtxDialect;
tableSql: string; tableSql: string;
columnSql: string; columnSql: string;
limit: number; limit: number;
}): string { }): string {
const top = input.dialect.getTopClause(input.limit);
return [ return [
`SELECT${topSql(input.driver, input.limit)} ${input.columnSql} AS value`, `SELECT${top ? ` ${top}` : ''} ${input.columnSql} AS value`,
`FROM ${input.tableSql}`, `FROM ${input.tableSql}`,
`WHERE ${input.columnSql} IS NOT NULL`, `WHERE ${input.columnSql} IS NOT NULL`,
`GROUP BY ${input.columnSql}`, `GROUP BY ${input.columnSql}`,
`ORDER BY COUNT(*) DESC, ${input.columnSql} ASC`, `ORDER BY COUNT(*) DESC, ${input.columnSql} ASC`,
limitSql(input.driver, input.limit), sqlSuffix(input.dialect.getLimitOffsetClause(input.limit)),
].join(' '); ].join(' ');
} }
function columnProfileSelectSql(input: { function columnProfileSelectSql(input: {
connectionDriver: KtxConnectionDriver; dialect: KtxDialect;
tableSql: string; tableSql: string;
profileTableSql: string; profileTableSql: string;
column: KtxEnrichedColumn; column: KtxEnrichedColumn;
sampleValuesPerColumn: number; sampleValuesPerColumn: number;
}): string { }): string {
const columnSql = quoteKtxRelationshipIdentifier(input.connectionDriver, input.column.name); const columnSql = input.dialect.quoteIdentifier(input.column.name);
const textLengthSql = textLengthExpression(input.connectionDriver, columnSql); const textLengthSql = input.dialect.textLengthExpression(columnSql);
const samplesSql = sampleAggregateSql( const samplesSql = input.dialect.getSampleValueAggregation(
input.connectionDriver,
sampleValuesSql({ sampleValuesSql({
driver: input.connectionDriver, dialect: input.dialect,
tableSql: input.profileTableSql, tableSql: input.profileTableSql,
columnSql, columnSql,
limit: input.sampleValuesPerColumn, limit: input.sampleValuesPerColumn,
@ -296,12 +218,12 @@ function splitSampleValues(value: unknown): string[] {
async function queryCount(input: { async function queryCount(input: {
connectionId: string; connectionId: string;
driver: KtxConnectionDriver; dialect: KtxDialect;
table: KtxTableRef; table: KtxTableRef;
executor: KtxRelationshipReadOnlyExecutor; executor: KtxRelationshipReadOnlyExecutor;
ctx: KtxScanContext; ctx: KtxScanContext;
}): Promise<{ rowCount: number; queryCount: number }> { }): Promise<{ rowCount: number; queryCount: number }> {
const tableSql = formatKtxRelationshipTableRef(input.driver, input.table); const tableSql = input.dialect.formatTableName(input.table);
const result = await input.executor.executeReadOnly( const result = await input.executor.executeReadOnly(
{ connectionId: input.connectionId, sql: `SELECT COUNT(*) AS row_count FROM ${tableSql}`, maxRows: 1 }, { connectionId: input.connectionId, sql: `SELECT COUNT(*) AS row_count FROM ${tableSql}`, maxRows: 1 },
input.ctx, input.ctx,
@ -311,7 +233,7 @@ async function queryCount(input: {
async function queryTableProfile(input: { async function queryTableProfile(input: {
connectionId: string; connectionId: string;
driver: KtxConnectionDriver; dialect: KtxDialect;
table: KtxEnrichedTable; table: KtxEnrichedTable;
executor: KtxRelationshipReadOnlyExecutor; executor: KtxRelationshipReadOnlyExecutor;
ctx: KtxScanContext; ctx: KtxScanContext;
@ -325,7 +247,7 @@ async function queryTableProfile(input: {
if (input.table.columns.length === 0) { if (input.table.columns.length === 0) {
const rowCount = await queryCount({ const rowCount = await queryCount({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, dialect: input.dialect,
table: input.table.ref, table: input.table.ref,
executor: input.executor, executor: input.executor,
ctx: input.ctx, ctx: input.ctx,
@ -337,12 +259,12 @@ async function queryTableProfile(input: {
}; };
} }
const tableSql = formatKtxRelationshipTableRef(input.driver, input.table.ref); const tableSql = input.dialect.formatTableName(input.table.ref);
const profileTableSql = sampledTableSql(input.driver, tableSql, input.profileSampleRows); const profileTableSql = sampledTableSql(input.dialect, tableSql, input.profileSampleRows);
const sql = input.table.columns const sql = input.table.columns
.map((column) => .map((column) =>
columnProfileSelectSql({ columnProfileSelectSql({
connectionDriver: input.driver, dialect: input.dialect,
tableSql, tableSql,
profileTableSql, profileTableSql,
column, column,
@ -401,7 +323,7 @@ export async function profileKtxRelationshipSchema(
if (!input.executor) { if (!input.executor) {
return { return {
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, driver: input.dialect.type,
sqlAvailable: false, sqlAvailable: false,
queryCount: 0, queryCount: 0,
tables: [], tables: [],
@ -425,7 +347,7 @@ export async function profileKtxRelationshipSchema(
const profileSampleRows = input.profileSampleRows ?? 10000; const profileSampleRows = input.profileSampleRows ?? 10000;
const cacheKey = tableProfileCacheKey({ const cacheKey = tableProfileCacheKey({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, dialect: input.dialect,
ctx: input.ctx, ctx: input.ctx,
table: table.ref, table: table.ref,
sampleValuesPerColumn, sampleValuesPerColumn,
@ -439,7 +361,7 @@ export async function profileKtxRelationshipSchema(
try { try {
const tableProfile = await queryTableProfile({ const tableProfile = await queryTableProfile({
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, dialect: input.dialect,
table, table,
executor, executor,
ctx: input.ctx, ctx: input.ctx,
@ -481,7 +403,7 @@ export async function profileKtxRelationshipSchema(
return { return {
connectionId: input.connectionId, connectionId: input.connectionId,
driver: input.driver, driver: input.dialect.type,
sqlAvailable: true, sqlAvailable: true,
queryCount: queryTotal, queryCount: queryTotal,
tables, tables,

View file

@ -1,13 +1,12 @@
import type { KtxDialect } from '../connections/dialects.js';
import type { KtxRelationshipEndpoint } from './enrichment-types.js'; import type { KtxRelationshipEndpoint } from './enrichment-types.js';
import { applyKtxRelationshipValidationBudget, type KtxRelationshipValidationBudget } from './relationship-budget.js'; import { applyKtxRelationshipValidationBudget, type KtxRelationshipValidationBudget } from './relationship-budget.js';
import type { KtxRelationshipDiscoveryCandidate } from './relationship-candidates.js'; import type { KtxRelationshipDiscoveryCandidate } from './relationship-candidates.js';
import { import {
formatKtxRelationshipTableRef,
type KtxRelationshipProfileArtifact, type KtxRelationshipProfileArtifact,
type KtxRelationshipReadOnlyExecutor, type KtxRelationshipReadOnlyExecutor,
quoteKtxRelationshipIdentifier,
} from './relationship-profiling.js'; } from './relationship-profiling.js';
import type { KtxConnectionDriver, KtxQueryResult, KtxScanContext } from './types.js'; import type { KtxQueryResult, KtxScanContext, KtxTableRef } from './types.js';
type KtxValidatedRelationshipStatus = 'accepted' | 'review' | 'rejected'; type KtxValidatedRelationshipStatus = 'accepted' | 'review' | 'rejected';
@ -45,7 +44,7 @@ export interface KtxValidatedRelationshipDiscoveryCandidate
export interface ValidateKtxRelationshipDiscoveryCandidatesInput { export interface ValidateKtxRelationshipDiscoveryCandidatesInput {
connectionId: string; connectionId: string;
driver: KtxConnectionDriver; dialect: KtxDialect;
candidates: readonly KtxRelationshipDiscoveryCandidate[]; candidates: readonly KtxRelationshipDiscoveryCandidate[];
profiles: KtxRelationshipProfileArtifact; profiles: KtxRelationshipProfileArtifact;
executor: KtxRelationshipReadOnlyExecutor | null; executor: KtxRelationshipReadOnlyExecutor | null;
@ -104,38 +103,28 @@ function numberAt(result: KtxQueryResult, header: string): number {
return 0; return 0;
} }
function limitSql(driver: KtxConnectionDriver, limit: number): string { function sqlSuffix(fragment: string): string {
if (driver === 'sqlserver') { return fragment ? ` ${fragment}` : '';
return '';
}
return ` LIMIT ${Math.max(1, Math.floor(limit))}`;
}
function topSql(driver: KtxConnectionDriver, limit: number): string {
if (driver === 'sqlserver') {
return ` TOP (${Math.max(1, Math.floor(limit))})`;
}
return '';
} }
function buildCoverageSql(input: { function buildCoverageSql(input: {
driver: KtxConnectionDriver; dialect: KtxDialect;
childTable: string; childTable: KtxTableRef;
childColumn: string; childColumn: string;
parentTable: string; parentTable: KtxTableRef;
parentColumn: string; parentColumn: string;
maxDistinctSourceValues: number; maxDistinctSourceValues: number;
}): string { }): string {
const childTable = formatKtxRelationshipTableRef(input.driver, { catalog: null, db: null, name: input.childTable }); const childTable = input.dialect.formatTableName(input.childTable);
const parentTable = formatKtxRelationshipTableRef(input.driver, { catalog: null, db: null, name: input.parentTable }); const parentTable = input.dialect.formatTableName(input.parentTable);
const childColumn = quoteKtxRelationshipIdentifier(input.driver, input.childColumn); const childColumn = input.dialect.quoteIdentifier(input.childColumn);
const parentColumn = quoteKtxRelationshipIdentifier(input.driver, input.parentColumn); const parentColumn = input.dialect.quoteIdentifier(input.parentColumn);
const limit = limitSql(input.driver, input.maxDistinctSourceValues); const limit = sqlSuffix(input.dialect.getLimitOffsetClause(input.maxDistinctSourceValues));
const top = topSql(input.driver, input.maxDistinctSourceValues); const top = input.dialect.getTopClause(input.maxDistinctSourceValues);
return [ return [
'WITH child_values AS (', 'WITH child_values AS (',
`SELECT DISTINCT${top} ${childColumn} AS value FROM ${childTable} WHERE ${childColumn} IS NOT NULL${limit}`, `SELECT DISTINCT${top ? ` ${top}` : ''} ${childColumn} AS value FROM ${childTable} WHERE ${childColumn} IS NOT NULL${limit}`,
'), parent_values AS (', '), parent_values AS (',
`SELECT DISTINCT ${parentColumn} AS value FROM ${parentTable} WHERE ${parentColumn} IS NOT NULL`, `SELECT DISTINCT ${parentColumn} AS value FROM ${parentTable} WHERE ${parentColumn} IS NOT NULL`,
')', ')',
@ -271,10 +260,10 @@ export async function validateKtxRelationshipDiscoveryCandidates(
{ {
connectionId: input.connectionId, connectionId: input.connectionId,
sql: buildCoverageSql({ sql: buildCoverageSql({
driver: input.driver, dialect: input.dialect,
childTable: candidate.from.table.name, childTable: candidate.from.table,
childColumn: sourceColumn, childColumn: sourceColumn,
parentTable: candidate.to.table.name, parentTable: candidate.to.table,
parentColumn: targetColumn, parentColumn: targetColumn,
maxDistinctSourceValues: settings.maxDistinctSourceValues, maxDistinctSourceValues: settings.maxDistinctSourceValues,
}), }),

View file

@ -297,6 +297,7 @@ export interface KtxQueryResult {
} }
export interface KtxTableListEntry { export interface KtxTableListEntry {
catalog: string | null;
schema: string; schema: string;
name: string; name: string;
kind: 'table' | 'view'; kind: 'table' | 'view';
@ -313,6 +314,8 @@ export interface KtxScanConnector {
capabilities: KtxConnectorCapabilities; capabilities: KtxConnectorCapabilities;
eventStreamDiscovery?: KtxEventStreamDiscoveryPort; eventStreamDiscovery?: KtxEventStreamDiscoveryPort;
introspect(input: KtxScanInput, ctx: KtxScanContext): Promise<KtxSchemaSnapshot>; introspect(input: KtxScanInput, ctx: KtxScanContext): Promise<KtxSchemaSnapshot>;
listSchemas(): Promise<string[]>;
listTables(schemas?: string[]): Promise<KtxTableListEntry[]>;
testConnection?(): Promise<KtxConnectorTestResult>; testConnection?(): Promise<KtxConnectorTestResult>;
sampleColumn?(input: KtxColumnSampleInput, ctx: KtxScanContext): Promise<KtxColumnSampleResult>; sampleColumn?(input: KtxColumnSampleInput, ctx: KtxScanContext): Promise<KtxColumnSampleResult>;
sampleTable?(input: KtxTableSampleInput, ctx: KtxScanContext): Promise<KtxTableSampleResult>; sampleTable?(input: KtxTableSampleInput, ctx: KtxScanContext): Promise<KtxTableSampleResult>;

View file

@ -1,4 +1,4 @@
import { getDialectForDriver } from '../../context/connections/dialects.js'; import { getDialectForDriver, type KtxDialect } from '../connections/dialects.js';
import type { KtxFileStorePort } from '../../context/core/file-store.js'; import type { KtxFileStorePort } from '../../context/core/file-store.js';
import type { import type {
KtxConnectionDriver, KtxConnectionDriver,
@ -128,46 +128,22 @@ function splitDisplay(display: string): string[] {
.filter(Boolean); .filter(Boolean);
} }
function formatDisplay(driver: CatalogDriver, table: KtxTableRef): string { function formatDisplay(dialect: KtxDialect, table: KtxTableRef): string {
if (driver === 'sqlite') { return dialect.formatDisplayRef(table);
return table.name;
}
return [table.catalog, table.db, table.name].filter((part): part is string => Boolean(part)).join('.');
} }
function parseDisplay(driver: CatalogDriver, display: string): KtxTableRef | null { function parseDisplay(dialect: KtxDialect, display: string): KtxTableRef | null {
const parsed = dialect.parseDisplayRef(display);
if (parsed) {
return parsed;
}
const parts = splitDisplay(display); const parts = splitDisplay(display);
if (driver === 'sqlite') {
return parts.length === 1 ? { catalog: null, db: null, name: parts[0]! } : null;
}
if (driver === 'bigquery' || driver === 'snowflake' || driver === 'sqlserver') {
if (parts.length !== 3) {
return null;
}
return { catalog: parts[0]!, db: parts[1]!, name: parts[2]! };
}
if (parts.length === 2) {
return { catalog: null, db: parts[0]!, name: parts[1]! };
}
if (parts.length === 3) {
return { catalog: parts[0]!, db: parts[1]!, name: parts[2]! };
}
return parts.length === 1 ? { catalog: null, db: null, name: parts[0]! } : null; return parts.length === 1 ? { catalog: null, db: null, name: parts[0]! } : null;
} }
function expectedDisplayPartCount(driver: CatalogDriver): number { function parseColumnDisplay(dialect: KtxDialect, display: string): (KtxTableRef & { column: string }) | null {
if (driver === 'sqlite') {
return 1;
}
if (driver === 'bigquery' || driver === 'snowflake' || driver === 'sqlserver') {
return 3;
}
return 2;
}
function parseColumnDisplay(driver: CatalogDriver, display: string): (KtxTableRef & { column: string }) | null {
const parts = splitDisplay(display); const parts = splitDisplay(display);
const tablePartCount = expectedDisplayPartCount(driver); const tablePartCount = dialect.columnDisplayTablePartCount();
if (parts.length !== tablePartCount + 1) { if (parts.length !== tablePartCount + 1) {
return null; return null;
} }
@ -175,7 +151,7 @@ function parseColumnDisplay(driver: CatalogDriver, display: string): (KtxTableRe
if (!column) { if (!column) {
return null; return null;
} }
const table = parseDisplay(driver, parts.slice(0, -1).join('.')); const table = dialect.parseDisplayRef(parts.slice(0, -1).join('.'));
return table ? { ...table, column } : null; return table ? { ...table, column } : null;
} }
@ -272,6 +248,7 @@ export class WarehouseCatalogService {
if (!table) { if (!table) {
return null; return null;
} }
const dialect = getDialectForDriver(catalog.driver);
const profileTables = catalog.profile?.tables ?? []; const profileTables = catalog.profile?.tables ?? [];
const profileTable = profileTables.find((candidate) => candidate.table && refsEqual(candidate.table, table)); const profileTable = profileTables.find((candidate) => candidate.table && refsEqual(candidate.table, table));
const profileColumns = catalog.profile?.columns ?? {}; const profileColumns = catalog.profile?.columns ?? {};
@ -281,7 +258,7 @@ export class WarehouseCatalogService {
catalog: table.catalog, catalog: table.catalog,
db: table.db, db: table.db,
name: table.name, name: table.name,
display: formatDisplay(catalog.driver, table), display: formatDisplay(dialect, table),
kind: table.kind, kind: table.kind,
comment: table.comment, comment: table.comment,
description: firstDescription(table.descriptions), description: firstDescription(table.descriptions),
@ -321,16 +298,21 @@ export class WarehouseCatalogService {
if (!catalog) { if (!catalog) {
return { resolved: null, candidates: [], dialect: 'unknown' }; return { resolved: null, candidates: [], dialect: 'unknown' };
} }
const dialect = getDialectForDriver(catalog.driver).type; const dialect = getDialectForDriver(catalog.driver);
const parsed = parseDisplay(catalog.driver, display); const parsed = parseDisplay(dialect, display);
if (!parsed) { if (!parsed) {
return { resolved: null, candidates: bestCandidates(catalog.tables, display), dialect }; return { resolved: null, candidates: bestCandidates(catalog.tables, display), dialect: dialect.type };
} }
const table = catalog.tables.find((candidate) => refsEqual(candidate, parsed)); const exactTable = catalog.tables.find((candidate) => refsEqual(candidate, parsed));
const looseNameMatches =
parsed.catalog === null && parsed.db === null
? catalog.tables.filter((candidate) => normalize(candidate.name) === normalize(parsed.name))
: [];
const table = exactTable ?? (looseNameMatches.length === 1 ? looseNameMatches[0] : undefined);
if (!table) { if (!table) {
return { resolved: null, candidates: bestCandidates(catalog.tables, display), dialect }; return { resolved: null, candidates: bestCandidates(catalog.tables, display), dialect: dialect.type };
} }
return { resolved: { catalog: table.catalog, db: table.db, name: table.name }, candidates: [], dialect }; return { resolved: { catalog: table.catalog, db: table.db, name: table.name }, candidates: [], dialect: dialect.type };
} }
async resolveDisplayTarget(connectionId: string, display: string): Promise<DisplayTargetResolution> { async resolveDisplayTarget(connectionId: string, display: string): Promise<DisplayTargetResolution> {
@ -339,20 +321,20 @@ export class WarehouseCatalogService {
return { resolved: null, candidates: [], dialect: 'unknown' }; return { resolved: null, candidates: [], dialect: 'unknown' };
} }
const dialect = getDialectForDriver(catalog.driver).type; const dialect = getDialectForDriver(catalog.driver);
const tableResolution = await this.resolveDisplay(connectionId, display); const tableResolution = await this.resolveDisplay(connectionId, display);
if (tableResolution.resolved) { if (tableResolution.resolved) {
return tableResolution; return tableResolution;
} }
const parsedColumn = parseColumnDisplay(catalog.driver, display); const parsedColumn = parseColumnDisplay(dialect, display);
if (!parsedColumn) { if (!parsedColumn) {
return { resolved: null, candidates: bestCandidates(catalog.tables, display), dialect }; return { resolved: null, candidates: bestCandidates(catalog.tables, display), dialect: dialect.type };
} }
const table = catalog.tables.find((candidate) => refsEqual(candidate, parsedColumn)); const table = catalog.tables.find((candidate) => refsEqual(candidate, parsedColumn));
if (!table) { if (!table) {
return { resolved: null, candidates: bestCandidates(catalog.tables, display), dialect }; return { resolved: null, candidates: bestCandidates(catalog.tables, display), dialect: dialect.type };
} }
return { return {
@ -363,7 +345,7 @@ export class WarehouseCatalogService {
column: parsedColumn.column, column: parsedColumn.column,
}, },
candidates: [], candidates: [],
dialect, dialect: dialect.type,
}; };
} }
@ -372,6 +354,7 @@ export class WarehouseCatalogService {
if (!catalog) { if (!catalog) {
return []; return [];
} }
const dialect = getDialectForDriver(catalog.driver);
const hits: RawSchemaHit[] = []; const hits: RawSchemaHit[] = [];
for (const table of catalog.tables as TableWithDescriptions[]) { for (const table of catalog.tables as TableWithDescriptions[]) {
const tableMatch = matchedOnTable(table, query); const tableMatch = matchedOnTable(table, query);
@ -380,7 +363,7 @@ export class WarehouseCatalogService {
kind: 'table', kind: 'table',
connectionId, connectionId,
ref: { catalog: table.catalog, db: table.db, name: table.name }, ref: { catalog: table.catalog, db: table.db, name: table.name },
display: formatDisplay(catalog.driver, table), display: formatDisplay(dialect, table),
matchedOn: tableMatch, matchedOn: tableMatch,
}); });
} }
@ -393,7 +376,7 @@ export class WarehouseCatalogService {
kind: 'column', kind: 'column',
connectionId, connectionId,
ref: { catalog: table.catalog, db: table.db, name: table.name, column: column.name }, ref: { catalog: table.catalog, db: table.db, name: table.name, column: column.name },
display: `${formatDisplay(catalog.driver, table)}.${column.name}`, display: `${formatDisplay(dialect, table)}.${column.name}`,
matchedOn: columnMatch, matchedOn: columnMatch,
}); });
} }

View file

@ -1,3 +1,4 @@
import { parseDottedTableEntry } from './context/scan/enabled-tables.js';
import type { KtxTableListEntry } from './context/scan/types.js'; import type { KtxTableListEntry } from './context/scan/types.js';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import { profileMark } from './startup-profile.js'; import { profileMark } from './startup-profile.js';
@ -73,7 +74,9 @@ export interface PickDatabaseScopeArgs {
} }
function qualifiedTableId(entry: KtxTableListEntry): string { function qualifiedTableId(entry: KtxTableListEntry): string {
return `${entry.schema}.${entry.name}`; return entry.catalog !== null
? `${entry.catalog}.${entry.schema}.${entry.name}`
: `${entry.schema}.${entry.name}`;
} }
function tableTitle(entry: KtxTableListEntry): string { function tableTitle(entry: KtxTableListEntry): string {
@ -177,7 +180,8 @@ function schemasFromEnabledTables(enabledTables: readonly string[]): string[] {
const seen = new Set<string>(); const seen = new Set<string>();
const result: string[] = []; const result: string[] = [];
for (const qualified of enabledTables) { for (const qualified of enabledTables) {
const schema = qualified.split('.')[0] ?? ''; const ref = parseDottedTableEntry(qualified);
const schema = ref?.db ?? '';
if (schema.length === 0 || seen.has(schema)) continue; if (schema.length === 0 || seen.has(schema)) continue;
seen.add(schema); seen.add(schema);
result.push(schema); result.push(schema);
@ -228,11 +232,14 @@ async function runStageTwoTreePicker(input: {
? initialSelectionForExisting(args.existing.enabledTables, byId) ? initialSelectionForExisting(args.existing.enabledTables, byId)
: initialSelectionFromDefaults(selectedSchemas, schemaIds); : initialSelectionFromDefaults(selectedSchemas, schemaIds);
const initialState = buildInitialState({ const initialState = {
tree, ...buildInitialState({
existingSelectedIds: initialSelection, tree,
skipEmptyAction: 'save-empty', existingSelectedIds: initialSelection,
}); skipEmptyAction: 'save-empty',
}),
expanded: new Set(schemaIds),
};
const schemaWordPlural = schemaCount === 1 ? args.schemaNoun : args.schemaNounPlural; const schemaWordPlural = schemaCount === 1 ? args.schemaNoun : args.schemaNounPlural;
const subtitleLines = [ const subtitleLines = [

View file

@ -0,0 +1,28 @@
export function describeError(error: unknown): string {
if (!(error instanceof Error)) {
const text = String(error);
return text.length > 0 ? text : 'unknown error';
}
const parts: string[] = [];
if (error.message.length > 0) {
parts.push(error.message);
}
const seen = new Set<unknown>([error]);
let cause: unknown = error.cause;
while (cause && !seen.has(cause)) {
seen.add(cause);
if (cause instanceof Error) {
if (cause.message.length > 0) {
parts.push(cause.message);
}
cause = cause.cause;
} else {
const text = String(cause);
if (text.length > 0) {
parts.push(text);
}
break;
}
}
return parts.length > 0 ? parts.join(': ') : 'unknown error';
}

View file

@ -1,3 +1,4 @@
import { describeError } from '../error-message.js';
import { createKtxEmbeddingProvider, type KtxEmbeddingProviderDeps } from './embedding-provider.js'; import { createKtxEmbeddingProvider, type KtxEmbeddingProviderDeps } from './embedding-provider.js';
import type { KtxEmbeddingConfig } from './types.js'; import type { KtxEmbeddingConfig } from './types.js';
@ -48,7 +49,6 @@ export async function runKtxEmbeddingHealthCheck(
} }
return { ok: true }; return { ok: true };
} catch (error) { } catch (error) {
const message = error instanceof Error ? error.message : String(error); return { ok: false, message: redactHealthCheckMessage(describeError(error), config) };
return { ok: false, message: redactHealthCheckMessage(message, config) };
} }
} }

View file

@ -1,7 +1,11 @@
import {
getDriverRegistration,
listSupportedDrivers,
} from './context/connections/drivers.js';
import type { KtxLocalProject } from './context/project/project.js'; import type { KtxLocalProject } from './context/project/project.js';
import type { KtxScanConnector } from './context/scan/types.js'; import type { KtxScanConnector } from './context/scan/types.js';
const SUPPORTED_DRIVERS = 'sqlite, postgres, mysql, clickhouse, sqlserver, bigquery, snowflake'; const SUPPORTED_DRIVERS = listSupportedDrivers().join(', ');
export async function createKtxCliScanConnector( export async function createKtxCliScanConnector(
project: KtxLocalProject, project: KtxLocalProject,
@ -17,58 +21,23 @@ export async function createKtxCliScanConnector(
`Connection "${connectionId}" has no \`driver\` field in ktx.yaml. Supported drivers: ${SUPPORTED_DRIVERS}.`, `Connection "${connectionId}" has no \`driver\` field in ktx.yaml. Supported drivers: ${SUPPORTED_DRIVERS}.`,
); );
} }
if (driver === 'sqlite') {
const { KtxSqliteScanConnector, isKtxSqliteConnectionConfig } = await import('./connectors/sqlite/connector.js');; const registration = getDriverRegistration(driver);
if (!isKtxSqliteConnectionConfig(connection)) { if (!registration) {
throw invalidConnectionConfigError(connectionId, driver); throw new Error(
} `Connection "${connectionId}" uses driver "${driver}", which has no native standalone KTX scan connector. Supported drivers: ${SUPPORTED_DRIVERS}.`,
return new KtxSqliteScanConnector({ connectionId, connection, projectDir: project.projectDir }); );
} }
if (driver === 'postgres') {
const { KtxPostgresScanConnector, isKtxPostgresConnectionConfig } = await import('./connectors/postgres/connector.js');; const connectorModule = await registration.load();
if (!isKtxPostgresConnectionConfig(connection)) { if (!connectorModule.isConnectionConfig(connection)) {
throw invalidConnectionConfigError(connectionId, driver); throw invalidConnectionConfigError(connectionId, driver);
}
return new KtxPostgresScanConnector({ connectionId, connection });
} }
if (driver === 'mysql') { return connectorModule.createScanConnector({
const { KtxMysqlScanConnector, isKtxMysqlConnectionConfig } = await import('./connectors/mysql/connector.js');; connectionId,
if (!isKtxMysqlConnectionConfig(connection)) { connection,
throw invalidConnectionConfigError(connectionId, driver); projectDir: project.projectDir,
} });
return new KtxMysqlScanConnector({ connectionId, connection });
}
if (driver === 'clickhouse') {
const { KtxClickHouseScanConnector, isKtxClickHouseConnectionConfig } = await import('./connectors/clickhouse/connector.js');;
if (!isKtxClickHouseConnectionConfig(connection)) {
throw invalidConnectionConfigError(connectionId, driver);
}
return new KtxClickHouseScanConnector({ connectionId, connection });
}
if (driver === 'sqlserver') {
const { KtxSqlServerScanConnector, isKtxSqlServerConnectionConfig } = await import('./connectors/sqlserver/connector.js');;
if (!isKtxSqlServerConnectionConfig(connection)) {
throw invalidConnectionConfigError(connectionId, driver);
}
return new KtxSqlServerScanConnector({ connectionId, connection });
}
if (driver === 'bigquery') {
const { KtxBigQueryScanConnector, isKtxBigQueryConnectionConfig } = await import('./connectors/bigquery/connector.js');;
if (!isKtxBigQueryConnectionConfig(connection)) {
throw invalidConnectionConfigError(connectionId, driver);
}
return new KtxBigQueryScanConnector({ connectionId, connection });
}
if (driver === 'snowflake') {
const { KtxSnowflakeScanConnector, isKtxSnowflakeConnectionConfig } = await import('./connectors/snowflake/connector.js');;
if (!isKtxSnowflakeConnectionConfig(connection)) {
throw invalidConnectionConfigError(connectionId, driver);
}
return new KtxSnowflakeScanConnector({ connectionId, connection, projectDir: project.projectDir });
}
throw new Error(
`Connection "${connectionId}" uses driver "${driver}", which has no native standalone KTX scan connector. Supported drivers: ${SUPPORTED_DRIVERS}.`,
);
} }
function invalidConnectionConfigError(connectionId: string, driver: string): Error { function invalidConnectionConfigError(connectionId: string, driver: string): Error {

View file

@ -1,5 +1,6 @@
import type { KtxEmbeddingConfig } from './llm/types.js'; import type { KtxEmbeddingConfig } from './llm/types.js';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import { writePrefixedLines } from './clack.js';
import { import {
ensureManagedPythonCommandRuntime, ensureManagedPythonCommandRuntime,
type KtxManagedPythonInstallPolicy, type KtxManagedPythonInstallPolicy,
@ -73,7 +74,7 @@ export async function ensureManagedLocalEmbeddingsDaemon(
}); });
const verb = daemon.status === 'started' ? 'Started' : 'Using'; const verb = daemon.status === 'started' ? 'Started' : 'Using';
options.io.stderr.write(`${verb} KTX daemon: ${daemon.baseUrl}\n`); writePrefixedLines((chunk) => options.io.stderr.write(chunk), `${verb} KTX daemon: ${daemon.baseUrl}`);
return { return {
baseUrl: daemon.baseUrl, baseUrl: daemon.baseUrl,

View file

@ -4,6 +4,7 @@ import { createServer } from 'node:net';
import { setTimeout as delay } from 'node:timers/promises'; import { setTimeout as delay } from 'node:timers/promises';
import { promisify } from 'node:util'; import { promisify } from 'node:util';
import { z } from 'zod'; import { z } from 'zod';
import { describeError } from './error-message.js';
import { import {
installManagedPythonRuntime, installManagedPythonRuntime,
managedPythonDaemonLayout, managedPythonDaemonLayout,
@ -16,6 +17,17 @@ import {
} from './managed-python-runtime.js'; } from './managed-python-runtime.js';
import { sanitizeChildProxyEnv } from './proxy-env.js'; import { sanitizeChildProxyEnv } from './proxy-env.js';
export class ManagedPythonDaemonStartError extends Error {
readonly detail: string;
readonly stderrLog: string;
constructor(detail: string, stderrLog: string) {
super(`KTX daemon failed to start: ${detail}. stderr: ${stderrLog}`);
this.name = 'ManagedPythonDaemonStartError';
this.detail = detail;
this.stderrLog = stderrLog;
}
}
export interface ManagedPythonDaemonState { export interface ManagedPythonDaemonState {
schemaVersion: 1; schemaVersion: 1;
pid: number; pid: number;
@ -237,7 +249,7 @@ async function healthOk(input: {
} }
return { ok: true }; return { ok: true };
} catch (error) { } catch (error) {
return { ok: false, detail: error instanceof Error ? error.message : String(error) }; return { ok: false, detail: describeError(error) };
} }
} }
@ -328,7 +340,7 @@ async function waitForHealth(input: {
return; return;
} }
lastDetail = finalHealth.detail; lastDetail = finalHealth.detail;
throw new Error(`KTX daemon failed to start: ${lastDetail}. stderr: ${input.state.stderrLog}`); throw new ManagedPythonDaemonStartError(lastDetail, input.state.stderrLog);
} }
async function removeState(layout: ManagedPythonDaemonLayout): Promise<void> { async function removeState(layout: ManagedPythonDaemonLayout): Promise<void> {
@ -721,13 +733,21 @@ export async function startManagedPythonDaemon(
stdoutLog: layout.daemonStdoutPath, stdoutLog: layout.daemonStdoutPath,
stderrLog: layout.daemonStderrPath, stderrLog: layout.daemonStderrPath,
}; };
await waitForHealth({ try {
state, await waitForHealth({
cliVersion: options.cliVersion, state,
fetch: fetchImpl, cliVersion: options.cliVersion,
timeoutMs: options.startupTimeoutMs ?? 10_000, fetch: fetchImpl,
pollIntervalMs: options.pollIntervalMs ?? 100, timeoutMs: options.startupTimeoutMs ?? 30_000,
}); pollIntervalMs: options.pollIntervalMs ?? 100,
});
} catch (error) {
if (processAlive(state.pid)) {
killProcess(state.pid);
}
await removeState(layout);
throw error;
}
await writeState(layout.daemonStatePath, state); await writeState(layout.daemonStatePath, state);
return { status: 'started', layout, state, baseUrl: baseUrl(state) }; return { status: 'started', layout, state, baseUrl: baseUrl(state) };
} finally { } finally {

View file

@ -7,6 +7,7 @@ import type { LookerTableIdentifierParser } from './context/ingest/adapters/look
import { createHttpSqlAnalysisPort, type KtxSqlAnalysisHttpJsonRunner } from './context/sql-analysis/http-sql-analysis-port.js'; import { createHttpSqlAnalysisPort, type KtxSqlAnalysisHttpJsonRunner } from './context/sql-analysis/http-sql-analysis-port.js';
import type { SqlAnalysisPort } from './context/sql-analysis/ports.js'; import type { SqlAnalysisPort } from './context/sql-analysis/ports.js';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import { writePrefixedLines } from './clack.js';
import { import {
ensureManagedPythonCommandRuntime, ensureManagedPythonCommandRuntime,
type KtxManagedPythonInstallPolicy, type KtxManagedPythonInstallPolicy,
@ -137,7 +138,7 @@ export function createManagedPythonDaemonBaseUrlResolver(
force: false, force: false,
}); });
const verb = daemon.status === 'started' ? 'Started' : 'Using existing'; const verb = daemon.status === 'started' ? 'Started' : 'Using existing';
options.io.stderr.write(`${verb} KTX daemon: ${daemon.baseUrl}\n`); writePrefixedLines((chunk) => options.io.stderr.write(chunk), `${verb} KTX daemon: ${daemon.baseUrl}`);
cachedBaseUrl = daemon.baseUrl; cachedBaseUrl = daemon.baseUrl;
return cachedBaseUrl; return cachedBaseUrl;
}; };

View file

@ -10,6 +10,7 @@ import { markKtxSetupStateStepComplete } from './context/project/setup-config.js
import { serializeKtxProjectConfig } from './context/project/config.js'; import { serializeKtxProjectConfig } from './context/project/config.js';
import { strToU8, zipSync } from 'fflate'; import { strToU8, zipSync } from 'fflate';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import { errorMessage, writePrefixedLines } from './clack.js';
import { import {
createKtxSetupPromptAdapter, createKtxSetupPromptAdapter,
createKtxSetupUiAdapter, createKtxSetupUiAdapter,
@ -1230,7 +1231,7 @@ export async function runKtxSetupAgentsStep(
} }
return { status: 'ready', projectDir: args.projectDir, installs, nextActions }; return { status: 'ready', projectDir: args.projectDir, installs, nextActions };
} catch (error) { } catch (error) {
io.stderr.write(`${error instanceof Error ? error.message : String(error)}\n`); writePrefixedLines((chunk) => io.stderr.write(chunk), errorMessage(error));
return { status: 'failed', projectDir: args.projectDir }; return { status: 'failed', projectDir: args.projectDir };
} }
} }

View file

@ -5,6 +5,7 @@ import { type KtxLocalProject, loadKtxProject } from './context/project/project.
import { markKtxSetupStateStepComplete, readKtxSetupState } from './context/project/setup-config.js'; import { markKtxSetupStateStepComplete, readKtxSetupState } from './context/project/setup-config.js';
import { serializeKtxProjectConfig } from './context/project/config.js'; import { serializeKtxProjectConfig } from './context/project/config.js';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import { errorMessage, writePrefixedLines } from './clack.js';
import { buildPublicIngestPlan } from './public-ingest.js'; import { buildPublicIngestPlan } from './public-ingest.js';
import { import {
type KtxDatabaseContextDepth, type KtxDatabaseContextDepth,
@ -745,7 +746,7 @@ export async function runKtxSetupContextStep(
return await runBuild(args, io, deps, project, targets); return await runBuild(args, io, deps, project, targets);
} catch (error) { } catch (error) {
io.stderr.write(`${error instanceof Error ? error.message : String(error)}\n`); writePrefixedLines((chunk) => io.stderr.write(chunk), errorMessage(error));
return { status: 'failed', projectDir: args.projectDir }; return { status: 'failed', projectDir: args.projectDir };
} }
} }

View file

@ -3,6 +3,7 @@ import { readFile, writeFile } from 'node:fs/promises';
import { delimiter, dirname, join } from 'node:path'; import { delimiter, dirname, join } from 'node:path';
import { fileURLToPath } from 'node:url'; import { fileURLToPath } from 'node:url';
import { promisify } from 'node:util'; import { promisify } from 'node:util';
import { getDriverRegistration } from './context/connections/drivers.js';
import { queryHistoryDialectForConnection } from './context/ingest/adapters/historic-sql/connection-dialect.js'; import { queryHistoryDialectForConnection } from './context/ingest/adapters/historic-sql/connection-dialect.js';
import type { HistoricSqlDialect } from './context/ingest/adapters/historic-sql/types.js'; import type { HistoricSqlDialect } from './context/ingest/adapters/historic-sql/types.js';
import { import {
@ -15,6 +16,11 @@ import { loadKtxProject } from './context/project/project.js';
import { markKtxSetupStateStepComplete, setKtxSetupDatabaseConnectionIds } from './context/project/setup-config.js'; import { markKtxSetupStateStepComplete, setKtxSetupDatabaseConnectionIds } from './context/project/setup-config.js';
import type { KtxTableListEntry } from './context/scan/types.js'; import type { KtxTableListEntry } from './context/scan/types.js';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import {
errorMessage,
flushPrefixedBufferedCommandOutput,
writePrefixedLines,
} from './clack.js';
import { runKtxConnection } from './connection.js'; import { runKtxConnection } from './connection.js';
import { import {
pickDatabaseScope as defaultPickDatabaseScope, pickDatabaseScope as defaultPickDatabaseScope,
@ -112,13 +118,13 @@ export interface KtxSetupDatabasesDeps {
} }
const DRIVER_OPTIONS: Array<{ value: KtxSetupDatabaseDriver; label: string }> = [ const DRIVER_OPTIONS: Array<{ value: KtxSetupDatabaseDriver; label: string }> = [
{ value: 'sqlite', label: 'SQLite' },
{ value: 'postgres', label: 'PostgreSQL' }, { value: 'postgres', label: 'PostgreSQL' },
{ value: 'bigquery', label: 'BigQuery' },
{ value: 'snowflake', label: 'Snowflake' },
{ value: 'mysql', label: 'MySQL' }, { value: 'mysql', label: 'MySQL' },
{ value: 'clickhouse', label: 'ClickHouse' }, { value: 'clickhouse', label: 'ClickHouse' },
{ value: 'sqlserver', label: 'SQL Server' }, { value: 'sqlserver', label: 'SQL Server' },
{ value: 'bigquery', label: 'BigQuery' }, { value: 'sqlite', label: 'SQLite' },
{ value: 'snowflake', label: 'Snowflake' },
]; ];
const DRIVER_LABELS = Object.fromEntries(DRIVER_OPTIONS.map((option) => [option.value, option.label])) as Record< const DRIVER_LABELS = Object.fromEntries(DRIVER_OPTIONS.map((option) => [option.value, option.label])) as Record<
@ -220,7 +226,7 @@ const SCOPE_DISCOVERY_SPECS: Partial<Record<KtxSetupDatabaseDriver, ScopeDiscove
}; };
type UrlDriverType = Extract<KtxSetupDatabaseDriver, 'postgres' | 'mysql' | 'clickhouse' | 'sqlserver'>; type UrlDriverType = Extract<KtxSetupDatabaseDriver, 'postgres' | 'mysql' | 'clickhouse' | 'sqlserver'>;
type ConnectionSetupStatus = 'ready' | 'back' | 'failed'; type ConnectionSetupStatus = 'ready' | 'back' | 'failed' | 'failed-query-history-unavailable';
const DRIVER_CONNECTION_DEFAULTS: Record<UrlDriverType, { port: string }> = { const DRIVER_CONNECTION_DEFAULTS: Record<UrlDriverType, { port: string }> = {
postgres: { port: '5432' }, postgres: { port: '5432' },
@ -361,74 +367,18 @@ async function defaultListSchemas(projectDir: string, connectionId: string): Pro
const project = await loadKtxProject({ projectDir }); const project = await loadKtxProject({ projectDir });
const connection = project.config.connections[connectionId]; const connection = project.config.connections[connectionId];
const driver = normalizeDriver(connection?.driver); const driver = normalizeDriver(connection?.driver);
const registration = driver ? getDriverRegistration(driver) : undefined;
if (!registration) return [];
if (driver === 'postgres') { const connectorModule = await registration.load();
const { KtxPostgresScanConnector, isKtxPostgresConnectionConfig } = await import('./connectors/postgres/connector.js');; if (!connectorModule.isConnectionConfig(connection)) return [];
if (!isKtxPostgresConnectionConfig(connection)) return [];
const connector = new KtxPostgresScanConnector({ connectionId, connection }); const connector = connectorModule.createScanConnector({ connectionId, connection, projectDir });
try { try {
return await connector.listSchemas(); return await connector.listSchemas();
} finally { } finally {
await connector.cleanup(); await connector.cleanup?.();
}
} }
if (driver === 'sqlserver') {
const { KtxSqlServerScanConnector, isKtxSqlServerConnectionConfig } = await import('./connectors/sqlserver/connector.js');;
if (!isKtxSqlServerConnectionConfig(connection)) return [];
const connector = new KtxSqlServerScanConnector({ connectionId, connection });
try {
return await connector.listSchemas();
} finally {
await connector.cleanup();
}
}
if (driver === 'mysql') {
const { KtxMysqlScanConnector, isKtxMysqlConnectionConfig } = await import('./connectors/mysql/connector.js');;
if (!isKtxMysqlConnectionConfig(connection)) return [];
const connector = new KtxMysqlScanConnector({ connectionId, connection });
try {
return await connector.listSchemas();
} finally {
await connector.cleanup();
}
}
if (driver === 'clickhouse') {
const { KtxClickHouseScanConnector, isKtxClickHouseConnectionConfig } = await import('./connectors/clickhouse/connector.js');;
if (!isKtxClickHouseConnectionConfig(connection)) return [];
const connector = new KtxClickHouseScanConnector({ connectionId, connection });
try {
return await connector.listSchemas();
} finally {
await connector.cleanup();
}
}
if (driver === 'bigquery') {
const { KtxBigQueryScanConnector, isKtxBigQueryConnectionConfig } = await import('./connectors/bigquery/connector.js');;
if (!isKtxBigQueryConnectionConfig(connection)) return [];
const connector = new KtxBigQueryScanConnector({ connectionId, connection });
try {
return await connector.listDatasets();
} finally {
await connector.cleanup();
}
}
if (driver === 'snowflake') {
const { KtxSnowflakeScanConnector, isKtxSnowflakeConnectionConfig } = await import('./connectors/snowflake/connector.js');;
if (!isKtxSnowflakeConnectionConfig(connection)) return [];
const connector = new KtxSnowflakeScanConnector({ connectionId, connection, projectDir });
try {
return await connector.listSchemas();
} finally {
await connector.cleanup();
}
}
return [];
} }
function configuredSchemas(connection: KtxProjectConnectionConfig | undefined, driver: KtxSetupDatabaseDriver): string[] | undefined { function configuredSchemas(connection: KtxProjectConnectionConfig | undefined, driver: KtxSetupDatabaseDriver): string[] | undefined {
@ -448,74 +398,18 @@ async function defaultListTables(
const connection = project.config.connections[connectionId]; const connection = project.config.connections[connectionId];
const driver = normalizeDriver(connection?.driver); const driver = normalizeDriver(connection?.driver);
const schemas = schemasOverride ?? (driver ? configuredSchemas(connection, driver) : undefined); const schemas = schemasOverride ?? (driver ? configuredSchemas(connection, driver) : undefined);
const registration = driver ? getDriverRegistration(driver) : undefined;
if (!registration) return [];
if (driver === 'postgres') { const connectorModule = await registration.load();
const { KtxPostgresScanConnector, isKtxPostgresConnectionConfig } = await import('./connectors/postgres/connector.js');; if (!connectorModule.isConnectionConfig(connection)) return [];
if (!isKtxPostgresConnectionConfig(connection)) return [];
const connector = new KtxPostgresScanConnector({ connectionId, connection }); const connector = connectorModule.createScanConnector({ connectionId, connection, projectDir });
try { try {
return await connector.listTables(schemas); return await connector.listTables(schemas);
} finally { } finally {
await connector.cleanup(); await connector.cleanup?.();
}
} }
if (driver === 'mysql') {
const { KtxMysqlScanConnector, isKtxMysqlConnectionConfig } = await import('./connectors/mysql/connector.js');;
if (!isKtxMysqlConnectionConfig(connection)) return [];
const connector = new KtxMysqlScanConnector({ connectionId, connection });
try {
return await connector.listTables(schemas);
} finally {
await connector.cleanup();
}
}
if (driver === 'sqlserver') {
const { KtxSqlServerScanConnector, isKtxSqlServerConnectionConfig } = await import('./connectors/sqlserver/connector.js');;
if (!isKtxSqlServerConnectionConfig(connection)) return [];
const connector = new KtxSqlServerScanConnector({ connectionId, connection });
try {
return await connector.listTables(schemas);
} finally {
await connector.cleanup();
}
}
if (driver === 'bigquery') {
const { KtxBigQueryScanConnector, isKtxBigQueryConnectionConfig } = await import('./connectors/bigquery/connector.js');;
if (!isKtxBigQueryConnectionConfig(connection)) return [];
const connector = new KtxBigQueryScanConnector({ connectionId, connection });
try {
return await connector.listTables(schemas);
} finally {
await connector.cleanup();
}
}
if (driver === 'snowflake') {
const { KtxSnowflakeScanConnector, isKtxSnowflakeConnectionConfig } = await import('./connectors/snowflake/connector.js');;
if (!isKtxSnowflakeConnectionConfig(connection)) return [];
const connector = new KtxSnowflakeScanConnector({ connectionId, connection, projectDir });
try {
return await connector.listTables(schemas);
} finally {
await connector.cleanup();
}
}
if (driver === 'clickhouse') {
const { KtxClickHouseScanConnector, isKtxClickHouseConnectionConfig } = await import('./connectors/clickhouse/connector.js');;
if (!isKtxClickHouseConnectionConfig(connection)) return [];
const connector = new KtxClickHouseScanConnector({ connectionId, connection });
try {
return await connector.listTables(schemas);
} finally {
await connector.cleanup();
}
}
return [];
} }
function existingConnectionIdsByDriver( function existingConnectionIdsByDriver(
@ -638,9 +532,9 @@ function scriptedScopeConfigForDriver(
databaseSchemas: string[], databaseSchemas: string[],
): Record<string, unknown> { ): Record<string, unknown> {
if (databaseSchemas.length === 0) return {}; if (databaseSchemas.length === 0) return {};
if (driver === 'bigquery') return { dataset_ids: databaseSchemas }; const registration = getDriverRegistration(driver);
if (driver === 'clickhouse') return { databases: databaseSchemas }; if (!registration?.scopeConfigKey) return {};
return { schemas: databaseSchemas }; return { [registration.scopeConfigKey]: databaseSchemas };
} }
function databaseNameFromLiteralUrl(url: string): string | undefined { function databaseNameFromLiteralUrl(url: string): string | undefined {
@ -1128,25 +1022,6 @@ function createBufferedCommandIo(): BufferedCommandIo {
}; };
} }
function flushBufferedCommandOutput(io: KtxCliIo, bufferedIo: BufferedCommandIo): void {
const stdout = bufferedIo.stdoutText();
const stderr = bufferedIo.stderrText();
if (stdout.length > 0) {
io.stdout.write(stdout);
}
if (stderr.length > 0) {
io.stderr.write(stderr);
}
}
function writePrefixedLines(write: (chunk: string) => void, output: string): void {
for (const line of output.split(/\r?\n/)) {
if (line.length > 0) {
write(`${line}\n`);
}
}
}
function envWithCurrentNodeFirst(env: NodeJS.ProcessEnv = process.env): NodeJS.ProcessEnv { function envWithCurrentNodeFirst(env: NodeJS.ProcessEnv = process.env): NodeJS.ProcessEnv {
return { return {
...env, ...env,
@ -1222,11 +1097,6 @@ async function defaultRebuildNativeSqlite(io: KtxCliIo): Promise<number> {
} }
} }
function flushPrefixedBufferedCommandOutput(io: KtxCliIo, bufferedIo: BufferedCommandIo): void {
writePrefixedLines((chunk) => io.stdout.write(chunk), bufferedIo.stdoutText());
writePrefixedLines((chunk) => io.stderr.write(chunk), bufferedIo.stderrText());
}
function nativeSqliteAbiMismatchDetail(output: string): string | null { function nativeSqliteAbiMismatchDetail(output: string): string | null {
const mentionsBetterSqlite = /\bbetter-sqlite3\b|better_sqlite3/i.test(output); const mentionsBetterSqlite = /\bbetter-sqlite3\b|better_sqlite3/i.test(output);
const mentionsAbiMismatch = /compiled against a different Node\.js version|NODE_MODULE_VERSION/i.test(output); const mentionsAbiMismatch = /compiled against a different Node\.js version|NODE_MODULE_VERSION/i.test(output);
@ -1318,6 +1188,20 @@ async function writeConnectionConfig(input: {
} }
} }
async function disableConnectionQueryHistory(projectDir: string, connectionId: string): Promise<void> {
const project = await loadKtxProject({ projectDir });
const connection = project.config.connections[connectionId];
if (!connection) {
return;
}
const existing = queryHistoryConfigRecord(connection) ?? historicSqlConfigRecord(connection) ?? {};
await writeConnectionConfig({
projectDir,
connectionId,
connection: withQueryHistoryConfig(connection, { ...existing, enabled: false }),
});
}
async function createConnectionConfigRollback(projectDir: string, connectionId: string): Promise<() => Promise<void>> { async function createConnectionConfigRollback(projectDir: string, connectionId: string): Promise<() => Promise<void>> {
const project = await loadKtxProject({ projectDir }); const project = await loadKtxProject({ projectDir });
const previousConnection = project.config.connections[connectionId]; const previousConnection = project.config.connections[connectionId];
@ -1519,9 +1403,9 @@ async function maybeConfigureDatabaseScope(input: {
input.connectionId, input.connectionId,
); );
} catch (error) { } catch (error) {
const detail = error instanceof Error ? error.message : String(error); writePrefixedLines(
input.io.stderr.write( (chunk) => input.io.stderr.write(chunk),
`Could not discover ${spec.promptLabel.toLowerCase()} for ${input.connectionId}; ${detail}\n`, `Could not discover ${spec.promptLabel.toLowerCase()} for ${input.connectionId}; ${errorMessage(error)}`,
); );
const typed = await promptCommaSeparatedScope({ const typed = await promptCommaSeparatedScope({
prompts: input.prompts, prompts: input.prompts,
@ -1573,11 +1457,12 @@ async function maybeConfigureDatabaseScope(input: {
input.io, input.io,
); );
} catch (error) { } catch (error) {
const detail = error instanceof Error ? error.message : String(error); const detail = errorMessage(error);
input.io.stderr.write( writePrefixedLines(
(chunk) => input.io.stderr.write(chunk),
input.forcePrompt === true input.forcePrompt === true
? `Could not discover tables for ${input.connectionId}; edit was not saved. ${detail}\n` ? `Could not discover tables for ${input.connectionId}; edit was not saved. ${detail}`
: `Could not discover tables for ${input.connectionId}; continuing without table filter. ${detail}\n`, : `Could not discover tables for ${input.connectionId}; continuing without table filter. ${detail}`,
); );
return input.forcePrompt === true ? 'failed' : 'ready'; return input.forcePrompt === true ? 'failed' : 'ready';
} }
@ -1665,19 +1550,19 @@ async function maybeRunHistoricSqlSetupProbe(input: {
connectionId: string; connectionId: string;
io: KtxCliIo; io: KtxCliIo;
deps: KtxSetupDatabasesDeps; deps: KtxSetupDatabasesDeps;
}): Promise<void> { }): Promise<boolean> {
const project = await loadKtxProject({ projectDir: input.projectDir }); const project = await loadKtxProject({ projectDir: input.projectDir });
const connection = project.config.connections[input.connectionId]; const connection = project.config.connections[input.connectionId];
const queryHistory = queryHistoryConfigRecord(connection) ?? historicSqlConfigRecord(connection); const queryHistory = queryHistoryConfigRecord(connection) ?? historicSqlConfigRecord(connection);
if (queryHistory?.enabled !== true) { if (queryHistory?.enabled !== true) {
return; return true;
} }
if (!connection) { if (!connection) {
return; return true;
} }
const dialect = queryHistoryDialectForConnection(connection); const dialect = queryHistoryDialectForConnection(connection);
if (!dialect) { if (!dialect) {
return; return true;
} }
input.io.stdout.write('│ Query history probe...\n'); input.io.stdout.write('│ Query history probe...\n');
@ -1696,6 +1581,7 @@ async function maybeRunHistoricSqlSetupProbe(input: {
if (!result.ok) { if (!result.ok) {
input.io.stdout.write('│ Setup written; query history will be skipped until fixed.\n'); input.io.stdout.write('│ Setup written; query history will be skipped until fixed.\n');
} }
return result.ok;
} }
async function applyHistoricSqlConfigToExistingConnection(input: { async function applyHistoricSqlConfigToExistingConnection(input: {
@ -1785,8 +1671,11 @@ async function validateAndScanConnection(input: {
const testIo = createBufferedCommandIo(); const testIo = createBufferedCommandIo();
const testCode = await testConnection(input.projectDir, input.connectionId, testIo); const testCode = await testConnection(input.projectDir, input.connectionId, testIo);
if (testCode !== 0) { if (testCode !== 0) {
flushBufferedCommandOutput(input.io, testIo); flushPrefixedBufferedCommandOutput(input.io, testIo);
input.io.stderr.write(`Connection test failed for ${input.connectionId}.\n`); writePrefixedLines(
(chunk) => input.io.stderr.write(chunk),
`Connection test failed for ${input.connectionId}.`,
);
return 'failed'; return 'failed';
} }
const testOutput = testIo.stdoutText(); const testOutput = testIo.stdoutText();
@ -1800,7 +1689,7 @@ async function validateAndScanConnection(input: {
return scopeStatus; return scopeStatus;
} }
await maybeRunHistoricSqlSetupProbe({ const queryHistoryAvailable = await maybeRunHistoricSqlSetupProbe({
projectDir: input.projectDir, projectDir: input.projectDir,
connectionId: input.connectionId, connectionId: input.connectionId,
io: input.io, io: input.io,
@ -1857,7 +1746,7 @@ async function validateAndScanConnection(input: {
); );
} }
if (scanCode !== 0) { if (scanCode !== 0) {
return 'failed'; return queryHistoryAvailable ? 'failed' : 'failed-query-history-unavailable';
} }
} }
const scanOutput = scanIo.stdoutText(); const scanOutput = scanIo.stdoutText();
@ -1999,7 +1888,10 @@ async function runPrimarySourceFullEdit(input: {
const existing = project.config.connections[input.connectionId]; const existing = project.config.connections[input.connectionId];
const driver = normalizeDriver(existing?.driver); const driver = normalizeDriver(existing?.driver);
if (!existing || !driver) { if (!existing || !driver) {
input.io.stderr.write(`Connection "${input.connectionId}" is not a configured database.\n`); writePrefixedLines(
(chunk) => input.io.stderr.write(chunk),
`Connection "${input.connectionId}" is not a configured database.`,
);
return 'failed'; return 'failed';
} }
@ -2053,7 +1945,7 @@ async function runPrimarySourceFullEdit(input: {
}); });
if (validated !== 'ready') { if (validated !== 'ready') {
await rollback(); await rollback();
return validated; return validated === 'failed-query-history-unavailable' ? 'failed' : validated;
} }
return 'ready'; return 'ready';
} }
@ -2188,7 +2080,7 @@ export async function runKtxSetupDatabasesStep(
prompts, prompts,
}); });
} catch (error) { } catch (error) {
io.stderr.write(`${error instanceof Error ? error.message : String(error)}\n`); writePrefixedLines((chunk) => io.stderr.write(chunk), errorMessage(error));
return { status: 'failed', projectDir: args.projectDir }; return { status: 'failed', projectDir: args.projectDir };
} }
if (connectionChoice === 'back') { if (connectionChoice === 'back') {
@ -2332,14 +2224,18 @@ export async function runKtxSetupDatabasesStep(
break; break;
} }
if (args.inputMode === 'disabled') return { status: 'failed', projectDir: args.projectDir }; if (args.inputMode === 'disabled') return { status: 'failed', projectDir: args.projectDir };
const failureOptions = [
{ value: 'retry', label: 'Retry connection test' },
{ value: 're-enter', label: 'Re-enter connection details' },
...(setupStatus === 'failed-query-history-unavailable'
? [{ value: 'disable-query-history', label: 'Disable query history and retry' }]
: []),
{ value: 'skip', label: 'Skip this database' },
{ value: 'back', label: 'Back' },
];
const action = await prompts.select({ const action = await prompts.select({
message: `Database setup failed for ${connectionChoice.connectionId}`, message: `Database setup failed for ${connectionChoice.connectionId}`,
options: [ options: failureOptions,
{ value: 'retry', label: 'Retry connection test' },
{ value: 're-enter', label: 'Re-enter connection details' },
{ value: 'skip', label: 'Skip this database' },
{ value: 'back', label: 'Back' },
],
}); });
if (action === 'back') { if (action === 'back') {
if (!canReturnToDriverSelection) return { status: 'back', projectDir: args.projectDir }; if (!canReturnToDriverSelection) return { status: 'back', projectDir: args.projectDir };
@ -2359,6 +2255,16 @@ export async function runKtxSetupDatabasesStep(
args, args,
prompts, prompts,
}); });
} else if (action === 'disable-query-history') {
await disableConnectionQueryHistory(args.projectDir, connectionChoice.connectionId);
setupStatus = await validateAndScanConnection({
projectDir: args.projectDir,
connectionId: connectionChoice.connectionId,
io,
deps,
args,
prompts,
});
} else if (action === 're-enter') { } else if (action === 're-enter') {
const connection = await buildConnectionConfig({ const connection = await buildConnectionConfig({
driver, driver,

View file

@ -6,12 +6,13 @@ import { markKtxSetupStateStepComplete, readKtxSetupState } from './context/proj
import type { KtxEmbeddingConfig } from './llm/types.js'; import type { KtxEmbeddingConfig } from './llm/types.js';
import { type KtxEmbeddingHealthCheckResult, runKtxEmbeddingHealthCheck } from './llm/embedding-health.js'; import { type KtxEmbeddingHealthCheckResult, runKtxEmbeddingHealthCheck } from './llm/embedding-health.js';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import { createStaticCliSpinner, type KtxCliSpinner } from './clack.js'; import { createStaticCliSpinner, errorMessage, writePrefixedLines, type KtxCliSpinner } from './clack.js';
import { import {
ensureManagedLocalEmbeddingsDaemon, ensureManagedLocalEmbeddingsDaemon,
managedLocalEmbeddingHealthConfig, managedLocalEmbeddingHealthConfig,
type ManagedLocalEmbeddingsDaemon, type ManagedLocalEmbeddingsDaemon,
} from './managed-local-embeddings.js'; } from './managed-local-embeddings.js';
import { ManagedPythonDaemonStartError } from './managed-python-daemon.js';
import type { KtxManagedPythonInstallPolicy } from './managed-python-command.js'; import type { KtxManagedPythonInstallPolicy } from './managed-python-command.js';
import { withTextInputNavigation } from './prompt-navigation.js'; import { withTextInputNavigation } from './prompt-navigation.js';
import { envCredentialReference, writeProjectLocalSecretReference } from './setup-secrets.js'; import { envCredentialReference, writeProjectLocalSecretReference } from './setup-secrets.js';
@ -419,7 +420,13 @@ export async function runKtxSetupEmbeddingsStep(
io, io,
}); });
} catch (error) { } catch (error) {
io.stderr.write(`${error instanceof Error ? error.message : String(error)}\n`); const write = (chunk: string) => io.stderr.write(chunk);
if (error instanceof ManagedPythonDaemonStartError) {
const tail = await readLocalEmbeddingDaemonStderrTail(error.stderrLog);
writePrefixedLines(write, localEmbeddingSetupMessage(error.detail, tail));
} else {
writePrefixedLines(write, errorMessage(error));
}
return { status: 'failed', projectDir: args.projectDir }; return { status: 'failed', projectDir: args.projectDir };
} }
} }

View file

@ -1,6 +1,7 @@
import { loadKtxProject, type KtxLocalProject } from './context/project/project.js'; import { loadKtxProject, type KtxLocalProject } from './context/project/project.js';
import { markKtxSetupStateStepComplete } from './context/project/setup-config.js'; import { markKtxSetupStateStepComplete } from './context/project/setup-config.js';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import { errorMessage, writePrefixedLines } from './clack.js';
import { import {
ensureManagedLocalEmbeddingsDaemon, ensureManagedLocalEmbeddingsDaemon,
type ManagedLocalEmbeddingsDaemon, type ManagedLocalEmbeddingsDaemon,
@ -88,7 +89,7 @@ export async function runKtxSetupRuntimeStep(
}); });
} }
} catch (error) { } catch (error) {
io.stderr.write(`${error instanceof Error ? error.message : String(error)}\n`); writePrefixedLines((chunk) => io.stderr.write(chunk), errorMessage(error));
return { status: 'failed', projectDir: args.projectDir, requirements }; return { status: 'failed', projectDir: args.projectDir, requirements };
} }

View file

@ -17,6 +17,7 @@ import { type KtxProjectConfig, type KtxProjectConnectionConfig, serializeKtxPro
import { loadKtxProject } from './context/project/project.js'; import { loadKtxProject } from './context/project/project.js';
import { markKtxSetupStateStepComplete } from './context/project/setup-config.js'; import { markKtxSetupStateStepComplete } from './context/project/setup-config.js';
import type { KtxCliIo } from './cli-runtime.js'; import type { KtxCliIo } from './cli-runtime.js';
import { errorMessage, writePrefixedLines } from './clack.js';
import { pickNotionRootPages } from './notion-page-picker.js'; import { pickNotionRootPages } from './notion-page-picker.js';
import { runKtxSourceMapping } from './source-mapping.js'; import { runKtxSourceMapping } from './source-mapping.js';
import { withMultiselectNavigation, withTextInputNavigation } from './prompt-navigation.js'; import { withMultiselectNavigation, withTextInputNavigation } from './prompt-navigation.js';
@ -1983,7 +1984,7 @@ export async function runKtxSetupSourcesStep(
return { status: 'ready', projectDir: args.projectDir, connectionIds: readyConnectionIds }; return { status: 'ready', projectDir: args.projectDir, connectionIds: readyConnectionIds };
} }
} catch (error) { } catch (error) {
io.stderr.write(`${error instanceof Error ? error.message : String(error)}\n`); writePrefixedLines((chunk) => io.stderr.write(chunk), errorMessage(error));
return { status: 'failed', projectDir: args.projectDir }; return { status: 'failed', projectDir: args.projectDir };
} }
} }

View file

@ -55,6 +55,10 @@ const emittedProjectSnapshots = new Set<string>();
const MCP_SAMPLE_RATE = 0.1 as const; const MCP_SAMPLE_RATE = 0.1 as const;
let mcpSampled: boolean | undefined; let mcpSampled: boolean | undefined;
function telemetryDebugEnabled(): boolean {
return process.env.KTX_TELEMETRY_DEBUG === '1';
}
export function shouldEmitMcpTelemetry(): boolean { export function shouldEmitMcpTelemetry(): boolean {
mcpSampled ??= Math.random() < MCP_SAMPLE_RATE; mcpSampled ??= Math.random() < MCP_SAMPLE_RATE;
return mcpSampled; return mcpSampled;
@ -71,19 +75,21 @@ export async function emitTelemetryEvent<Name extends TelemetryEventName>(input:
packageInfo?: KtxCliPackageInfo; packageInfo?: KtxCliPackageInfo;
projectDir?: string; projectDir?: string;
}): Promise<void> { }): Promise<void> {
const debug = telemetryDebugEnabled();
const identity = await loadTelemetryIdentity({ const identity = await loadTelemetryIdentity({
stdoutIsTTY: input.io.stdout.isTTY === true, stdoutIsTTY: input.io.stdout.isTTY === true,
stderr: input.io.stderr, stderr: input.io.stderr,
env: process.env, env: process.env,
}); });
if (!identity.enabled || !identity.installId) { if ((!identity.enabled || !identity.installId) && !debug) {
return; return;
} }
const packageInfo = input.packageInfo ?? getKtxCliPackageInfo(); const packageInfo = input.packageInfo ?? getKtxCliPackageInfo();
const installId = identity.installId ?? 'debug';
const projectId = input.projectDir ? computeTelemetryProjectId(identity.installId, input.projectDir) : undefined; const projectId = input.projectDir ? computeTelemetryProjectId(installId, input.projectDir) : undefined;
await trackTelemetryEvent({ await trackTelemetryEvent({
event: buildTelemetryEvent( event: buildTelemetryEvent(
input.name, input.name,
@ -93,7 +99,7 @@ export async function emitTelemetryEvent<Name extends TelemetryEventName>(input:
}), }),
input.fields, input.fields,
), ),
distinctId: identity.installId, distinctId: installId,
projectId, projectId,
env: process.env, env: process.env,
stderr: input.io.stderr, stderr: input.io.stderr,

View file

@ -1,9 +1,9 @@
import { createRequire } from 'node:module'; import { createRequire } from 'node:module';
import type { ReindexSummary } from './context/index-sync/types.js'; import type { ReindexSummary } from '../src/context/index-sync/types.js';
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { renderReindexJson, renderReindexPlain, reindexHasErrors } from './admin-reindex.js'; import { renderReindexJson, renderReindexPlain, reindexHasErrors } from '../src/admin-reindex.js';
import { runKtxCli } from './index.js'; import { runKtxCli } from '../src/index.js';
const cliVersion = (createRequire(import.meta.url)('@kaelio/ktx/package.json') as { version: string }) const cliVersion = (createRequire(import.meta.url)('@kaelio/ktx/package.json') as { version: string })
.version; .version;

View file

@ -1,5 +1,5 @@
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { runKtxCli } from './index.js'; import { runKtxCli } from '../src/index.js';
function makeIo() { function makeIo() {
let stdout = ''; let stdout = '';

View file

@ -3,8 +3,8 @@ import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import { runCommanderKtxCli } from './cli-program.js'; import { runCommanderKtxCli } from '../src/cli-program.js';
import type { KtxCliDeps, KtxCliIo, KtxCliPackageInfo } from './cli-runtime.js'; import type { KtxCliDeps, KtxCliIo, KtxCliPackageInfo } from '../src/cli-runtime.js';
function makeIo(stdoutIsTTY = true): { io: KtxCliIo; stdout: () => string; stderr: () => string } { function makeIo(stdoutIsTTY = true): { io: KtxCliIo; stdout: () => string; stderr: () => string } {
let stdout = ''; let stdout = '';

View file

@ -1,7 +1,7 @@
import { Command, type CommandUnknownOpts } from '@commander-js/extra-typings'; import { Command, type CommandUnknownOpts } from '@commander-js/extra-typings';
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { buildKtxProgram, collectCommandFlagsPresent } from './cli-program.js'; import { buildKtxProgram, collectCommandFlagsPresent } from '../src/cli-program.js';
import type { KtxCliIo, KtxCliPackageInfo } from './cli-runtime.js'; import type { KtxCliIo, KtxCliPackageInfo } from '../src/cli-runtime.js';
function stubIo(): KtxCliIo { function stubIo(): KtxCliIo {
return { return {

View file

@ -1,6 +1,6 @@
import { Command } from '@commander-js/extra-typings'; import { Command } from '@commander-js/extra-typings';
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { formatCommandTree, walkCommandTree } from './command-tree.js'; import { formatCommandTree, walkCommandTree } from '../src/command-tree.js';
describe('walkCommandTree', () => { describe('walkCommandTree', () => {
it('captures name, description, aliases, and nested children', () => { it('captures name, description, aliases, and nested children', () => {

View file

@ -1,7 +1,7 @@
import { Command } from '@commander-js/extra-typings'; import { Command } from '@commander-js/extra-typings';
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import type { KtxCliCommandContext } from '../cli-program.js'; import type { KtxCliCommandContext } from '../../src/cli-program.js';
import { registerMcpCommands } from './mcp-commands.js'; import { registerMcpCommands } from '../../src/commands/mcp-commands.js';
function makeContext(overrides: Partial<KtxCliCommandContext> = {}): KtxCliCommandContext { function makeContext(overrides: Partial<KtxCliCommandContext> = {}): KtxCliCommandContext {
let exitCode = 0; let exitCode = 0;

View file

@ -1,7 +1,7 @@
import { Command } from '@commander-js/extra-typings'; import { Command } from '@commander-js/extra-typings';
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import type { KtxCliCommandContext } from '../cli-program.js'; import type { KtxCliCommandContext } from '../../src/cli-program.js';
import { registerSqlCommands } from './sql-commands.js'; import { registerSqlCommands } from '../../src/commands/sql-commands.js';
function makeContext(overrides: Partial<KtxCliCommandContext> = {}): KtxCliCommandContext { function makeContext(overrides: Partial<KtxCliCommandContext> = {}): KtxCliCommandContext {
let exitCode = 0; let exitCode = 0;

View file

@ -1,14 +1,14 @@
import { mkdtemp, readFile, rm, writeFile } from 'node:fs/promises'; import { mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import type { LookerClient } from './context/ingest/adapters/looker/client.js'; import type { LookerClient } from '../src/context/ingest/adapters/looker/client.js';
import type { MetabaseRuntimeClient } from './context/ingest/adapters/metabase/client-port.js'; import type { MetabaseRuntimeClient } from '../src/context/ingest/adapters/metabase/client-port.js';
import type { NotionClient } from './context/ingest/adapters/notion/notion-client.js'; import type { NotionClient } from '../src/context/ingest/adapters/notion/notion-client.js';
import { initKtxProject } from './context/project/project.js'; import { initKtxProject } from '../src/context/project/project.js';
import { parseKtxProjectConfig, serializeKtxProjectConfig } from './context/project/config.js'; import { parseKtxProjectConfig, serializeKtxProjectConfig } from '../src/context/project/config.js';
import type { KtxConnectionDriver, KtxScanConnector } from './context/scan/types.js'; import type { KtxConnectionDriver, KtxScanConnector } from '../src/context/scan/types.js';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import { runKtxConnection } from './connection.js'; import { runKtxConnection } from '../src/connection.js';
function stripAnsi(s: string): string { function stripAnsi(s: string): string {
return s.replace(/\[[0-9;]*m/g, ''); return s.replace(/\[[0-9;]*m/g, '');
@ -59,6 +59,8 @@ function nativeConnector(
introspect: vi.fn(async () => { introspect: vi.fn(async () => {
throw new Error('introspect should not be called from connection test'); throw new Error('introspect should not be called from connection test');
}), }),
listSchemas: vi.fn(async () => []),
listTables: vi.fn(async () => []),
testConnection, testConnection,
cleanup, cleanup,
}; };

View file

@ -1,7 +1,7 @@
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { bigQueryConnectionConfigFromConfig, isKtxBigQueryConnectionConfig, type KtxBigQueryClient, KtxBigQueryScanConnector, type KtxBigQueryClientFactory, type KtxBigQueryDataset, type KtxBigQueryQueryJob, type KtxBigQueryTableRef } from '../../connectors/bigquery/connector.js'; import { bigQueryConnectionConfigFromConfig, isKtxBigQueryConnectionConfig, type KtxBigQueryClient, KtxBigQueryScanConnector, type KtxBigQueryClientFactory, type KtxBigQueryDataset, type KtxBigQueryQueryJob, type KtxBigQueryTableRef, prepareBigQueryReadOnlyQuery } from '../../../src/connectors/bigquery/connector.js';
import { createBigQueryLiveDatabaseIntrospection } from '../../connectors/bigquery/live-database-introspection.js'; import { createBigQueryLiveDatabaseIntrospection } from '../../../src/connectors/bigquery/live-database-introspection.js';
import { tableRefSet } from '../../context/scan/table-ref.js'; import { tableRefSet } from '../../../src/context/scan/table-ref.js';
function fakeClientFactory(options: { primaryKeyError?: Error } = {}): KtxBigQueryClientFactory { function fakeClientFactory(options: { primaryKeyError?: Error } = {}): KtxBigQueryClientFactory {
const queryResults = vi.fn(async (): ReturnType<KtxBigQueryQueryJob['getQueryResults']> => [ const queryResults = vi.fn(async (): ReturnType<KtxBigQueryQueryJob['getQueryResults']> => [
@ -98,6 +98,17 @@ const connection = {
} as const; } as const;
describe('KtxBigQueryScanConnector', () => { describe('KtxBigQueryScanConnector', () => {
it('prepares read-only SQL parameters with BigQuery named placeholders', () => {
expect(prepareBigQueryReadOnlyQuery('SELECT * FROM orders WHERE id = :id AND id_2 = :id_2', { id: 1, id_2: 2 })).toEqual({
sql: 'SELECT * FROM orders WHERE id = @id AND id_2 = @id_2',
params: { id: 1, id_2: 2 },
});
expect(prepareBigQueryReadOnlyQuery('SELECT * FROM orders')).toEqual({
sql: 'SELECT * FROM orders',
params: undefined,
});
});
it('resolves configuration safely', () => { it('resolves configuration safely', () => {
expect(isKtxBigQueryConnectionConfig(connection)).toBe(true); expect(isKtxBigQueryConnectionConfig(connection)).toBe(true);
expect(isKtxBigQueryConnectionConfig({ driver: 'mysql' })).toBe(false); expect(isKtxBigQueryConnectionConfig({ driver: 'mysql' })).toBe(false);
@ -256,7 +267,7 @@ describe('KtxBigQueryScanConnector', () => {
), ),
).resolves.toEqual({ values: ['open', 'paid'], cardinality: 2 }); ).resolves.toEqual({ values: ['open', 'paid'], cardinality: 2 });
await expect(connector.getTableRowCount('orders')).resolves.toBe(12); await expect(connector.getTableRowCount('orders')).resolves.toBe(12);
await expect(connector.listDatasets()).resolves.toEqual(['analytics', 'staging']); await expect(connector.listSchemas()).resolves.toEqual(['analytics', 'staging']);
await expect( await expect(
connector.columnStats( connector.columnStats(
{ connectionId: 'warehouse', table: { catalog: 'project-1', db: 'analytics', name: 'orders' }, column: 'status' }, { connectionId: 'warehouse', table: { catalog: 'project-1', db: 'analytics', name: 'orders' }, column: 'status' },
@ -366,9 +377,9 @@ describe('KtxBigQueryScanConnector', () => {
}); });
await expect(connector.listTables(['analytics', 'mart'])).resolves.toEqual([ await expect(connector.listTables(['analytics', 'mart'])).resolves.toEqual([
{ schema: 'analytics', name: 'orders', kind: 'table' }, { catalog: 'project-1', schema: 'analytics', name: 'orders', kind: 'table' },
{ schema: 'analytics', name: 'order_clone', kind: 'table' }, { catalog: 'project-1', schema: 'analytics', name: 'order_clone', kind: 'table' },
{ schema: 'mart', name: 'orders_mv', kind: 'view' }, { catalog: 'project-1', schema: 'mart', name: 'orders_mv', kind: 'view' },
]); ]);
expect(createQueryJob).toHaveBeenCalledTimes(1); expect(createQueryJob).toHaveBeenCalledTimes(1);

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { KtxBigQueryDialect } from './dialect.js'; import { KtxBigQueryDialect } from '../../../src/connectors/bigquery/dialect.js';
describe('KtxBigQueryDialect', () => { describe('KtxBigQueryDialect', () => {
const dialect = new KtxBigQueryDialect(); const dialect = new KtxBigQueryDialect();
@ -38,14 +38,6 @@ describe('KtxBigQueryDialect', () => {
); );
}); });
it('rewrites colon parameters to BigQuery named parameters', () => {
expect(dialect.prepareQuery('SELECT * FROM orders WHERE id = :id AND id_2 = :id_2', { id: 1, id_2: 2 })).toEqual({
sql: 'SELECT * FROM orders WHERE id = @id AND id_2 = @id_2',
params: { id: 1, id_2: 2 },
});
expect(dialect.prepareQuery('SELECT * FROM orders')).toEqual({ sql: 'SELECT * FROM orders', params: undefined });
});
it('keeps unsupported statistics explicit', () => { it('keeps unsupported statistics explicit', () => {
expect(dialect.generateColumnStatisticsQuery('analytics', 'orders')).toBeNull(); expect(dialect.generateColumnStatisticsQuery('analytics', 'orders')).toBeNull();
}); });

View file

@ -1,7 +1,7 @@
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { clickHouseClientConfigFromConfig, isKtxClickHouseConnectionConfig, KtxClickHouseScanConnector, type KtxClickHouseClientFactory } from '../../connectors/clickhouse/connector.js'; import { clickHouseClientConfigFromConfig, isKtxClickHouseConnectionConfig, KtxClickHouseScanConnector, prepareClickHouseReadOnlyQuery, type KtxClickHouseClientFactory } from '../../../src/connectors/clickhouse/connector.js';
import { createClickHouseLiveDatabaseIntrospection } from '../../connectors/clickhouse/live-database-introspection.js'; import { createClickHouseLiveDatabaseIntrospection } from '../../../src/connectors/clickhouse/live-database-introspection.js';
import { tableRefSet } from '../../context/scan/table-ref.js'; import { tableRefSet } from '../../../src/context/scan/table-ref.js';
function result<T>(payload: T) { function result<T>(payload: T) {
return { return {
@ -15,8 +15,8 @@ function fakeClientFactory(): KtxClickHouseClientFactory {
const query = vi.fn(async (input: { query: string; format: string; query_params?: Record<string, unknown> }) => { const query = vi.fn(async (input: { query: string; format: string; query_params?: Record<string, unknown> }) => {
if (input.query.includes('FROM system.tables')) { if (input.query.includes('FROM system.tables')) {
return result([ return result([
{ name: 'events', engine: 'MergeTree', comment: 'Event stream' }, { database: 'analytics', name: 'event_summary', engine: 'View', comment: '' },
{ name: 'event_summary', engine: 'View', comment: '' }, { database: 'analytics', name: 'events', engine: 'MergeTree', comment: 'Event stream' },
]); ]);
} }
if (input.query.includes('FROM system.columns')) { if (input.query.includes('FROM system.columns')) {
@ -136,6 +136,33 @@ function multiDatabaseClickHouseClientFactory(): KtxClickHouseClientFactory {
} }
describe('KtxClickHouseScanConnector', () => { describe('KtxClickHouseScanConnector', () => {
it('prepares read-only SQL parameters with ClickHouse typed placeholders', () => {
expect(
prepareClickHouseReadOnlyQuery('select * from events where id = :id and event_name = :name', {
id: 10,
name: 'signup',
}),
).toEqual({
sql: 'select * from events where id = {id:Int64} and event_name = {name:String}',
params: { id: 10, name: 'signup' },
});
expect(
prepareClickHouseReadOnlyQuery('select * from events where enabled = :enabled and ratio = :ratio and created_at = :created_at', {
enabled: true,
ratio: 1.5,
created_at: new Date('2026-05-25T00:00:00.000Z'),
}),
).toEqual({
sql: 'select * from events where enabled = {enabled:Bool} and ratio = {ratio:Float64} and created_at = {created_at:DateTime}',
params: {
enabled: true,
ratio: 1.5,
created_at: new Date('2026-05-25T00:00:00.000Z'),
},
});
expect(prepareClickHouseReadOnlyQuery('select 1')).toEqual({ sql: 'select 1', params: undefined });
});
it('resolves ClickHouse connection configuration safely', () => { it('resolves ClickHouse connection configuration safely', () => {
expect(isKtxClickHouseConnectionConfig({ driver: 'clickhouse', host: 'localhost', database: 'analytics' })).toBe( expect(isKtxClickHouseConnectionConfig({ driver: 'clickhouse', host: 'localhost', database: 'analytics' })).toBe(
true, true,
@ -196,8 +223,8 @@ describe('KtxClickHouseScanConnector', () => {
}, },
}); });
expect(snapshot.tables.map((table) => [table.name, table.kind, table.estimatedRows, table.comment])).toEqual([ expect(snapshot.tables.map((table) => [table.name, table.kind, table.estimatedRows, table.comment])).toEqual([
['events', 'table', 2, 'Event stream'],
['event_summary', 'view', null, null], ['event_summary', 'view', null, null],
['events', 'table', 2, 'Event stream'],
]); ]);
expect(snapshot.tables.find((table) => table.name === 'events')?.columns[0]).toMatchObject({ expect(snapshot.tables.find((table) => table.name === 'events')?.columns[0]).toMatchObject({
name: 'id', name: 'id',
@ -344,6 +371,10 @@ describe('KtxClickHouseScanConnector', () => {
await expect(connector.getTableRowCount('events')).resolves.toBe(2); await expect(connector.getTableRowCount('events')).resolves.toBe(2);
await expect(connector.listSchemas()).resolves.toEqual(['analytics', 'warehouse']); await expect(connector.listSchemas()).resolves.toEqual(['analytics', 'warehouse']);
await expect(connector.listTables(['analytics'])).resolves.toEqual([
{ catalog: null, schema: 'analytics', name: 'event_summary', kind: 'view' },
{ catalog: null, schema: 'analytics', name: 'events', kind: 'table' },
]);
await expect( await expect(
connector.columnStats( connector.columnStats(
{ connectionId: 'warehouse', table: { catalog: null, db: 'analytics', name: 'events' }, column: 'event_name' }, { connectionId: 'warehouse', table: { catalog: null, db: 'analytics', name: 'events' }, column: 'event_name' },

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { KtxClickHouseDialect } from './dialect.js'; import { KtxClickHouseDialect } from '../../../src/connectors/clickhouse/dialect.js';
describe('KtxClickHouseDialect', () => { describe('KtxClickHouseDialect', () => {
const dialect = new KtxClickHouseDialect(); const dialect = new KtxClickHouseDialect();
@ -23,7 +23,7 @@ describe('KtxClickHouseDialect', () => {
expect(dialect.mapToDimensionType('')).toBe('string'); expect(dialect.mapToDimensionType('')).toBe('string');
}); });
it('builds sampling, distinct-value, pagination, and time SQL', () => { it('builds sampling, distinct-value, and pagination SQL', () => {
expect(dialect.generateSampleQuery('`analytics`.`events`', 25, ['id', 'event_name'])).toBe( expect(dialect.generateSampleQuery('`analytics`.`events`', 25, ['id', 'event_name'])).toBe(
'SELECT `id`, `event_name` FROM `analytics`.`events` LIMIT 25', 'SELECT `id`, `event_name` FROM `analytics`.`events` LIMIT 25',
); );
@ -34,16 +34,6 @@ describe('KtxClickHouseDialect', () => {
'SELECT DISTINCT toString(`event_name`) AS val', 'SELECT DISTINCT toString(`event_name`) AS val',
); );
expect(dialect.getLimitOffsetClause(10, 20)).toBe('LIMIT 10 OFFSET 20'); expect(dialect.getLimitOffsetClause(10, 20)).toBe('LIMIT 10 OFFSET 20');
expect(dialect.getTimeTruncExpression('created_at', 'week')).toBe('toStartOfWeek(created_at, 1)');
}); });
it('prepares named parameters using ClickHouse typed placeholders', () => {
expect(dialect.prepareQuery('select * from events where id = :id and event_name = :name', {
id: 10,
name: 'signup',
})).toEqual({
sql: 'select * from events where id = {id:Int64} and event_name = {name:String}',
params: { id: 10, name: 'signup' },
});
});
}); });

View file

@ -1,8 +1,8 @@
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import type { FieldPacket, RowDataPacket } from 'mysql2/promise'; import type { FieldPacket, RowDataPacket } from 'mysql2/promise';
import { createMysqlLiveDatabaseIntrospection } from '../../connectors/mysql/live-database-introspection.js'; import { createMysqlLiveDatabaseIntrospection } from '../../../src/connectors/mysql/live-database-introspection.js';
import { isKtxMysqlConnectionConfig, KtxMysqlScanConnector, mysqlConnectionPoolConfigFromConfig, type KtxMysqlConnectionConfig, type KtxMysqlPoolFactory } from '../../connectors/mysql/connector.js'; import { isKtxMysqlConnectionConfig, KtxMysqlScanConnector, mysqlConnectionPoolConfigFromConfig, prepareMysqlReadOnlyQuery, type KtxMysqlConnectionConfig, type KtxMysqlPoolFactory } from '../../../src/connectors/mysql/connector.js';
import { tableRefSet } from '../../context/scan/table-ref.js'; import { tableRefSet } from '../../../src/context/scan/table-ref.js';
function mysqlResult(rows: Record<string, unknown>[], fields: Array<{ name: string; type?: number }>): [RowDataPacket[], FieldPacket[]] { function mysqlResult(rows: Record<string, unknown>[], fields: Array<{ name: string; type?: number }>): [RowDataPacket[], FieldPacket[]] {
return [rows as RowDataPacket[], fields as FieldPacket[]]; return [rows as RowDataPacket[], fields as FieldPacket[]];
@ -13,9 +13,9 @@ function fakePoolFactory(): KtxMysqlPoolFactory {
if (sql.includes('INFORMATION_SCHEMA.TABLES')) { if (sql.includes('INFORMATION_SCHEMA.TABLES')) {
return mysqlResult( return mysqlResult(
[ [
{ TABLE_NAME: 'customers', TABLE_TYPE: 'BASE TABLE', TABLE_COMMENT: 'Customer table', TABLE_ROWS: 2 }, { TABLE_SCHEMA: 'analytics', TABLE_NAME: 'customers', TABLE_TYPE: 'BASE TABLE', TABLE_COMMENT: 'Customer table', TABLE_ROWS: 2 },
{ TABLE_NAME: 'orders', TABLE_TYPE: 'BASE TABLE', TABLE_COMMENT: 'InnoDB free: 1 kB; Order table', TABLE_ROWS: 2 }, { TABLE_SCHEMA: 'analytics', TABLE_NAME: 'orders', TABLE_TYPE: 'BASE TABLE', TABLE_COMMENT: 'InnoDB free: 1 kB; Order table', TABLE_ROWS: 2 },
{ TABLE_NAME: 'order_summary', TABLE_TYPE: 'VIEW', TABLE_COMMENT: '', TABLE_ROWS: null }, { TABLE_SCHEMA: 'analytics', TABLE_NAME: 'order_summary', TABLE_TYPE: 'VIEW', TABLE_COMMENT: '', TABLE_ROWS: null },
], ],
[{ name: 'TABLE_NAME' }, { name: 'TABLE_TYPE' }, { name: 'TABLE_COMMENT' }, { name: 'TABLE_ROWS' }], [{ name: 'TABLE_NAME' }, { name: 'TABLE_TYPE' }, { name: 'TABLE_COMMENT' }, { name: 'TABLE_ROWS' }],
); );
@ -173,6 +173,19 @@ function multiSchemaMysqlPoolFactory(
} }
describe('KtxMysqlScanConnector', () => { describe('KtxMysqlScanConnector', () => {
it('prepares read-only SQL parameters with MySQL positional placeholders', () => {
expect(
prepareMysqlReadOnlyQuery('select * from orders where id = :id and status = :status', {
status: 'paid',
id: 10,
}),
).toEqual({
sql: 'select * from orders where id = ? and status = ?',
params: [10, 'paid'],
});
expect(prepareMysqlReadOnlyQuery('select 1')).toEqual({ sql: 'select 1', params: undefined });
});
it('resolves MySQL connection configuration safely', () => { it('resolves MySQL connection configuration safely', () => {
expect(isKtxMysqlConnectionConfig({ driver: 'mysql', host: 'localhost', database: 'analytics' })).toBe(true); expect(isKtxMysqlConnectionConfig({ driver: 'mysql', host: 'localhost', database: 'analytics' })).toBe(true);
expect(isKtxMysqlConnectionConfig({ driver: 'postgres', host: 'localhost', database: 'analytics' })).toBe(false); expect(isKtxMysqlConnectionConfig({ driver: 'postgres', host: 'localhost', database: 'analytics' })).toBe(false);
@ -497,6 +510,11 @@ describe('KtxMysqlScanConnector', () => {
await expect(connector.getTableRowCount('orders')).resolves.toBe(2); await expect(connector.getTableRowCount('orders')).resolves.toBe(2);
await expect(connector.listSchemas()).resolves.toEqual(['analytics', 'warehouse']); await expect(connector.listSchemas()).resolves.toEqual(['analytics', 'warehouse']);
await expect(connector.listTables(['analytics'])).resolves.toEqual([
{ catalog: null, schema: 'analytics', name: 'customers', kind: 'table' },
{ catalog: null, schema: 'analytics', name: 'orders', kind: 'table' },
{ catalog: null, schema: 'analytics', name: 'order_summary', kind: 'view' },
]);
await expect(connector.columnStats( await expect(connector.columnStats(
{ connectionId: 'warehouse', table: { catalog: null, db: 'analytics', name: 'orders' }, column: 'status' }, { connectionId: 'warehouse', table: { catalog: null, db: 'analytics', name: 'orders' }, column: 'status' },
{ runId: 'scan-run-1' }, { runId: 'scan-run-1' },

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { KtxMysqlDialect } from './dialect.js'; import { KtxMysqlDialect } from '../../../src/connectors/mysql/dialect.js';
describe('KtxMysqlDialect', () => { describe('KtxMysqlDialect', () => {
const dialect = new KtxMysqlDialect(); const dialect = new KtxMysqlDialect();
@ -23,7 +23,7 @@ describe('KtxMysqlDialect', () => {
expect(dialect.mapToDimensionType('')).toBe('string'); expect(dialect.mapToDimensionType('')).toBe('string');
}); });
it('builds sampling, distinct-value, pagination, and time SQL', () => { it('builds sampling, distinct-value, and pagination SQL', () => {
expect(dialect.generateSampleQuery('`analytics`.`orders`', 25, ['id', 'status'])).toBe( expect(dialect.generateSampleQuery('`analytics`.`orders`', 25, ['id', 'status'])).toBe(
'SELECT `id`, `status` FROM `analytics`.`orders` LIMIT 25', 'SELECT `id`, `status` FROM `analytics`.`orders` LIMIT 25',
); );
@ -34,16 +34,6 @@ describe('KtxMysqlDialect', () => {
'SELECT DISTINCT CAST(`status` AS CHAR) AS val', 'SELECT DISTINCT CAST(`status` AS CHAR) AS val',
); );
expect(dialect.getLimitOffsetClause(10, 20)).toBe('LIMIT 10 OFFSET 20'); expect(dialect.getLimitOffsetClause(10, 20)).toBe('LIMIT 10 OFFSET 20');
expect(dialect.getTimeTruncExpression('created_at', 'month')).toBe("DATE_FORMAT(created_at, '%Y-%m-01')");
}); });
it('prepares named parameters in deterministic SQL placeholder order', () => {
expect(dialect.prepareQuery('select * from orders where id = :id and status = :status', {
status: 'paid',
id: 10,
})).toEqual({
sql: 'select * from orders where id = ? and status = ?',
params: [10, 'paid'],
});
});
}); });

View file

@ -1,7 +1,7 @@
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { createPostgresLiveDatabaseIntrospection } from '../../connectors/postgres/live-database-introspection.js'; import { createPostgresLiveDatabaseIntrospection } from '../../../src/connectors/postgres/live-database-introspection.js';
import { isKtxPostgresConnectionConfig, KtxPostgresScanConnector, postgresPoolConfigFromConfig, type KtxPostgresConnectionConfig, type KtxPostgresPoolFactory } from '../../connectors/postgres/connector.js'; import { isKtxPostgresConnectionConfig, KtxPostgresScanConnector, postgresPoolConfigFromConfig, preparePostgresReadOnlyQuery, type KtxPostgresConnectionConfig, type KtxPostgresPoolFactory } from '../../../src/connectors/postgres/connector.js';
import { tableRefSet } from '../../context/scan/table-ref.js'; import { tableRefSet } from '../../../src/context/scan/table-ref.js';
interface FakeQueryResult { interface FakeQueryResult {
rows: Record<string, unknown>[]; rows: Record<string, unknown>[];
@ -44,9 +44,9 @@ function metadataResults(): Map<string, FakeQueryResponse> {
'FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n', 'FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n',
{ {
rows: [ rows: [
{ table_name: 'customers', table_kind: 'r', row_count: '2', table_comment: 'Customers' }, { schema_name: 'public', table_name: 'customers', table_kind: 'r', row_count: '2', table_comment: 'Customers' },
{ table_name: 'orders', table_kind: 'r', row_count: '3', table_comment: null }, { schema_name: 'public', table_name: 'orders', table_kind: 'r', row_count: '3', table_comment: null },
{ table_name: 'recent_orders', table_kind: 'v', row_count: '0', table_comment: 'Recent orders' }, { schema_name: 'public', table_name: 'recent_orders', table_kind: 'v', row_count: '0', table_comment: 'Recent orders' },
], ],
}, },
], ],
@ -102,6 +102,28 @@ function metadataResults(): Map<string, FakeQueryResponse> {
} }
describe('KtxPostgresScanConnector', () => { describe('KtxPostgresScanConnector', () => {
it('prepares read-only SQL parameters with PostgreSQL positional placeholders', () => {
expect(
preparePostgresReadOnlyQuery('select * from orders where id = :id and status = :status', {
id: 1,
status: 'paid',
}),
).toEqual({
sql: 'select * from orders where id = $1 and status = $2',
params: [1, 'paid'],
});
expect(
preparePostgresReadOnlyQuery('select :Client_Name_10, :Client_Name_1', {
Client_Name_1: 'short',
Client_Name_10: 'long',
}),
).toEqual({
sql: 'select $2, $1',
params: ['short', 'long'],
});
expect(preparePostgresReadOnlyQuery('select 1')).toEqual({ sql: 'select 1', params: undefined });
});
it('resolves configuration safely', () => { it('resolves configuration safely', () => {
expect(isKtxPostgresConnectionConfig({ driver: 'postgres', url: 'env:DATABASE_URL' })).toBe(true); expect(isKtxPostgresConnectionConfig({ driver: 'postgres', url: 'env:DATABASE_URL' })).toBe(true);
expect(isKtxPostgresConnectionConfig({ driver: 'postgresql', host: 'db', database: 'analytics' })).toBe(false); expect(isKtxPostgresConnectionConfig({ driver: 'postgresql', host: 'db', database: 'analytics' })).toBe(false);
@ -367,6 +389,11 @@ describe('KtxPostgresScanConnector', () => {
}); });
await expect(connector.getTableRowCount({ db: 'public', name: 'orders' })).resolves.toBe(3); await expect(connector.getTableRowCount({ db: 'public', name: 'orders' })).resolves.toBe(3);
await expect(connector.listSchemas()).resolves.toEqual(['public']); await expect(connector.listSchemas()).resolves.toEqual(['public']);
await expect(connector.listTables(['public'])).resolves.toEqual([
{ catalog: null, schema: 'public', name: 'customers', kind: 'table' },
{ catalog: null, schema: 'public', name: 'orders', kind: 'table' },
{ catalog: null, schema: 'public', name: 'recent_orders', kind: 'view' },
]);
await expect(connector.testConnection()).resolves.toEqual({ success: true }); await expect(connector.testConnection()).resolves.toEqual({ success: true });
await expect( await expect(

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { KtxPostgresDialect } from './dialect.js'; import { KtxPostgresDialect } from '../../../src/connectors/postgres/dialect.js';
describe('KtxPostgresDialect', () => { describe('KtxPostgresDialect', () => {
const dialect = new KtxPostgresDialect(); const dialect = new KtxPostgresDialect();
@ -18,7 +18,7 @@ describe('KtxPostgresDialect', () => {
expect(dialect.mapToDimensionType('jsonb')).toBe('string'); expect(dialect.mapToDimensionType('jsonb')).toBe('string');
}); });
it('generates sample, distinct-value, statistics, and time SQL', () => { it('generates sample, distinct-value, and statistics SQL', () => {
expect(dialect.generateSampleQuery('"public"."orders"', 5, ['id', 'status'])).toBe( expect(dialect.generateSampleQuery('"public"."orders"', 5, ['id', 'status'])).toBe(
'SELECT "id", "status" FROM "public"."orders" LIMIT 5', 'SELECT "id", "status" FROM "public"."orders" LIMIT 5',
); );
@ -29,24 +29,6 @@ describe('KtxPostgresDialect', () => {
'SELECT DISTINCT "status"::text AS val', 'SELECT DISTINCT "status"::text AS val',
); );
expect(dialect.generateColumnStatisticsQuery('public', 'orders')).toContain('FROM pg_stats s'); expect(dialect.generateColumnStatisticsQuery('public', 'orders')).toContain('FROM pg_stats s');
expect(dialect.getTimeTruncExpression('"created_at"', 'month')).toBe('DATE_TRUNC(\'month\', "created_at")');
}); });
it('prepares named parameters with PostgreSQL positional parameters', () => {
expect(
dialect.prepareQuery('select * from orders where id = :id and status = :status', { id: 1, status: 'paid' }),
).toEqual({
sql: 'select * from orders where id = $1 and status = $2',
params: [1, 'paid'],
});
expect(
dialect.prepareQuery('select :Client_Name_10, :Client_Name_1', {
Client_Name_1: 'short',
Client_Name_10: 'long',
}),
).toEqual({
sql: 'select $2, $1',
params: ['short', 'long'],
});
});
}); });

View file

@ -1,6 +1,6 @@
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { KtxPostgresHistoricSqlQueryClient } from './historic-sql-query-client.js'; import { KtxPostgresHistoricSqlQueryClient } from '../../../src/connectors/postgres/historic-sql-query-client.js';
import type { KtxPostgresPoolConfig, KtxPostgresPoolFactory } from './connector.js'; import type { KtxPostgresPoolConfig, KtxPostgresPoolFactory } from '../../../src/connectors/postgres/connector.js';
describe('KtxPostgresHistoricSqlQueryClient', () => { describe('KtxPostgresHistoricSqlQueryClient', () => {
it('executes parameterized read-only SQL through the native Postgres connector pool', async () => { it('executes parameterized read-only SQL through the native Postgres connector pool', async () => {

View file

@ -7,9 +7,9 @@ vi.mock('snowflake-sdk', () => ({
createPool, createPool,
})); }));
import { createSnowflakeLiveDatabaseIntrospection } from '../../connectors/snowflake/live-database-introspection.js'; import { createSnowflakeLiveDatabaseIntrospection } from '../../../src/connectors/snowflake/live-database-introspection.js';
import { isKtxSnowflakeConnectionConfig, KtxSnowflakeScanConnector, snowflakeConnectionConfigFromConfig, type KtxSnowflakeConnectionConfig, type KtxSnowflakeDriver, type KtxSnowflakeDriverFactory } from '../../connectors/snowflake/connector.js'; import { isKtxSnowflakeConnectionConfig, KtxSnowflakeScanConnector, prepareSnowflakeReadOnlyQuery, snowflakeConnectionConfigFromConfig, type KtxSnowflakeConnectionConfig, type KtxSnowflakeDriver, type KtxSnowflakeDriverFactory } from '../../../src/connectors/snowflake/connector.js';
import { tableRefSet } from '../../context/scan/table-ref.js'; import { tableRefSet } from '../../../src/context/scan/table-ref.js';
function fakeDriverFactory(): KtxSnowflakeDriverFactory { function fakeDriverFactory(): KtxSnowflakeDriverFactory {
const driver: KtxSnowflakeDriver = { const driver: KtxSnowflakeDriver = {
@ -64,8 +64,8 @@ function fakeDriverFactory(): KtxSnowflakeDriverFactory {
]), ]),
listSchemas: vi.fn(async () => ['PUBLIC', 'MART']), listSchemas: vi.fn(async () => ['PUBLIC', 'MART']),
listTables: vi.fn(async () => [ listTables: vi.fn(async () => [
{ schema: 'PUBLIC', name: 'ORDERS', kind: 'table' as const }, { catalog: 'ANALYTICS', schema: 'PUBLIC', name: 'ORDERS', kind: 'table' as const },
{ schema: 'PUBLIC', name: 'ORDER_SUMMARY', kind: 'view' as const }, { catalog: 'ANALYTICS', schema: 'PUBLIC', name: 'ORDER_SUMMARY', kind: 'view' as const },
]), ]),
cleanup: vi.fn(async () => undefined), cleanup: vi.fn(async () => undefined),
}; };
@ -105,6 +105,17 @@ function installSnowflakePoolMock() {
} }
describe('KtxSnowflakeScanConnector', () => { describe('KtxSnowflakeScanConnector', () => {
it('prepares read-only SQL parameters with Snowflake bind arrays', () => {
expect(prepareSnowflakeReadOnlyQuery('SELECT * FROM ORDERS WHERE ID = ? AND STATUS = ?', { id: 1, status: 'paid' })).toEqual({
sql: 'SELECT * FROM ORDERS WHERE ID = ? AND STATUS = ?',
params: [1, 'paid'],
});
expect(prepareSnowflakeReadOnlyQuery('SELECT * FROM ORDERS')).toEqual({
sql: 'SELECT * FROM ORDERS',
params: undefined,
});
});
it('resolves Snowflake connection configuration safely', () => { it('resolves Snowflake connection configuration safely', () => {
expect( expect(
isKtxSnowflakeConnectionConfig({ isKtxSnowflakeConnectionConfig({
@ -561,8 +572,8 @@ describe('KtxSnowflakeScanConnector', () => {
}); });
await expect(connector.listTables(['MART', 'PUBLIC'])).resolves.toEqual([ await expect(connector.listTables(['MART', 'PUBLIC'])).resolves.toEqual([
{ schema: 'MART', name: 'ORDERS', kind: 'table' }, { catalog: 'ANALYTICS', schema: 'MART', name: 'ORDERS', kind: 'table' },
{ schema: 'PUBLIC', name: 'ORDER_SUMMARY', kind: 'view' }, { catalog: 'ANALYTICS', schema: 'PUBLIC', name: 'ORDER_SUMMARY', kind: 'view' },
]); ]);
expect(queries).toHaveLength(1); expect(queries).toHaveLength(1);

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { KtxSnowflakeDialect } from './dialect.js'; import { KtxSnowflakeDialect } from '../../../src/connectors/snowflake/dialect.js';
describe('KtxSnowflakeDialect', () => { describe('KtxSnowflakeDialect', () => {
const dialect = new KtxSnowflakeDialect(); const dialect = new KtxSnowflakeDialect();
@ -36,14 +36,6 @@ describe('KtxSnowflakeDialect', () => {
); );
}); });
it('passes Snowflake positional parameters as bind arrays', () => {
expect(dialect.prepareQuery('SELECT * FROM ORDERS WHERE ID = ? AND STATUS = ?', { id: 1, status: 'paid' })).toEqual({
sql: 'SELECT * FROM ORDERS WHERE ID = ? AND STATUS = ?',
params: [1, 'paid'],
});
expect(dialect.prepareQuery('SELECT * FROM ORDERS')).toEqual({ sql: 'SELECT * FROM ORDERS', params: undefined });
});
it('keeps unsupported statistics explicit', () => { it('keeps unsupported statistics explicit', () => {
expect(dialect.generateColumnStatisticsQuery('PUBLIC', 'ORDERS')).toBeNull(); expect(dialect.generateColumnStatisticsQuery('PUBLIC', 'ORDERS')).toBeNull();
}); });

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { assertSafeSnowflakeIdentifier, quoteSnowflakeIdentifier } from './identifiers.js'; import { assertSafeSnowflakeIdentifier, quoteSnowflakeIdentifier } from '../../../src/connectors/snowflake/identifiers.js';
describe('Snowflake identifier guards', () => { describe('Snowflake identifier guards', () => {
it('quotes simple Snowflake identifiers', () => { it('quotes simple Snowflake identifiers', () => {

View file

@ -11,7 +11,7 @@ vi.mock('snowflake-sdk', () => ({
import { import {
configureSnowflakeSdkLogger, configureSnowflakeSdkLogger,
resetSnowflakeSdkLoggerConfigurationForTests, resetSnowflakeSdkLoggerConfigurationForTests,
} from './sdk-logger.js'; } from '../../../src/connectors/snowflake/sdk-logger.js';
describe('configureSnowflakeSdkLogger', () => { describe('configureSnowflakeSdkLogger', () => {
let projectDir: string; let projectDir: string;

View file

@ -4,9 +4,9 @@ import { mkdtemp, rm } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { afterEach, beforeEach, describe, expect, it } from 'vitest'; import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import { createSqliteLiveDatabaseIntrospection } from '../../connectors/sqlite/live-database-introspection.js'; import { createSqliteLiveDatabaseIntrospection } from '../../../src/connectors/sqlite/live-database-introspection.js';
import { isKtxSqliteConnectionConfig, KtxSqliteScanConnector, sqliteDatabasePathFromConfig } from '../../connectors/sqlite/connector.js'; import { isKtxSqliteConnectionConfig, KtxSqliteScanConnector, sqliteDatabasePathFromConfig } from '../../../src/connectors/sqlite/connector.js';
import { tableRefSet } from '../../context/scan/table-ref.js'; import { tableRefSet } from '../../../src/context/scan/table-ref.js';
describe('KtxSqliteScanConnector', () => { describe('KtxSqliteScanConnector', () => {
let tempDir: string; let tempDir: string;
@ -150,6 +150,20 @@ describe('KtxSqliteScanConnector', () => {
]); ]);
}); });
it('lists schemaless tables and views for setup discovery', async () => {
const connector = new KtxSqliteScanConnector({
connectionId: 'warehouse',
connection: { driver: 'sqlite', path: dbPath },
});
await expect(connector.listSchemas()).resolves.toEqual([]);
await expect(connector.listTables(['ignored'])).resolves.toEqual([
{ catalog: null, schema: '', name: 'customers', kind: 'table' },
{ catalog: null, schema: '', name: 'orders', kind: 'table' },
{ catalog: null, schema: '', name: 'recent_orders', kind: 'view' },
]);
});
it('runs samples, distinct values, statistics, and read-only SQL', async () => { it('runs samples, distinct values, statistics, and read-only SQL', async () => {
const connector = new KtxSqliteScanConnector({ const connector = new KtxSqliteScanConnector({
connectionId: 'warehouse', connectionId: 'warehouse',

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { KtxSqliteDialect } from './dialect.js'; import { KtxSqliteDialect } from '../../../src/connectors/sqlite/dialect.js';
describe('KtxSqliteDialect', () => { describe('KtxSqliteDialect', () => {
const dialect = new KtxSqliteDialect(); const dialect = new KtxSqliteDialect();

View file

@ -1,7 +1,7 @@
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { createSqlServerLiveDatabaseIntrospection } from '../../connectors/sqlserver/live-database-introspection.js'; import { createSqlServerLiveDatabaseIntrospection } from '../../../src/connectors/sqlserver/live-database-introspection.js';
import { isKtxSqlServerConnectionConfig, KtxSqlServerScanConnector, sqlServerConnectionPoolConfigFromConfig, type KtxSqlServerConnectionConfig, type KtxSqlServerPoolFactory, type KtxSqlServerQueryResult } from '../../connectors/sqlserver/connector.js'; import { isKtxSqlServerConnectionConfig, KtxSqlServerScanConnector, prepareSqlServerReadOnlyQuery, sqlServerConnectionPoolConfigFromConfig, type KtxSqlServerConnectionConfig, type KtxSqlServerPoolFactory, type KtxSqlServerQueryResult } from '../../../src/connectors/sqlserver/connector.js';
import { tableRefSet } from '../../context/scan/table-ref.js'; import { tableRefSet } from '../../../src/context/scan/table-ref.js';
function recordset<T extends Record<string, unknown>>( function recordset<T extends Record<string, unknown>>(
rows: T[], rows: T[],
@ -21,9 +21,9 @@ function fakePoolFactory(options: { primaryKeyError?: Error; foreignKeyError?: E
if (sql.includes('INFORMATION_SCHEMA.TABLES')) { if (sql.includes('INFORMATION_SCHEMA.TABLES')) {
return result( return result(
[ [
{ table_name: 'customers', table_type: 'BASE TABLE' }, { schema_name: 'dbo', table_name: 'customers', table_type: 'BASE TABLE' },
{ table_name: 'orders', table_type: 'BASE TABLE' }, { schema_name: 'dbo', table_name: 'orders', table_type: 'BASE TABLE' },
{ table_name: 'order_summary', table_type: 'VIEW' }, { schema_name: 'dbo', table_name: 'order_summary', table_type: 'VIEW' },
], ],
['table_name', 'table_type'], ['table_name', 'table_type'],
); );
@ -100,13 +100,13 @@ function fakePoolFactory(options: { primaryKeyError?: Error; foreignKeyError?: E
['table_name', 'row_count'], ['table_name', 'row_count'],
); );
} }
if (sql.includes('SELECT TOP 1 [id], [status] FROM [dbo].[orders]')) { if (sql.includes('SELECT TOP 1 [id], [status] FROM [analytics].[dbo].[orders]')) {
return result([{ id: 10, status: 'paid' }], ['id', 'status']); return result([{ id: 10, status: 'paid' }], ['id', 'status']);
} }
if (sql.includes('SELECT TOP 1 * FROM (select id, status from dbo.orders) AS ktx_query_result')) { if (sql.includes('SELECT TOP 1 * FROM (select id, status from dbo.orders) AS ktx_query_result')) {
return result([{ id: 10, status: 'paid' }], ['id', 'status']); return result([{ id: 10, status: 'paid' }], ['id', 'status']);
} }
if (sql.includes('SELECT TOP 5 [status] FROM [dbo].[orders]')) { if (sql.includes('SELECT TOP 5 [status] FROM [analytics].[dbo].[orders]')) {
return result([{ status: 'paid' }, { status: 'open' }], ['status']); return result([{ status: 'paid' }, { status: 'open' }], ['status']);
} }
if (sql.includes('COUNT(DISTINCT val)')) { if (sql.includes('COUNT(DISTINCT val)')) {
@ -118,6 +118,16 @@ function fakePoolFactory(options: { primaryKeyError?: Error; foreignKeyError?: E
if (sql.includes('SUM(p.rows) AS row_count') && sql.includes('t.name = @tableName')) { if (sql.includes('SUM(p.rows) AS row_count') && sql.includes('t.name = @tableName')) {
return result([{ row_count: 2 }], ['row_count']); return result([{ row_count: 2 }], ['row_count']);
} }
if (sql.includes('FROM sys.objects o')) {
return result(
[
{ schema_name: 'dbo', table_name: 'customers', table_type: 'USER_TABLE' },
{ schema_name: 'dbo', table_name: 'order_summary', table_type: 'VIEW' },
{ schema_name: 'dbo', table_name: 'orders', table_type: 'USER_TABLE' },
],
['schema_name', 'table_name', 'table_type'],
);
}
if (sql.includes('SELECT s.name AS schema_name')) { if (sql.includes('SELECT s.name AS schema_name')) {
return result([{ schema_name: 'dbo' }, { schema_name: 'sales' }], ['schema_name']); return result([{ schema_name: 'dbo' }, { schema_name: 'sales' }], ['schema_name']);
} }
@ -140,6 +150,19 @@ function fakePoolFactory(options: { primaryKeyError?: Error; foreignKeyError?: E
} }
describe('KtxSqlServerScanConnector', () => { describe('KtxSqlServerScanConnector', () => {
it('prepares read-only SQL parameters with SQL Server named placeholders', () => {
expect(
prepareSqlServerReadOnlyQuery('select * from events where id = :id and name = :name', {
id: 10,
name: 'signup',
}),
).toEqual({
sql: 'select * from events where id = @id and name = @name',
params: { id: 10, name: 'signup' },
});
expect(prepareSqlServerReadOnlyQuery('select 1')).toEqual({ sql: 'select 1', params: undefined });
});
it('resolves SQL Server connection configuration safely', () => { it('resolves SQL Server connection configuration safely', () => {
expect( expect(
isKtxSqlServerConnectionConfig({ isKtxSqlServerConnectionConfig({
@ -366,6 +389,11 @@ describe('KtxSqlServerScanConnector', () => {
await expect(connector.getTableRowCount('orders')).resolves.toBe(2); await expect(connector.getTableRowCount('orders')).resolves.toBe(2);
await expect(connector.listSchemas()).resolves.toEqual(['dbo', 'sales']); await expect(connector.listSchemas()).resolves.toEqual(['dbo', 'sales']);
await expect(connector.listTables(['dbo'])).resolves.toEqual([
{ catalog: 'analytics', schema: 'dbo', name: 'customers', kind: 'table' },
{ catalog: 'analytics', schema: 'dbo', name: 'order_summary', kind: 'view' },
{ catalog: 'analytics', schema: 'dbo', name: 'orders', kind: 'table' },
]);
await expect( await expect(
connector.columnStats( connector.columnStats(
{ connectionId: 'warehouse', table: { catalog: 'analytics', db: 'dbo', name: 'orders' }, column: 'status' }, { connectionId: 'warehouse', table: { catalog: 'analytics', db: 'dbo', name: 'orders' }, column: 'status' },

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { KtxSqlServerDialect } from './dialect.js'; import { KtxSqlServerDialect } from '../../../src/connectors/sqlserver/dialect.js';
describe('KtxSqlServerDialect', () => { describe('KtxSqlServerDialect', () => {
const dialect = new KtxSqlServerDialect(); const dialect = new KtxSqlServerDialect();
@ -7,7 +7,9 @@ describe('KtxSqlServerDialect', () => {
it('quotes identifiers and formats schema-qualified table names', () => { it('quotes identifiers and formats schema-qualified table names', () => {
expect(dialect.quoteIdentifier('events')).toBe('[events]'); expect(dialect.quoteIdentifier('events')).toBe('[events]');
expect(dialect.quoteIdentifier('odd]name')).toBe('[odd]]name]'); expect(dialect.quoteIdentifier('odd]name')).toBe('[odd]]name]');
expect(dialect.formatTableName({ catalog: 'warehouse', db: 'dbo', name: 'events' })).toBe('[dbo].[events]'); expect(dialect.formatTableName({ catalog: 'warehouse', db: 'dbo', name: 'events' })).toBe(
'[warehouse].[dbo].[events]',
);
expect(dialect.formatTableName({ catalog: null, db: null, name: 'events' })).toBe('[events]'); expect(dialect.formatTableName({ catalog: null, db: null, name: 'events' })).toBe('[events]');
}); });
@ -20,7 +22,7 @@ describe('KtxSqlServerDialect', () => {
expect(dialect.mapToDimensionType('')).toBe('string'); expect(dialect.mapToDimensionType('')).toBe('string');
}); });
it('builds sampling, distinct-value, pagination, and time SQL', () => { it('builds sampling, distinct-value, and pagination SQL', () => {
expect(dialect.generateSampleQuery('[dbo].[events]', 25, ['id', 'event_name'])).toBe( expect(dialect.generateSampleQuery('[dbo].[events]', 25, ['id', 'event_name'])).toBe(
'SELECT TOP 25 [id], [event_name] FROM [dbo].[events]', 'SELECT TOP 25 [id], [event_name] FROM [dbo].[events]',
); );
@ -28,22 +30,8 @@ describe('KtxSqlServerDialect', () => {
"SELECT TOP 10 [event_name] FROM [dbo].[events] WHERE [event_name] IS NOT NULL AND LTRIM(RTRIM(CAST([event_name] AS NVARCHAR(MAX)))) != ''", "SELECT TOP 10 [event_name] FROM [dbo].[events] WHERE [event_name] IS NOT NULL AND LTRIM(RTRIM(CAST([event_name] AS NVARCHAR(MAX)))) != ''",
); );
expect(dialect.generateDistinctValuesQuery('[dbo].[events]', '[event_name]', 5)).toContain('SELECT TOP 5 val'); expect(dialect.generateDistinctValuesQuery('[dbo].[events]', '[event_name]', 5)).toContain('SELECT TOP 5 val');
expect(dialect.getTopClause(10)).toBe('TOP 10'); expect(dialect.getTopClause(10)).toBe('TOP (10)');
expect(dialect.getLimitOffsetClause(10, 20)).toBe('OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY'); expect(dialect.getLimitOffsetClause(10, 20)).toBe('');
expect(dialect.getTimeTruncExpression('created_at', 'month')).toBe(
'DATEFROMPARTS(YEAR(created_at), MONTH(created_at), 1)',
);
}); });
it('prepares named parameters using SQL Server @ parameters', () => {
expect(
dialect.prepareQuery('select * from events where id = :id and name = :name', {
id: 10,
name: 'signup',
}),
).toEqual({
sql: 'select * from events where id = @id and name = @name',
params: { id: 10, name: 'signup' },
});
});
}); });

View file

@ -1,6 +1,6 @@
import { buildDefaultKtxProjectConfig, type KtxProjectConfig } from './context/project/config.js'; import { buildDefaultKtxProjectConfig, type KtxProjectConfig } from '../src/context/project/config.js';
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import type { KtxPublicIngestProject, KtxPublicIngestTargetResult } from './public-ingest.js'; import type { KtxPublicIngestProject, KtxPublicIngestTargetResult } from '../src/public-ingest.js';
import { import {
type ContextBuildTargetState, type ContextBuildTargetState,
extractProgressMessage, extractProgressMessage,
@ -11,7 +11,7 @@ import {
renderContextBuildView, renderContextBuildView,
runContextBuild, runContextBuild,
viewStateFromSourceProgress, viewStateFromSourceProgress,
} from './context-build-view.js'; } from '../src/context-build-view.js';
function makeIo(options: { isTTY?: boolean; columns?: number } = {}) { function makeIo(options: { isTTY?: boolean; columns?: number } = {}) {
let stdout = ''; let stdout = '';

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { normalizeBigQueryProjectId, normalizeBigQueryRegion } from './bigquery-identifiers.js'; import { normalizeBigQueryProjectId, normalizeBigQueryRegion } from '../../../src/context/connections/bigquery-identifiers.js';
describe('BigQuery identifier normalization', () => { describe('BigQuery identifier normalization', () => {
it('normalizes project ids and regions for information schema paths', () => { it('normalizes project ids and regions for information schema paths', () => {

View file

@ -0,0 +1,316 @@
import { describe, expect, it } from 'vitest';
import { getDialectForDriver } from '../../../src/context/connections/dialects.js';
import type { KtxConnectionDriver, KtxTableRef } from '../../../src/context/scan/types.js';
interface DialectFixture {
driver: KtxConnectionDriver;
table: KtxTableRef;
quoteInput: string;
quotedIdentifier: string;
formattedTable: string;
display: string;
invalidDisplay: string;
columnDisplayTablePartCount: 1 | 2 | 3;
limitClause: string;
topClause: string;
randomFilter: string;
tableSampleClause: string;
sampleQuery: string;
columnSampleContains: string;
nullCountExpression: string;
distinctCountExpression: string;
textLengthExpression: string;
castToText: string;
sampleValueAggregation: string;
cardinalityContains: string;
randomizedCardinalityContains: string;
distinctValuesContains: string;
statisticsContains: string | null;
dimensionInput: string;
dimensionType: 'time' | 'string' | 'number' | 'boolean';
nativeTypeInput: string;
normalizedType: string;
}
const innerSampleSql = 'SELECT status AS value FROM orders';
const fixtures: DialectFixture[] = [
{
driver: 'postgres',
table: { catalog: null, db: 'public', name: 'orders' },
quoteInput: 'order"items',
quotedIdentifier: '"order""items"',
formattedTable: '"public"."orders"',
display: 'public.orders',
invalidDisplay: 'orders',
columnDisplayTablePartCount: 2,
limitClause: 'LIMIT 25 OFFSET 5',
topClause: '',
randomFilter: 'RANDOM() < 0.25',
tableSampleClause: 'TABLESAMPLE SYSTEM (25)',
sampleQuery: 'SELECT "id", "status" FROM "public"."orders" LIMIT 5',
columnSampleContains: 'TRIM(CAST("status" AS TEXT)) != \'\'',
nullCountExpression: 'COUNT(*) FILTER (WHERE "status" IS NULL)',
distinctCountExpression: 'COUNT(DISTINCT "status")',
textLengthExpression: 'LENGTH(CAST("status" AS TEXT))',
castToText: 'CAST("status" AS TEXT)',
sampleValueAggregation:
'(SELECT STRING_AGG(CAST(value AS TEXT), CHR(31)) FROM (SELECT status AS value FROM orders) AS relationship_profile_values)',
cardinalityContains: 'SELECT COUNT(DISTINCT val) AS cardinality',
randomizedCardinalityContains: 'ORDER BY RANDOM()',
distinctValuesContains: 'SELECT DISTINCT "status"::text AS val',
statisticsContains: 'FROM pg_stats s',
dimensionInput: 'timestamp with time zone',
dimensionType: 'time',
nativeTypeInput: 'numeric(12,2)',
normalizedType: 'numeric(12,2)',
},
{
driver: 'mysql',
table: { catalog: null, db: 'analytics', name: 'orders' },
quoteInput: 'order`items',
quotedIdentifier: '`order``items`',
formattedTable: '`analytics`.`orders`',
display: 'analytics.orders',
invalidDisplay: 'orders',
columnDisplayTablePartCount: 2,
limitClause: 'LIMIT 25 OFFSET 5',
topClause: '',
randomFilter: 'RAND() < 0.25',
tableSampleClause: '',
sampleQuery: 'SELECT `id`, `status` FROM `analytics`.`orders` LIMIT 5',
columnSampleContains: 'TRIM(CAST(`status` AS CHAR)) != \'\'',
nullCountExpression: 'SUM(CASE WHEN `status` IS NULL THEN 1 ELSE 0 END)',
distinctCountExpression: 'COUNT(DISTINCT `status`)',
textLengthExpression: 'CHAR_LENGTH(CAST(`status` AS CHAR))',
castToText: 'CAST(`status` AS CHAR)',
sampleValueAggregation:
'(SELECT GROUP_CONCAT(CAST(value AS CHAR) SEPARATOR CHAR(31)) FROM (SELECT status AS value FROM orders) AS relationship_profile_values)',
cardinalityContains: 'SELECT COUNT(DISTINCT val) AS cardinality',
randomizedCardinalityContains: 'ORDER BY RAND()',
distinctValuesContains: 'SELECT DISTINCT CAST(`status` AS CHAR) AS val',
statisticsContains: null,
dimensionInput: 'tinyint(1)',
dimensionType: 'boolean',
nativeTypeInput: 'varchar(255)',
normalizedType: 'varchar(255)',
},
{
driver: 'clickhouse',
table: { catalog: null, db: 'analytics', name: 'events' },
quoteInput: 'order`items',
quotedIdentifier: '`order``items`',
formattedTable: '`analytics`.`events`',
display: 'analytics.events',
invalidDisplay: 'events',
columnDisplayTablePartCount: 2,
limitClause: 'LIMIT 25 OFFSET 5',
topClause: '',
randomFilter: 'rand() / 4294967295.0 < 0.25',
tableSampleClause: '',
sampleQuery: 'SELECT `id`, `status` FROM `analytics`.`events` LIMIT 5',
columnSampleContains: 'trim(toString(`status`)) != \'\'',
nullCountExpression: 'countIf(`status` IS NULL)',
distinctCountExpression: 'COUNT(DISTINCT `status`)',
textLengthExpression: 'length(toString(`status`))',
castToText: 'toString(`status`)',
sampleValueAggregation:
'(SELECT arrayStringConcat(groupArray(toString(value)), \'\\x1F\') FROM (SELECT status AS value FROM orders) AS relationship_profile_values)',
cardinalityContains: 'SELECT COUNT(DISTINCT val) AS cardinality',
randomizedCardinalityContains: 'ORDER BY rand()',
distinctValuesContains: 'SELECT DISTINCT toString(`status`) AS val',
statisticsContains: null,
dimensionInput: 'Nullable(DateTime64(3))',
dimensionType: 'time',
nativeTypeInput: 'LowCardinality(String)',
normalizedType: 'LowCardinality(String)',
},
{
driver: 'sqlite',
table: { catalog: null, db: null, name: 'orders' },
quoteInput: 'order"items',
quotedIdentifier: '"order""items"',
formattedTable: '"orders"',
display: 'orders',
invalidDisplay: 'public.orders',
columnDisplayTablePartCount: 1,
limitClause: 'LIMIT 25 OFFSET 5',
topClause: '',
randomFilter: '(RANDOM() % 100) < 25',
tableSampleClause: '',
sampleQuery: 'SELECT "id", "status" FROM "orders" LIMIT 5',
columnSampleContains: 'TRIM(CAST("status" AS TEXT)) != \'\'',
nullCountExpression: 'SUM(CASE WHEN "status" IS NULL THEN 1 ELSE 0 END)',
distinctCountExpression: 'COUNT(DISTINCT "status")',
textLengthExpression: 'LENGTH(CAST("status" AS TEXT))',
castToText: 'CAST("status" AS TEXT)',
sampleValueAggregation:
'(SELECT GROUP_CONCAT(CAST(value AS TEXT), char(31)) FROM (SELECT status AS value FROM orders) AS relationship_profile_values)',
cardinalityContains: 'SELECT COUNT(DISTINCT val) AS cardinality',
randomizedCardinalityContains: 'ORDER BY RANDOM()',
distinctValuesContains: 'SELECT DISTINCT CAST("status" AS TEXT) AS val',
statisticsContains: null,
dimensionInput: 'INTEGER',
dimensionType: 'number',
nativeTypeInput: 'VARCHAR(255)',
normalizedType: 'VARCHAR(255)',
},
{
driver: 'snowflake',
table: { catalog: 'ANALYTICS', db: 'PUBLIC', name: 'ORDERS' },
quoteInput: 'order"items',
quotedIdentifier: '"order""items"',
formattedTable: '"ANALYTICS"."PUBLIC"."ORDERS"',
display: 'ANALYTICS.PUBLIC.ORDERS',
invalidDisplay: 'PUBLIC.ORDERS',
columnDisplayTablePartCount: 3,
limitClause: 'LIMIT 25 OFFSET 5',
topClause: '',
randomFilter: 'UNIFORM(0::FLOAT, 1::FLOAT, RANDOM()) < 0.25',
tableSampleClause: 'SAMPLE (25)',
sampleQuery: 'SELECT "id", "status" FROM "ANALYTICS"."PUBLIC"."ORDERS" SAMPLE ROW (5 ROWS)',
columnSampleContains: 'TRIM(CAST("status" AS STRING)) != \'\'',
nullCountExpression: 'COUNT_IF("status" IS NULL)',
distinctCountExpression: 'APPROX_COUNT_DISTINCT("status")',
textLengthExpression: 'LENGTH(CAST("status" AS TEXT))',
castToText: 'CAST("status" AS VARCHAR)',
sampleValueAggregation:
'(SELECT LISTAGG(CAST(value AS VARCHAR), \'\\x1f\') FROM (SELECT status AS value FROM orders) AS relationship_profile_values)',
cardinalityContains: 'SELECT COUNT(DISTINCT val) AS cardinality',
randomizedCardinalityContains: 'SAMPLE ROW (100 ROWS)',
distinctValuesContains: 'SELECT DISTINCT "status"::VARCHAR AS val',
statisticsContains: null,
dimensionInput: 'TIMESTAMP_NTZ',
dimensionType: 'time',
nativeTypeInput: 'NUMBER(38,0)',
normalizedType: 'NUMBER(38,0)',
},
{
driver: 'bigquery',
table: { catalog: 'analytics-project', db: 'warehouse', name: 'orders' },
quoteInput: 'order`items',
quotedIdentifier: '`order\\`items`',
formattedTable: '`analytics-project`.`warehouse`.`orders`',
display: 'analytics-project.warehouse.orders',
invalidDisplay: 'warehouse.orders',
columnDisplayTablePartCount: 3,
limitClause: 'LIMIT 25 OFFSET 5',
topClause: '',
randomFilter: 'RAND() < 0.25',
tableSampleClause: 'TABLESAMPLE SYSTEM (25 PERCENT)',
sampleQuery: 'SELECT `id`, `status` FROM `analytics-project`.`warehouse`.`orders` ORDER BY RAND() LIMIT 5',
columnSampleContains: 'TRIM(CAST(`status` AS STRING)) != \'\'',
nullCountExpression: 'COUNTIF(`status` IS NULL)',
distinctCountExpression: 'APPROX_COUNT_DISTINCT(`status`)',
textLengthExpression: 'LENGTH(CAST(`status` AS STRING))',
castToText: 'CAST(`status` AS STRING)',
sampleValueAggregation:
'(SELECT STRING_AGG(CAST(value AS STRING), \'\\u001F\') FROM (SELECT status AS value FROM orders) AS relationship_profile_values)',
cardinalityContains: 'SELECT APPROX_COUNT_DISTINCT(val) AS cardinality',
randomizedCardinalityContains: 'ORDER BY RAND()',
distinctValuesContains: 'SELECT DISTINCT CAST(`status` AS STRING) AS val',
statisticsContains: null,
dimensionInput: 'INT64',
dimensionType: 'number',
nativeTypeInput: 'INT64',
normalizedType: 'BIGINT',
},
{
driver: 'sqlserver',
table: { catalog: 'warehouse', db: 'dbo', name: 'events' },
quoteInput: 'odd]name',
quotedIdentifier: '[odd]]name]',
formattedTable: '[warehouse].[dbo].[events]',
display: 'warehouse.dbo.events',
invalidDisplay: 'dbo.events',
columnDisplayTablePartCount: 3,
limitClause: '',
topClause: 'TOP (25)',
randomFilter: 'ABS(CHECKSUM(NEWID())) % 100 < 25',
tableSampleClause: 'TABLESAMPLE (25 PERCENT)',
sampleQuery: 'SELECT TOP 5 [id], [status] FROM [warehouse].[dbo].[events]',
columnSampleContains: 'LTRIM(RTRIM(CAST([status] AS NVARCHAR(MAX)))) != \'\'',
nullCountExpression: 'SUM(CASE WHEN [status] IS NULL THEN 1 ELSE 0 END)',
distinctCountExpression: 'COUNT(DISTINCT [status])',
textLengthExpression: 'LEN(CAST([status] AS NVARCHAR(MAX)))',
castToText: 'CAST([status] AS NVARCHAR(MAX))',
sampleValueAggregation:
'(SELECT STRING_AGG(CAST(value AS NVARCHAR(MAX)), CHAR(31)) FROM (SELECT status AS value FROM orders) AS relationship_profile_values)',
cardinalityContains: 'SELECT COUNT(DISTINCT val) AS cardinality',
randomizedCardinalityContains: 'ORDER BY NEWID()',
distinctValuesContains: 'SELECT TOP 20 val',
statisticsContains: null,
dimensionInput: 'datetime2',
dimensionType: 'time',
nativeTypeInput: 'uniqueidentifier',
normalizedType: 'uniqueidentifier',
},
];
describe('getDialectForDriver', () => {
it.each(fixtures)('returns a full KtxDialect for $driver', (fixture) => {
const dialect = getDialectForDriver(fixture.driver);
const column = dialect.quoteIdentifier('status');
expect(dialect.type).toBe(fixture.driver);
expect(dialect.quoteIdentifier(fixture.quoteInput)).toBe(fixture.quotedIdentifier);
expect(dialect.formatTableName(fixture.table)).toBe(fixture.formattedTable);
expect(dialect.formatDisplayRef(fixture.table)).toBe(fixture.display);
expect(dialect.parseDisplayRef(fixture.display)).toEqual(fixture.table);
expect(dialect.parseDisplayRef(fixture.invalidDisplay)).toBeNull();
expect(dialect.columnDisplayTablePartCount()).toBe(fixture.columnDisplayTablePartCount);
expect(dialect.getLimitOffsetClause(25, 5)).toBe(fixture.limitClause);
expect(dialect.getTopClause(25)).toBe(fixture.topClause);
expect(dialect.getRandomSampleFilter(0.25)).toBe(fixture.randomFilter);
expect(dialect.getTableSampleClause(0.25)).toBe(fixture.tableSampleClause);
expect(dialect.generateSampleQuery(fixture.formattedTable, 5, ['id', 'status'])).toBe(fixture.sampleQuery);
expect(dialect.generateColumnSampleQuery(fixture.formattedTable, 'status', 10)).toContain(
fixture.columnSampleContains,
);
expect(dialect.getNullCountExpression(column)).toBe(fixture.nullCountExpression);
expect(dialect.getDistinctCountExpression(column)).toBe(fixture.distinctCountExpression);
expect(dialect.textLengthExpression(column)).toBe(fixture.textLengthExpression);
expect(dialect.castToText(column)).toBe(fixture.castToText);
expect(dialect.getSampleValueAggregation(innerSampleSql)).toBe(fixture.sampleValueAggregation);
expect(dialect.generateCardinalitySampleQuery(fixture.formattedTable, column, 100)).toContain(
fixture.cardinalityContains,
);
expect(dialect.generateRandomizedCardinalitySampleQuery(fixture.formattedTable, column, 100)).toContain(
fixture.randomizedCardinalityContains,
);
expect(dialect.generateDistinctValuesQuery(fixture.formattedTable, column, 20)).toContain(
fixture.distinctValuesContains,
);
const statistics = dialect.generateColumnStatisticsQuery(fixture.table.db ?? '', fixture.table.name);
if (fixture.statisticsContains) {
expect(statistics).toContain(fixture.statisticsContains);
} else {
expect(statistics).toBeNull();
}
expect(dialect.mapToDimensionType(fixture.dimensionInput)).toBe(fixture.dimensionType);
expect(dialect.mapDataType(fixture.nativeTypeInput)).toBe(fixture.normalizedType);
});
it('accepts three-part ANSI display refs while keeping one-part names caller-owned', () => {
for (const driver of ['postgres', 'mysql', 'clickhouse'] as const) {
const dialect = getDialectForDriver(driver);
expect(dialect.parseDisplayRef('warehouse.public.orders')).toEqual({
catalog: 'warehouse',
db: 'public',
name: 'orders',
});
expect(dialect.parseDisplayRef('orders')).toBeNull();
}
});
it('throws with a supported-driver list for unknown drivers', () => {
expect(() => getDialectForDriver('oracle')).toThrow(
'Unsupported warehouse driver "oracle". Supported drivers: bigquery, clickhouse, mysql, postgres, sqlite, snowflake, sqlserver',
);
});
it('rejects legacy driver aliases', () => {
expect(() => getDialectForDriver('postgresql')).toThrow('Unsupported warehouse driver "postgresql"');
expect(() => getDialectForDriver('sqlite3')).toThrow('Unsupported warehouse driver "sqlite3"');
});
});

View file

@ -0,0 +1,145 @@
import { mkdtemp, rm } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import {
driverRegistrations,
getDriverRegistration,
listSupportedDrivers,
} from '../../../src/context/connections/drivers.js';
import type {
KtxDriverConnectorModule,
KtxScopeConfigKey,
} from '../../../src/context/connections/drivers.js';
import type { KtxConnectionDriver } from '../../../src/context/scan/types.js';
type FixtureFactory = (projectDir: string) => Record<string, unknown>;
const connectionFixtures: Record<KtxConnectionDriver, FixtureFactory> = {
postgres: () => ({
driver: 'postgres',
url: 'postgresql://reader:secret@localhost:5432/analytics', // pragma: allowlist secret
schemas: ['public'],
}),
sqlite: () => ({ driver: 'sqlite', path: 'warehouse.db' }),
mysql: () => ({
driver: 'mysql',
host: 'localhost',
database: 'analytics',
username: 'reader',
password: 'secret', // pragma: allowlist secret
schemas: ['analytics'],
}),
clickhouse: () => ({
driver: 'clickhouse',
url: 'http://localhost:8123',
database: 'analytics',
username: 'reader',
password: 'secret', // pragma: allowlist secret
}),
sqlserver: () => ({
driver: 'sqlserver',
host: 'localhost',
database: 'analytics',
username: 'reader',
password: 'secret', // pragma: allowlist secret
schemas: ['dbo'],
}),
bigquery: () => ({
driver: 'bigquery',
dataset_id: 'analytics',
credentials_json: JSON.stringify({
project_id: 'project-1',
client_email: 'reader@example.test',
private_key: '-----BEGIN PRIVATE KEY-----\nsecret\n-----END PRIVATE KEY-----\n', // pragma: allowlist secret
}),
location: 'US',
}),
snowflake: () => ({
driver: 'snowflake',
account: 'example-account',
username: 'reader',
password: 'secret', // pragma: allowlist secret
warehouse: 'COMPUTE_WH',
database: 'ANALYTICS',
schema: 'PUBLIC',
}),
};
const allowedScopeKeys = new Set(['dataset_ids', 'databases', 'schemas', 'schema_names']);
const historicSqlReaderDrivers = new Set<KtxConnectionDriver>(['postgres', 'bigquery', 'snowflake']);
const localExecutorDrivers = new Set<KtxConnectionDriver>(['postgres', 'sqlite']);
function assertExportedRegistryBoundaryTypes(input: {
scopeConfigKey: KtxScopeConfigKey;
connectorModule: KtxDriverConnectorModule;
}): {
scopeConfigKey: KtxScopeConfigKey;
connectorModule: KtxDriverConnectorModule;
} {
return input;
}
describe('driverRegistrations', () => {
let projectDir: string;
beforeEach(async () => {
projectDir = await mkdtemp(join(tmpdir(), 'ktx-driver-registry-'));
});
afterEach(async () => {
await rm(projectDir, { recursive: true, force: true });
});
it('lists every supported warehouse driver', () => {
const registryDrivers = Object.keys(driverRegistrations).sort();
expect(listSupportedDrivers()).toEqual(registryDrivers);
expect(listSupportedDrivers()).toEqual([
'bigquery',
'clickhouse',
'mysql',
'postgres',
'snowflake',
'sqlite',
'sqlserver',
]);
});
it('resolves registered drivers case-insensitively', () => {
expect(getDriverRegistration(' Postgres ')?.driver).toBe('postgres');
expect(getDriverRegistration('unknown')).toBeUndefined();
});
it.each(Object.values(driverRegistrations))('adapts $driver connector exports', async (registration) => {
const connectorModule = await registration.load();
const connection = connectionFixtures[registration.driver](projectDir);
const exportedBoundary = assertExportedRegistryBoundaryTypes({
scopeConfigKey: registration.scopeConfigKey ?? 'schemas',
connectorModule,
});
expect(exportedBoundary.connectorModule.createScanConnector).toEqual(expect.any(Function));
expect(connectorModule.isConnectionConfig(connection)).toBe(true);
expect(connectorModule.isConnectionConfig({})).toBe(false);
const connector = connectorModule.createScanConnector({
connectionId: 'warehouse',
connection,
projectDir,
});
expect(connector.driver).toBe(registration.driver);
expect(connector.listSchemas).toEqual(expect.any(Function));
expect(connector.listTables).toEqual(expect.any(Function));
await connector.cleanup?.();
if (registration.driver === 'sqlite') {
expect(registration.scopeConfigKey).toBeNull();
} else {
expect(registration.scopeConfigKey).not.toBeNull();
expect(allowedScopeKeys.has(registration.scopeConfigKey ?? '')).toBe(true);
}
expect(registration.hasHistoricSqlReader).toBe(historicSqlReaderDrivers.has(registration.driver));
expect(registration.hasLocalQueryExecutor).toBe(localExecutorDrivers.has(registration.driver));
});
});

View file

@ -1,5 +1,5 @@
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { createDefaultLocalQueryExecutor } from './local-query-executor.js'; import { createDefaultLocalQueryExecutor } from '../../../src/context/connections/local-query-executor.js';
describe('createDefaultLocalQueryExecutor', () => { describe('createDefaultLocalQueryExecutor', () => {
it('dispatches postgres and sqlite drivers to their executors', async () => { it('dispatches postgres and sqlite drivers to their executors', async () => {

View file

@ -3,7 +3,7 @@ import {
localConnectionInfoFromConfig, localConnectionInfoFromConfig,
localConnectionToWarehouseDescriptor, localConnectionToWarehouseDescriptor,
localConnectionTypeForConfig, localConnectionTypeForConfig,
} from './local-warehouse-descriptor.js'; } from '../../../src/context/connections/local-warehouse-descriptor.js';
describe('localConnectionToWarehouseDescriptor', () => { describe('localConnectionToWarehouseDescriptor', () => {
it('maps local Postgres URLs to canonical warehouse descriptors', () => { it('maps local Postgres URLs to canonical warehouse descriptors', () => {

View file

@ -7,7 +7,7 @@ import {
parseNotionConnectionConfig, parseNotionConnectionConfig,
redactNotionConnectionConfig, redactNotionConnectionConfig,
resolveNotionAuthToken, resolveNotionAuthToken,
} from './notion-config.js'; } from '../../../src/context/connections/notion-config.js';
describe('standalone Notion connection config', () => { describe('standalone Notion connection config', () => {
let tempDir: string; let tempDir: string;

View file

@ -1,5 +1,5 @@
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { createPostgresQueryExecutor } from './postgres-query-executor.js'; import { createPostgresQueryExecutor } from '../../../src/context/connections/postgres-query-executor.js';
function makeClient() { function makeClient() {
const calls: unknown[] = []; const calls: unknown[] = [];

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { assertReadOnlySql, limitSqlForExecution } from './read-only-sql.js'; import { assertReadOnlySql, limitSqlForExecution } from '../../../src/context/connections/read-only-sql.js';
describe('assertReadOnlySql', () => { describe('assertReadOnlySql', () => {
it('allows select and with queries', () => { it('allows select and with queries', () => {

View file

@ -4,7 +4,7 @@ import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import Database from 'better-sqlite3'; import Database from 'better-sqlite3';
import { afterEach, beforeEach, describe, expect, it } from 'vitest'; import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import { createSqliteQueryExecutor, sqliteDatabasePathFromConnection } from './sqlite-query-executor.js'; import { createSqliteQueryExecutor, sqliteDatabasePathFromConnection } from '../../../src/context/connections/sqlite-query-executor.js';
describe('createSqliteQueryExecutor', () => { describe('createSqliteQueryExecutor', () => {
let tempDir: string; let tempDir: string;

View file

@ -2,7 +2,7 @@ import { mkdir, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { resolveKtxConfigReference, resolveKtxHomePath } from './config-reference.js'; import { resolveKtxConfigReference, resolveKtxHomePath } from '../../../src/context/core/config-reference.js';
describe('KTX config references', () => { describe('KTX config references', () => {
it('resolves env references without returning empty values', () => { it('resolves env references without returning empty values', () => {

View file

@ -3,9 +3,9 @@ import { mkdir, mkdtemp, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import type { SimpleGit } from 'simple-git'; import type { SimpleGit } from 'simple-git';
import type { KtxCoreConfig } from './config.js'; import type { KtxCoreConfig } from '../../../src/context/core/config.js';
import { createSimpleGit } from './git-env.js'; import { createSimpleGit } from '../../../src/context/core/git-env.js';
import { GitService } from './git.service.js'; import { GitService } from '../../../src/context/core/git.service.js';
describe('GitService.assertWorktreeClean', () => { describe('GitService.assertWorktreeClean', () => {
let workdir: string; let workdir: string;

View file

@ -3,9 +3,9 @@ import { mkdir, mkdtemp, readdir, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import type { SimpleGit } from 'simple-git'; import type { SimpleGit } from 'simple-git';
import type { KtxCoreConfig } from './config.js'; import type { KtxCoreConfig } from '../../../src/context/core/config.js';
import { createSimpleGit } from './git-env.js'; import { createSimpleGit } from '../../../src/context/core/git-env.js';
import { GitService } from './git.service.js'; import { GitService } from '../../../src/context/core/git.service.js';
describe('GitService.deleteDirectories', () => { describe('GitService.deleteDirectories', () => {
let workdir: string; let workdir: string;

View file

@ -2,7 +2,7 @@ import { mkdir, mkdtemp, readFile, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { GitService } from './git.service.js'; import { GitService } from '../../../src/context/core/git.service.js';
async function makeGit() { async function makeGit() {
const homeDir = await mkdtemp(join(tmpdir(), 'ktx-git-patch-')); const homeDir = await mkdtemp(join(tmpdir(), 'ktx-git-patch-'));

View file

@ -3,9 +3,9 @@ import { mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import type { SimpleGit } from 'simple-git'; import type { SimpleGit } from 'simple-git';
import type { KtxCoreConfig } from './config.js'; import type { KtxCoreConfig } from '../../../src/context/core/config.js';
import { createSimpleGit } from './git-env.js'; import { createSimpleGit } from '../../../src/context/core/git-env.js';
import { GitService } from './git.service.js'; import { GitService } from '../../../src/context/core/git.service.js';
describe('GitService.resetHardTo', () => { describe('GitService.resetHardTo', () => {
let workdir: string; let workdir: string;

View file

@ -2,8 +2,8 @@ import { mkdtemp, readFile, realpath, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { afterEach, beforeEach, describe, expect, it } from 'vitest'; import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import type { KtxCoreConfig } from './config.js'; import type { KtxCoreConfig } from '../../../src/context/core/config.js';
import { GitService } from './git.service.js'; import { GitService } from '../../../src/context/core/git.service.js';
// These tests drive a real git repo inside a temp directory — simple-git shells out to the // These tests drive a real git repo inside a temp directory — simple-git shells out to the
// system `git` binary. They are fast enough to run as unit tests and catch real issues that // system `git` binary. They are fast enough to run as unit tests and catch real issues that

View file

@ -2,9 +2,9 @@ import { mkdtemp, realpath, rm, stat } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import type { KtxCoreConfig } from './config.js'; import type { KtxCoreConfig } from '../../../src/context/core/config.js';
import { GitService } from './git.service.js'; import { GitService } from '../../../src/context/core/git.service.js';
import { SessionWorktreeService, type WorktreeConfigPort } from './session-worktree.service.js'; import { SessionWorktreeService, type WorktreeConfigPort } from '../../../src/context/core/session-worktree.service.js';
interface TestWorktreeConfig extends WorktreeConfigPort<TestWorktreeConfig> { interface TestWorktreeConfig extends WorktreeConfigPort<TestWorktreeConfig> {
workdir?: string; workdir?: string;

View file

@ -1,7 +1,7 @@
import { once } from 'node:events'; import { once } from 'node:events';
import { createServer } from 'node:http'; import { createServer } from 'node:http';
import { describe, expect, it, vi } from 'vitest'; import { describe, expect, it, vi } from 'vitest';
import { createHttpSemanticLayerComputePort, createPythonSemanticLayerComputePort } from './semantic-layer-compute.js'; import { createHttpSemanticLayerComputePort, createPythonSemanticLayerComputePort } from '../../../src/context/daemon/semantic-layer-compute.js';
const source = { const source = {
name: 'orders', name: 'orders',

View file

@ -2,10 +2,10 @@ import { mkdir, mkdtemp, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { afterEach, beforeEach, describe, expect, it } from 'vitest'; import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import type { KtxEmbeddingPort } from '../../context/core/embedding.js'; import type { KtxEmbeddingPort } from '../../../src/context/core/embedding.js';
import { initKtxProject, loadKtxProject, type KtxLocalProject } from '../../context/project/project.js'; import { initKtxProject, loadKtxProject, type KtxLocalProject } from '../../../src/context/project/project.js';
import { SqliteKnowledgeIndex } from '../wiki/sqlite-knowledge-index.js'; import { SqliteKnowledgeIndex } from '../../../src/context/wiki/sqlite-knowledge-index.js';
import { reindexLocalIndexes } from './reindex.js'; import { reindexLocalIndexes } from '../../../src/context/index-sync/reindex.js';
class FakeEmbeddingPort implements KtxEmbeddingPort { class FakeEmbeddingPort implements KtxEmbeddingPort {
readonly maxBatchSize = 8; readonly maxBatchSize = 8;

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { actionTargetConnectionId, memoryActionIdentity } from './action-identity.js'; import { actionTargetConnectionId, memoryActionIdentity } from '../../../src/context/ingest/action-identity.js';
describe('memory action target identity', () => { describe('memory action target identity', () => {
it('keys SL actions by target connection and wiki actions by run connection', () => { it('keys SL actions by target connection and wiki actions by run connection', () => {

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { parseDbtSchemaFile, parseDbtSchemaFiles } from './parse-schema.js'; import { parseDbtSchemaFile, parseDbtSchemaFiles } from '../../../../../src/context/ingest/adapters/dbt-descriptions/parse-schema.js';
describe('dbt descriptions schema parser', () => { describe('dbt descriptions schema parser', () => {
it('resolves shared dbt vars and defaults before parsing schema YAML', () => { it('resolves shared dbt vars and defaults before parsing schema YAML', () => {

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { chunkDbtProject } from './chunk.js'; import { chunkDbtProject } from '../../../../../src/context/ingest/adapters/dbt/chunk.js';
describe('chunkDbtProject', () => { describe('chunkDbtProject', () => {
const diffSet = (modified: string[]) => ({ added: [], modified, deleted: [], unchanged: [] }); const diffSet = (modified: string[]) => ({ added: [], modified, deleted: [], unchanged: [] });

View file

@ -2,8 +2,8 @@ import { mkdir, mkdtemp, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { afterEach, beforeEach, describe, expect, it } from 'vitest'; import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import type { SourceAdapter } from '../../types.js'; import type { SourceAdapter } from '../../../../../src/context/ingest/types.js';
import { DbtSourceAdapter } from './dbt.adapter.js'; import { DbtSourceAdapter } from '../../../../../src/context/ingest/adapters/dbt/dbt.adapter.js';
describe('DbtSourceAdapter', () => { describe('DbtSourceAdapter', () => {
let stagedDir: string; let stagedDir: string;

View file

@ -2,7 +2,7 @@ import { mkdir, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os'; import { tmpdir } from 'node:os';
import { join } from 'node:path'; import { join } from 'node:path';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import { fetchDbtRepo } from './fetch.js'; import { fetchDbtRepo } from '../../../../../src/context/ingest/adapters/dbt/fetch.js';
describe('fetchDbtRepo', () => { describe('fetchDbtRepo', () => {
let tempDir: string; let tempDir: string;

View file

@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { normalizeDbtPath } from './parse.js'; import { normalizeDbtPath } from '../../../../../src/context/ingest/adapters/dbt/parse.js';
describe('normalizeDbtPath', () => { describe('normalizeDbtPath', () => {
it('normalizes Windows separators to POSIX separators', () => { it('normalizes Windows separators to POSIX separators', () => {

Some files were not shown because too many files have changed in this diff Show more