ktx/packages/cli/src/sql.ts

229 lines
7.6 KiB
TypeScript
Raw Permalink Normal View History

chore(workspace): gate dead-code with knip production mode (#196) * refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm * refactor(workspace): rewrite @ktx/llm imports to relative paths * refactor(workspace): fold internal packages into cli * chore(workspace): gate dead-code with knip production mode Turn on production-mode knip plus an autofix run in pre-commit and the `pnpm dead-code` script, document the `/** @internal */` convention for test-only exports in AGENTS.md, annotate test-only exports across the CLI with that JSDoc, and drop dead exports/wrappers the new gate surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`, `createLocalScanEnrichmentProvidersFromConfig`, `PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports). Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit production entries so cross-package barrel leaks are caught. * refactor(cli): delete internal barrel index.ts files The 34 `index.ts` re-export barrels inside `packages/cli/src/` were holdovers from the pre-fold multi-workspace structure. Post-fold-in they served no production purpose: external consumers go through the single package main entry, and in-repo callers mostly imported through them only because the path was short. Internally, knip flagged most barrel re-exports as production-dead (only reached via tests). This change: - Deletes every internal barrel except `packages/cli/src/index.ts` (the published package entry). - Rewrites ~270 source/test files to import each name directly from the file that defines it. - Moves `tools/warehouse-verification/index.ts` to `create-warehouse-verification-tools.ts` (the function it defined locally) and updates its single consumer. - Renames `search/backend-conformance.ts` → `.test-utils.ts` to match the existing test-helper file convention. - Deletes 13 dead test-only chains (dbt-descriptions/*, live-database/extracted-schema, live-database/structural-sync, relationship-* feedback/review chain) plus their tests and a cascading orphan integration test. - Updates test mocks that pointed at deleted barrel paths (notion-client, connector barrels in scan/local-scan-connectors tests) to mock the source files instead. - Points the maintainer benchmark script (`scripts/relationship-benchmark-report.mjs`) at source files instead of `dist/context/scan/index.js`. - Drops the barrel `!` entries from `knip.json`; adds explicit production entries only for the benchmark code reached via dist by the maintainer script. Net: 413 files changed, ~1.2k insertions, ~9.4k deletions. `pnpm run dead-code` (Biome + knip default + knip production) and `pnpm run type-check` are clean; 2277 tests pass. * refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly Promote the CLI workspace package to the public name `@kaelio/ktx` and drop the separate `scripts/build-public-npm-package.mjs` wrapper. The CLI package is now publishable in place (`publishConfig.access: public`, `provenance: true`), so artifact packing uses `pnpm pack` against `packages/cli/` instead of assembling a parallel package tree. Updates all workspace filter invocations, docs, tests, and release readiness checks to reference the new package name, and folds the tarball-name helper into `scripts/public-npm-release-metadata.mjs`. * docs: align "agent clients" and "data agents" terminology Replace "client agents" with "agent clients" and "database agents" with "data agents" across AGENTS.md, README.md, the docs-site copy, and the matching setup-agents test description, matching the canonical vocabulary in docs/terminology.md. Also moves packages/cli/tsconfig.json's tsBuildInfoFile from node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive node_modules reinstalls. * refactor(release): single source of truth for package version Make packages/cli/package.json the single source of truth for the @kaelio/ktx version. publicNpmPackageVersion() now reads it directly, so artifact filenames, release-readiness checks, and the Python wheel version all derive from one field. The duplicate release-policy.json.publicNpmPackageVersion is removed. Previously the two fields could drift: tarballs were named kaelio-ktx-0.4.1.tgz while internally containing @kaelio/ktx@0.0.0-private. - update-public-release-version.mjs rewrites both Python pyproject.toml files (ktx-daemon, ktx-sl) alongside the npm package.jsons, normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2). - semantic-release-config.cjs adds the two pyproject.toml files to @semantic-release/git assets so the release commit back to main carries every version source in lockstep. - The six "?? '0.0.0-private'" fallback literals across the CLI are replaced with "?? getKtxCliPackageInfo().version", and createDefaultKtxMcpServer makes its version arg required. - docs/release.md describes the actual commit-back model: the dev tree always reflects the most recent release; no sentinel pin to maintain. Verified: pnpm run artifacts:build now produces kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with @kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and 2287 vitests + 173 script tests pass. * refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and scan command entrypoints so tests can stub them, and teach resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime feature when ktx.yaml selects sentence-transformers. * chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal Both symbols are consumed only by status-project.test.ts. Annotating with /** @internal */ keeps knip's production-mode check clean without changing runtime behavior. * fix(cli): use real package metadata in print-command-tree The stubbed package name embedded a forbidden product identifier that tripped the boundary check in CI. Read the metadata from package.json instead — keeps the rendered tree unchanged and removes a duplicate source of truth. * feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer source counts, computed with `SUM(embedding_json IS NOT NULL)` over `knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to "Wiki" (canonical per `docs/terminology.md`) and rename the matching `localStats.knowledgePages` field to `localStats.wikiPages`. Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those duplicated the per-surface rows above. Disk now reports only actual byte usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` / `semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry` helpers, and the `filter` arg on `summarizeDir` are removed.
2026-05-21 15:28:58 +02:00
import { loadKtxProject, type KtxLocalProject } from './context/project/project.js';
import type { KtxQueryResult, KtxScanConnector } from './context/scan/types.js';
import type { SqlAnalysisDialect, SqlAnalysisPort } from './context/sql-analysis/ports.js';
import type { KtxCliIo } from './cli-runtime.js';
fix: surface silent failures and drop unused dead-code paths (#193) Address overengineering audit findings across cli/context/connector packages: - F1 Snowflake `query`: drop bare catch that flattened all errors to empty result - F2 memory-agent: treat LLM `stopReason === 'error'` as crash (skip squash-merge) - F3 WikiSearchTool: description honest about token-only fallback vs sqlite-fts5 hybrid - F5 Scan enrichment provider resolution: return discriminated status and surface distinct `llm_unavailable` / `embedding_unavailable` warnings per failure mode - F6 Relationship validation budget: drop dead `tableCount === undefined → 'all'` branch; update tests to pass `tableCount` like production - F8 `ktx sql`: use canonical `resolveOutputMode` (now honors KTX_OUTPUT/CI/TTY) - F9 MCP stdio server: default `protocolIo.stderr` to `process.stderr` so memory_ingest startup failures are visible - F13/F14 Scan/setup JSON readers: distinguish ENOENT from corruption instead of silently treating both as missing - F15 `createKtxCliScanConnector`: throw config-shape error when driver matches but type guard rejects, instead of "no native connector" - F16 ContextEvidenceSearchTool: surface `embedding_unhealthy:<reason>` instead of silently dropping the semantic lane - F17 PromptService: default partials to `[]` (removes stale `clinical_policy` reference from a prior product) - F20 `contextBuildCommands`: drop unused `runId` parameter Dead-code removal: - F4 Delete `AgentRunnerService` (duplicated `RuntimeAgentRunner`, only test-used); migrate tests to exercise `AiSdkKtxLlmRuntime.runAgentLoop` directly - F7 Delete `KtxScanOrchestrator` and its test (no production callers; the inline pipeline in `runLocalScan` is the single source of truth) - F18 Delete `generateKtxText`/`generateKtxObject` pass-through helpers; inline the single `runtime.generateObject` call at its caller Plus a clarifying comment on the SQLite `resolveStringReference` `file:` carve-out (load-bearing for SQLite URI form, not a bug).
2026-05-21 02:38:18 +02:00
import { type KtxOutputMode, resolveOutputMode } from './io/mode.js';
import { createKtxCliScanConnector } from './local-scan-connectors.js';
import { createManagedDaemonSqlAnalysisPort } from './managed-python-http.js';
import { profileMark } from './startup-profile.js';
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
import { isDemoConnection } from './telemetry/demo-detect.js';
import { emitTelemetryEvent } from './telemetry/index.js';
import { scrubErrorClass } from './telemetry/scrubber.js';
profileMark('module:sql');
fix: surface silent failures and drop unused dead-code paths (#193) Address overengineering audit findings across cli/context/connector packages: - F1 Snowflake `query`: drop bare catch that flattened all errors to empty result - F2 memory-agent: treat LLM `stopReason === 'error'` as crash (skip squash-merge) - F3 WikiSearchTool: description honest about token-only fallback vs sqlite-fts5 hybrid - F5 Scan enrichment provider resolution: return discriminated status and surface distinct `llm_unavailable` / `embedding_unavailable` warnings per failure mode - F6 Relationship validation budget: drop dead `tableCount === undefined → 'all'` branch; update tests to pass `tableCount` like production - F8 `ktx sql`: use canonical `resolveOutputMode` (now honors KTX_OUTPUT/CI/TTY) - F9 MCP stdio server: default `protocolIo.stderr` to `process.stderr` so memory_ingest startup failures are visible - F13/F14 Scan/setup JSON readers: distinguish ENOENT from corruption instead of silently treating both as missing - F15 `createKtxCliScanConnector`: throw config-shape error when driver matches but type guard rejects, instead of "no native connector" - F16 ContextEvidenceSearchTool: surface `embedding_unhealthy:<reason>` instead of silently dropping the semantic lane - F17 PromptService: default partials to `[]` (removes stale `clinical_policy` reference from a prior product) - F20 `contextBuildCommands`: drop unused `runId` parameter Dead-code removal: - F4 Delete `AgentRunnerService` (duplicated `RuntimeAgentRunner`, only test-used); migrate tests to exercise `AiSdkKtxLlmRuntime.runAgentLoop` directly - F7 Delete `KtxScanOrchestrator` and its test (no production callers; the inline pipeline in `runLocalScan` is the single source of truth) - F18 Delete `generateKtxText`/`generateKtxObject` pass-through helpers; inline the single `runtime.generateObject` call at its caller Plus a clarifying comment on the SQLite `resolveStringReference` `file:` carve-out (load-bearing for SQLite URI form, not a bug).
2026-05-21 02:38:18 +02:00
type KtxSqlOutputMode = KtxOutputMode;
export type KtxSqlArgs = {
command: 'execute';
projectDir: string;
connectionId: string;
sql: string;
maxRows: number;
output?: KtxSqlOutputMode;
json?: boolean;
cliVersion: string;
};
export interface KtxSqlDeps {
loadProject?: typeof loadKtxProject;
createSqlAnalysis?: () => SqlAnalysisPort;
createScanConnector?: typeof createKtxCliScanConnector;
}
interface SqlExecutionOutput {
connectionId: string;
headers: string[];
headerTypes?: string[];
rows: unknown[][];
rowCount: number;
}
function sqlAnalysisDialectForDriver(driver: string | undefined): SqlAnalysisDialect {
const normalized = String(driver ?? '').trim().toLowerCase();
const map: Record<string, SqlAnalysisDialect> = {
postgres: 'postgres',
postgresql: 'postgres',
bigquery: 'bigquery',
snowflake: 'snowflake',
mysql: 'mysql',
sqlserver: 'tsql',
mssql: 'tsql',
sqlite: 'sqlite',
sqlite3: 'sqlite',
clickhouse: 'clickhouse',
redshift: 'redshift',
};
return map[normalized] ?? 'postgres';
}
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
function queryVerb(sql: string): 'select' | 'explain' | 'show' | 'with' | 'other' {
const first = sql.trim().split(/\s+/, 1)[0]?.toLowerCase();
if (first === 'select' || first === 'explain' || first === 'show' || first === 'with') {
return first;
}
return 'other';
}
async function safeReferencedTableCount(
port: SqlAnalysisPort,
sql: string,
dialect: SqlAnalysisDialect,
): Promise<number> {
try {
const results = await port.analyzeBatch([{ id: 'cli-sql', sql }], dialect);
return results.get('cli-sql')?.tablesTouched.length ?? 0;
} catch {
return 0;
}
}
function formatValue(value: unknown): string {
if (value === null || value === undefined) return '';
if (typeof value === 'string') return value;
if (typeof value === 'number' || typeof value === 'boolean' || typeof value === 'bigint') return String(value);
return JSON.stringify(value);
}
function printJson(output: SqlExecutionOutput, io: KtxCliIo): void {
io.stdout.write(`${JSON.stringify(output, null, 2)}\n`);
}
function printPlain(output: SqlExecutionOutput, io: KtxCliIo): void {
io.stdout.write(`${output.headers.join('\t')}\n`);
for (const row of output.rows) {
io.stdout.write(`${row.map(formatValue).join('\t')}\n`);
}
}
function printPretty(output: SqlExecutionOutput, io: KtxCliIo): void {
const rows = output.rows.map((row) => row.map(formatValue));
const widths = output.headers.map((header, index) =>
Math.max(header.length, ...rows.map((row) => row[index]?.length ?? 0)),
);
const renderRow = (cells: string[]): string =>
cells.map((cell, index) => cell.padEnd(widths[index] ?? cell.length)).join(' ').trimEnd();
if (output.headers.length > 0) {
io.stdout.write(`${renderRow(output.headers)}\n`);
io.stdout.write(`${renderRow(widths.map((width) => '-'.repeat(width)))}\n`);
}
for (const row of rows) {
io.stdout.write(`${renderRow(row)}\n`);
}
io.stdout.write(`\n${output.rowCount} ${output.rowCount === 1 ? 'row' : 'rows'}\n`);
}
function printSqlResult(output: SqlExecutionOutput, mode: KtxSqlOutputMode, io: KtxCliIo): void {
if (mode === 'json') {
printJson(output, io);
return;
}
if (mode === 'plain') {
printPlain(output, io);
return;
}
printPretty(output, io);
}
async function cleanupConnector(connector: KtxScanConnector | null): Promise<void> {
if (connector?.cleanup) {
await connector.cleanup();
}
}
function resultOutput(connectionId: string, result: KtxQueryResult): SqlExecutionOutput {
return {
connectionId,
headers: result.headers,
...(result.headerTypes ? { headerTypes: result.headerTypes } : {}),
rows: result.rows,
rowCount: result.rowCount ?? result.rows.length,
};
}
export async function runKtxSql(args: KtxSqlArgs, io: KtxCliIo = process, deps: KtxSqlDeps = {}): Promise<number> {
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
const startedAt = performance.now();
let driver = 'unknown';
let demoConnection = false;
try {
const project = await (deps.loadProject ?? loadKtxProject)({ projectDir: args.projectDir });
const connection = project.config.connections[args.connectionId];
if (!connection) {
throw new Error(`Connection "${args.connectionId}" is not configured in ktx.yaml`);
}
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
driver = String(connection.driver ?? 'unknown').toLowerCase();
demoConnection = isDemoConnection(args.connectionId, connection);
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
const createSqlAnalysis =
deps.createSqlAnalysis ??
(() =>
createManagedDaemonSqlAnalysisPort({
cliVersion: args.cliVersion,
projectDir: args.projectDir,
installPolicy: 'auto',
io,
}));
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
const analysisPort = createSqlAnalysis();
const dialect = sqlAnalysisDialectForDriver(connection.driver);
const validation = await analysisPort.validateReadOnly(args.sql, dialect);
if (!validation.ok) {
throw new Error(validation.error ?? 'SQL is not read-only.');
}
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
const referencedTableCount = await safeReferencedTableCount(analysisPort, args.sql, dialect);
const createScanConnector = deps.createScanConnector ?? createKtxCliScanConnector;
let connector: KtxScanConnector | null = null;
try {
connector = await createScanConnector(project as KtxLocalProject, args.connectionId);
if (!connector.capabilities.readOnlySql || !connector.executeReadOnly) {
throw new Error(`Connection "${args.connectionId}" does not support read-only SQL execution.`);
}
const result = await connector.executeReadOnly(
{
connectionId: args.connectionId,
sql: args.sql,
maxRows: args.maxRows,
},
{ runId: 'cli-sql' },
);
fix: surface silent failures and drop unused dead-code paths (#193) Address overengineering audit findings across cli/context/connector packages: - F1 Snowflake `query`: drop bare catch that flattened all errors to empty result - F2 memory-agent: treat LLM `stopReason === 'error'` as crash (skip squash-merge) - F3 WikiSearchTool: description honest about token-only fallback vs sqlite-fts5 hybrid - F5 Scan enrichment provider resolution: return discriminated status and surface distinct `llm_unavailable` / `embedding_unavailable` warnings per failure mode - F6 Relationship validation budget: drop dead `tableCount === undefined → 'all'` branch; update tests to pass `tableCount` like production - F8 `ktx sql`: use canonical `resolveOutputMode` (now honors KTX_OUTPUT/CI/TTY) - F9 MCP stdio server: default `protocolIo.stderr` to `process.stderr` so memory_ingest startup failures are visible - F13/F14 Scan/setup JSON readers: distinguish ENOENT from corruption instead of silently treating both as missing - F15 `createKtxCliScanConnector`: throw config-shape error when driver matches but type guard rejects, instead of "no native connector" - F16 ContextEvidenceSearchTool: surface `embedding_unhealthy:<reason>` instead of silently dropping the semantic lane - F17 PromptService: default partials to `[]` (removes stale `clinical_policy` reference from a prior product) - F20 `contextBuildCommands`: drop unused `runId` parameter Dead-code removal: - F4 Delete `AgentRunnerService` (duplicated `RuntimeAgentRunner`, only test-used); migrate tests to exercise `AiSdkKtxLlmRuntime.runAgentLoop` directly - F7 Delete `KtxScanOrchestrator` and its test (no production callers; the inline pipeline in `runLocalScan` is the single source of truth) - F18 Delete `generateKtxText`/`generateKtxObject` pass-through helpers; inline the single `runtime.generateObject` call at its caller Plus a clarifying comment on the SQLite `resolveStringReference` `file:` carve-out (load-bearing for SQLite URI form, not a bug).
2026-05-21 02:38:18 +02:00
const mode = resolveOutputMode({ explicit: args.output, json: args.json, io });
printSqlResult(resultOutput(args.connectionId, result), mode, io);
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
await emitTelemetryEvent({
name: 'sql_completed',
projectDir: args.projectDir,
io,
fields: {
driver,
isDemoConnection: demoConnection,
queryVerb: queryVerb(args.sql),
referencedTableCount,
durationMs: Math.max(0, performance.now() - startedAt),
outcome: 'ok',
},
});
return 0;
} finally {
await cleanupConnector(connector);
}
} catch (error) {
feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests
2026-05-22 18:18:47 +02:00
const errorClass = scrubErrorClass(error);
await emitTelemetryEvent({
name: 'sql_completed',
projectDir: args.projectDir,
io,
fields: {
driver,
isDemoConnection: demoConnection,
queryVerb: queryVerb(args.sql),
referencedTableCount: 0,
durationMs: Math.max(0, performance.now() - startedAt),
outcome: 'error',
...(errorClass ? { errorClass } : {}),
},
});
io.stderr.write(`${error instanceof Error ? error.message : String(error)}\n`);
return 1;
}
}