chore(workspace): gate dead-code with knip production mode (#196)

* refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm * refactor(workspace): rewrite @ktx/llm imports to relative paths * refactor(workspace): fold internal packages into cli * chore(workspace): gate dead-code with knip production mode Turn on production-mode knip plus an autofix run in pre-commit and the `pnpm dead-code` script, document the `/** @internal */` convention for test-only exports in AGENTS.md, annotate test-only exports across the CLI with that JSDoc, and drop dead exports/wrappers the new gate surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`, `createLocalScanEnrichmentProvidersFromConfig`, `PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports). Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit production entries so cross-package barrel leaks are caught. * refactor(cli): delete internal barrel index.ts files The 34 `index.ts` re-export barrels inside `packages/cli/src/` were holdovers from the pre-fold multi-workspace structure. Post-fold-in they served no production purpose: external consumers go through the single package main entry, and in-repo callers mostly imported through them only because the path was short. Internally, knip flagged most barrel re-exports as production-dead (only reached via tests). This change: - Deletes every internal barrel except `packages/cli/src/index.ts` (the published package entry). - Rewrites ~270 source/test files to import each name directly from the file that defines it. - Moves `tools/warehouse-verification/index.ts` to `create-warehouse-verification-tools.ts` (the function it defined locally) and updates its single consumer. - Renames `search/backend-conformance.ts` → `.test-utils.ts` to match the existing test-helper file convention. - Deletes 13 dead test-only chains (dbt-descriptions/*, live-database/extracted-schema, live-database/structural-sync, relationship-* feedback/review chain) plus their tests and a cascading orphan integration test. - Updates test mocks that pointed at deleted barrel paths (notion-client, connector barrels in scan/local-scan-connectors tests) to mock the source files instead. - Points the maintainer benchmark script (`scripts/relationship-benchmark-report.mjs`) at source files instead of `dist/context/scan/index.js`. - Drops the barrel `!` entries from `knip.json`; adds explicit production entries only for the benchmark code reached via dist by the maintainer script. Net: 413 files changed, ~1.2k insertions, ~9.4k deletions. `pnpm run dead-code` (Biome + knip default + knip production) and `pnpm run type-check` are clean; 2277 tests pass. * refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly Promote the CLI workspace package to the public name `@kaelio/ktx` and drop the separate `scripts/build-public-npm-package.mjs` wrapper. The CLI package is now publishable in place (`publishConfig.access: public`, `provenance: true`), so artifact packing uses `pnpm pack` against `packages/cli/` instead of assembling a parallel package tree. Updates all workspace filter invocations, docs, tests, and release readiness checks to reference the new package name, and folds the tarball-name helper into `scripts/public-npm-release-metadata.mjs`. * docs: align "agent clients" and "data agents" terminology Replace "client agents" with "agent clients" and "database agents" with "data agents" across AGENTS.md, README.md, the docs-site copy, and the matching setup-agents test description, matching the canonical vocabulary in docs/terminology.md. Also moves packages/cli/tsconfig.json's tsBuildInfoFile from node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive node_modules reinstalls. * refactor(release): single source of truth for package version Make packages/cli/package.json the single source of truth for the @kaelio/ktx version. publicNpmPackageVersion() now reads it directly, so artifact filenames, release-readiness checks, and the Python wheel version all derive from one field. The duplicate release-policy.json.publicNpmPackageVersion is removed. Previously the two fields could drift: tarballs were named kaelio-ktx-0.4.1.tgz while internally containing @kaelio/ktx@0.0.0-private. - update-public-release-version.mjs rewrites both Python pyproject.toml files (ktx-daemon, ktx-sl) alongside the npm package.jsons, normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2). - semantic-release-config.cjs adds the two pyproject.toml files to @semantic-release/git assets so the release commit back to main carries every version source in lockstep. - The six "?? '0.0.0-private'" fallback literals across the CLI are replaced with "?? getKtxCliPackageInfo().version", and createDefaultKtxMcpServer makes its version arg required. - docs/release.md describes the actual commit-back model: the dev tree always reflects the most recent release; no sentinel pin to maintain. Verified: pnpm run artifacts:build now produces kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with @kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and 2287 vitests + 173 script tests pass. * refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and scan command entrypoints so tests can stub them, and teach resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime feature when ktx.yaml selects sentence-transformers. * chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal Both symbols are consumed only by status-project.test.ts. Annotating with /** @internal */ keeps knip's production-mode check clean without changing runtime behavior. * fix(cli): use real package metadata in print-command-tree The stubbed package name embedded a forbidden product identifier that tripped the boundary check in CI. Read the metadata from package.json instead — keeps the rendered tree unchanged and removes a duplicate source of truth. * feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer source counts, computed with `SUM(embedding_json IS NOT NULL)` over `knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to "Wiki" (canonical per `docs/terminology.md`) and rename the matching `localStats.knowledgePages` field to `localStats.wikiPages`. Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those duplicated the per-surface rows above. Disk now reports only actual byte usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` / `semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry` helpers, and the `filter` arg on `summarizeDir` are removed.
2026-06-22 08:38:08 +02:00 · 2026-05-21 15:28:58 +02:00 · 2026-05-21 15:28:58 +02:00 · 2366b00301
commit 2366b00301
parent a1cfb03d73
1002 changed files with 2286 additions and 12051 deletions
--- a/packages/cli/src/context/daemon/semantic-layer-compute.test.ts
+++ b/packages/cli/src/context/daemon/semantic-layer-compute.test.ts
@ -0,0 +1,339 @@
+import { once } from 'node:events';
+import { createServer } from 'node:http';
+import { describe, expect, it, vi } from 'vitest';
+import { createHttpSemanticLayerComputePort, createPythonSemanticLayerComputePort } from './semantic-layer-compute.js';
+
+const source = {
+  name: 'orders',
+  table: 'public.orders',
+  grain: ['id'],
+  columns: [{ name: 'id', type: 'number' }],
+  joins: [],
+  measures: [{ name: 'order_count', expr: 'count(*)' }],
+};
+
+const sourceGenerationInput = {
+  tables: [
+    {
+      name: 'orders',
+      db: 'public',
+      comment: 'Orders table',
+      columns: [
+        { name: 'id', type: 'integer', primaryKey: true, nullable: false, comment: 'Order ID' },
+        { name: 'customer_id', type: 'integer' },
+        { name: 'amount', type: 'decimal', comment: 'Order amount' },
+      ],
+    },
+    {
+      name: 'customers',
+      db: 'public',
+      columns: [
+        { name: 'id', type: 'integer', primaryKey: true },
+        { name: 'email', type: 'varchar' },
+      ],
+    },
+  ],
+  links: [
+    {
+      fromTable: 'orders',
+      fromColumn: 'customer_id',
+      toTable: 'customers',
+      toColumn: 'id',
+      relationshipType: 'MANY_TO_ONE',
+    },
+  ],
+  dialect: 'postgres',
+};
+
+const sourceGenerationDaemonPayload = {
+  tables: [
+    {
+      name: 'orders',
+      db: 'public',
+      comment: 'Orders table',
+      columns: [
+        { name: 'id', type: 'integer', primary_key: true, nullable: false, comment: 'Order ID' },
+        { name: 'customer_id', type: 'integer' },
+        { name: 'amount', type: 'decimal', comment: 'Order amount' },
+      ],
+    },
+    {
+      name: 'customers',
+      db: 'public',
+      columns: [
+        { name: 'id', type: 'integer', primary_key: true },
+        { name: 'email', type: 'varchar' },
+      ],
+    },
+  ],
+  links: [
+    {
+      from_table: 'orders',
+      from_column: 'customer_id',
+      to_table: 'customers',
+      to_column: 'id',
+      relationship_type: 'MANY_TO_ONE',
+    },
+  ],
+  dialect: 'postgres',
+};
+
+const sourceGenerationDaemonResponse = {
+  source_count: 2,
+  sources: [
+    {
+      name: 'orders',
+      table: 'public.orders',
+      grain: ['id'],
+      columns: [{ name: 'id', type: 'number' }],
+      joins: [
+        {
+          to: 'customers',
+          on: 'customer_id = customers.id',
+          relationship: 'many_to_one',
+        },
+      ],
+      measures: [{ name: 'record_count', expr: 'count(id)' }],
+    },
+  ],
+};
+
+describe('createPythonSemanticLayerComputePort', () => {
+  it('calls the semantic-query stdio command', async () => {
+    const runJson = vi.fn(async () => ({
+      sql: 'select count(*) from public.orders',
+      dialect: 'postgres',
+      columns: [{ name: 'orders.order_count' }],
+      plan: { sources_used: ['orders'] },
+    }));
+    const port = createPythonSemanticLayerComputePort({ runJson });
+
+    await expect(
+      port.query({
+        sources: [source],
+        dialect: 'postgres',
+        query: { measures: ['orders.order_count'], dimensions: [] },
+      }),
+    ).resolves.toEqual({
+      sql: 'select count(*) from public.orders',
+      dialect: 'postgres',
+      columns: [{ name: 'orders.order_count' }],
+      plan: { sources_used: ['orders'] },
+    });
+
+    expect(runJson).toHaveBeenCalledWith('semantic-query', {
+      sources: [source],
+      dialect: 'postgres',
+      query: { measures: ['orders.order_count'], dimensions: [] },
+    });
+  });
+
+  it('calls the semantic-validate stdio command', async () => {
+    const runJson = vi.fn(async () => ({
+      valid: true,
+      errors: [],
+      warnings: [],
+      per_source_warnings: {},
+    }));
+    const port = createPythonSemanticLayerComputePort({ runJson });
+
+    await expect(
+      port.validateSources({
+        sources: [source],
+        dialect: 'postgres',
+        recentlyTouched: ['orders'],
+      }),
+    ).resolves.toEqual({
+      valid: true,
+      errors: [],
+      warnings: [],
+      perSourceWarnings: {},
+    });
+
+    expect(runJson).toHaveBeenCalledWith('semantic-validate', {
+      sources: [source],
+      dialect: 'postgres',
+      recently_touched: ['orders'],
+    });
+  });
+
+  it('calls the semantic-generate-sources stdio command', async () => {
+    const runJson = vi.fn(async () => sourceGenerationDaemonResponse);
+    const port = createPythonSemanticLayerComputePort({ runJson });
+
+    await expect(port.generateSources(sourceGenerationInput)).resolves.toEqual({
+      sourceCount: 2,
+      sources: sourceGenerationDaemonResponse.sources,
+    });
+
+    expect(runJson).toHaveBeenCalledWith('semantic-generate-sources', sourceGenerationDaemonPayload);
+  });
+});
+
+describe('createHttpSemanticLayerComputePort', () => {
+  it('calls semantic query and validate HTTP endpoints through an injected runner', async () => {
+    const requestJson = vi.fn(async (path: string) => {
+      if (path === '/semantic-layer/query') {
+        return {
+          sql: 'select count(*) from public.orders',
+          dialect: 'postgres',
+          columns: [{ name: 'orders.order_count' }],
+          plan: { sources_used: ['orders'] },
+        };
+      }
+      return {
+        valid: true,
+        errors: [],
+        warnings: [],
+        per_source_warnings: {},
+      };
+    });
+    const port = createHttpSemanticLayerComputePort({ baseUrl: 'http://127.0.0.1:8765/', requestJson });
+
+    await expect(
+      port.query({
+        sources: [source],
+        dialect: 'postgres',
+        query: { measures: ['orders.order_count'], dimensions: [] },
+      }),
+    ).resolves.toEqual({
+      sql: 'select count(*) from public.orders',
+      dialect: 'postgres',
+      columns: [{ name: 'orders.order_count' }],
+      plan: { sources_used: ['orders'] },
+    });
+
+    await expect(
+      port.validateSources({
+        sources: [source],
+        dialect: 'postgres',
+        recentlyTouched: ['orders'],
+      }),
+    ).resolves.toEqual({
+      valid: true,
+      errors: [],
+      warnings: [],
+      perSourceWarnings: {},
+    });
+
+    expect(requestJson).toHaveBeenNthCalledWith(1, '/semantic-layer/query', {
+      sources: [source],
+      dialect: 'postgres',
+      query: { measures: ['orders.order_count'], dimensions: [] },
+    });
+    expect(requestJson).toHaveBeenNthCalledWith(2, '/semantic-layer/validate', {
+      sources: [source],
+      dialect: 'postgres',
+      recently_touched: ['orders'],
+    });
+  });
+
+  it('calls the semantic source-generation HTTP endpoint through an injected runner', async () => {
+    const requestJson = vi.fn(async () => sourceGenerationDaemonResponse);
+    const port = createHttpSemanticLayerComputePort({ baseUrl: 'http://127.0.0.1:8765/', requestJson });
+
+    await expect(port.generateSources(sourceGenerationInput)).resolves.toEqual({
+      sourceCount: 2,
+      sources: sourceGenerationDaemonResponse.sources,
+    });
+
+    expect(requestJson).toHaveBeenCalledWith('/semantic-layer/generate-sources', sourceGenerationDaemonPayload);
+  });
+
+  it('posts JSON to a running HTTP daemon endpoint', async () => {
+    const requests: Array<{ url: string | undefined; body: unknown }> = [];
+    const server = createServer((request, response) => {
+      const chunks: Buffer[] = [];
+      request.on('data', (chunk: Buffer) => chunks.push(chunk));
+      request.on('end', () => {
+        requests.push({
+          url: request.url,
+          body: JSON.parse(Buffer.concat(chunks).toString('utf8')),
+        });
+        response.writeHead(200, { 'content-type': 'application/json' });
+        response.end(
+          JSON.stringify({
+            sql: 'select count(*) from public.orders',
+            dialect: 'postgres',
+            columns: [{ name: 'orders.order_count' }],
+            plan: { sources_used: ['orders'] },
+          }),
+        );
+      });
+    });
+
+    server.listen(0, '127.0.0.1');
+    await once(server, 'listening');
+    try {
+      const address = server.address();
+      if (!address || typeof address === 'string') {
+        throw new Error('expected TCP server address');
+      }
+      const port = createHttpSemanticLayerComputePort({ baseUrl: `http://127.0.0.1:${address.port}` });
+
+      await expect(
+        port.query({
+          sources: [source],
+          dialect: 'postgres',
+          query: { measures: ['orders.order_count'], dimensions: [] },
+        }),
+      ).resolves.toMatchObject({
+        sql: 'select count(*) from public.orders',
+        dialect: 'postgres',
+      });
+
+      expect(requests).toEqual([
+        {
+          url: '/semantic-layer/query',
+          body: {
+            sources: [source],
+            dialect: 'postgres',
+            query: { measures: ['orders.order_count'], dimensions: [] },
+          },
+        },
+      ]);
+    } finally {
+      server.close();
+    }
+  });
+
+  it('posts source-generation JSON to a running HTTP daemon endpoint', async () => {
+    const requests: Array<{ url: string | undefined; body: unknown }> = [];
+    const server = createServer((request, response) => {
+      const chunks: Buffer[] = [];
+      request.on('data', (chunk: Buffer) => chunks.push(chunk));
+      request.on('end', () => {
+        requests.push({
+          url: request.url,
+          body: JSON.parse(Buffer.concat(chunks).toString('utf8')),
+        });
+        response.writeHead(200, { 'content-type': 'application/json' });
+        response.end(JSON.stringify(sourceGenerationDaemonResponse));
+      });
+    });
+
+    server.listen(0, '127.0.0.1');
+    await once(server, 'listening');
+    try {
+      const address = server.address();
+      if (!address || typeof address === 'string') {
+        throw new Error('expected TCP server address');
+      }
+      const port = createHttpSemanticLayerComputePort({ baseUrl: `http://127.0.0.1:${address.port}` });
+
+      await expect(port.generateSources(sourceGenerationInput)).resolves.toEqual({
+        sourceCount: 2,
+        sources: sourceGenerationDaemonResponse.sources,
+      });
+
+      expect(requests).toEqual([
+        {
+          url: '/semantic-layer/generate-sources',
+          body: sourceGenerationDaemonPayload,
+        },
+      ]);
+    } finally {
+      server.close();
+    }
+  });
+});
--- a/packages/cli/src/context/daemon/semantic-layer-compute.ts
+++ b/packages/cli/src/context/daemon/semantic-layer-compute.ts
@ -0,0 +1,314 @@
+import { request as httpRequest } from 'node:http';
+import { request as httpsRequest } from 'node:https';
+import { URL } from 'node:url';
+import { spawn } from 'node:child_process';
+import type { ResolvedSemanticLayerSource, SemanticLayerQueryInput } from '../sl/types.js';
+
+interface KtxSemanticLayerComputeQueryResult {
+  sql: string;
+  dialect: string;
+  columns: Array<Record<string, unknown>>;
+  plan: Record<string, unknown>;
+}
+
+interface KtxSemanticLayerComputeValidationResult {
+  valid: boolean;
+  errors: string[];
+  warnings: string[];
+  perSourceWarnings: Record<string, string[]>;
+}
+
+interface KtxSemanticLayerSourceGenerationColumnInput {
+  name: string;
+  type: string;
+  primaryKey?: boolean;
+  nullable?: boolean;
+  comment?: string | null;
+}
+
+interface KtxSemanticLayerSourceGenerationTableInput {
+  name: string;
+  catalog?: string | null;
+  db?: string | null;
+  comment?: string | null;
+  columns: KtxSemanticLayerSourceGenerationColumnInput[];
+}
+
+interface KtxSemanticLayerSourceGenerationLinkInput {
+  fromTable: string;
+  fromColumn: string;
+  toTable: string;
+  toColumn: string;
+  relationshipType: string;
+}
+
+interface KtxSemanticLayerSourceGenerationInput {
+  tables: KtxSemanticLayerSourceGenerationTableInput[];
+  links: KtxSemanticLayerSourceGenerationLinkInput[];
+  dialect?: string;
+}
+
+interface KtxSemanticLayerSourceGenerationResult {
+  sources: Array<Record<string, unknown>>;
+  sourceCount: number;
+}
+
+export interface KtxSemanticLayerComputePort {
+  /**
+   * Callers must pass sources sanitized through toResolvedWire. The Python
+   * daemon rejects authoring-only fields such as usage and inherits_columns_from.
+   */
+  query(input: {
+    sources: ResolvedSemanticLayerSource[];
+    query: SemanticLayerQueryInput;
+    dialect: string;
+  }): Promise<KtxSemanticLayerComputeQueryResult>;
+  /**
+   * Callers must pass sources sanitized through toResolvedWire. The Python
+   * daemon rejects authoring-only fields such as usage and inherits_columns_from.
+   */
+  validateSources(input: {
+    sources: ResolvedSemanticLayerSource[];
+    dialect: string;
+    recentlyTouched?: string[];
+  }): Promise<KtxSemanticLayerComputeValidationResult>;
+  generateSources(input: KtxSemanticLayerSourceGenerationInput): Promise<KtxSemanticLayerSourceGenerationResult>;
+}
+
+type KtxDaemonCommand = 'semantic-query' | 'semantic-validate' | 'semantic-generate-sources';
+
+type KtxDaemonJsonRunner = (
+  subcommand: KtxDaemonCommand,
+  payload: Record<string, unknown>,
+) => Promise<Record<string, unknown>>;
+
+type KtxDaemonHttpJsonRunner = (path: string, payload: Record<string, unknown>) => Promise<Record<string, unknown>>;
+
+export interface PythonSemanticLayerComputeOptions {
+  command?: string;
+  args?: string[];
+  cwd?: string;
+  env?: NodeJS.ProcessEnv;
+  runJson?: KtxDaemonJsonRunner;
+}
+
+/** @internal */
+export interface HttpSemanticLayerComputeOptions {
+  baseUrl: string;
+  requestJson?: KtxDaemonHttpJsonRunner;
+}
+
+function parseJsonObject(raw: string, subcommand: string): Record<string, unknown> {
+  const parsed = JSON.parse(raw) as unknown;
+  if (!parsed || typeof parsed !== 'object' || Array.isArray(parsed)) {
+    throw new Error(`ktx-daemon ${subcommand} returned non-object JSON`);
+  }
+  return parsed as Record<string, unknown>;
+}
+
+function runProcessJson(
+  options: Required<Pick<PythonSemanticLayerComputeOptions, 'command' | 'args'>> &
+    Pick<PythonSemanticLayerComputeOptions, 'cwd' | 'env'>,
+): KtxDaemonJsonRunner {
+  return async (subcommand: KtxDaemonCommand, payload: Record<string, unknown>): Promise<Record<string, unknown>> =>
+    new Promise((resolve, reject) => {
+      const child = spawn(options.command, [...options.args, subcommand], {
+        cwd: options.cwd,
+        env: { ...process.env, ...options.env },
+        stdio: ['pipe', 'pipe', 'pipe'],
+      });
+      const stdout: Buffer[] = [];
+      const stderr: Buffer[] = [];
+
+      child.stdout.on('data', (chunk: Buffer) => stdout.push(chunk));
+      child.stderr.on('data', (chunk: Buffer) => stderr.push(chunk));
+      child.on('error', reject);
+      child.on('close', (code) => {
+        const stdoutText = Buffer.concat(stdout).toString('utf8').trim();
+        const stderrText = Buffer.concat(stderr).toString('utf8').trim();
+        if (code !== 0) {
+          reject(new Error(`ktx-daemon ${subcommand} failed: ${stderrText || `exit code ${code}`}`));
+          return;
+        }
+        try {
+          resolve(parseJsonObject(stdoutText, subcommand));
+        } catch (error) {
+          reject(error);
+        }
+      });
+      child.stdin.end(`${JSON.stringify(payload)}\n`);
+    });
+}
+
+function normalizedBaseUrl(baseUrl: string): string {
+  return baseUrl.endsWith('/') ? baseUrl : `${baseUrl}/`;
+}
+
+function postJson(baseUrl: string): KtxDaemonHttpJsonRunner {
+  return async (path, payload) =>
+    new Promise((resolve, reject) => {
+      const target = new URL(path.replace(/^\//, ''), normalizedBaseUrl(baseUrl));
+      const body = JSON.stringify(payload);
+      const client = target.protocol === 'https:' ? httpsRequest : httpRequest;
+      const request = client(
+        target,
+        {
+          method: 'POST',
+          headers: {
+            accept: 'application/json',
+            'content-type': 'application/json',
+            'content-length': Buffer.byteLength(body),
+          },
+        },
+        (response) => {
+          const chunks: Buffer[] = [];
+          response.on('data', (chunk: Buffer) => chunks.push(chunk));
+          response.on('end', () => {
+            const text = Buffer.concat(chunks).toString('utf8');
+            const statusCode = response.statusCode ?? 0;
+            if (statusCode < 200 || statusCode >= 300) {
+              reject(new Error(`ktx-daemon HTTP ${path} failed with ${statusCode}: ${text}`));
+              return;
+            }
+            try {
+              resolve(parseJsonObject(text, path));
+            } catch (error) {
+              reject(error);
+            }
+          });
+        },
+      );
+      request.on('error', reject);
+      request.end(body);
+    });
+}
+
+function stringArray(value: unknown): string[] {
+  return Array.isArray(value) ? value.filter((item): item is string => typeof item === 'string') : [];
+}
+
+function recordValue(value: unknown): Record<string, unknown> {
+  return value && typeof value === 'object' && !Array.isArray(value) ? (value as Record<string, unknown>) : {};
+}
+
+function recordArray(value: unknown): Array<Record<string, unknown>> {
+  return Array.isArray(value)
+    ? value.filter(
+        (item): item is Record<string, unknown> => item !== null && typeof item === 'object' && !Array.isArray(item),
+      )
+    : [];
+}
+
+function sourceGenerationPayload(input: KtxSemanticLayerSourceGenerationInput): Record<string, unknown> {
+  return {
+    tables: input.tables.map((table) => ({
+      name: table.name,
+      ...(table.catalog !== undefined ? { catalog: table.catalog } : {}),
+      ...(table.db !== undefined ? { db: table.db } : {}),
+      ...(table.comment !== undefined ? { comment: table.comment } : {}),
+      columns: table.columns.map((column) => ({
+        name: column.name,
+        type: column.type,
+        ...(column.primaryKey !== undefined ? { primary_key: column.primaryKey } : {}),
+        ...(column.nullable !== undefined ? { nullable: column.nullable } : {}),
+        ...(column.comment !== undefined ? { comment: column.comment } : {}),
+      })),
+    })),
+    links: input.links.map((link) => ({
+      from_table: link.fromTable,
+      from_column: link.fromColumn,
+      to_table: link.toTable,
+      to_column: link.toColumn,
+      relationship_type: link.relationshipType,
+    })),
+    dialect: input.dialect ?? 'postgres',
+  };
+}
+
+function sourceGenerationResult(raw: Record<string, unknown>): KtxSemanticLayerSourceGenerationResult {
+  return {
+    sources: recordArray(raw.sources),
+    sourceCount: typeof raw.source_count === 'number' ? raw.source_count : recordArray(raw.sources).length,
+  };
+}
+
+export function createPythonSemanticLayerComputePort(
+  options: PythonSemanticLayerComputeOptions = {},
+): KtxSemanticLayerComputePort {
+  const command = options.command ?? 'python';
+  const args = options.args ?? ['-m', 'ktx_daemon'];
+  const runJson = options.runJson ?? runProcessJson({ command, args, cwd: options.cwd, env: options.env });
+
+  return {
+    async query(input) {
+      const raw = await runJson('semantic-query', {
+        sources: input.sources,
+        dialect: input.dialect,
+        query: input.query,
+      });
+      return {
+        sql: typeof raw.sql === 'string' ? raw.sql : '',
+        dialect: typeof raw.dialect === 'string' ? raw.dialect : input.dialect,
+        columns: recordArray(raw.columns),
+        plan: recordValue(raw.plan),
+      };
+    },
+    async validateSources(input) {
+      const raw = await runJson('semantic-validate', {
+        sources: input.sources,
+        dialect: input.dialect,
+        recently_touched: input.recentlyTouched,
+      });
+      return {
+        valid: raw.valid === true,
+        errors: stringArray(raw.errors),
+        warnings: stringArray(raw.warnings),
+        perSourceWarnings: recordValue(raw.per_source_warnings) as Record<string, string[]>,
+      };
+    },
+    async generateSources(input) {
+      const raw = await runJson('semantic-generate-sources', sourceGenerationPayload(input));
+      return sourceGenerationResult(raw);
+    },
+  };
+}
+
+/** @internal */
+export function createHttpSemanticLayerComputePort(
+  options: HttpSemanticLayerComputeOptions,
+): KtxSemanticLayerComputePort {
+  const requestJson = options.requestJson ?? postJson(options.baseUrl);
+
+  return {
+    async query(input) {
+      const raw = await requestJson('/semantic-layer/query', {
+        sources: input.sources,
+        dialect: input.dialect,
+        query: input.query,
+      });
+      return {
+        sql: typeof raw.sql === 'string' ? raw.sql : '',
+        dialect: typeof raw.dialect === 'string' ? raw.dialect : input.dialect,
+        columns: recordArray(raw.columns),
+        plan: recordValue(raw.plan),
+      };
+    },
+    async validateSources(input) {
+      const raw = await requestJson('/semantic-layer/validate', {
+        sources: input.sources,
+        dialect: input.dialect,
+        recently_touched: input.recentlyTouched,
+      });
+      return {
+        valid: raw.valid === true,
+        errors: stringArray(raw.errors),
+        warnings: stringArray(raw.warnings),
+        perSourceWarnings: recordValue(raw.per_source_warnings) as Record<string, string[]>,
+      };
+    },
+    async generateSources(input) {
+      const raw = await requestJson('/semantic-layer/generate-sources', sourceGenerationPayload(input));
+      return sourceGenerationResult(raw);
+    },
+  };
+}