ktx/packages/cli/test/context/ingest/tools/warehouse-verification/sql-execution.tool.test.ts

128 lines
5.2 KiB
TypeScript
Raw Permalink Normal View History

fix: read semantic sources safely (#284) * fix: read semantic sources safely * test: retarget reindex per-scope error case to a broken manifest Reading a broken standalone source was made non-fatal in de1f1a8d (it is surfaced for repair instead of throwing), so the reindex per-scope error test no longer captured an error. Point it at a corrupt manifest shard, which is the remaining fatal read failure the per-scope catch must isolate, and assert the captured error names the offending file. * fix(sl): decouple semantic-layer file names from warehouse naming rules The in-file `name:` field is now the sole source identity; the filename is a derived label that never participates in identity. This removes the "Unsafe semantic-layer source name" failure class entirely: any warehouse identifier (Snowflake's uppercase SIGNED_UP, EVENT$LOG, dotted names) can be read, overlaid, edited, and deleted. - New `source-files.ts`: one total filename derivation (safe lowercase names verbatim; otherwise slug + sha256-hash suffix, immune to case-insensitive-filesystem collisions) and one by-name file resolver. - Reads resolve by name everywhere; the path-from-name fast path and `assertSafeSourceName` are gone. - Writes resolve-then-write: rewrites land on the file that declares the name (human renames survive); new sources get a derived filename; a derived path occupied by a different source fails instead of clobbering. - `readSourceFile` returns null for missing files instead of forcing every caller to launder IO errors; `deleteSource` distinguishes manifest-backed sources from not-found instead of silently succeeding. - `sl_write_source` accepts verbatim warehouse identifiers (snake_case is now a recommendation for new sources) and rejects sourceName/source.name mismatches; `sl_edit_source` rejects name-changing edits. - Ingest projection commits, gate-repair allowlists, and touched-source derivation use resolved paths / in-file names instead of interpolating `<connId>/<name>.yaml`. - Collapsed the five parallel path derivations and duplicated path-token helpers onto the shared module; dropped dead service methods. * fix(sl): resolve sources by declared name end-to-end and gate warehouse SQL with the parser-backed validator - Key broken/renamed semantic-layer files by their recoverable in-file name (slSourceNameForFile) so mid-edit sources stay reachable under their real identity in reads, listings, and search - Derive finalization touched sources from composed-source diffs and recover deleted files' declared names from the pre-change commit instead of parsing hash-derived filenames - Resolve revert/rollback paths against history (listFilesAtCommit) so human-renamed files are restored where they lived at preHead - Validate ingest sql_execution through the daemon's sqlglot validateReadOnly in the connection's dialect, sharing one driver-to-dialect map (sql-analysis/dialect.ts) across MCP and ingest - Harden the local read-only SQL backstop: accept leading comments, reject smuggled second statements, and strip trailing semicolons/comments before row-limit wrapping
2026-06-10 14:06:13 +02:00
import { beforeEach, describe, expect, it, vi } from 'vitest';
test: split cli tests from source tree (#216) * feat(cli): define full warehouse dialect contract * test(cli): keep dialect edge tests focused * fix(cli): stabilize dialect contract foundation * refactor(connectors): own read-only query preparation * refactor(connectors): resolve dialects through registry * refactor(connectors): keep concrete dialect classes internal * chore(workspace): enforce dialect import boundary * refactor(cli): resolve relationship dialect at scan boundary * refactor(cli): use dialect display parsing for entity details * refactor(cli): use dialect display parsing for warehouse catalog * refactor(cli): use dialect SQL in relationship workflows * test(cli): verify solid dialect scan workflow closure * test: split cli tests from source tree * refactor(cli): standardize BigQuery scope listing * feat(sqlite): implement connector scope listing * test(connectors): cover required table listing * feat(cli): add warehouse driver registry * refactor(setup): route scope discovery through driver registry * refactor(cli): route local query execution through driver registry * refactor(historic-sql): route dialect support through driver registry * refactor(cli): test warehouse connections through driver registry * fix(cli): close driver registry type export gaps * Improve setup daemon diagnostics * refactor(setup): centralize rail-prefixed diagnostics + query-history fallback Extract errorMessage, writePrefixedLines, and flushPrefixedBufferedCommandOutput into clack.ts so the setup wizard, managed daemons, and embedding/agent steps share one rail-formatted writer. setup-databases.ts also adds a "disable query history and retry" option when the schema-context build fails and query history is the likely culprit, surfaced via a new failed-query-history-unavailable status. * fix(cli): carry catalog through the picker so BigQuery/Snowflake/SQL Server scope filters match The setup picker's KtxTableListEntry was a 2-level { schema, name }, so qualifiedTableId always wrote db.name into enabled_tables. When BigQuery, Snowflake, or SQL Server later ran fast ingest, their introspect step filtered the scope set with scopedTableNames(scope, { catalog: projectId|database, db }) — catalog was non-null on the introspect side but null in the scope refs, so every entry was rejected, the live-database adapter staged zero table files, and detect() failed with 'Adapter "live-database" did not recognize fetched source output'. Align the picker boundary with the canonical 3-level KtxTableRef: - Add catalog: string | null to KtxTableListEntry. - BigQuery/Snowflake/SQL Server listTables populate catalog from the resolved projectId / database; Postgres/MySQL/ClickHouse/SQLite set null. - qualifiedTableId emits catalog.schema.name when catalog is non-null (resolveEnabledTables already accepts the 3-part shape) and schemasFromEnabledTables now goes through parseDottedTableEntry so it recovers the schema correctly from both 2-part and 3-part entries. - Export parseDottedTableEntry from enabled-tables.ts (@internal) for picker reuse. Update listTables expectations in all seven connector tests and the setup / picker test fixtures. Add a picker regression test that covers the catalog-bearing round-trip (save + refine). * fix(cli): allow debug telemetry under opt-out env
2026-05-26 08:49:05 +02:00
import type { SlConnectionCatalogPort } from '../../../../../src/context/sl/ports.js';
fix: read semantic sources safely (#284) * fix: read semantic sources safely * test: retarget reindex per-scope error case to a broken manifest Reading a broken standalone source was made non-fatal in de1f1a8d (it is surfaced for repair instead of throwing), so the reindex per-scope error test no longer captured an error. Point it at a corrupt manifest shard, which is the remaining fatal read failure the per-scope catch must isolate, and assert the captured error names the offending file. * fix(sl): decouple semantic-layer file names from warehouse naming rules The in-file `name:` field is now the sole source identity; the filename is a derived label that never participates in identity. This removes the "Unsafe semantic-layer source name" failure class entirely: any warehouse identifier (Snowflake's uppercase SIGNED_UP, EVENT$LOG, dotted names) can be read, overlaid, edited, and deleted. - New `source-files.ts`: one total filename derivation (safe lowercase names verbatim; otherwise slug + sha256-hash suffix, immune to case-insensitive-filesystem collisions) and one by-name file resolver. - Reads resolve by name everywhere; the path-from-name fast path and `assertSafeSourceName` are gone. - Writes resolve-then-write: rewrites land on the file that declares the name (human renames survive); new sources get a derived filename; a derived path occupied by a different source fails instead of clobbering. - `readSourceFile` returns null for missing files instead of forcing every caller to launder IO errors; `deleteSource` distinguishes manifest-backed sources from not-found instead of silently succeeding. - `sl_write_source` accepts verbatim warehouse identifiers (snake_case is now a recommendation for new sources) and rejects sourceName/source.name mismatches; `sl_edit_source` rejects name-changing edits. - Ingest projection commits, gate-repair allowlists, and touched-source derivation use resolved paths / in-file names instead of interpolating `<connId>/<name>.yaml`. - Collapsed the five parallel path derivations and duplicated path-token helpers onto the shared module; dropped dead service methods. * fix(sl): resolve sources by declared name end-to-end and gate warehouse SQL with the parser-backed validator - Key broken/renamed semantic-layer files by their recoverable in-file name (slSourceNameForFile) so mid-edit sources stay reachable under their real identity in reads, listings, and search - Derive finalization touched sources from composed-source diffs and recover deleted files' declared names from the pre-change commit instead of parsing hash-derived filenames - Resolve revert/rollback paths against history (listFilesAtCommit) so human-renamed files are restored where they lived at preHead - Validate ingest sql_execution through the daemon's sqlglot validateReadOnly in the connection's dialect, sharing one driver-to-dialect map (sql-analysis/dialect.ts) across MCP and ingest - Harden the local read-only SQL backstop: accept leading comments, reject smuggled second statements, and strip trailing semicolons/comments before row-limit wrapping
2026-06-10 14:06:13 +02:00
import type { SqlAnalysisPort } from '../../../../../src/context/sql-analysis/ports.js';
test: split cli tests from source tree (#216) * feat(cli): define full warehouse dialect contract * test(cli): keep dialect edge tests focused * fix(cli): stabilize dialect contract foundation * refactor(connectors): own read-only query preparation * refactor(connectors): resolve dialects through registry * refactor(connectors): keep concrete dialect classes internal * chore(workspace): enforce dialect import boundary * refactor(cli): resolve relationship dialect at scan boundary * refactor(cli): use dialect display parsing for entity details * refactor(cli): use dialect display parsing for warehouse catalog * refactor(cli): use dialect SQL in relationship workflows * test(cli): verify solid dialect scan workflow closure * test: split cli tests from source tree * refactor(cli): standardize BigQuery scope listing * feat(sqlite): implement connector scope listing * test(connectors): cover required table listing * feat(cli): add warehouse driver registry * refactor(setup): route scope discovery through driver registry * refactor(cli): route local query execution through driver registry * refactor(historic-sql): route dialect support through driver registry * refactor(cli): test warehouse connections through driver registry * fix(cli): close driver registry type export gaps * Improve setup daemon diagnostics * refactor(setup): centralize rail-prefixed diagnostics + query-history fallback Extract errorMessage, writePrefixedLines, and flushPrefixedBufferedCommandOutput into clack.ts so the setup wizard, managed daemons, and embedding/agent steps share one rail-formatted writer. setup-databases.ts also adds a "disable query history and retry" option when the schema-context build fails and query history is the likely culprit, surfaced via a new failed-query-history-unavailable status. * fix(cli): carry catalog through the picker so BigQuery/Snowflake/SQL Server scope filters match The setup picker's KtxTableListEntry was a 2-level { schema, name }, so qualifiedTableId always wrote db.name into enabled_tables. When BigQuery, Snowflake, or SQL Server later ran fast ingest, their introspect step filtered the scope set with scopedTableNames(scope, { catalog: projectId|database, db }) — catalog was non-null on the introspect side but null in the scope refs, so every entry was rejected, the live-database adapter staged zero table files, and detect() failed with 'Adapter "live-database" did not recognize fetched source output'. Align the picker boundary with the canonical 3-level KtxTableRef: - Add catalog: string | null to KtxTableListEntry. - BigQuery/Snowflake/SQL Server listTables populate catalog from the resolved projectId / database; Postgres/MySQL/ClickHouse/SQLite set null. - qualifiedTableId emits catalog.schema.name when catalog is non-null (resolveEnabledTables already accepts the 3-part shape) and schemasFromEnabledTables now goes through parseDottedTableEntry so it recovers the schema correctly from both 2-part and 3-part entries. - Export parseDottedTableEntry from enabled-tables.ts (@internal) for picker reuse. Update listTables expectations in all seven connector tests and the setup / picker test fixtures. Add a picker regression test that covers the catalog-bearing round-trip (save + refine). * fix(cli): allow debug telemetry under opt-out env
2026-05-26 08:49:05 +02:00
import type { ToolContext } from '../../../../../src/context/tools/base-tool.js';
import { SqlExecutionTool } from '../../../../../src/context/ingest/tools/warehouse-verification/sql-execution.tool.js';
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
describe('SqlExecutionTool', () => {
const connections = {
executeQuery: vi.fn(),
fix: read semantic sources safely (#284) * fix: read semantic sources safely * test: retarget reindex per-scope error case to a broken manifest Reading a broken standalone source was made non-fatal in de1f1a8d (it is surfaced for repair instead of throwing), so the reindex per-scope error test no longer captured an error. Point it at a corrupt manifest shard, which is the remaining fatal read failure the per-scope catch must isolate, and assert the captured error names the offending file. * fix(sl): decouple semantic-layer file names from warehouse naming rules The in-file `name:` field is now the sole source identity; the filename is a derived label that never participates in identity. This removes the "Unsafe semantic-layer source name" failure class entirely: any warehouse identifier (Snowflake's uppercase SIGNED_UP, EVENT$LOG, dotted names) can be read, overlaid, edited, and deleted. - New `source-files.ts`: one total filename derivation (safe lowercase names verbatim; otherwise slug + sha256-hash suffix, immune to case-insensitive-filesystem collisions) and one by-name file resolver. - Reads resolve by name everywhere; the path-from-name fast path and `assertSafeSourceName` are gone. - Writes resolve-then-write: rewrites land on the file that declares the name (human renames survive); new sources get a derived filename; a derived path occupied by a different source fails instead of clobbering. - `readSourceFile` returns null for missing files instead of forcing every caller to launder IO errors; `deleteSource` distinguishes manifest-backed sources from not-found instead of silently succeeding. - `sl_write_source` accepts verbatim warehouse identifiers (snake_case is now a recommendation for new sources) and rejects sourceName/source.name mismatches; `sl_edit_source` rejects name-changing edits. - Ingest projection commits, gate-repair allowlists, and touched-source derivation use resolved paths / in-file names instead of interpolating `<connId>/<name>.yaml`. - Collapsed the five parallel path derivations and duplicated path-token helpers onto the shared module; dropped dead service methods. * fix(sl): resolve sources by declared name end-to-end and gate warehouse SQL with the parser-backed validator - Key broken/renamed semantic-layer files by their recoverable in-file name (slSourceNameForFile) so mid-edit sources stay reachable under their real identity in reads, listings, and search - Derive finalization touched sources from composed-source diffs and recover deleted files' declared names from the pre-change commit instead of parsing hash-derived filenames - Resolve revert/rollback paths against history (listFilesAtCommit) so human-renamed files are restored where they lived at preHead - Validate ingest sql_execution through the daemon's sqlglot validateReadOnly in the connection's dialect, sharing one driver-to-dialect map (sql-analysis/dialect.ts) across MCP and ingest - Harden the local read-only SQL backstop: accept leading comments, reject smuggled second statements, and strip trailing semicolons/comments before row-limit wrapping
2026-06-10 14:06:13 +02:00
getConnectionById: vi.fn(async () => ({ id: 'warehouse', name: 'warehouse', connectionType: 'POSTGRESQL' })),
} as unknown as SlConnectionCatalogPort & {
executeQuery: ReturnType<typeof vi.fn>;
getConnectionById: ReturnType<typeof vi.fn>;
};
const sqlAnalysis = {
validateReadOnly: vi.fn(async () => ({ ok: true, error: null })),
} as unknown as SqlAnalysisPort & { validateReadOnly: ReturnType<typeof vi.fn> };
const tool = new SqlExecutionTool(connections, sqlAnalysis);
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
const context: ToolContext = {
sourceId: 'ingest',
messageId: 'm1',
userId: 'system',
session: { allowedConnectionNames: new Set(['warehouse']) } as any,
};
fix: read semantic sources safely (#284) * fix: read semantic sources safely * test: retarget reindex per-scope error case to a broken manifest Reading a broken standalone source was made non-fatal in de1f1a8d (it is surfaced for repair instead of throwing), so the reindex per-scope error test no longer captured an error. Point it at a corrupt manifest shard, which is the remaining fatal read failure the per-scope catch must isolate, and assert the captured error names the offending file. * fix(sl): decouple semantic-layer file names from warehouse naming rules The in-file `name:` field is now the sole source identity; the filename is a derived label that never participates in identity. This removes the "Unsafe semantic-layer source name" failure class entirely: any warehouse identifier (Snowflake's uppercase SIGNED_UP, EVENT$LOG, dotted names) can be read, overlaid, edited, and deleted. - New `source-files.ts`: one total filename derivation (safe lowercase names verbatim; otherwise slug + sha256-hash suffix, immune to case-insensitive-filesystem collisions) and one by-name file resolver. - Reads resolve by name everywhere; the path-from-name fast path and `assertSafeSourceName` are gone. - Writes resolve-then-write: rewrites land on the file that declares the name (human renames survive); new sources get a derived filename; a derived path occupied by a different source fails instead of clobbering. - `readSourceFile` returns null for missing files instead of forcing every caller to launder IO errors; `deleteSource` distinguishes manifest-backed sources from not-found instead of silently succeeding. - `sl_write_source` accepts verbatim warehouse identifiers (snake_case is now a recommendation for new sources) and rejects sourceName/source.name mismatches; `sl_edit_source` rejects name-changing edits. - Ingest projection commits, gate-repair allowlists, and touched-source derivation use resolved paths / in-file names instead of interpolating `<connId>/<name>.yaml`. - Collapsed the five parallel path derivations and duplicated path-token helpers onto the shared module; dropped dead service methods. * fix(sl): resolve sources by declared name end-to-end and gate warehouse SQL with the parser-backed validator - Key broken/renamed semantic-layer files by their recoverable in-file name (slSourceNameForFile) so mid-edit sources stay reachable under their real identity in reads, listings, and search - Derive finalization touched sources from composed-source diffs and recover deleted files' declared names from the pre-change commit instead of parsing hash-derived filenames - Resolve revert/rollback paths against history (listFilesAtCommit) so human-renamed files are restored where they lived at preHead - Validate ingest sql_execution through the daemon's sqlglot validateReadOnly in the connection's dialect, sharing one driver-to-dialect map (sql-analysis/dialect.ts) across MCP and ingest - Harden the local read-only SQL backstop: accept leading comments, reject smuggled second statements, and strip trailing semicolons/comments before row-limit wrapping
2026-06-10 14:06:13 +02:00
beforeEach(() => {
connections.executeQuery.mockReset();
connections.getConnectionById.mockReset();
connections.getConnectionById.mockResolvedValue({ id: 'warehouse', name: 'warehouse', connectionType: 'POSTGRESQL' });
sqlAnalysis.validateReadOnly.mockReset();
sqlAnalysis.validateReadOnly.mockResolvedValue({ ok: true, error: null });
});
it('validates with the parser-backed validator in the connection dialect, then wraps with a capped row limit', async () => {
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
connections.executeQuery.mockResolvedValue({ headers: ['status'], rows: [['paid']], totalRows: 1 });
const result = await tool.call(
feat(mcp):added MCP server (#97) * docs(specs): design research-agent MCP tools and ktx mcp daemon Adds the 2026-05-14 design spec for exposing four new MCP tools (discover_data, entity_details, dictionary_search, sql_execution), shipping a ktx-research skill, and introducing an HTTP-only ktx mcp daemon so external agents can use KTX as a research-capable context layer. * Refine research-agent MCP tools spec after adversarial review iteration 1 * Refine research-agent MCP tools spec after adversarial review iteration 2 * Refine research-agent MCP tools spec after adversarial review iteration 3 * Refine spec: drop connectionName compat carve-out and ground summary/snippet provenance per kind * feat(daemon): validate read-only SQL with sqlglot * feat(context): expose read-only SQL validation port * feat(context): register MCP sql execution tool * feat(context): execute MCP SQL through validated connector path * test(context): update SQL analysis port fixtures * docs: add research-agent MCP sql execution foundation plan * feat(context): add scan-backed entity details service * feat(context): register MCP entity details tool * feat(context): expose local MCP entity details * test(context): align entity details scan fixtures * docs: add research-agent MCP entity_details plan * feat(context): add dictionary search service * feat(context): register MCP dictionary search tool * feat(context): expose local MCP dictionary search * docs: add research-agent MCP dictionary_search plan * feat: add MCP discover data service * feat: expose discover data MCP tool * feat: wire local discover data MCP port * docs: add research-agent MCP discover_data plan * feat(cli): add mcp http security helpers * feat(cli): host mcp over streamable http * feat(cli): manage mcp daemon lifecycle * feat(cli): add ktx mcp commands * fix(cli): stabilize mcp daemon verification * docs: add research-agent MCP http daemon plan * feat(cli): install KTX research skill * feat(cli): configure MCP clients in setup agents * feat(cli): support Claude local MCP setup scope * docs: add research-agent MCP setup-agents plan * refactor(context): use connectionId in warehouse verification tools * docs(context): update ingest verification prompts for connectionId * docs: add research-agent MCP ingest contract convergence plan * chore: build runtime artifacts in conductor setup --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-05-15 02:35:09 +02:00
{ connectionId: 'warehouse', sql: 'select status from public.orders', rowLimit: 5 },
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
context,
);
fix: read semantic sources safely (#284) * fix: read semantic sources safely * test: retarget reindex per-scope error case to a broken manifest Reading a broken standalone source was made non-fatal in de1f1a8d (it is surfaced for repair instead of throwing), so the reindex per-scope error test no longer captured an error. Point it at a corrupt manifest shard, which is the remaining fatal read failure the per-scope catch must isolate, and assert the captured error names the offending file. * fix(sl): decouple semantic-layer file names from warehouse naming rules The in-file `name:` field is now the sole source identity; the filename is a derived label that never participates in identity. This removes the "Unsafe semantic-layer source name" failure class entirely: any warehouse identifier (Snowflake's uppercase SIGNED_UP, EVENT$LOG, dotted names) can be read, overlaid, edited, and deleted. - New `source-files.ts`: one total filename derivation (safe lowercase names verbatim; otherwise slug + sha256-hash suffix, immune to case-insensitive-filesystem collisions) and one by-name file resolver. - Reads resolve by name everywhere; the path-from-name fast path and `assertSafeSourceName` are gone. - Writes resolve-then-write: rewrites land on the file that declares the name (human renames survive); new sources get a derived filename; a derived path occupied by a different source fails instead of clobbering. - `readSourceFile` returns null for missing files instead of forcing every caller to launder IO errors; `deleteSource` distinguishes manifest-backed sources from not-found instead of silently succeeding. - `sl_write_source` accepts verbatim warehouse identifiers (snake_case is now a recommendation for new sources) and rejects sourceName/source.name mismatches; `sl_edit_source` rejects name-changing edits. - Ingest projection commits, gate-repair allowlists, and touched-source derivation use resolved paths / in-file names instead of interpolating `<connId>/<name>.yaml`. - Collapsed the five parallel path derivations and duplicated path-token helpers onto the shared module; dropped dead service methods. * fix(sl): resolve sources by declared name end-to-end and gate warehouse SQL with the parser-backed validator - Key broken/renamed semantic-layer files by their recoverable in-file name (slSourceNameForFile) so mid-edit sources stay reachable under their real identity in reads, listings, and search - Derive finalization touched sources from composed-source diffs and recover deleted files' declared names from the pre-change commit instead of parsing hash-derived filenames - Resolve revert/rollback paths against history (listFilesAtCommit) so human-renamed files are restored where they lived at preHead - Validate ingest sql_execution through the daemon's sqlglot validateReadOnly in the connection's dialect, sharing one driver-to-dialect map (sql-analysis/dialect.ts) across MCP and ingest - Harden the local read-only SQL backstop: accept leading comments, reject smuggled second statements, and strip trailing semicolons/comments before row-limit wrapping
2026-06-10 14:06:13 +02:00
expect(sqlAnalysis.validateReadOnly).toHaveBeenCalledWith('select status from public.orders', 'postgres');
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
expect(connections.executeQuery).toHaveBeenCalledWith(
'warehouse',
'select * from (select status from public.orders) as ktx_query_result limit 5',
);
expect(result.markdown).toContain('| status |');
expect(result.structured.wrappedSql).toContain('limit 5');
});
fix: read semantic sources safely (#284) * fix: read semantic sources safely * test: retarget reindex per-scope error case to a broken manifest Reading a broken standalone source was made non-fatal in de1f1a8d (it is surfaced for repair instead of throwing), so the reindex per-scope error test no longer captured an error. Point it at a corrupt manifest shard, which is the remaining fatal read failure the per-scope catch must isolate, and assert the captured error names the offending file. * fix(sl): decouple semantic-layer file names from warehouse naming rules The in-file `name:` field is now the sole source identity; the filename is a derived label that never participates in identity. This removes the "Unsafe semantic-layer source name" failure class entirely: any warehouse identifier (Snowflake's uppercase SIGNED_UP, EVENT$LOG, dotted names) can be read, overlaid, edited, and deleted. - New `source-files.ts`: one total filename derivation (safe lowercase names verbatim; otherwise slug + sha256-hash suffix, immune to case-insensitive-filesystem collisions) and one by-name file resolver. - Reads resolve by name everywhere; the path-from-name fast path and `assertSafeSourceName` are gone. - Writes resolve-then-write: rewrites land on the file that declares the name (human renames survive); new sources get a derived filename; a derived path occupied by a different source fails instead of clobbering. - `readSourceFile` returns null for missing files instead of forcing every caller to launder IO errors; `deleteSource` distinguishes manifest-backed sources from not-found instead of silently succeeding. - `sl_write_source` accepts verbatim warehouse identifiers (snake_case is now a recommendation for new sources) and rejects sourceName/source.name mismatches; `sl_edit_source` rejects name-changing edits. - Ingest projection commits, gate-repair allowlists, and touched-source derivation use resolved paths / in-file names instead of interpolating `<connId>/<name>.yaml`. - Collapsed the five parallel path derivations and duplicated path-token helpers onto the shared module; dropped dead service methods. * fix(sl): resolve sources by declared name end-to-end and gate warehouse SQL with the parser-backed validator - Key broken/renamed semantic-layer files by their recoverable in-file name (slSourceNameForFile) so mid-edit sources stay reachable under their real identity in reads, listings, and search - Derive finalization touched sources from composed-source diffs and recover deleted files' declared names from the pre-change commit instead of parsing hash-derived filenames - Resolve revert/rollback paths against history (listFilesAtCommit) so human-renamed files are restored where they lived at preHead - Validate ingest sql_execution through the daemon's sqlglot validateReadOnly in the connection's dialect, sharing one driver-to-dialect map (sql-analysis/dialect.ts) across MCP and ingest - Harden the local read-only SQL backstop: accept leading comments, reject smuggled second statements, and strip trailing semicolons/comments before row-limit wrapping
2026-06-10 14:06:13 +02:00
it('maps connection types to sqlglot dialects', async () => {
connections.getConnectionById.mockResolvedValue({ id: 'warehouse', name: 'warehouse', connectionType: 'SNOWFLAKE' });
connections.executeQuery.mockResolvedValue({ headers: [], rows: [], totalRows: 0 });
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
fix: read semantic sources safely (#284) * fix: read semantic sources safely * test: retarget reindex per-scope error case to a broken manifest Reading a broken standalone source was made non-fatal in de1f1a8d (it is surfaced for repair instead of throwing), so the reindex per-scope error test no longer captured an error. Point it at a corrupt manifest shard, which is the remaining fatal read failure the per-scope catch must isolate, and assert the captured error names the offending file. * fix(sl): decouple semantic-layer file names from warehouse naming rules The in-file `name:` field is now the sole source identity; the filename is a derived label that never participates in identity. This removes the "Unsafe semantic-layer source name" failure class entirely: any warehouse identifier (Snowflake's uppercase SIGNED_UP, EVENT$LOG, dotted names) can be read, overlaid, edited, and deleted. - New `source-files.ts`: one total filename derivation (safe lowercase names verbatim; otherwise slug + sha256-hash suffix, immune to case-insensitive-filesystem collisions) and one by-name file resolver. - Reads resolve by name everywhere; the path-from-name fast path and `assertSafeSourceName` are gone. - Writes resolve-then-write: rewrites land on the file that declares the name (human renames survive); new sources get a derived filename; a derived path occupied by a different source fails instead of clobbering. - `readSourceFile` returns null for missing files instead of forcing every caller to launder IO errors; `deleteSource` distinguishes manifest-backed sources from not-found instead of silently succeeding. - `sl_write_source` accepts verbatim warehouse identifiers (snake_case is now a recommendation for new sources) and rejects sourceName/source.name mismatches; `sl_edit_source` rejects name-changing edits. - Ingest projection commits, gate-repair allowlists, and touched-source derivation use resolved paths / in-file names instead of interpolating `<connId>/<name>.yaml`. - Collapsed the five parallel path derivations and duplicated path-token helpers onto the shared module; dropped dead service methods. * fix(sl): resolve sources by declared name end-to-end and gate warehouse SQL with the parser-backed validator - Key broken/renamed semantic-layer files by their recoverable in-file name (slSourceNameForFile) so mid-edit sources stay reachable under their real identity in reads, listings, and search - Derive finalization touched sources from composed-source diffs and recover deleted files' declared names from the pre-change commit instead of parsing hash-derived filenames - Resolve revert/rollback paths against history (listFilesAtCommit) so human-renamed files are restored where they lived at preHead - Validate ingest sql_execution through the daemon's sqlglot validateReadOnly in the connection's dialect, sharing one driver-to-dialect map (sql-analysis/dialect.ts) across MCP and ingest - Harden the local read-only SQL backstop: accept leading comments, reject smuggled second statements, and strip trailing semicolons/comments before row-limit wrapping
2026-06-10 14:06:13 +02:00
await tool.call({ connectionId: 'warehouse', sql: 'select 1' }, context);
expect(sqlAnalysis.validateReadOnly).toHaveBeenCalledWith('select 1', 'snowflake');
});
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
fix: read semantic sources safely (#284) * fix: read semantic sources safely * test: retarget reindex per-scope error case to a broken manifest Reading a broken standalone source was made non-fatal in de1f1a8d (it is surfaced for repair instead of throwing), so the reindex per-scope error test no longer captured an error. Point it at a corrupt manifest shard, which is the remaining fatal read failure the per-scope catch must isolate, and assert the captured error names the offending file. * fix(sl): decouple semantic-layer file names from warehouse naming rules The in-file `name:` field is now the sole source identity; the filename is a derived label that never participates in identity. This removes the "Unsafe semantic-layer source name" failure class entirely: any warehouse identifier (Snowflake's uppercase SIGNED_UP, EVENT$LOG, dotted names) can be read, overlaid, edited, and deleted. - New `source-files.ts`: one total filename derivation (safe lowercase names verbatim; otherwise slug + sha256-hash suffix, immune to case-insensitive-filesystem collisions) and one by-name file resolver. - Reads resolve by name everywhere; the path-from-name fast path and `assertSafeSourceName` are gone. - Writes resolve-then-write: rewrites land on the file that declares the name (human renames survive); new sources get a derived filename; a derived path occupied by a different source fails instead of clobbering. - `readSourceFile` returns null for missing files instead of forcing every caller to launder IO errors; `deleteSource` distinguishes manifest-backed sources from not-found instead of silently succeeding. - `sl_write_source` accepts verbatim warehouse identifiers (snake_case is now a recommendation for new sources) and rejects sourceName/source.name mismatches; `sl_edit_source` rejects name-changing edits. - Ingest projection commits, gate-repair allowlists, and touched-source derivation use resolved paths / in-file names instead of interpolating `<connId>/<name>.yaml`. - Collapsed the five parallel path derivations and duplicated path-token helpers onto the shared module; dropped dead service methods. * fix(sl): resolve sources by declared name end-to-end and gate warehouse SQL with the parser-backed validator - Key broken/renamed semantic-layer files by their recoverable in-file name (slSourceNameForFile) so mid-edit sources stay reachable under their real identity in reads, listings, and search - Derive finalization touched sources from composed-source diffs and recover deleted files' declared names from the pre-change commit instead of parsing hash-derived filenames - Resolve revert/rollback paths against history (listFilesAtCommit) so human-renamed files are restored where they lived at preHead - Validate ingest sql_execution through the daemon's sqlglot validateReadOnly in the connection's dialect, sharing one driver-to-dialect map (sql-analysis/dialect.ts) across MCP and ingest - Harden the local read-only SQL backstop: accept leading comments, reject smuggled second statements, and strip trailing semicolons/comments before row-limit wrapping
2026-06-10 14:06:13 +02:00
it('returns the validator error without executing when validation fails', async () => {
sqlAnalysis.validateReadOnly.mockResolvedValue({ ok: false, error: 'SQL contains read/write operation: Insert' });
const result = await tool.call(
{ connectionId: 'warehouse', sql: 'with x as (insert into t values (1) returning *) select * from x' },
context,
);
expect(result.markdown).toContain('SQL contains read/write operation: Insert');
expect(result.structured.error).toContain('SQL contains read/write operation: Insert');
expect(connections.executeQuery).not.toHaveBeenCalled();
});
it('throws when no parser-backed validator is configured', async () => {
const unvalidated = new SqlExecutionTool(connections);
await expect(unvalidated.call({ connectionId: 'warehouse', sql: 'select 1' }, context)).rejects.toThrow(
'sql_execution requires parser-backed SQL validation.',
);
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
expect(connections.executeQuery).not.toHaveBeenCalled();
});
fix: read semantic sources safely (#284) * fix: read semantic sources safely * test: retarget reindex per-scope error case to a broken manifest Reading a broken standalone source was made non-fatal in de1f1a8d (it is surfaced for repair instead of throwing), so the reindex per-scope error test no longer captured an error. Point it at a corrupt manifest shard, which is the remaining fatal read failure the per-scope catch must isolate, and assert the captured error names the offending file. * fix(sl): decouple semantic-layer file names from warehouse naming rules The in-file `name:` field is now the sole source identity; the filename is a derived label that never participates in identity. This removes the "Unsafe semantic-layer source name" failure class entirely: any warehouse identifier (Snowflake's uppercase SIGNED_UP, EVENT$LOG, dotted names) can be read, overlaid, edited, and deleted. - New `source-files.ts`: one total filename derivation (safe lowercase names verbatim; otherwise slug + sha256-hash suffix, immune to case-insensitive-filesystem collisions) and one by-name file resolver. - Reads resolve by name everywhere; the path-from-name fast path and `assertSafeSourceName` are gone. - Writes resolve-then-write: rewrites land on the file that declares the name (human renames survive); new sources get a derived filename; a derived path occupied by a different source fails instead of clobbering. - `readSourceFile` returns null for missing files instead of forcing every caller to launder IO errors; `deleteSource` distinguishes manifest-backed sources from not-found instead of silently succeeding. - `sl_write_source` accepts verbatim warehouse identifiers (snake_case is now a recommendation for new sources) and rejects sourceName/source.name mismatches; `sl_edit_source` rejects name-changing edits. - Ingest projection commits, gate-repair allowlists, and touched-source derivation use resolved paths / in-file names instead of interpolating `<connId>/<name>.yaml`. - Collapsed the five parallel path derivations and duplicated path-token helpers onto the shared module; dropped dead service methods. * fix(sl): resolve sources by declared name end-to-end and gate warehouse SQL with the parser-backed validator - Key broken/renamed semantic-layer files by their recoverable in-file name (slSourceNameForFile) so mid-edit sources stay reachable under their real identity in reads, listings, and search - Derive finalization touched sources from composed-source diffs and recover deleted files' declared names from the pre-change commit instead of parsing hash-derived filenames - Resolve revert/rollback paths against history (listFilesAtCommit) so human-renamed files are restored where they lived at preHead - Validate ingest sql_execution through the daemon's sqlglot validateReadOnly in the connection's dialect, sharing one driver-to-dialect map (sql-analysis/dialect.ts) across MCP and ingest - Harden the local read-only SQL backstop: accept leading comments, reject smuggled second statements, and strip trailing semicolons/comments before row-limit wrapping
2026-06-10 14:06:13 +02:00
it.each(['insert into x values (1)', 'drop table x', 'vacuum'])(
'keeps the local backstop even when the validator approves: %s',
async (sql) => {
const result = await tool.call({ connectionId: 'warehouse', sql }, context);
expect(result.markdown).toContain('Only read-only SELECT/WITH queries can be executed locally.');
expect(connections.executeQuery).not.toHaveBeenCalled();
},
);
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
it('surfaces connector errors verbatim', async () => {
connections.executeQuery.mockRejectedValue(new Error('relation "orbit_analytics.customer" does not exist'));
const result = await tool.call(
feat(mcp):added MCP server (#97) * docs(specs): design research-agent MCP tools and ktx mcp daemon Adds the 2026-05-14 design spec for exposing four new MCP tools (discover_data, entity_details, dictionary_search, sql_execution), shipping a ktx-research skill, and introducing an HTTP-only ktx mcp daemon so external agents can use KTX as a research-capable context layer. * Refine research-agent MCP tools spec after adversarial review iteration 1 * Refine research-agent MCP tools spec after adversarial review iteration 2 * Refine research-agent MCP tools spec after adversarial review iteration 3 * Refine spec: drop connectionName compat carve-out and ground summary/snippet provenance per kind * feat(daemon): validate read-only SQL with sqlglot * feat(context): expose read-only SQL validation port * feat(context): register MCP sql execution tool * feat(context): execute MCP SQL through validated connector path * test(context): update SQL analysis port fixtures * docs: add research-agent MCP sql execution foundation plan * feat(context): add scan-backed entity details service * feat(context): register MCP entity details tool * feat(context): expose local MCP entity details * test(context): align entity details scan fixtures * docs: add research-agent MCP entity_details plan * feat(context): add dictionary search service * feat(context): register MCP dictionary search tool * feat(context): expose local MCP dictionary search * docs: add research-agent MCP dictionary_search plan * feat: add MCP discover data service * feat: expose discover data MCP tool * feat: wire local discover data MCP port * docs: add research-agent MCP discover_data plan * feat(cli): add mcp http security helpers * feat(cli): host mcp over streamable http * feat(cli): manage mcp daemon lifecycle * feat(cli): add ktx mcp commands * fix(cli): stabilize mcp daemon verification * docs: add research-agent MCP http daemon plan * feat(cli): install KTX research skill * feat(cli): configure MCP clients in setup agents * feat(cli): support Claude local MCP setup scope * docs: add research-agent MCP setup-agents plan * refactor(context): use connectionId in warehouse verification tools * docs(context): update ingest verification prompts for connectionId * docs: add research-agent MCP ingest contract convergence plan * chore: build runtime artifacts in conductor setup --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-05-15 02:35:09 +02:00
{ connectionId: 'warehouse', sql: 'select 1 from orbit_analytics.customer', rowLimit: 1 },
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
context,
);
expect(result.markdown).toContain('relation "orbit_analytics.customer" does not exist');
expect(result.structured.error).toContain('relation "orbit_analytics.customer" does not exist');
});
feat(mcp):added MCP server (#97) * docs(specs): design research-agent MCP tools and ktx mcp daemon Adds the 2026-05-14 design spec for exposing four new MCP tools (discover_data, entity_details, dictionary_search, sql_execution), shipping a ktx-research skill, and introducing an HTTP-only ktx mcp daemon so external agents can use KTX as a research-capable context layer. * Refine research-agent MCP tools spec after adversarial review iteration 1 * Refine research-agent MCP tools spec after adversarial review iteration 2 * Refine research-agent MCP tools spec after adversarial review iteration 3 * Refine spec: drop connectionName compat carve-out and ground summary/snippet provenance per kind * feat(daemon): validate read-only SQL with sqlglot * feat(context): expose read-only SQL validation port * feat(context): register MCP sql execution tool * feat(context): execute MCP SQL through validated connector path * test(context): update SQL analysis port fixtures * docs: add research-agent MCP sql execution foundation plan * feat(context): add scan-backed entity details service * feat(context): register MCP entity details tool * feat(context): expose local MCP entity details * test(context): align entity details scan fixtures * docs: add research-agent MCP entity_details plan * feat(context): add dictionary search service * feat(context): register MCP dictionary search tool * feat(context): expose local MCP dictionary search * docs: add research-agent MCP dictionary_search plan * feat: add MCP discover data service * feat: expose discover data MCP tool * feat: wire local discover data MCP port * docs: add research-agent MCP discover_data plan * feat(cli): add mcp http security helpers * feat(cli): host mcp over streamable http * feat(cli): manage mcp daemon lifecycle * feat(cli): add ktx mcp commands * fix(cli): stabilize mcp daemon verification * docs: add research-agent MCP http daemon plan * feat(cli): install KTX research skill * feat(cli): configure MCP clients in setup agents * feat(cli): support Claude local MCP setup scope * docs: add research-agent MCP setup-agents plan * refactor(context): use connectionId in warehouse verification tools * docs(context): update ingest verification prompts for connectionId * docs: add research-agent MCP ingest contract convergence plan * chore: build runtime artifacts in conductor setup --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-05-15 02:35:09 +02:00
it('uses connectionId as the public input field', () => {
const legacyConnectionField = ['connection', 'Name'].join('');
expect(
tool.parseInput({
connectionId: 'warehouse',
sql: 'select 1',
rowLimit: 5,
}),
).toEqual({
connectionId: 'warehouse',
sql: 'select 1',
rowLimit: 5,
});
expect(() =>
tool.parseInput({
[legacyConnectionField]: 'warehouse',
sql: 'select 1',
rowLimit: 5,
}),
).toThrow();
});
feat(context): add warehouse verification tools (#46) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references
2026-05-13 13:43:23 +02:00
});