Commit graph

325 commits

Author SHA1 Message Date
Andrey Avtomonov
4612c9e4c4 Merge remote-tracking branch 'origin/explore-research-agent-tools' into explore-research-agent-tools 2026-05-15 02:13:10 +02:00
Andrey Avtomonov
05d666e75f Merge remote-tracking branch 'origin/main' into explore-research-agent-tools
# Conflicts:
#	packages/context/skills/metricflow_ingest/SKILL.md
2026-05-15 02:12:30 +02:00
Andrey Avtomonov
cb8902f1e5
fix(context): merge overlay columns onto manifest columns by name (#94)
* fix(context): merge overlay columns onto manifest columns by name

composeOverlay was appending overlay columns to the manifest column list,
producing duplicate entries when dbt/metabase overlays declared a column
just to attach descriptions. The duplicates carried no `type`, so the
pydantic SourceDefinition rejected them at semantic-query time and broke
`ktx sl query` for every overlay-backed measure. Now overlay columns
match base columns by name (case-insensitive): same-name entries merge
onto the manifest (overlay fields win, type/role fall back to the base,
descriptions merge per source key) and only new names append.

* refactor(sl): split overlay columns from column_overrides and enforce TS/Python wire contract

Overlay sources now have two distinct collections: `columns:` for computed
columns (requiring `expr` + `type`) and `column_overrides:` for metadata
patches to inherited manifest columns. Composing or loading an overlay that
mixes the two — or references an unknown column — fails with a typed error.

Introduce `ResolvedSemanticLayerSource` / `resolvedSourceSchema` /
`toResolvedWire` as the strict shape sent to the Python engine, and add a
schema contract test that diffs Zod against the Pydantic JSON schema dumped
by `python -m semantic_layer dump-schema`. `SourceDefinition` is now
`extra="forbid"` on the Python side.

`loadAllSources` surfaces per-file load errors instead of swallowing them,
so validation/query paths can report manifest shard parse failures.

* fix(context): make scan description generation resilient and quiet

A transient sampleTable failure during ingest used to take out every
table in a connection: generateTableDescription returned a hardcoded
'Table not found' string into descriptions.ai, and KtxDescriptionGenerator
was constructed without a logger, so the failure left no trail anywhere.

- sampleTable / sampleColumn calls retry 3x with 200/400/800ms backoff,
  honouring KtxScanContext.signal via a new KtxAbortedError.
- On retry exhaustion or missing capability, table generation falls back
  to a metadata-only prompt built from column name / native type / comment
  / rawDescriptions. The column path follows the same rule -- call the
  LLM when any of samples or rawDescriptions are available; skip only
  when both are absent.
- Logger is now threaded from KtxScanContext into the generator. Failures
  emit structured KtxScanWarning entries (new description_fallback_used
  code, plus existing sampling_failed / enrichment_failed /
  connector_capability_missing). ktx scan groups warnings by code so a
  batch of identical failures collapses to one summary line plus sample.
- Returns null on failure instead of the 'Table not found' sentinel; the
  manifest writer's existing guard already skips empty descriptions, so
  schema YAML no longer carries misleading text. SCAN_MANAGED_DESCRIPTION_KEYS
  already strips stale 'ai' on merge, so existing YAML clears on next run.

Also suppress AI SDK v6 'system in messages' warning: pull system messages
out of KtxMessageBuilder.wrapSimple's output via a new splitKtxSystemMessages
helper and pass them top-level to generateText (preserves cacheControl
providerOptions on the SystemModelMessage). Agent-runner's local
splitSystemPromptMessages dedupes onto the shared helper.

* test(docs): align examples-docs assertions with revamped docs

PR #103 (setup/guide doc revamp) reworded several CLI examples and
connection labels; the assertions in scripts/examples-docs.test.mjs
still referenced the pre-revamp wording and were failing in CI on main.
Update the regexes to match the post-revamp content:

- drop the `--json` flag from the sl-query example expectation
- move the `Driver:` / `Status: ok` probe to the connection reference,
  which is where that output now lives (driver id is lowercase
  `postgres`, not the display name `PostgreSQL`)
- drop the obsolete `Install \`uv\`...` troubleshooting line
- accept `<connectionId>` everywhere; the docs no longer use the
  hyphenated `<connection-id>` form
- match the `warehouse` connection id used in the quickstart instead of
  the `postgres-warehouse` id only used in the README and setup ref

* fix(sl): skip TS/Python schema contract test when uv is unavailable

The TypeScript checks CI job does not install uv or Python, so the
module-level `execFileSync('uv', ...)` in schemas.contract.test.ts threw
ENOENT and failed the suite. Wrap the schema dump in a try/catch and
guard the describe block with `describe.skipIf` so the test skips in
environments without uv. Local dev and any CI job that has uv on PATH
still runs the cross-language contract assertion.
2026-05-15 02:11:04 +02:00
Luca Martial
6bc8d200ea
Remove deleted CLI command remnants (#105)
* fix(cli): reject unknown commands generically

* fix(cli): refresh ready command hints

* refactor(cli): drop removed wiki command internals
2026-05-14 19:04:22 -04:00
Luca Martial
db23fea609
Revamp setup and guide docs (#103)
* docs: revamp quickstart setup flow

* docs: refresh context build guide

* docs: rewrite context authoring guide

* docs: update agent serving guide
2026-05-14 18:09:26 -04:00
Luca Martial
17653e24f5
docs: revamp resource and integration sections (#104) 2026-05-14 18:09:13 -04:00
Andrey Avtomonov
644659fc1b
Merge branch 'main' into explore-research-agent-tools 2026-05-15 00:08:24 +02:00
Andrey Avtomonov
f8db99811a
feat(context): add driver-discriminated connection schemas (#96)
* refactor(context): export and describe mapping shape schemas

* feat(context): add driver-schemas module with warehouse drivers

* feat(context): add metabase, looker, lookml driver schemas with mappings

* feat(context): add notion, dbt, metricflow driver schemas

* refactor(context): make connectionSchema a driver-discriminated union

* chore(context): re-export KtxConnectionConfig from project package

* docs(context): add connection driver schema plan

* chore(secrets): allowlist example credentials in driver-schemas fixtures

* test(cli): align metabase fixtures with required api_url field

The driver-discriminated union added in this branch now requires api_url
for metabase connections and a known driver for warehouses. Update slow
CLI test fixtures and assertions so they exercise the new schema:
- ingest.test-utils.ts: add api_url to the prod-metabase fixture.
- setup.test.ts: switch metabase fixture from 'url' to 'api_url'.
- local-scan-connectors.test.ts: invalid-driver/missing-driver tests now
  expect the schema error from loadKtxProject (parse-time rejection).
2026-05-15 00:08:11 +02:00
Luca Martial
d244261aa7
docs: add context layer diagram (#102) 2026-05-14 17:57:16 -04:00
Andrey Avtomonov
d431fbfa5d
feat(docs-site): refresh nav mascot with SVG and bump size (#101)
Replace the PNG mascot with the refined "C" SVG (light + dark variants)
and enlarge the nav logo from 32px to 56px so it reads at a glance.
Also drop the same SVG pair into assets/ for repo-wide reuse.
2026-05-14 23:45:41 +02:00
Andrey Avtomonov
13ebe4fb6f chore: build runtime artifacts in conductor setup 2026-05-14 23:08:33 +02:00
Andrey Avtomonov
6c73029d0c Merge remote-tracking branch 'origin/main' into explore-research-agent-tools
# Conflicts:
#	packages/cli/src/print-command-tree.test.ts
#	packages/context/skills/sl_capture/SKILL.md
2026-05-14 22:05:00 +02:00
Andrey Avtomonov
8fd8ae6388 docs: add research-agent MCP ingest contract convergence plan 2026-05-14 19:27:38 +02:00
Andrey Avtomonov
fd3a25f14e docs(context): update ingest verification prompts for connectionId 2026-05-14 19:25:59 +02:00
Andrey Avtomonov
e4f2863fed refactor(context): use connectionId in warehouse verification tools 2026-05-14 19:25:03 +02:00
Andrey Avtomonov
6179667e45 docs: add research-agent MCP setup-agents plan 2026-05-14 19:11:30 +02:00
Andrey Avtomonov
c35aa760e6 feat(cli): support Claude local MCP setup scope 2026-05-14 19:09:48 +02:00
Andrey Avtomonov
6cb03d6924 feat(cli): configure MCP clients in setup agents 2026-05-14 19:07:22 +02:00
Andrey Avtomonov
0955b36887 feat(cli): install KTX research skill 2026-05-14 19:05:46 +02:00
Andrey Avtomonov
d79c51abaa docs: add research-agent MCP http daemon plan 2026-05-14 18:55:22 +02:00
Andrey Avtomonov
88f91f27c2 fix(cli): stabilize mcp daemon verification 2026-05-14 18:54:18 +02:00
Luca Martial
5cf2c89093
Revise CLI reference docs (#100)
* docs: revise CLI reference

* docs: sync CLI reference with current commands
2026-05-14 12:53:55 -04:00
Andrey Avtomonov
a9c7c152f1
docs(agents): instruct agents to update docs-site after code changes (#99) 2026-05-14 18:53:44 +02:00
Andrey Avtomonov
3bba9eec79 feat(cli): add ktx mcp commands 2026-05-14 18:52:00 +02:00
Andrey Avtomonov
db09df4d72 feat(cli): manage mcp daemon lifecycle 2026-05-14 18:50:54 +02:00
Andrey Avtomonov
6bff3c3492 feat(cli): host mcp over streamable http 2026-05-14 18:50:08 +02:00
Andrey Avtomonov
7ffa99983f feat(cli): add mcp http security helpers 2026-05-14 18:47:50 +02:00
Luca Martial
372c90b533
Polish documentation copy (#98) 2026-05-14 12:43:14 -04:00
Andrey Avtomonov
e974f3e59f docs: add research-agent MCP discover_data plan 2026-05-14 18:38:18 +02:00
Andrey Avtomonov
fc903f32b7 feat: wire local discover data MCP port 2026-05-14 18:36:18 +02:00
Andrey Avtomonov
e74976d321 feat: expose discover data MCP tool 2026-05-14 18:35:30 +02:00
Andrey Avtomonov
6a61f209e2 feat: add MCP discover data service 2026-05-14 18:34:44 +02:00
Andrey Avtomonov
f1c073b614 docs: add research-agent MCP dictionary_search plan 2026-05-14 18:24:48 +02:00
Andrey Avtomonov
b8418c7a79 feat(context): expose local MCP dictionary search 2026-05-14 18:22:55 +02:00
Andrey Avtomonov
edb62deed2 feat(context): register MCP dictionary search tool 2026-05-14 18:22:25 +02:00
Andrey Avtomonov
d0b8996456 feat(context): add dictionary search service 2026-05-14 18:21:52 +02:00
Andrey Avtomonov
ff3a8c5777 docs: add research-agent MCP entity_details plan 2026-05-14 18:14:17 +02:00
Andrey Avtomonov
a27400b3dd test(context): align entity details scan fixtures 2026-05-14 18:13:23 +02:00
Andrey Avtomonov
da6f8873d4 feat(context): expose local MCP entity details 2026-05-14 18:11:27 +02:00
Andrey Avtomonov
9d9fa9bc3b feat(context): register MCP entity details tool 2026-05-14 18:10:03 +02:00
Andrey Avtomonov
700c0ba5d7 feat(context): add scan-backed entity details service 2026-05-14 18:09:16 +02:00
Andrey Avtomonov
371fcd47eb docs: add research-agent MCP sql execution foundation plan 2026-05-14 18:00:40 +02:00
Andrey Avtomonov
66e7c9fca4 test(context): update SQL analysis port fixtures 2026-05-14 17:59:21 +02:00
Andrey Avtomonov
807f86d761 feat(context): execute MCP SQL through validated connector path 2026-05-14 17:57:41 +02:00
Andrey Avtomonov
c774870346 feat(context): register MCP sql execution tool 2026-05-14 17:56:33 +02:00
Andrey Avtomonov
06f020dca1 feat(context): expose read-only SQL validation port 2026-05-14 17:55:23 +02:00
Andrey Avtomonov
aa4431b295 feat(daemon): validate read-only SQL with sqlglot 2026-05-14 17:54:36 +02:00
Andrey Avtomonov
ce23aca4c4
fix: remove project from ktx config (#95) 2026-05-14 17:39:31 +02:00
Andrey Avtomonov
de9f4d97e7 Refine spec: drop connectionName compat carve-out and ground summary/snippet provenance per kind 2026-05-14 17:24:23 +02:00
Andrey Avtomonov
bef9d14b90 Refine research-agent MCP tools spec after adversarial review iteration 3 2026-05-14 17:06:35 +02:00