apunkt/ktx - bitfreedom.net: free all bits, everywhere

apunkt/ktx

mirror of https://github.com/Kaelio/ktx.git synced 2026-07-25 12:01:03 +02:00

Author	SHA1	Message	Date
Andrey Avtomonov	eff904b60d	fix(sl): skip TS/Python schema contract test when uv is unavailable The TypeScript checks CI job does not install uv or Python, so the module-level `execFileSync('uv', ...)` in schemas.contract.test.ts threw ENOENT and failed the suite. Wrap the schema dump in a try/catch and guard the describe block with `describe.skipIf` so the test skips in environments without uv. Local dev and any CI job that has uv on PATH still runs the cross-language contract assertion.	2026-05-15 02:04:12 +02:00
Andrey Avtomonov	cd49d5d4ae	Merge remote-tracking branch 'origin/main' into fix-sl-query-source-column-type # Conflicts: # packages/context/skills/metabase_ingest/SKILL.md # packages/context/skills/sl_capture/SKILL.md	2026-05-15 01:43:02 +02:00
Andrey Avtomonov	81b67a96cb	fix(context): make scan description generation resilient and quiet A transient sampleTable failure during ingest used to take out every table in a connection: generateTableDescription returned a hardcoded 'Table not found' string into descriptions.ai, and KtxDescriptionGenerator was constructed without a logger, so the failure left no trail anywhere. - sampleTable / sampleColumn calls retry 3x with 200/400/800ms backoff, honouring KtxScanContext.signal via a new KtxAbortedError. - On retry exhaustion or missing capability, table generation falls back to a metadata-only prompt built from column name / native type / comment / rawDescriptions. The column path follows the same rule -- call the LLM when any of samples or rawDescriptions are available; skip only when both are absent. - Logger is now threaded from KtxScanContext into the generator. Failures emit structured KtxScanWarning entries (new description_fallback_used code, plus existing sampling_failed / enrichment_failed / connector_capability_missing). ktx scan groups warnings by code so a batch of identical failures collapses to one summary line plus sample. - Returns null on failure instead of the 'Table not found' sentinel; the manifest writer's existing guard already skips empty descriptions, so schema YAML no longer carries misleading text. SCAN_MANAGED_DESCRIPTION_KEYS already strips stale 'ai' on merge, so existing YAML clears on next run. Also suppress AI SDK v6 'system in messages' warning: pull system messages out of KtxMessageBuilder.wrapSimple's output via a new splitKtxSystemMessages helper and pass them top-level to generateText (preserves cacheControl providerOptions on the SystemModelMessage). Agent-runner's local splitSystemPromptMessages dedupes onto the shared helper.	2026-05-15 01:33:50 +02:00
Andrey Avtomonov	f561bfa850	refactor(sl): split overlay columns from column_overrides and enforce TS/Python wire contract Overlay sources now have two distinct collections: `columns:` for computed columns (requiring `expr` + `type`) and `column_overrides:` for metadata patches to inherited manifest columns. Composing or loading an overlay that mixes the two — or references an unknown column — fails with a typed error. Introduce `ResolvedSemanticLayerSource` / `resolvedSourceSchema` / `toResolvedWire` as the strict shape sent to the Python engine, and add a schema contract test that diffs Zod against the Pydantic JSON schema dumped by `python -m semantic_layer dump-schema`. `SourceDefinition` is now `extra="forbid"` on the Python side. `loadAllSources` surfaces per-file load errors instead of swallowing them, so validation/query paths can report manifest shard parse failures.	2026-05-15 00:36:52 +02:00
Andrey Avtomonov	f8db99811a	feat(context): add driver-discriminated connection schemas (#96 ) * refactor(context): export and describe mapping shape schemas * feat(context): add driver-schemas module with warehouse drivers * feat(context): add metabase, looker, lookml driver schemas with mappings * feat(context): add notion, dbt, metricflow driver schemas * refactor(context): make connectionSchema a driver-discriminated union * chore(context): re-export KtxConnectionConfig from project package * docs(context): add connection driver schema plan * chore(secrets): allowlist example credentials in driver-schemas fixtures * test(cli): align metabase fixtures with required api_url field The driver-discriminated union added in this branch now requires api_url for metabase connections and a known driver for warehouses. Update slow CLI test fixtures and assertions so they exercise the new schema: - ingest.test-utils.ts: add api_url to the prod-metabase fixture. - setup.test.ts: switch metabase fixture from 'url' to 'api_url'. - local-scan-connectors.test.ts: invalid-driver/missing-driver tests now expect the schema error from loadKtxProject (parse-time rejection).	2026-05-15 00:08:11 +02:00
Luca Martial	372c90b533	Polish documentation copy (#98 )	2026-05-14 12:43:14 -04:00
Andrey Avtomonov	ce23aca4c4	fix: remove project from ktx config (#95 )	2026-05-14 17:39:31 +02:00
Andrey Avtomonov	3e12a9fef4	fix(context): merge overlay columns onto manifest columns by name composeOverlay was appending overlay columns to the manifest column list, producing duplicate entries when dbt/metabase overlays declared a column just to attach descriptions. The duplicates carried no `type`, so the pydantic SourceDefinition rejected them at semantic-query time and broke `ktx sl query` for every overlay-backed measure. Now overlay columns match base columns by name (case-insensitive): same-name entries merge onto the manifest (overlay fields win, type/role fall back to the base, descriptions merge per source key) and only new names append.	2026-05-14 17:23:37 +02:00
Andrey Avtomonov	2bca308863	feat(cli): add ktx dev schema to emit ktx.yaml JSON Schema (#93 ) Annotates the Zod config schema with .describe() text on every field and adds generateKtxProjectConfigJsonSchema() plus a ktx dev schema command that prints (or writes) a draft-07 JSON Schema for editors and LLM agents.	2026-05-14 16:21:29 +02:00
Andrey Avtomonov	c7c5f63a66	feat(cli): extend `ktx connection test` to every supported driver (#92 ) * feat(cli): extend `ktx connection test` to every supported driver Dispatch by driver: native DBs now call `connector.testConnection()` (was `introspect(dryRun)`), looker/notion/metabase hit their auth endpoints, and dbt/metricflow/lookml run `git ls-remote` via the existing `testRepoConnection` helper. Unknown drivers exit 1 with a listing of supported ones. * feat(cli): add `ktx connection test --all` summary list Tests every configured connection in parallel and renders a single Clack-style list (◇/│/◆/└, green ✓ / red ✗) consistent with sl list, with per-row detail and a passed/failed footer. Exits non-zero if any connection fails. Single-id `ktx connection test` output is preserved. * fix(cli): read metabase status url from api_url `ktx status` was probing `url` / `base_url` on metabase connections, but ktx.yaml stores it as `api_url`, so the field always reported "url not set". Read `api_url` directly and align the warning text with the actual key.	2026-05-14 16:21:18 +02:00
Andrey Avtomonov	b3be54e3fa	refactor(context): validate ktx.yaml with Zod and surface issues in status (#91 ) * refactor(context): validate ktx.yaml with Zod and surface issues in status - Replace hand-rolled ktx.yaml parsing with a strict Zod schema and derive KtxProjectConfig types from it. - Add validateKtxProjectConfig returning structured KtxConfigIssue[] with migration hints for deprecated keys (ingest.llm, scan.enrichment.backend, etc.). - Wire ktx status/doctor to run validation, render schema issues in plain and JSON output, and add a Config row to project status. - Update the orbit example to camelCase scan.relationships keys to match the schema. * fix(context): tolerate legacy setup.completed_steps and optional driver - Accept and drop the legacy setup.completed_steps field so existing ktx.yaml files migrated from older versions still load. - Make connections.<id>.driver optional in the schema; runtime code already produces a clear "no driver" error at use time. * feat(cli): add ktx status --validate to run only ktx.yaml schema validation - New --validate flag dispatches a focused runKtxDoctor 'validate' branch that reads ktx.yaml, runs validateKtxProjectConfig, and skips LLM, connection, embedding, and query-history checks. - Plain output prints a single Config row; JSON output emits {ok: true} on success or the existing invalid_config / missing_project shapes on failure.	2026-05-14 15:36:35 +02:00
Andrey Avtomonov	49f1e2720e	fix(llm): wire prompt caching through all Anthropic call sites (#90 ) * fix(llm): wire prompt caching through all Anthropic call sites - page-triage classifier + light-extraction now put the static skill prompt in `system:` so the per-document caches hit instead of re-sending boilerplate in the user message every call. - Description generation builders return `{ system, user }` with instruction text + word limit moved into the cacheable system. - Relationship-LLM proposal framing moved to `system:`. - `KtxMessageBuilder.wrapSimple` skips the history breakpoint for single-message calls (cache write that could never be reused). - Gateway backend now sets `anthropic-beta: extended-cache-ttl-2025-04-11` so 1h TTLs don't silently downgrade to 5m on Gateway routes. * fix(llm): keep wrapSimple history breakpoint so multi-step agent loops cache Reverts the wrapSimple `messages.length > 1` guard from the prior commit. agent-runner uses wrapSimple with a single user message, but generateText runs a multi-step tool loop inside it — the cache marker on the first user message is reused by every subsequent step, so it isn't waste. The release validator (scripts/validate-llm-debug-jsonl.mjs) also requires a `message-part` marker target in captured debug JSONL.	2026-05-14 15:36:27 +02:00
Andrey Avtomonov	b00c1a11a9	feat: merge ingest and scan * docs: add CLI component reuse guidance * docs: add unified ingest ux design * Refine unified ingest UX design after adversarial review iteration 1 * Refine unified ingest UX design after adversarial review iteration 2 * Refine unified ingest UX design after adversarial review iteration 3 * feat(cli): route public connection ingest command * feat(cli): hide standalone scan from public help * feat(cli): plan public ingest depth and query history * feat(cli): execute public database ingest facets * feat(ingest): read connection query history config * fix(cli): use public ingest wording * fix(config): stop generating ingest adapter allow lists * docs: document public ingest command * test: align ingest surface expectations * docs: add unified ingest public CLI surface plan * feat(cli): preflight deep public ingest readiness * feat(setup): store query history in connection context * feat(setup): store database context depth * feat(setup): verify context readiness by database depth * fix(setup): keep context build foreground only * fix(config): reject reserved ingest connection ids * test: close unified ingest v1 expectations * docs: add unified ingest v1 closure plan * fix(ingest): bypass adapter allow-list for public source ingest * fix(ingest): honor query history window intent * fix(ingest): hide scan internals from public database ingest * feat(ingest): use foreground view for interactive public ingest * fix(setup): use schema context and query history wording * test(cli): verify unified ingest public output * docs: add unified ingest v1 public output closure plan * fix(setup): forward query history flags * fix(setup): prompt for postgres query history * fix(status): report query history readiness * fix(ingest): remove legacy public guidance * fix(ingest): polish foreground retry copy * docs(examples): use unified query history wording * chore(ingest): finish public query history cleanup * docs: add unified ingest v1 query history status cleanup plan * test(docs): cover unified ingest public docs * docs: align ingest CLI reference with unified UX * docs: update context build guides for unified ingest * docs: update setup and primary source ingest wording * docs: stop advertising adapter-backed example ingest * docs: close unified ingest public docs gaps * docs: add unified ingest v1 docs site closure plan * fix: render unified ingest foreground warnings * fix: explain query history schema order * fix: add public ingest retry guidance * fix: align setup next steps with unified ingest * fix: remove scan wording from demo progress * test: verify unified ingest ux closure * docs: add unified ingest v1 foreground and retry closure plan * fix(cli): preserve query-history pull config in public ingest * fix(cli): omit hidden commands from docs command tree * test(cli): close unified ingest final public surface checks * docs: add unified ingest v1 final public surface closure plan * fix(cli): use public source labels in ingest reports * fix(cli): suppress low-level public ingest output * test(cli): verify unified ingest public plain output * docs: add unified ingest v1 public plain output closure plan * fix(cli): add public ingest copy sanitizers * fix(cli): sanitize public ingest progress copy * fix(cli): rename setup schema scope prompt * docs(plan): add progress copy closure; test: align setup back-nav fixture Adds the iter9 plan and updates the setup back-navigation test fixture to pass disableQueryHistory plus listSchemas/listTables stubs that the unified ingest setup step now requires. * docs(plan): add final ux labels plan with narrowed label scans * fix(cli): aggregate unsupported query-history warnings * fix(cli): align setup database labels * test(cli): fix setup database test type-check * fix(cli): remove primary-source wording from setup output * test(cli): verify unified ingest setup closure * docs(plan): add unified ingest v1 verification copy closure plan * fix(cli): remove top-level scan command * fix(cli): remove legacy ingest and wiki commands * Merge scan into ingest flow * feat(cli): split ingest progress into per-phase rows, rename work units to tasks Each database target in the unified ingest dashboard now renders one row per real subprocess (Schema, then Query history when enabled) instead of a single combined bar. Each phase has its own monotonic 0-100% bar so the progress never snaps back to zero when historic-sql starts after scan completes. Completed phases keep their final bar, summary, and elapsed time visible as an inline audit trail; queued and skipped phases are shown explicitly. Also rename user-facing "work units" / "Failed work units" to "tasks" / "Failed tasks" in ingest output and parseIngestSummary. The parser still accepts the legacy "Work units:" wording in captured output for backward compat. Internal memory-flow event names and type fields are left alone. * Fix test harness failures * Fix CI smoke checks --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>	2026-05-14 01:43:06 +02:00
Andrey Avtomonov	1a472cf3ed	fix: clean up ktx yaml config parameters (#75 ) * fix: clean up ktx yaml config parameters * fix: align ci smoke checks with status output * test: update artifact smoke status assertion	2026-05-14 01:27:31 +02:00
Andrey Avtomonov	0a261fe8a4	ci: add codecov coverage reporting (#82 ) * ci: add codecov coverage reporting * ci: fix codecov and secret scan checks * ci: fix smoke and artifact checks	2026-05-14 01:13:31 +02:00
Andrey Avtomonov	28b5e2a83e	fix: align KTX agent tools and repair handling (#73 )	2026-05-14 00:57:51 +02:00
Andrey Avtomonov	fa9237956e	ci: run pre-commit checks in CI (#74 ) * ci: run pre-commit in CI * test: update CI workflow guardrail	2026-05-13 19:49:25 +02:00
Andrey Avtomonov	3fde4438b1	fix: stop requiring readonly connection config (#71 )	2026-05-13 19:37:25 +02:00
Andrey Avtomonov	d7147f9ca1	feat: rename project wiki directory (#66 ) * feat: rename project wiki directory * test: fix wiki skill ordering expectations * Show configured context sources in setup	2026-05-13 16:05:58 +02:00
Andrey Avtomonov	97da9919e9	refactor: remove legacy compatibility paths (#64 ) * refactor: remove legacy compatibility paths * fix: support legacy metabase native queries * test: use canonical semantic layer descriptions * Rename CLI description * Recover setup scan from SQLite ABI mismatch * Remove legacy product name from CLI help	2026-05-13 15:55:00 +02:00
Andrey Avtomonov	e1e9c4af91	fix(cli): clean up connection commands (#62 ) * fix(cli): clean up connection commands * test(cli): update connection smoke coverage * Fix setup output formatting * fix notion setup picker exit	2026-05-13 15:04:50 +02:00
Luca Martial	4973ca562f	Restore Vertex AI LLM setup (#56 ) * feat(context): resolve Vertex AI config references * feat(cli): restore Vertex AI LLM setup --------- Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>	2026-05-13 14:42:38 +02:00
Andrey Avtomonov	b75576279c	fix: store Metabase mappings in ktx.yaml (#61 ) * fix: store Metabase mappings in ktx.yaml * docs: note KTX has no public users * refactor: drop setup progress compatibility	2026-05-13 13:55:21 +02:00
Andrey Avtomonov	c22248dabf	feat(context): add warehouse verification tools (#46 ) * feat(context): add warehouse dialect dispatch * feat(context): read warehouse scan catalog * feat(context): add entity details verification tool * feat(context): add ingest SQL verification tool * feat(context): add raw warehouse discovery tool * feat(context): expose warehouse verification tools to ingest * docs(context): add ingest identifier verification protocol * test(context): guard ingest identifier verification prompts * chore(context): verify warehouse verification tools * docs: add warehouse verification tools plan and spec * fix(context): expose target warehouses to Notion ingest * fix(context): update ingest prompts for warehouse verification tools * fix(context): scope raw schema discovery to allowed connections * fix(context): verify warehouse column display targets * docs: add notion warehouse verification gap closure plan * fix(context): include raw discovery connection names * fix(context): expose warehouse targets for LookML and MetricFlow * fix(context): pass connection config to ingest query executors * fix(cli): enable read-only SQL probes for local ingest * docs: add warehouse verification final v1 closure plan * fix(context): align warehouse sql probe prompt shape * docs: add warehouse verification prompt shape closure plan * test(context): catch connectionless sql execution prompt examples * fix(context): include connection name in sl capture sql example * docs: add warehouse verification sql example closure plan * fix(context): report structured entity detail misses * docs: add warehouse verification structured target miss closure plan * fix: report untracked squash merge conflicts * feat: require ingest verification ledger * fix: stabilize ingest wiki references	2026-05-13 13:43:23 +02:00
Andrey Avtomonov	bcb0d2f8f7	chore: add TypeScript dead-code checks (#60 ) * chore: add TypeScript dead-code checks * chore: trim stale Knip ignores * Fix CI smoke and artifact checks	2026-05-13 13:33:28 +02:00
Andrey Avtomonov	721f1a998f	feat(cli)!: remove ktx agent command (#58 ) * feat(cli)!: remove ktx agent command * test(context): update PGlite boundary guardrail	2026-05-13 13:01:56 +02:00
Andrey Avtomonov	b9e0a746af	feat(cli): clean up dev command surface (#57 ) * feat(cli): clean up dev command surface * test: align CI expectations with CLI cleanup * test(cli): update slow test command expectations	2026-05-13 12:00:08 +02:00
Luca Martial	9a8cb08192	Refine setup table selection flow	2026-05-12 21:31:11 -07:00
Luca Martial	52ddb061a4	Add scan table filtering	2026-05-12 18:22:03 -07:00
Luca Martial	e13350c970	Merge pull request #47 from Kaelio/luca-martial/save-setup-in-dot-ktx Save setup completion state in .ktx/setup/state.json	2026-05-12 19:27:26 -04:00
Luca Martial	f70271152b	feat(context): add local .ktx/setup/state.json for setup completion tracking Move setup step completion state out of ktx.yaml into a gitignored local state file so it is not committed or shared across machines. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-12 16:26:13 -07:00
Andrey Avtomonov	85fc408054	chore(deps): refresh workspace dependencies (#43 ) * chore(deps): refresh workspace dependencies * Fix pnpm artifact smoke build approvals	2026-05-13 01:15:35 +02:00
Luca Martial	60457e9407	Improve schema setup and Notion ingest UX (#14 ) * Improve schema setup and Notion ingest UX * Handle Postgres network scan failures * WIP: save local changes before main merge * Refine setup prompt choices * Tighten ingest reconciliation guidance * Commit setup config updates * Canonicalize unmapped fallback details * Count reconciliation actions in reports * Harden semantic layer source validation * Return wiki content after edits * Validate SL sources against manifests * Validate wiki refs before writes * Simplify CLI next steps * Clarify agent setup summary * Surface dbt target SL sources * Recover SL write fallbacks * Preserve failed context build metadata * Track raw paths for ingest actions * test(cli): update seeded demo expectations * fix(ingest): scope fallback recovery checks * fix(sl): tighten source validation guards * fix(wiki): ignore empty embedding vectors * Improve Notion ingest UX * Enforce flat wiki keys * test(context): update wiki key assertion --------- Co-authored-by: Andrey Avtomonov <andreybavt@gmail.com>	2026-05-12 22:56:58 +02:00
Andrey Avtomonov	7a315d580f	Merge remote-tracking branch 'origin/main' into andreybavt/execute-context7-plan # Conflicts: # packages/cli/src/ingest.test.ts # packages/cli/src/ingest.ts	2026-05-12 14:37:51 +02:00
Andrey Avtomonov	366933c755	perf: parallelize scan description generation	2026-05-12 14:34:59 +02:00
Andrey Avtomonov	4d4441ccd5	fix(context): avoid saving scan error descriptions (#37 )	2026-05-12 14:34:15 +02:00
Andrey Avtomonov	15f433930e	Merge branch 'main' into andreybavt/execute-context7-plan	2026-05-12 13:04:16 +02:00
Andrey Avtomonov	d830e8c46e	docs: standardize env variable examples	2026-05-12 12:24:25 +02:00
Andrey Avtomonov	d7fb092cb0	feat(cli): route ingest adapter logs through operational logger	2026-05-12 11:26:34 +02:00
Andrey Avtomonov	d5f484eb7e	fix: standardize KTX environment variables	2026-05-12 11:21:37 +02:00
Andrey Avtomonov	9e80add72c	fix(context): make ingest adapter logging explicit	2026-05-12 11:21:29 +02:00
Andrey Avtomonov	a2dcd4eb08	fix: guide dev ingest llm setup (#15 ) Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>	2026-05-12 10:26:07 +02:00
Andrey Avtomonov	9d3b1015cc	fix: allow dbt ingest to discover warehouse schemas (#20 ) Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>	2026-05-12 10:25:56 +02:00
Andrey Avtomonov	b981cabdc6	Include historic SQL projection in memory counts	2026-05-11 22:52:47 +02:00
Andrey Avtomonov	1bd29c7eb1	Fix historic SQL ingest setup and progress	2026-05-11 22:35:07 +02:00
Andrey Avtomonov	2d1efda176	test: verify historic sql pattern shard work units	2026-05-11 20:23:37 +02:00
Andrey Avtomonov	8deac9d530	test: align historic sql pattern skill with shards	2026-05-11 20:21:39 +02:00
Andrey Avtomonov	3e11e33b8a	feat: emit historic sql pattern shard work units	2026-05-11 20:20:58 +02:00
Andrey Avtomonov	02b621be72	feat: write historic sql pattern shards	2026-05-11 20:20:22 +02:00
Andrey Avtomonov	2a91ea521f	feat: shard historic sql pattern inputs	2026-05-11 20:19:47 +02:00

1 2

86 commits