apunkt/ktx - bitfreedom.net: free all bits, everywhere

apunkt/ktx

mirror of https://github.com/Kaelio/ktx.git synced 2026-07-22 11:51:01 +02:00

Author	SHA1	Message	Date
Andrey Avtomonov	6149f67ce6	docs(docs-site): collapse agent setup explainer into a hover overlay	2026-05-28 16:04:08 +02:00
Andrey Avtomonov	6c6a3e7baf	docs(skills): correct ktx setup skill against agent-trial findings (#230 ) An external agent ran the skill end-to-end against `ktx setup` and reported seven concrete failures, all verified against the CLI source: - All useful setup flags are `.hideHelp()`, so the skill's "verify with --help" rule led the agent to conclude its own examples were wrong (setup-commands.ts:208-332). - The non-interactive LLM default is `anthropic` (and requires a key), not `claude-code` as the skill claimed (setup-models.ts:505-507). - `ktx status` exits 1 whenever the LLM is `none`, even with healthy embeddings and connections (status-project.ts:204-211, doctor.ts:647). - `ktx ingest` rejects `--yes`+`--no-input` while `ktx setup` accepts both (managed-python-command.ts:23-24). - `--database-url <raw>` auto-externalizes to `.ktx/secrets/<id>-url` — worth telling the agent (setup-databases.ts:671-683). - Resuming setup with only `--llm-backend` fails on missing DB flags even when `ktx.yaml` already has one (setup-databases.ts:1778-1782). - The `--agents` step prints `Required before using agents: ktx mcp start` but the skill never told agents to run it (setup-agents.ts:989,1227). Rewrite SKILL.md to: lead with the scripted (non-interactive) path; add a single "gather inputs once" checklist; correct the LLM default; document `--skip-*` flags and resumability; warn that `status` exit code ≠ readiness; fix the `ktx ingest` example to use `--no-input` only; require `ktx mcp start` after `--agents`; add a ktx-monorepo branch that avoids `npm install -g`. Add skills/ktx/troubleshooting.md (one level deep, per Anthropic's progressive-disclosure guidance) covering the five real failure signatures the agent hit: invalid ELF header, missing native CLI binary, missing Anthropic key, claude-code probe failure, and the resume-without-DB error. Description rewritten to combine what + when per the official skill authoring guidelines.	2026-05-28 15:36:56 +02:00
Andrey Avtomonov	35cecdf65d	docs(docs-site): tidy agent setup prompt copy and sizing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 15:30:51 +02:00
Andrey Avtomonov	c1ed5eedce	fix(cli): preserve project artifacts when ktx setup steps fail (#229 ) ktx setup wiped ktx.yaml, .ktx/setup/state.json, wiki/, semantic-layer/, raw-sources/, and .git/ — or removed the entire project dir — whenever any single source in the context-build step failed, destroying hours of ingest work and the persisted resume state. The cleanup hint was designed for an "early abort, leave no trace" semantic but was applied indiscriminately to every later step failure, in direct conflict with the .ktx/setup/state.json resume mechanism. Drop the cleanup mechanism entirely (KtxSetupCreatedProjectCleanup, cleanupForFolderState, createProjectWithCleanup, cleanupCreatedProjectScaffold, and the createdProjectCleanup plumbing through KtxSetupProjectResult). Step failures now return non-zero without touching the filesystem, so re-running ktx setup continues from completed steps and only re-attempts failed sources. Rewrites the two tests that documented the wipe behavior to assert preservation, and adds a regression test that simulates partial context-build artifacts (state.json, wiki/, semantic-layer/) and verifies all survive a failed context step. Refs KLO-719	2026-05-28 15:17:06 +02:00
Andrey Avtomonov	b687167bc1	Route ktx stars dashboard	2026-05-28 13:00:49 +02:00
Andrey Avtomonov	2a85346613	fix(docs-site): disable Geist Mono ligatures on every font-mono surface (#228 ) Geist Mono fuses `--` into an em-dash glyph that visually swallows the adjacent space, so prompts like `npx skills add Kaelio/ktx --skill ktx` rendered as `Kaelio/ktx--skill ktx` on the quickstart page. The existing ligature-off rule only covered <code>/<pre> and the .ktx-code wrapper — quickstart.mdx puts the prompt in a plain <div className="font-mono">, so the rule didn't apply. Extend the selector to also match the .font-mono Tailwind utility and any inline-style opt-in via the mono font CSS variable. Document the convention in AGENTS.md so future docs additions keep ligatures off on any new monospace container.	2026-05-28 12:51:17 +02:00
Andrey Avtomonov	39f94f39ff	docs: add ktx skills.sh setup skill (#227 )	2026-05-28 12:28:10 +02:00
Luca Martial	27842e14a9	docs: add context layer terminology (#226 )	2026-05-28 05:58:08 -04:00
Andrey Avtomonov	6837ab253d	fix(cli): align ingest step counter with SDK num_turns (#225 ) The Claude Code runtime counted every SDKAssistantMessage with parent_tool_use_id === null as a step, but the SDK emits extra messages within a single num_turns round-trip — `stop_reason: 'pause_turn'` continuations and errored partials it retries internally. The local counter then outran maxTurns and the ingest HUD rendered confusing ratios like `step 69/40`. Filter both cases in collectResult so stepIndex tracks num_turns and stays bounded by the work-unit stepBudget.	2026-05-28 02:09:53 +02:00
Andrey Avtomonov	a94f35800a	feat(docs-site): redirect ktx.sh/slack to Slack community invite (#224 ) Add a host-scoped redirect for /slack on ktx.sh before the existing catch-all so the path resolves to the community invite link instead of docs.kaelio.com/ktx/slack.	2026-05-27 18:20:51 +02:00
semantic-release-bot	5d74bd35de	chore(release): 0.6.0 [skip ci] ## [0.6.0](https://github.com/Kaelio/ktx/compare/v0.5.0...v0.6.0) (2026-05-26) ### Features * cli: skip-context-sources menu + clack-style tree picker UX ([#213](https://github.com/Kaelio/ktx/issues/213)) ([`cfd1749`](`cfd1749ab9`)) * cli: surface docs and demo-warehouse links in ktx setup ([#221](https://github.com/Kaelio/ktx/issues/221)) ([`62699bf`](`62699bfe9d`)) * connectors: generalize readiness and constraint handling ([#212](https://github.com/Kaelio/ktx/issues/212)) ([`78b8a0c`](`78b8a0c025`)) ### Bug Fixes * ingest: attribute historic-sql evidence writes in bundle report ([#220](https://github.com/Kaelio/ktx/issues/220)) ([`1071f9d`](`1071f9d1c9`)) * scripts: make package artifacts pnpm launch work on Windows ([`2a6fb19`](`2a6fb19ba4`)) * update ktx CI boundary checks ([#223](https://github.com/Kaelio/ktx/issues/223)) ([`bc7373f`](`bc7373fa8e`)) ### Documentation * ban ktx compatibility shims ([#214](https://github.com/Kaelio/ktx/issues/214)) ([`a9db379`](`a9db3797e6`)) * readme: restructure for clarity and add FAQ + comparison table ([#222](https://github.com/Kaelio/ktx/issues/222)) ([`0eeac6f`](`0eeac6f980`)) * standardize fanout terminology ([#218](https://github.com/Kaelio/ktx/issues/218)) ([`9248688`](`924868841d`)) ### Code Refactoring * remove legacy ktx compatibility shims ([#211](https://github.com/Kaelio/ktx/issues/211)) ([`96952fb`](`96952fb43c`)) ### Tests * split cli tests from source tree ([#216](https://github.com/Kaelio/ktx/issues/216)) ([`56985b7`](`56985b7e09`)) ### Continuous Integration * disable telemetry in workflows ([#217](https://github.com/Kaelio/ktx/issues/217)) ([`4827437`](`4827437f3a`))	2026-05-26 21:19:07 +00:00
Andrey Avtomonov	bc7373fa8e	fix: update ktx CI boundary checks (#223 )	2026-05-26 23:03:47 +02:00
Andrey Avtomonov	0eeac6f980	docs(readme): restructure for clarity and add FAQ + comparison table (#222 ) * docs(readme): restructure for clarity and add FAQ + comparison table Restructure the README: trim Common Commands to the 6 essentials and link to the CLI Reference, add a "How ktx compares" table and "Who is ktx for" qualifier, introduce a small FAQ, wrap key prompts in GitHub callouts, merge the duplicate workspace-layout section into Development, move Telemetry next to License, and add a Star History chart. * docs(readme): tighten Skip-ktx list and convert FAQ to bullets	2026-05-26 14:29:53 +02:00
Andrey Avtomonov	62699bfe9d	feat(cli): surface docs and demo-warehouse links in ktx setup (#221 ) Add a Clack note pointing to https://docs.kaelio.com/ktx right after the setup intro, and a second note pointing to https://kaelio.com/start above the database driver multiselect — mirroring the docs-site CTA wording. Closes KLO-715 and KLO-716.	2026-05-26 13:42:52 +02:00
Andrey Avtomonov	1071f9d1c9	fix(ingest): attribute historic-sql evidence writes in bundle report (#220 ) The emit_historic_sql_evidence tool took rawPath as LLM-supplied input, so projection actions frequently lacked defensible raw paths and every row in bundle_ingest_reports fell through as actionType: 'skipped' with null artifact metadata, hiding the wiki pages and SL merges the run had actually produced (KLO-698). The tool now reads the work unit's rawFiles from session.allowedRawPaths and stores them on the evidence envelope; the projection emits actions with those paths, and stale/archive actions are anchored to manifest.json so they also surface as non-skipped provenance rows.	2026-05-26 12:21:53 +02:00
ARYAN	2a6fb19ba4	fix(scripts): make package artifacts pnpm launch work on Windows Fix Windows package artifact script invocation under pnpm.	2026-05-26 12:16:53 +02:00
Andrey Avtomonov	56985b7e09	test: split cli tests from source tree (#216 ) * feat(cli): define full warehouse dialect contract * test(cli): keep dialect edge tests focused * fix(cli): stabilize dialect contract foundation * refactor(connectors): own read-only query preparation * refactor(connectors): resolve dialects through registry * refactor(connectors): keep concrete dialect classes internal * chore(workspace): enforce dialect import boundary * refactor(cli): resolve relationship dialect at scan boundary * refactor(cli): use dialect display parsing for entity details * refactor(cli): use dialect display parsing for warehouse catalog * refactor(cli): use dialect SQL in relationship workflows * test(cli): verify solid dialect scan workflow closure * test: split cli tests from source tree * refactor(cli): standardize BigQuery scope listing * feat(sqlite): implement connector scope listing * test(connectors): cover required table listing * feat(cli): add warehouse driver registry * refactor(setup): route scope discovery through driver registry * refactor(cli): route local query execution through driver registry * refactor(historic-sql): route dialect support through driver registry * refactor(cli): test warehouse connections through driver registry * fix(cli): close driver registry type export gaps * Improve setup daemon diagnostics * refactor(setup): centralize rail-prefixed diagnostics + query-history fallback Extract errorMessage, writePrefixedLines, and flushPrefixedBufferedCommandOutput into clack.ts so the setup wizard, managed daemons, and embedding/agent steps share one rail-formatted writer. setup-databases.ts also adds a "disable query history and retry" option when the schema-context build fails and query history is the likely culprit, surfaced via a new failed-query-history-unavailable status. * fix(cli): carry catalog through the picker so BigQuery/Snowflake/SQL Server scope filters match The setup picker's KtxTableListEntry was a 2-level { schema, name }, so qualifiedTableId always wrote db.name into enabled_tables. When BigQuery, Snowflake, or SQL Server later ran fast ingest, their introspect step filtered the scope set with scopedTableNames(scope, { catalog: projectId\|database, db }) — catalog was non-null on the introspect side but null in the scope refs, so every entry was rejected, the live-database adapter staged zero table files, and detect() failed with 'Adapter "live-database" did not recognize fetched source output'. Align the picker boundary with the canonical 3-level KtxTableRef: - Add catalog: string \| null to KtxTableListEntry. - BigQuery/Snowflake/SQL Server listTables populate catalog from the resolved projectId / database; Postgres/MySQL/ClickHouse/SQLite set null. - qualifiedTableId emits catalog.schema.name when catalog is non-null (resolveEnabledTables already accepts the 3-part shape) and schemasFromEnabledTables now goes through parseDottedTableEntry so it recovers the schema correctly from both 2-part and 3-part entries. - Export parseDottedTableEntry from enabled-tables.ts (@internal) for picker reuse. Update listTables expectations in all seven connector tests and the setup / picker test fixtures. Add a picker regression test that covers the catalog-bearing round-trip (save + refine). * fix(cli): allow debug telemetry under opt-out env	2026-05-26 08:49:05 +02:00
Luca Martial	924868841d	docs: standardize fanout terminology (#218 )	2026-05-25 11:09:33 -04:00
Andrey Avtomonov	4827437f3a	ci: disable telemetry in workflows (#217 )	2026-05-25 16:12:39 +02:00
Andrey Avtomonov	a9db3797e6	docs: ban ktx compatibility shims (#214 )	2026-05-24 22:55:08 +02:00
Andrey Avtomonov	78b8a0c025	feat(connectors): generalize readiness and constraint handling (#212 ) * feat(connectors): add postgres maxConnections * feat(connectors): add mysql maxConnections * feat(connectors): add sqlserver maxConnections * feat(connectors): rename snowflake pool config * docs: document connector maxConnections * feat(scan): add constraint discovery warning helper * feat(scan): carry structural warnings through reports * feat(postgres): soft-fail denied constraint discovery * feat(mysql): soft-fail denied constraint discovery * feat(sqlserver): soft-fail denied constraint discovery * feat(bigquery): soft-fail denied primary key discovery * feat(snowflake): report denied primary key discovery * test(scan): verify constraint discovery warnings * feat(historic-sql): use shared readiness probes * docs: document query history readiness probes * test(historic-sql): verify readiness probe registry * test(ingest): account for live database warnings artifact * Add skip option for agent setup	2026-05-24 19:30:06 +02:00
Andrey Avtomonov	cfd1749ab9	feat(cli): skip-context-sources menu + clack-style tree picker UX (#213 ) * feat(cli): add 'skip context sources' option to database setup menu After databases are configured, the post-setup menu now offers a 'Skip context sources' choice equivalent to passing --skip-sources, which plumbs through KtxSetupDatabasesResult.skipSources to bypass the context-source step in the same run. * feat(cli): standardize tree picker UX after clack autocomplete-multiselect Search is always on (no '/' to enter): typed printable chars feed the query, Tab toggles selection on the focused node without leaving the search bar, and Space toggles only after arrow-key navigation (isNavigating); otherwise it is appended to the query. Esc clears a non-empty query before quitting, Ctrl+A and Ctrl+N replace bare-letter bulk bindings, and the cursor refocuses on the first match when the query change would hide it.	2026-05-24 19:29:37 +02:00
Andrey Avtomonov	96952fb43c	refactor: remove legacy ktx compatibility shims (#211 ) * refactor: remove legacy ktx compatibility shims * fix: restore overlay collision guidance	2026-05-24 16:57:23 +02:00
semantic-release-bot	a954a29a76	chore(release): 0.5.0 [skip ci] ## [0.5.0](https://github.com/Kaelio/ktx/compare/v0.4.1...v0.5.0) (2026-05-23) ### Features * cli: add --fast flag and Local data section to ktx status ([#198](https://github.com/Kaelio/ktx/issues/198)) ([`1c7131c`](`1c7131c6c2`)) * cli: redesign database scope picker for searchable schema-first setup ([#203](https://github.com/Kaelio/ktx/issues/203)) ([`c87d14a`](`c87d14a554`)) * telemetry: anonymous posthog usage telemetry across node cli and python daemon ([#205](https://github.com/Kaelio/ktx/issues/205)) ([`b0dd13c`](`b0dd13ce7c`)) ### Bug Fixes * cli: treat omitted sentence-transformers base_url as managed daemon ([#194](https://github.com/Kaelio/ktx/issues/194)) ([`9fc715a`](`9fc715ac6a`)), closes [#184](https://github.com/Kaelio/ktx/issues/184) [#192](https://github.com/Kaelio/ktx/issues/192) * snowflake: unblock multi-schema ingest and relationship discovery ([#204](https://github.com/Kaelio/ktx/issues/204)) ([`394a985`](`394a985d2a`)), closes [#206](https://github.com/Kaelio/ktx/issues/206) * surface silent failures and drop unused dead-code paths ([#193](https://github.com/Kaelio/ktx/issues/193)) ([`0958bc0`](`0958bc03dc`)) * surface silent failures in SL, wiki, and embedding wiring ([#195](https://github.com/Kaelio/ktx/issues/195)) ([`488b955`](`488b955024`)) ### Documentation * add agent terminology rules and link from AGENTS.md ([#197](https://github.com/Kaelio/ktx/issues/197)) ([`d67cf0a`](`d67cf0aab8`)) * add code-design principles and link from AGENTS.md ([#199](https://github.com/Kaelio/ktx/issues/199)) ([`a1cfb03`](`a1cfb03d73`)) * add ktx.yaml configuration reference ([#200](https://github.com/Kaelio/ktx/issues/200)) ([`5211a03`](`5211a0317e`)) * bold Claude Pro/Max subscription note in README ([`50c7bbc`](`50c7bbc957`)) * quickstart: redesign demo-warehouse callout with sticker icons ([#202](https://github.com/Kaelio/ktx/issues/202)) ([`fd2ba62`](`fd2ba62d92`)) * rewrite context-as-code as reviewing-context guide ([#201](https://github.com/Kaelio/ktx/issues/201)) ([`4d4296f`](`4d4296f397`)) ### Code Refactoring * remove legacy compatibility shims ([#208](https://github.com/Kaelio/ktx/issues/208)) ([`db09936`](`db09936085`)) ### Other Changes * workspace: gate dead-code with knip production mode ([#196](https://github.com/Kaelio/ktx/issues/196)) ([`2366b00`](`2366b00301`))	2026-05-23 23:06:49 +00:00
Andrey Avtomonov	db09936085	refactor: remove legacy compatibility shims (#208 )	2026-05-24 01:00:20 +02:00
Andrey Avtomonov	394a985d2a	fix(snowflake): unblock multi-schema ingest and relationship discovery (#204 ) * feat(setup): drop redundant Snowflake schema prompt; fall back to free-text on listSchemas failure Snowflake setup previously asked for a single schema as free text, then ran a multiselect against the discovered schemas — two schema questions back-to-back, with the first being only a session bootstrap. The SDK's `schema` is optional, so the bootstrap step is unnecessary. - Remove the free-text Snowflake schema prompt; only pass `schema` to snowflake-sdk when one is configured. - When `listSchemas()` fails (e.g. role lacks SHOW SCHEMAS), prompt the user for a comma-separated list, persist it as `schema_names`, and use it as both the table-list filter and the multiselect default. Applies to every driver with a scope-discovery spec, not just Snowflake. - Update docs to lead with `schema_names`; keep `schema_name` as a documented single-schema shorthand. * fix(snowflake): keep introspecting when primary-key discovery is denied The PK query joins INFORMATION_SCHEMA.TABLE_CONSTRAINTS and INFORMATION_SCHEMA.KEY_COLUMN_USAGE, which require grants the connection role may not have. Previously a 'SQL compilation error: Object ANALYTICS.INFORMATION_SCHEMA.KEY_COLUMN_USAGE does not exist or not authorized' aborted the entire introspect — schemas, columns, and row counts were all discarded over a missing nice-to-have. Wrap the constraint query in try/catch, log a one-line warning per schema, and return an empty PK map. Columns end up with primaryKey=false; relationship inference still has FK and profiling to fall back on. * fix(scan): unblock relationship discovery on Snowflake Two adjacent bugs prevented the scan's relationship pipeline from producing any joins on a Snowflake warehouse: - relationship-profiling.ts fell through to a default `GROUP_CONCAT` branch for unknown drivers. Snowflake has no GROUP_CONCAT, so every per-table profile query failed with "Unknown function GROUP_CONCAT". Add an explicit Snowflake branch that uses LISTAGG with a literal '\x1f' delimiter (Snowflake requires the delimiter to be a constant, so CHR(31) is rejected). - description-generation.ts destructured `connector.sampleTable` and `connector.sampleColumn` into bare locals, losing the `this` binding when the class-method connectors (Snowflake, Postgres, MySQL) were invoked. Every sample call threw "Cannot read properties of undefined (reading 'assertConnection')" and degraded LLM descriptions to metadata-only prompts. Call the methods through the connector instead. Without these, even after the primary-key probe is allowed to fail softly, the scan ends up with 0 validated relationships and an empty `joins:` block in every shard YAML. * test(scan): cover table-ref helpers * feat(scan): plumb tableScope through live-database introspection port * feat(scan): apply tableScope during metadata fetch * feat(scan): enforce table scope at fetch boundary * feat(scan): pool Snowflake sessions and batch enrichment for faster ingest (#206) * feat(cli): add RSA key-pair auth option to Snowflake setup wizard Extends the interactive Snowflake setup flow with an authentication-method prompt (password vs RSA/JWT key-pair). The RSA branch collects a private-key path (env/file/absolute) and an optional passphrase; the resulting connection config records `authMethod: 'rsa'` with `privateKey` and `passphrase` instead of `password`. * feat(scan): pool Snowflake sessions * fix(scan): reuse structural snapshots and cleanup connectors * feat(scan): parallelize relationship profiling * feat(scan): batch table description generation * docs: document Snowflake ingest concurrency knobs * fix(scan): close Snowflake ingest perf verification gaps * fix(scan): keep batched description failure bounded * feat(scan): dispatch query-history probes by connection driver Extract historic-sql dialect resolution into a shared helper so the status-project readiness check and the local ingest factory agree on which connections enable query history and which probe to run. The status command now picks the postgres/snowflake/bigquery probe based on the connection's driver instead of always reporting against postgres, which previously caused snowflake connections with queryHistory.enabled to surface a misleading "driver is snowflake" failure. Also drops a noisy console.warn from Snowflake primary-key discovery — INFORMATION_SCHEMA.KEY_COLUMN_USAGE is commonly ungranted for read-only roles and the FK + profiling paths handle the empty PK map already. * fix(llm): allow StructuredOutput tool and raise maxTurns for generateObject The Claude Code agent SDK announces an internal pseudo-tool named StructuredOutput in the system/init message whenever outputFormat is set to { type: 'json_schema' }. The runtime's isolation check built its allowedToolIds set only from MCP tool ids and treated StructuredOutput as an unexpected host-injected tool, so every generateObject call threw "Claude Code runtime isolation failed: tools=StructuredOutput ..." and the table-descriptions and relationship-LLM-proposal enrichment stages recorded null output across the board. Whitelist StructuredOutput specifically in generateObject's allowedToolIds — the check also enforces missing_tools symmetry, so generateText and runAgentLoop, which do not see StructuredOutput, must not require it. generateObject also ran with maxTurns: 1, which the model intermittently breached when it emitted thinking text before the structured response. Raised to 5 to give the schema-bound call enough headroom without allowing unbounded loops. The existing tests now exercise the path with an init message that announces StructuredOutput so the regression cannot slip back in. * chore(scripts): add ktx-reset.sh project-cleanup helper Convenience script for repeatable ingest testing: takes a project directory and prunes everything except ktx.yaml and .ktx/secrets/, so the next ktx setup or ktx ingest run starts from a known-clean state.	2026-05-23 10:41:30 +02:00
Andrey Avtomonov	b0dd13ce7c	feat(telemetry): anonymous posthog usage telemetry across node cli and python daemon (#205 ) * feat: add telemetry phase 1 * feat: add node telemetry event catalog * feat: add telemetry event helpers * feat: emit setup and connection telemetry * feat: emit connection and stack telemetry * feat: emit ingest and scan telemetry * feat: emit query telemetry * feat: emit sampled mcp telemetry * docs: expand telemetry event catalog * feat: add telemetry schema sync artifact * feat: pass telemetry project id to semantic daemon * feat: add daemon telemetry foundation * feat: emit semantic daemon telemetry * feat: emit daemon lifecycle telemetry * docs: document full telemetry event catalog * feat(telemetry): dim first-run notice * feat(telemetry): show first-run notice before command output * feat(telemetry): wire ktx PostHog project for live ingestion * docs(telemetry): drop posthog project name and host from storage section * docs(telemetry): trim to general overview and disclaimer * docs(agents): add short telemetry guidelines * feat(telemetry): enable posthog geoip enrichment * docs(telemetry): drop ip-geoip note from public overview * refactor(telemetry): drop no-op groupIdentify, rely on capture groups field * fix(telemetry): respect CI kill switch in python daemon identity * fix(sql): route table-count analysis to existing analyze-batch endpoint * fix(telemetry): emit install_first_run from notice path and derive flagsPresent from commander * fix(telemetry): read package info via getKtxCliPackageInfo to satisfy boundary check * fix(telemetry): make python identity env={} bypass os.environ and unset CI in tests * fix(telemetry): unset CI kill switch in cli-program-telemetry tests	2026-05-22 18:18:47 +02:00
Andrey Avtomonov	c87d14a554	feat(cli): redesign database scope picker for searchable schema-first setup (#203 ) * feat: add searchable setup prompt pickers * fix: make snowflake scope discovery single query * fix: make bigquery table discovery schema scoped * fix: honor mysql and clickhouse database scope * feat: wire schema scope discovery for all relational setup drivers * feat: add schema-first database scope picker * test: update setup prompt stubs for type-check * docs: document database scope picker fields * Fix database setup edit preservation --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>	2026-05-22 14:22:11 +02:00
Andrey Avtomonov	fd2ba62d92	docs(quickstart): redesign demo-warehouse callout with sticker icons (#202 ) * docs(quickstart): redesign demo-warehouse callout with sticker icons Replaces the plain warning-style callout with a two-column layout: text and a pill-shaped CTA on the left, a 2x2 cluster of rotated Postgres, Metabase, dbt, and Notion sticker tiles on the right. Adds the four connector SVGs under docs-site/public/icons/ to support it. * chore(docs-site): refresh auto-generated next-env.d.ts	2026-05-21 16:04:58 +02:00
Andrey Avtomonov	4d4296f397	docs: rewrite context-as-code as reviewing-context guide (#201 ) * docs: rewrite context-as-code as reviewing-context guide Move the page from Concepts to Guides and rebuild around an interactive review-loop diagram. Extract pan/zoom + fit-view controls into a shared FlowCanvas wrapper and adopt it across all three docs diagrams. * test: point examples-docs assertion at reviewing-context Update the doc smoke test that read context-as-code.mdx to read the new guides/reviewing-context.mdx path. The `ktx ingest --all --no-input` assertion still holds; the rename was the only break.	2026-05-21 15:42:50 +02:00
Andrey Avtomonov	5211a0317e	docs: add ktx.yaml configuration reference (#200 ) Adds a new Configuration section to the docs with a reference page that covers every top-level block of ktx.yaml: connections, setup, storage, llm, ingest, scan, agent, and memory. Each block lists fields, defaults, accepted values, and a short YAML example, with a leading schematic that groups blocks into inputs, compute, and persistence.	2026-05-21 15:29:20 +02:00
Andrey Avtomonov	2366b00301	chore(workspace): gate dead-code with knip production mode (#196 ) * refactor(workspace): relocate @ktx/llm source into packages/cli/src/llm * refactor(workspace): rewrite @ktx/llm imports to relative paths * refactor(workspace): fold internal packages into cli * chore(workspace): gate dead-code with knip production mode Turn on production-mode knip plus an autofix run in pre-commit and the `pnpm dead-code` script, document the `/** @internal /` convention for test-only exports in AGENTS.md, annotate test-only exports across the CLI with that JSDoc, and drop dead exports/wrappers the new gate surfaced (e.g. `cli-project.ts`, `lookerRuntimeSourceToFileAdapterSource`, `createLocalScanEnrichmentProvidersFromConfig`, `PGLITE_OWNER_PROCESS_BACKEND_CAPABILITIES`, stale type re-exports). Replace the loose `ignoreIssues` allowlist in `knip.json` with explicit production entries so cross-package barrel leaks are caught. refactor(cli): delete internal barrel index.ts files The 34 `index.ts` re-export barrels inside `packages/cli/src/` were holdovers from the pre-fold multi-workspace structure. Post-fold-in they served no production purpose: external consumers go through the single package main entry, and in-repo callers mostly imported through them only because the path was short. Internally, knip flagged most barrel re-exports as production-dead (only reached via tests). This change: - Deletes every internal barrel except `packages/cli/src/index.ts` (the published package entry). - Rewrites ~270 source/test files to import each name directly from the file that defines it. - Moves `tools/warehouse-verification/index.ts` to `create-warehouse-verification-tools.ts` (the function it defined locally) and updates its single consumer. - Renames `search/backend-conformance.ts` → `.test-utils.ts` to match the existing test-helper file convention. - Deletes 13 dead test-only chains (dbt-descriptions/, live-database/extracted-schema, live-database/structural-sync, relationship- feedback/review chain) plus their tests and a cascading orphan integration test. - Updates test mocks that pointed at deleted barrel paths (notion-client, connector barrels in scan/local-scan-connectors tests) to mock the source files instead. - Points the maintainer benchmark script (`scripts/relationship-benchmark-report.mjs`) at source files instead of `dist/context/scan/index.js`. - Drops the barrel `!` entries from `knip.json`; adds explicit production entries only for the benchmark code reached via dist by the maintainer script. Net: 413 files changed, ~1.2k insertions, ~9.4k deletions. `pnpm run dead-code` (Biome + knip default + knip production) and `pnpm run type-check` are clean; 2277 tests pass. * refactor(workspace): rename @ktx/cli to @kaelio/ktx and pack it directly Promote the CLI workspace package to the public name `@kaelio/ktx` and drop the separate `scripts/build-public-npm-package.mjs` wrapper. The CLI package is now publishable in place (`publishConfig.access: public`, `provenance: true`), so artifact packing uses `pnpm pack` against `packages/cli/` instead of assembling a parallel package tree. Updates all workspace filter invocations, docs, tests, and release readiness checks to reference the new package name, and folds the tarball-name helper into `scripts/public-npm-release-metadata.mjs`. * docs: align "agent clients" and "data agents" terminology Replace "client agents" with "agent clients" and "database agents" with "data agents" across AGENTS.md, README.md, the docs-site copy, and the matching setup-agents test description, matching the canonical vocabulary in docs/terminology.md. Also moves packages/cli/tsconfig.json's tsBuildInfoFile from node_modules/.cache/ to dist/.tsbuildinfo so incremental builds survive node_modules reinstalls. * refactor(release): single source of truth for package version Make packages/cli/package.json the single source of truth for the @kaelio/ktx version. publicNpmPackageVersion() now reads it directly, so artifact filenames, release-readiness checks, and the Python wheel version all derive from one field. The duplicate release-policy.json.publicNpmPackageVersion is removed. Previously the two fields could drift: tarballs were named kaelio-ktx-0.4.1.tgz while internally containing @kaelio/ktx@0.0.0-private. - update-public-release-version.mjs rewrites both Python pyproject.toml files (ktx-daemon, ktx-sl) alongside the npm package.jsons, normalizing the version for PEP 440 (e.g. 0.1.0-rc.2 -> 0.1.0rc2). - semantic-release-config.cjs adds the two pyproject.toml files to @semantic-release/git assets so the release commit back to main carries every version source in lockstep. - The six "?? '0.0.0-private'" fallback literals across the CLI are replaced with "?? getKtxCliPackageInfo().version", and createDefaultKtxMcpServer makes its version arg required. - docs/release.md describes the actual commit-back model: the dev tree always reflects the most recent release; no sentinel pin to maintain. Verified: pnpm run artifacts:build now produces kaelio-ktx-0.4.1.tgz and kaelio_ktx-0.4.1-py3-none-any.whl with @kaelio/ktx@0.4.1 inside. Full type-check, dead-code, and 2287 vitests + 173 script tests pass. * refactor(cli): inject embedding provider resolution and detect sentence-transformers runtime Make resolveProjectEmbeddingProvider and runtimeIo injectable in ingest and scan command entrypoints so tests can stub them, and teach resolvePublicIngestRuntimeRequirements to flag the local-embeddings runtime feature when ktx.yaml selects sentence-transformers. * chore(cli): mark buildLocalStatsStatus and LocalStatsStatus as @internal Both symbols are consumed only by status-project.test.ts. Annotating with /** @internal / keeps knip's production-mode check clean without changing runtime behavior. fix(cli): use real package metadata in print-command-tree The stubbed package name embedded a forbidden product identifier that tripped the boundary check in CI. Read the metadata from package.json instead — keeps the rendered tree unchanged and removes a duplicate source of truth. * feat(cli): show embedding coverage in `ktx status`, drop duplicate disk counts Inline `(N embedded)` next to the Wiki scope counts and Semantic-layer source counts, computed with `SUM(embedding_json IS NOT NULL)` over `knowledge_pages` and `local_sl_sources`. Rename the "Knowledge" label to "Wiki" (canonical per `docs/terminology.md`) and rename the matching `localStats.knowledgePages` field to `localStats.wikiPages`. Drop `wiki=N md` and `semantic-layer=N yaml` from the Disk row — those duplicated the per-surface rows above. Disk now reports only actual byte usage (db, cache, raw-sources). The unused `wikiGlobalMarkdownCount` / `semanticLayerYamlCount` fields, the `isMarkdownEntry` / `isYamlEntry` helpers, and the `filter` arg on `summarizeDir` are removed.	2026-05-21 15:28:58 +02:00
Andrey Avtomonov	a1cfb03d73	docs: add code-design principles and link from AGENTS.md (#199 ) Add docs/code-design.md with seven cross-cutting principles for avoiding overengineering (one way to say one thing, behavior follows from inputs, failures must reach a decision-maker, no seams without a second consumer, spec-and-behavior drift, verify the path you fixed, naming asymmetries). Reference it from AGENTS.md with the same weight as in-file MUST/MUST NOT rules, mirroring the docs/terminology.md pattern.	2026-05-21 14:33:27 +02:00
Andrey Avtomonov	1c7131c6c2	feat(cli): add --fast flag and Local data section to ktx status (#198 ) Add --fast to skip checks requiring external communication (Claude Code auth probe and Postgres pg_stat_statements probe); skipped checks render as `-` and carry `"status": "skipped"` in JSON output. Always show a new Local data section sourced from .ktx/db.sqlite (ingest run counts and last-completed per connection, knowledge page counts by scope, semantic layer source/dictionary value counts) plus on-disk sizes for .ktx/db.sqlite, .ktx/cache/, raw-sources/, wiki/global/, and semantic-layer/. Wrap the remaining slow probes in a @clack/prompts spinner when stdout is a TTY.	2026-05-21 14:13:03 +02:00
Andrey Avtomonov	50c7bbc957	docs: bold Claude Pro/Max subscription note in README	2026-05-21 13:09:41 +02:00
Andrey Avtomonov	d67cf0aab8	docs: add agent terminology rules and link from AGENTS.md (#197 ) Introduce `docs/terminology.md` with the canonical vocabulary coding agents should use across docs, code, comments, CLI strings, and error messages — including the disambiguation rule for the overloaded word `source` (semantic / primary / context / source of truth) and a converged-vs-banned table covering connectors, ingest modes, MCP naming, reconciliation, and supported-system orderings. Reference the new file from the `Product Naming` section of AGENTS.md so Claude, Codex, and Gemini all pick it up via the existing AGENTS.md symlinks.	2026-05-21 12:52:50 +02:00
Andrey Avtomonov	488b955024	fix: surface silent failures in SL, wiki, and embedding wiring (#195 ) * fix: surface silent failures in SL, wiki, and embedding wiring - require non-empty `vertex.location` in the project schema instead of defaulting to an empty string with a description that promised SDK fallback the resolver never honored - log YAML parse failures from `SemanticLayerService.loadSource` and `KnowledgeWikiService.readPage` so corrupted overlays aren't silently treated as "does not exist" by ingest/agent tools - push directory-listing errors in `loadAllSources` and `listPageKeys` into the load-error / log path instead of returning empty success - accept an `embeddingProvider` in `createLocalProjectMemoryIngest` and plumb the resolved CLI provider through `mcp-server-factory`; warn in both the memory and bundle runtimes when they fall back to `NoopEmbeddingPort` while the project config requests an active embedding backend - clarify `embeddings.dimensions` description as a placeholder valid only with `backend: none`, and tighten the sentence-transformers `base_url` description to call out that managed-daemon resolution is CLI-only * test: improve PR coverage	2026-05-21 10:38:23 +02:00
Andrey Avtomonov	9fc715ac6a	fix(cli): treat omitted sentence-transformers base_url as managed daemon (#194 ) After PR #184 and #192 moved managed-embeddings URL resolution to the CLI project boundary and made `ktx setup` persist `ktx.yaml` without a `base_url`, the status command still treated the empty value as misconfiguration and printed "no base_url configured", dragging the verdict down to "Partially ready — embedding credentials missing". Update `buildEmbeddingsStatus` to recognize the managed-daemon convention and report it as ok. Add a `status-project.test.ts` covering the explicit-url, omitted, empty-string, and openai-missing-key paths.	2026-05-21 02:38:34 +02:00
Andrey Avtomonov	0958bc03dc	fix: surface silent failures and drop unused dead-code paths (#193 ) Address overengineering audit findings across cli/context/connector packages: - F1 Snowflake `query`: drop bare catch that flattened all errors to empty result - F2 memory-agent: treat LLM `stopReason === 'error'` as crash (skip squash-merge) - F3 WikiSearchTool: description honest about token-only fallback vs sqlite-fts5 hybrid - F5 Scan enrichment provider resolution: return discriminated status and surface distinct `llm_unavailable` / `embedding_unavailable` warnings per failure mode - F6 Relationship validation budget: drop dead `tableCount === undefined → 'all'` branch; update tests to pass `tableCount` like production - F8 `ktx sql`: use canonical `resolveOutputMode` (now honors KTX_OUTPUT/CI/TTY) - F9 MCP stdio server: default `protocolIo.stderr` to `process.stderr` so memory_ingest startup failures are visible - F13/F14 Scan/setup JSON readers: distinguish ENOENT from corruption instead of silently treating both as missing - F15 `createKtxCliScanConnector`: throw config-shape error when driver matches but type guard rejects, instead of "no native connector" - F16 ContextEvidenceSearchTool: surface `embedding_unhealthy:<reason>` instead of silently dropping the semantic lane - F17 PromptService: default partials to `[]` (removes stale `clinical_policy` reference from a prior product) - F20 `contextBuildCommands`: drop unused `runId` parameter Dead-code removal: - F4 Delete `AgentRunnerService` (duplicated `RuntimeAgentRunner`, only test-used); migrate tests to exercise `AiSdkKtxLlmRuntime.runAgentLoop` directly - F7 Delete `KtxScanOrchestrator` and its test (no production callers; the inline pipeline in `runLocalScan` is the single source of truth) - F18 Delete `generateKtxText`/`generateKtxObject` pass-through helpers; inline the single `runtime.generateObject` call at its caller Plus a clarifying comment on the SQLite `resolveStringReference` `file:` carve-out (load-bearing for SQLite URI form, not a bug).	2026-05-21 02:38:18 +02:00
semantic-release-bot	7737ccaf1a	chore(release): 0.4.1 [skip ci] ## [0.4.1](https://github.com/Kaelio/ktx/compare/v0.4.0...v0.4.1) (2026-05-21) ### Bug Fixes * cli: resolve embedding provider explicitly and surface lane status in sl search ([#192](https://github.com/Kaelio/ktx/issues/192)) ([`9d92c79`](`9d92c79988`)) ### Documentation * concepts: add Wiki retrieval pillar page ([#191](https://github.com/Kaelio/ktx/issues/191)) ([`ed2d2f9`](`ed2d2f9be0`)) ### Other Changes * docs-site: add dev shortcut and fix hero heading clipping ([#190](https://github.com/Kaelio/ktx/issues/190)) ([`56a9672`](`56a967278a`))	2026-05-21 00:23:17 +00:00
Andrey Avtomonov	9d92c79988	fix(cli): resolve embedding provider explicitly and surface lane status in sl search (#192 ) * feat(cli): add tryUseManagedLocalEmbeddingsDaemon for read-only callers * feat(cli): add resolveProjectEmbeddingProvider helper * fix(cli): wire sl search through resolveProjectEmbeddingProvider so semantic lane works * fix(cli): wire wiki/knowledge search through resolveProjectEmbeddingProvider * feat(cli): surface embeddings-unavailable status when sl search returns empty * refactor(cli): route admin reindex through resolveProjectEmbeddingProvider * refactor: pass embeddingProvider into ingest/scan instead of resolving inside @ktx/context * refactor(mcp): resolve embedding provider in CLI factory, pass into context ports * refactor(context): delete MANAGED_SENTENCE_TRANSFORMERS_BASE_URL sentinel * refactor(cli): delete sentinel-based managed-embeddings indirection * chore: scrub stale managed-embeddings sentinel references from tests and smoke script * chore: unexport unused EmbeddingResolutionMode alias * fix(cli): force pathPrefix="" when targeting the managed embeddings daemon The managed daemon serves /embeddings/compute directly. The default pathPrefix in @ktx/llm is /api, so omitting sentenceTransformers from ktx.yaml produced /api/embeddings/compute -> 404. The resolver now sets pathPrefix='' explicitly when wiring the managed daemon URL, matching what the daemon actually exposes.	2026-05-21 02:21:22 +02:00
Andrey Avtomonov	56a967278a	chore(docs-site): add dev shortcut and fix hero heading clipping (#190 ) * chore(docs-site): add dev shortcut and fix hero heading clipping - Add `pnpm docs` script that frees port 3000 then runs the docs-site dev server, so the docs preview is one command away. - Bump hero heading line-height to 1.2 and add 0.15em bottom padding so the gradient text-clip no longer cuts off descenders. - Sync auto-generated next-env.d.ts to the current Next types path. * fix(ci): unblock CI on docs-font branch - Add lsof to knip ignoreBinaries so the new `pnpm docs` script (which uses `lsof -ti:3000` to free port 3000) does not trip the Unlisted binaries check. - Make CLI version assertions read @ktx/cli/package.json at runtime instead of hardcoding 0.0.0-private. The 0.4.0 release commit on main bumped the package version, breaking 18 hardcoded test cases in index.test.ts and admin-reindex.test.ts; reading the version dynamically keeps the suite green across future version bumps. * fix ci release version fixtures	2026-05-21 01:30:45 +02:00
Andrey Avtomonov	ed2d2f9be0	docs(concepts): add Wiki retrieval pillar page (#191 ) * docs(concepts): add Wiki retrieval pillar page Adds a dedicated concept page covering the wiki side of the context layer: the page contract, the hybrid retrieval pipeline (lexical, semantic, token lanes fused by RRF), the refs/sl_refs/[[wikilink]] graph, validation that keeps edges live, and where ingest sources pages. Wired into concepts nav and cross-linked from the-context-layer to mirror the existing Semantic querying link. * test: derive release versions in tests instead of hardcoding 0.1.0-rc.1 After @semantic-release/git started committing version bumps back to the branch, the 0.4.0 release rewrote package.json, packages/cli/package.json, and release-policy.json — but the script and CLI tests still pinned the pre-bump strings (0.0.0-private, 0.1.0-rc.1, 0.1.0rc1), so every new branch off main failed TypeScript checks and Coverage. Drive the version off the existing source of truth instead: read @ktx/cli/package.json via createRequire in the CLI tests, and reuse the already-imported PUBLIC_NPM_PACKAGE_VERSION / RUNTIME_WHEEL_PACKAGE_VERSION constants in the script tests. The two assertions that pinned those constants to specific values become semver shape checks.	2026-05-21 01:26:58 +02:00
semantic-release-bot	2f647d5c68	chore(release): 0.4.0 [skip ci] ## [0.4.0](https://github.com/Kaelio/ktx/compare/v0.3.0...v0.4.0) (2026-05-20) ### Features * release: one version everywhere via @semantic-release/git ([#186](https://github.com/Kaelio/ktx/issues/186)) ([`2f70861`](`2f70861a18`)) ### Bug Fixes * correct repository URL casing to match canonical Kaelio org name ([#188](https://github.com/Kaelio/ktx/issues/188)) ([`b43000f`](`b43000f961`)) ### Documentation * standardize ktx naming ([#187](https://github.com/Kaelio/ktx/issues/187)) ([`17647a4`](`17647a436a`)) ### Continuous Integration * release: restore RELEASE_PAT for branch push ([#189](https://github.com/Kaelio/ktx/issues/189)) ([`16f8a35`](`16f8a35bee`)), closes [#188](https://github.com/Kaelio/ktx/issues/188)	2026-05-20 15:59:28 +00:00
Andrey Avtomonov	16f8a35bee	ci(release): restore RELEASE_PAT for branch push (#189 ) Re-applies the RELEASE_PAT wiring on top of the URL-casing fix in #188. The default GITHUB_TOKEN authenticates as github-actions[bot], which cannot be added to either restrictions or bypass_pull_request_allowances on a protected branch. With #188 removing the URL redirect, the PAT auth header now survives all the way to the protected-branch hook; since RELEASE_PAT belongs to andreybavt (verified via /user) and andreybavt is in the bypass list, the push should now be accepted.	2026-05-20 17:57:35 +02:00
Andrey Avtomonov	b43000f961	fix: correct repository URL casing to match canonical Kaelio org name (#188 ) semantic-release pushes the release commit to whatever repository.url holds, then GitHub 301-redirects the lowercase /kaelio/ktx.git to the canonical /Kaelio/ktx.git. The redirect causes branch-protection actor evaluation to misbehave (bypass list matches are lost). Pinning the correct case avoids the redirect entirely.	2026-05-20 17:55:34 +02:00
Andrey Avtomonov	17647a436a	docs: standardize ktx naming (#187 ) * docs: align KTX terminology * docs: standardize ktx naming	2026-05-20 17:33:38 +02:00
Andrey Avtomonov	2f70861a18	feat(release): one version everywhere via @semantic-release/git (#186 ) * feat(release): commit version files back to branch for one-version-everywhere Add @semantic-release/git to the release plugin chain so the bumped package.json, release-policy.json, and packages/cli/package.json land back on the release branch after publish. This keeps the published npm version and the in-repo version files in sync, so local builds from main report the released version (e.g. ktx --version and the daemon /health endpoint via KTX_DAEMON_VERSION). Also widens assertPublicNpmReleaseTag to accept branch-<sanitized> tags, unblocking branch RC publishes that pass through update-public-release- version.mjs. * test(release): pin GITHUB_REF_NAME in main-rc releaseTag assertion The bare releaseTag('rc') call defaulted to process.env.GITHUB_REF_NAME, which on PR CI is the merge ref (e.g. 186/merge) and yields 'branch-186-merge' instead of 'next'. Pass an explicit { GITHUB_REF_NAME: 'main' } so the test exercises the main-rc path regardless of CI env.	2026-05-20 17:01:26 +02:00
Andrey Avtomonov	2667952aa9	fix: recover snapshots and branch rc tags (#185 )	2026-05-20 15:22:01 +02:00
Andrey Avtomonov	c24e07a115	fix(cli): resolve managed-embeddings daemon URL at project boundary (#184 ) A clean `ktx setup` was failing verification because the managed local-embeddings daemon URL was passed library-side through `process.env[KTX_MANAGED_SENTENCE_TRANSFORMERS_BASE_URL]`, and the setup flow never wrote that variable. With no resolved URL the embedding provider was null, the deep scan emitted `scan_enrichment_backend_not_configured`, descriptions + embeddings stayed `skipped`, and the agent-readiness check exited 1. Replace the env-var indirection with CLI-side substitution at the project-load boundary. New `loadKtxCliProject` wraps `loadKtxProject`, ensures the managed daemon when `managed:local-embeddings` is present in `config.ingest.embeddings` or `config.scan.enrichment.embeddings`, and substitutes the resolved baseUrl into the in-memory config. Runtime entry points (scan, ingest, public-ingest, admin-reindex) use the new loader; setup-time persistence paths keep raw `loadKtxProject` so the on-disk `ktx.yaml` keeps the portable sentinel. Cleanup follows from the new design: drop `MANAGED_SENTENCE_TRANSFORMERS_BASE_URL_ENV`, remove the env-var lookup branch in `resolveSentenceTransformersBaseUrl`, drop the `env` field from `ManagedLocalEmbeddingsDaemon`, and collapse the manual daemon-ensure dance in `admin-reindex.ts`.	2026-05-20 14:43:02 +02:00

1 2 3 4 5 ...

403 commits