apunkt/ktx - bitfreedom.net: free all bits, everywhere

apunkt/ktx

mirror of https://github.com/Kaelio/ktx.git synced 2026-06-07 07:55:13 +02:00

Author	SHA1	Message	Date
Andrey Avtomonov	e70ae1e63b	feat(query-history): scope mining to modeled schemas by default (#258 ) * feat(query-history): structure SQL analysis table refs * feat(query-history): qualify SQL analysis table refs * feat(query-history): wire modeled scope floor through ingest * chore(query-history): verify scope floor * test(query-history): align daemon SQL batch endpoint contract * feat(query-history): build scope from same-run scan catalog * feat(query-history): fail open on scope-floor catalog failures * chore(query-history): verify scope-floor v1 closure * refactor(query-history): share scope membership * feat(setup): apply derived query history filters * docs: document derived query history filters * fix(query-history): redact filter picker LLM prompt SQL * fix(setup): run filter picker SQL analysis through managed daemon * chore(query-history): verify filter picker v1 closure * fix(query-history): fail open on partial service-account attribution * fix(query-history): aggregate BigQuery users by execution count * fix(query-history): aggregate Snowflake users by execution count * fix(query-history): use BigQuery query info hash	2026-06-03 17:19:42 +02:00
Andrey Avtomonov	494618ab14	feat: add codex llm backend for ktx runtime work (#253 ) * feat: add codex sdk runner foundation * feat: parse codex runtime events * feat: expose codex runtime mcp tools * feat: add codex llm runtime * feat: wire codex llm backend * test: avoid Array.fromAsync in codex runner test * docs: document codex llm backend * fix: tighten codex runtime config ownership * fix: use codex sdk env and thread options * fix: parse codex sdk event shapes * test: add codex backend live smoke * docs: clarify codex backend isolation * fix: drive codex loop metrics from mcp events * fix: enforce codex local step budget * docs: disclose codex isolation limits * fix: count all codex agent steps and stream step callbacks live The agent-loop step budget only counted completed mcp_tool_call items, so built-in command_execution steps (which the public Codex SDK/CLI surface can still expose) never decremented the budget, letting ingest/reconciliation run past stepBudget until Codex stopped on its own. onStepFinish was also replayed only after the whole stream drained, so live work_unit_step / reconciliation progress appeared stuck until the Codex process exited. collectEvents is now the single live step accumulator: it counts every completed agent-action item via a shared isCompletedAgentStep predicate (command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish as each step completes, and enforces the budget on that broader count. A no-tool turn still counts as one step. toolFailures stays MCP-specific, since a non-zero command exit is normal agent exploration, not a loop failure. * test: align ingest llm-guard assertions with codex backend The skip-llm ingest guard message now lists codex as a valid backend and mentions a Claude Code/Codex session plus a codex setup hint, but this slow suite test still asserted the pre-codex wording. Update it to match the production message (already covered by the local-bundle-runtime unit test) and add the codex setup-line assertion. * fix: treat codex error:null tool calls as success The Codex SDK serializes error: null on successful mcp_tool_call items, so the failure check (item.error !== undefined) flagged every successful tool call as failed with the empty-payload default "Codex turn failed". This killed every ingest work unit under the codex backend before it could produce a patch. Key on status === 'failed' (authoritative, always set) and only treat a populated error object as a failure. Add a regression test built from a verbatim real-SDK event capture. * fix: default codex backend to gpt-5.5 and report real probe errors The previous default gpt-5.3-codex is an API-key-only model that the OpenAI API rejects under ChatGPT-account (subscription) auth, so codex status/setup failed with a misleading "authentication is not usable" message even though auth was fine. - Default codex model is now gpt-5.5 (works on both subscription and API-key auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark). - runCodexAuthProbe now distinguishes "model not available" from an auth failure and surfaces the real API error: collectEvents retains stream events when the SDK throws on a non-zero exit, and the API error JSON envelope is unwrapped to its human-readable message. - The Codex isolation warning now renders inside the clack setup frame. - Docs updated to gpt-5.5 with a note that -codex ids require API-key auth. fix: require llm.models.default in status and match codex probe remediation Status reported a project ready when a non-none LLM backend was configured without llm.models.default, but the runtime (resolveModelSlots) hard-requires it, so ingest/scan/memory threw after `ktx status` said the project was usable. buildLlmStatus now fails for any non-none backend missing models.default and no longer invents a fallback model for claude-code/codex. Codex probe failures now carry a category-matched fix: a model-access failure steers the user at llm.models.default instead of the auth/install remediation. runCodexAuthProbe returns the fix and status consumes it; the message stays self-sufficient so setup output is unchanged. Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx states --llm-model only accepts codex/default or gpt-/codex- ids. Repaired four doctor fixtures that configured a backend without models.default (the now-correctly-blocked config) and added coverage for the new behavior.	2026-06-02 13:57:11 +02:00
Andrey Avtomonov	3f0d11e07d	feat(cli)!: remove fast mode; ktx ingest always builds enriched context (KLO-721) (#237 ) Fast mode (the ktx ingest --fast/--deep database-ingest depth toggle) is removed. ktx ingest now always builds the full enriched ("deep") context. There is no structural fallback: a database connection without a configured model and embeddings fails the enrichment-readiness preflight before any work runs, with a 'Run ktx setup to configure a model and embeddings' hint. - Remove --fast/--deep flags, the per-connection context.depth field, and the ktx setup depth prompt (delete setup-database-context-depth.ts). - Rename ingest-depth.ts -> connection-drivers.ts; ingest always requests scan mode 'enriched'; readiness gate (enrichmentReadinessGaps) runs for every database target. - Drop the database-context-depth telemetry step (Node + Python schema mirrors regenerated). - Update CLI, setup, context-build view, docs, the public ktx skill, and the release-smoke / artifacts scripts (now assert the no-LLM guard failure). ktx status --fast (a separate network-probe flag) is unchanged. Follow-ups: KLO-726 (live progress for ktx ingest --all), KLO-727 (restore credentialed successful-ingest release smoke coverage).	2026-05-29 17:41:04 +02:00
Andrey Avtomonov	637891f030	fix(cli): align Notion setup credential to --source-auth-token-ref (#236 ) Notion's setup path read --source-api-key-ref while writing the auth_token_ref config field, so --source-auth-token-ref was silently dropped. Align Notion to the flag=field convention every other connector follows: it now reads --source-auth-token-ref, and --source-api-key-ref becomes Metabase-only. Also add validation rejecting any credential-ref flag not applicable to the chosen --source, with a pointer to the correct flag, closing the silent-drop class for all connectors. Update CLI-reference docs, the ktx skill Notion example, and tests. Fixes KLO-724.	2026-05-29 17:23:46 +02:00
Andrey Avtomonov	56985b7e09	test: split cli tests from source tree (#216 ) * feat(cli): define full warehouse dialect contract * test(cli): keep dialect edge tests focused * fix(cli): stabilize dialect contract foundation * refactor(connectors): own read-only query preparation * refactor(connectors): resolve dialects through registry * refactor(connectors): keep concrete dialect classes internal * chore(workspace): enforce dialect import boundary * refactor(cli): resolve relationship dialect at scan boundary * refactor(cli): use dialect display parsing for entity details * refactor(cli): use dialect display parsing for warehouse catalog * refactor(cli): use dialect SQL in relationship workflows * test(cli): verify solid dialect scan workflow closure * test: split cli tests from source tree * refactor(cli): standardize BigQuery scope listing * feat(sqlite): implement connector scope listing * test(connectors): cover required table listing * feat(cli): add warehouse driver registry * refactor(setup): route scope discovery through driver registry * refactor(cli): route local query execution through driver registry * refactor(historic-sql): route dialect support through driver registry * refactor(cli): test warehouse connections through driver registry * fix(cli): close driver registry type export gaps * Improve setup daemon diagnostics * refactor(setup): centralize rail-prefixed diagnostics + query-history fallback Extract errorMessage, writePrefixedLines, and flushPrefixedBufferedCommandOutput into clack.ts so the setup wizard, managed daemons, and embedding/agent steps share one rail-formatted writer. setup-databases.ts also adds a "disable query history and retry" option when the schema-context build fails and query history is the likely culprit, surfaced via a new failed-query-history-unavailable status. * fix(cli): carry catalog through the picker so BigQuery/Snowflake/SQL Server scope filters match The setup picker's KtxTableListEntry was a 2-level { schema, name }, so qualifiedTableId always wrote db.name into enabled_tables. When BigQuery, Snowflake, or SQL Server later ran fast ingest, their introspect step filtered the scope set with scopedTableNames(scope, { catalog: projectId\|database, db }) — catalog was non-null on the introspect side but null in the scope refs, so every entry was rejected, the live-database adapter staged zero table files, and detect() failed with 'Adapter "live-database" did not recognize fetched source output'. Align the picker boundary with the canonical 3-level KtxTableRef: - Add catalog: string \| null to KtxTableListEntry. - BigQuery/Snowflake/SQL Server listTables populate catalog from the resolved projectId / database; Postgres/MySQL/ClickHouse/SQLite set null. - qualifiedTableId emits catalog.schema.name when catalog is non-null (resolveEnabledTables already accepts the 3-part shape) and schemasFromEnabledTables now goes through parseDottedTableEntry so it recovers the schema correctly from both 2-part and 3-part entries. - Export parseDottedTableEntry from enabled-tables.ts (@internal) for picker reuse. Update listTables expectations in all seven connector tests and the setup / picker test fixtures. Add a picker regression test that covers the catalog-bearing round-trip (save + refine). * fix(cli): allow debug telemetry under opt-out env	2026-05-26 08:49:05 +02:00
Andrey Avtomonov	78b8a0c025	feat(connectors): generalize readiness and constraint handling (#212 ) * feat(connectors): add postgres maxConnections * feat(connectors): add mysql maxConnections * feat(connectors): add sqlserver maxConnections * feat(connectors): rename snowflake pool config * docs: document connector maxConnections * feat(scan): add constraint discovery warning helper * feat(scan): carry structural warnings through reports * feat(postgres): soft-fail denied constraint discovery * feat(mysql): soft-fail denied constraint discovery * feat(sqlserver): soft-fail denied constraint discovery * feat(bigquery): soft-fail denied primary key discovery * feat(snowflake): report denied primary key discovery * test(scan): verify constraint discovery warnings * feat(historic-sql): use shared readiness probes * docs: document query history readiness probes * test(historic-sql): verify readiness probe registry * test(ingest): account for live database warnings artifact * Add skip option for agent setup	2026-05-24 19:30:06 +02:00
Andrey Avtomonov	cfd1749ab9	feat(cli): skip-context-sources menu + clack-style tree picker UX (#213 ) * feat(cli): add 'skip context sources' option to database setup menu After databases are configured, the post-setup menu now offers a 'Skip context sources' choice equivalent to passing --skip-sources, which plumbs through KtxSetupDatabasesResult.skipSources to bypass the context-source step in the same run. * feat(cli): standardize tree picker UX after clack autocomplete-multiselect Search is always on (no '/' to enter): typed printable chars feed the query, Tab toggles selection on the focused node without leaving the search bar, and Space toggles only after arrow-key navigation (isNavigating); otherwise it is appended to the query. Esc clears a non-empty query before quitting, Ctrl+A and Ctrl+N replace bare-letter bulk bindings, and the cursor refocuses on the first match when the query change would hide it.	2026-05-24 19:29:37 +02:00
Andrey Avtomonov	c87d14a554	feat(cli): redesign database scope picker for searchable schema-first setup (#203 ) * feat: add searchable setup prompt pickers * fix: make snowflake scope discovery single query * fix: make bigquery table discovery schema scoped * fix: honor mysql and clickhouse database scope * feat: wire schema scope discovery for all relational setup drivers * feat: add schema-first database scope picker * test: update setup prompt stubs for type-check * docs: document database scope picker fields * Fix database setup edit preservation --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>	2026-05-22 14:22:11 +02:00
Andrey Avtomonov	17647a436a	docs: standardize ktx naming (#187 ) * docs: align KTX terminology * docs: standardize ktx naming	2026-05-20 17:33:38 +02:00
Andrey Avtomonov	6dbb0c8b3a	feat(cli): add ktx admin reindex (#160 ) * feat(cli): add admin reindex * fix: keep lexical-only reindex incremental	2026-05-20 01:36:54 +02:00
Andrey Avtomonov	620d6adbe6	docs: rewrite Semantic Querying concept with imperative-vs-declarative diagram (#156 ) * docs: rewrite Semantic Querying concept with imperative-vs-declarative diagram Reframe semantic-layer-internals.mdx around the contract the semantic layer offers an agent: declare what you want (a Semantic Query), KTX figures out how to compute it. Replaces the old "Context-Aware SQL" framing with a clear imperative-vs-declarative narrative. Adds a React Flow component (semantic-layer-flow.tsx) that contrasts a buggy 4-table agent-authored SQL (chasm trap, LEFT-JOIN-in-WHERE, hardcoded DATE_TRUNC) against the chasm-safe per-fact CTE SQL the planner actually emits, including the outer GROUP BY over the requested dimensions. Both lanes converge into a shared warehouse node and each SQL card now has parallel bullet notes (failures on the left, KTX behavior on the right). Side fixes bundled in: - include the /ktx basePath in the favicon metadata so the icon resolves under the production prefix - migrate docs-site/middleware.ts to docs-site/proxy.ts (Next 16 rename) - redirect / to /ktx/docs/getting-started/introduction so the apex docs URL works - add tests covering the apex redirect, the favicon basePath, and the middleware-to-proxy rename - propagate the Semantic Query terminology across the ktx-sl CLI reference, the context-layer concept page, and the agent-clients / primary-sources integration pages * Fix CI dead-code failures * docs-site: polish semantic-layer-internals code blocks and flow diagram - Make CodeBlock a server component so children traverse synchronously under React 19 RSC streaming; previously extractText returned "" in dev SSR, leaving code blocks empty. - Add custom JSON/YAML/SQL/code-like tokenizers with theme-aware token classes; drop the colored file-glyph dot and gradient tab-head. - Tighten tab-head: subtle grey background, smaller monospace filename in muted grey, smaller rectangular language pill placed to the left of the filename. - Polish the React Flow semantic-layer diagram (controls, fit-view padding, edge types). * docs-site: annotate imperative SQL, add section anchor, drop ClickHouse - Wire numbered red badges to each problematic span in the "Without KTX" SQL with hover sync between SQL gutter, lines, and the notes list. - Add #imperative-vs-declarative anchor on the flow section header so the eyebrow link is shareable; reveals a # glyph on hover/focus. - Align the compiled-SQL note dots to the first-line midpoint (mt-[6px] instead of mt-1) so 4px dots sit at y=8 in a 16px line. - Remove all ClickHouse references from docs-site (primary-sources, quickstart, ktx-setup, contributing, agents-setup, mechanics test, warehouse drivers in the flow diagram). * test: drop ClickHouse contributing-docs assertion Align the workspace-package mirror test with the ClickHouse removal from docs-site (`75907eb`). The connector-clickhouse package still exists in packages/, but contributing.mdx no longer lists it, so the test that mirrored docs against the workspace was failing.	2026-05-19 23:41:29 +02:00
Andrey Avtomonov	590dd5dddb	fix(cli): simplify setup flags and agents tty handling (#155 ) * fix(cli): simplify setup flags and agents tty handling * fix(context): update ingest setup guidance flag	2026-05-19 19:23:35 +02:00
Andrey Avtomonov	b42f418adc	fix: allow agent setup without context (#139 ) * fix: allow agent setup without context * docs: align readme command examples	2026-05-19 12:18:52 +02:00
Andrey Avtomonov	c89af7733a	fix: improve ingest runtime readiness (#124 ) * fix: improve ingest runtime readiness * fix(cli): mock runtime in slow setup tests * test(cli): isolate setup runtime status	2026-05-17 10:27:29 +02:00
Andrey Avtomonov	b565e44a22	feat: add claude-code llm backend with runtime port (#115 ) * docs: revise claude-code ingest backend spec * docs: keep claude-code spec focused on ingest * docs: expand claude-code spec to full llm parity * Refine claude-code backend spec after adversarial review iteration 1 * Refine claude-code backend spec after adversarial review iteration 2 * Refine claude-code backend spec after adversarial review iteration 3 * feat: recognize claude-code llm backend * feat: add ktx llm runtime port * feat: add claude-code llm runtime * feat: route non-agent llm calls through runtime * feat: run ingest agents through llm runtime * feat: support claude-code setup and status * test: verify claude-code backend runtime * docs: add claude-code backend v1 runtime plan * fix: close claude-code runtime isolation checks * fix: warn on claude-code prompt caching during setup * chore: verify claude-code v1 closure * docs: add claude-code backend v1 isolation closure plan * fix: update claude-code ingest setup guidance * docs: add claude-code backend v1 ingest guidance closure plan * docs: align claude-code isolation spec with sdk metadata * test: cover claude-code host discovery metadata * fix: tolerate claude-code host discovery metadata * docs: clarify claude-code host discovery metadata * docs: add claude-code auth-probe isolation fix plan * chore: prepare kaelio ktx rc1 release * chore: add semantic release workflow * fix: unblock ci checks * chore(release): 0.1.0-rc.1 * feat: add Claude Code model selection to setup * fix: keep git maintenance attached in local repos	2026-05-16 12:06:34 +02:00
Luca Martial	42b688e934	Align docs with current KTX behavior (#106 ) * docs: align docs with current KTX behavior * fix: generate valid agent sl query command * docs: clarify KTX product mechanics * fix: use <ol> for runtime pipeline steps in product mechanics The PipelineStep component renders <li> elements, but the RuntimeDiagram wrapper was a plain <div> instead of a list element. This produced invalid HTML and accessibility warnings. IngestionDiagram already used <ol>. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add docs favicon * docs: add semantic layer internals concept * docs: refine documentation source label * docs: clarify company documentation examples --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 15:31:51 -04:00
Luca Martial	5cf2c89093	Revise CLI reference docs (#100 ) * docs: revise CLI reference * docs: sync CLI reference with current commands	2026-05-14 12:53:55 -04:00
Luca Martial	372c90b533	Polish documentation copy (#98 )	2026-05-14 12:43:14 -04:00
Andrey Avtomonov	b00c1a11a9	feat: merge ingest and scan * docs: add CLI component reuse guidance * docs: add unified ingest ux design * Refine unified ingest UX design after adversarial review iteration 1 * Refine unified ingest UX design after adversarial review iteration 2 * Refine unified ingest UX design after adversarial review iteration 3 * feat(cli): route public connection ingest command * feat(cli): hide standalone scan from public help * feat(cli): plan public ingest depth and query history * feat(cli): execute public database ingest facets * feat(ingest): read connection query history config * fix(cli): use public ingest wording * fix(config): stop generating ingest adapter allow lists * docs: document public ingest command * test: align ingest surface expectations * docs: add unified ingest public CLI surface plan * feat(cli): preflight deep public ingest readiness * feat(setup): store query history in connection context * feat(setup): store database context depth * feat(setup): verify context readiness by database depth * fix(setup): keep context build foreground only * fix(config): reject reserved ingest connection ids * test: close unified ingest v1 expectations * docs: add unified ingest v1 closure plan * fix(ingest): bypass adapter allow-list for public source ingest * fix(ingest): honor query history window intent * fix(ingest): hide scan internals from public database ingest * feat(ingest): use foreground view for interactive public ingest * fix(setup): use schema context and query history wording * test(cli): verify unified ingest public output * docs: add unified ingest v1 public output closure plan * fix(setup): forward query history flags * fix(setup): prompt for postgres query history * fix(status): report query history readiness * fix(ingest): remove legacy public guidance * fix(ingest): polish foreground retry copy * docs(examples): use unified query history wording * chore(ingest): finish public query history cleanup * docs: add unified ingest v1 query history status cleanup plan * test(docs): cover unified ingest public docs * docs: align ingest CLI reference with unified UX * docs: update context build guides for unified ingest * docs: update setup and primary source ingest wording * docs: stop advertising adapter-backed example ingest * docs: close unified ingest public docs gaps * docs: add unified ingest v1 docs site closure plan * fix: render unified ingest foreground warnings * fix: explain query history schema order * fix: add public ingest retry guidance * fix: align setup next steps with unified ingest * fix: remove scan wording from demo progress * test: verify unified ingest ux closure * docs: add unified ingest v1 foreground and retry closure plan * fix(cli): preserve query-history pull config in public ingest * fix(cli): omit hidden commands from docs command tree * test(cli): close unified ingest final public surface checks * docs: add unified ingest v1 final public surface closure plan * fix(cli): use public source labels in ingest reports * fix(cli): suppress low-level public ingest output * test(cli): verify unified ingest public plain output * docs: add unified ingest v1 public plain output closure plan * fix(cli): add public ingest copy sanitizers * fix(cli): sanitize public ingest progress copy * fix(cli): rename setup schema scope prompt * docs(plan): add progress copy closure; test: align setup back-nav fixture Adds the iter9 plan and updates the setup back-navigation test fixture to pass disableQueryHistory plus listSchemas/listTables stubs that the unified ingest setup step now requires. * docs(plan): add final ux labels plan with narrowed label scans * fix(cli): aggregate unsupported query-history warnings * fix(cli): align setup database labels * test(cli): fix setup database test type-check * fix(cli): remove primary-source wording from setup output * test(cli): verify unified ingest setup closure * docs(plan): add unified ingest v1 verification copy closure plan * fix(cli): remove top-level scan command * fix(cli): remove legacy ingest and wiki commands * Merge scan into ingest flow * feat(cli): split ingest progress into per-phase rows, rename work units to tasks Each database target in the unified ingest dashboard now renders one row per real subprocess (Schema, then Query history when enabled) instead of a single combined bar. Each phase has its own monotonic 0-100% bar so the progress never snaps back to zero when historic-sql starts after scan completes. Completed phases keep their final bar, summary, and elapsed time visible as an inline audit trail; queued and skipped phases are shown explicitly. Also rename user-facing "work units" / "Failed work units" to "tasks" / "Failed tasks" in ingest output and parseIngestSummary. The parser still accepts the legacy "Work units:" wording in captured output for backward compat. Internal memory-flow event names and type fields are left alone. * Fix test harness failures * Fix CI smoke checks --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>	2026-05-14 01:43:06 +02:00
Luca Martial	c2750dd797	refactor(cli): hide internal setup options and remove dead flags (#79 ) Hide advanced/internal `ktx setup` options from --help output using .hideHelp() so the command surface is approachable for new users. Remove the --project, --agent-scope, and --skip-initial-source-ingest flags that are no longer needed. Update docs and tests to match. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-13 17:55:25 -04:00
Andrey Avtomonov	17a2fee69a	fix(cli): remove ktx setup subcommands (#42 ) * fix(cli): remove ktx setup subcommands * test(scripts): update setup-dev status expectation	2026-05-13 00:38:26 +02:00
Andrey Avtomonov	e15a4ebaec	feat(cli): clean up command surface	2026-05-12 23:51:46 +02:00
Luca Martial	885072d2a9	docs(docs-site): normalize CLI references for agents	2026-05-11 16:43:08 -07:00
Andrey Avtomonov	f3f6b36551	Merge remote-tracking branch 'origin/main' into andreybavt/historic-sql-redesign	2026-05-11 20:52:19 +02:00
Andrey Avtomonov	a46563bb01	chore: move docs site workspace	2026-05-11 16:53:42 +02:00

25 commits