Adds a new Configuration section to the docs with a reference page that
covers every top-level block of ktx.yaml: connections, setup, storage,
llm, ingest, scan, agent, and memory. Each block lists fields, defaults,
accepted values, and a short YAML example, with a leading schematic that
groups blocks into inputs, compute, and persistence.
Add docs/code-design.md with seven cross-cutting principles for
avoiding overengineering (one way to say one thing, behavior follows
from inputs, failures must reach a decision-maker, no seams without a
second consumer, spec-and-behavior drift, verify the path you fixed,
naming asymmetries). Reference it from AGENTS.md with the same weight
as in-file MUST/MUST NOT rules, mirroring the docs/terminology.md
pattern.
Add --fast to skip checks requiring external communication (Claude Code
auth probe and Postgres pg_stat_statements probe); skipped checks render
as `-` and carry `"status": "skipped"` in JSON output. Always show a new
Local data section sourced from .ktx/db.sqlite (ingest run counts and
last-completed per connection, knowledge page counts by scope, semantic
layer source/dictionary value counts) plus on-disk sizes for .ktx/db.sqlite,
.ktx/cache/, raw-sources/, wiki/global/, and semantic-layer/. Wrap the
remaining slow probes in a @clack/prompts spinner when stdout is a TTY.
Introduce `docs/terminology.md` with the canonical vocabulary coding
agents should use across docs, code, comments, CLI strings, and error
messages — including the disambiguation rule for the overloaded word
`source` (semantic / primary / context / source of truth) and a
converged-vs-banned table covering connectors, ingest modes, MCP
naming, reconciliation, and supported-system orderings. Reference the
new file from the `Product Naming` section of AGENTS.md so Claude,
Codex, and Gemini all pick it up via the existing AGENTS.md symlinks.
* fix: surface silent failures in SL, wiki, and embedding wiring
- require non-empty `vertex.location` in the project schema instead of defaulting
to an empty string with a description that promised SDK fallback the resolver
never honored
- log YAML parse failures from `SemanticLayerService.loadSource` and
`KnowledgeWikiService.readPage` so corrupted overlays aren't silently treated
as "does not exist" by ingest/agent tools
- push directory-listing errors in `loadAllSources` and `listPageKeys` into the
load-error / log path instead of returning empty success
- accept an `embeddingProvider` in `createLocalProjectMemoryIngest` and plumb the
resolved CLI provider through `mcp-server-factory`; warn in both the memory
and bundle runtimes when they fall back to `NoopEmbeddingPort` while the
project config requests an active embedding backend
- clarify `embeddings.dimensions` description as a placeholder valid only with
`backend: none`, and tighten the sentence-transformers `base_url` description
to call out that managed-daemon resolution is CLI-only
* test: improve PR coverage
After PR #184 and #192 moved managed-embeddings URL resolution to the
CLI project boundary and made `ktx setup` persist `ktx.yaml` without a
`base_url`, the status command still treated the empty value as
misconfiguration and printed "no base_url configured", dragging the
verdict down to "Partially ready — embedding credentials missing".
Update `buildEmbeddingsStatus` to recognize the managed-daemon
convention and report it as ok. Add a `status-project.test.ts` covering
the explicit-url, omitted, empty-string, and openai-missing-key paths.
Address overengineering audit findings across cli/context/connector packages:
- F1 Snowflake `query`: drop bare catch that flattened all errors to empty result
- F2 memory-agent: treat LLM `stopReason === 'error'` as crash (skip squash-merge)
- F3 WikiSearchTool: description honest about token-only fallback vs sqlite-fts5 hybrid
- F5 Scan enrichment provider resolution: return discriminated status and surface
distinct `llm_unavailable` / `embedding_unavailable` warnings per failure mode
- F6 Relationship validation budget: drop dead `tableCount === undefined → 'all'`
branch; update tests to pass `tableCount` like production
- F8 `ktx sql`: use canonical `resolveOutputMode` (now honors KTX_OUTPUT/CI/TTY)
- F9 MCP stdio server: default `protocolIo.stderr` to `process.stderr` so
memory_ingest startup failures are visible
- F13/F14 Scan/setup JSON readers: distinguish ENOENT from corruption instead of
silently treating both as missing
- F15 `createKtxCliScanConnector`: throw config-shape error when driver matches
but type guard rejects, instead of "no native connector"
- F16 ContextEvidenceSearchTool: surface `embedding_unhealthy:<reason>` instead
of silently dropping the semantic lane
- F17 PromptService: default partials to `[]` (removes stale `clinical_policy`
reference from a prior product)
- F20 `contextBuildCommands`: drop unused `runId` parameter
Dead-code removal:
- F4 Delete `AgentRunnerService` (duplicated `RuntimeAgentRunner`, only test-used);
migrate tests to exercise `AiSdkKtxLlmRuntime.runAgentLoop` directly
- F7 Delete `KtxScanOrchestrator` and its test (no production callers; the
inline pipeline in `runLocalScan` is the single source of truth)
- F18 Delete `generateKtxText`/`generateKtxObject` pass-through helpers; inline
the single `runtime.generateObject` call at its caller
Plus a clarifying comment on the SQLite `resolveStringReference` `file:` carve-out
(load-bearing for SQLite URI form, not a bug).
* feat(cli): add tryUseManagedLocalEmbeddingsDaemon for read-only callers
* feat(cli): add resolveProjectEmbeddingProvider helper
* fix(cli): wire sl search through resolveProjectEmbeddingProvider so semantic lane works
* fix(cli): wire wiki/knowledge search through resolveProjectEmbeddingProvider
* feat(cli): surface embeddings-unavailable status when sl search returns empty
* refactor(cli): route admin reindex through resolveProjectEmbeddingProvider
* refactor: pass embeddingProvider into ingest/scan instead of resolving inside @ktx/context
* refactor(mcp): resolve embedding provider in CLI factory, pass into context ports
* refactor(context): delete MANAGED_SENTENCE_TRANSFORMERS_BASE_URL sentinel
* refactor(cli): delete sentinel-based managed-embeddings indirection
* chore: scrub stale managed-embeddings sentinel references from tests and smoke script
* chore: unexport unused EmbeddingResolutionMode alias
* fix(cli): force pathPrefix="" when targeting the managed embeddings daemon
The managed daemon serves /embeddings/compute directly. The default
pathPrefix in @ktx/llm is /api, so omitting sentenceTransformers from
ktx.yaml produced /api/embeddings/compute -> 404. The resolver now
sets pathPrefix='' explicitly when wiring the managed daemon URL,
matching what the daemon actually exposes.
* chore(docs-site): add dev shortcut and fix hero heading clipping
- Add `pnpm docs` script that frees port 3000 then runs the docs-site
dev server, so the docs preview is one command away.
- Bump hero heading line-height to 1.2 and add 0.15em bottom padding
so the gradient text-clip no longer cuts off descenders.
- Sync auto-generated next-env.d.ts to the current Next types path.
* fix(ci): unblock CI on docs-font branch
- Add lsof to knip ignoreBinaries so the new `pnpm docs` script
(which uses `lsof -ti:3000` to free port 3000) does not trip
the Unlisted binaries check.
- Make CLI version assertions read @ktx/cli/package.json at runtime
instead of hardcoding 0.0.0-private. The 0.4.0 release commit on
main bumped the package version, breaking 18 hardcoded test cases
in index.test.ts and admin-reindex.test.ts; reading the version
dynamically keeps the suite green across future version bumps.
* fix ci release version fixtures
* docs(concepts): add Wiki retrieval pillar page
Adds a dedicated concept page covering the wiki side of the context
layer: the page contract, the hybrid retrieval pipeline (lexical,
semantic, token lanes fused by RRF), the refs/sl_refs/[[wikilink]]
graph, validation that keeps edges live, and where ingest sources
pages. Wired into concepts nav and cross-linked from the-context-layer
to mirror the existing Semantic querying link.
* test: derive release versions in tests instead of hardcoding 0.1.0-rc.1
After @semantic-release/git started committing version bumps back to the
branch, the 0.4.0 release rewrote package.json, packages/cli/package.json,
and release-policy.json — but the script and CLI tests still pinned the
pre-bump strings (0.0.0-private, 0.1.0-rc.1, 0.1.0rc1), so every new
branch off main failed TypeScript checks and Coverage.
Drive the version off the existing source of truth instead: read
@ktx/cli/package.json via createRequire in the CLI tests, and reuse the
already-imported PUBLIC_NPM_PACKAGE_VERSION / RUNTIME_WHEEL_PACKAGE_VERSION
constants in the script tests. The two assertions that pinned those
constants to specific values become semver shape checks.
Re-applies the RELEASE_PAT wiring on top of the URL-casing fix in #188.
The default GITHUB_TOKEN authenticates as github-actions[bot], which
cannot be added to either restrictions or bypass_pull_request_allowances
on a protected branch. With #188 removing the URL redirect, the PAT
auth header now survives all the way to the protected-branch hook;
since RELEASE_PAT belongs to andreybavt (verified via /user) and
andreybavt is in the bypass list, the push should now be accepted.
semantic-release pushes the release commit to whatever repository.url
holds, then GitHub 301-redirects the lowercase /kaelio/ktx.git to the
canonical /Kaelio/ktx.git. The redirect causes branch-protection actor
evaluation to misbehave (bypass list matches are lost). Pinning the
correct case avoids the redirect entirely.
* feat(release): commit version files back to branch for one-version-everywhere
Add @semantic-release/git to the release plugin chain so the bumped
package.json, release-policy.json, and packages/cli/package.json land
back on the release branch after publish. This keeps the published npm
version and the in-repo version files in sync, so local builds from
main report the released version (e.g. ktx --version and the daemon
/health endpoint via KTX_DAEMON_VERSION).
Also widens assertPublicNpmReleaseTag to accept branch-<sanitized> tags,
unblocking branch RC publishes that pass through update-public-release-
version.mjs.
* test(release): pin GITHUB_REF_NAME in main-rc releaseTag assertion
The bare releaseTag('rc') call defaulted to process.env.GITHUB_REF_NAME,
which on PR CI is the merge ref (e.g. 186/merge) and yields
'branch-186-merge' instead of 'next'. Pass an explicit { GITHUB_REF_NAME:
'main' } so the test exercises the main-rc path regardless of CI env.
A clean `ktx setup` was failing verification because the managed
local-embeddings daemon URL was passed library-side through
`process.env[KTX_MANAGED_SENTENCE_TRANSFORMERS_BASE_URL]`, and the setup
flow never wrote that variable. With no resolved URL the embedding
provider was null, the deep scan emitted
`scan_enrichment_backend_not_configured`, descriptions + embeddings
stayed `skipped`, and the agent-readiness check exited 1.
Replace the env-var indirection with CLI-side substitution at the
project-load boundary. New `loadKtxCliProject` wraps `loadKtxProject`,
ensures the managed daemon when `managed:local-embeddings` is present in
`config.ingest.embeddings` or `config.scan.enrichment.embeddings`, and
substitutes the resolved baseUrl into the in-memory config. Runtime
entry points (scan, ingest, public-ingest, admin-reindex) use the new
loader; setup-time persistence paths keep raw `loadKtxProject` so the
on-disk `ktx.yaml` keeps the portable sentinel.
Cleanup follows from the new design: drop
`MANAGED_SENTENCE_TRANSFORMERS_BASE_URL_ENV`, remove the env-var lookup
branch in `resolveSentenceTransformersBaseUrl`, drop the `env` field
from `ManagedLocalEmbeddingsDaemon`, and collapse the manual
daemon-ensure dance in `admin-reindex.ts`.
Swap the small "committed" chip on the two-pillars figure for an inline
Git logo + "git" label so the source-of-truth signal reads at a glance.
Adds GitIcon component matching the existing GitHubIcon/SlackIcon
inline-SVG pattern. Also picks up a Next.js-regenerated next-env.d.ts
routes path.
* Refine adapter-owned ingest finalization design after adversarial review iteration 1
* Refine adapter-owned ingest finalization design after adversarial review iteration 2
* Refine adapter-owned ingest finalization design after adversarial review iteration 3
* Implement adapter-owned ingest finalization v1
Moves finalization from runner-owned post-processors into typed
SourceAdapter.finalize() contracts. Adds finalization report schema,
scope derivation, override replay context, and migrates historic-SQL
projection. Removes IngestBundlePostProcessorPort wiring and
HistoricSqlProjectionPostProcessor.
* feat(ingest): export finalization adapter contract types
* test(ingest): exercise historic sql finalization locally
* docs(plans): add adapter-owned finalization v1 closure plan
* fix(setup): unblock clean Linux installs and add enabled_tables allowlist
- Pin managed Python runtime to 3.13 via `uv venv --python 3.13` so installs
don't pick the system 3.12 on Ubuntu 24.04 and fail at wheel install.
- Sanitize NO_PROXY/no_proxy for the daemon child process — drop IPv6 CIDR
entries that httpx rejects with InvalidURL (OrbStack injects these by
default).
- Add `enabled_tables` allowlist on warehouse connections (zod schema +
live-database introspection filter) to scope ingest to specific tables.
- Add `getting-started/troubleshooting-linux` docs page covering the Python
3.13 prerequisite, IPv6 proxy gotcha, and a minimal working recipe; link
it from the quickstart troubleshooting table and the llms-docs map.
- Make docs-site origin overridable via `KTX_DOCS_ORIGIN` so local builds
can serve under host.docker.internal.
* Move docs changes to specs repo
* fix(cli): keep managed runtime python version private
* Deduplicate enabled tables filtering
Restructure the-context-layer.mdx around two committed pillars (semantic
sources + wiki pages) with an inline anatomy card, replace the
semantic-layer-only comparison with a three-way matrix against company
brains and traditional semantic layers, and add a navigable-graph
explanation grounded in sl_refs/refs maintenance. Extend the docs-site
CodeBlock with a markdown highlighter that detects YAML frontmatter,
heading and list markers, and inline code so wiki examples render with
the same token colors as YAML/SQL blocks.
* chore: standardize daemon naming on "KTX daemon"
Replace inconsistent names ("KTX Python daemon", "KTX local embeddings
daemon", "KTX managed daemon", "Python daemon") with the single name
"KTX daemon" in CLI output, errors, command descriptions, test
assertions, smoke scripts, docs, AGENTS.md, issue templates, and
codecov flags. The daemon is a portable compute server with endpoints
for SQL analysis, semantic layer, LookML, database introspection, and
embeddings; the previous labels misrepresented it as embeddings-only or
exposed implementation details ("Python", "managed").
The "KTX Python runtime" concept (installed interpreter + packages) is
deliberately left as-is — it is a separate concept from the daemon
process.
* refactor(release): drop release-policy.json runtime dep and next branch
Strips the release-policy.json fallback from release-version.ts so the CLI
reads its version straight from packages/cli/package.json. dev → 0.0.0-private,
installed @kaelio/ktx → the real semver baked into the published package.json.
KtxCliPackageInfo collapses to { name, version, contextPackageName }; /health
no longer depends on version files surviving past a CI run.
Replaces the dual-branch (main + next) semantic-release model with a single-
branch model on main. rcs and stables interleave on the same branch via
{ name: 'main', prerelease: 'rc', channel: 'next' } / ['main']. Drops
@semantic-release/git and @semantic-release/changelog (nothing is committed
back to the repo on any channel) and the workflow's "Prepare next prerelease
branch" step plus the KTX_PRERELEASE_BRANCH plumbing. The git tag plus the
published npm artifact carry the version forward.
Updates docs/release.md, removes the two now-unused devDeps, regenerates
pnpm-lock.yaml. 611/611 @ktx/cli tests, 173/173 script tests, type-check,
biome, knip all clean.
* fix(release): don't throw on non-main branches at config-load time
knip loads .releaserc.cjs on every PR run, where GITHUB_REF_NAME is the
merge ref (e.g. 180/merge). The previous version of releaseBranches threw
immediately when the branch wasn't main, which made knip fail to evaluate
the config and then mis-flag @semantic-release/exec as an unused dep.
semantic-release already refuses to publish when the current branch doesn't
match a configured release branch, so the explicit throw was redundant.
Drop it (and the unused currentBranch helper) and replace the
"rejects releases from non-main" assertion with one that exercises a CI-
shaped GITHUB_REF_NAME and confirms the config loads.
Bumps font sizes across the in-app React Flow diagram and the README's
SVG so block titles, stage names, body copy, and chip labels are easier
to read. Widens stage cards and updates the SVG layout so every stage
body wraps to two lines, and resizes every badge rect to fit its text
with even 12px padding on both sides (notably the PostgreSQL chip).
Also includes a pre-existing README addition noting that KTX runs with
the user's own LLM keys or a Claude Pro/Max subscription.
Bare invocations now do the obvious thing instead of erroring out, and mode-as-subcommand patterns collapse into flags on the parent. No new top-level commands.
- `ktx ingest` (bare) ingests every configured connection. The `text` subcommand is gone; capture inline notes with `ktx ingest --text "..."` and files with `ktx ingest --file path` (use `-` for stdin). `--text`/`--file` reject a positional connection id; pass `--connection-id` to tag captured notes.
- `ktx connection` (bare) lists; `ktx connection test` (bare) tests every configured connection.
- `ktx wiki` and `ktx sl` flatten `list`/`search`: bare lists, with a `[query...]` positional searches (multi-word joined with spaces). `sl validate` and `sl query` stay as distinct verbs and now read `--connection-id` from the parent.
- `ktx mcp` (bare) prints daemon status.
Adds a shared `resolveConnectionSelection` helper consumed by ingest and connection test. Updates README, docs-site cli-reference and guides, next-steps strings, agent SKILL templates, and all affected tests. Per-package type-check, unit tests (605), smoke tests, and dead-code checks all pass.
* chore(community): rewards program, issue templates, and triage workflow
Adds the public-facing community engagement infrastructure.
CONTRIBUTING.md introduces a three-tier rewards program (sticker / t-shirt /
hoodie) gated on merged PRs, with explicit eligibility rules to keep the
program sustainable. Fulfillment is handled by emailing support@kaelio.com.
The .github/ISSUE_TEMPLATE/ forms give structure to bug reports and feature
requests, and config.yml routes questions to the KTX Slack instead of GitHub
Discussions (matching the routing established in docs-site/.../support.mdx).
The triage-issues workflow applies a needs-triage label only when the issue
author isn't OWNER, MEMBER, or COLLABORATOR — so internal issues stay clean
while external contributions get queued for maintainer review.
The first 14 connector contribution issues (#161-174) have been filed using
these labels and reward tiers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(community): add SECURITY.md
Documents the private reporting channel (GitHub Security Advisories with
support@kaelio.com as fallback), what reporters should include, and the
supported-version policy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wheel events over the embedded ReactFlow diagram were captured by
zoomOnScroll + preventScrolling, blocking page scroll once the pointer
crossed into the diagram — even at min zoom. Disable wheel-zoom and let
the page handle scroll, keep Cmd/Ctrl + scroll as the zoom escape hatch
(default zoomActivationKeyCode), and remove the inaccessible Controls
that sat at the bottom of the 2340px-tall canvas. Hint badge updated.
* fix(ci): publish the pre-built tarball instead of re-packing
The release workflow built the tarball twice — once via pnpm pack in
artifacts:check (leaving it at dist/artifacts/npm/) and again inside
@semantic-release/npm's prepare step, which then tried to fs-extra
move npm pack's output into the same directory and crashed with
"dest already exists". On top of being a publish blocker, that meant
the published tarball was different from the one smoke-tested in
artifacts:check.
Drop @semantic-release/npm and publish the exact tarball that
artifacts:check verified via an exec publishCmd:
npm publish dist/artifacts/npm/kaelio-ktx-<v>.tgz \
--tag <next|latest> --access public --provenance
Auth uses OIDC trusted publishing — the workflow already grants
id-token: write and setup-node configures the registry, and
release-workflow.test.mjs asserts NODE_AUTH_TOKEN is not set.
* fix(ci): allow @kaelio/ktx tarball name in semantic-release config
The new publishCmd added in the previous commit hardcodes the
dist/artifacts/npm/kaelio-ktx-<v>.tgz path, which trips the boundary
check that forbids the literal product name outside release-machinery
files. The release config is exactly such a release-machinery file —
its job is to bridge the generic ktx project to the @kaelio/ktx npm
package — so add it to identifierAllowPatterns alongside the existing
build-public-npm-package and public-npm-release-metadata entries.
* docs: add Slack community invite to README and docs
Adds a Slack badge and Community section to the README, a new
Community & Support page under docs-site/content/docs/community/,
and a Community section on the docs introduction page. Routes
chat/questions to Slack and bugs/features to GitHub Issues.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: add Slack icon link to docs navbar
Adds the Slack brand mark as an icon button in the Fumadocs navbar
alongside the existing GitHub link, pointing to the KTX Slack
community invite. Persistent across every docs page so users can
reach the community from anywhere.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: order navbar icons as GitHub then Slack
Moves the GitHub link out of githubUrl and into the explicit links
array so the navbar renders GitHub first, then Slack. Fumadocs
appends githubUrl after links, which previously put Slack first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: rewrite Semantic Querying concept with imperative-vs-declarative diagram
Reframe semantic-layer-internals.mdx around the contract the semantic
layer offers an agent: declare what you want (a Semantic Query), KTX
figures out how to compute it. Replaces the old "Context-Aware SQL"
framing with a clear imperative-vs-declarative narrative.
Adds a React Flow component (semantic-layer-flow.tsx) that contrasts a
buggy 4-table agent-authored SQL (chasm trap, LEFT-JOIN-in-WHERE,
hardcoded DATE_TRUNC) against the chasm-safe per-fact CTE SQL the
planner actually emits, including the outer GROUP BY over the requested
dimensions. Both lanes converge into a shared warehouse node and each
SQL card now has parallel bullet notes (failures on the left, KTX
behavior on the right).
Side fixes bundled in:
- include the /ktx basePath in the favicon metadata so the icon resolves
under the production prefix
- migrate docs-site/middleware.ts to docs-site/proxy.ts (Next 16 rename)
- redirect / to /ktx/docs/getting-started/introduction so the apex docs
URL works
- add tests covering the apex redirect, the favicon basePath, and the
middleware-to-proxy rename
- propagate the Semantic Query terminology across the ktx-sl CLI
reference, the context-layer concept page, and the agent-clients /
primary-sources integration pages
* Fix CI dead-code failures
* docs-site: polish semantic-layer-internals code blocks and flow diagram
- Make CodeBlock a server component so children traverse synchronously
under React 19 RSC streaming; previously extractText returned "" in
dev SSR, leaving code blocks empty.
- Add custom JSON/YAML/SQL/code-like tokenizers with theme-aware token
classes; drop the colored file-glyph dot and gradient tab-head.
- Tighten tab-head: subtle grey background, smaller monospace filename
in muted grey, smaller rectangular language pill placed to the left
of the filename.
- Polish the React Flow semantic-layer diagram (controls, fit-view
padding, edge types).
* docs-site: annotate imperative SQL, add section anchor, drop ClickHouse
- Wire numbered red badges to each problematic span in the "Without KTX"
SQL with hover sync between SQL gutter, lines, and the notes list.
- Add #imperative-vs-declarative anchor on the flow section header so
the eyebrow link is shareable; reveals a # glyph on hover/focus.
- Align the compiled-SQL note dots to the first-line midpoint
(mt-[6px] instead of mt-1) so 4px dots sit at y=8 in a 16px line.
- Remove all ClickHouse references from docs-site (primary-sources,
quickstart, ktx-setup, contributing, agents-setup, mechanics test,
warehouse drivers in the flow diagram).
* test: drop ClickHouse contributing-docs assertion
Align the workspace-package mirror test with the ClickHouse removal
from docs-site (75907eb). The connector-clickhouse package still
exists in packages/, but contributing.mdx no longer lists it, so the
test that mirrored docs against the workspace was failing.