Merge branch 'main' into python-dependency-updates

This commit is contained in:
Andrey Avtomonov 2026-05-28 16:40:10 +02:00 committed by GitHub
commit fa7377ddd3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
18 changed files with 508 additions and 340 deletions

View file

@ -323,6 +323,26 @@ use `PascalCase` without the suffix.
source-code identifier, package/API name, or other literal value that must
match the implementation.
### Product Category Naming
- **MUST**: Use **context layer** as the primary public category for **ktx**.
Preferred phrase: `context layer for data agents`.
- **MUST**: Use **context engine** only as the secondary mechanism term for the
active system that builds, reconciles, validates, searches, and serves the
context layer.
- **MUST**: Keep **semantic layer** as the narrower term for executable metric
definitions, semantic sources, joins, measures, and SQL compilation.
- **MUST NOT**: Replace every `semantic layer` occurrence with `context layer`;
the semantic layer is one pillar inside the broader context layer.
Preferred pattern:
```md
**ktx** is an open-source context layer for data agents. Its context engine
ingests warehouse metadata, BI definitions, query history, docs, and approved
metrics, then turns them into reviewable files agents can search and execute.
```
### Terminology
For canonical vocabulary used across docs, code, comments, CLI strings, and
@ -355,6 +375,22 @@ that do not change user-facing behavior. When you do update docs, follow the
warrants docs but you are out of scope, call it out in your final summary
rather than silently skipping it.
#### Monospace ligatures in `docs-site/`
- **MUST**: Disable monospace ligatures on every surface that uses the
`var(--font-mono)` family (Geist Mono). Geist Mono fuses `--` into an
em-dash glyph that visually eats the adjacent space, so prompts like
`npx skills add Kaelio/ktx --skill ktx` render as `Kaelio/ktx--skill ktx`.
- **MUST**: When adding a new container that renders user-visible monospace
text outside `<code>` / `<pre>` (e.g. a styled `<div className="font-mono">`
for a copyable prompt), verify the global ligature-off rule in
`docs-site/app/global.css` covers its selector. Either use Tailwind's
`font-mono` utility (already covered) or extend the rule to match the new
class — do not silently rely on Geist Mono's defaults.
- **SHOULD**: Prefer `<code>` / `<pre>` (or a `font-mono` wrapper) for any
string that contains CLI flags, paths, or other tokens with `--`, `->`,
`>=`, `!=`, `==`, `//` so ligatures never alter intent.
## LLM and Prompt Development
When creating or modifying agent prompts, system prompts, tool descriptions, or

View file

@ -119,9 +119,8 @@ Agent integration ready: yes (codex:project)
> your project directory:
>
> ```text
> Follow instructions from
> https://docs.kaelio.com/ktx/docs/agents-setup.md
> to install and configure ktx
> Run npx skills add Kaelio/ktx --skill ktx and use the ktx skill to install
> and configure ktx in this project.
> ```
> [!IMPORTANT]

View file

@ -166,12 +166,16 @@ pre {
}
/* Disable monospace ligatures so `--flag` keeps a visible space and double
dashes don't fuse into an em-dash glyph. */
dashes don't fuse into an em-dash glyph. Covers every monospace surface:
raw <code>/<pre>, the ktx-code wrapper, Tailwind's `font-mono` utility,
and anything that opts in via the `var(--font-mono)` family directly. */
code,
pre,
pre code,
.ktx-code,
.ktx-code code {
.ktx-code code,
.font-mono,
[style*="--font-mono"] {
font-variant-ligatures: none !important;
font-feature-settings: "liga" 0, "calt" 0 !important;
}

View file

@ -3,11 +3,6 @@ import {
getLlmDocsPages,
getPageMarkdown,
} from "@/lib/llm-docs";
import {
agentSetupSlug,
isAgentSetupSlug,
readAgentSetupMarkdown,
} from "@/lib/agent-setup-markdown";
export const dynamic = "force-static";
@ -16,14 +11,6 @@ export async function GET(
props: { params: Promise<{ slug?: string[] }> },
) {
const params = await props.params;
if (isAgentSetupSlug(params.slug)) {
return new Response(await readAgentSetupMarkdown(), {
headers: {
"Content-Type": "text/markdown; charset=utf-8",
},
});
}
const page = getLlmDocsPage(params.slug);
if (!page) {
return new Response("Documentation page not found.\n", {
@ -42,8 +29,5 @@ export async function GET(
}
export function generateStaticParams() {
return [
...getLlmDocsPages().map((page) => ({ slug: page.slug })),
{ slug: [...agentSetupSlug] },
];
return getLlmDocsPages().map((page) => ({ slug: page.slug }));
}

View file

@ -1,201 +0,0 @@
# Goal
Set up **ktx** from scratch end-to-end as a fully autonomous, agent-driven replacement for the interactive `ktx setup` wizard. Detect the environment, install missing prerequisites, ask the user only for information you genuinely need (which connections to add, credentials), write a valid configuration, verify it works, and run a fast ingest. Keep the user updated throughout.
# Operating principles
- **Be autonomous.** Detect, decide, and act. Only ask the user when you need information that only they can provide: project location, which databases/sources to connect, credentials, and similar choices.
- **Stream short status updates.** Before each major phase ("Checking prerequisites…", "Installing uv…", "Configuring warehouse connection…", "Running fast ingest…") print a one-line update. Not chatty - just enough that the user can see what's happening.
- **Verify against docs, never guess.** CLI flags, config keys, and command names must come from the docs or from `ktx <command> --help`. If something looks wrong or missing, say so explicitly.
- **Print every command you run and its exit code.** Terse, not silent.
- **Fail loudly with cause + fix.** When a command fails: capture the exact error, identify the cause, change something, retry. Never retry an unchanged command. Exceptions for *known soft-failures* are listed in Phase 4 - handle those without retrying.
- **No LLM-based ingestion in this flow.** Only `--fast` ingest. The user can run `--deep` later.
- **Platform-agnostic.** Detect the host OS first and pick the right install commands / path syntax. Anything path- or shell-specific must branch on OS.
# Authoritative docs
**ktx** docs are served at `https://docs.kaelio.com/ktx/`. **Start by fetching `https://docs.kaelio.com/ktx/llms.txt`** to discover the docs map. Scan it for a "troubleshooting" entry - if one exists, read it **before** running install/setup so you can apply known fixes preemptively rather than after failing. If no troubleshooting page is listed (current state of the docs), proceed. Then fetch any other `.md` pages you need (setup, ingest, status, connection types). **Never invent CLI flags or config keys** - verify against the docs or `ktx --help` / `ktx <subcommand> --help`.
> **Note on the `ktx status` JSON example in the docs.** The docs page for `ktx status` shows an example shaped like `{"title": "...", "checks": [...]}`. That example is outdated. The real CLI output uses a top-level `verdict` field plus a `connections[]` array - see Phase 5 for the canonical success criteria. Trust the shape in this prompt over the docs example.
# Workflow
## Phase 1 - Detect environment
Determine the host OS (e.g. via `uname -s`, `process.platform`, or `$env:OS`). Use the right install commands per OS for the rest of this flow.
| Tool | macOS / Linux | Windows (PowerShell) |
|------|---------------|----------------------|
| `uv` | `curl -LsSf https://astral.sh/uv/install.sh \| sh` then re-source shell env | `irm https://astral.sh/uv/install.ps1 \| iex` |
| Node.js | use system / fnm / nvm - **do not** auto-install | use system / nvm-windows - **do not** auto-install |
| **ktx** CLI | `npm install -g …` (see Phase 2) | `npm install -g …` (see Phase 2) |
If Node.js is missing, **stop and ask the user** to install it (https://nodejs.org/). Do not attempt to auto-install Node.
## Phase 2 - Verify and install prerequisites
Check each tool in order; install only if missing.
1. **Node.js** - run `node --version`. Require >= 22. If missing or older, stop and instruct the user.
2. **`uv`** - run `uv --version`. If missing, run the OS-appropriate install command, then re-source the shell environment (`export PATH="$HOME/.local/bin:$PATH"` on Linux/macOS) so `uv` is on `PATH`.
3. **ktx CLI** -
- Install ktx with `npm install -g @kaelio/ktx`
- Verify with `ktx --version`.
Print one status line per tool ("✓ uv 0.11.15 found", "Installing uv…", "✓ ktx 0.x.y installed").
## Phase 3 - Gather user choices
Ask the user (grouped if your harness supports it; otherwise sequentially):
1. **Project directory.** Default: current working directory. Confirm before continuing.
2. **LLM provider.** Default: `claude-code` with model `sonnet` (the user is already inside Claude Code; no extra API key needed). Offer `anthropic` (paste API key, stored as `env:` or `file:` ref) and `vertex` (GCP project + location) as alternatives. Skip if defaults are accepted.
3. **Embeddings backend.** Default: `sentence-transformers` (local, no API key, managed Python runtime). Offer `openai` only if the user has a key.
4. **Database connections.** Ask how many to add, then loop. For each, collect:
- Connection name (e.g. `warehouse`, `analytics`).
- Driver: one of `sqlite`, `postgres`, `mysql`, `sqlserver`, `bigquery`, `snowflake`.
- Connection URL/DSN (or service-account file for BigQuery). Accept `env:VAR_NAME` or `file:/abs/path` to avoid pasting raw secrets.
- **Heads-up for the user**: even if they paste a literal URL, **ktx** will silently relocate it into `<project>/.ktx/secrets/<connection>-url` and rewrite `ktx.yaml` to `url: file:…` - this is correct, secure behavior and not a bug.
- Schemas / datasets to include (postgres / sqlserver / snowflake / bigquery only).
- Optional `enabled_tables` allowlist if the user wants to scope ingest to specific tables.
5. **Context sources** (dbt, Metabase, Looker, LookML, MetricFlow, Notion). Default: none. Ask only if the user mentions them.
## Phase 4 - Configure the project
Drive the existing wizard non-interactively (verify exact flag names with `ktx setup --help` and the docs - the automation flags are hidden from help but accepted):
```
ktx setup \
--project-dir <path> \
--no-input --yes \
--llm-backend <claude-code|anthropic|vertex> --llm-model <model> \
[--anthropic-api-key-env ANTHROPIC_API_KEY | --anthropic-api-key-file <path>] \
[--vertex-project <p> --vertex-location <loc>] \
--embedding-backend <sentence-transformers|openai> \
[--embedding-api-key-env OPENAI_API_KEY] \
--skip-sources \
--database <driver> --database-connection-id <name> --database-url <url|env:VAR|file:/path> \
[--database-schema <schema> …]
```
Notes on the flags above:
- **Project creation is automatic with `--no-input --yes`.** When
`ktx.yaml` exists, setup resumes it. When it doesn't exist, setup creates it
at `--project-dir`.
- **`--database-connection-id` is dual-purpose.** With `--database` or
`--database-url`, it names the new connection. Without those flags, it
selects an existing connection id.
- **Configure one new database connection per setup command.** If the user
wants multiple new connections, run setup again for each connection.
- **You don't need `--skip-agents` in this flow.** The agent integration step
is opt-in: setup leaves it alone unless you pass `--agents --target
<target>`.
- **`--skip-sources`** is correct and is the documented way to leave context sources unconfigured.
### Known soft-failure: `ktx setup` exits 1 after a successful fast build
When you select a configuration that only does fast ingest, `ktx setup`'s final readiness verification fails with:
```
ktx context build did not pass agent-readiness verification.
<connection>: deep database context has not completed.
```
This is **expected** and **does not mean setup failed**. Treat the exit code as a soft-failure **only if all of the following hold**:
- The build log shows the fast ingest reached `[100%] Scan completed` for every configured connection.
- `ktx connection test <name>` (run next) exits 0 for every connection.
- `ktx status --json --no-input` reports `verdict: "ready"`.
If those three conditions hold, proceed to Phase 5 without retrying setup, and **do not** switch to `--deep` to "fix" the readiness gate - deep ingest is explicitly out of scope. Mention this in the final report under "Docs / CLI gaps" so the user is aware.
If any of those three conditions do not hold, this is a real failure - capture the error, fetch the relevant docs page, fix the cause, retry.
After `ktx setup` writes `ktx.yaml`, edit it directly for anything flags don't cover:
- Per-connection `enabled_tables` allowlist (snake_case, under `connections.<name>.enabled_tables`).
- Any advanced settings the user requested.
Use a YAML-aware editor (e.g. `uv run python -c "import yaml; …"`) - do not hand-edit blindly.
## Phase 5 - Verify
`ktx setup` already runs a fast ingest of every database connection it configures, so you do not need to re-ingest by default. For each configured connection:
```
ktx connection test <connection-name> # must exit 0
```
Only re-run ingest if setup's build log did **not** reach 100% for that connection:
```
ktx ingest <connection-name> --fast --no-input
```
**Mutex warning on `ktx ingest`**: passing both `--yes` and `--no-input` fails with `Choose only one runtime install mode: --yes or --no-input`. Setup already installed the managed Python runtime, so pass **only `--no-input`** to `ktx ingest`. (`--yes` is only needed when an ingest invocation has to install the runtime itself, which is not the case here.)
Then run the global health check:
```
ktx status --json --no-input
```
Success requires (canonical shape - supersedes the example in the docs):
- `verdict: "ready"` at the top of the JSON.
- Every `connections[].status === "ok"`.
- `ktx connection test <name>` exited 0 for every connection.
Do **not** run `--deep` ingest in this flow - that requires LLM time and is out of scope.
### Optional: directly probe the ktx daemon
If the user asks for stronger verification that `sentence-transformers` is actually serving (not just that setup said "ok"), do all of:
1. `ktx admin runtime status --json` → expect `"kind": "ready"` and `"features": [..., "local-embeddings"]`.
2. `pgrep -fa ktx-daemon` → expect a process running `ktx-daemon serve-http`.
3. `curl -sS http://127.0.0.1:<port>/health` → expect HTTP 200 with `{"status":"healthy",…}`.
4. `curl -sS -X POST http://127.0.0.1:<port>/embeddings/compute -H 'content-type: application/json' -d '{"text":"hello"}'` → expect `{"embedding": [...384 floats...]}`.
Discover the port from setup's log line `Started ktx daemon: http://127.0.0.1:<port>` or from the daemon's OpenAPI at `GET /openapi.json`. Note: the routes are `/health` and `/embeddings/compute` - not `/healthz` or `/embeddings`.
## Phase 6 - Final report
Print a structured report:
```
ktx SETUP COMPLETE
Project: <path>
LLM: <backend> / <model>
Embeddings: <backend> / <model>
Runtime: managed Python ✓ (if the ktx daemon was started)
Connections:
- <name> (<driver>) status=ok schemas=[…] tables=<N>
- …
Sources: <list or "none">
Verdict: ready
```
Then **Next steps** (copy-pasteable):
1. Enrich with AI descriptions and embeddings: `ktx ingest <connection> --deep` (several minutes per connection).
2. Add more connections later by rerunning this setup or via `ktx setup --database … --database-connection-id …`.
3. Configure context sources (dbt, Metabase, Looker, LookML, MetricFlow, Notion) - see `ktx setup --help` for `--source …` flags.
4. Install agent integration: `ktx setup --agents --target <claude-code|claude-desktop|codex|cursor|opencode|universal>` (with optional `--global` for `claude-code`/`codex`).
5. Connect the agent / MCP: see docs at `https://docs.kaelio.com/ktx/`.
Under **Docs / CLI gaps to flag** include any of these that applied during your run:
- `ktx setup` exits non-zero after a successful fast build (deep-readiness gate); status reports ready.
- `ktx ingest` rejects `--yes` and `--no-input` together; docs don't note the conflict.
- `ktx status --json` real shape (`verdict`, `connections[]`) doesn't match the example in the docs page.
- The pasted DB URL was moved to `.ktx/secrets/<name>-url` automatically.
End with a single line: `RESULT: PASS` or `RESULT: FAIL - <one-line reason>`.
# Operating rules (recap)
- Print every command you run and its exit code. Status updates may be terse, but never silent.
- On failure: capture the error, fetch the relevant docs page, fix the cause, retry. Never retry an unchanged command.
- Known soft-failures (listed in Phase 4 and Phase 5) are not real failures - handle them as documented; do not retry or escalate.
- If you find a docs/CLI gap ("docs say X but CLI does Y"), call it out in the final report.
- Never commit credentials - **ktx** accepts `env:` and `file:` references; prefer those. **ktx** will also auto-relocate literal URLs into `.ktx/secrets/`, but that does not protect anyone who pasted the URL into chat history.

View file

@ -14,7 +14,8 @@ Read https://docs.kaelio.com/ktx/llms.txt first. Then fetch only the ktx Markdow
## Set up a project
```text
Set up ktx in this repository. Start by reading /docs/ai-resources/agent-quickstart.md and /docs/getting-started/quickstart.md. Install the published CLI with npm; use pnpm only when working from a ktx source checkout. After setup, run ktx status and summarize which steps are complete, which files changed, and what still needs credentials or user input.
Run npx skills add Kaelio/ktx --skill ktx and use the ktx skill to install
and configure ktx in this project.
```
## Find a command

View file

@ -98,21 +98,70 @@ If you're a coding assistant choosing a docs route, start with the
background: 'color-mix(in oklch, var(--color-fd-primary) 8%, transparent)',
}}
>
<div className="text-sm font-semibold text-fd-foreground">
Run setup from an agent
</div>
<div className="mt-2 text-sm leading-6 text-fd-muted-foreground">
You can ask an agent such as Claude Code, Codex, Cursor, or OpenCode to
install and configure **ktx** for you. The{' '}
<a href="/ktx/docs/agents-setup.md" className="font-medium underline">
agent setup Markdown prompt
</a>{' '}
tells the agent how to check prerequisites, ask only for credentials or
connection choices, run <code>ktx setup</code>, verify connections, and
report the result.
</div>
<div className="mt-3 text-sm leading-6 text-fd-muted-foreground">
Use a prompt like this from the project you want to configure:
<div className="flex flex-wrap items-center gap-x-3 gap-y-2">
<div className="text-sm font-semibold text-fd-foreground">
Or, ask an AI agent to install and configure **ktx** for you.
</div>
<div className="group relative ml-auto inline-flex">
<button
type="button"
aria-describedby="agent-setup-overlay"
className="inline-flex cursor-help items-center gap-1.5 rounded-full border border-fd-border bg-fd-background/70 px-2.5 py-1 text-xs font-medium text-fd-muted-foreground transition-colors hover:border-fd-primary/40 hover:text-fd-foreground focus:outline-none focus-visible:border-fd-primary/40 focus-visible:text-fd-foreground"
>
<svg
width="12"
height="12"
viewBox="0 0 24 24"
fill="none"
stroke="currentColor"
strokeWidth="2.4"
strokeLinecap="round"
strokeLinejoin="round"
aria-hidden="true"
>
<circle cx="12" cy="12" r="10" />
<path d="M9.09 9a3 3 0 0 1 5.83 1c0 2-3 3-3 3" />
<line x1="12" y1="17" x2="12.01" y2="17" />
</svg>
What does it do?
</button>
<div
id="agent-setup-overlay"
role="tooltip"
className="invisible absolute right-0 top-full z-20 translate-y-0.5 pt-2 opacity-0 transition-all duration-150 group-hover:visible group-hover:translate-y-0 group-hover:opacity-100 group-focus-within:visible group-focus-within:translate-y-0 group-focus-within:opacity-100"
>
<div className="w-[min(24rem,calc(100vw-2rem))] rounded-lg border border-fd-border bg-fd-popover p-3 text-sm leading-6 text-fd-popover-foreground shadow-xl">
<div className="text-xs font-semibold uppercase tracking-wide text-fd-muted-foreground">
The agent will
</div>
<ol className="mt-2 space-y-1.5 pl-0">
{[
<>Check prerequisites on your machine</>,
<>Ask only for credentials and connection choices</>,
<>Run <code className="whitespace-nowrap">ktx setup</code> in your project</>,
<>Verify each connection it configured</>,
<>Report what was installed and what is ready</>,
].map((item, index) => (
<li key={index} className="flex gap-2.5">
<span
className="mt-0.5 inline-flex h-5 w-5 shrink-0 items-center justify-center rounded-full text-[11px] font-bold tabular-nums"
style={{
background: 'color-mix(in oklch, var(--color-fd-primary) 18%, transparent)',
color: 'var(--color-fd-primary)',
}}
>
{index + 1}
</span>
<span className="leading-6">{item}</span>
</li>
))}
</ol>
<div className="mt-3 border-t border-fd-border pt-2 text-xs text-fd-muted-foreground">
Works with any AI coding agent.
</div>
</div>
</div>
</div>
</div>
<div className="mt-3 max-w-full overflow-hidden rounded-md border bg-fd-background">
<div className="flex items-center justify-between gap-2 border-b px-3 py-2">
@ -120,16 +169,15 @@ If you're a coding assistant choosing a docs route, start with the
Prompt
</span>
<CopyButton
text={`Follow instructions from
https://docs.kaelio.com/ktx/docs/agents-setup.md
to install and configure ktx`}
text={[
'Run npx skills add Kaelio/ktx --skill ktx and use the ktx skill',
'to install and configure ktx',
].join(' ')}
className="-my-1"
/>
</div>
<div className="p-3 font-mono text-sm leading-6 text-fd-foreground">
<div>Follow instructions from</div>
<div className="break-all">https://docs.kaelio.com/ktx/docs/agents-setup.md</div>
<div>to install and configure ktx</div>
<div className="p-3 font-mono text-[13.5px] leading-6 text-fd-foreground">
Run {'`npx skills add Kaelio/ktx --skill ktx`'} and use the ktx skill to install and configure ktx
</div>
</div>
</div>

View file

@ -1,12 +0,0 @@
import { readFile } from "node:fs/promises";
import { join } from "node:path";
export const agentSetupSlug = ["agents-setup"] as const;
export function isAgentSetupSlug(slug: string[] | undefined) {
return slug?.length === 1 && slug[0] === agentSetupSlug[0];
}
export function readAgentSetupMarkdown() {
return readFile(join(process.cwd(), "content/agents-setup.md"), "utf8");
}

View file

@ -52,8 +52,9 @@ ktx provides semantic-layer files, warehouse scans, wiki pages, provenance, and
## Agent Entry Points
- Installable setup skill: run \`npx skills add Kaelio/ktx --skill ktx\` from
the project you want to configure.
${link("/docs/ai-resources/agent-quickstart", "Agent Quickstart", "Task-first route for coding assistants using ktx")}
${link("/docs/agents-setup", "Agent Setup", "Copy-pasteable prompt for agents installing and configuring ktx")}
${link("/docs/ai-resources/markdown-access", "Markdown Access", "Fetch ktx docs as llms.txt, llms-full.txt, or per-page Markdown")}
${link("/docs/ai-resources/agent-instructions", "Agent Instructions", "Suggested instructions for coding assistants that need to read and cite ktx docs")}

View file

@ -6,12 +6,28 @@ const withMDX = createMDX();
const config = {
basePath: "/ktx",
async rewrites() {
return [
{
source: "/docs/:path*.md",
destination: "/llms.mdx/docs/:path*",
},
];
return {
beforeFiles: [
{
source: "/stars",
has: [{ type: "host", value: "ktx.sh" }],
destination: "https://ktx-stars.vercel.app/stars",
basePath: false,
},
{
source: "/stars/:path*",
has: [{ type: "host", value: "ktx.sh" }],
destination: "https://ktx-stars.vercel.app/stars/:path*",
basePath: false,
},
],
afterFiles: [
{
source: "/docs/:path*.md",
destination: "/llms.mdx/docs/:path*",
},
],
};
},
async redirects() {
return [
@ -43,9 +59,9 @@ const config = {
basePath: false,
},
{
source: "/:path*",
source: "/:path((?!stars(?:/|$)).*)",
has: [{ type: "host", value: "ktx.sh" }],
destination: "https://docs.kaelio.com/ktx/:path*",
destination: "https://docs.kaelio.com/ktx/:path",
permanent: true,
basePath: false,
},

View file

@ -21,6 +21,41 @@ in prose when ambiguity is possible. Always qualify:
Bare `source` is allowed only inside a section that has already established its
referent (e.g., body of a `Semantic sources` page, or `sourceName` as a CLI arg).
## Context Layer and Context Engine
Use **context layer** as the primary category term for what **ktx** provides to
data agents.
Use **context engine** as the secondary mechanism term for how **ktx** builds,
maintains, validates, and serves that layer.
| Concept | Use | Do not use |
|---|---|---|
| The whole **ktx** product category | **context layer** / **context layer for data agents** | knowledge layer, agent memory |
| The active system that builds and maintains context | **context engine** | context layer when describing ingest/reconciliation internals |
| The durable reviewed surface agents use | **context layer** | context engine |
| The compiler pillar for executable metrics and joins | **semantic layer** | context layer when specifically discussing SQL compilation |
| Prose/business knowledge files | **wiki** / **wiki pages** | wiki context |
### Usage rules
- Use **context layer** in taglines, page titles, meta descriptions, docs
introductions, comparison pages, and first-paragraph definitions.
- Use **context engine** when describing active behavior: ingesting evidence,
reconciling changes, validating references, maintaining files, search, CLI,
and MCP serving.
- Keep **semantic layer** for the narrower YAML/compiler surface: semantic
sources, measures, joins, dimensions, filters, SQL compilation, and semantic
queries.
- Do not use **context engine** as the primary replacement for the whole
product. It sounds like runtime infrastructure; **context layer** better
describes the durable YAML and Markdown surface users review in git.
- Do not use **context layer** when the sentence is specifically about the
compiler. Example: write "the semantic layer compiles semantic queries to
SQL," not "the context layer compiles semantic queries to SQL."
- Default lowercase in prose: `context layer`, `context engine`, `semantic
layer`. Title case only in page titles, headings, nav labels, and UI labels.
## Canonical vocabulary
| Concept | Use | Do not use |
@ -31,7 +66,8 @@ referent (e.g., body of a `Semantic sources` page, or `sourceName` as a CLI arg)
| The connected database | **primary source** / **database connection** | data source |
| Analytics-tooling integration | **context source** / **context-source connection** | BI source, BI model, metadata source, source tool |
| YAML file describing a table | **semantic source** | semantic-layer source, model file, bare "source file" |
| The whole **ktx** surface | **context layer** (lowercase in prose) | "Context Layer" in prose |
| The whole **ktx** surface | **context layer** / **context layer for data agents** (lowercase in prose) | "Context Layer" in prose, knowledge layer, agent memory |
| The active system that builds and maintains context | **context engine** (lowercase in prose) | context layer when describing ingest/reconciliation internals |
| The compiler pillar | **semantic layer** (lowercase in prose) | "Semantic Layer" in prose |
| The query payload | **semantic query** (lowercase in prose) | "Semantic Query" |
| The MCP layer | **MCP server** (the server), **MCP tools** (the functions) | "ktx MCP" as a standalone noun |

View file

@ -24,17 +24,12 @@ export interface KtxSetupProjectArgs {
allowBack?: boolean;
}
export type KtxSetupCreatedProjectCleanup =
| { kind: 'remove-project-dir'; projectDir: string }
| { kind: 'remove-ktx-scaffold'; projectDir: string };
export type KtxSetupProjectResult =
| {
status: 'ready';
projectDir: string;
project: KtxLocalProject;
confirmedCreation?: boolean;
createdProjectCleanup?: KtxSetupCreatedProjectCleanup;
}
| { status: 'back'; projectDir: string }
| { status: 'cancelled'; projectDir: string }
@ -59,7 +54,6 @@ type PromptProjectDirResult =
status: 'selected';
projectDir: string;
confirmedCreation: boolean;
createdProjectCleanup?: KtxSetupCreatedProjectCleanup;
}
| { status: 'cancelled'; projectDir: string }
| { status: 'missing-input'; projectDir: string }
@ -106,26 +100,12 @@ type ConfirmProjectDirResult =
| {
status: 'confirmed';
confirmedCreation: boolean;
createdProjectCleanup?: KtxSetupCreatedProjectCleanup;
}
| { status: 'choose-another' }
| { status: 'back' }
| { status: 'cancelled' }
| { status: 'not-directory' };
function cleanupForFolderState(
projectDir: string,
state: Awaited<ReturnType<typeof existingFolderState>>,
): KtxSetupCreatedProjectCleanup | undefined {
if (state === 'missing') {
return { kind: 'remove-project-dir', projectDir };
}
if (state === 'empty-directory') {
return { kind: 'remove-ktx-scaffold', projectDir };
}
return undefined;
}
async function confirmProjectDir(
selectedDir: string,
io: KtxCliIo,
@ -165,7 +145,7 @@ async function confirmProjectDir(
if (action === 'choose-another') return { status: 'choose-another' };
if (action === 'back') return { status: 'back' };
if (action !== 'create') return { status: 'cancelled' };
return { status: 'confirmed', confirmedCreation: true, createdProjectCleanup: cleanupForFolderState(selectedDir, state) };
return { status: 'confirmed', confirmedCreation: true };
}
async function normalizeSetupGitignore(projectDir: string): Promise<void> {
@ -252,24 +232,10 @@ async function promptForNewProjectDir(
status: 'selected',
projectDir: selectedDir,
confirmedCreation: confirmed.confirmedCreation,
...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}),
};
}
}
async function createProjectWithCleanup(
projectDir: string,
deps: KtxSetupProjectDeps,
): Promise<{ project: KtxLocalProject; createdProjectCleanup?: KtxSetupCreatedProjectCleanup }> {
const state = await existingFolderState(projectDir);
const project = await createProject(projectDir, deps);
const createdProjectCleanup = cleanupForFolderState(projectDir, state);
return {
project,
...(createdProjectCleanup ? { createdProjectCleanup } : {}),
};
}
export async function runKtxSetupProjectStep(
args: KtxSetupProjectArgs,
io: KtxCliIo,
@ -307,7 +273,6 @@ export async function runKtxSetupProjectStep(
projectDir: selected.projectDir,
project,
confirmedCreation: selected.confirmedCreation,
...(selected.createdProjectCleanup ? { createdProjectCleanup: selected.createdProjectCleanup } : {}),
};
}
@ -322,13 +287,12 @@ export async function runKtxSetupProjectStep(
io.stderr.write('Missing setup choice: pass --yes to create a project in non-interactive setup.\n');
return { status: 'missing-input', projectDir };
}
const { project, createdProjectCleanup } = await createProjectWithCleanup(projectDir, deps);
const project = await createProject(projectDir, deps);
printProjectSummary(io, projectDir);
return {
status: 'ready',
projectDir,
project,
...(createdProjectCleanup ? { createdProjectCleanup } : {}),
};
}
@ -368,13 +332,12 @@ export async function runKtxSetupProjectStep(
}
if (choice === 'current') {
const { project, createdProjectCleanup } = await createProjectWithCleanup(projectDir, deps);
const project = await createProject(projectDir, deps);
printProjectSummary(io, projectDir);
return {
status: 'ready',
projectDir,
project,
...(createdProjectCleanup ? { createdProjectCleanup } : {}),
};
}
@ -390,7 +353,6 @@ export async function runKtxSetupProjectStep(
projectDir: defaultProjectDir,
project,
confirmedCreation: confirmed.confirmedCreation,
...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}),
};
}
@ -419,7 +381,6 @@ export async function runKtxSetupProjectStep(
projectDir: customDir,
project,
confirmedCreation: confirmed.confirmedCreation,
...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}),
};
}

View file

@ -1,5 +1,4 @@
import { existsSync } from 'node:fs';
import { rm } from 'node:fs/promises';
import { basename, join, resolve } from 'node:path';
import { getLatestLocalIngestStatus } from './context/ingest/local-ingest.js';
import { savedMemoryCountsForReport } from './context/ingest/reports.js';
@ -32,11 +31,7 @@ import {
isKtxSetupLlmConfigReady,
runKtxSetupAnthropicModelStep,
} from './setup-models.js';
import {
type KtxSetupCreatedProjectCleanup,
type KtxSetupProjectDeps,
runKtxSetupProjectStep,
} from './setup-project.js';
import { type KtxSetupProjectDeps, runKtxSetupProjectStep } from './setup-project.js';
import {
isKtxPreAgentSetupReady,
isKtxSetupReady,
@ -556,23 +551,6 @@ async function commitSetupConfigChanges(projectDir: string): Promise<void> {
await project.git.commitFile('ktx.yaml', 'setup: update KTX project config', 'ktx setup', 'setup@ktx.local');
}
const KTX_SETUP_SCAFFOLD_PATHS = ['ktx.yaml', '.ktx', 'wiki', 'semantic-layer', 'raw-sources', '.git'];
async function cleanupCreatedProjectScaffold(cleanup: KtxSetupCreatedProjectCleanup | undefined): Promise<void> {
if (!cleanup) {
return;
}
if (cleanup.kind === 'remove-project-dir') {
await rm(cleanup.projectDir, { recursive: true, force: true });
return;
}
await Promise.all(
KTX_SETUP_SCAFFOLD_PATHS.map((relativePath) =>
rm(join(cleanup.projectDir, relativePath), { recursive: true, force: true }),
),
);
}
export async function runKtxSetup(args: KtxSetupArgs, io: KtxCliIo, deps: KtxSetupDeps = {}): Promise<number> {
try {
return await runKtxSetupInner(args, io, deps);
@ -869,7 +847,6 @@ async function runKtxSetupInner(args: KtxSetupArgs, io: KtxCliIo, deps: KtxSetup
});
if (stepResult.status === 'failed') {
await cleanupCreatedProjectScaffold(projectResult.createdProjectCleanup);
return 1;
}
if (stepResult.status === 'missing-input') {

View file

@ -1,5 +1,5 @@
import { execFile } from 'node:child_process';
import { mkdir, mkdtemp, readFile, readdir, rm, stat, writeFile } from 'node:fs/promises';
import { mkdir, mkdtemp, readFile, rm, stat, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { promisify } from 'node:util';
@ -602,7 +602,7 @@ describe('setup status', () => {
expect(testIo.stderr()).toBe('');
});
it('removes a newly created missing project directory when a later runtime step fails', async () => {
it('preserves a newly created missing project directory when a later setup step fails', async () => {
const projectDir = join(tempDir, 'missing-project');
const testIo = makeIo();
@ -634,10 +634,12 @@ describe('setup status', () => {
),
).resolves.toBe(1);
await expect(stat(projectDir)).rejects.toThrow();
await expect(stat(projectDir)).resolves.toBeDefined();
await expect(stat(join(projectDir, 'ktx.yaml'))).resolves.toBeDefined();
await expect(stat(join(projectDir, '.ktx'))).resolves.toBeDefined();
});
it('removes KTX scaffold files from an initially empty project directory when runtime setup fails', async () => {
it('preserves KTX scaffold files in an initially empty project directory when setup fails', async () => {
const testIo = makeIo();
await expect(
@ -668,8 +670,59 @@ describe('setup status', () => {
),
).resolves.toBe(1);
await expect(stat(tempDir)).resolves.toBeDefined();
expect(await readdir(tempDir)).toEqual([]);
await expect(stat(join(tempDir, 'ktx.yaml'))).resolves.toBeDefined();
await expect(stat(join(tempDir, '.ktx'))).resolves.toBeDefined();
});
it('preserves partial context-build artifacts and resume state when the context step fails', async () => {
const projectDir = join(tempDir, 'partial-context');
const testIo = makeIo();
await expect(
runKtxSetup(
{
command: 'run',
projectDir,
mode: 'auto',
agents: false,
skipAgents: true,
inputMode: 'disabled',
yes: true,
cliVersion: '0.2.0',
skipLlm: true,
skipEmbeddings: true,
databaseSchemas: [],
skipDatabases: true,
skipSources: true,
},
testIo.io,
{
model: async () => ({ status: 'skipped', projectDir }),
embeddings: async () => ({ status: 'skipped', projectDir }),
databases: async () => ({ status: 'skipped', projectDir }),
sources: async () => ({ status: 'skipped', projectDir }),
runtime: async () => runtimeReady(projectDir),
context: async () => {
await mkdir(join(projectDir, '.ktx', 'setup'), { recursive: true });
await writeFile(
join(projectDir, '.ktx', 'setup', 'state.json'),
JSON.stringify({ status: 'failed', retryableFailedTargets: [{ source: 'metabase' }] }),
'utf-8',
);
await mkdir(join(projectDir, 'wiki'), { recursive: true });
await writeFile(join(projectDir, 'wiki', 'postgres-warehouse.md'), '# warehouse\n', 'utf-8');
await mkdir(join(projectDir, 'semantic-layer'), { recursive: true });
await writeFile(join(projectDir, 'semantic-layer', 'orders.yaml'), 'name: orders\n', 'utf-8');
return { status: 'failed', projectDir };
},
},
),
).resolves.toBe(1);
await expect(stat(join(projectDir, 'ktx.yaml'))).resolves.toBeDefined();
await expect(readFile(join(projectDir, '.ktx', 'setup', 'state.json'), 'utf-8')).resolves.toContain('"status":"failed"');
await expect(readFile(join(projectDir, 'wiki', 'postgres-warehouse.md'), 'utf-8')).resolves.toContain('warehouse');
await expect(readFile(join(projectDir, 'semantic-layer', 'orders.yaml'), 'utf-8')).resolves.toContain('orders');
});
it('preserves a pre-existing non-empty project directory when runtime setup fails', async () => {

11
skills.sh.json Normal file
View file

@ -0,0 +1,11 @@
{
"$schema": "https://skills.sh/schemas/skills.sh.schema.json",
"notGrouped": "bottom",
"groupings": [
{
"title": "ktx",
"description": "Skills for installing, configuring, and operating ktx.",
"skills": ["ktx"]
}
]
}

168
skills/ktx/SKILL.md Normal file
View file

@ -0,0 +1,168 @@
---
name: ktx
description: Installs and configures ktx, the open-source context layer for data agents — runs ktx setup non-interactively with hidden CLI flags, configures database connections and embeddings, installs agent integration, and verifies readiness. Use when the user asks an agent to add ktx to a project, connect data sources, install agent rules, ingest schema, or troubleshoot a local ktx install.
---
# ktx
Install and configure **ktx**, the open-source context layer for data agents.
Use this skill when a user wants an agent to add **ktx** to a project, connect
data sources, build initial context, install agent integration, or troubleshoot
a local **ktx** setup.
## Operating rules
- Act autonomously when the user asks you to install or configure **ktx**.
The non-interactive scripted flow below is the canonical path — bare
`ktx setup` is interactive (clack prompts) and an agent cannot drive it.
- Setup's non-interactive flags are intentionally hidden from `--help`. Use the
flags listed below; verify uncommon flags against the docs at
`https://docs.kaelio.com/ktx/` or this skill — not against `--help` output.
- Ask only for values you cannot infer: project directory, connection targets,
credentials, account identifiers, and source selections.
- Never ask the user to paste secrets when an `env:VAR_NAME` or `file:/path`
reference would work. Pasting a literal URL is also safe — `ktx setup`
auto-externalizes URLs into `.ktx/secrets/<id>-url` (see workflow step 2).
- Do not commit `.ktx/secrets/*`.
- Print each command you run and its result.
- If a command fails, identify the cause and change something before retrying.
## Gather inputs once
Before invoking `ktx setup`, collect in one round:
1. Project directory (default: current working directory).
2. LLM backend and key strategy. In `--no-input` mode the CLI defaults to
`anthropic` and **requires an API key**. When the user is inside Claude
Code, pass `--llm-backend claude-code` explicitly; otherwise pass
`--llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY`.
3. Embedding backend (`sentence-transformers` is the local default and needs
no key; use `openai` only if the user already has a key, then pass
`--embedding-api-key-env OPENAI_API_KEY`).
4. Database: driver, connection id, URL (or `env:` / `file:` ref), and one or
more schemas.
5. Optional context sources (dbt, Metabase, Looker, LookML, MetricFlow,
Notion). Skip with `--skip-sources` if the user has none.
Do not discover these inputs across multiple setup runs.
## Install workflow
1. **Detect the install path.** If the working directory contains
`packages/cli/dist/bin.js` or `pnpm-workspace.yaml` referencing
`@kaelio/ktx` you are inside the **ktx** monorepo — build and link the
local CLI with `pnpm` and do **not** run `npm install -g`. Otherwise:
```bash
node --version # require >= 22; stop and ask the user if older
ktx --version || npm install -g @kaelio/ktx
```
2. **Run scripted setup** (canonical path):
```bash
ktx setup --no-input --yes \
--project-dir <path> \
--llm-backend claude-code \
--embedding-backend sentence-transformers \
--database <driver> --database-connection-id <id> \
--database-url '<raw-url | env:NAME | file:/abs/path>' \
--database-schema <schema> \
--skip-sources
```
- Configure one new database connection per setup invocation. For multiple
connections, rerun setup once per connection.
- Pasting a literal `--database-url` is safe: the CLI relocates the URL
into `.ktx/secrets/<connection-id>-url` and rewrites `ktx.yaml` to a
`file:` ref automatically.
3. **Resumability and `--skip-*`.** Re-running `ktx setup` against an existing
project resumes its config. Use `--skip-llm`, `--skip-databases`,
`--skip-sources`, or `--skip-embeddings` to leave a slice unconfigured but
let the rest complete instead of aborting on the first failure. **When
resuming an existing project to change one slice (e.g. only LLM), still
pass the database flags from the previous run** — setup validates current
flags, not persisted `ktx.yaml` state.
4. **Run fast ingest** if setup did not already complete one:
```bash
ktx ingest <connection-id> --fast --no-input
```
Note: `ktx ingest` rejects `--yes` together with `--no-input`
(*Choose only one runtime install mode*); `ktx setup` accepts both. Use
`--no-input` only for ingest. Do not run `--deep` ingest unless the user
explicitly asks for LLM-backed enrichment.
5. **Install agent integration:**
```bash
ktx setup --agents --target <claude-code|claude-desktop|codex|cursor|opencode|universal>
ktx mcp start --project-dir <path>
```
Agent integration is **not usable until `ktx mcp start` is running**. The
`--agents` step prints this requirement as `Required before using agents`.
6. **Fall back to bare `ktx setup` only when a human is at the keyboard**
it uses interactive prompts an agent cannot answer.
## Files to inspect
- `ktx.yaml`: project configuration.
- `.ktx/secrets/*`: local secret files. Never commit them.
- `semantic-layer/<connection-id>/*.yaml`: semantic sources for SQL
compilation.
- `wiki/**/*.md`: project context pages for agents.
- `.claude/skills/ktx/`, `.agents/skills/ktx/`, `.cursor/rules/ktx.mdc`, and
`.opencode/commands/ktx.md`: generated agent integration files.
## Verification
After setup, run:
```bash
ktx connection test <connection-id>
ktx status --json --no-input
```
**Judge readiness from `ktx status --json` fields, not the exit code.**
`ktx status` exits 1 whenever the LLM is `none`, even when embeddings and
every database connection are healthy. Treat success as:
- `verdict: "ready"` at the top of the JSON, and
- every `connections[].status === "ok"`, and
- every `ktx connection test <id>` exited 0.
A non-zero exit with only the LLM unconfigured is still a usable context
layer — report it as "ready, LLM optional" rather than retrying setup.
## Troubleshooting
For known failure signatures (`invalid ELF header`,
`Native CLI binary for <plat> not found`, `Missing Anthropic API key`,
`claude-code` probe failure, `KTX cannot work without a database` on resume),
see [troubleshooting.md](troubleshooting.md).
## Final report
End setup work with a concise report:
```text
ktx SETUP COMPLETE
Project: <path>
LLM: <backend> / <model>
Embeddings: <backend> / <model>
Connections: <name> (<driver>) status=<ok|warn|fail>
Sources: <list or none>
Verdict: <ready|needs action>
Next:
1. <copy-pasteable command or action>
2. <copy-pasteable command or action>
RESULT: PASS
```

View file

@ -0,0 +1,7 @@
interface:
display_name: "ktx"
short_description: "Install and configure ktx for data agents"
default_prompt: "Use $ktx to install and configure ktx in this project."
policy:
allow_implicit_invocation: true

View file

@ -0,0 +1,79 @@
# ktx setup troubleshooting
Known failure signatures hit by agent-driven `ktx setup` runs. Match the
error string in the left column, apply the fix in the right column.
## `Error: invalid ELF header` from `better-sqlite3`
Native module compiled for a different platform or architecture (e.g.
installed under Rosetta then run under native arm64).
Fix:
```bash
# Inside the ktx monorepo:
pnpm rebuild better-sqlite3
# Or for a global install:
npm rebuild --global better-sqlite3
```
## `Native CLI binary for <plat> not found`
The platform-specific optional dependency that ships the native CLI binary
was skipped during install (npm/pnpm "optional dep not for this platform").
Fix:
```bash
npm install -g @kaelio/ktx --force
```
## `Missing Anthropic API key: pass --anthropic-api-key-env or --anthropic-api-key-file`
`--no-input` mode defaulted the LLM backend to `anthropic` because no
`--llm-backend` flag was supplied. The CLI then required a key.
Fix — pick one:
```bash
# Inside Claude Code, prefer the local backend:
ktx setup --no-input --llm-backend claude-code ...other flags...
# Otherwise point at an existing env var:
ktx setup --no-input --llm-backend anthropic \
--anthropic-api-key-env ANTHROPIC_API_KEY ...other flags...
```
## `claude-code` LLM probe fails (auth or binary not found)
The `claude` CLI is not on the agent's `PATH`, or the user has not run
`claude` interactively at least once to log in.
Fix:
```bash
which claude # confirm the binary resolves
claude --version # confirm it runs
# If auth probe still fails, the user must run `claude` once interactively
# to complete login; agents cannot do this step.
```
If `claude-code` cannot be made to work, fall back to `--skip-llm` and let
the rest of setup complete; the project is still a usable context layer
without an LLM.
## `KTX cannot work without a database` when resuming setup
`ktx setup` validates the **current invocation's flags**, not the persisted
`ktx.yaml`. Resuming setup with only `--llm-backend …` fails even when the
project already has a healthy database connection.
Fix — re-pass the database flags from the original setup run, even when
only changing one slice:
```bash
ktx setup --no-input \
--database <driver> --database-connection-id <id> \
--llm-backend claude-code
```