diff --git a/AGENTS.md b/AGENTS.md index 64ec2d4a..3d8c1725 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -323,6 +323,26 @@ use `PascalCase` without the suffix. source-code identifier, package/API name, or other literal value that must match the implementation. +### Product Category Naming + +- **MUST**: Use **context layer** as the primary public category for **ktx**. + Preferred phrase: `context layer for data agents`. +- **MUST**: Use **context engine** only as the secondary mechanism term for the + active system that builds, reconciles, validates, searches, and serves the + context layer. +- **MUST**: Keep **semantic layer** as the narrower term for executable metric + definitions, semantic sources, joins, measures, and SQL compilation. +- **MUST NOT**: Replace every `semantic layer` occurrence with `context layer`; + the semantic layer is one pillar inside the broader context layer. + +Preferred pattern: + +```md +**ktx** is an open-source context layer for data agents. Its context engine +ingests warehouse metadata, BI definitions, query history, docs, and approved +metrics, then turns them into reviewable files agents can search and execute. +``` + ### Terminology For canonical vocabulary used across docs, code, comments, CLI strings, and @@ -355,6 +375,22 @@ that do not change user-facing behavior. When you do update docs, follow the warrants docs but you are out of scope, call it out in your final summary rather than silently skipping it. +#### Monospace ligatures in `docs-site/` + +- **MUST**: Disable monospace ligatures on every surface that uses the + `var(--font-mono)` family (Geist Mono). Geist Mono fuses `--` into an + em-dash glyph that visually eats the adjacent space, so prompts like + `npx skills add Kaelio/ktx --skill ktx` render as `Kaelio/ktx--skill ktx`. +- **MUST**: When adding a new container that renders user-visible monospace + text outside `` / `
` (e.g. a styled `
` + for a copyable prompt), verify the global ligature-off rule in + `docs-site/app/global.css` covers its selector. Either use Tailwind's + `font-mono` utility (already covered) or extend the rule to match the new + class — do not silently rely on Geist Mono's defaults. +- **SHOULD**: Prefer `` / `
` (or a `font-mono` wrapper) for any
+  string that contains CLI flags, paths, or other tokens with `--`, `->`,
+  `>=`, `!=`, `==`, `//` so ligatures never alter intent.
+
 ## LLM and Prompt Development
 
 When creating or modifying agent prompts, system prompts, tool descriptions, or
diff --git a/README.md b/README.md
index 2e034677..23b2fa0a 100644
--- a/README.md
+++ b/README.md
@@ -119,9 +119,8 @@ Agent integration ready: yes (codex:project)
 > your project directory:
 >
 > ```text
-> Follow instructions from
-> https://docs.kaelio.com/ktx/docs/agents-setup.md
-> to install and configure ktx
+> Run npx skills add Kaelio/ktx --skill ktx and use the ktx skill to install
+> and configure ktx in this project.
 > ```
 
 > [!IMPORTANT]
diff --git a/docs-site/app/global.css b/docs-site/app/global.css
index a4cebc55..929e06b4 100644
--- a/docs-site/app/global.css
+++ b/docs-site/app/global.css
@@ -166,12 +166,16 @@ pre {
 }
 
 /* Disable monospace ligatures so `--flag` keeps a visible space and double
-   dashes don't fuse into an em-dash glyph. */
+   dashes don't fuse into an em-dash glyph. Covers every monospace surface:
+   raw /
, the ktx-code wrapper, Tailwind's `font-mono` utility,
+   and anything that opts in via the `var(--font-mono)` family directly. */
 code,
 pre,
 pre code,
 .ktx-code,
-.ktx-code code {
+.ktx-code code,
+.font-mono,
+[style*="--font-mono"] {
   font-variant-ligatures: none !important;
   font-feature-settings: "liga" 0, "calt" 0 !important;
 }
diff --git a/docs-site/app/llms.mdx/docs/[[...slug]]/route.ts b/docs-site/app/llms.mdx/docs/[[...slug]]/route.ts
index 87dcbd42..1372d556 100644
--- a/docs-site/app/llms.mdx/docs/[[...slug]]/route.ts
+++ b/docs-site/app/llms.mdx/docs/[[...slug]]/route.ts
@@ -3,11 +3,6 @@ import {
   getLlmDocsPages,
   getPageMarkdown,
 } from "@/lib/llm-docs";
-import {
-  agentSetupSlug,
-  isAgentSetupSlug,
-  readAgentSetupMarkdown,
-} from "@/lib/agent-setup-markdown";
 
 export const dynamic = "force-static";
 
@@ -16,14 +11,6 @@ export async function GET(
   props: { params: Promise<{ slug?: string[] }> },
 ) {
   const params = await props.params;
-  if (isAgentSetupSlug(params.slug)) {
-    return new Response(await readAgentSetupMarkdown(), {
-      headers: {
-        "Content-Type": "text/markdown; charset=utf-8",
-      },
-    });
-  }
-
   const page = getLlmDocsPage(params.slug);
   if (!page) {
     return new Response("Documentation page not found.\n", {
@@ -42,8 +29,5 @@ export async function GET(
 }
 
 export function generateStaticParams() {
-  return [
-    ...getLlmDocsPages().map((page) => ({ slug: page.slug })),
-    { slug: [...agentSetupSlug] },
-  ];
+  return getLlmDocsPages().map((page) => ({ slug: page.slug }));
 }
diff --git a/docs-site/content/agents-setup.md b/docs-site/content/agents-setup.md
deleted file mode 100644
index 4933ff10..00000000
--- a/docs-site/content/agents-setup.md
+++ /dev/null
@@ -1,201 +0,0 @@
-# Goal
-
-Set up **ktx** from scratch end-to-end as a fully autonomous, agent-driven replacement for the interactive `ktx setup` wizard. Detect the environment, install missing prerequisites, ask the user only for information you genuinely need (which connections to add, credentials), write a valid configuration, verify it works, and run a fast ingest. Keep the user updated throughout.
-
-# Operating principles
-
-- **Be autonomous.** Detect, decide, and act. Only ask the user when you need information that only they can provide: project location, which databases/sources to connect, credentials, and similar choices.
-- **Stream short status updates.** Before each major phase ("Checking prerequisites…", "Installing uv…", "Configuring warehouse connection…", "Running fast ingest…") print a one-line update. Not chatty - just enough that the user can see what's happening.
-- **Verify against docs, never guess.** CLI flags, config keys, and command names must come from the docs or from `ktx  --help`. If something looks wrong or missing, say so explicitly.
-- **Print every command you run and its exit code.** Terse, not silent.
-- **Fail loudly with cause + fix.** When a command fails: capture the exact error, identify the cause, change something, retry. Never retry an unchanged command. Exceptions for *known soft-failures* are listed in Phase 4 - handle those without retrying.
-- **No LLM-based ingestion in this flow.** Only `--fast` ingest. The user can run `--deep` later.
-- **Platform-agnostic.** Detect the host OS first and pick the right install commands / path syntax. Anything path- or shell-specific must branch on OS.
-
-# Authoritative docs
-
-**ktx** docs are served at `https://docs.kaelio.com/ktx/`. **Start by fetching `https://docs.kaelio.com/ktx/llms.txt`** to discover the docs map. Scan it for a "troubleshooting" entry - if one exists, read it **before** running install/setup so you can apply known fixes preemptively rather than after failing. If no troubleshooting page is listed (current state of the docs), proceed. Then fetch any other `.md` pages you need (setup, ingest, status, connection types). **Never invent CLI flags or config keys** - verify against the docs or `ktx --help` / `ktx  --help`.
-
-> **Note on the `ktx status` JSON example in the docs.** The docs page for `ktx status` shows an example shaped like `{"title": "...", "checks": [...]}`. That example is outdated. The real CLI output uses a top-level `verdict` field plus a `connections[]` array - see Phase 5 for the canonical success criteria. Trust the shape in this prompt over the docs example.
-
-# Workflow
-
-## Phase 1 - Detect environment
-
-Determine the host OS (e.g. via `uname -s`, `process.platform`, or `$env:OS`). Use the right install commands per OS for the rest of this flow.
-
-| Tool | macOS / Linux | Windows (PowerShell) |
-|------|---------------|----------------------|
-| `uv` | `curl -LsSf https://astral.sh/uv/install.sh \| sh` then re-source shell env | `irm https://astral.sh/uv/install.ps1 \| iex` |
-| Node.js | use system / fnm / nvm - **do not** auto-install | use system / nvm-windows - **do not** auto-install |
-| **ktx** CLI | `npm install -g …` (see Phase 2) | `npm install -g …` (see Phase 2) |
-
-If Node.js is missing, **stop and ask the user** to install it (https://nodejs.org/). Do not attempt to auto-install Node.
-
-## Phase 2 - Verify and install prerequisites
-
-Check each tool in order; install only if missing.
-
-1. **Node.js** - run `node --version`. Require >= 22. If missing or older, stop and instruct the user.
-2. **`uv`** - run `uv --version`. If missing, run the OS-appropriate install command, then re-source the shell environment (`export PATH="$HOME/.local/bin:$PATH"` on Linux/macOS) so `uv` is on `PATH`.
-3. **ktx CLI** -
-   - Install ktx with `npm install -g @kaelio/ktx`
-   - Verify with `ktx --version`.
-
-Print one status line per tool ("✓ uv 0.11.15 found", "Installing uv…", "✓ ktx 0.x.y installed").
-
-## Phase 3 - Gather user choices
-
-Ask the user (grouped if your harness supports it; otherwise sequentially):
-
-1. **Project directory.** Default: current working directory. Confirm before continuing.
-2. **LLM provider.** Default: `claude-code` with model `sonnet` (the user is already inside Claude Code; no extra API key needed). Offer `anthropic` (paste API key, stored as `env:` or `file:` ref) and `vertex` (GCP project + location) as alternatives. Skip if defaults are accepted.
-3. **Embeddings backend.** Default: `sentence-transformers` (local, no API key, managed Python runtime). Offer `openai` only if the user has a key.
-4. **Database connections.** Ask how many to add, then loop. For each, collect:
-   - Connection name (e.g. `warehouse`, `analytics`).
-   - Driver: one of `sqlite`, `postgres`, `mysql`, `sqlserver`, `bigquery`, `snowflake`.
-   - Connection URL/DSN (or service-account file for BigQuery). Accept `env:VAR_NAME` or `file:/abs/path` to avoid pasting raw secrets.
-     - **Heads-up for the user**: even if they paste a literal URL, **ktx** will silently relocate it into `/.ktx/secrets/-url` and rewrite `ktx.yaml` to `url: file:…` - this is correct, secure behavior and not a bug.
-   - Schemas / datasets to include (postgres / sqlserver / snowflake / bigquery only).
-   - Optional `enabled_tables` allowlist if the user wants to scope ingest to specific tables.
-5. **Context sources** (dbt, Metabase, Looker, LookML, MetricFlow, Notion). Default: none. Ask only if the user mentions them.
-
-## Phase 4 - Configure the project
-
-Drive the existing wizard non-interactively (verify exact flag names with `ktx setup --help` and the docs - the automation flags are hidden from help but accepted):
-
-```
-ktx setup \
-  --project-dir  \
-  --no-input --yes \
-  --llm-backend  --llm-model  \
-  [--anthropic-api-key-env ANTHROPIC_API_KEY | --anthropic-api-key-file ] \
-  [--vertex-project 

--vertex-location ] \ - --embedding-backend \ - [--embedding-api-key-env OPENAI_API_KEY] \ - --skip-sources \ - --database --database-connection-id --database-url \ - [--database-schema …] -``` - -Notes on the flags above: -- **Project creation is automatic with `--no-input --yes`.** When - `ktx.yaml` exists, setup resumes it. When it doesn't exist, setup creates it - at `--project-dir`. -- **`--database-connection-id` is dual-purpose.** With `--database` or - `--database-url`, it names the new connection. Without those flags, it - selects an existing connection id. -- **Configure one new database connection per setup command.** If the user - wants multiple new connections, run setup again for each connection. -- **You don't need `--skip-agents` in this flow.** The agent integration step - is opt-in: setup leaves it alone unless you pass `--agents --target - `. -- **`--skip-sources`** is correct and is the documented way to leave context sources unconfigured. - -### Known soft-failure: `ktx setup` exits 1 after a successful fast build - -When you select a configuration that only does fast ingest, `ktx setup`'s final readiness verification fails with: - -``` -ktx context build did not pass agent-readiness verification. - : deep database context has not completed. -``` - -This is **expected** and **does not mean setup failed**. Treat the exit code as a soft-failure **only if all of the following hold**: - -- The build log shows the fast ingest reached `[100%] Scan completed` for every configured connection. -- `ktx connection test ` (run next) exits 0 for every connection. -- `ktx status --json --no-input` reports `verdict: "ready"`. - -If those three conditions hold, proceed to Phase 5 without retrying setup, and **do not** switch to `--deep` to "fix" the readiness gate - deep ingest is explicitly out of scope. Mention this in the final report under "Docs / CLI gaps" so the user is aware. - -If any of those three conditions do not hold, this is a real failure - capture the error, fetch the relevant docs page, fix the cause, retry. - -After `ktx setup` writes `ktx.yaml`, edit it directly for anything flags don't cover: -- Per-connection `enabled_tables` allowlist (snake_case, under `connections..enabled_tables`). -- Any advanced settings the user requested. - -Use a YAML-aware editor (e.g. `uv run python -c "import yaml; …"`) - do not hand-edit blindly. - -## Phase 5 - Verify - -`ktx setup` already runs a fast ingest of every database connection it configures, so you do not need to re-ingest by default. For each configured connection: - -``` -ktx connection test # must exit 0 -``` - -Only re-run ingest if setup's build log did **not** reach 100% for that connection: - -``` -ktx ingest --fast --no-input -``` - -**Mutex warning on `ktx ingest`**: passing both `--yes` and `--no-input` fails with `Choose only one runtime install mode: --yes or --no-input`. Setup already installed the managed Python runtime, so pass **only `--no-input`** to `ktx ingest`. (`--yes` is only needed when an ingest invocation has to install the runtime itself, which is not the case here.) - -Then run the global health check: - -``` -ktx status --json --no-input -``` - -Success requires (canonical shape - supersedes the example in the docs): -- `verdict: "ready"` at the top of the JSON. -- Every `connections[].status === "ok"`. -- `ktx connection test ` exited 0 for every connection. - -Do **not** run `--deep` ingest in this flow - that requires LLM time and is out of scope. - -### Optional: directly probe the ktx daemon - -If the user asks for stronger verification that `sentence-transformers` is actually serving (not just that setup said "ok"), do all of: - -1. `ktx admin runtime status --json` → expect `"kind": "ready"` and `"features": [..., "local-embeddings"]`. -2. `pgrep -fa ktx-daemon` → expect a process running `ktx-daemon serve-http`. -3. `curl -sS http://127.0.0.1:/health` → expect HTTP 200 with `{"status":"healthy",…}`. -4. `curl -sS -X POST http://127.0.0.1:/embeddings/compute -H 'content-type: application/json' -d '{"text":"hello"}'` → expect `{"embedding": [...384 floats...]}`. - -Discover the port from setup's log line `Started ktx daemon: http://127.0.0.1:` or from the daemon's OpenAPI at `GET /openapi.json`. Note: the routes are `/health` and `/embeddings/compute` - not `/healthz` or `/embeddings`. - -## Phase 6 - Final report - -Print a structured report: - -``` -ktx SETUP COMPLETE - -Project: -LLM: / -Embeddings: / -Runtime: managed Python ✓ (if the ktx daemon was started) - -Connections: - - () status=ok schemas=[…] tables= - - … - -Sources: -Verdict: ready -``` - -Then **Next steps** (copy-pasteable): -1. Enrich with AI descriptions and embeddings: `ktx ingest --deep` (several minutes per connection). -2. Add more connections later by rerunning this setup or via `ktx setup --database … --database-connection-id …`. -3. Configure context sources (dbt, Metabase, Looker, LookML, MetricFlow, Notion) - see `ktx setup --help` for `--source …` flags. -4. Install agent integration: `ktx setup --agents --target ` (with optional `--global` for `claude-code`/`codex`). -5. Connect the agent / MCP: see docs at `https://docs.kaelio.com/ktx/`. - -Under **Docs / CLI gaps to flag** include any of these that applied during your run: -- `ktx setup` exits non-zero after a successful fast build (deep-readiness gate); status reports ready. -- `ktx ingest` rejects `--yes` and `--no-input` together; docs don't note the conflict. -- `ktx status --json` real shape (`verdict`, `connections[]`) doesn't match the example in the docs page. -- The pasted DB URL was moved to `.ktx/secrets/-url` automatically. - -End with a single line: `RESULT: PASS` or `RESULT: FAIL - `. - -# Operating rules (recap) - -- Print every command you run and its exit code. Status updates may be terse, but never silent. -- On failure: capture the error, fetch the relevant docs page, fix the cause, retry. Never retry an unchanged command. -- Known soft-failures (listed in Phase 4 and Phase 5) are not real failures - handle them as documented; do not retry or escalate. -- If you find a docs/CLI gap ("docs say X but CLI does Y"), call it out in the final report. -- Never commit credentials - **ktx** accepts `env:` and `file:` references; prefer those. **ktx** will also auto-relocate literal URLs into `.ktx/secrets/`, but that does not protect anyone who pasted the URL into chat history. diff --git a/docs-site/content/docs/ai-resources/prompt-recipes.mdx b/docs-site/content/docs/ai-resources/prompt-recipes.mdx index c2a9f282..9ba8e3b8 100644 --- a/docs-site/content/docs/ai-resources/prompt-recipes.mdx +++ b/docs-site/content/docs/ai-resources/prompt-recipes.mdx @@ -14,7 +14,8 @@ Read https://docs.kaelio.com/ktx/llms.txt first. Then fetch only the ktx Markdow ## Set up a project ```text -Set up ktx in this repository. Start by reading /docs/ai-resources/agent-quickstart.md and /docs/getting-started/quickstart.md. Install the published CLI with npm; use pnpm only when working from a ktx source checkout. After setup, run ktx status and summarize which steps are complete, which files changed, and what still needs credentials or user input. +Run npx skills add Kaelio/ktx --skill ktx and use the ktx skill to install +and configure ktx in this project. ``` ## Find a command diff --git a/docs-site/content/docs/getting-started/quickstart.mdx b/docs-site/content/docs/getting-started/quickstart.mdx index f2ef3c7c..6f65d6ec 100644 --- a/docs-site/content/docs/getting-started/quickstart.mdx +++ b/docs-site/content/docs/getting-started/quickstart.mdx @@ -98,21 +98,70 @@ If you're a coding assistant choosing a docs route, start with the background: 'color-mix(in oklch, var(--color-fd-primary) 8%, transparent)', }} > -

- Run setup from an agent -
-
- You can ask an agent such as Claude Code, Codex, Cursor, or OpenCode to - install and configure **ktx** for you. The{' '} - - agent setup Markdown prompt - {' '} - tells the agent how to check prerequisites, ask only for credentials or - connection choices, run ktx setup, verify connections, and - report the result. -
-
- Use a prompt like this from the project you want to configure: +
+
+ Or, ask an AI agent to install and configure **ktx** for you. +
+
+ + +
@@ -120,16 +169,15 @@ If you're a coding assistant choosing a docs route, start with the Prompt
-
-
Follow instructions from
-
https://docs.kaelio.com/ktx/docs/agents-setup.md
-
to install and configure ktx
+
+ Run {'`npx skills add Kaelio/ktx --skill ktx`'} and use the ktx skill to install and configure ktx
diff --git a/docs-site/lib/agent-setup-markdown.ts b/docs-site/lib/agent-setup-markdown.ts deleted file mode 100644 index 5a42ea1f..00000000 --- a/docs-site/lib/agent-setup-markdown.ts +++ /dev/null @@ -1,12 +0,0 @@ -import { readFile } from "node:fs/promises"; -import { join } from "node:path"; - -export const agentSetupSlug = ["agents-setup"] as const; - -export function isAgentSetupSlug(slug: string[] | undefined) { - return slug?.length === 1 && slug[0] === agentSetupSlug[0]; -} - -export function readAgentSetupMarkdown() { - return readFile(join(process.cwd(), "content/agents-setup.md"), "utf8"); -} diff --git a/docs-site/lib/llm-docs.ts b/docs-site/lib/llm-docs.ts index 7ed338a0..fd6c8dd1 100644 --- a/docs-site/lib/llm-docs.ts +++ b/docs-site/lib/llm-docs.ts @@ -52,8 +52,9 @@ ktx provides semantic-layer files, warehouse scans, wiki pages, provenance, and ## Agent Entry Points +- Installable setup skill: run \`npx skills add Kaelio/ktx --skill ktx\` from + the project you want to configure. ${link("/docs/ai-resources/agent-quickstart", "Agent Quickstart", "Task-first route for coding assistants using ktx")} -${link("/docs/agents-setup", "Agent Setup", "Copy-pasteable prompt for agents installing and configuring ktx")} ${link("/docs/ai-resources/markdown-access", "Markdown Access", "Fetch ktx docs as llms.txt, llms-full.txt, or per-page Markdown")} ${link("/docs/ai-resources/agent-instructions", "Agent Instructions", "Suggested instructions for coding assistants that need to read and cite ktx docs")} diff --git a/docs-site/next.config.mjs b/docs-site/next.config.mjs index b82803be..380dba85 100644 --- a/docs-site/next.config.mjs +++ b/docs-site/next.config.mjs @@ -6,12 +6,28 @@ const withMDX = createMDX(); const config = { basePath: "/ktx", async rewrites() { - return [ - { - source: "/docs/:path*.md", - destination: "/llms.mdx/docs/:path*", - }, - ]; + return { + beforeFiles: [ + { + source: "/stars", + has: [{ type: "host", value: "ktx.sh" }], + destination: "https://ktx-stars.vercel.app/stars", + basePath: false, + }, + { + source: "/stars/:path*", + has: [{ type: "host", value: "ktx.sh" }], + destination: "https://ktx-stars.vercel.app/stars/:path*", + basePath: false, + }, + ], + afterFiles: [ + { + source: "/docs/:path*.md", + destination: "/llms.mdx/docs/:path*", + }, + ], + }; }, async redirects() { return [ @@ -43,9 +59,9 @@ const config = { basePath: false, }, { - source: "/:path*", + source: "/:path((?!stars(?:/|$)).*)", has: [{ type: "host", value: "ktx.sh" }], - destination: "https://docs.kaelio.com/ktx/:path*", + destination: "https://docs.kaelio.com/ktx/:path", permanent: true, basePath: false, }, diff --git a/docs/terminology.md b/docs/terminology.md index 00be75e6..9da59456 100644 --- a/docs/terminology.md +++ b/docs/terminology.md @@ -21,6 +21,41 @@ in prose when ambiguity is possible. Always qualify: Bare `source` is allowed only inside a section that has already established its referent (e.g., body of a `Semantic sources` page, or `sourceName` as a CLI arg). +## Context Layer and Context Engine + +Use **context layer** as the primary category term for what **ktx** provides to +data agents. + +Use **context engine** as the secondary mechanism term for how **ktx** builds, +maintains, validates, and serves that layer. + +| Concept | Use | Do not use | +|---|---|---| +| The whole **ktx** product category | **context layer** / **context layer for data agents** | knowledge layer, agent memory | +| The active system that builds and maintains context | **context engine** | context layer when describing ingest/reconciliation internals | +| The durable reviewed surface agents use | **context layer** | context engine | +| The compiler pillar for executable metrics and joins | **semantic layer** | context layer when specifically discussing SQL compilation | +| Prose/business knowledge files | **wiki** / **wiki pages** | wiki context | + +### Usage rules + +- Use **context layer** in taglines, page titles, meta descriptions, docs + introductions, comparison pages, and first-paragraph definitions. +- Use **context engine** when describing active behavior: ingesting evidence, + reconciling changes, validating references, maintaining files, search, CLI, + and MCP serving. +- Keep **semantic layer** for the narrower YAML/compiler surface: semantic + sources, measures, joins, dimensions, filters, SQL compilation, and semantic + queries. +- Do not use **context engine** as the primary replacement for the whole + product. It sounds like runtime infrastructure; **context layer** better + describes the durable YAML and Markdown surface users review in git. +- Do not use **context layer** when the sentence is specifically about the + compiler. Example: write "the semantic layer compiles semantic queries to + SQL," not "the context layer compiles semantic queries to SQL." +- Default lowercase in prose: `context layer`, `context engine`, `semantic + layer`. Title case only in page titles, headings, nav labels, and UI labels. + ## Canonical vocabulary | Concept | Use | Do not use | @@ -31,7 +66,8 @@ referent (e.g., body of a `Semantic sources` page, or `sourceName` as a CLI arg) | The connected database | **primary source** / **database connection** | data source | | Analytics-tooling integration | **context source** / **context-source connection** | BI source, BI model, metadata source, source tool | | YAML file describing a table | **semantic source** | semantic-layer source, model file, bare "source file" | -| The whole **ktx** surface | **context layer** (lowercase in prose) | "Context Layer" in prose | +| The whole **ktx** surface | **context layer** / **context layer for data agents** (lowercase in prose) | "Context Layer" in prose, knowledge layer, agent memory | +| The active system that builds and maintains context | **context engine** (lowercase in prose) | context layer when describing ingest/reconciliation internals | | The compiler pillar | **semantic layer** (lowercase in prose) | "Semantic Layer" in prose | | The query payload | **semantic query** (lowercase in prose) | "Semantic Query" | | The MCP layer | **MCP server** (the server), **MCP tools** (the functions) | "ktx MCP" as a standalone noun | diff --git a/packages/cli/src/setup-project.ts b/packages/cli/src/setup-project.ts index d7d189e1..08f935e6 100644 --- a/packages/cli/src/setup-project.ts +++ b/packages/cli/src/setup-project.ts @@ -24,17 +24,12 @@ export interface KtxSetupProjectArgs { allowBack?: boolean; } -export type KtxSetupCreatedProjectCleanup = - | { kind: 'remove-project-dir'; projectDir: string } - | { kind: 'remove-ktx-scaffold'; projectDir: string }; - export type KtxSetupProjectResult = | { status: 'ready'; projectDir: string; project: KtxLocalProject; confirmedCreation?: boolean; - createdProjectCleanup?: KtxSetupCreatedProjectCleanup; } | { status: 'back'; projectDir: string } | { status: 'cancelled'; projectDir: string } @@ -59,7 +54,6 @@ type PromptProjectDirResult = status: 'selected'; projectDir: string; confirmedCreation: boolean; - createdProjectCleanup?: KtxSetupCreatedProjectCleanup; } | { status: 'cancelled'; projectDir: string } | { status: 'missing-input'; projectDir: string } @@ -106,26 +100,12 @@ type ConfirmProjectDirResult = | { status: 'confirmed'; confirmedCreation: boolean; - createdProjectCleanup?: KtxSetupCreatedProjectCleanup; } | { status: 'choose-another' } | { status: 'back' } | { status: 'cancelled' } | { status: 'not-directory' }; -function cleanupForFolderState( - projectDir: string, - state: Awaited>, -): KtxSetupCreatedProjectCleanup | undefined { - if (state === 'missing') { - return { kind: 'remove-project-dir', projectDir }; - } - if (state === 'empty-directory') { - return { kind: 'remove-ktx-scaffold', projectDir }; - } - return undefined; -} - async function confirmProjectDir( selectedDir: string, io: KtxCliIo, @@ -165,7 +145,7 @@ async function confirmProjectDir( if (action === 'choose-another') return { status: 'choose-another' }; if (action === 'back') return { status: 'back' }; if (action !== 'create') return { status: 'cancelled' }; - return { status: 'confirmed', confirmedCreation: true, createdProjectCleanup: cleanupForFolderState(selectedDir, state) }; + return { status: 'confirmed', confirmedCreation: true }; } async function normalizeSetupGitignore(projectDir: string): Promise { @@ -252,24 +232,10 @@ async function promptForNewProjectDir( status: 'selected', projectDir: selectedDir, confirmedCreation: confirmed.confirmedCreation, - ...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}), }; } } -async function createProjectWithCleanup( - projectDir: string, - deps: KtxSetupProjectDeps, -): Promise<{ project: KtxLocalProject; createdProjectCleanup?: KtxSetupCreatedProjectCleanup }> { - const state = await existingFolderState(projectDir); - const project = await createProject(projectDir, deps); - const createdProjectCleanup = cleanupForFolderState(projectDir, state); - return { - project, - ...(createdProjectCleanup ? { createdProjectCleanup } : {}), - }; -} - export async function runKtxSetupProjectStep( args: KtxSetupProjectArgs, io: KtxCliIo, @@ -307,7 +273,6 @@ export async function runKtxSetupProjectStep( projectDir: selected.projectDir, project, confirmedCreation: selected.confirmedCreation, - ...(selected.createdProjectCleanup ? { createdProjectCleanup: selected.createdProjectCleanup } : {}), }; } @@ -322,13 +287,12 @@ export async function runKtxSetupProjectStep( io.stderr.write('Missing setup choice: pass --yes to create a project in non-interactive setup.\n'); return { status: 'missing-input', projectDir }; } - const { project, createdProjectCleanup } = await createProjectWithCleanup(projectDir, deps); + const project = await createProject(projectDir, deps); printProjectSummary(io, projectDir); return { status: 'ready', projectDir, project, - ...(createdProjectCleanup ? { createdProjectCleanup } : {}), }; } @@ -368,13 +332,12 @@ export async function runKtxSetupProjectStep( } if (choice === 'current') { - const { project, createdProjectCleanup } = await createProjectWithCleanup(projectDir, deps); + const project = await createProject(projectDir, deps); printProjectSummary(io, projectDir); return { status: 'ready', projectDir, project, - ...(createdProjectCleanup ? { createdProjectCleanup } : {}), }; } @@ -390,7 +353,6 @@ export async function runKtxSetupProjectStep( projectDir: defaultProjectDir, project, confirmedCreation: confirmed.confirmedCreation, - ...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}), }; } @@ -419,7 +381,6 @@ export async function runKtxSetupProjectStep( projectDir: customDir, project, confirmedCreation: confirmed.confirmedCreation, - ...(confirmed.createdProjectCleanup ? { createdProjectCleanup: confirmed.createdProjectCleanup } : {}), }; } diff --git a/packages/cli/src/setup.ts b/packages/cli/src/setup.ts index 6b3442ca..74056542 100644 --- a/packages/cli/src/setup.ts +++ b/packages/cli/src/setup.ts @@ -1,5 +1,4 @@ import { existsSync } from 'node:fs'; -import { rm } from 'node:fs/promises'; import { basename, join, resolve } from 'node:path'; import { getLatestLocalIngestStatus } from './context/ingest/local-ingest.js'; import { savedMemoryCountsForReport } from './context/ingest/reports.js'; @@ -32,11 +31,7 @@ import { isKtxSetupLlmConfigReady, runKtxSetupAnthropicModelStep, } from './setup-models.js'; -import { - type KtxSetupCreatedProjectCleanup, - type KtxSetupProjectDeps, - runKtxSetupProjectStep, -} from './setup-project.js'; +import { type KtxSetupProjectDeps, runKtxSetupProjectStep } from './setup-project.js'; import { isKtxPreAgentSetupReady, isKtxSetupReady, @@ -556,23 +551,6 @@ async function commitSetupConfigChanges(projectDir: string): Promise { await project.git.commitFile('ktx.yaml', 'setup: update KTX project config', 'ktx setup', 'setup@ktx.local'); } -const KTX_SETUP_SCAFFOLD_PATHS = ['ktx.yaml', '.ktx', 'wiki', 'semantic-layer', 'raw-sources', '.git']; - -async function cleanupCreatedProjectScaffold(cleanup: KtxSetupCreatedProjectCleanup | undefined): Promise { - if (!cleanup) { - return; - } - if (cleanup.kind === 'remove-project-dir') { - await rm(cleanup.projectDir, { recursive: true, force: true }); - return; - } - await Promise.all( - KTX_SETUP_SCAFFOLD_PATHS.map((relativePath) => - rm(join(cleanup.projectDir, relativePath), { recursive: true, force: true }), - ), - ); -} - export async function runKtxSetup(args: KtxSetupArgs, io: KtxCliIo, deps: KtxSetupDeps = {}): Promise { try { return await runKtxSetupInner(args, io, deps); @@ -869,7 +847,6 @@ async function runKtxSetupInner(args: KtxSetupArgs, io: KtxCliIo, deps: KtxSetup }); if (stepResult.status === 'failed') { - await cleanupCreatedProjectScaffold(projectResult.createdProjectCleanup); return 1; } if (stepResult.status === 'missing-input') { diff --git a/packages/cli/test/setup.test.ts b/packages/cli/test/setup.test.ts index 6c928033..0bc00919 100644 --- a/packages/cli/test/setup.test.ts +++ b/packages/cli/test/setup.test.ts @@ -1,5 +1,5 @@ import { execFile } from 'node:child_process'; -import { mkdir, mkdtemp, readFile, readdir, rm, stat, writeFile } from 'node:fs/promises'; +import { mkdir, mkdtemp, readFile, rm, stat, writeFile } from 'node:fs/promises'; import { tmpdir } from 'node:os'; import { join } from 'node:path'; import { promisify } from 'node:util'; @@ -602,7 +602,7 @@ describe('setup status', () => { expect(testIo.stderr()).toBe(''); }); - it('removes a newly created missing project directory when a later runtime step fails', async () => { + it('preserves a newly created missing project directory when a later setup step fails', async () => { const projectDir = join(tempDir, 'missing-project'); const testIo = makeIo(); @@ -634,10 +634,12 @@ describe('setup status', () => { ), ).resolves.toBe(1); - await expect(stat(projectDir)).rejects.toThrow(); + await expect(stat(projectDir)).resolves.toBeDefined(); + await expect(stat(join(projectDir, 'ktx.yaml'))).resolves.toBeDefined(); + await expect(stat(join(projectDir, '.ktx'))).resolves.toBeDefined(); }); - it('removes KTX scaffold files from an initially empty project directory when runtime setup fails', async () => { + it('preserves KTX scaffold files in an initially empty project directory when setup fails', async () => { const testIo = makeIo(); await expect( @@ -668,8 +670,59 @@ describe('setup status', () => { ), ).resolves.toBe(1); - await expect(stat(tempDir)).resolves.toBeDefined(); - expect(await readdir(tempDir)).toEqual([]); + await expect(stat(join(tempDir, 'ktx.yaml'))).resolves.toBeDefined(); + await expect(stat(join(tempDir, '.ktx'))).resolves.toBeDefined(); + }); + + it('preserves partial context-build artifacts and resume state when the context step fails', async () => { + const projectDir = join(tempDir, 'partial-context'); + const testIo = makeIo(); + + await expect( + runKtxSetup( + { + command: 'run', + projectDir, + mode: 'auto', + agents: false, + skipAgents: true, + inputMode: 'disabled', + yes: true, + cliVersion: '0.2.0', + skipLlm: true, + skipEmbeddings: true, + databaseSchemas: [], + skipDatabases: true, + skipSources: true, + }, + testIo.io, + { + model: async () => ({ status: 'skipped', projectDir }), + embeddings: async () => ({ status: 'skipped', projectDir }), + databases: async () => ({ status: 'skipped', projectDir }), + sources: async () => ({ status: 'skipped', projectDir }), + runtime: async () => runtimeReady(projectDir), + context: async () => { + await mkdir(join(projectDir, '.ktx', 'setup'), { recursive: true }); + await writeFile( + join(projectDir, '.ktx', 'setup', 'state.json'), + JSON.stringify({ status: 'failed', retryableFailedTargets: [{ source: 'metabase' }] }), + 'utf-8', + ); + await mkdir(join(projectDir, 'wiki'), { recursive: true }); + await writeFile(join(projectDir, 'wiki', 'postgres-warehouse.md'), '# warehouse\n', 'utf-8'); + await mkdir(join(projectDir, 'semantic-layer'), { recursive: true }); + await writeFile(join(projectDir, 'semantic-layer', 'orders.yaml'), 'name: orders\n', 'utf-8'); + return { status: 'failed', projectDir }; + }, + }, + ), + ).resolves.toBe(1); + + await expect(stat(join(projectDir, 'ktx.yaml'))).resolves.toBeDefined(); + await expect(readFile(join(projectDir, '.ktx', 'setup', 'state.json'), 'utf-8')).resolves.toContain('"status":"failed"'); + await expect(readFile(join(projectDir, 'wiki', 'postgres-warehouse.md'), 'utf-8')).resolves.toContain('warehouse'); + await expect(readFile(join(projectDir, 'semantic-layer', 'orders.yaml'), 'utf-8')).resolves.toContain('orders'); }); it('preserves a pre-existing non-empty project directory when runtime setup fails', async () => { diff --git a/skills.sh.json b/skills.sh.json new file mode 100644 index 00000000..6bc144ae --- /dev/null +++ b/skills.sh.json @@ -0,0 +1,11 @@ +{ + "$schema": "https://skills.sh/schemas/skills.sh.schema.json", + "notGrouped": "bottom", + "groupings": [ + { + "title": "ktx", + "description": "Skills for installing, configuring, and operating ktx.", + "skills": ["ktx"] + } + ] +} diff --git a/skills/ktx/SKILL.md b/skills/ktx/SKILL.md new file mode 100644 index 00000000..85028de7 --- /dev/null +++ b/skills/ktx/SKILL.md @@ -0,0 +1,168 @@ +--- +name: ktx +description: Installs and configures ktx, the open-source context layer for data agents — runs ktx setup non-interactively with hidden CLI flags, configures database connections and embeddings, installs agent integration, and verifies readiness. Use when the user asks an agent to add ktx to a project, connect data sources, install agent rules, ingest schema, or troubleshoot a local ktx install. +--- + +# ktx + +Install and configure **ktx**, the open-source context layer for data agents. +Use this skill when a user wants an agent to add **ktx** to a project, connect +data sources, build initial context, install agent integration, or troubleshoot +a local **ktx** setup. + +## Operating rules + +- Act autonomously when the user asks you to install or configure **ktx**. + The non-interactive scripted flow below is the canonical path — bare + `ktx setup` is interactive (clack prompts) and an agent cannot drive it. +- Setup's non-interactive flags are intentionally hidden from `--help`. Use the + flags listed below; verify uncommon flags against the docs at + `https://docs.kaelio.com/ktx/` or this skill — not against `--help` output. +- Ask only for values you cannot infer: project directory, connection targets, + credentials, account identifiers, and source selections. +- Never ask the user to paste secrets when an `env:VAR_NAME` or `file:/path` + reference would work. Pasting a literal URL is also safe — `ktx setup` + auto-externalizes URLs into `.ktx/secrets/-url` (see workflow step 2). +- Do not commit `.ktx/secrets/*`. +- Print each command you run and its result. +- If a command fails, identify the cause and change something before retrying. + +## Gather inputs once + +Before invoking `ktx setup`, collect in one round: + +1. Project directory (default: current working directory). +2. LLM backend and key strategy. In `--no-input` mode the CLI defaults to + `anthropic` and **requires an API key**. When the user is inside Claude + Code, pass `--llm-backend claude-code` explicitly; otherwise pass + `--llm-backend anthropic --anthropic-api-key-env ANTHROPIC_API_KEY`. +3. Embedding backend (`sentence-transformers` is the local default and needs + no key; use `openai` only if the user already has a key, then pass + `--embedding-api-key-env OPENAI_API_KEY`). +4. Database: driver, connection id, URL (or `env:` / `file:` ref), and one or + more schemas. +5. Optional context sources (dbt, Metabase, Looker, LookML, MetricFlow, + Notion). Skip with `--skip-sources` if the user has none. + +Do not discover these inputs across multiple setup runs. + +## Install workflow + +1. **Detect the install path.** If the working directory contains + `packages/cli/dist/bin.js` or `pnpm-workspace.yaml` referencing + `@kaelio/ktx` you are inside the **ktx** monorepo — build and link the + local CLI with `pnpm` and do **not** run `npm install -g`. Otherwise: + + ```bash + node --version # require >= 22; stop and ask the user if older + ktx --version || npm install -g @kaelio/ktx + ``` + +2. **Run scripted setup** (canonical path): + + ```bash + ktx setup --no-input --yes \ + --project-dir \ + --llm-backend claude-code \ + --embedding-backend sentence-transformers \ + --database --database-connection-id \ + --database-url '' \ + --database-schema \ + --skip-sources + ``` + + - Configure one new database connection per setup invocation. For multiple + connections, rerun setup once per connection. + - Pasting a literal `--database-url` is safe: the CLI relocates the URL + into `.ktx/secrets/-url` and rewrites `ktx.yaml` to a + `file:` ref automatically. + +3. **Resumability and `--skip-*`.** Re-running `ktx setup` against an existing + project resumes its config. Use `--skip-llm`, `--skip-databases`, + `--skip-sources`, or `--skip-embeddings` to leave a slice unconfigured but + let the rest complete instead of aborting on the first failure. **When + resuming an existing project to change one slice (e.g. only LLM), still + pass the database flags from the previous run** — setup validates current + flags, not persisted `ktx.yaml` state. + +4. **Run fast ingest** if setup did not already complete one: + + ```bash + ktx ingest --fast --no-input + ``` + + Note: `ktx ingest` rejects `--yes` together with `--no-input` + (*Choose only one runtime install mode*); `ktx setup` accepts both. Use + `--no-input` only for ingest. Do not run `--deep` ingest unless the user + explicitly asks for LLM-backed enrichment. + +5. **Install agent integration:** + + ```bash + ktx setup --agents --target + ktx mcp start --project-dir + ``` + + Agent integration is **not usable until `ktx mcp start` is running**. The + `--agents` step prints this requirement as `Required before using agents`. + +6. **Fall back to bare `ktx setup` only when a human is at the keyboard** — + it uses interactive prompts an agent cannot answer. + +## Files to inspect + +- `ktx.yaml`: project configuration. +- `.ktx/secrets/*`: local secret files. Never commit them. +- `semantic-layer//*.yaml`: semantic sources for SQL + compilation. +- `wiki/**/*.md`: project context pages for agents. +- `.claude/skills/ktx/`, `.agents/skills/ktx/`, `.cursor/rules/ktx.mdc`, and + `.opencode/commands/ktx.md`: generated agent integration files. + +## Verification + +After setup, run: + +```bash +ktx connection test +ktx status --json --no-input +``` + +**Judge readiness from `ktx status --json` fields, not the exit code.** +`ktx status` exits 1 whenever the LLM is `none`, even when embeddings and +every database connection are healthy. Treat success as: + +- `verdict: "ready"` at the top of the JSON, and +- every `connections[].status === "ok"`, and +- every `ktx connection test ` exited 0. + +A non-zero exit with only the LLM unconfigured is still a usable context +layer — report it as "ready, LLM optional" rather than retrying setup. + +## Troubleshooting + +For known failure signatures (`invalid ELF header`, +`Native CLI binary for not found`, `Missing Anthropic API key`, +`claude-code` probe failure, `KTX cannot work without a database` on resume), +see [troubleshooting.md](troubleshooting.md). + +## Final report + +End setup work with a concise report: + +```text +ktx SETUP COMPLETE + +Project: +LLM: / +Embeddings: / +Connections: () status= +Sources: +Verdict: + +Next: +1. +2. + +RESULT: PASS +``` diff --git a/skills/ktx/agents/openai.yaml b/skills/ktx/agents/openai.yaml new file mode 100644 index 00000000..41eb75d2 --- /dev/null +++ b/skills/ktx/agents/openai.yaml @@ -0,0 +1,7 @@ +interface: + display_name: "ktx" + short_description: "Install and configure ktx for data agents" + default_prompt: "Use $ktx to install and configure ktx in this project." + +policy: + allow_implicit_invocation: true diff --git a/skills/ktx/troubleshooting.md b/skills/ktx/troubleshooting.md new file mode 100644 index 00000000..812b45fc --- /dev/null +++ b/skills/ktx/troubleshooting.md @@ -0,0 +1,79 @@ +# ktx setup troubleshooting + +Known failure signatures hit by agent-driven `ktx setup` runs. Match the +error string in the left column, apply the fix in the right column. + +## `Error: invalid ELF header` from `better-sqlite3` + +Native module compiled for a different platform or architecture (e.g. +installed under Rosetta then run under native arm64). + +Fix: + +```bash +# Inside the ktx monorepo: +pnpm rebuild better-sqlite3 + +# Or for a global install: +npm rebuild --global better-sqlite3 +``` + +## `Native CLI binary for not found` + +The platform-specific optional dependency that ships the native CLI binary +was skipped during install (npm/pnpm "optional dep not for this platform"). + +Fix: + +```bash +npm install -g @kaelio/ktx --force +``` + +## `Missing Anthropic API key: pass --anthropic-api-key-env or --anthropic-api-key-file` + +`--no-input` mode defaulted the LLM backend to `anthropic` because no +`--llm-backend` flag was supplied. The CLI then required a key. + +Fix — pick one: + +```bash +# Inside Claude Code, prefer the local backend: +ktx setup --no-input --llm-backend claude-code ...other flags... + +# Otherwise point at an existing env var: +ktx setup --no-input --llm-backend anthropic \ + --anthropic-api-key-env ANTHROPIC_API_KEY ...other flags... +``` + +## `claude-code` LLM probe fails (auth or binary not found) + +The `claude` CLI is not on the agent's `PATH`, or the user has not run +`claude` interactively at least once to log in. + +Fix: + +```bash +which claude # confirm the binary resolves +claude --version # confirm it runs +# If auth probe still fails, the user must run `claude` once interactively +# to complete login; agents cannot do this step. +``` + +If `claude-code` cannot be made to work, fall back to `--skip-llm` and let +the rest of setup complete; the project is still a usable context layer +without an LLM. + +## `KTX cannot work without a database` when resuming setup + +`ktx setup` validates the **current invocation's flags**, not the persisted +`ktx.yaml`. Resuming setup with only `--llm-backend …` fails even when the +project already has a healthy database connection. + +Fix — re-pass the database flags from the original setup run, even when +only changing one slice: + +```bash +ktx setup --no-input \ + --database --database-connection-id \ + --llm-backend claude-code +```