docs(ktx skill): harden setup guidance from agent-driven demo run (#247)

Fold field-tested fixes into the ktx skill, verified against current CLI source: - prefer file: secret refs over env: (env: re-resolves per-process and resolves empty in later ingest/mcp shells) - pass --skip-agents on data-only setup runs; explain the trailing agent step's misleading exit 1 on otherwise-successful runs - dbt ignores --source-warehouse-connection-id (maps by table name); required only for Metabase/Looker/LookML - never go silent during slow setup/ingest: poll .ktx mtimes and post progress so a long run does not look stuck - judge readiness from verdict, connections[].status, localStats.semanticLayer and wikiPages; perConnection under-reports - add troubleshooting entries for the 'Run in a TTY' exit 1 and secrets that resolve empty only during ingest/mcp
2026-07-22 11:51:01 +02:00 · 2026-06-01 14:08:58 +02:00 · 2026-06-01 14:08:58 +02:00 · 5faa16b32c
commit 5faa16b32c
parent 1959f493d6
2 changed files with 120 additions and 25 deletions
--- a/skills/ktx/SKILL.md
+++ b/skills/ktx/SKILL.md
@ -20,11 +20,25 @@ a local **ktx** setup.
  `https://docs.kaelio.com/ktx/` or this skill — not against `--help` output.
 - Ask only for values you cannot infer: project directory, connection targets,
  credentials, account identifiers, and source selections.
- Never ask the user to paste secrets when an `env:VAR_NAME` or `file:/path`
-  reference would work. Pasting a literal URL is also safe — `ktx setup`
-  auto-externalizes URLs into `.ktx/secrets/<id>-url` (see workflow step 2).
+- Prefer `file:/abs/path` secret refs over `env:VAR_NAME`. `env:` refs are
+  re-resolved against the process environment on **every** `ktx` run, so a var
+  exported only in the setup shell is gone when `ktx ingest` or `ktx mcp start`
+  runs later — the secret silently resolves to empty and the connection fails.
+  `file:` refs read from disk and survive across shells. The same caveat
+  applies to `--*-api-key-env` flags: the named var must be present in every
+  shell that runs `ktx`, including the `ktx mcp` daemon's environment.
+- A literal database URL is safe to pass — `ktx setup` auto-externalizes it
+  into `.ktx/secrets/<id>-url` and rewrites `ktx.yaml` to a `file:` ref (see
+  workflow step 2). Source credential refs are **not** auto-externalized: write
+  the secret to a file under `.ktx/secrets/` (`chmod 600`) and pass a `file:`
+  ref. Never ask the user to paste a secret when a `file:` or `env:` ref works.
 - Do not commit `.ktx/secrets/*`.
 - Print each command you run and its result.
+- Setup and ingest can run for many minutes (LLM-heavy source ingests take the
+  longest), and from the outside a slow step looks identical to a stuck one.
+  Don't go silent: say what's about to run and that it may take a while, then
+  post brief progress/liveness updates while it runs (see step 4) so the user
+  never has to wonder whether it stalled — otherwise they may kill it mid-run.
 - If a command fails, identify the cause and change something before retrying.

 ## Gather inputs once
@ -68,9 +82,10 @@ Do not discover these inputs across multiple setup runs.
     --llm-backend claude-code \
     --embedding-backend sentence-transformers \
     --database <driver> --database-connection-id <id> \
-     --database-url '<raw-url | env:NAME | file:/abs/path>' \
+     --database-url '<raw-url | file:/abs/path>' \
     --database-schema <schema> \
-     --skip-sources
+     --skip-sources \
+     --skip-agents
   ```

   - Configure one new database connection per setup invocation. For multiple
@ -78,6 +93,13 @@ Do not discover these inputs across multiple setup runs.
   - Pasting a literal `--database-url` is safe: the CLI relocates the URL
     into `.ktx/secrets/<connection-id>-url` and rewrites `ktx.yaml` to a
     `file:` ref automatically.
+   - `ktx setup` runs agent integration as its **last** step. In `--no-input`
+     mode with neither `--target` nor `--skip-agents`, that step has no input,
+     prints `Run in a TTY, or pass --target <target>.`, and the command exits
+     non-zero **even though every database/LLM/embedding step succeeded**. Pass
+     `--skip-agents` to defer agents to step 5 (as above), or `--target <agent>`
+     to install them inline and exit 0. Judge data-layer success from
+     `ktx status`, not from this exit code.

 3. **Resumability and `--skip-*`.** Re-running `ktx setup` against an existing
   project resumes its config. Use `--skip-llm`, `--skip-databases`,
@ -99,6 +121,23 @@ Do not discover these inputs across multiple setup runs.
   together with `--no-input` (*Choose only one runtime install mode*);
   `ktx setup` accepts both. Use `--no-input` only for ingest.

+   Ingest one connection at a time. It can run for many minutes with **no
+   stdout** until it exits (LLM-heavy sources like Metabase are the slowest), so
+   don't assume it hung, and don't pipe it through `tail`/`head` — that buffers
+   all output to the end, so run it raw. Tell the user up front that the step is
+   slow, then keep them posted instead of blocking silently: run the ingest in
+   the background and poll for liveness every minute or so, reporting a one-line
+   update each time (which connection, roughly how long it's been running, and
+   that `.ktx` files are still changing) so a long run never looks stuck:
+
+   ```bash
+   find <path>/.ktx/worktrees <path>/.ktx/ingest-transcripts -type f -mmin -3
+   ```
+
+   On success, the `Ingest finished` summary table shows `done` in the
+   `Source ingest` and `Memory update` columns with no `Failed sources:`
+   section.
+
 5. **Install agent integration:**

   ```bash
@ -117,29 +156,33 @@ Do not discover these inputs across multiple setup runs.
 Context sources (dbt, Metabase, Looker, LookML, MetricFlow, Notion) are added
 **one at a time** — `--source` is not repeatable, so run `ktx setup` once per
 source. Source setup is resumable against an existing project: pass
-`--skip-databases --skip-llm --skip-embeddings` so only the source is
-configured. Map warehouse-backed sources (dbt, Metabase, Looker) to an existing
-database connection with `--source-warehouse-connection-id <db-connection-id>`.
-Prefer `env:VAR` / `file:/abs/path` refs for keys and tokens over literals.
+`--skip-databases --skip-llm --skip-embeddings --skip-agents` so only the source
+is configured (the trailing agent step otherwise fails the run — see install
+step 2). Map Metabase, Looker, and LookML to an existing database connection
+with `--source-warehouse-connection-id <db-connection-id>` (required for those).
+**dbt ignores `--source-warehouse-connection-id`** — it maps to the warehouse by
+table name — so omit it for dbt. Use `file:/abs/path` refs for keys and tokens
+(see the secrets rule above); `env:` refs must be exported in every later `ktx`
+shell.

 ```bash
-# dbt — pick exactly one of --source-path (local) or --source-git-url (remote)
-ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings \
+# dbt — pick exactly one of --source-path (local) or --source-git-url (remote).
+# No --source-warehouse-connection-id: dbt maps to the warehouse by table name.
+ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings --skip-agents \
  --source dbt --source-connection-id <id> \
-  --source-git-url <url> --source-branch <branch> \
-  --source-warehouse-connection-id <db-connection-id>
+  --source-git-url <url> --source-branch <branch>

 # Metabase
-ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings \
+ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings --skip-agents \
  --source metabase --source-connection-id <id> \
-  --source-url <url> --source-api-key-ref env:METABASE_API_KEY \
+  --source-url <url> --source-api-key-ref file:/abs/path/metabase-api-key \
  --source-warehouse-connection-id <db-connection-id> \
  --metabase-database-id <metabase-db-id>

 # Notion
-ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings \
+ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings --skip-agents \
  --source notion --source-connection-id <id> \
-  --source-auth-token-ref env:NOTION_TOKEN \
+  --source-auth-token-ref file:/abs/path/notion-token \
  --notion-crawl-mode selected_roots --notion-root-page-id <page-id>
 ```

@ -171,25 +214,36 @@ After setup, run:
 ```bash
 ktx connection test <connection-id>
 ktx status --json --no-input
+ktx sl --output plain          # lists compiled semantic sources; `ktx sl` has no --no-input
 ```

 **Judge readiness from `ktx status --json` fields, not the exit code.**
-`ktx status` exits 1 whenever the LLM is `none`, even when embeddings and
-every database connection are healthy. Treat success as:
+`ktx status` exits 1 whenever the LLM is `none` (`verdict: "blocked"`), even
+when embeddings and every database connection are healthy. Treat success as:

 - `verdict: "ready"` at the top of the JSON, and
- every `connections[].status === "ok"`, and
- every `ktx connection test <id>` exited 0.
+- every `connections[].status === "ok"` (other levels: `warn`, `fail`,
+  `skipped`), and
+- every `ktx connection test <id>` exited 0, and
+- for each ingested source, `localStats.semanticLayer[].sourceCount > 0` and
+  `localStats.wikiPages[].count > 0` — these confirm the source actually
+  produced context. Do **not** rely on `localStats.ingest.perConnection` to
+  confirm source ingests: it reflects only completed warehouse ingest reports
+  and under-reports (often lists just the warehouse connection).

-A non-zero exit with only the LLM unconfigured is still a usable context
-layer — report it as "ready, LLM optional" rather than retrying setup.
+If the LLM is intentionally left unconfigured, `verdict` is `blocked` and the
+exit is non-zero by design — that is still a usable context layer, so report it
+as "ready, LLM optional" and judge the data layer by the connection and
+`localStats` fields above rather than retrying setup.

 ## Troubleshooting

 For known failure signatures (`invalid ELF header`,
 `Native CLI binary for <plat> not found`, `Missing Anthropic API key`,
-`claude-code` probe failure, `KTX cannot work without a database` on resume),
-see [troubleshooting.md](troubleshooting.md).
+`claude-code` probe failure, `KTX cannot work without a database` on resume,
+`Run in a TTY, or pass --target <target>.` with a misleading exit 1, and a
+secret that resolves empty only during `ktx ingest`/`ktx mcp`), see
+[troubleshooting.md](troubleshooting.md).

 ## Final report

--- a/skills/ktx/troubleshooting.md
+++ b/skills/ktx/troubleshooting.md
@ -77,3 +77,44 @@ ktx setup --no-input \
  --database <driver> --database-connection-id <id> \
  --llm-backend claude-code
 ```
+
+## `Run in a TTY, or pass --target <target>.` and `ktx setup` exits 1
+
+`ktx setup` runs agent integration as its last step. In `--no-input` mode with
+neither `--target` nor `--skip-agents`, that step has no input and the whole
+command exits non-zero — even when every database, LLM, and embedding step
+already succeeded. The exit code is misleading here.
+
+Fix — pass one of these to the data-only setup runs:
+
+```bash
+# Defer agents; install them later with `ktx setup --agents --target <agent>`:
+ktx setup --no-input --yes ...other flags... --skip-agents
+
+# Or install agents inline and exit 0:
+ktx setup --no-input --yes ...other flags... --target claude-code
+```
+
+Either way, confirm the data work landed with `ktx status --json` rather than
+trusting the exit code.
+
+## A secret resolves empty only during `ktx ingest` or `ktx mcp`
+
+Setup succeeded, but a later `ktx ingest`/`ktx mcp start` fails to connect or
+authenticate. The connection used an `env:VAR_NAME` ref (or a `--*-api-key-env`
+flag) and the variable was exported only in the setup shell. `env:` refs are
+re-resolved against the process environment on every `ktx` run, so they resolve
+to empty wherever the var is absent — including the `ktx mcp` daemon.
+
+Fix — write the secret to a file and use a `file:` ref, which reads from disk
+and survives across shells:
+
+```bash
+mkdir -p "$PROJECT/.ktx/secrets"
+printf '%s\n' '<secret>' > "$PROJECT/.ktx/secrets/<id>-<name>"
+chmod 600 "$PROJECT/.ktx/secrets/"*
+# then pass: --source-api-key-ref file:$PROJECT/.ktx/secrets/<id>-<name>
+```
+
+Alternatively, ensure the var is exported in every shell that runs `ktx`,
+including the environment of the `ktx mcp` daemon.