mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
docs(ktx skill): harden setup guidance from agent-driven demo run (#247)
Fold field-tested fixes into the ktx skill, verified against current CLI source: - prefer file: secret refs over env: (env: re-resolves per-process and resolves empty in later ingest/mcp shells) - pass --skip-agents on data-only setup runs; explain the trailing agent step's misleading exit 1 on otherwise-successful runs - dbt ignores --source-warehouse-connection-id (maps by table name); required only for Metabase/Looker/LookML - never go silent during slow setup/ingest: poll .ktx mtimes and post progress so a long run does not look stuck - judge readiness from verdict, connections[].status, localStats.semanticLayer and wikiPages; perConnection under-reports - add troubleshooting entries for the 'Run in a TTY' exit 1 and secrets that resolve empty only during ingest/mcp
This commit is contained in:
parent
1959f493d6
commit
5faa16b32c
2 changed files with 120 additions and 25 deletions
|
|
@ -20,11 +20,25 @@ a local **ktx** setup.
|
||||||
`https://docs.kaelio.com/ktx/` or this skill — not against `--help` output.
|
`https://docs.kaelio.com/ktx/` or this skill — not against `--help` output.
|
||||||
- Ask only for values you cannot infer: project directory, connection targets,
|
- Ask only for values you cannot infer: project directory, connection targets,
|
||||||
credentials, account identifiers, and source selections.
|
credentials, account identifiers, and source selections.
|
||||||
- Never ask the user to paste secrets when an `env:VAR_NAME` or `file:/path`
|
- Prefer `file:/abs/path` secret refs over `env:VAR_NAME`. `env:` refs are
|
||||||
reference would work. Pasting a literal URL is also safe — `ktx setup`
|
re-resolved against the process environment on **every** `ktx` run, so a var
|
||||||
auto-externalizes URLs into `.ktx/secrets/<id>-url` (see workflow step 2).
|
exported only in the setup shell is gone when `ktx ingest` or `ktx mcp start`
|
||||||
|
runs later — the secret silently resolves to empty and the connection fails.
|
||||||
|
`file:` refs read from disk and survive across shells. The same caveat
|
||||||
|
applies to `--*-api-key-env` flags: the named var must be present in every
|
||||||
|
shell that runs `ktx`, including the `ktx mcp` daemon's environment.
|
||||||
|
- A literal database URL is safe to pass — `ktx setup` auto-externalizes it
|
||||||
|
into `.ktx/secrets/<id>-url` and rewrites `ktx.yaml` to a `file:` ref (see
|
||||||
|
workflow step 2). Source credential refs are **not** auto-externalized: write
|
||||||
|
the secret to a file under `.ktx/secrets/` (`chmod 600`) and pass a `file:`
|
||||||
|
ref. Never ask the user to paste a secret when a `file:` or `env:` ref works.
|
||||||
- Do not commit `.ktx/secrets/*`.
|
- Do not commit `.ktx/secrets/*`.
|
||||||
- Print each command you run and its result.
|
- Print each command you run and its result.
|
||||||
|
- Setup and ingest can run for many minutes (LLM-heavy source ingests take the
|
||||||
|
longest), and from the outside a slow step looks identical to a stuck one.
|
||||||
|
Don't go silent: say what's about to run and that it may take a while, then
|
||||||
|
post brief progress/liveness updates while it runs (see step 4) so the user
|
||||||
|
never has to wonder whether it stalled — otherwise they may kill it mid-run.
|
||||||
- If a command fails, identify the cause and change something before retrying.
|
- If a command fails, identify the cause and change something before retrying.
|
||||||
|
|
||||||
## Gather inputs once
|
## Gather inputs once
|
||||||
|
|
@ -68,9 +82,10 @@ Do not discover these inputs across multiple setup runs.
|
||||||
--llm-backend claude-code \
|
--llm-backend claude-code \
|
||||||
--embedding-backend sentence-transformers \
|
--embedding-backend sentence-transformers \
|
||||||
--database <driver> --database-connection-id <id> \
|
--database <driver> --database-connection-id <id> \
|
||||||
--database-url '<raw-url | env:NAME | file:/abs/path>' \
|
--database-url '<raw-url | file:/abs/path>' \
|
||||||
--database-schema <schema> \
|
--database-schema <schema> \
|
||||||
--skip-sources
|
--skip-sources \
|
||||||
|
--skip-agents
|
||||||
```
|
```
|
||||||
|
|
||||||
- Configure one new database connection per setup invocation. For multiple
|
- Configure one new database connection per setup invocation. For multiple
|
||||||
|
|
@ -78,6 +93,13 @@ Do not discover these inputs across multiple setup runs.
|
||||||
- Pasting a literal `--database-url` is safe: the CLI relocates the URL
|
- Pasting a literal `--database-url` is safe: the CLI relocates the URL
|
||||||
into `.ktx/secrets/<connection-id>-url` and rewrites `ktx.yaml` to a
|
into `.ktx/secrets/<connection-id>-url` and rewrites `ktx.yaml` to a
|
||||||
`file:` ref automatically.
|
`file:` ref automatically.
|
||||||
|
- `ktx setup` runs agent integration as its **last** step. In `--no-input`
|
||||||
|
mode with neither `--target` nor `--skip-agents`, that step has no input,
|
||||||
|
prints `Run in a TTY, or pass --target <target>.`, and the command exits
|
||||||
|
non-zero **even though every database/LLM/embedding step succeeded**. Pass
|
||||||
|
`--skip-agents` to defer agents to step 5 (as above), or `--target <agent>`
|
||||||
|
to install them inline and exit 0. Judge data-layer success from
|
||||||
|
`ktx status`, not from this exit code.
|
||||||
|
|
||||||
3. **Resumability and `--skip-*`.** Re-running `ktx setup` against an existing
|
3. **Resumability and `--skip-*`.** Re-running `ktx setup` against an existing
|
||||||
project resumes its config. Use `--skip-llm`, `--skip-databases`,
|
project resumes its config. Use `--skip-llm`, `--skip-databases`,
|
||||||
|
|
@ -99,6 +121,23 @@ Do not discover these inputs across multiple setup runs.
|
||||||
together with `--no-input` (*Choose only one runtime install mode*);
|
together with `--no-input` (*Choose only one runtime install mode*);
|
||||||
`ktx setup` accepts both. Use `--no-input` only for ingest.
|
`ktx setup` accepts both. Use `--no-input` only for ingest.
|
||||||
|
|
||||||
|
Ingest one connection at a time. It can run for many minutes with **no
|
||||||
|
stdout** until it exits (LLM-heavy sources like Metabase are the slowest), so
|
||||||
|
don't assume it hung, and don't pipe it through `tail`/`head` — that buffers
|
||||||
|
all output to the end, so run it raw. Tell the user up front that the step is
|
||||||
|
slow, then keep them posted instead of blocking silently: run the ingest in
|
||||||
|
the background and poll for liveness every minute or so, reporting a one-line
|
||||||
|
update each time (which connection, roughly how long it's been running, and
|
||||||
|
that `.ktx` files are still changing) so a long run never looks stuck:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
find <path>/.ktx/worktrees <path>/.ktx/ingest-transcripts -type f -mmin -3
|
||||||
|
```
|
||||||
|
|
||||||
|
On success, the `Ingest finished` summary table shows `done` in the
|
||||||
|
`Source ingest` and `Memory update` columns with no `Failed sources:`
|
||||||
|
section.
|
||||||
|
|
||||||
5. **Install agent integration:**
|
5. **Install agent integration:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
@ -117,29 +156,33 @@ Do not discover these inputs across multiple setup runs.
|
||||||
Context sources (dbt, Metabase, Looker, LookML, MetricFlow, Notion) are added
|
Context sources (dbt, Metabase, Looker, LookML, MetricFlow, Notion) are added
|
||||||
**one at a time** — `--source` is not repeatable, so run `ktx setup` once per
|
**one at a time** — `--source` is not repeatable, so run `ktx setup` once per
|
||||||
source. Source setup is resumable against an existing project: pass
|
source. Source setup is resumable against an existing project: pass
|
||||||
`--skip-databases --skip-llm --skip-embeddings` so only the source is
|
`--skip-databases --skip-llm --skip-embeddings --skip-agents` so only the source
|
||||||
configured. Map warehouse-backed sources (dbt, Metabase, Looker) to an existing
|
is configured (the trailing agent step otherwise fails the run — see install
|
||||||
database connection with `--source-warehouse-connection-id <db-connection-id>`.
|
step 2). Map Metabase, Looker, and LookML to an existing database connection
|
||||||
Prefer `env:VAR` / `file:/abs/path` refs for keys and tokens over literals.
|
with `--source-warehouse-connection-id <db-connection-id>` (required for those).
|
||||||
|
**dbt ignores `--source-warehouse-connection-id`** — it maps to the warehouse by
|
||||||
|
table name — so omit it for dbt. Use `file:/abs/path` refs for keys and tokens
|
||||||
|
(see the secrets rule above); `env:` refs must be exported in every later `ktx`
|
||||||
|
shell.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# dbt — pick exactly one of --source-path (local) or --source-git-url (remote)
|
# dbt — pick exactly one of --source-path (local) or --source-git-url (remote).
|
||||||
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings \
|
# No --source-warehouse-connection-id: dbt maps to the warehouse by table name.
|
||||||
|
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings --skip-agents \
|
||||||
--source dbt --source-connection-id <id> \
|
--source dbt --source-connection-id <id> \
|
||||||
--source-git-url <url> --source-branch <branch> \
|
--source-git-url <url> --source-branch <branch>
|
||||||
--source-warehouse-connection-id <db-connection-id>
|
|
||||||
|
|
||||||
# Metabase
|
# Metabase
|
||||||
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings \
|
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings --skip-agents \
|
||||||
--source metabase --source-connection-id <id> \
|
--source metabase --source-connection-id <id> \
|
||||||
--source-url <url> --source-api-key-ref env:METABASE_API_KEY \
|
--source-url <url> --source-api-key-ref file:/abs/path/metabase-api-key \
|
||||||
--source-warehouse-connection-id <db-connection-id> \
|
--source-warehouse-connection-id <db-connection-id> \
|
||||||
--metabase-database-id <metabase-db-id>
|
--metabase-database-id <metabase-db-id>
|
||||||
|
|
||||||
# Notion
|
# Notion
|
||||||
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings \
|
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings --skip-agents \
|
||||||
--source notion --source-connection-id <id> \
|
--source notion --source-connection-id <id> \
|
||||||
--source-auth-token-ref env:NOTION_TOKEN \
|
--source-auth-token-ref file:/abs/path/notion-token \
|
||||||
--notion-crawl-mode selected_roots --notion-root-page-id <page-id>
|
--notion-crawl-mode selected_roots --notion-root-page-id <page-id>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -171,25 +214,36 @@ After setup, run:
|
||||||
```bash
|
```bash
|
||||||
ktx connection test <connection-id>
|
ktx connection test <connection-id>
|
||||||
ktx status --json --no-input
|
ktx status --json --no-input
|
||||||
|
ktx sl --output plain # lists compiled semantic sources; `ktx sl` has no --no-input
|
||||||
```
|
```
|
||||||
|
|
||||||
**Judge readiness from `ktx status --json` fields, not the exit code.**
|
**Judge readiness from `ktx status --json` fields, not the exit code.**
|
||||||
`ktx status` exits 1 whenever the LLM is `none`, even when embeddings and
|
`ktx status` exits 1 whenever the LLM is `none` (`verdict: "blocked"`), even
|
||||||
every database connection are healthy. Treat success as:
|
when embeddings and every database connection are healthy. Treat success as:
|
||||||
|
|
||||||
- `verdict: "ready"` at the top of the JSON, and
|
- `verdict: "ready"` at the top of the JSON, and
|
||||||
- every `connections[].status === "ok"`, and
|
- every `connections[].status === "ok"` (other levels: `warn`, `fail`,
|
||||||
- every `ktx connection test <id>` exited 0.
|
`skipped`), and
|
||||||
|
- every `ktx connection test <id>` exited 0, and
|
||||||
|
- for each ingested source, `localStats.semanticLayer[].sourceCount > 0` and
|
||||||
|
`localStats.wikiPages[].count > 0` — these confirm the source actually
|
||||||
|
produced context. Do **not** rely on `localStats.ingest.perConnection` to
|
||||||
|
confirm source ingests: it reflects only completed warehouse ingest reports
|
||||||
|
and under-reports (often lists just the warehouse connection).
|
||||||
|
|
||||||
A non-zero exit with only the LLM unconfigured is still a usable context
|
If the LLM is intentionally left unconfigured, `verdict` is `blocked` and the
|
||||||
layer — report it as "ready, LLM optional" rather than retrying setup.
|
exit is non-zero by design — that is still a usable context layer, so report it
|
||||||
|
as "ready, LLM optional" and judge the data layer by the connection and
|
||||||
|
`localStats` fields above rather than retrying setup.
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
For known failure signatures (`invalid ELF header`,
|
For known failure signatures (`invalid ELF header`,
|
||||||
`Native CLI binary for <plat> not found`, `Missing Anthropic API key`,
|
`Native CLI binary for <plat> not found`, `Missing Anthropic API key`,
|
||||||
`claude-code` probe failure, `KTX cannot work without a database` on resume),
|
`claude-code` probe failure, `KTX cannot work without a database` on resume,
|
||||||
see [troubleshooting.md](troubleshooting.md).
|
`Run in a TTY, or pass --target <target>.` with a misleading exit 1, and a
|
||||||
|
secret that resolves empty only during `ktx ingest`/`ktx mcp`), see
|
||||||
|
[troubleshooting.md](troubleshooting.md).
|
||||||
|
|
||||||
## Final report
|
## Final report
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -77,3 +77,44 @@ ktx setup --no-input \
|
||||||
--database <driver> --database-connection-id <id> \
|
--database <driver> --database-connection-id <id> \
|
||||||
--llm-backend claude-code
|
--llm-backend claude-code
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## `Run in a TTY, or pass --target <target>.` and `ktx setup` exits 1
|
||||||
|
|
||||||
|
`ktx setup` runs agent integration as its last step. In `--no-input` mode with
|
||||||
|
neither `--target` nor `--skip-agents`, that step has no input and the whole
|
||||||
|
command exits non-zero — even when every database, LLM, and embedding step
|
||||||
|
already succeeded. The exit code is misleading here.
|
||||||
|
|
||||||
|
Fix — pass one of these to the data-only setup runs:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Defer agents; install them later with `ktx setup --agents --target <agent>`:
|
||||||
|
ktx setup --no-input --yes ...other flags... --skip-agents
|
||||||
|
|
||||||
|
# Or install agents inline and exit 0:
|
||||||
|
ktx setup --no-input --yes ...other flags... --target claude-code
|
||||||
|
```
|
||||||
|
|
||||||
|
Either way, confirm the data work landed with `ktx status --json` rather than
|
||||||
|
trusting the exit code.
|
||||||
|
|
||||||
|
## A secret resolves empty only during `ktx ingest` or `ktx mcp`
|
||||||
|
|
||||||
|
Setup succeeded, but a later `ktx ingest`/`ktx mcp start` fails to connect or
|
||||||
|
authenticate. The connection used an `env:VAR_NAME` ref (or a `--*-api-key-env`
|
||||||
|
flag) and the variable was exported only in the setup shell. `env:` refs are
|
||||||
|
re-resolved against the process environment on every `ktx` run, so they resolve
|
||||||
|
to empty wherever the var is absent — including the `ktx mcp` daemon.
|
||||||
|
|
||||||
|
Fix — write the secret to a file and use a `file:` ref, which reads from disk
|
||||||
|
and survives across shells:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mkdir -p "$PROJECT/.ktx/secrets"
|
||||||
|
printf '%s\n' '<secret>' > "$PROJECT/.ktx/secrets/<id>-<name>"
|
||||||
|
chmod 600 "$PROJECT/.ktx/secrets/"*
|
||||||
|
# then pass: --source-api-key-ref file:$PROJECT/.ktx/secrets/<id>-<name>
|
||||||
|
```
|
||||||
|
|
||||||
|
Alternatively, ensure the var is exported in every shell that runs `ktx`,
|
||||||
|
including the environment of the `ktx mcp` daemon.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue