mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
docs(ktx skill): harden setup guidance from agent-driven demo run (#247)
Fold field-tested fixes into the ktx skill, verified against current CLI source: - prefer file: secret refs over env: (env: re-resolves per-process and resolves empty in later ingest/mcp shells) - pass --skip-agents on data-only setup runs; explain the trailing agent step's misleading exit 1 on otherwise-successful runs - dbt ignores --source-warehouse-connection-id (maps by table name); required only for Metabase/Looker/LookML - never go silent during slow setup/ingest: poll .ktx mtimes and post progress so a long run does not look stuck - judge readiness from verdict, connections[].status, localStats.semanticLayer and wikiPages; perConnection under-reports - add troubleshooting entries for the 'Run in a TTY' exit 1 and secrets that resolve empty only during ingest/mcp
This commit is contained in:
parent
1959f493d6
commit
5faa16b32c
2 changed files with 120 additions and 25 deletions
|
|
@ -20,11 +20,25 @@ a local **ktx** setup.
|
|||
`https://docs.kaelio.com/ktx/` or this skill — not against `--help` output.
|
||||
- Ask only for values you cannot infer: project directory, connection targets,
|
||||
credentials, account identifiers, and source selections.
|
||||
- Never ask the user to paste secrets when an `env:VAR_NAME` or `file:/path`
|
||||
reference would work. Pasting a literal URL is also safe — `ktx setup`
|
||||
auto-externalizes URLs into `.ktx/secrets/<id>-url` (see workflow step 2).
|
||||
- Prefer `file:/abs/path` secret refs over `env:VAR_NAME`. `env:` refs are
|
||||
re-resolved against the process environment on **every** `ktx` run, so a var
|
||||
exported only in the setup shell is gone when `ktx ingest` or `ktx mcp start`
|
||||
runs later — the secret silently resolves to empty and the connection fails.
|
||||
`file:` refs read from disk and survive across shells. The same caveat
|
||||
applies to `--*-api-key-env` flags: the named var must be present in every
|
||||
shell that runs `ktx`, including the `ktx mcp` daemon's environment.
|
||||
- A literal database URL is safe to pass — `ktx setup` auto-externalizes it
|
||||
into `.ktx/secrets/<id>-url` and rewrites `ktx.yaml` to a `file:` ref (see
|
||||
workflow step 2). Source credential refs are **not** auto-externalized: write
|
||||
the secret to a file under `.ktx/secrets/` (`chmod 600`) and pass a `file:`
|
||||
ref. Never ask the user to paste a secret when a `file:` or `env:` ref works.
|
||||
- Do not commit `.ktx/secrets/*`.
|
||||
- Print each command you run and its result.
|
||||
- Setup and ingest can run for many minutes (LLM-heavy source ingests take the
|
||||
longest), and from the outside a slow step looks identical to a stuck one.
|
||||
Don't go silent: say what's about to run and that it may take a while, then
|
||||
post brief progress/liveness updates while it runs (see step 4) so the user
|
||||
never has to wonder whether it stalled — otherwise they may kill it mid-run.
|
||||
- If a command fails, identify the cause and change something before retrying.
|
||||
|
||||
## Gather inputs once
|
||||
|
|
@ -68,9 +82,10 @@ Do not discover these inputs across multiple setup runs.
|
|||
--llm-backend claude-code \
|
||||
--embedding-backend sentence-transformers \
|
||||
--database <driver> --database-connection-id <id> \
|
||||
--database-url '<raw-url | env:NAME | file:/abs/path>' \
|
||||
--database-url '<raw-url | file:/abs/path>' \
|
||||
--database-schema <schema> \
|
||||
--skip-sources
|
||||
--skip-sources \
|
||||
--skip-agents
|
||||
```
|
||||
|
||||
- Configure one new database connection per setup invocation. For multiple
|
||||
|
|
@ -78,6 +93,13 @@ Do not discover these inputs across multiple setup runs.
|
|||
- Pasting a literal `--database-url` is safe: the CLI relocates the URL
|
||||
into `.ktx/secrets/<connection-id>-url` and rewrites `ktx.yaml` to a
|
||||
`file:` ref automatically.
|
||||
- `ktx setup` runs agent integration as its **last** step. In `--no-input`
|
||||
mode with neither `--target` nor `--skip-agents`, that step has no input,
|
||||
prints `Run in a TTY, or pass --target <target>.`, and the command exits
|
||||
non-zero **even though every database/LLM/embedding step succeeded**. Pass
|
||||
`--skip-agents` to defer agents to step 5 (as above), or `--target <agent>`
|
||||
to install them inline and exit 0. Judge data-layer success from
|
||||
`ktx status`, not from this exit code.
|
||||
|
||||
3. **Resumability and `--skip-*`.** Re-running `ktx setup` against an existing
|
||||
project resumes its config. Use `--skip-llm`, `--skip-databases`,
|
||||
|
|
@ -99,6 +121,23 @@ Do not discover these inputs across multiple setup runs.
|
|||
together with `--no-input` (*Choose only one runtime install mode*);
|
||||
`ktx setup` accepts both. Use `--no-input` only for ingest.
|
||||
|
||||
Ingest one connection at a time. It can run for many minutes with **no
|
||||
stdout** until it exits (LLM-heavy sources like Metabase are the slowest), so
|
||||
don't assume it hung, and don't pipe it through `tail`/`head` — that buffers
|
||||
all output to the end, so run it raw. Tell the user up front that the step is
|
||||
slow, then keep them posted instead of blocking silently: run the ingest in
|
||||
the background and poll for liveness every minute or so, reporting a one-line
|
||||
update each time (which connection, roughly how long it's been running, and
|
||||
that `.ktx` files are still changing) so a long run never looks stuck:
|
||||
|
||||
```bash
|
||||
find <path>/.ktx/worktrees <path>/.ktx/ingest-transcripts -type f -mmin -3
|
||||
```
|
||||
|
||||
On success, the `Ingest finished` summary table shows `done` in the
|
||||
`Source ingest` and `Memory update` columns with no `Failed sources:`
|
||||
section.
|
||||
|
||||
5. **Install agent integration:**
|
||||
|
||||
```bash
|
||||
|
|
@ -117,29 +156,33 @@ Do not discover these inputs across multiple setup runs.
|
|||
Context sources (dbt, Metabase, Looker, LookML, MetricFlow, Notion) are added
|
||||
**one at a time** — `--source` is not repeatable, so run `ktx setup` once per
|
||||
source. Source setup is resumable against an existing project: pass
|
||||
`--skip-databases --skip-llm --skip-embeddings` so only the source is
|
||||
configured. Map warehouse-backed sources (dbt, Metabase, Looker) to an existing
|
||||
database connection with `--source-warehouse-connection-id <db-connection-id>`.
|
||||
Prefer `env:VAR` / `file:/abs/path` refs for keys and tokens over literals.
|
||||
`--skip-databases --skip-llm --skip-embeddings --skip-agents` so only the source
|
||||
is configured (the trailing agent step otherwise fails the run — see install
|
||||
step 2). Map Metabase, Looker, and LookML to an existing database connection
|
||||
with `--source-warehouse-connection-id <db-connection-id>` (required for those).
|
||||
**dbt ignores `--source-warehouse-connection-id`** — it maps to the warehouse by
|
||||
table name — so omit it for dbt. Use `file:/abs/path` refs for keys and tokens
|
||||
(see the secrets rule above); `env:` refs must be exported in every later `ktx`
|
||||
shell.
|
||||
|
||||
```bash
|
||||
# dbt — pick exactly one of --source-path (local) or --source-git-url (remote)
|
||||
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings \
|
||||
# dbt — pick exactly one of --source-path (local) or --source-git-url (remote).
|
||||
# No --source-warehouse-connection-id: dbt maps to the warehouse by table name.
|
||||
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings --skip-agents \
|
||||
--source dbt --source-connection-id <id> \
|
||||
--source-git-url <url> --source-branch <branch> \
|
||||
--source-warehouse-connection-id <db-connection-id>
|
||||
--source-git-url <url> --source-branch <branch>
|
||||
|
||||
# Metabase
|
||||
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings \
|
||||
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings --skip-agents \
|
||||
--source metabase --source-connection-id <id> \
|
||||
--source-url <url> --source-api-key-ref env:METABASE_API_KEY \
|
||||
--source-url <url> --source-api-key-ref file:/abs/path/metabase-api-key \
|
||||
--source-warehouse-connection-id <db-connection-id> \
|
||||
--metabase-database-id <metabase-db-id>
|
||||
|
||||
# Notion
|
||||
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings \
|
||||
ktx setup --no-input --yes --skip-databases --skip-llm --skip-embeddings --skip-agents \
|
||||
--source notion --source-connection-id <id> \
|
||||
--source-auth-token-ref env:NOTION_TOKEN \
|
||||
--source-auth-token-ref file:/abs/path/notion-token \
|
||||
--notion-crawl-mode selected_roots --notion-root-page-id <page-id>
|
||||
```
|
||||
|
||||
|
|
@ -171,25 +214,36 @@ After setup, run:
|
|||
```bash
|
||||
ktx connection test <connection-id>
|
||||
ktx status --json --no-input
|
||||
ktx sl --output plain # lists compiled semantic sources; `ktx sl` has no --no-input
|
||||
```
|
||||
|
||||
**Judge readiness from `ktx status --json` fields, not the exit code.**
|
||||
`ktx status` exits 1 whenever the LLM is `none`, even when embeddings and
|
||||
every database connection are healthy. Treat success as:
|
||||
`ktx status` exits 1 whenever the LLM is `none` (`verdict: "blocked"`), even
|
||||
when embeddings and every database connection are healthy. Treat success as:
|
||||
|
||||
- `verdict: "ready"` at the top of the JSON, and
|
||||
- every `connections[].status === "ok"`, and
|
||||
- every `ktx connection test <id>` exited 0.
|
||||
- every `connections[].status === "ok"` (other levels: `warn`, `fail`,
|
||||
`skipped`), and
|
||||
- every `ktx connection test <id>` exited 0, and
|
||||
- for each ingested source, `localStats.semanticLayer[].sourceCount > 0` and
|
||||
`localStats.wikiPages[].count > 0` — these confirm the source actually
|
||||
produced context. Do **not** rely on `localStats.ingest.perConnection` to
|
||||
confirm source ingests: it reflects only completed warehouse ingest reports
|
||||
and under-reports (often lists just the warehouse connection).
|
||||
|
||||
A non-zero exit with only the LLM unconfigured is still a usable context
|
||||
layer — report it as "ready, LLM optional" rather than retrying setup.
|
||||
If the LLM is intentionally left unconfigured, `verdict` is `blocked` and the
|
||||
exit is non-zero by design — that is still a usable context layer, so report it
|
||||
as "ready, LLM optional" and judge the data layer by the connection and
|
||||
`localStats` fields above rather than retrying setup.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
For known failure signatures (`invalid ELF header`,
|
||||
`Native CLI binary for <plat> not found`, `Missing Anthropic API key`,
|
||||
`claude-code` probe failure, `KTX cannot work without a database` on resume),
|
||||
see [troubleshooting.md](troubleshooting.md).
|
||||
`claude-code` probe failure, `KTX cannot work without a database` on resume,
|
||||
`Run in a TTY, or pass --target <target>.` with a misleading exit 1, and a
|
||||
secret that resolves empty only during `ktx ingest`/`ktx mcp`), see
|
||||
[troubleshooting.md](troubleshooting.md).
|
||||
|
||||
## Final report
|
||||
|
||||
|
|
|
|||
|
|
@ -77,3 +77,44 @@ ktx setup --no-input \
|
|||
--database <driver> --database-connection-id <id> \
|
||||
--llm-backend claude-code
|
||||
```
|
||||
|
||||
## `Run in a TTY, or pass --target <target>.` and `ktx setup` exits 1
|
||||
|
||||
`ktx setup` runs agent integration as its last step. In `--no-input` mode with
|
||||
neither `--target` nor `--skip-agents`, that step has no input and the whole
|
||||
command exits non-zero — even when every database, LLM, and embedding step
|
||||
already succeeded. The exit code is misleading here.
|
||||
|
||||
Fix — pass one of these to the data-only setup runs:
|
||||
|
||||
```bash
|
||||
# Defer agents; install them later with `ktx setup --agents --target <agent>`:
|
||||
ktx setup --no-input --yes ...other flags... --skip-agents
|
||||
|
||||
# Or install agents inline and exit 0:
|
||||
ktx setup --no-input --yes ...other flags... --target claude-code
|
||||
```
|
||||
|
||||
Either way, confirm the data work landed with `ktx status --json` rather than
|
||||
trusting the exit code.
|
||||
|
||||
## A secret resolves empty only during `ktx ingest` or `ktx mcp`
|
||||
|
||||
Setup succeeded, but a later `ktx ingest`/`ktx mcp start` fails to connect or
|
||||
authenticate. The connection used an `env:VAR_NAME` ref (or a `--*-api-key-env`
|
||||
flag) and the variable was exported only in the setup shell. `env:` refs are
|
||||
re-resolved against the process environment on every `ktx` run, so they resolve
|
||||
to empty wherever the var is absent — including the `ktx mcp` daemon.
|
||||
|
||||
Fix — write the secret to a file and use a `file:` ref, which reads from disk
|
||||
and survives across shells:
|
||||
|
||||
```bash
|
||||
mkdir -p "$PROJECT/.ktx/secrets"
|
||||
printf '%s\n' '<secret>' > "$PROJECT/.ktx/secrets/<id>-<name>"
|
||||
chmod 600 "$PROJECT/.ktx/secrets/"*
|
||||
# then pass: --source-api-key-ref file:$PROJECT/.ktx/secrets/<id>-<name>
|
||||
```
|
||||
|
||||
Alternatively, ensure the var is exported in every shell that runs `ktx`,
|
||||
including the environment of the `ktx mcp` daemon.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue