* feat(cli): block context build when a required connection fails its live test
A context build can take several minutes, so a connection that is
unreachable or misconfigured should stop the build up front instead of
failing partway through. Before the build starts, run a live connection
test for every primary- and context-source connection the build depends
on.
Each test's output is captured in a discarded buffer so raw error text
(and database paths) never reach the user; failures are surfaced only by
connection id and connector type, with a pointer to `ktx connection test
<id>` for the underlying error.
- Interactive setup lets the user fix the connection and retry without
restarting, re-resolving targets so an added/removed/reconfigured
connection is honored.
- `--no-input` exits non-zero and writes a failed context state with a
failureReason, so scripts stop early and setup never reads as ready.
Extract the buffered command IO helper out of setup-databases into
src/io/buffered-command-io.ts so both call sites share one implementation.
* feat(cli): use recovery primitive for database setup
* feat(cli): use recovery primitive for source setup
* docs: document setup connection recovery
* fix(cli): close database recovery gaps
* fix(cli): target failing project in gate hint and preserve missing-input
Address two review findings on the connection-recovery work:
- The connection-gate failure hint emitted `ktx connection test <id>` with no
--project-dir, so a setup run started with `--project-dir ./analytics` pointed
users at cwd/KTX_PROJECT_DIR instead of the project that just failed. Emit the
resolved project dir, matching the contextBuildCommands convention.
- The non-interactive database configure path returned `cancelled`, which the
recovery primitive collapses to `failed`. Sibling paths still report
`missing-input` for absent flags, so incomplete-flag runs were
indistinguishable from real connection failures. The database wrapper now
tracks the configure missing-input signal and restores the `missing-input`
step status; the shared primitive keeps its four outcomes.
* fix(ingest): recover textual-conflict gate failures; fix query-history adapter
Two latent gaps in the isolated-diff local-ingest pipeline that can abort an
otherwise-successful ingest:
- Metabase: when a work-unit patch hit both a textual conflict and a post-merge
dangling sl_ref, the after-textual-resolution branch returned a hard
semantic_conflict and rolled back the whole job. It now runs the same
repairGateFailure recovery the clean-apply branch already uses (re-validate,
then commit the union of resolved + repaired paths), reaching parity.
- Query history: the historic-sql adapter was registered only when ktx.yaml had
context.queryHistory.enabled=true, so `--query-history` threw "Adapter not
available for local ingest". Registration now resolves the dialect from driver
capability, since the explicit --query-history request is itself the opt-in;
the config-gated helper is unchanged for status/setup/probes.
Adds the previously-missing tests for both paths.
* chore: sync uv.lock to 0.8.0 (regenerated with pinned uv 0.11.11)
* fix(ingest): drop ktx's own scan probes and dedup tables in query history
Query history (historic-sql) mined two kinds of noise back into context:
- ktx's own warehouse scan emits relationship- and column-profiling probes
(the relationship_profile_values aggregation and the child_values/parent_values
FK-overlap CTEs) into pg_stat_statements. shouldDropBySql now filters these
ktx-owned, dialect-stable signatures so ktx introspection is not ingested as
usage history.
- The same physical table appears both bare (accounts, via search_path) and
schema-qualified (orbit_raw.accounts), producing duplicate per-table work
units. canonicalizeTableIdentifiers collapses a bare name into its unique
qualified form before work-unit keying; ambiguous names are left untouched.
On the orbit demo this removes ~35% of sampled query templates (ktx self-probes)
and ~45 duplicate per-table work units.
* docs(agents): add Design Reasoning Defaults section
Re-running setup was the dominant action for installs that completed setup but never ingested. Classify completion (incomplete | needs-context | needs-agents | ready) and drive one obvious next action per state: route a config-complete project straight to the build, point unbuilt-context users at `ktx ingest` instead of re-running setup or dropping to a bare shell, and confirm readiness for fully-set-up projects rather than reopening the edit menu.
Three reliability gaps surfaced while auditing why PostHog numbers were
untrustworthy:
1. Interrupted commands lost their events. capture() is fire-and-forget and the
only flush guarantee lived in a finally block, which SIGINT/SIGTERM skip — so
Ctrl-C'ing a long ingest or an MCP client killing 'ktx mcp stdio' dropped the
command event and any queued events. Add SIGINT/SIGTERM handlers (real-process
entry only; never under test/programmatic io) that mark the active command
span aborted, emit it, drain the emitter, then exit. Idempotent with the
normal finally path via the single-consume command span.
2. Headless-first installs were invisible. loadTelemetryIdentity refused to mint
an installId unless stdout was a TTY, so a machine whose first run was an
IDE-launched MCP server or a script emitted nothing, ever. Mint on first run
regardless of surface (still honoring CI/DO_NOT_TRACK/KTX_TELEMETRY_DISABLED),
writing the one-time notice to stderr — safe under the MCP stdio protocol,
which reserves stdout. Drop the now-unused stdoutIsTTY option.
3. No guard against silent emit regressions (the 0.7.0 scan_completed blackout).
Add tests: the shared executePublicIngestTarget chokepoint emits exactly one
ingest_completed on success and on the preflight-failure branch, and a
database target invokes the scan that emits scan_completed; plus coverage for
the aborted-flush helper.
Identity is unchanged otherwise: every event still attributes to the installId
in ~/.ktx/telemetry.json. No event/field changes, so Node<->Python schema parity
is untouched. Docs updated to reflect first-run-on-any-surface activation.
emitIngestCompleted was called only in runKtxPublicIngest's plain/json loop,
so the foreground 'ktx ingest' view and all of 'ktx setup' — which delegate to
runContextBuild -> executePublicIngestTarget — never emitted the event. That
left ingest_completed near-useless for measuring ingestion.
Move the emit into executePublicIngestTarget, the single per-target chokepoint
every entrypoint funnels through: a thin wrapper now captures timing, runs the
existing steps (extracted to runIngestTargetSteps), and emits exactly once. The
telemetry echo targets deps.runtimeIo (the real user stream) so a capture
buffer used for step output doesn't swallow it. Thread project through the
context-build call site. No schema/field changes, so Node<->Python telemetry
parity is unaffected.
Add tests: the shared chokepoint emits exactly one ingest_completed for any
caller, and a multi-target run emits one per target with no double-emit.
* feat: add codex sdk runner foundation
* feat: parse codex runtime events
* feat: expose codex runtime mcp tools
* feat: add codex llm runtime
* feat: wire codex llm backend
* test: avoid Array.fromAsync in codex runner test
* docs: document codex llm backend
* fix: tighten codex runtime config ownership
* fix: use codex sdk env and thread options
* fix: parse codex sdk event shapes
* test: add codex backend live smoke
* docs: clarify codex backend isolation
* fix: drive codex loop metrics from mcp events
* fix: enforce codex local step budget
* docs: disclose codex isolation limits
* fix: count all codex agent steps and stream step callbacks live
The agent-loop step budget only counted completed mcp_tool_call items, so
built-in command_execution steps (which the public Codex SDK/CLI surface can
still expose) never decremented the budget, letting ingest/reconciliation run
past stepBudget until Codex stopped on its own. onStepFinish was also replayed
only after the whole stream drained, so live work_unit_step / reconciliation
progress appeared stuck until the Codex process exited.
collectEvents is now the single live step accumulator: it counts every
completed agent-action item via a shared isCompletedAgentStep predicate
(command_execution, mcp_tool_call, file_change, web_search), fires onStepFinish
as each step completes, and enforces the budget on that broader count. A
no-tool turn still counts as one step. toolFailures stays MCP-specific, since a
non-zero command exit is normal agent exploration, not a loop failure.
* test: align ingest llm-guard assertions with codex backend
The skip-llm ingest guard message now lists codex as a valid backend and
mentions a Claude Code/Codex session plus a codex setup hint, but this slow
suite test still asserted the pre-codex wording. Update it to match the
production message (already covered by the local-bundle-runtime unit test) and
add the codex setup-line assertion.
* fix: treat codex error:null tool calls as success
The Codex SDK serializes error: null on successful mcp_tool_call items, so
the failure check (item.error !== undefined) flagged every successful tool
call as failed with the empty-payload default "Codex turn failed". This
killed every ingest work unit under the codex backend before it could
produce a patch.
Key on status === 'failed' (authoritative, always set) and only treat a
populated error object as a failure. Add a regression test built from a
verbatim real-SDK event capture.
* fix: default codex backend to gpt-5.5 and report real probe errors
The previous default gpt-5.3-codex is an API-key-only model that the OpenAI
API rejects under ChatGPT-account (subscription) auth, so codex status/setup
failed with a misleading "authentication is not usable" message even though
auth was fine.
- Default codex model is now gpt-5.5 (works on both subscription and API-key
auth); the curated setup picker offers gpt-5.5 / gpt-5.4 / gpt-5.4-mini and
keeps free-form entry for account-specific ids (e.g. gpt-5.3-codex-spark).
- runCodexAuthProbe now distinguishes "model not available" from an auth
failure and surfaces the real API error: collectEvents retains stream
events when the SDK throws on a non-zero exit, and the API error JSON
envelope is unwrapped to its human-readable message.
- The Codex isolation warning now renders inside the clack setup frame.
- Docs updated to gpt-5.5 with a note that *-codex ids require API-key auth.
* fix: require llm.models.default in status and match codex probe remediation
Status reported a project ready when a non-none LLM backend was configured
without llm.models.default, but the runtime (resolveModelSlots) hard-requires
it, so ingest/scan/memory threw after `ktx status` said the project was usable.
buildLlmStatus now fails for any non-none backend missing models.default and no
longer invents a fallback model for claude-code/codex.
Codex probe failures now carry a category-matched fix: a model-access failure
steers the user at llm.models.default instead of the auth/install remediation.
runCodexAuthProbe returns the fix and status consumes it; the message stays
self-sufficient so setup output is unchanged.
Docs: README now lists the codex backend and local Codex auth; ktx-setup.mdx
states --llm-model only accepts codex/default or gpt-*/codex-* ids.
Repaired four doctor fixtures that configured a backend without models.default
(the now-correctly-blocked config) and added coverage for the new behavior.
The GitHub repo was renamed back from Kaelio/ktx-ai-data-agents-context to Kaelio/ktx, reverting the URL changes from #250 across package metadata, CI (codecov + star-history slugs), issue/security templates, the release runbook, and docs/install commands.
Also removes the rename-resilience machinery #250 added: semantic-release now reads the repository URL straight from package.json (Kaelio/ktx) again, so the repositoryUrl() derivation in scripts/semantic-release-config.cjs, its tests, and the rename note in docs/release.md are no longer needed.
* feat(cli): share public ingest progress adapter
* feat(cli): stream plain public ingest progress
* test(cli): update plain ingest progress assertions
* chore(cli): satisfy plain ingest progress checks
* fix(artifacts): expect plain ingest stderr progress in installed-CLI smoke
* ci(coverage): make Codecov upload non-fatal and fix repo slug
The Coverage job failed because the Codecov upload returned
'Repository not found' while fail_ci_if_error was true, turning a
Codecov-side issue into a hard CI failure even though all tests pass.
- Set fail_ci_if_error: false on both uploads so Codecov outages or an
unlinked repo no longer break CI (upload stays best-effort).
- Correct the stale slug Kaelio/ktx -> Kaelio/ktx-ai-data-agents-context
to match the actual GitHub repo (aligns with main).
* fix(cli): isolate query-history failure capture from scan output
The plain public-ingest progress path passes one captured IO as the
target-level `io`. With progress deps set, both the schema scan and the
query-history ingest resolved their capture to that same shared buffer,
so a non-actionable query-history failure surfaced leftover scan report
text (e.g. "Mode: enriched") as the skipped-facet detail instead of the
real query-history message.
Give the query-history ingest a phase-local capture while preserving the
flow-to-io branch the foreground context-build view relies on.
---------
Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
* fix(release): point repository URLs at renamed GitHub repo
The GitHub repo was renamed from Kaelio/ktx to
Kaelio/ktx-ai-data-agents-context. semantic-release reads repositoryUrl
from package.json's repository field and the @semantic-release/github
plugin failed verifyConditions with EMISMATCHGITHUBURL because it no
longer matched the live clone URL.
Update every Kaelio/ktx reference to the renamed repo: package metadata
(root + CLI repository/bugs/homepage), the codecov upload slugs and
star-history slug in CI, the issue-template and security-advisory links,
the release runbook, and all docs/install commands.
* fix(release): derive semantic-release repositoryUrl from the CI repo
@semantic-release/github exact-matches repositoryUrl against the live
GitHub clone_url (no redirect following), so any repo rename re-breaks the
release when repositoryUrl is the static package.json value.
Derive repositoryUrl from the runner's GITHUB_REPOSITORY/GITHUB_SERVER_URL
so it always tracks the current repo name. A future rename (including back
to Kaelio/ktx) now resolves with no code change. Outside CI the option is
omitted, so semantic-release falls back to package.json as documented.
The package.json repository field stays ktx-ai-data-agents-context as
npm-display metadata, decoupled from the release-time match.
* feat(cli): profile ingest runs to find where wall-clock time goes
Add opt-in profiling for `ktx ingest`. Each timed phase, work unit, and
agent loop now records durationMs / step count / token usage in the
trace, and a post-run aggregator rolls them up into a "where did the
time go" report printed to stderr.
Enable per run with KTX_PROFILE_INGEST (1/true -> human table, json ->
raw structured profile) or persistently via `ingest.profile` in
ktx.yaml. The json form emits raw milliseconds, token counts, and a
summary.headline one-line diagnosis so coding agents can parse it
directly; json wins when both env and config request profiling.
- runtime-port: RunLoopMetrics (totalMs, usage, stepCount,
stepBoundariesMs) plus onMetrics callbacks on text/object generation
- ai-sdk + claude-code runtimes: capture per-loop timing and token usage
- work-unit-executor and stages 3/4: thread metrics into trace events
- ingest-bundle.runner: time worktree / triage / clustering / index /
reconcile / squash phases and emit the profile in a finally block
(best-effort; never affects the run outcome)
- ingest-profile: new trace+transcript aggregator with table/json formatters
- config: ingest.profile flag; docs: profiling section in ktx-ingest.mdx
* fix(cli): flush tool-call logs before reading ingest profile
Tool transcripts are appended fire-and-forget so the agent hot path never
blocks on logging. The ingest profiler read them before the writes settled,
so per-work-unit toolMs (and the model-vs-tool split derived from it) could
be incomplete. Track in-flight appends and expose flushToolCallLogs() —
bounded by a timeout so it can never hang — and flush before the profiler
reads the transcript.
Add a clickable launch-video poster (linking to YouTube) directly after
the intro note and before the architecture diagrams. GitHub Markdown can
not embed a YouTube player, so the poster image links out instead.
Fold field-tested fixes into the ktx skill, verified against current CLI source:
- prefer file: secret refs over env: (env: re-resolves per-process and resolves
empty in later ingest/mcp shells)
- pass --skip-agents on data-only setup runs; explain the trailing agent step's
misleading exit 1 on otherwise-successful runs
- dbt ignores --source-warehouse-connection-id (maps by table name); required
only for Metabase/Looker/LookML
- never go silent during slow setup/ingest: poll .ktx mtimes and post progress
so a long run does not look stuck
- judge readiness from verdict, connections[].status, localStats.semanticLayer
and wikiPages; perConnection under-reports
- add troubleshooting entries for the 'Run in a TTY' exit 1 and secrets that
resolve empty only during ingest/mcp
Replace the tall portrait README ingestion SVG with two landscape
diagrams — "1 · Ingestion" (build the context layer) and "2 · Serving"
(agents query it through MCP) — wired in as transparent 2x PNGs that
read on GitHub light and dark.
Add docs-site/diagram-studio: a static React Flow page with custom
themed nodes and the inlined ktx mascot that renders both diagrams and
exports them to PNG via html-to-image (the diagrams' reproducible
source). Remove the superseded ingestion-flow SVGs.
* feat(completion): complete known argument values
* fix(completion): hide Commander-hidden subcommands from completions
Replace the `__`-prefix name heuristic with Commander's `_hidden` flag so
internal subcommands registered with { hidden: true } (e.g. `mcp serve-internal`)
are excluded from completions, mirroring `ktx --help`.
* test: cover wiki and sl read command routing
* test: cover raw wiki and sl reads
* feat: add wiki read command
* feat: add sl read command
* feat: complete read command entity names
* docs: document wiki and sl read commands
* test: include read commands in command tree
* feat(sl): read and validate unique sources by name
* feat(sl): make read and validate connection id optional
* fix(completion): dedupe semantic source names
* docs(sl): document connection-optional read and validate
* fix(sl): require connection id for query command
* docs(sl): clarify query connection requirement
* fix(completion): don't resolve option values as subcommands
resolveCommand skipped flag tokens but not the value consumed by a
value-taking option in the `--flag value` form, so a connection id like
`query` was matched as the `sl query` subcommand and yielded no `sl`
completions. Track value-taking options and skip their consumed value
before matching subcommands.
* test(telemetry): assert first-run notice via TELEMETRY_NOTICE constant
CI (which tests this branch merged with main) failed because #243 changed
the first-run notice wording in identity.ts (dropped "anonymous") but left
this test grepping for the old literal 'ktx collects anonymous usage data',
so indexOf returned -1. Assert against the exported TELEMETRY_NOTICE
constant instead so the test tracks the source of truth and cannot drift
when the notice text changes again.
Set disableGeoip: false on the CLI telemetry client so events are enriched with approximate, IP-based location at ingest. Update the first-run notice, public telemetry docs, and the AGENTS telemetry policy to drop the prior "anonymous" wording to match.
The star-history refresh workflow committed the API's SVG verbatim, but the
response has no trailing newline. Because the refresh commit uses [skip ci],
the file never ran end-of-file-fixer at commit time, so pre-commit's
`--all-files` run failed end-of-file-fixer on every open PR (e.g. #240), even
PRs that never touched the file.
Normalize the downloaded SVG to exactly one trailing newline in the workflow
(idempotent, so the "unchanged" guard still works), and fix the currently
committed file so open PRs go green now.
The scheduled star-history workflow checked out with the default
GITHUB_TOKEN, so its git push to main was rejected by the branch
protection hook (GH006). Check out with RELEASE_PAT instead, matching
release.yml, whose semantic-release step already pushes to the protected
main branch with the same token.
Point the README chart at a committed assets/star-history.svg instead of
the star-history API URL so GitHub serves it directly and bypasses the Camo
proxy cache. A scheduled workflow regenerates the SVG at 06:00/18:00 UTC,
busting star-history's server-side cache, and commits it when it changes.
* fix(cli): derive ingest outcomes from saved artifacts
* fix(cli): treat artifact-producing ingests with failures as partial
* fix(cli): route memory-flow run status through shared ingest outcome
* fix(cli): treat partial ingest as saved context in setup status
* test(cli): align memory-flow replay expectations with partial ingests
Fast mode (the ktx ingest --fast/--deep database-ingest depth toggle) is removed.
ktx ingest now always builds the full enriched ("deep") context. There is no
structural fallback: a database connection without a configured model and
embeddings fails the enrichment-readiness preflight before any work runs, with
a 'Run ktx setup to configure a model and embeddings' hint.
- Remove --fast/--deep flags, the per-connection context.depth field, and the
ktx setup depth prompt (delete setup-database-context-depth.ts).
- Rename ingest-depth.ts -> connection-drivers.ts; ingest always requests scan
mode 'enriched'; readiness gate (enrichmentReadinessGaps) runs for every
database target.
- Drop the database-context-depth telemetry step (Node + Python schema mirrors
regenerated).
- Update CLI, setup, context-build view, docs, the public ktx skill, and the
release-smoke / artifacts scripts (now assert the no-LLM guard failure).
ktx status --fast (a separate network-probe flag) is unchanged.
Follow-ups: KLO-726 (live progress for ktx ingest --all), KLO-727 (restore
credentialed successful-ingest release smoke coverage).
Notion's setup path read --source-api-key-ref while writing the auth_token_ref
config field, so --source-auth-token-ref was silently dropped. Align Notion to
the flag=field convention every other connector follows: it now reads
--source-auth-token-ref, and --source-api-key-ref becomes Metabase-only.
Also add validation rejecting any credential-ref flag not applicable to the
chosen --source, with a pointer to the correct flag, closing the silent-drop
class for all connectors.
Update CLI-reference docs, the ktx skill Notion example, and tests.
Fixes KLO-724.
The pre-commit job failed because tombi-format reformats uv.lock to a
layout uv does not produce, so once CI's uv sync re-resolved the stale
lock (workspace members still at 0.6.0) and rewrote it, tombi rewrote it
back and the hook reported a modified file.
Exclude uv.lock from tombi-format so uv stays authoritative for its
generated lockfile, and bump the workspace members to 0.7.0 so the lock
is current and uv stops re-resolving it (uv lock --check now passes).
- skills/ktx/SKILL.md: add an "Add context sources" section with the generic
`ktx setup --source ...` flags per connector (dbt, Metabase, Notion, ...),
warehouse mapping, the --metabase-database-id discovery note, and the
`ktx ingest` follow-up. The skill previously only documented database setup
with --skip-sources, so agents couldn't wire up dbt/Metabase/Notion (KLO-723).
- docs-site quickstart: the kaelio.com/start callout now points at the
"copy agent setup" one-shot prompt that installs the full four-source demo.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
An external agent ran the skill end-to-end against `ktx setup` and reported
seven concrete failures, all verified against the CLI source:
- All useful setup flags are `.hideHelp()`, so the skill's "verify with
--help" rule led the agent to conclude its own examples were wrong
(setup-commands.ts:208-332).
- The non-interactive LLM default is `anthropic` (and requires a key), not
`claude-code` as the skill claimed (setup-models.ts:505-507).
- `ktx status` exits 1 whenever the LLM is `none`, even with healthy
embeddings and connections (status-project.ts:204-211, doctor.ts:647).
- `ktx ingest` rejects `--yes`+`--no-input` while `ktx setup` accepts both
(managed-python-command.ts:23-24).
- `--database-url <raw>` auto-externalizes to `.ktx/secrets/<id>-url` —
worth telling the agent (setup-databases.ts:671-683).
- Resuming setup with only `--llm-backend` fails on missing DB flags even
when `ktx.yaml` already has one (setup-databases.ts:1778-1782).
- The `--agents` step prints `Required before using agents: ktx mcp start`
but the skill never told agents to run it (setup-agents.ts:989,1227).
Rewrite SKILL.md to: lead with the scripted (non-interactive) path; add a
single "gather inputs once" checklist; correct the LLM default; document
`--skip-*` flags and resumability; warn that `status` exit code ≠
readiness; fix the `ktx ingest` example to use `--no-input` only; require
`ktx mcp start` after `--agents`; add a ktx-monorepo branch that avoids
`npm install -g`.
Add skills/ktx/troubleshooting.md (one level deep, per Anthropic's
progressive-disclosure guidance) covering the five real failure signatures
the agent hit: invalid ELF header, missing native CLI binary, missing
Anthropic key, claude-code probe failure, and the resume-without-DB error.
Description rewritten to combine what + when per the official skill
authoring guidelines.
ktx setup wiped ktx.yaml, .ktx/setup/state.json, wiki/, semantic-layer/,
raw-sources/, and .git/ — or removed the entire project dir — whenever any
single source in the context-build step failed, destroying hours of ingest
work and the persisted resume state. The cleanup hint was designed for an
"early abort, leave no trace" semantic but was applied indiscriminately to
every later step failure, in direct conflict with the .ktx/setup/state.json
resume mechanism.
Drop the cleanup mechanism entirely (KtxSetupCreatedProjectCleanup,
cleanupForFolderState, createProjectWithCleanup, cleanupCreatedProjectScaffold,
and the createdProjectCleanup plumbing through KtxSetupProjectResult). Step
failures now return non-zero without touching the filesystem, so re-running
ktx setup continues from completed steps and only re-attempts failed sources.
Rewrites the two tests that documented the wipe behavior to assert
preservation, and adds a regression test that simulates partial context-build
artifacts (state.json, wiki/, semantic-layer/) and verifies all survive a
failed context step.
Refs KLO-719
Geist Mono fuses `--` into an em-dash glyph that visually swallows the
adjacent space, so prompts like `npx skills add Kaelio/ktx --skill ktx`
rendered as `Kaelio/ktx--skill ktx` on the quickstart page. The existing
ligature-off rule only covered <code>/<pre> and the .ktx-code wrapper —
quickstart.mdx puts the prompt in a plain <div className="font-mono">,
so the rule didn't apply. Extend the selector to also match the
.font-mono Tailwind utility and any inline-style opt-in via the mono
font CSS variable.
Document the convention in AGENTS.md so future docs additions keep
ligatures off on any new monospace container.
The Claude Code runtime counted every SDKAssistantMessage with
parent_tool_use_id === null as a step, but the SDK emits extra messages
within a single num_turns round-trip — `stop_reason: 'pause_turn'`
continuations and errored partials it retries internally. The local
counter then outran maxTurns and the ingest HUD rendered confusing
ratios like `step 69/40`.
Filter both cases in collectResult so stepIndex tracks num_turns and
stays bounded by the work-unit stepBudget.
Add a host-scoped redirect for /slack on ktx.sh before the existing
catch-all so the path resolves to the community invite link instead of
docs.kaelio.com/ktx/slack.