ktx/docs-site/content/docs/community/telemetry.mdx
Andrey Avtomonov cb6a67c2d7 Make telemetry reliable across interrupts and headless installs
Three reliability gaps surfaced while auditing why PostHog numbers were
untrustworthy:

1. Interrupted commands lost their events. capture() is fire-and-forget and the
   only flush guarantee lived in a finally block, which SIGINT/SIGTERM skip — so
   Ctrl-C'ing a long ingest or an MCP client killing 'ktx mcp stdio' dropped the
   command event and any queued events. Add SIGINT/SIGTERM handlers (real-process
   entry only; never under test/programmatic io) that mark the active command
   span aborted, emit it, drain the emitter, then exit. Idempotent with the
   normal finally path via the single-consume command span.

2. Headless-first installs were invisible. loadTelemetryIdentity refused to mint
   an installId unless stdout was a TTY, so a machine whose first run was an
   IDE-launched MCP server or a script emitted nothing, ever. Mint on first run
   regardless of surface (still honoring CI/DO_NOT_TRACK/KTX_TELEMETRY_DISABLED),
   writing the one-time notice to stderr — safe under the MCP stdio protocol,
   which reserves stdout. Drop the now-unused stdoutIsTTY option.

3. No guard against silent emit regressions (the 0.7.0 scan_completed blackout).
   Add tests: the shared executePublicIngestTarget chokepoint emits exactly one
   ingest_completed on success and on the preflight-failure branch, and a
   database target invokes the scan that emits scan_completed; plus coverage for
   the aborted-flush helper.

Identity is unchanged otherwise: every event still attributes to the installId
in ~/.ktx/telemetry.json. No event/field changes, so Node<->Python schema parity
is untouched. Docs updated to reflect first-run-on-any-surface activation.
2026-06-02 23:19:37 +02:00

53 lines
2.3 KiB
Text

---
title: Telemetry
description: Understand what usage telemetry ktx collects and how to opt out.
---
**ktx** collects aggregated usage telemetry so maintainers can see
which commands work, where setup fails, and which parts of the data-agent
workflow need improvement. Telemetry is opt-out: it turns on the first time you
run **ktx** in any way — an interactive command, a script, or an
agent-launched MCP server — and prints a one-time notice (to the terminal when
there is one, otherwise to standard error). It stays disabled in CI and whenever
an opt-out is set.
## Opt out
Use any of these mechanisms to disable telemetry:
| Mechanism | Effect |
|-----------|--------|
| `export KTX_TELEMETRY_DISABLED=1` | Disables telemetry for the shell and child processes |
| `export DO_NOT_TRACK=1` | Standard do-not-track environment variable |
| `CI=1` | Automatic in CI |
| Edit `~/.ktx/telemetry.json` and set `"enabled": false` | Persistent for the machine, including the MCP server |
## What we collect
High-level signals: which commands run, how long they take, whether they
succeed or fail, and basic environment metadata (CLI version, Node version, OS
platform). When an operation fails, we also include diagnostic detail about the
error so we can debug it. For project-level analysis, **ktx** sends a salted
hash of the project directory to group events.
When an agent reaches **ktx** through MCP, we also record the connecting client
tool's self-reported name and version (for example Claude Desktop, Cursor, or
Cline) so we can see which agents people use **ktx** with. That describes the
tool, never you or your data.
## What we never collect
We build telemetry around counts and coarse signals, not the contents of your
data or configuration. We don't deliberately collect your `ktx.yaml`, query
results, passwords, API keys, or access tokens.
The one place environment-specific text can appear is failure diagnostics: when
an operation errors, the detail we record is the error as your tools reported
it, which can include identifiers from your setup. If you'd rather send nothing
at all, turn telemetry off using any of the options above.
## Storage and retention
Telemetry is sent to PostHog, a third-party product-analytics service used by
the **ktx** maintainers. Raw event data is retained for 90 days. Aggregated
counts may be retained indefinitely.