ktx/docs-site/content/docs/community/telemetry.mdx
2026-05-22 16:19:27 +02:00

88 lines
4.9 KiB
Text

---
title: Telemetry
description: Understand what anonymous usage telemetry ktx collects and how to opt out.
---
**ktx** collects anonymous product-usage telemetry from interactive CLI runs so
maintainers can understand which commands work, where setup fails, and which
parts of the data-agent workflow need improvement.
## Opt out
Telemetry is opt-out and is disabled automatically in CI and non-interactive
CLI runs. Use any of these mechanisms to disable it:
| Mechanism | Effect |
|-----------|--------|
| `export KTX_TELEMETRY_DISABLED=1` | Disables telemetry for the shell and child processes |
| `export DO_NOT_TRACK=1` | Disables telemetry using the standard do-not-track environment variable |
| `CI=1` | Disables telemetry automatically in CI |
| Non-TTY output | Disables telemetry automatically for pipes and scripts |
| Edit `~/.ktx/telemetry.json` and set `"enabled": false` | Disables telemetry persistently for the machine |
There is no `ktx telemetry` command. The first interactive run that can emit
telemetry prints this one-line notice to stderr:
```text
ktx collects anonymous usage data to improve the product. Opt out: set KTX_TELEMETRY_DISABLED=1.
```
## Identity and grouping
**ktx** stores a random install ID in `~/.ktx/telemetry.json`. This ID is the
PostHog `distinctId` and is not tied to your name, email, Git identity, or
account.
For project-level analysis, **ktx** sends a salted SHA-256 project ID derived
from the install ID and absolute project directory. The raw project path is not
sent.
## Events
**ktx** emits these events:
| Event | When it fires | Fields |
|-------|---------------|--------|
| `install_first_run` | Once when `~/.ktx/telemetry.json` is created | Common envelope only |
| `command` | Once for a registered Commander action that reaches the action hook | `commandPath`, `durationMs`, `outcome`, `errorClass`, `flagsPresent`, `hasProject`, `projectGroupAttached` |
| `setup_step` | At the end of each setup step | `step`, `outcome`, `durationMs` |
| `connection_added` | When setup writes a database, source, or demo connection | `driver`, `isDemoConnection` |
| `connection_test` | Every `ktx connection test` run | `driver`, `isDemoConnection`, `outcome`, `errorClass`, `durationMs`, `serverVersion` |
| `project_stack_snapshot` | Once per process after `setup`, `ingest`, or project `status` | `connectors`, `connectionCount`, `hasSl`, `hasWiki`, `hasMcp`, `hasManagedRuntime` |
| `ingest_completed` | End of each public ingest target | `driver`, `isDemoConnection`, `schemaCount`, `tableCount`, `columnCount`, `rowsBucket`, `durationMs`, `outcome`, `errorClass` |
| `scan_completed` | End of schema scan or relationship inference | `driver`, `tableCount`, `columnCount`, `inferredFkCount`, `declaredFkCount`, `durationMs`, `outcome`, `errorClass` |
| `sl_validate_completed` | `ktx sl validate` | `sourceCount`, `modelCount`, `validationErrorCount`, `outcome`, `errorClass`, `durationMs` |
| `sl_query_completed` | `ktx sl query` | `mode`, `referencedSourceCount`, `referencedDimensionCount`, `referencedMeasureCount`, `durationMs`, `outcome`, `errorClass` |
| `sql_completed` | `ktx sql` | `driver`, `isDemoConnection`, `queryVerb`, `referencedTableCount`, `durationMs`, `outcome`, `errorClass` |
| `wiki_query_completed` | `ktx wiki <query>` | `queryLength`, `resultCount`, `durationMs`, `outcome` |
| `mcp_request_completed` | Sampled MCP tool invocations | `toolName`, `outcome`, `durationMs`, `errorClass`, `sampleRate` |
| `daemon_started` | The long-lived `ktx-daemon serve-http` server starts | `daemonVersion`, `pythonVersion`, `runtimeVersion`, `startupDurationMs` |
| `daemon_stopped` | The long-lived `ktx-daemon serve-http` server shuts down | `reason`, `uptimeMs` |
| `sl_plan_completed` | A daemon semantic-layer planning pass completes | `outcome`, `stage`, `errorClass`, `durationMs`, `sourceCount`, `joinCount` |
| `sql_gen_completed` | A daemon SQL generation pass completes | `outcome`, `dialect`, `errorClass`, `durationMs` |
Common envelope fields are `cliVersion`, `nodeVersion`, `osPlatform`,
`osRelease`, `arch`, `runtime`, and `isCi`.
Daemon events use `runtime: "daemon-py"`. The Python daemon reads the same
install ID file as the Node CLI and receives only the already-hashed project ID
for semantic-layer query events.
`mcp_request_completed` is sampled at 10% with a sticky per-process sampling
decision. If a process is sampled in, every MCP tool invocation in that process
emits the event; if it is sampled out, none do.
## Never collected
**ktx** telemetry never collects:
- Argv values, file paths, hostnames, or environment variable values
- `ktx.yaml` contents, connection passwords, API keys, or tokens
- Schema names, table names, column names, SQL text, or query results
- Error messages or stack traces
- Git remote URLs, Git user email, OS user, or hostname
## Storage and retention
Telemetry is sent to the GTX PostHog project. Raw event data is retained for
90 days in PostHog. Aggregated counts may be retained indefinitely.