From bf163029d308832ffcf75954e188900f56266496 Mon Sep 17 00:00:00 2001 From: Andrey Avtomonov Date: Wed, 20 May 2026 00:43:07 +0200 Subject: [PATCH] Normalize docs punctuation --- docs-site/content/agents-setup.md | 52 +++++++++---------- .../concepts/semantic-layer-internals.mdx | 8 +-- 2 files changed, 30 insertions(+), 30 deletions(-) diff --git a/docs-site/content/agents-setup.md b/docs-site/content/agents-setup.md index c7709639..e7307c13 100644 --- a/docs-site/content/agents-setup.md +++ b/docs-site/content/agents-setup.md @@ -5,46 +5,46 @@ Set up KTX from scratch end-to-end as a fully autonomous, agent-driven replaceme # Operating principles - **Be autonomous.** Detect, decide, and act. Only ask the user when you need information that only they can provide: project location, which databases/sources to connect, credentials, and similar choices. -- **Stream short status updates.** Before each major phase ("Checking prerequisites…", "Installing uv…", "Configuring warehouse connection…", "Running fast ingest…") print a one-line update. Not chatty — just enough that the user can see what's happening. +- **Stream short status updates.** Before each major phase ("Checking prerequisites…", "Installing uv…", "Configuring warehouse connection…", "Running fast ingest…") print a one-line update. Not chatty - just enough that the user can see what's happening. - **Verify against docs, never guess.** CLI flags, config keys, and command names must come from the docs or from `ktx --help`. If something looks wrong or missing, say so explicitly. - **Print every command you run and its exit code.** Terse, not silent. -- **Fail loudly with cause + fix.** When a command fails: capture the exact error, identify the cause, change something, retry. Never retry an unchanged command. Exceptions for *known soft-failures* are listed in Phase 4 — handle those without retrying. +- **Fail loudly with cause + fix.** When a command fails: capture the exact error, identify the cause, change something, retry. Never retry an unchanged command. Exceptions for *known soft-failures* are listed in Phase 4 - handle those without retrying. - **No LLM-based ingestion in this flow.** Only `--fast` ingest (schema-only). The user can run `--deep` later. - **Platform-agnostic.** Detect the host OS first and pick the right install commands / path syntax. Anything path- or shell-specific must branch on OS. # Authoritative docs -KTX docs are served at `https://docs.kaelio.com/ktx/`. **Start by fetching `https://docs.kaelio.com/ktx/llms.txt`** to discover the docs map. Scan it for a "troubleshooting" entry — if one exists, read it **before** running install/setup so you can apply known fixes preemptively rather than after failing. If no troubleshooting page is listed (current state of the docs), proceed. Then fetch any other `.md` pages you need (setup, ingest, status, connection types). **Never invent CLI flags or config keys** — verify against the docs or `ktx --help` / `ktx --help`. +KTX docs are served at `https://docs.kaelio.com/ktx/`. **Start by fetching `https://docs.kaelio.com/ktx/llms.txt`** to discover the docs map. Scan it for a "troubleshooting" entry - if one exists, read it **before** running install/setup so you can apply known fixes preemptively rather than after failing. If no troubleshooting page is listed (current state of the docs), proceed. Then fetch any other `.md` pages you need (setup, ingest, status, connection types). **Never invent CLI flags or config keys** - verify against the docs or `ktx --help` / `ktx --help`. -> **Note on the `ktx status` JSON example in the docs.** The docs page for `ktx status` shows an example shaped like `{"title": "...", "checks": [...]}`. That example is outdated. The real CLI output uses a top-level `verdict` field plus a `connections[]` array — see Phase 5 for the canonical success criteria. Trust the shape in this prompt over the docs example. +> **Note on the `ktx status` JSON example in the docs.** The docs page for `ktx status` shows an example shaped like `{"title": "...", "checks": [...]}`. That example is outdated. The real CLI output uses a top-level `verdict` field plus a `connections[]` array - see Phase 5 for the canonical success criteria. Trust the shape in this prompt over the docs example. # Workflow -## Phase 1 — Detect environment +## Phase 1 - Detect environment Determine the host OS (e.g. via `uname -s`, `process.platform`, or `$env:OS`). Use the right install commands per OS for the rest of this flow. | Tool | macOS / Linux | Windows (PowerShell) | |------|---------------|----------------------| | `uv` | `curl -LsSf https://astral.sh/uv/install.sh \| sh` then re-source shell env | `irm https://astral.sh/uv/install.ps1 \| iex` | -| Node.js | use system / fnm / nvm — **do not** auto-install | use system / nvm-windows — **do not** auto-install | +| Node.js | use system / fnm / nvm - **do not** auto-install | use system / nvm-windows - **do not** auto-install | | KTX CLI | `npm install -g …` (see Phase 2) | `npm install -g …` (see Phase 2) | If Node.js is missing, **stop and ask the user** to install it (https://nodejs.org/). Do not attempt to auto-install Node. -## Phase 2 — Verify and install prerequisites +## Phase 2 - Verify and install prerequisites Check each tool in order; install only if missing. -1. **Node.js** — run `node --version`. Require >= 22. If missing or older, stop and instruct the user. -2. **`uv`** — run `uv --version`. If missing, run the OS-appropriate install command, then re-source the shell environment (`export PATH="$HOME/.local/bin:$PATH"` on Linux/macOS) so `uv` is on `PATH`. -3. **KTX CLI** — +1. **Node.js** - run `node --version`. Require >= 22. If missing or older, stop and instruct the user. +2. **`uv`** - run `uv --version`. If missing, run the OS-appropriate install command, then re-source the shell environment (`export PATH="$HOME/.local/bin:$PATH"` on Linux/macOS) so `uv` is on `PATH`. +3. **KTX CLI** - - Install ktx with `npm install -g @kaelio/ktx` - Verify with `ktx --version`. Print one status line per tool ("✓ uv 0.11.15 found", "Installing uv…", "✓ ktx 0.x.y installed"). -## Phase 3 — Gather user choices +## Phase 3 - Gather user choices Ask the user (grouped if your harness supports it; otherwise sequentially): @@ -55,14 +55,14 @@ Ask the user (grouped if your harness supports it; otherwise sequentially): - Connection name (e.g. `warehouse`, `analytics`). - Driver: one of `sqlite`, `postgres`, `mysql`, `sqlserver`, `bigquery`, `snowflake`. - Connection URL/DSN (or service-account file for BigQuery). Accept `env:VAR_NAME` or `file:/abs/path` to avoid pasting raw secrets. - - **Heads-up for the user**: even if they paste a literal URL, KTX will silently relocate it into `/.ktx/secrets/-url` and rewrite `ktx.yaml` to `url: file:…` — this is correct, secure behavior and not a bug. + - **Heads-up for the user**: even if they paste a literal URL, KTX will silently relocate it into `/.ktx/secrets/-url` and rewrite `ktx.yaml` to `url: file:…` - this is correct, secure behavior and not a bug. - Schemas / datasets to include (postgres / sqlserver / snowflake / bigquery only). - Optional `enabled_tables` allowlist if the user wants to scope ingest to specific tables. 5. **BI / metadata sources** (dbt, Metabase, Looker, LookML, MetricFlow, Notion). Default: none. Ask only if the user mentions them. -## Phase 4 — Configure the project +## Phase 4 - Configure the project -Drive the existing wizard non-interactively (verify exact flag names with `ktx setup --help` and the docs — the automation flags are hidden from help but accepted): +Drive the existing wizard non-interactively (verify exact flag names with `ktx setup --help` and the docs - the automation flags are hidden from help but accepted): ``` ktx setup \ @@ -107,17 +107,17 @@ This is **expected** and **does not mean setup failed**. Treat the exit code as - `ktx connection test ` (run next) exits 0 for every connection. - `ktx status --json --no-input` reports `verdict: "ready"`. -If those three conditions hold, proceed to Phase 5 without retrying setup, and **do not** switch to `--deep` to "fix" the readiness gate — deep ingest is explicitly out of scope. Mention this in the final report under "Docs / CLI gaps" so the user is aware. +If those three conditions hold, proceed to Phase 5 without retrying setup, and **do not** switch to `--deep` to "fix" the readiness gate - deep ingest is explicitly out of scope. Mention this in the final report under "Docs / CLI gaps" so the user is aware. -If any of those three conditions do not hold, this is a real failure — capture the error, fetch the relevant docs page, fix the cause, retry. +If any of those three conditions do not hold, this is a real failure - capture the error, fetch the relevant docs page, fix the cause, retry. After `ktx setup` writes `ktx.yaml`, edit it directly for anything flags don't cover: - Per-connection `enabled_tables` allowlist (snake_case, under `connections..enabled_tables`). - Any advanced settings the user requested. -Use a YAML-aware editor (e.g. `uv run python -c "import yaml; …"`) — do not hand-edit blindly. +Use a YAML-aware editor (e.g. `uv run python -c "import yaml; …"`) - do not hand-edit blindly. -## Phase 5 — Verify +## Phase 5 - Verify `ktx setup` already runs a fast schema ingest of every database connection it configures, so you do not need to re-ingest by default. For each configured connection: @@ -139,12 +139,12 @@ Then run the global health check: ktx status --json --no-input ``` -Success requires (canonical shape — supersedes the example in the docs): +Success requires (canonical shape - supersedes the example in the docs): - `verdict: "ready"` at the top of the JSON. - Every `connections[].status === "ok"`. - `ktx connection test ` exited 0 for every connection. -Do **not** run `--deep` ingest in this flow — that requires LLM time and is out of scope. +Do **not** run `--deep` ingest in this flow - that requires LLM time and is out of scope. ### Optional: directly probe the embeddings daemon @@ -155,9 +155,9 @@ If the user asks for stronger verification that `sentence-transformers` is actua 3. `curl -sS http://127.0.0.1:/health` → expect HTTP 200 with `{"status":"healthy",…}`. 4. `curl -sS -X POST http://127.0.0.1:/embeddings/compute -H 'content-type: application/json' -d '{"text":"hello"}'` → expect `{"embedding": [...384 floats...]}`. -Discover the port from setup's log line `Started KTX local embeddings daemon: http://127.0.0.1:` or from the daemon's OpenAPI at `GET /openapi.json`. Note: the routes are `/health` and `/embeddings/compute` — not `/healthz` or `/embeddings`. +Discover the port from setup's log line `Started KTX local embeddings daemon: http://127.0.0.1:` or from the daemon's OpenAPI at `GET /openapi.json`. Note: the routes are `/health` and `/embeddings/compute` - not `/healthz` or `/embeddings`. -## Phase 6 — Final report +## Phase 6 - Final report Print a structured report: @@ -180,7 +180,7 @@ Verdict: ready Then **Next steps** (copy-pasteable): 1. Enrich with AI descriptions and embeddings: `ktx ingest --deep` (several minutes per connection). 2. Add more connections later by rerunning this setup or via `ktx setup --database … --database-connection-id …`. -3. Configure BI sources (dbt, Metabase, Looker, LookML, MetricFlow, Notion) — see `ktx setup --help` for `--source …` flags. +3. Configure BI sources (dbt, Metabase, Looker, LookML, MetricFlow, Notion) - see `ktx setup --help` for `--source …` flags. 4. Install agent integration: `ktx setup --agents --target ` (with optional `--global` for `claude-code`/`codex`). 5. Connect the agent / MCP: see docs at `https://docs.kaelio.com/ktx/`. @@ -190,12 +190,12 @@ Under **Docs / CLI gaps to flag** include any of these that applied during your - `ktx status --json` real shape (`verdict`, `connections[]`) doesn't match the example in the docs page. - The pasted DB URL was moved to `.ktx/secrets/-url` automatically. -End with a single line: `RESULT: PASS` or `RESULT: FAIL — `. +End with a single line: `RESULT: PASS` or `RESULT: FAIL - `. # Operating rules (recap) - Print every command you run and its exit code. Status updates may be terse, but never silent. - On failure: capture the error, fetch the relevant docs page, fix the cause, retry. Never retry an unchanged command. -- Known soft-failures (listed in Phase 4 and Phase 5) are not real failures — handle them as documented; do not retry or escalate. +- Known soft-failures (listed in Phase 4 and Phase 5) are not real failures - handle them as documented; do not retry or escalate. - If you find a docs/CLI gap ("docs say X but CLI does Y"), call it out in the final report. -- Never commit credentials — KTX accepts `env:` and `file:` references; prefer those. KTX will also auto-relocate literal URLs into `.ktx/secrets/`, but that does not protect anyone who pasted the URL into chat history. +- Never commit credentials - KTX accepts `env:` and `file:` references; prefer those. KTX will also auto-relocate literal URLs into `.ktx/secrets/`, but that does not protect anyone who pasted the URL into chat history. diff --git a/docs-site/content/docs/concepts/semantic-layer-internals.mdx b/docs-site/content/docs/concepts/semantic-layer-internals.mdx index a011d1cb..f64836fe 100644 --- a/docs-site/content/docs/concepts/semantic-layer-internals.mdx +++ b/docs-site/content/docs/concepts/semantic-layer-internals.mdx @@ -6,7 +6,7 @@ description: How KTX compiles a short Semantic Query into safe, dialect-correct import { SemanticLayerFlow } from "@/components/semantic-layer-flow"; KTX's semantic layer is a compiler that turns intent into SQL. The agent -declares _what_ it wants — measures, dimensions, filters — in a small +declares _what_ it wants - measures, dimensions, filters - in a small Semantic Query. KTX figures out the _how_: which tables to join, what grain to aggregate at, how to keep fan-out from inflating measures, and what dialect the warehouse speaks. @@ -21,8 +21,8 @@ This page covers four mechanics: ## Imperative SQL vs declarative Semantic Querying Writing analytics SQL is imperative work. Every question forces the -agent to hold two things in mind at once: _what_ it wants — a measure, a -slice, a filter — and _how_ to compute it: which tables to join, which +agent to hold two things in mind at once: _what_ it wants - a measure, a +slice, a filter - and _how_ to compute it: which tables to join, which key links them, what grain to aggregate at, how to keep one fact from inflating another, and what dialect the warehouse speaks. Plumbing on top of intent, every query. @@ -30,7 +30,7 @@ top of intent, every query. KTX's semantic layer separates those concerns: - **You and KTX maintain the how.** Sources, joins, grain, measures, and - segments live in reviewable YAML — the analytical contract the team + segments live in reviewable YAML - the analytical contract the team agrees on, version-controlled. - **The agent declares the what.** It sends a Semantic Query and trusts the compiler to produce safe SQL.