chore: move docs site workspace

This commit is contained in:
Andrey Avtomonov 2026-05-11 16:53:42 +02:00
parent 0ae9b6effd
commit a46563bb01
52 changed files with 3 additions and 3 deletions

View file

@ -0,0 +1,59 @@
---
title: Introduction
description: What KTX is and who it's for.
---
Data agents can write SQL. The hard part is making sure they write the SQL your analytics team would have written.
KTX is the agent-native context layer for analytics engineering. At its core is a semantic layer: YAML sources that define tables, columns, measures, joins, grain, filters, segments, and computed fields. Around that core, KTX adds the context analytics agents need to work safely: warehouse scans, knowledge pages, ingestion from existing tools, provenance, validation, and MCP access.
KTX projects are plain files — YAML, Markdown, and SQLite — that you commit to git and review in PRs, just like dbt models. Agents can read them, edit them, validate them, query through them, and leave behind a diff your team can review.
## Who KTX is for
KTX is built for analytics engineers and data teams who want data agents to work on real analytics systems, not just generate one-off SQL.
Use KTX when you want agents to:
- Generate SQL from approved measures, dimensions, and joins
- Repair or extend semantic definitions through reviewable git diffs
- Explain where a metric definition came from and what business rules shape it
- Use warehouse scans and relationship evidence instead of guessing join paths
- Work alongside **dbt**, **LookML**, **MetricFlow**, **Looker**, **Metabase**, **Notion**, and BI platforms
- Work with warehouses like **PostgreSQL**, **Snowflake**, **BigQuery**, **ClickHouse**, **MySQL**, or **SQL Server**
If you've ever watched an agent confidently generate a query that joins on the wrong key or invents a metric that doesn't exist, KTX is the fix.
## What KTX gives agents
- **A semantic layer they can edit** — plain YAML sources with measures, dimensions, joins, grain, segments, filters, and computed columns
- **Safe query planning** — grain-aware SQL generation, fan-out detection, chasm-trap handling, and dialect transpilation
- **Business context** — Markdown knowledge pages for definitions, rules, exceptions, and data quality notes
- **Schema evidence** — warehouse scans with table metadata, column stats, constraints, and inferred relationships
- **Provenance** — ingest transcripts and replay metadata that explain where context came from and why it changed
- **An agent-facing API** — MCP and CLI tools for reading, writing, validating, searching, and querying context
## How these docs are organized
<Cards>
<Card title="Quickstart" href="/docs/getting-started/quickstart">
Set up KTX and build your first context in under 10 minutes.
</Card>
<Card title="Concepts" href="/docs/concepts/the-context-layer">
Understand what a context layer is, why agents need one, and how KTX compares to other semantic layers.
</Card>
<Card title="Guides" href="/docs/guides/building-context">
Hands-on workflows for scanning, ingesting, writing semantic sources, and serving agents.
</Card>
<Card title="Integrations" href="/docs/integrations/primary-sources">
Setup details for every supported database, context source, and agent client.
</Card>
<Card title="CLI Reference" href="/docs/cli-reference/ktx-setup">
Exhaustive flag and subcommand reference for every KTX command.
</Card>
</Cards>
## Next steps
- **Get hands-on** — follow the [Quickstart](/docs/getting-started/quickstart) to set up KTX with your own database in under 10 minutes.
- **Understand the theory** — read [The Context Layer](/docs/concepts/the-context-layer) to learn why schema access alone breaks on real analytics and how KTX addresses it.

View file

@ -0,0 +1,5 @@
{
"title": "Getting Started",
"defaultOpen": true,
"pages": ["introduction", "quickstart"]
}

View file

@ -0,0 +1,255 @@
---
title: Quickstart
description: Set up KTX and build your first context in under 10 minutes.
---
This guide walks you through `ktx setup` — an interactive wizard that configures your LLM provider, connects your database, optionally ingests from your existing tools, builds context, and installs agent integration.
## Prerequisites
- **Node.js 22+** and **pnpm**
- An **Anthropic API key** for LLM-powered enrichment and ingestion
- A **database connection** — PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, or SQLite
- Optionally, a **dbt project**, **LookML repo**, **Metabase instance**, or other context source
## Install and run setup
KTX is currently used from a local checkout or linked workspace CLI. Build and link the CLI first:
```bash
git clone https://github.com/kaelio/ktx.git
cd ktx
pnpm install
pnpm run setup:dev
pnpm run link:dev
```
Then run the setup wizard in the directory where you want your KTX project:
```bash
ktx setup
```
The wizard walks through six steps. You can go back at any point, and if you exit early, running `ktx setup` again resumes where you left off.
## Step 1: Configure LLM
KTX uses an Anthropic model to enrich schema descriptions, generate semantic sources during ingestion, and reconcile metadata from your tools.
The wizard asks how to find your API key:
```
◆ How should KTX find your Anthropic API key?
│ ○ Use ANTHROPIC_API_KEY from the environment
│ ○ Paste a key and save it as a local secret file
```
If you choose to paste a key, KTX saves it in `.ktx/secrets/anthropic-api-key` with local file permissions. Your `ktx.yaml` stores a `file:` reference, never the raw key.
Next, choose a model:
```
◆ Which Anthropic model should KTX use?
│ ○ Claude Sonnet 4.6 (recommended)
│ ○ Claude Opus 4.6
│ ○ Claude Haiku 4.5
│ ○ Enter a model ID manually
```
KTX runs a health check to verify your key and model work before saving.
## Step 2: Configure embeddings
KTX uses embeddings for semantic search over sources, wiki content, schema metadata, and relationship evidence.
```
◆ Which embedding option should KTX use?
│ ○ Local sentence-transformers embeddings
│ ○ OpenAI embeddings (recommended)
```
**OpenAI embeddings** use `text-embedding-3-small` (1536 dimensions) and require an `OPENAI_API_KEY`.
**Local embeddings** use `all-MiniLM-L6-v2` (384 dimensions) via the KTX Python daemon. No API key is needed. If you run the daemon as a long-lived HTTP service, start it with:
```bash
ktx-daemon serve-http --host 127.0.0.1 --port 8765
```
## Step 3: Connect a database
Select one or more databases for KTX to scan. The wizard supports SQLite, PostgreSQL, MySQL, ClickHouse, SQL Server, BigQuery, and Snowflake.
For PostgreSQL, you can enter connection details field by field or paste a connection URL:
```
◆ How do you want to connect to PostgreSQL?
│ ○ Enter connection details (host, port, database, user)
│ ○ Paste a connection URL
```
If your URL contains credentials, KTX saves it to `.ktx/secrets/` and writes a `file:` reference in `ktx.yaml`. You can also use `env:DATABASE_URL` to reference an environment variable.
After connecting, KTX automatically runs a connection test and a structural scan:
```
◇ Testing postgres-warehouse
│ ✓ Connection test passed
│ Driver: PostgreSQL · Tables: 42
◇ Scanning postgres-warehouse
│ ✓ Structural scan completed
│ Changes: 42 new tables
◇ Primary source ready
│ postgres-warehouse · PostgreSQL · structural scan complete
```
For Snowflake and BigQuery, the wizard offers **Historic SQL** configuration for query history views. For PostgreSQL, enable Historic SQL with `--enable-historic-sql` when `pg_stat_statements` is configured.
## Step 4: Add context sources
Context sources let KTX ingest metadata from your existing analytics tools. This step is optional — you can skip it and add sources later.
```
◆ Which context sources should KTX ingest?
│ ◻ dbt
│ ◻ MetricFlow
│ ◻ Metabase
│ ◻ Looker
│ ◻ LookML
│ ◻ Notion
```
For **dbt**, point KTX at a local path or git URL. KTX reads your `dbt_project.yml` and schema files to extract model metadata:
```
◆ dbt source location
│ ○ Local path
│ ○ Git URL
```
For **Metabase** and **Looker**, you provide an API URL and credentials. KTX maps BI databases to your KTX primary source connections so it knows which warehouse tables the BI metadata refers to.
Context sources are saved to `ktx.yaml` and built during the next step.
## Step 5: Build context
This is where KTX does the heavy lifting. It runs an enriched scan of your database (generating AI-powered column and table descriptions) and ingests metadata from any configured context sources.
```
◆ Build KTX context for agents?
│ ○ Build context now (recommended)
│ ○ Leave context unbuilt and exit setup
```
The build scans each primary source with LLM enrichment, detects table relationships, and runs ingestion agents that reconcile metadata from your context sources into semantic-layer YAML files and knowledge pages.
For a small database (under 50 tables), this takes a few minutes. Larger warehouses can take longer. You can press <kbd>d</kbd> to detach and let it run in the background:
```
KTX context build
Run: setup-context-local-abc123
Project: /home/user/analytics
Detach: press d to leave this running.
Resume: ktx setup context watch setup-context-local-abc123
Status: ktx setup context status setup-context-local-abc123
```
When the build completes, KTX verifies that agent-ready context was produced:
```
KTX context is ready for agents.
Primary sources:
postgres-warehouse: enriched scan complete
Context sources:
dbt-main: memory update complete
Verification:
Agent context: ready
Semantic search: ready
```
## Step 6: Install agent integration
The final step connects KTX to your coding agent. Choose how agents should access the project:
```
◆ How should agents use this KTX project?
│ ○ CLI tools and skills
│ ○ MCP server config
│ ○ Both
```
Then select which agents to install for:
```
◆ Which agent targets should KTX install?
│ ◻ Claude Code
│ ◻ Codex
│ ◻ Cursor
│ ◻ OpenCode
```
**CLI mode** writes a skill file (e.g., `.claude/skills/ktx/SKILL.md`) that teaches the agent to call KTX commands directly.
**MCP mode** writes an MCP server configuration (e.g., `.mcp.json`) that lets the agent call KTX tools like `sl_query`, `knowledge_search`, and `sl_write_source` over the Model Context Protocol.
## Verify it worked
Check your project status:
```bash
ktx status
```
```
KTX project: /home/user/analytics
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
Primary sources configured: yes (postgres-warehouse)
Context sources configured: yes (dbt-main)
KTX context built: yes
Agent integration ready: yes (claude-code:project)
```
List your semantic sources:
```bash
ktx sl list
```
Query through the semantic layer:
```bash
ktx sl query \
--connection-id postgres-warehouse \
--measure orders.total_revenue \
--dimension orders.status \
--order-by orders.total_revenue:desc \
--limit 5 \
--format sql
```
This outputs the generated SQL. Add `--execute` to run it against your warehouse:
```bash
ktx sl query \
--connection-id postgres-warehouse \
--measure orders.total_revenue \
--dimension orders.status \
--order-by orders.total_revenue:desc \
--limit 5 \
--execute --max-rows 10
```
## Next steps
- **Build more context** — learn about [scanning](/docs/guides/building-context), relationship detection, and ingestion workflows in the Building Context guide.
- **Refine your semantic layer** — the [Writing Context](/docs/guides/writing-context) guide covers source YAML, measures, joins, and knowledge pages.
- **Understand the architecture** — read [The Context Layer](/docs/concepts/the-context-layer) to learn why a context layer is more than a semantic layer.
- **Connect more agents** — see the [Agent Clients](/docs/integrations/agent-clients) integration page for per-tool setup details.