ktx/docs-site/content/docs/getting-started/quickstart.mdx

287 lines
10 KiB
Text
Raw Permalink Normal View History

---
title: Quickstart
2026-05-14 15:06:48 -07:00
description: Set up KTX, build local context, and connect your coding agent.
---
2026-05-14 15:06:48 -07:00
This guide gets a local analytics project ready for KTX. You will install the
CLI, run the setup wizard, connect a database, build context, and install agent
rules that teach your coding assistant which KTX commands to run.
2026-05-14 15:06:48 -07:00
If you are a coding assistant choosing a docs route, start with the
[Agent Quickstart](/docs/ai-resources/agent-quickstart). This page is the
human setup walkthrough.
2026-05-14 15:06:48 -07:00
## What setup does
2026-05-14 15:06:48 -07:00
`ktx setup` is the main project workflow. It can create or resume `ktx.yaml`,
configure model and embedding providers, add database connections, add optional
context sources, build the first context artifacts, and install agent
integration.
2026-05-14 15:06:48 -07:00
When you run bare `ktx` in an interactive terminal outside a KTX project, the
CLI opens the same setup experience. Inside an existing project, `ktx setup`
resumes incomplete work or opens a menu for changing setup, connecting an
agent, checking status, or exploring a demo project.
2026-05-14 15:06:48 -07:00
## Install the CLI
2026-05-14 15:06:48 -07:00
Install the published `@kaelio/ktx` package:
```bash
npm install -g @kaelio/ktx
```
2026-05-14 15:06:48 -07:00
Then run setup from the analytics project directory:
```bash
ktx setup
```
2026-05-14 15:06:48 -07:00
The local checkout workflow is only for KTX contributors. See
[Contributing](/docs/community/contributing) for that path.
2026-05-14 15:06:48 -07:00
## Step 1: Choose the project
2026-05-14 15:06:48 -07:00
In an interactive terminal, setup can create a new KTX project or resume the
nearest existing project. The main project file is `ktx.yaml`.
2026-05-14 15:06:48 -07:00
For scripted setup, pass the project directory explicitly:
2026-05-14 15:06:48 -07:00
```bash
ktx setup --project-dir ./analytics
```
2026-05-14 15:06:48 -07:00
If setup exits early, rerun `ktx setup` in the same directory. KTX tracks
completed setup steps and resumes from the remaining work.
2026-05-14 15:06:48 -07:00
## Step 2: Configure the LLM
2026-05-14 15:06:48 -07:00
KTX uses a Claude model for ingest agents that turn schemas, SQL, BI metadata,
and documents into semantic-layer sources and wiki context.
2026-05-14 15:06:48 -07:00
Setup supports two LLM provider paths:
2026-05-14 15:06:48 -07:00
| Provider | Use when | Credential model |
|----------|----------|------------------|
| Anthropic API | You have an Anthropic API key | `ANTHROPIC_API_KEY` or a local `file:` secret |
| Google Vertex AI for Anthropic Claude | Your organization runs Claude through Google Cloud | Application Default Credentials plus Vertex project and location |
2026-05-14 15:06:48 -07:00
For Anthropic API, setup can read the key from the environment or save a pasted
key to `.ktx/secrets/anthropic-api-key`. `ktx.yaml` stores an `env:` or `file:`
reference, not the raw key.
2026-05-14 15:06:48 -07:00
For Vertex AI, setup uses Google Application Default Credentials. It can read
your active `gcloud` project, list visible projects, or accept explicit
`--vertex-project` and `--vertex-location` values.
Setup checks the selected model before saving. Anthropic API setup fetches live
Claude model choices when possible and falls back to bundled defaults if model
discovery is unavailable.
## Step 3: Configure embeddings
2026-05-14 15:06:48 -07:00
KTX uses embeddings for semantic search over semantic-layer sources, wiki
context, schema metadata, and relationship evidence.
2026-05-14 15:06:48 -07:00
| Backend | Default model | Notes |
|---------|---------------|-------|
| OpenAI | `text-embedding-3-small` | Recommended for hosted embeddings. Requires an OpenAI API key. |
| Local sentence-transformers | `all-MiniLM-L6-v2` | Runs through the KTX-managed Python runtime. No hosted embedding key is required. |
OpenAI setup reads `OPENAI_API_KEY` or saves a local secret file. Local
sentence-transformers setup can install and start the managed runtime during
setup. To prepare that runtime before setup, run:
```bash
2026-05-12 23:51:46 +02:00
ktx dev runtime install --feature local-embeddings --yes
ktx dev runtime start --feature local-embeddings
```
2026-05-14 15:06:48 -07:00
## Step 4: Add a database
2026-05-14 15:06:48 -07:00
KTX needs at least one primary database connection before it can build database
context. The wizard supports SQLite, PostgreSQL, MySQL, ClickHouse, SQL Server,
BigQuery, and Snowflake.
2026-05-14 15:06:48 -07:00
You can usually enter connection fields interactively or provide a URL. Secret
URLs can be stored as local files under `.ktx/secrets/` or referenced with
`env:NAME` in `ktx.yaml`.
2026-05-14 15:06:48 -07:00
After saving a connection, setup tests it and builds fast schema context:
2026-05-14 15:06:48 -07:00
```text
Testing warehouse
feat: merge ingest and scan * docs: add CLI component reuse guidance * docs: add unified ingest ux design * Refine unified ingest UX design after adversarial review iteration 1 * Refine unified ingest UX design after adversarial review iteration 2 * Refine unified ingest UX design after adversarial review iteration 3 * feat(cli): route public connection ingest command * feat(cli): hide standalone scan from public help * feat(cli): plan public ingest depth and query history * feat(cli): execute public database ingest facets * feat(ingest): read connection query history config * fix(cli): use public ingest wording * fix(config): stop generating ingest adapter allow lists * docs: document public ingest command * test: align ingest surface expectations * docs: add unified ingest public CLI surface plan * feat(cli): preflight deep public ingest readiness * feat(setup): store query history in connection context * feat(setup): store database context depth * feat(setup): verify context readiness by database depth * fix(setup): keep context build foreground only * fix(config): reject reserved ingest connection ids * test: close unified ingest v1 expectations * docs: add unified ingest v1 closure plan * fix(ingest): bypass adapter allow-list for public source ingest * fix(ingest): honor query history window intent * fix(ingest): hide scan internals from public database ingest * feat(ingest): use foreground view for interactive public ingest * fix(setup): use schema context and query history wording * test(cli): verify unified ingest public output * docs: add unified ingest v1 public output closure plan * fix(setup): forward query history flags * fix(setup): prompt for postgres query history * fix(status): report query history readiness * fix(ingest): remove legacy public guidance * fix(ingest): polish foreground retry copy * docs(examples): use unified query history wording * chore(ingest): finish public query history cleanup * docs: add unified ingest v1 query history status cleanup plan * test(docs): cover unified ingest public docs * docs: align ingest CLI reference with unified UX * docs: update context build guides for unified ingest * docs: update setup and primary source ingest wording * docs: stop advertising adapter-backed example ingest * docs: close unified ingest public docs gaps * docs: add unified ingest v1 docs site closure plan * fix: render unified ingest foreground warnings * fix: explain query history schema order * fix: add public ingest retry guidance * fix: align setup next steps with unified ingest * fix: remove scan wording from demo progress * test: verify unified ingest ux closure * docs: add unified ingest v1 foreground and retry closure plan * fix(cli): preserve query-history pull config in public ingest * fix(cli): omit hidden commands from docs command tree * test(cli): close unified ingest final public surface checks * docs: add unified ingest v1 final public surface closure plan * fix(cli): use public source labels in ingest reports * fix(cli): suppress low-level public ingest output * test(cli): verify unified ingest public plain output * docs: add unified ingest v1 public plain output closure plan * fix(cli): add public ingest copy sanitizers * fix(cli): sanitize public ingest progress copy * fix(cli): rename setup schema scope prompt * docs(plan): add progress copy closure; test: align setup back-nav fixture Adds the iter9 plan and updates the setup back-navigation test fixture to pass disableQueryHistory plus listSchemas/listTables stubs that the unified ingest setup step now requires. * docs(plan): add final ux labels plan with narrowed label scans * fix(cli): aggregate unsupported query-history warnings * fix(cli): align setup database labels * test(cli): fix setup database test type-check * fix(cli): remove primary-source wording from setup output * test(cli): verify unified ingest setup closure * docs(plan): add unified ingest v1 verification copy closure plan * fix(cli): remove top-level scan command * fix(cli): remove legacy ingest and wiki commands * Merge scan into ingest flow * feat(cli): split ingest progress into per-phase rows, rename work units to tasks Each database target in the unified ingest dashboard now renders one row per real subprocess (Schema, then Query history when enabled) instead of a single combined bar. Each phase has its own monotonic 0-100% bar so the progress never snaps back to zero when historic-sql starts after scan completes. Completed phases keep their final bar, summary, and elapsed time visible as an inline audit trail; queued and skipped phases are shown explicitly. Also rename user-facing "work units" / "Failed work units" to "tasks" / "Failed tasks" in ingest output and parseIngestSummary. The parser still accepts the legacy "Work units:" wording in captured output for backward compat. Internal memory-flow event names and type fields are left alone. * Fix test harness failures * Fix CI smoke checks --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-05-14 01:43:06 +02:00
Connection test passed
2026-05-14 15:06:48 -07:00
Building schema context for warehouse
feat: merge ingest and scan * docs: add CLI component reuse guidance * docs: add unified ingest ux design * Refine unified ingest UX design after adversarial review iteration 1 * Refine unified ingest UX design after adversarial review iteration 2 * Refine unified ingest UX design after adversarial review iteration 3 * feat(cli): route public connection ingest command * feat(cli): hide standalone scan from public help * feat(cli): plan public ingest depth and query history * feat(cli): execute public database ingest facets * feat(ingest): read connection query history config * fix(cli): use public ingest wording * fix(config): stop generating ingest adapter allow lists * docs: document public ingest command * test: align ingest surface expectations * docs: add unified ingest public CLI surface plan * feat(cli): preflight deep public ingest readiness * feat(setup): store query history in connection context * feat(setup): store database context depth * feat(setup): verify context readiness by database depth * fix(setup): keep context build foreground only * fix(config): reject reserved ingest connection ids * test: close unified ingest v1 expectations * docs: add unified ingest v1 closure plan * fix(ingest): bypass adapter allow-list for public source ingest * fix(ingest): honor query history window intent * fix(ingest): hide scan internals from public database ingest * feat(ingest): use foreground view for interactive public ingest * fix(setup): use schema context and query history wording * test(cli): verify unified ingest public output * docs: add unified ingest v1 public output closure plan * fix(setup): forward query history flags * fix(setup): prompt for postgres query history * fix(status): report query history readiness * fix(ingest): remove legacy public guidance * fix(ingest): polish foreground retry copy * docs(examples): use unified query history wording * chore(ingest): finish public query history cleanup * docs: add unified ingest v1 query history status cleanup plan * test(docs): cover unified ingest public docs * docs: align ingest CLI reference with unified UX * docs: update context build guides for unified ingest * docs: update setup and primary source ingest wording * docs: stop advertising adapter-backed example ingest * docs: close unified ingest public docs gaps * docs: add unified ingest v1 docs site closure plan * fix: render unified ingest foreground warnings * fix: explain query history schema order * fix: add public ingest retry guidance * fix: align setup next steps with unified ingest * fix: remove scan wording from demo progress * test: verify unified ingest ux closure * docs: add unified ingest v1 foreground and retry closure plan * fix(cli): preserve query-history pull config in public ingest * fix(cli): omit hidden commands from docs command tree * test(cli): close unified ingest final public surface checks * docs: add unified ingest v1 final public surface closure plan * fix(cli): use public source labels in ingest reports * fix(cli): suppress low-level public ingest output * test(cli): verify unified ingest public plain output * docs: add unified ingest v1 public plain output closure plan * fix(cli): add public ingest copy sanitizers * fix(cli): sanitize public ingest progress copy * fix(cli): rename setup schema scope prompt * docs(plan): add progress copy closure; test: align setup back-nav fixture Adds the iter9 plan and updates the setup back-navigation test fixture to pass disableQueryHistory plus listSchemas/listTables stubs that the unified ingest setup step now requires. * docs(plan): add final ux labels plan with narrowed label scans * fix(cli): aggregate unsupported query-history warnings * fix(cli): align setup database labels * test(cli): fix setup database test type-check * fix(cli): remove primary-source wording from setup output * test(cli): verify unified ingest setup closure * docs(plan): add unified ingest v1 verification copy closure plan * fix(cli): remove top-level scan command * fix(cli): remove legacy ingest and wiki commands * Merge scan into ingest flow * feat(cli): split ingest progress into per-phase rows, rename work units to tasks Each database target in the unified ingest dashboard now renders one row per real subprocess (Schema, then Query history when enabled) instead of a single combined bar. Each phase has its own monotonic 0-100% bar so the progress never snaps back to zero when historic-sql starts after scan completes. Completed phases keep their final bar, summary, and elapsed time visible as an inline audit trail; queued and skipped phases are shown explicitly. Also rename user-facing "work units" / "Failed work units" to "tasks" / "Failed tasks" in ingest output and parseIngestSummary. The parser still accepts the legacy "Work units:" wording in captured output for backward compat. Internal memory-flow event names and type fields are left alone. * Fix test harness failures * Fix CI smoke checks --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-05-14 01:43:06 +02:00
Running fast database ingest
Database ready
2026-05-14 15:06:48 -07:00
warehouse - PostgreSQL - schema context complete
```
2026-05-14 15:06:48 -07:00
PostgreSQL, BigQuery, and Snowflake can also enable query-history ingest. Query
history helps KTX learn common query patterns, joins, service-account filters,
and warehouse-specific usage.
2026-05-14 15:06:48 -07:00
## Step 5: Add context sources
2026-05-14 15:06:48 -07:00
Context sources are optional, but they make the first context layer much richer.
Setup can add:
2026-05-14 15:06:48 -07:00
| Source | Typical input | What KTX learns |
|--------|---------------|-----------------|
| dbt | Local project or Git repo | Models, columns, tests, descriptions, tags |
| MetricFlow | Local project or Git repo | Semantic models, metrics, dimensions, entities |
| LookML | Local files or Git repo | Views, explores, dimensions, measures, joins |
| Looker | API URL and credentials | Explores, looks, dashboards, model metadata |
| Metabase | API URL and key | Questions, dashboards, BI database mappings |
| Notion | Integration token and crawl settings | Business docs and knowledge pages |
2026-05-14 15:06:48 -07:00
Setup maps BI and source metadata back to your primary warehouse connection so
generated context points at the right tables.
2026-05-14 15:06:48 -07:00
You can skip this step and add sources later by rerunning `ktx setup`.
2026-05-14 15:06:48 -07:00
## Step 6: Build context
2026-05-14 15:06:48 -07:00
The context build turns configured databases and sources into local artifacts
agents can read. It runs database ingest first, then source ingest and memory
updates.
2026-05-14 15:06:48 -07:00
Fast database ingest records deterministic schema grounding. Deep ingest adds
AI-enriched descriptions, embeddings, relationship evidence, and query-history
context when configured.
2026-05-14 15:06:48 -07:00
When the build finishes, setup verifies that agent-ready context exists:
2026-05-14 15:06:48 -07:00
```text
KTX context is ready for agents.
feat: merge ingest and scan * docs: add CLI component reuse guidance * docs: add unified ingest ux design * Refine unified ingest UX design after adversarial review iteration 1 * Refine unified ingest UX design after adversarial review iteration 2 * Refine unified ingest UX design after adversarial review iteration 3 * feat(cli): route public connection ingest command * feat(cli): hide standalone scan from public help * feat(cli): plan public ingest depth and query history * feat(cli): execute public database ingest facets * feat(ingest): read connection query history config * fix(cli): use public ingest wording * fix(config): stop generating ingest adapter allow lists * docs: document public ingest command * test: align ingest surface expectations * docs: add unified ingest public CLI surface plan * feat(cli): preflight deep public ingest readiness * feat(setup): store query history in connection context * feat(setup): store database context depth * feat(setup): verify context readiness by database depth * fix(setup): keep context build foreground only * fix(config): reject reserved ingest connection ids * test: close unified ingest v1 expectations * docs: add unified ingest v1 closure plan * fix(ingest): bypass adapter allow-list for public source ingest * fix(ingest): honor query history window intent * fix(ingest): hide scan internals from public database ingest * feat(ingest): use foreground view for interactive public ingest * fix(setup): use schema context and query history wording * test(cli): verify unified ingest public output * docs: add unified ingest v1 public output closure plan * fix(setup): forward query history flags * fix(setup): prompt for postgres query history * fix(status): report query history readiness * fix(ingest): remove legacy public guidance * fix(ingest): polish foreground retry copy * docs(examples): use unified query history wording * chore(ingest): finish public query history cleanup * docs: add unified ingest v1 query history status cleanup plan * test(docs): cover unified ingest public docs * docs: align ingest CLI reference with unified UX * docs: update context build guides for unified ingest * docs: update setup and primary source ingest wording * docs: stop advertising adapter-backed example ingest * docs: close unified ingest public docs gaps * docs: add unified ingest v1 docs site closure plan * fix: render unified ingest foreground warnings * fix: explain query history schema order * fix: add public ingest retry guidance * fix: align setup next steps with unified ingest * fix: remove scan wording from demo progress * test: verify unified ingest ux closure * docs: add unified ingest v1 foreground and retry closure plan * fix(cli): preserve query-history pull config in public ingest * fix(cli): omit hidden commands from docs command tree * test(cli): close unified ingest final public surface checks * docs: add unified ingest v1 final public surface closure plan * fix(cli): use public source labels in ingest reports * fix(cli): suppress low-level public ingest output * test(cli): verify unified ingest public plain output * docs: add unified ingest v1 public plain output closure plan * fix(cli): add public ingest copy sanitizers * fix(cli): sanitize public ingest progress copy * fix(cli): rename setup schema scope prompt * docs(plan): add progress copy closure; test: align setup back-nav fixture Adds the iter9 plan and updates the setup back-navigation test fixture to pass disableQueryHistory plus listSchemas/listTables stubs that the unified ingest setup step now requires. * docs(plan): add final ux labels plan with narrowed label scans * fix(cli): aggregate unsupported query-history warnings * fix(cli): align setup database labels * test(cli): fix setup database test type-check * fix(cli): remove primary-source wording from setup output * test(cli): verify unified ingest setup closure * docs(plan): add unified ingest v1 verification copy closure plan * fix(cli): remove top-level scan command * fix(cli): remove legacy ingest and wiki commands * Merge scan into ingest flow * feat(cli): split ingest progress into per-phase rows, rename work units to tasks Each database target in the unified ingest dashboard now renders one row per real subprocess (Schema, then Query history when enabled) instead of a single combined bar. Each phase has its own monotonic 0-100% bar so the progress never snaps back to zero when historic-sql starts after scan completes. Completed phases keep their final bar, summary, and elapsed time visible as an inline audit trail; queued and skipped phases are shown explicitly. Also rename user-facing "work units" / "Failed work units" to "tasks" / "Failed tasks" in ingest output and parseIngestSummary. The parser still accepts the legacy "Work units:" wording in captured output for backward compat. Internal memory-flow event names and type fields are left alone. * Fix test harness failures * Fix CI smoke checks --------- Co-authored-by: Andrey Avtomonov <7889985+andreybavt@users.noreply.github.com>
2026-05-14 01:43:06 +02:00
Databases:
2026-05-14 15:06:48 -07:00
warehouse: deep context complete
Context sources:
2026-05-14 15:06:48 -07:00
dbt_main: memory update complete
Verification:
Agent context: ready
Semantic search: ready
```
2026-05-14 15:06:48 -07:00
If a foreground build is interrupted, rerun `ktx setup` or build the same target
with `ktx ingest <connectionId>`.
2026-05-14 15:06:48 -07:00
## Step 7: Install agent integration
2026-05-14 15:06:48 -07:00
The final setup step installs project-local rules for your coding assistant.
Supported targets are Claude Code, Codex, Cursor, OpenCode, and universal
`.agents`.
2026-05-14 15:06:48 -07:00
You can also run this step later:
2026-05-14 15:06:48 -07:00
```bash
ktx setup --agents --target codex
```
2026-05-14 15:06:48 -07:00
Claude Code and Codex also support global installs:
2026-05-14 15:06:48 -07:00
```bash
ktx setup --agents --target codex --global
```
Agent rules are CLI-based. They point agents at the KTX CLI path that created
the file, so agents do not need a separate `ktx` binary in `PATH`. If the CLI
path changes after reinstalling or moving a checkout, rerun `ktx setup --agents`.
## Generated files
2026-05-14 15:06:48 -07:00
KTX writes plain files so people and agents can inspect changes in git.
2026-05-14 15:06:48 -07:00
| Path | Purpose |
|------|---------|
| `ktx.yaml` | Project configuration for LLMs, embeddings, connections, context sources, and setup state |
| `.ktx/secrets/*` | Local secret files referenced from `ktx.yaml`; do not commit these |
| `.ktx/setup/*` | Local setup and context-build state |
| `.ktx/agents/install-manifest.json` | Manifest used to manage installed agent files |
| `semantic-layer/<connection-id>/*.yaml` | Semantic source definitions used for SQL generation |
| `wiki/global/*.md` | Shared business context and metric definitions |
| `wiki/user/<user-id>/*.md` | User-scoped notes and local context |
| `.claude/skills/ktx/SKILL.md` | Claude Code project skill |
| `.agents/skills/ktx/SKILL.md` | Codex or universal project skill |
| `.cursor/rules/ktx.mdc` | Cursor project rule |
| `.opencode/commands/ktx.md` | OpenCode project command |
2026-05-14 15:06:48 -07:00
## Verify setup
2026-05-14 15:06:48 -07:00
Run:
```bash
ktx status
```
2026-05-14 15:06:48 -07:00
Example output:
```text
KTX project: /home/user/analytics
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
2026-05-14 15:06:48 -07:00
Databases configured: yes (warehouse)
Context sources configured: yes (dbt_main)
KTX context built: yes
2026-05-14 15:06:48 -07:00
Agent integration ready: yes (codex:project)
```
2026-05-14 15:06:48 -07:00
Use JSON when an agent or script needs a structured readiness check:
```bash
ktx status --json
```
## Scripted setup example
Use non-interactive setup when creating repeatable fixtures or automation:
```bash
ktx setup \
--project-dir ./analytics \
--no-input \
--skip-llm \
--skip-embeddings \
--database postgres \
--new-database-connection-id warehouse \
--database-url env:DATABASE_URL \
--database-schema public
```
Then build context:
```bash
ktx ingest warehouse --fast
```
See [ktx setup](/docs/cli-reference/ktx-setup) for the full automation flag
surface.
## Common errors
2026-05-14 15:06:48 -07:00
| Symptom | Likely cause | Recovery |
|---------|--------------|----------|
| `ktx: command not found` | The global package is not installed or your shell cannot find it | Reinstall `@kaelio/ktx` and open a new shell |
| Setup resumes the wrong project | `KTX_PROJECT_DIR` or the nearest `ktx.yaml` points somewhere else | Pass `--project-dir <path>` |
| Anthropic health check fails | API key, model id, or access is invalid | Fix `ANTHROPIC_API_KEY` or rerun setup with a different key or model |
| Vertex AI health check fails | Vertex API, Claude access, project, location, or IAM permissions are missing | Check the project, location, Application Default Credentials, and Vertex AI permissions |
| OpenAI embeddings fail | `OPENAI_API_KEY` is missing or invalid | Export the key or choose local sentence-transformers embeddings |
| Local embeddings fail | Managed Python runtime cannot install or start | Run `ktx dev runtime status`, then install the local embeddings runtime |
| Database test fails | Credentials, network access, database, warehouse, or schema is wrong | Test the same values with the database's native client, then rerun setup |
| Context is not built | Setup saved configuration but skipped or interrupted the build | Run `ktx setup` or `ktx ingest --all` |
| Agent integration is incomplete | Setup skipped the agents step or installed a different target | Run `ktx setup --agents --target <target>` |
## Next steps
2026-05-14 15:06:48 -07:00
- Build and refresh context with [Building Context](/docs/guides/building-context).
- Edit semantic sources and wiki pages with [Writing Context](/docs/guides/writing-context).
- Connect more tools with [Agent Clients](/docs/integrations/agent-clients).
- Read [The Context Layer](/docs/concepts/the-context-layer) to understand the architecture.