docs: document database scope picker fields

This commit is contained in:
Andrey Avtomonov 2026-05-21 20:08:22 +02:00
parent c275748eda
commit 359d46e230
3 changed files with 94 additions and 7 deletions

View file

@ -103,7 +103,7 @@ runtime features are missing.
| Flag | Description |
|------|-------------|
| `--database <driver>` | Database driver to configure; repeatable. Choices: `sqlite`, `postgres`, `mysql`, `sqlserver`, `bigquery`, `snowflake` |
| `--database <driver>` | Database driver to configure; repeatable. Choices: `sqlite`, `postgres`, `mysql`, `clickhouse`, `sqlserver`, `bigquery`, `snowflake` |
| `--database-connection-id <id>` | Existing selected connection id; repeatable. With `--database` or `--database-url`, connection id for the new connection. |
| `--database-url <url>` | URL, `env:NAME`, or `file:/path` for one new URL-style database connection; also used as the SQLite path |
| `--database-schema <schema>` | Database schema or dataset to include; repeatable |
@ -113,6 +113,10 @@ runtime features are missing.
context. Use `--skip-databases` only when intentionally leaving the project
incomplete.
`--database-schema` maps to the driver's scope field: `schemas` for PostgreSQL,
MySQL, and SQL Server; `schema_names` for Snowflake; `dataset_ids` for
BigQuery; and `databases` for ClickHouse.
### Query History
| Flag | Description |

View file

@ -109,9 +109,9 @@ context-source drivers share the map.
| `mysql` | Warehouse | `driver` | `url`, `enabled_tables` |
| `sqlite` | Warehouse | `driver` | `url` or `path`, `enabled_tables` |
| `sqlserver` | Warehouse | `driver` | `url`, `enabled_tables` |
| `bigquery` | Warehouse | `driver` | `url`, `enabled_tables`, `historicSql` |
| `snowflake` | Warehouse | `driver` | `url`, `enabled_tables`, `historicSql` |
| `clickhouse` | Warehouse | `driver` | `url`, `enabled_tables` |
| `bigquery` | Warehouse | `driver` | `credentials_json`, `dataset_ids`, `enabled_tables`, `historicSql` |
| `snowflake` | Warehouse | `driver` | `schema_names`, `enabled_tables`, `historicSql` |
| `clickhouse` | Warehouse | `driver` | `url`, `database`, `databases`, `enabled_tables` |
| `metabase` | Context source | `driver`, `api_url` | `api_key_ref`, `mappings` |
| `looker` | Context source | `driver`, `base_url`, `client_id` | `client_secret_ref`, `mappings` |
| `lookml` | Context source | `driver`, `repoUrl` | `branch`, `path`, `auth_token_ref`, `mappings` |
@ -136,6 +136,27 @@ connections:
- public.customers
```
Connector-specific scope fields let setup and scan use the same warehouse
boundary:
```yaml
connections:
mysql-warehouse:
driver: mysql
url: env:MYSQL_URL
schemas: [analytics, mart]
clickhouse-warehouse:
driver: clickhouse
url: env:CLICKHOUSE_URL
database: analytics
databases: [analytics, mart]
bigquery-warehouse:
driver: bigquery
credentials_json: file:./service-account.json
location: US
dataset_ids: [analytics, mart]
```
For Postgres, BigQuery, and Snowflake, `historicSql` and `context.queryHistory`
toggle query-history ingest. The shape is connector-specific; the setup wizard
writes these fields when you pass `--enable-query-history`.

View file

@ -1,6 +1,6 @@
---
title: Primary Sources
description: Connect ktx to PostgreSQL, Snowflake, BigQuery, MySQL, SQL Server, or SQLite.
description: Connect ktx to PostgreSQL, Snowflake, BigQuery, MySQL, ClickHouse, SQL Server, or SQLite.
---
**ktx** connects to your data warehouse or database to build schema context,
@ -26,7 +26,7 @@ Agents should prefer environment or file references over literal secrets.
| Field | Required | Applies to | Description |
|-------|----------|------------|-------------|
| `driver` | Yes | all connections | Connector driver such as `postgres`, `snowflake`, `bigquery`, `mysql`, `sqlserver`, or `sqlite` |
| `driver` | Yes | all connections | Connector driver such as `postgres`, `snowflake`, `bigquery`, `mysql`, `clickhouse`, `sqlserver`, or `sqlite` |
| `url` | One of the connection methods | URL-style connectors | Database URL, `env:NAME`, or `file:/path/to/secret` |
| `host`, `port`, `database`, `username`, `password` | One of the connection methods | PostgreSQL, MySQL, SQL Server | Field-by-field connection values |
| `schema` or `schemas` | No | schema-aware warehouses | Single schema or list of schemas to scan |
@ -216,6 +216,10 @@ For multiple datasets:
- finance
```
BigQuery dataset scope is stored in `connections.<id>.dataset_ids`. Interactive
setup discovers datasets from credentials plus location, then writes the chosen
dataset ids as the scan scope.
### Authentication
| Method | Config |
@ -282,6 +286,10 @@ connections:
url: env:MYSQL_DATABASE_URL
```
MySQL supports selecting one or more databases during `ktx setup`. The selected
database scope is stored in `connections.<id>.schemas`, and `ktx scan` reads
exactly those databases.
Or with individual fields:
```yaml title="ktx.yaml"
@ -320,12 +328,66 @@ connections:
- Parameter binding uses positional `?` placeholders
- Uses `LIMIT X OFFSET Y` for pagination
- Single database per connection (no multi-schema)
- Multi-database scanning uses `schemas` as the selected database list
- Supports 20+ MySQL types including `enum`, `json`, `datetime`, `decimal`
- Table comments extracted with InnoDB metadata prefix stripping
---
## ClickHouse
Connects to ClickHouse over HTTP. Supports table and column introspection across
one or more selected databases.
### Connection config
```yaml title="ktx.yaml"
connections:
my-clickhouse:
driver: clickhouse
url: env:CLICKHOUSE_DATABASE_URL
database: analytics
```
For multiple databases:
```yaml
databases:
- analytics
- mart
```
ClickHouse supports selecting one or more databases during `ktx setup`. The
selected scan scope is stored in `connections.<id>.databases`. The single
`database` field remains the connection default for raw SQL and `ktx sql`.
### Authentication
| Method | Config |
|--------|--------|
| URL | `url: env:CLICKHOUSE_DATABASE_URL` |
| Password | `password: env:CLICKHOUSE_PASSWORD` or `password: file:/path/to/secret` |
### Features
| Feature | Supported | Notes |
|---------|-----------|-------|
| Tables & views | Yes | Via `system.tables` |
| Primary keys | No | Not exposed as relational constraints |
| Foreign keys | No | Not available in ClickHouse |
| Row count estimates | Yes | From ClickHouse metadata where available |
| Column statistics | No | - |
| Query history | No | - |
| Table sampling | Yes | Uses ClickHouse sampling syntax when supported |
### Dialect notes
- Parameter binding uses named placeholders
- The `database` field sets the default database for SQL execution
- The `databases` array controls the scan scope
---
## SQL Server
Connects to Microsoft SQL Server and Azure SQL. Supports multi-schema scanning with `dbo` as the default schema.