From 3f022148c4e3adc7bd4da7e3ff9887ebab0a0995 Mon Sep 17 00:00:00 2001 From: Luca Martial Date: Thu, 14 May 2026 17:11:47 -0700 Subject: [PATCH] docs: align docs with current KTX behavior --- .../content/docs/cli-reference/ktx-ingest.mdx | 7 ++-- .../content/docs/cli-reference/ktx-setup.mdx | 9 +++-- .../docs/concepts/the-context-layer.mdx | 2 +- .../docs/getting-started/introduction.mdx | 3 +- .../docs/getting-started/quickstart.mdx | 9 ++--- .../content/docs/guides/building-context.mdx | 4 ++- .../content/docs/guides/writing-context.mdx | 35 ++++++++++++------- .../docs/integrations/agent-clients.mdx | 4 +-- docs-site/content/docs/integrations/index.mdx | 3 +- .../docs/integrations/primary-sources.mdx | 4 +-- 10 files changed, 51 insertions(+), 29 deletions(-) diff --git a/docs-site/content/docs/cli-reference/ktx-ingest.mdx b/docs-site/content/docs/cli-reference/ktx-ingest.mdx index a0bca58f..ab907992 100644 --- a/docs-site/content/docs/cli-reference/ktx-ingest.mdx +++ b/docs-site/content/docs/cli-reference/ktx-ingest.mdx @@ -29,14 +29,16 @@ connections when you use `--all`. | `--deep` | Use AI-enriched database ingest | Stored connection default, or `fast` | | `--query-history` | Include database query-history usage patterns | Stored connection default | | `--no-query-history` | Skip database query-history usage patterns for this run | Stored connection default | -| `--query-history-window-days ` | Query-history lookback window for this run | Stored connection default | +| `--query-history-window-days ` | BigQuery/Snowflake query-history lookback window for this run | Stored connection default | | `--plain` | Print plain text output | `true` | | `--json` | Print JSON output | `false` | | `--no-input` | Disable interactive terminal input | — | `--fast` and `--deep` are mutually exclusive. Depth flags apply only to database connections. Query-history flags apply only to database connections -that support query history. Query-history ingest runs after schema ingest and +that support query history. The window flag applies to BigQuery and Snowflake; +Postgres reads the current `pg_stat_statements` aggregate data instead of a +time-windowed history table. Query-history ingest runs after schema ingest and requires deep ingest readiness. When `--all` selects both databases and context sources, database ingest runs @@ -70,6 +72,7 @@ ktx ingest warehouse --deep # Include query-history usage patterns ktx ingest warehouse --deep --query-history +# Set the lookback window for BigQuery or Snowflake query history ktx ingest warehouse --query-history-window-days 30 # Build a source connection diff --git a/docs-site/content/docs/cli-reference/ktx-setup.mdx b/docs-site/content/docs/cli-reference/ktx-setup.mdx index 4de40ecb..90d0b175 100644 --- a/docs-site/content/docs/cli-reference/ktx-setup.mdx +++ b/docs-site/content/docs/cli-reference/ktx-setup.mdx @@ -96,13 +96,16 @@ incomplete. |------|-------------| | `--enable-query-history` | Enable query-history ingest when the selected database supports it | | `--disable-query-history` | Disable query-history ingest for the selected database | -| `--query-history-window-days ` | Query-history lookback window | +| `--query-history-window-days ` | BigQuery/Snowflake query-history lookback window | | `--query-history-min-executions ` | Minimum executions for a query-history template | | `--query-history-service-account-pattern ` | Query-history service-account regex; repeatable | | `--query-history-redaction-pattern ` | Query-history SQL-literal redaction regex; repeatable | -Query history setup is supported for Postgres, BigQuery, and Snowflake. Enabling -query history makes deep ingest readiness matter for later `ktx ingest` runs. +Query history setup is supported for Postgres, BigQuery, and Snowflake. The +window flag applies to BigQuery and Snowflake; Postgres reads the current +`pg_stat_statements` aggregate data instead of a time-windowed history table. +Enabling query history makes deep ingest readiness matter for later +`ktx ingest` runs. ### Context Sources diff --git a/docs-site/content/docs/concepts/the-context-layer.mdx b/docs-site/content/docs/concepts/the-context-layer.mdx index cb03b7c0..af304cf4 100644 --- a/docs-site/content/docs/concepts/the-context-layer.mdx +++ b/docs-site/content/docs/concepts/the-context-layer.mdx @@ -289,7 +289,7 @@ my-project/ │ └── data-quality-notes.md ├── raw-sources/ │ └── warehouse/ -│ └── database-ingest/ # Schema ingest artifacts and reports +│ └── live-database/ # Schema ingest artifacts and reports └── .ktx/ ├── db.sqlite # Local state (git-ignored) └── cache/ # Runtime cache (git-ignored) diff --git a/docs-site/content/docs/getting-started/introduction.mdx b/docs-site/content/docs/getting-started/introduction.mdx index cb8ac0dd..d87d8af3 100644 --- a/docs-site/content/docs/getting-started/introduction.mdx +++ b/docs-site/content/docs/getting-started/introduction.mdx @@ -60,7 +60,8 @@ Use KTX when you want agents to: - **Explain metric provenance** with warehouse evidence - **Work alongside** dbt, LookML, MetricFlow, Looker, Metabase, and modern BI platforms -Works with PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, and SQL Server. +Works with SQLite, PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, and SQL +Server. ## Explore the docs diff --git a/docs-site/content/docs/getting-started/quickstart.mdx b/docs-site/content/docs/getting-started/quickstart.mdx index 335aedfa..84bf4611 100644 --- a/docs-site/content/docs/getting-started/quickstart.mdx +++ b/docs-site/content/docs/getting-started/quickstart.mdx @@ -51,8 +51,8 @@ For scripted setup, pass the project directory explicitly: ktx setup --project-dir ./analytics ``` -If setup exits early, rerun `ktx setup` in the same directory. KTX tracks -completed setup steps and resumes from the remaining work. +If setup exits early, rerun `ktx setup` in the same directory. KTX keeps local +setup progress under `.ktx/setup/` and resumes from the remaining work. ## Step 2: Configure the LLM @@ -122,7 +122,8 @@ Database ready PostgreSQL, BigQuery, and Snowflake can also enable query-history ingest. Query history helps KTX learn common query patterns, joins, service-account filters, -and warehouse-specific usage. +and warehouse-specific usage. BigQuery and Snowflake support a lookback window; +Postgres reads the current `pg_stat_statements` aggregate data instead. ## Step 5: Add context sources @@ -200,7 +201,7 @@ KTX writes plain files so people and agents can inspect changes in git. | Path | Purpose | |------|---------| -| `ktx.yaml` | Project configuration for LLMs, embeddings, connections, context sources, and setup state | +| `ktx.yaml` | Project configuration for LLMs, embeddings, connections, context sources, and query-history settings | | `.ktx/secrets/*` | Local secret files referenced from `ktx.yaml`; do not commit these | | `.ktx/setup/*` | Local setup and context-build state | | `.ktx/agents/install-manifest.json` | Manifest used to manage installed agent files | diff --git a/docs-site/content/docs/guides/building-context.mdx b/docs-site/content/docs/guides/building-context.mdx index c21b7921..5fd288a6 100644 --- a/docs-site/content/docs/guides/building-context.mdx +++ b/docs-site/content/docs/guides/building-context.mdx @@ -62,13 +62,15 @@ configured, run `ktx setup` or use `--fast`. PostgreSQL, BigQuery, and Snowflake can add query-history context. This helps KTX learn common joins, filters, service-account patterns, redaction rules, and -usage-heavy query templates. +usage-heavy query templates. BigQuery and Snowflake support a lookback window; +Postgres reads the current `pg_stat_statements` aggregate data instead. Enable it during setup, store it under `connections..context.queryHistory`, or request it for one run: ```bash ktx ingest warehouse --deep --query-history +# Set the lookback window for BigQuery or Snowflake query history ktx ingest warehouse --query-history-window-days 30 ``` diff --git a/docs-site/content/docs/guides/writing-context.mdx b/docs-site/content/docs/guides/writing-context.mdx index fe9d3fdb..b68960bf 100644 --- a/docs-site/content/docs/guides/writing-context.mdx +++ b/docs-site/content/docs/guides/writing-context.mdx @@ -60,21 +60,25 @@ semantic-layer//.yaml ```yaml name: orders -description: Customer orders with booked revenue. +descriptions: + user: Customer orders with booked revenue. table: public.orders grain: - order_id columns: - name: order_id type: string - description: Unique order identifier. + descriptions: + user: Unique order identifier. - name: order_date type: time role: time - description: Date the order was placed. + descriptions: + user: Date the order was placed. - name: total_amount type: number - description: Booked order value in USD. + descriptions: + user: Booked order value in USD. measures: - name: total_revenue expr: SUM(total_amount) @@ -85,7 +89,8 @@ measures: ```yaml name: orders -description: Customer orders with line-item totals. +descriptions: + user: Customer orders with line-item totals. table: public.orders grain: - order_id @@ -93,26 +98,31 @@ grain: columns: - name: order_id type: string - description: Unique order identifier. + descriptions: + user: Unique order identifier. - name: order_date type: time role: time - description: Date the order was placed. + descriptions: + user: Date the order was placed. - name: status type: string visibility: public - description: Current order status. + descriptions: + user: Current order status. - name: _etl_loaded_at type: time visibility: hidden - description: Internal load timestamp. + descriptions: + user: Internal load timestamp. - name: total_amount type: number - description: Order total in USD. + descriptions: + user: Order total in USD. measures: - name: total_revenue @@ -149,9 +159,10 @@ joins: | Field | Required | Description | |-------|----------|-------------| | `name` | Yes | Source identifier. Use lowercase words and underscores. | +| `descriptions` | No | Description map keyed by source, such as `user`, `dbt`, or `ai`. | | `table` or `sql` | Yes | Database table or custom SQL expression. Use exactly one. | | `grain` | Yes | Columns that uniquely identify a row at the source grain. | -| `columns` | No | Column definitions with type, role, visibility, and descriptions. | +| `columns` | Yes | Non-empty column definitions with type, role, visibility, and descriptions. | | `measures` | No | Aggregation expressions such as `SUM`, `COUNT`, and `AVG`. | | `segments` | No | Named predicates agents can reuse. | | `joins` | No | Relationships to other semantic sources. | @@ -165,7 +176,7 @@ joins: | Column | `type` | Yes | Agent-facing type: `string`, `number`, `time`, or `boolean`. | | Column | `role` | No | Special role such as `time` for default time dimensions. | | Column | `visibility` | No | `public`, `internal`, or `hidden`. | -| Column | `description` | Strongly recommended | Business meaning and usage notes. | +| Column | `descriptions` | Strongly recommended | Description map keyed by source, such as `user`, `dbt`, or `ai`. | | Measure | `name` | Yes | Queryable metric name. | | Measure | `expr` | Yes | SQL aggregation expression at the source grain. | | Measure | `filter` | No | SQL predicate applied only to this measure. | diff --git a/docs-site/content/docs/integrations/agent-clients.mdx b/docs-site/content/docs/integrations/agent-clients.mdx index de628197..01cbbca5 100644 --- a/docs-site/content/docs/integrations/agent-clients.mdx +++ b/docs-site/content/docs/integrations/agent-clients.mdx @@ -75,7 +75,7 @@ Available commands: - `ktx status --json --project-dir /path/to/project` - `ktx sl list --json --project-dir /path/to/project` - `ktx sl search '' --json --project-dir /path/to/project --connection-id ''` -- `ktx sl query --json --project-dir /path/to/project --connection-id '' --query-file '' --execute --max-rows 100` +- `ktx sl query --project-dir /path/to/project --connection-id '' --query-file '' --format json --execute --max-rows 100` - `ktx wiki search '' --json --project-dir /path/to/project --limit 10` ``` @@ -172,7 +172,7 @@ All supported agent clients call the same KTX CLI commands: | `ktx sl list --json` | List semantic-layer sources | | `ktx sl search --json` | Search semantic-layer sources | | `ktx sl validate --connection-id ` | Validate semantic source definitions | -| `ktx sl query --json` | Execute a semantic-layer query when semantic compute is configured | +| `ktx sl query --format json` | Execute a semantic-layer query when semantic compute is configured | ### Security constraints diff --git a/docs-site/content/docs/integrations/index.mdx b/docs-site/content/docs/integrations/index.mdx index 8f77a624..92a677aa 100644 --- a/docs-site/content/docs/integrations/index.mdx +++ b/docs-site/content/docs/integrations/index.mdx @@ -34,8 +34,9 @@ automation flags documented in [`ktx setup`](/docs/cli-reference/ktx-setup). | Path | Purpose | |------|---------| -| `ktx.yaml` | Main project configuration for providers, embeddings, connections, source mappings, query history, and setup state | +| `ktx.yaml` | Main project configuration for providers, embeddings, connections, source mappings, and query history | | `.ktx/secrets/*` | Local file-backed secrets when you choose file references during setup | +| `.ktx/setup/*` | Local setup progress and context-build state | | `semantic-layer//` | YAML semantic sources generated by database and source ingestion | | `wiki/` | Markdown business context, definitions, and ingested knowledge | | `.ktx/agents/install-manifest.json` | Manifest of agent integration files installed by `ktx setup --agents` | diff --git a/docs-site/content/docs/integrations/primary-sources.mdx b/docs-site/content/docs/integrations/primary-sources.mdx index a3d4db29..00cc39aa 100644 --- a/docs-site/content/docs/integrations/primary-sources.mdx +++ b/docs-site/content/docs/integrations/primary-sources.mdx @@ -228,7 +228,7 @@ mapping metadata. The BigQuery connector still authenticates with the | Feature | Supported | Notes | |---------|-----------|-------| | Tables & views | Yes | Including materialized views and external tables | -| Primary keys | No | - | +| Primary keys | Yes | Via `INFORMATION_SCHEMA` table constraints when declared | | Foreign keys | No | Not available in BigQuery | | Row count estimates | Yes | From table metadata | | Column statistics | No | - | @@ -500,7 +500,7 @@ No authentication required - SQLite is file-based. The file must be readable by - Uses `LIMIT X OFFSET Y` for pagination - SQLite type affinity system: `TEXT`, `NUMERIC`, `INTEGER`, `REAL`, `BLOB` - Foreign key enforcement requires explicit `PRAGMA foreign_keys = ON` -- In-memory databases supported with `path: ":memory:"` (for testing) +- Database file must exist before `ktx connection test` or ingest runs ## Common errors