mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
* docs: rewrite Semantic Querying concept with imperative-vs-declarative diagram
Reframe semantic-layer-internals.mdx around the contract the semantic
layer offers an agent: declare what you want (a Semantic Query), KTX
figures out how to compute it. Replaces the old "Context-Aware SQL"
framing with a clear imperative-vs-declarative narrative.
Adds a React Flow component (semantic-layer-flow.tsx) that contrasts a
buggy 4-table agent-authored SQL (chasm trap, LEFT-JOIN-in-WHERE,
hardcoded DATE_TRUNC) against the chasm-safe per-fact CTE SQL the
planner actually emits, including the outer GROUP BY over the requested
dimensions. Both lanes converge into a shared warehouse node and each
SQL card now has parallel bullet notes (failures on the left, KTX
behavior on the right).
Side fixes bundled in:
- include the /ktx basePath in the favicon metadata so the icon resolves
under the production prefix
- migrate docs-site/middleware.ts to docs-site/proxy.ts (Next 16 rename)
- redirect / to /ktx/docs/getting-started/introduction so the apex docs
URL works
- add tests covering the apex redirect, the favicon basePath, and the
middleware-to-proxy rename
- propagate the Semantic Query terminology across the ktx-sl CLI
reference, the context-layer concept page, and the agent-clients /
primary-sources integration pages
* Fix CI dead-code failures
* docs-site: polish semantic-layer-internals code blocks and flow diagram
- Make CodeBlock a server component so children traverse synchronously
under React 19 RSC streaming; previously extractText returned "" in
dev SSR, leaving code blocks empty.
- Add custom JSON/YAML/SQL/code-like tokenizers with theme-aware token
classes; drop the colored file-glyph dot and gradient tab-head.
- Tighten tab-head: subtle grey background, smaller monospace filename
in muted grey, smaller rectangular language pill placed to the left
of the filename.
- Polish the React Flow semantic-layer diagram (controls, fit-view
padding, edge types).
* docs-site: annotate imperative SQL, add section anchor, drop ClickHouse
- Wire numbered red badges to each problematic span in the "Without KTX"
SQL with hover sync between SQL gutter, lines, and the notes list.
- Add #imperative-vs-declarative anchor on the flow section header so
the eyebrow link is shareable; reveals a # glyph on hover/focus.
- Align the compiled-SQL note dots to the first-line midpoint
(mt-[6px] instead of mt-1) so 4px dots sit at y=8 in a 16px line.
- Remove all ClickHouse references from docs-site (primary-sources,
quickstart, ktx-setup, contributing, agents-setup, mechanics test,
warehouse drivers in the flow diagram).
* test: drop ClickHouse contributing-docs assertion
Align the workspace-package mirror test with the ClickHouse removal
from docs-site (75907eb). The connector-clickhouse package still
exists in packages/, but contributing.mdx no longer lists it, so the
test that mirrored docs against the workspace was failing.
148 lines
6.1 KiB
Text
148 lines
6.1 KiB
Text
---
|
|
title: The Context Layer
|
|
description: What a context layer is, why agents need one, and how KTX compares to other semantic layers.
|
|
---
|
|
|
|
## Why agents need context
|
|
|
|
Database access lets an agent generate SQL. It does not tell the agent which
|
|
tables matter, which joins are safe, which metrics are canonical, or what your
|
|
team means by "enterprise", "net revenue", or "active customer".
|
|
|
|
That missing business context is where plausible SQL becomes wrong SQL:
|
|
|
|
- `orders.amount` may include refunds unless filtered.
|
|
- `customers.id` may not be the right join key for every source.
|
|
- `legacy_segments` may be stale even though it still exists.
|
|
- A metric may have a board-approved definition that is not obvious from
|
|
column names.
|
|
|
|
## Three waves of AI analytics
|
|
|
|
| Wave | What it gives agents | Where it breaks |
|
|
|------|----------------------|-----------------|
|
|
| **Database access** | Tables, columns, and query execution | Agents guess joins, filters, and metric logic |
|
|
| **Semantic layers** | Modeled metrics, dimensions, joins, and SQL generation | They often miss operating context: anomalies, caveats, ownership, and review history |
|
|
| **Agentic context** | Semantic definitions plus wiki knowledge, scans, provenance, and edit workflows | Requires context to be kept current and reviewable |
|
|
|
|
KTX is built for the third wave: agents that generate SQL, maintain semantic
|
|
files, write docs, propose tests, and leave reviewable diffs.
|
|
|
|
## What KTX adds
|
|
|
|
A context layer is the trusted knowledge surface between analytics systems and
|
|
agents. The semantic layer is the core, but agents also need business rules,
|
|
schema evidence, provenance, and a safe way to update files.
|
|
|
|
```text
|
|
Warehouses + dbt + BI + docs
|
|
|
|
|
v
|
|
ktx ingest
|
|
|
|
|
v
|
|
semantic-layer/ + wiki/ + raw-sources/ + provenance
|
|
|
|
|
v
|
|
Agents search, query, explain, validate, and patch context
|
|
```
|
|
|
|
| Pillar | Format | What it answers |
|
|
|--------|--------|-----------------|
|
|
| **Semantic sources** | `semantic-layer/**/*.yaml` | How do agents query a source safely? |
|
|
| **Wiki pages** | `wiki/**/*.md` | What does the business mean, and what caveats matter? |
|
|
| **Scan artifacts** | `raw-sources/**` | What did KTX observe in the warehouse or source tool? |
|
|
| **Provenance** | Ingest transcripts and run state | Why was this context created or changed? |
|
|
|
|
## Semantic sources
|
|
|
|
Semantic sources describe data in terms agents can reason about: row grain,
|
|
typed columns, valid joins, named measures, filters, and segments.
|
|
|
|
```yaml
|
|
name: orders
|
|
table: public.orders
|
|
grain: [id]
|
|
joins:
|
|
- to: customers
|
|
"on": customer_id = customers.id
|
|
relationship: many_to_one
|
|
measures:
|
|
- name: revenue
|
|
expr: sum(amount)
|
|
filter: "status != 'refunded'"
|
|
```
|
|
|
|
For join graphs, fan-out handling, and execution mechanics, read
|
|
[Semantic Querying](/docs/concepts/semantic-layer-internals).
|
|
|
|
## Wiki pages
|
|
|
|
Wiki pages capture the context that does not belong in a measure formula:
|
|
business definitions, reporting policy, known data issues, metric caveats, and
|
|
links back to semantic sources.
|
|
|
|
| Put it in YAML | Put it in Markdown |
|
|
|----------------|--------------------|
|
|
| `sum(amount)` | "Net revenue excludes successful refunds." |
|
|
| `many_to_one` join metadata | "Use contract segment for board reporting." |
|
|
| Row grain and column types | "February had a one-time refund anomaly." |
|
|
| Default time dimension | "Finance owns ARR definitions." |
|
|
|
|
## How KTX compares
|
|
|
|
KTX overlaps with semantic layers, but the product boundary is broader: it gives
|
|
agents a reviewable context workspace, not only a metric runtime.
|
|
|
|
| Dimension | KTX | MetricFlow / Cube / Malloy |
|
|
|-----------|-----|-----------------------------|
|
|
| **Primary surface** | Plain YAML and Markdown files | Modeling language, project runtime, or API surface |
|
|
| **Models** | Sources, joins, grain, measures, filters, wiki refs, and provenance | Metrics, dimensions, joins, queries, and generated SQL |
|
|
| **Agent edit loop** | First-class: patch files, validate, inspect SQL, and review git diffs | Possible, but usually tied to the tool's modeling workflow |
|
|
| **Surrounding context** | Built in through wiki pages, scans, transcripts, and source evidence | Usually descriptions, annotations, metadata, or app-specific context |
|
|
| **Best fit** | Agents maintaining analytics context and SQL-facing definitions | Teams standardizing metrics, BI APIs, semantic runtimes, or exploratory modeling |
|
|
|
|
If you already use MetricFlow, LookML, dbt, or BI tools, KTX can ingest that
|
|
context and turn it into agent-readable files. You do not need to replace your
|
|
serving layer to give agents a better working surface.
|
|
|
|
## Plain files
|
|
|
|
A KTX project is a directory of readable files. Semantic sources and wiki pages
|
|
are committed to git; local indexes and caches stay under `.ktx/`.
|
|
|
|
```text
|
|
my-project/
|
|
├── ktx.yaml
|
|
├── semantic-layer/
|
|
│ └── warehouse/
|
|
│ ├── orders.yaml
|
|
│ └── customers.yaml
|
|
├── wiki/
|
|
│ └── global/
|
|
│ ├── revenue.md
|
|
│ └── segment-classification.md
|
|
├── raw-sources/
|
|
│ └── warehouse/
|
|
└── .ktx/ # local state, git-ignored
|
|
```
|
|
|
|
This keeps analytics context close to the code review workflow:
|
|
|
|
- branch context changes;
|
|
- review YAML and Markdown diffs;
|
|
- merge accepted definitions;
|
|
- let agents read the updated source of truth.
|
|
|
|
## Agent usage notes
|
|
|
|
Use this page when an agent needs to explain why KTX exists, why schema-only
|
|
database access is not enough, or how KTX differs from traditional semantic
|
|
layers.
|
|
|
|
| Agent task | Relevant section | Next page |
|
|
|------------|------------------|-----------|
|
|
| Explain why a database agent wrote a plausible but wrong query | Why agents need context | [Writing Context](/docs/guides/writing-context) |
|
|
| Decide whether a fact belongs in YAML or Markdown | Semantic sources / Wiki pages | [Writing Context](/docs/guides/writing-context) |
|
|
| Compare KTX to another semantic layer | How KTX compares | [Primary Sources](/docs/integrations/primary-sources) |
|
|
| Explain reviewability and source of truth | Plain files | [Context as Code](/docs/concepts/context-as-code) |
|