ktx/docs-site/content/docs/concepts/the-context-layer.mdx

---
title: The Context Layer
description: What a context layer is, why agents need one, and the YAML and Markdown surfaces KTX writes to disk.
---

A context layer is the trusted knowledge surface that sits between your data
stack and the agents that query it. It holds the things a database connection
can't tell an agent on its own: which metrics are canonical, which joins are
safe, what your team means by "active customer", and where every definition
came from.

KTX builds that layer as plain files - YAML, Markdown, and JSON - that agents
can search and humans can review. This page covers what's in it, why agents
need it, and how it compares to other semantic tooling.

## Database access isn't enough

Hand an agent a database connection and it can run SQL. It still has to guess
the part that matters: which table is the source of truth, which join is the
one analysts actually use, and what definition the business agreed on. Plausible
SQL becomes wrong SQL fast.

| Schema-only access gives the agent | What it still doesn't know |
|------------------------------------|----------------------------|
| Tables, columns, and types | Which table is canonical for revenue |
| Primary and foreign keys | Which join is safe and which fans out measures |
| Sample rows | Which rows are test accounts the team excludes |
| `orders.amount` exists | That `amount` includes refunds unless filtered |
| A `customers.segment` column | That `legacy_segments` is stale even though it exists |
| Column comments, sometimes | The board-approved definition of ARR |

Schema is a starting point, not a contract. The context layer is the contract.

## The two pillars

A KTX project has two committed surfaces, each tuned for a different question.
Structured data lives where it can be compiled. Prose lives where it can be
searched. Wiki pages cross-reference semantic sources by name, so every metric
caveat stays anchored to the definition it explains.

<figure
  className="not-prose my-10 overflow-hidden rounded-lg border border-fd-border bg-fd-card shadow-sm"
  aria-label="The two committed pillars of a KTX context layer"
>
  <div className="border-b border-fd-border bg-fd-muted/35 px-5 py-4">
    <p className="text-[11px] font-semibold uppercase tracking-[0.08em] text-fd-primary">
      {"Anatomy of a context layer"}
    </p>
    <h3
      className="mt-1 text-base font-semibold tracking-normal text-fd-foreground sm:text-lg"
      style={{ fontFamily: "var(--font-display)" }}
    >
      {"Two files, two jobs"}
    </h3>
    <p className="mt-2 max-w-3xl text-xs leading-5 text-fd-muted-foreground">
      {"YAML for what the warehouse can execute. Markdown for what the team needs to interpret it. Both are committed to git and reviewed like code."}
    </p>
  </div>

  <div className="grid gap-px bg-fd-border md:grid-cols-2">
    <div className="bg-fd-card p-6" style={{ borderTop: "3px solid #3b82f6" }}>
      <div className="flex items-center justify-between gap-2">
        <p className="font-mono text-[14px] font-semibold tracking-tight" style={{ color: "#3b82f6" }}>
          {"semantic-layer/**/*.yaml"}
        </p>
        <span className="rounded border border-fd-border bg-fd-background px-1.5 py-0.5 text-[10px] font-semibold uppercase tracking-[0.08em] text-fd-muted-foreground">
          {"committed"}
        </span>
      </div>
      <p className="mt-3 text-[19px] font-semibold leading-7 text-fd-foreground" style={{ fontFamily: "var(--font-display)" }}>
        {"Semantic sources"}
      </p>
      <div className="mt-2 flex flex-wrap gap-1.5">
        <span className="rounded border border-fd-border bg-fd-background px-2 py-0.5 text-[11.5px] text-fd-muted-foreground">{"structured"}</span>
        <span className="rounded border border-fd-border bg-fd-background px-2 py-0.5 text-[11.5px] text-fd-muted-foreground">{"executable"}</span>
      </div>
      <p className="mt-3.5 text-[13.5px] leading-6 text-fd-muted-foreground">
        {"Tables, grain, joins, measures, dimensions, filters, and segments. The compiler turns these into dialect-correct SQL."}
      </p>
      <p className="mt-4 text-[11px] uppercase tracking-[0.08em] text-fd-muted-foreground">
        <span className="text-fd-foreground">{"Answers: "}</span>
        {"how do I query this safely?"}
      </p>
    </div>

    <div className="bg-fd-card p-6" style={{ borderTop: "3px solid #10b981" }}>
      <div className="flex items-center justify-between gap-2">
        <p className="font-mono text-[14px] font-semibold tracking-tight" style={{ color: "#10b981" }}>
          {"wiki/**/*.md"}
        </p>
        <span className="rounded border border-fd-border bg-fd-background px-1.5 py-0.5 text-[10px] font-semibold uppercase tracking-[0.08em] text-fd-muted-foreground">
          {"committed"}
        </span>
      </div>
      <p className="mt-3 text-[19px] font-semibold leading-7 text-fd-foreground" style={{ fontFamily: "var(--font-display)" }}>
        {"Wiki pages"}
      </p>
      <div className="mt-2 flex flex-wrap gap-1.5">
        <span className="rounded border border-fd-border bg-fd-background px-2 py-0.5 text-[11.5px] text-fd-muted-foreground">{"free-form"}</span>
        <span className="rounded border border-fd-border bg-fd-background px-2 py-0.5 text-[11.5px] text-fd-muted-foreground">{"searchable"}</span>
      </div>
      <p className="mt-3.5 text-[13.5px] leading-6 text-fd-muted-foreground">
        {"Definitions, caveats, policies, and decisions. Frontmatter links each page back to the semantic sources it explains."}
      </p>
      <p className="mt-4 text-[11px] uppercase tracking-[0.08em] text-fd-muted-foreground">
        <span className="text-fd-foreground">{"Answers: "}</span>
        {"what does this mean to the business?"}
      </p>
    </div>
  </div>

  <figcaption className="border-t border-fd-border bg-fd-muted/25 px-5 py-3 text-[11.5px] leading-5 text-fd-muted-foreground">
    <span className="font-medium text-fd-foreground">{"Behind the scenes. "}</span>
    {"KTX also keeps scan snapshots and a per-run event log locally so every committed change is traceable to its evidence. You don't read or edit these files yourself - see "}
    <a href="/docs/concepts/context-as-code" className="font-medium underline">{"Context as Code"}</a>
    {" for how that audit trail flows into review."}
  </figcaption>
</figure>

## Semantic sources

Semantic sources describe a table the way an agent can reason about it: row
grain, typed columns, named measures, valid joins, filters, and segments. The
planner compiles these into SQL; nothing else.

```yaml
# semantic-layer/warehouse/orders.yaml
name: orders
table: public.orders
grain: [id]
columns:
  - name: id
    type: number
  - name: status
    type: string
  - name: amount
    type: number
measures:
  - name: total_revenue
    expr: sum(amount)
    filter: "status != 'refunded'"
joins:
  - to: customers
    "on": customer_id = customers.id
    relationship: many_to_one
```

For how the compiler walks the join graph, handles fan-out, and transpiles
dialects, read [Semantic Querying](/docs/concepts/semantic-layer-internals).

## Wiki pages

Wiki pages hold the context that doesn't belong in a formula: business
definitions, reporting policy, anomalies, and metric caveats. Each page links
back to the semantic sources it explains through frontmatter.

```markdown
# wiki/global/revenue.md
---
summary: Paid order value after refunds
tags: [finance, orders]
sl_refs: [warehouse.orders]
refs: [segment-classification]
usage_mode: auto
---

Revenue is paid order amount after refund adjustments.

Use `orders.total_revenue` for recognized order value and
`orders.order_count` for paid order volume.
```

### A navigable graph

Those two reference fields - `sl_refs` from a wiki page to a semantic source,
and `refs` from a wiki page to other wiki pages - turn the context layer into
a graph agents traverse. An agent that finds this page while searching for
"revenue" follows `sl_refs` straight to `orders.total_revenue` for the
executable definition, then walks `refs` to related policies without rerunning
search.

The graph only helps if the edges stay live. KTX validates references when
wiki pages are written and prunes `sl_refs` during ingest when their target
sources are deleted or their measures are renamed - so a stale page can never
quietly route an agent to a definition that no longer exists.

The split between the two pillars is sharp:

| Put it in YAML | Put it in Markdown |
|----------------|--------------------|
| `sum(amount)` | "Net revenue excludes successful refunds." |
| `many_to_one` join metadata | "Use the contract segment for board reporting." |
| Row grain and column types | "February had a one-time refund anomaly." |
| Default time dimension | "Finance owns ARR definitions." |

If a fact changes how the SQL runs, it goes in YAML. If a human needs it to
trust the answer, it goes in Markdown.

## How KTX compares

Two adjacent product categories cover parts of this problem - but each leaves
a different gap.

**Company brains** (Glean, Notion AI, the search-over-everything tools) index
your wikis, docs, and chats so an agent can find context fast. They aren't
built for data stacks: there's no join graph, no canonical metrics, and no way
to compile a question into safe SQL. An agent reading them still has to guess
how to query the warehouse.

**Traditional semantic layers** (MetricFlow, Cube, Malloy) solve that side.
They give agents reviewable metric definitions and a compiler that produces
correct SQL. The cost is maintenance - models, joins, and dimensions are
hand-written, and the layer doesn't learn from the warehouse, BI tools, or
query history that surround it. The business context that explains *why* a
definition exists usually lives somewhere else.

KTX bundles both surfaces - wiki for business context, semantic layer for
queryable definitions - and keeps them current by reading the data stack and
reconciling new evidence with the reviewed files. You get the breadth of a
knowledge tool and the SQL safety of a semantic layer, without rewriting
models every time the warehouse changes.

| Capability | Company brain | Semantic layer | KTX |
|------------|---------------|----------------|-----|
| **Surface** | Indexed docs and chats | Modeling language or runtime | YAML and Markdown files |
| **Data-stack awareness** | None - treats data tools as text | High for declared metrics, none for the surrounding warehouse | Built in: scans schemas, dbt, BI tools, and query history |
| **Maintenance** | Manual page authoring | Manual modeling, model-per-change | Auto-maintained: reconciles evidence with accepted files |
| **SQL safety** | None - generates plausible text | Compiled, dialect-correct | Compiled with join-graph and fan-out handling |
| **Agent edit loop** | Text-only | Tied to the modeling workflow | First-class: patch files, validate, review diffs |

If you already use MetricFlow, LookML, dbt, or BI tools, KTX can ingest that
context and turn it into agent-readable files. You don't need to replace your
serving layer to give agents a better working surface.

## A KTX project on disk

A KTX project is a directory of readable files. Semantic sources and wiki
pages are committed to git; everything else KTX needs at runtime stays local
and out of the repo.

```text
my-project/
├── ktx.yaml                              # project config and connections
├── semantic-layer/
│   └── warehouse/
│       ├── orders.yaml
│       └── customers.yaml
├── wiki/
│   └── global/
│       ├── revenue.md
│       └── segment-classification.md
└── .ktx/                                 # local runtime state, git-ignored
```

This keeps analytics context close to the code review workflow: branch context
changes, review YAML and Markdown diffs, merge accepted definitions, and let
agents read the updated source of truth.

## Agent usage notes

Use this page when an agent needs to explain why KTX exists, why schema-only
database access isn't enough, or how KTX differs from traditional semantic
layers.

| Agent task | Relevant section | Next page |
|------------|------------------|-----------|
| Explain why a database agent wrote a plausible but wrong query | Database access isn't enough | [Writing Context](/docs/guides/writing-context) |
| Decide whether a fact belongs in YAML or Markdown | Semantic sources / Wiki pages | [Writing Context](/docs/guides/writing-context) |
| Compare KTX to another semantic layer | How KTX compares | [Primary Sources](/docs/integrations/primary-sources) |
| Explain reviewability and source of truth | A KTX project on disk | [Context as Code](/docs/concepts/context-as-code) |