--- title: Semantic Layer Internals description: How KTX uses join graphs, grain, and relationship metadata to turn context into safe SQL. --- KTX is a context layer for agents. This page focuses on one internal subsystem: the semantic execution layer that turns reviewed context into safe SQL. The semantic layer is important, but it is not the whole product. KTX also handles schema evidence, wiki context, provenance, validation, and agent workflows around those files. Read the page as a pipeline: - context inputs feed the semantic engine; - evidence becomes a join graph with grain and relationship metadata; - review and corrections keep that graph current; - the execution engine uses the graph to avoid fan-out and ambiguous joins. ## Where the semantic layer fits The semantic layer is not a separate product category inside KTX. It is the engine that makes the rest of the context actionable for SQL generation.

{"Context inputs"}

semantic-layer/

{"source YAML, measures, joins, grain"}

wiki/

{"business rules, definitions, caveats"}

raw-sources/

{"schema scans, keys, imported metadata"}

provenance

{"ingest decisions and review history"}

{"Semantic layer engine"}

Join graph

{"sources as nodes, joins as typed edges"}

Grain

{"row identity before aggregation"}

Measures

{"verified formulas and filters"}

Relationships

{"many_to_one, one_to_many, one_to_one"}

{"Safe query planning before SQL is generated."}

{"Agent workflows"}

{"Search sources and wiki pages"}

{"Compile trusted SQL"}

{"Explain metrics and provenance"}

{"Patch files and validate review"}

## The join graph KTX builds A semantic source is a node. A join is an edge with a join condition and a relationship type. The graph lets KTX choose valid paths, reject unsafe paths, and reason about whether a join preserves or multiplies rows before SQL is generated. - `many_to_one` paths are usually safe for adding dimensions. - `one_to_many` paths can multiply fact rows and trigger fan-out handling. - Equal-cost paths can be ambiguous, so aliases and explicit joins matter.

customers

grain: customer_id

orders

grain: order_id

order_items

grain: order_id, line_id

orders -> customers: many_to_one

orders -> order_items: one_to_many

{"Example: "} {"refunds joins to orders. Used carefully, it explains net revenue. Joined naively, it can duplicate order-level measures."}

The graph is bidirectional for planning. If `orders -> customers` is `many_to_one`, the reverse path is `one_to_many`; KTX keeps that distinction instead of treating every join as a neutral edge. ## How KTX builds the graph KTX starts from evidence, not a blank modeling canvas. Database scans and analytics-tool imports create source definitions that an analyst can review. | Evidence | What it contributes | |---|---| | Declared primary keys | Initial row grain for each source | | Declared foreign keys | Formal join candidates and relationship direction | | Inferred relationships | Useful edges when warehouses lack constraints | | dbt, MetricFlow, and LookML imports | Existing metrics, dimensions, entities, explores, and joins | | Query history | Real join and filter patterns agents should respect | | Analyst review | The final authority before context is merged | Generated YAML is intentionally reviewable. KTX can draft joins and measures, but the accepted semantic layer is still the plain-file diff your team approves. ## How KTX keeps the graph current The semantic layer changes as schemas, metrics, and business rules change. KTX keeps that loop explicit instead of hiding it behind a remote runtime.

{"Semantic maintenance loop"}

{"Every accepted correction becomes input to the next graph build."}

{"reviewed context"}

{"The accepted graph becomes the starting point for the next build."}

{"Step 1"}

{"ingest evidence"}

{"scan schemas, imports, and accepted files"}

{"Step 2"}

{"YAML diff"}

{"draft source, join, grain, and measure changes"}

{"Step 3"}

{"validation"}

{"check relationships, syntax, and unsafe query shapes"}

{"Step 4"}

{"analyst review"}

{"accept, edit, or reject generated context"}

{"Step 5"}

{"agent use"}

{"serve context to search, explain, and query"}

{"Step 6"}

{"corrections"}

{"agent and analyst fixes become new evidence"}

This matters because semantic correctness is not static. If a source gains a new key, a metric changes definition, or an analyst corrects a relationship, the next agent gets that reviewed context. ## The modeling problem the graph solves Fan-out is the classic failure mode. If an order-level measure is joined to line-item rows before aggregation, one order can become many rows and revenue can be counted more than once. | Problem | What happens | How KTX avoids it | |---|---|---| | Order measure joins to `order_items` | `orders.revenue` repeats once per item | Detect the `one_to_many` path and pre-aggregate the order measure | | Two independent fact sources share `customers` | Measures from each fact table multiply across the shared dimension | Treat it as a chasm trap and use aggregate-locality planning | | Filter lives only across a `one_to_many` path | Filtering after the join changes the measure grain | Reject or localize the filter instead of silently producing unsafe SQL | | Multiple equal-cost paths connect the same sources | The join path is ambiguous | Prefer safer paths and use aliases to disambiguate repeated joins | Many-to-many questions usually show up as multiple one-to-many paths or independent fact sources. KTX treats those shapes as fan-out or chasm risks unless the query can be planned at a safe grain. ## How the execution engine uses the graph The planner resolves the sources in a semantic query, chooses a join tree, and checks whether any requested dimension or filter crosses a row-multiplying edge. The SQL generator then chooses the simple path or the aggregate-locality path. | Naive SQL shape | Semantic-layer SQL shape | |---|---| | Join facts and dimensions first, then aggregate | Aggregate each fact source at its own grain, then join the results | | Put every filter in one outer `WHERE` clause | Keep measure filters with the measure source when locality is needed | | Trust the shortest textual join path | Prefer safe relationship paths and reject disconnected sources | | Let dimension grain differ across facts | Raise when asymmetric dimensions would fan out another measure |

{"Unsafe shape"}

{`orders
  join order_items
  join customers
group by customer_segment
sum(orders.amount)`}

{"The order measure is exposed to line-item fan-out before aggregation."}

{"KTX shape"}

{`orders_agg as (
  select customer_id, sum(amount) revenue
  from orders
  group by customer_id
)
select customers.segment, sum(revenue)
from orders_agg
join customers`}

{"KTX pre-aggregates fact measures at their own grain before joining dimensions."}

The result is not magic. It is structured planning: validated sources, typed relationships, graph search, fan-out detection, aggregate locality, and final dialect transpilation. ## What this means for agents KTX gives agents a semantic surface they can inspect and improve, not just a folder of notes. - Search semantic sources and related wiki pages before writing SQL. - Compile SQL through `ktx sl query` instead of guessing joins. - Validate semantic-layer changes before review. - Patch YAML and Markdown files in git. - Explain metric meaning and provenance from the same accepted context. Next, read [Writing Context](/docs/guides/writing-context) for the YAML editing workflow or [ktx sl](/docs/cli-reference/ktx-sl) for the command reference.