--- title: Writing Context description: Edit semantic sources and wiki pages so agents use your business logic. --- KTX context is meant to be edited. Ingest gives you a grounded first draft, then you refine source YAML and wiki Markdown until agents can answer data questions with the same definitions your team uses. Use this guide when you are adding measures, fixing joins, documenting business rules, or reviewing context changes made by an agent. ## Editing workflow Use this order for most context changes: 1. Discover existing context. ```bash ktx sl list --json ktx sl search "revenue" --json ktx wiki search "revenue recognition" --json --limit 10 ``` 2. Edit the smallest relevant files under `semantic-layer//` or `wiki/`. 3. Validate semantic source changes. ```bash ktx sl validate orders --connection-id warehouse ``` 4. Compile a representative query before executing it. ```bash ktx sl query \ --connection-id warehouse \ --measure orders.total_revenue \ --dimension orders.created_date \ --format sql ``` 5. Search again using likely user wording to confirm the new context is discoverable. ## Semantic sources Semantic sources are YAML files that describe queryable entities. A source is usually a table, but it can also point at a custom SQL expression. Sources define the vocabulary agents use for measures, dimensions, segments, joins, and grain-aware query planning. Source files live at: ```text semantic-layer//.yaml ``` ### Minimal source ```yaml name: orders descriptions: user: Customer orders with booked revenue. table: public.orders grain: - order_id columns: - name: order_id type: string descriptions: user: Unique order identifier. - name: order_date type: time role: time descriptions: user: Date the order was placed. - name: total_amount type: number descriptions: user: Booked order value in USD. measures: - name: total_revenue expr: SUM(total_amount) description: Sum of booked order value before refunds. ``` ### Full source shape ```yaml name: orders descriptions: user: Customer orders with line-item totals. table: public.orders grain: - order_id columns: - name: order_id type: string descriptions: user: Unique order identifier. - name: order_date type: time role: time descriptions: user: Date the order was placed. - name: status type: string visibility: public descriptions: user: Current order status. - name: _etl_loaded_at type: time visibility: hidden descriptions: user: Internal load timestamp. - name: total_amount type: number descriptions: user: Order total in USD. measures: - name: total_revenue expr: SUM(total_amount) description: Sum of all order values. - name: order_count expr: COUNT(DISTINCT order_id) description: Number of distinct orders. - name: avg_order_value expr: AVG(total_amount) description: Average booked order value. - name: high_value_revenue expr: SUM(total_amount) filter: total_amount > 100 description: Revenue from orders over $100. segments: - name: completed_orders expr: status = 'completed' description: Orders that completed fulfillment. joins: - to: customers on: orders.customer_id = customers.customer_id relationship: many_to_one - to: order_items on: orders.order_id = order_items.order_id relationship: one_to_many alias: items ``` ### Source fields | Field | Required | Description | |-------|----------|-------------| | `name` | Yes | Source identifier. Use lowercase words and underscores. | | `descriptions` | No | Description map keyed by source, such as `user`, `dbt`, or `ai`. | | `table` or `sql` | Yes | Database table or custom SQL expression. Use exactly one. | | `grain` | Yes | Columns that uniquely identify a row at the source grain. | | `columns` | Yes | Non-empty column definitions with type, role, visibility, and descriptions. | | `measures` | No | Aggregation expressions such as `SUM`, `COUNT`, and `AVG`. | | `segments` | No | Named predicates agents can reuse. | | `joins` | No | Relationships to other semantic sources. | | `inherits_columns_from` | No | Inherit column metadata from a manifest entry. | ### Component fields | Component | Field | Required | Description | |-----------|-------|----------|-------------| | Column | `name` | Yes | Column identifier used in SQL expressions. | | Column | `type` | Yes | Agent-facing type: `string`, `number`, `time`, or `boolean`. | | Column | `role` | No | Special role such as `time` for default time dimensions. | | Column | `visibility` | No | `public`, `internal`, or `hidden`. | | Column | `descriptions` | Strongly recommended | Description map keyed by source, such as `user`, `dbt`, or `ai`. | | Measure | `name` | Yes | Queryable metric name. | | Measure | `expr` | Yes | SQL aggregation expression at the source grain. | | Measure | `filter` | No | SQL predicate applied only to this measure. | | Measure | `description` | Strongly recommended | Definition agents can cite and compare. | | Segment | `name` | Yes | Reusable filter name. | | Segment | `expr` | Yes | SQL predicate for the segment. | | Join | `to` | Yes | Target semantic source name. | | Join | `on` | Yes | SQL join condition using source names or aliases. | | Join | `relationship` | Yes | `many_to_one`, `one_to_many`, or `one_to_one`. | | Join | `alias` | No | Query alias for repeated or clearer joins. | ### Visibility | Visibility | Agent behavior | |------------|----------------| | `public` | Included in listings and available for agent queries. | | `internal` | Available for joins and measures, but not highlighted to agents. | | `hidden` | Excluded from agent-facing context. Use for ETL fields and sensitive internals. | ## Measures Good measures have precise names, SQL expressions at the correct grain, and descriptions that say what is included and excluded. ```yaml measures: - name: net_revenue expr: SUM(total_amount - refunded_amount) filter: status = 'completed' description: Completed order revenue after refunds, excluding cancelled orders. ``` Prefer one canonical measure plus wiki synonyms over several nearly identical measures. If your team uses multiple definitions, document the distinction in a wiki page and link it with `sl_refs`. ## Joins and grain `grain` and `relationship` prevent agents from producing double-counted SQL. State the row grain even when it seems obvious. ```yaml grain: - order_id joins: - to: customers on: orders.customer_id = customers.customer_id relationship: many_to_one ``` Use `many_to_one` for dimensions such as customer, account, product, or plan. Use `one_to_many` only when the target can fan out the source rows, such as orders to order items. ## Validate and query Validation checks source YAML against the live database schema: ```bash ktx sl validate orders --connection-id warehouse ``` It catches missing columns, invalid join targets, and table-reference problems before an agent relies on the source. Compile a query to inspect generated SQL: ```bash ktx sl query \ --connection-id warehouse \ --measure orders.total_revenue \ --dimension orders.order_date \ --filter "orders.status = 'completed'" \ --order-by orders.order_date:desc \ --limit 10 \ --format sql ``` Execute only when you need live rows: ```bash ktx sl query \ --connection-id warehouse \ --measure orders.total_revenue \ --dimension orders.status \ --execute \ --max-rows 100 ``` ## Wiki pages Wiki pages capture business context that does not belong in a single source file: metric policies, dashboard caveats, company vocabulary, data freshness, known issues, and source-of-truth notes. Wiki files live under: ```text wiki/ global/ user// ``` Use global pages for shared business rules. Use user-scoped pages for local notes, personal conventions, or context that should not be shared broadly. ### Wiki page example ```markdown --- summary: Revenue recognition rules for finance reporting. tags: [revenue, finance, reporting] sl_refs: [orders] external_refs: - type: notion id: finance-revenue-policy --- ## Recognized Revenue Recognized revenue includes completed orders after refunds. It excludes cancelled orders, test orders, implementation fees, and tax. Finance reporting uses order completion date, not invoice creation date. ``` Useful frontmatter: | Field | Required | Description | |-------|----------|-------------| | `summary` | Yes | Short text shown in search results. | | `tags` | No | Business terms and synonyms that improve search. | | `sl_refs` | No | Semantic source names the page explains or constrains. | | `external_refs` | No | Source-of-truth system links or ids. | ## Add searchable business context 1. Search first. ```bash ktx wiki search "active customer definition" --json --limit 10 ``` 2. If no page covers the rule, create or edit a Markdown file under `wiki/global/`. 3. Write a compact `summary` with the wording users are likely to ask. 4. Add tags for synonyms and related business areas. 5. Add `sl_refs` for relevant semantic sources. 6. Search again with a user-like phrase. ## Review context changes Before accepting agent-written context: ```bash git diff -- semantic-layer wiki ktx sl validate orders --connection-id warehouse ktx sl search "revenue" --json ktx wiki search "revenue recognition" --json --limit 10 ``` Check that definitions are specific, hidden columns stay hidden, joins have explicit relationships, and measures compile into the expected SQL. ## Common errors | Symptom | Likely cause | Recovery | |---------|--------------|----------| | `ktx sl validate` reports a missing column | YAML references a column absent from the scanned table | Refresh database context or update the YAML | | Query compilation double-counts a measure | `grain` or join `relationship` is missing or wrong | Add explicit grain and relationship values, then recompile | | Agent cannot find a metric | Measure name and description do not match business terminology | Add a clearer measure description and a wiki page with synonyms | | Wiki search misses a page | Summary, tags, or content do not match user wording | Rewrite the summary and add likely synonyms | | Context diff is hard to review | One edit changed too many concepts | Split the change into focused source and wiki edits |