ktx/packages/context/skills/dbt_ingest/SKILL.md
2026-05-10 23:51:24 +02:00

3.1 KiB

name description callers
dbt_ingest Map dbt `schema.yml` / `properties.yml` models and sources into KTX semantic-layer overlays and column notes. Covers `sources:` vs `models:`, column `data_tests` (not_null, unique, accepted_values, relationships), and how bundle-time writes complement manifest backfill from git sync. Load when the WorkUnit's `skillNames` includes `dbt_ingest` or when raw files are dbt YAML under `models/` / `sources/`.
memory_agent

dbt → KTX (bundle ingest)

Use this skill for uploaded dbt projects (dbt_project.yml at stage root, models/**, sources/**, schema.yml). There is no fetch() in v1 — scheduled dbt parse / manifest.json pulls are out of scope; host-provided dbt sync may still backfill structured test metadata into _schema on the next sync.

Mapping (models / sources → SL)

dbt KTX Notes
models: entry with columns: Overlay on the manifest table with the same name (after wiki_sl_search / sl_describe_table) One SL source per physical table; model name may differ from DB name — resolve with read_raw_file + warehouse context.
sources:tables: Same as models; use identifier when present instead of logical name. Schema + name must match how the connection sees tables.
Column description descriptions.user or merged descriptions map on the column Do not overwrite dbt description keys from sync.
data_tests: not_null / unique Short hint in column descriptions or notes: “dbt: not null”, “dbt: unique” Full structured metadata lands in manifest via sync; the skill keeps bundle-time SL text useful for the agent.
accepted_values Add a brief line in the column description: allowed values (truncate long lists) Also mention enum-like use in wiki_sl_search / filters.
relationships Add or confirm joins: on the overlay only when to resolves to a real table via read_raw_file + wiki_sl_search / sl_describe_table If the ref cannot be resolved, capture the intent in a wiki page instead.

1.1 test hints (descriptions / meta)

When YAML shows accepted_values or not_null, add short hints into columns[].descriptions (e.g. under user) or freeform column notes so chat and validation see intent before the next git sync refreshes constraints / enum_values in _schema. Keep hints under a few words when possible.

Overlap with MetricFlow

If the same bundle also has MetricFlow semantic_models: / metrics:, the metricflow_ingest skill owns semantic/metric shapes. This skill focuses on raw dbt schema YAML (models, sources, tests). If both apply, load metricflow_ingest first when the file is clearly MetricFlow; otherwise use dbt_ingest for schema.yml without semantic_models.

Do not

  • Do not run dbt CLI or assume target/ / manifest.json exists in the upload.
  • Do not invent joins from relationships tests if the target model/table is not found in SL or the warehouse.
  • Do not read peerFileIndex paths — use read_raw_file only on rawFiles and dependencyPaths from the WorkUnit.