apunkt/ktx

mirror of https://github.com/Kaelio/ktx.git synced 2026-06-07 07:55:13 +02:00

Andrey Avtomonov 3f0d11e07d

feat(cli)!: remove fast mode; ktx ingest always builds enriched context (KLO-721) (#237 )

Fast mode (the ktx ingest --fast/--deep database-ingest depth toggle) is removed.
ktx ingest now always builds the full enriched ("deep") context. There is no
structural fallback: a database connection without a configured model and
embeddings fails the enrichment-readiness preflight before any work runs, with
a 'Run ktx setup to configure a model and embeddings' hint.

- Remove --fast/--deep flags, the per-connection context.depth field, and the
  ktx setup depth prompt (delete setup-database-context-depth.ts).
- Rename ingest-depth.ts -> connection-drivers.ts; ingest always requests scan
  mode 'enriched'; readiness gate (enrichmentReadinessGaps) runs for every
  database target.
- Drop the database-context-depth telemetry step (Node + Python schema mirrors
  regenerated).
- Update CLI, setup, context-build view, docs, the public ktx skill, and the
  release-smoke / artifacts scripts (now assert the no-LLM guard failure).

ktx status --fast (a separate network-probe flag) is unchanged.

Follow-ups: KLO-726 (live progress for ktx ingest --all), KLO-727 (restore
credentialed successful-ingest release smoke coverage).

2026-05-29 17:41:04 +02:00

7.3 KiB

Raw Blame History

ktx Terminology Rules

Canonical vocabulary for coding agents working on this repository. Applies to docs prose, code comments, identifiers, CLI strings, error messages, log lines, and example output.

For product-name capitalization rules (ktx vs **ktx** vs code font), see the Product Naming section of AGENTS.md — those rules take precedence over anything below when they conflict.

The "source" rule

source does four different jobs in this codebase. Never write bare source in prose when ambiguity is possible. Always qualify:

semantic source — the YAML file that describes a table
primary source — the connected database
context source — the analytics-tooling integration (dbt, Metabase, etc.)
source of truth — the canonical place a fact lives

Bare source is allowed only inside a section that has already established its referent (e.g., body of a Semantic sources page, or sourceName as a CLI arg).

Context Layer and Context Engine

Use context layer as the primary category term for what ktx provides to data agents.

Use context engine as the secondary mechanism term for how ktx builds, maintains, validates, and serves that layer.

Concept	Use	Do not use
The whole ktx product category	context layer / context layer for data agents	knowledge layer, agent memory
The active system that builds and maintains context	context engine	context layer when describing ingest/reconciliation internals
The durable reviewed surface agents use	context layer	context engine
The compiler pillar for executable metrics and joins	semantic layer	context layer when specifically discussing SQL compilation
Prose/business knowledge files	wiki / wiki pages	wiki context

Usage rules

Use context layer in taglines, page titles, meta descriptions, docs introductions, comparison pages, and first-paragraph definitions.
Use context engine when describing active behavior: ingesting evidence, reconciling changes, validating references, maintaining files, search, CLI, and MCP serving.
Keep semantic layer for the narrower YAML/compiler surface: semantic sources, measures, joins, dimensions, filters, SQL compilation, and semantic queries.
Do not use context engine as the primary replacement for the whole product. It sounds like runtime infrastructure; context layer better describes the durable YAML and Markdown surface users review in git.
Do not use context layer when the sentence is specifically about the compiler. Example: write "the semantic layer compiles semantic queries to SQL," not "the context layer compiles semantic queries to SQL."
Default lowercase in prose: context layer, context engine, semantic layer. Title case only in page titles, headings, nav labels, and UI labels.

Canonical vocabulary

Concept	Use	Do not use
AI consumer (general prose)	data agent	analytics agent, database agent, client agent
AI consumer (Integrations nav)	agent client	client agent
Coding-tool framing (user-facing)	coding agent	—
The connected database	primary source / database connection	data source
Analytics-tooling integration	context source / context-source connection	BI source, BI model, metadata source, source tool
YAML file describing a table	semantic source	semantic-layer source, model file, bare "source file"
The whole ktx surface	context layer / context layer for data agents (lowercase in prose)	"Context Layer" in prose, knowledge layer, agent memory
The active system that builds and maintains context	context engine (lowercase in prose)	context layer when describing ingest/reconciliation internals
The compiler pillar	semantic layer (lowercase in prose)	"Semantic Layer" in prose
The query payload	semantic query (lowercase in prose)	"Semantic Query"
The MCP layer	MCP server (the server), MCP tools (the functions)	"ktx MCP" as a standalone noun
The plugin/implementation	connector (prefix with primary or context when contrasting)	adapter, driver-as-noun
Config field value	`driver` (code font only)	`driver` as a generic noun
Merge step	reconcile / reconciliation / reconciliation agent	"merge intelligently", bare "LLM agent"
Connection ref in prose	connection id (lowercase, two words)	"connection ID"
CLI arg/flag literal	`connectionId` (code font)	—
File path placeholder	`<connection-id>` (code font)	—
Ingest of a primary connection	database ingest	—
Ingest of a context-source connection	context-source ingest	bare "source ingest"
Wiki capture	text ingest	—
Query-history sub-mode	query-history ingest	—
SQL compilation	compile / the compiler / SQL compilation	"SQL generation"
Internal stage inside compilation	planner / planning (only in semantic-layer-internals)	—
Setup flow noun	setup wizard	"the wizard" (bare)
Setup flow contrast	interactive setup (vs non-interactive / flag-driven)	"interactive command"
The whole project	ktx project	"KTX project" (all caps)
The filesystem path	project directory	"project dir"
Wiki surface as a whole	wiki	"wiki context"
A single Markdown file	wiki page	—
YAML vs Markdown contrast	wiki Markdown (only when contrasting with semantic source YAML)	—
Joins multiplying rows (generic)	fanout	—
The two named patterns	chasm trap / fan trap	—
Casual gloss in user prose	double-count	(avoid in technical/internals prose)

Prose rules

Article + ktx. Treat ktx as a bare proper noun, no article: ktx is..., in ktx. Articles attach to the following noun, not to ktx: the **ktx** MCP server, the **ktx** project.
Capitalization. Default lowercase for context layer, semantic layer, semantic query. Title case only inside literal page titles or H1 headings.
Code font. Reserve code font for the CLI command, binary, paths, config field values (e.g. driver: postgres), CLI arg/flag literals (connectionId, --project-dir), and path placeholders (<connection-id>). Do not use code font for prose nouns like connector or reconciliation.
driver is never a prose noun. Always driver: postgres (code font, as a config field value). For the noun, use connector.

Canonical lists

Use these orderings verbatim when listing supported systems:

Primary sources: PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, SQLite
Context sources: dbt, MetricFlow, LookML, Looker, Metabase, Notion

If a doc or string omits or reorders members of either list, treat that as a bug unless the surrounding text justifies the change.

When updating this file

Add a new row to the canonical vocabulary table; do not introduce a parallel glossary elsewhere.
If you rename a converged term, search the workspace for the previous form and update call sites in the same change.
When deprecating a term, add it to the Do not use column with a one-line reason in the surrounding prose, not just in the table.

7.3 KiB Raw Blame History