mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-07 07:55:13 +02:00
* feat(cli): define full warehouse dialect contract
* test(cli): keep dialect edge tests focused
* fix(cli): stabilize dialect contract foundation
* refactor(connectors): own read-only query preparation
* refactor(connectors): resolve dialects through registry
* refactor(connectors): keep concrete dialect classes internal
* chore(workspace): enforce dialect import boundary
* refactor(cli): resolve relationship dialect at scan boundary
* refactor(cli): use dialect display parsing for entity details
* refactor(cli): use dialect display parsing for warehouse catalog
* refactor(cli): use dialect SQL in relationship workflows
* test(cli): verify solid dialect scan workflow closure
* test: split cli tests from source tree
* refactor(cli): standardize BigQuery scope listing
* feat(sqlite): implement connector scope listing
* test(connectors): cover required table listing
* feat(cli): add warehouse driver registry
* refactor(setup): route scope discovery through driver registry
* refactor(cli): route local query execution through driver registry
* refactor(historic-sql): route dialect support through driver registry
* refactor(cli): test warehouse connections through driver registry
* fix(cli): close driver registry type export gaps
* Improve setup daemon diagnostics
* refactor(setup): centralize rail-prefixed diagnostics + query-history fallback
Extract errorMessage, writePrefixedLines, and flushPrefixedBufferedCommandOutput
into clack.ts so the setup wizard, managed daemons, and embedding/agent steps
share one rail-formatted writer. setup-databases.ts also adds a
"disable query history and retry" option when the schema-context build fails
and query history is the likely culprit, surfaced via a new
failed-query-history-unavailable status.
* fix(cli): carry catalog through the picker so BigQuery/Snowflake/SQL Server scope filters match
The setup picker's KtxTableListEntry was a 2-level { schema, name }, so
qualifiedTableId always wrote db.name into enabled_tables. When BigQuery,
Snowflake, or SQL Server later ran fast ingest, their introspect step filtered
the scope set with scopedTableNames(scope, { catalog: projectId|database, db })
— catalog was non-null on the introspect side but null in the scope refs, so
every entry was rejected, the live-database adapter staged zero table files,
and detect() failed with 'Adapter "live-database" did not recognize fetched
source output'.
Align the picker boundary with the canonical 3-level KtxTableRef:
- Add catalog: string | null to KtxTableListEntry.
- BigQuery/Snowflake/SQL Server listTables populate catalog from the
resolved projectId / database; Postgres/MySQL/ClickHouse/SQLite set null.
- qualifiedTableId emits catalog.schema.name when catalog is non-null
(resolveEnabledTables already accepts the 3-part shape) and
schemasFromEnabledTables now goes through parseDottedTableEntry so it
recovers the schema correctly from both 2-part and 3-part entries.
- Export parseDottedTableEntry from enabled-tables.ts (@internal) for picker
reuse.
Update listTables expectations in all seven connector tests and the setup /
picker test fixtures. Add a picker regression test that covers the
catalog-bearing round-trip (save + refine).
* fix(cli): allow debug telemetry under opt-out env
129 lines
6.4 KiB
Markdown
129 lines
6.4 KiB
Markdown
# ktx Code-Design Principles
|
|
|
|
Principles agents must apply when writing or changing behavior in this
|
|
repository. These rules carry the same weight as the `MUST` / `MUST NOT`
|
|
rules in `AGENTS.md`.
|
|
|
|
Overengineering rarely looks like over-engineering at the line level. It
|
|
shows up as small, locally-reasonable choices that combine into a system
|
|
where features fail silently and bug fixes have to be applied N times. The
|
|
principles below are the lessons; if a piece of code violates one, that is
|
|
enough reason to fix it even when the local code "works."
|
|
|
|
## One way to say one thing
|
|
|
|
- **MUST NOT**: Accept two spellings of the same intent — e.g. a magic
|
|
sentinel value AND absence-of-field both meaning "use the default". Pick
|
|
one and reject the other.
|
|
- **MUST NOT**: Maintain two entry points that load/construct/resolve the
|
|
same thing where one does strictly more work than the other. Callers
|
|
will pick the wrong one. Unify them, or encode the difference as a
|
|
required argument on a single entry point.
|
|
- **MUST NOT**: Let each consumer write its own private wrapper around a
|
|
shared helper to make it usable. If three callers each prepend the same
|
|
three lines, those three lines belong in the helper.
|
|
|
|
## Behavior follows from inputs, not from which path the caller took
|
|
|
|
- **MUST**: A function's result must depend on its arguments, not on
|
|
which sibling function the caller happened to invoke first. If "did
|
|
setup step S run?" determines correctness, S belongs INSIDE the function
|
|
that needs it, or its absence must be a hard error — not a silent
|
|
degradation.
|
|
- **SHOULD**: When a value on disk requires runtime resolution (start a
|
|
process, read state, hit a service), the resolution belongs in ONE
|
|
place that every consumer goes through. If some consumers get the
|
|
resolved form and some get the raw form, the abstraction is broken.
|
|
|
|
## Failures must reach a decision-maker
|
|
|
|
- **MUST NOT**: Catch an error, log it through a logger that may be a
|
|
no-op, and continue with a null/empty result. The error reaches no one.
|
|
Either surface the failure to the caller (return type, status field,
|
|
stderr line), or throw.
|
|
- **MUST**: A caller that receives "no result" must be able to
|
|
distinguish "the input legitimately produced nothing" from "a
|
|
dependency was unavailable" from "the operation was skipped." If those
|
|
three look the same to the user, the system is hiding bugs — including
|
|
this one.
|
|
- **MUST**: When a function returns `T | null` (or a "skipped" status),
|
|
at least one caller in the codebase must branch on the negative case
|
|
and surface it. If every caller treats absence as success, the function
|
|
is laundering errors.
|
|
|
|
## Don't build seams without a second piece on the other side
|
|
|
|
- **MUST NOT**: Introduce an interface, abstract type, or "port" boundary
|
|
with exactly one implementation and no concrete plan for a second.
|
|
Abstractions are paid for with indirection; pay only when you collect.
|
|
- **MUST NOT**: Add an optional dep-injection slot (`deps.X ?? defaultX`)
|
|
unless at least one test exercises the production default. If every
|
|
test injects a fake, the production codepath is type-checked and
|
|
untested.
|
|
- **MUST NOT**: Add a wrapper "in case" callers later need to extend it.
|
|
Add the wrapper when the second caller arrives.
|
|
|
|
## Specification and behavior are one artifact
|
|
|
|
- **MUST**: When a schema, doc comment, or config description states a
|
|
default or a meaning, the code MUST enforce it. Drift between
|
|
"what the field claims" and "what the code does" is a contract bug
|
|
even if both compile.
|
|
- **MUST**: When you change behavior, update the schema description, the
|
|
doc, AND the example in the same change. Not later.
|
|
|
|
## Verify the path you claim to have fixed
|
|
|
|
- **MUST**: Before claiming a feature works, run a command that actually
|
|
exercises it end-to-end and observe the side effect — the file
|
|
written, the process contacted, the row stored. Type-check passing is
|
|
necessary, not sufficient. A test passing against a fake is not
|
|
evidence the real path works.
|
|
- **MUST**: Before declaring a bug fixed, grep for the same shape
|
|
elsewhere. Bugs of the kind described in this section repeat. Fix the
|
|
class, not just the instance.
|
|
|
|
## Naming asymmetries are bugs in waiting
|
|
|
|
- **SHOULD**: When two related identifiers have non-parallel names
|
|
(`loadX` vs `loadHigherX`, `createY` vs `createDefaultY`, `xClient`
|
|
vs `xService`), assume callers will pick the wrong one. Unify, or
|
|
document inline why both must exist.
|
|
|
|
## Dispatch and contract leaks across per-variant layers
|
|
|
|
Layers with multiple per-variant implementations (warehouse drivers,
|
|
dialects, LLM providers, ingest adapters, historic-SQL probes) drift
|
|
toward parallel switches and informal contracts. The patterns below
|
|
look locally reasonable per file but multiply with the number of
|
|
variants times the number of consumers — every fix has to be applied
|
|
N times, and silent drift between variants is invisible until a user
|
|
hits it.
|
|
|
|
- **MUST NOT**: Maintain two or more files that switch on the same
|
|
enum or string union to dispatch to per-variant behavior. Promote
|
|
the dispatch to a single registry table keyed by the union, exposed
|
|
through one resolution function. If you find yourself writing the
|
|
third such switch, the second one was already a bug.
|
|
- **MUST**: When every variant of an abstraction implements the same
|
|
method, the method belongs on the shared interface. An informal
|
|
contract that every implementation happens to satisfy is a leak
|
|
waiting to happen — callers will reach for the concrete class
|
|
instead of the contract, and the next variant added will silently
|
|
forget to implement it.
|
|
- **MUST**: When a layer has both a thin shared interface and rich
|
|
per-variant concrete classes, they must agree. Either widen the
|
|
interface so callers never need the concrete class, or make the
|
|
concrete class private (test-only `/** @internal */` JSDoc plus a
|
|
boundary check in `scripts/check-boundaries.mjs`). A class that is
|
|
public AND has methods the interface does not expose is the exact
|
|
configuration that produces leaks.
|
|
|
|
The warehouse driver / dialect layer in
|
|
`packages/cli/src/connectors/<driver>/` plus
|
|
`packages/cli/src/context/connections/{dialects,drivers}.ts` is the
|
|
canonical worked example: per-driver dialect classes carry
|
|
`/** @internal */`, `scripts/check-boundaries.mjs` enforces the import
|
|
boundary, and dispatch lives in the two registry files. Apply the
|
|
same shape to any other per-variant layer that grows beyond two
|
|
implementations.
|