--- title: Contributing description: How to contribute to KTX. --- KTX is an open-source project and welcomes contributions - bug fixes, new connectors, documentation improvements, and feature proposals. This page covers how to set up a development environment, navigate the repository, run tests, and submit changes. ## Development setup This page is for contributors working on the KTX repository. To install KTX for an analytics project, use the published [`@kaelio/ktx`](https://www.npmjs.com/package/@kaelio/ktx) package in the [Quickstart](/docs/getting-started/quickstart). ### Prerequisites - **Node.js 22+** and **pnpm** - for the TypeScript workspace - **Python 3.11+** and **uv** - for the Python semantic layer and daemon - **Git** - for version control ### Clone and install ```bash git clone https://github.com/kaelio/ktx.git cd ktx pnpm install uv sync --all-groups ``` `pnpm install` sets up all TypeScript packages in the workspace. `uv sync --all-groups` installs Python dependencies for the semantic layer and daemon, including dev and test groups. ### Build ```bash pnpm run build ``` This builds all TypeScript packages. You can also build individual packages: ```bash pnpm --filter @ktx/cli run build pnpm --filter @ktx/context run build ``` ### Link the CLI for local testing ```bash pnpm run setup:dev pnpm run link:dev ``` This makes the `ktx-dev` command available globally, pointing at your local build. Use this development binary when you need to test unpublished repository changes. ## Repository structure KTX is a pnpm + uv workspace. TypeScript packages live in `packages/`, Python projects in `python/`. ```text packages/ cli/ # CLI entry point and commands context/ # Core context engine (scan, ingest, MCP, semantic layer) llm/ # LLM client abstraction connector-postgres/ # PostgreSQL connector connector-snowflake/ # Snowflake connector connector-bigquery/ # BigQuery connector connector-clickhouse/ # ClickHouse connector connector-mysql/ # MySQL connector connector-sqlserver/ # SQL Server connector connector-sqlite/ # SQLite connector connector-posthog/ # PostHog connector python/ ktx-sl/ # Semantic layer - grain-aware query planning and SQL generation ktx-daemon/ # Daemon - portable API server around the semantic layer examples/ # Example projects and fixtures scripts/ # Workspace scripts (benchmarks, verification, release) docs/ # Documentation site (Fumadocs) ``` All TypeScript packages are ESM (`"type": "module"`) and use `NodeNext` module resolution. The Python projects use `pyproject.toml` for dependency management. ## Running tests ### TypeScript ```bash # Run all tests pnpm run test # Run tests for a specific package pnpm --filter @ktx/cli run test pnpm --filter @ktx/context run test # Type-check all packages pnpm run type-check # Type-check a specific package pnpm --filter @ktx/context run type-check # CLI smoke test pnpm --filter @ktx/cli run smoke ``` ### Python ```bash # Run all Python tests uv run pytest -q # Semantic layer tests uv run pytest python/ktx-sl/tests -q # Daemon tests uv run pytest python/ktx-daemon/tests -q ``` ### Pre-commit checks After modifying Python files, run pre-commit on the changed files: ```bash uv run pre-commit run --files python/ktx-sl/src/changed_file.py ``` ### Full verification For cross-cutting changes that affect package exports or shared contracts: ```bash pnpm run build pnpm run type-check pnpm run test uv run pytest -q ``` ## Adding a connector Database connectors live in `packages/connector-/`. Each connector implements the `KtxScanConnector` interface from `@ktx/context`. ### Step 1: Scaffold the package Create a new directory at `packages/connector-/` with: ```text packages/connector-/ package.json tsconfig.json src/ index.ts # Public exports connector.ts # KtxScanConnector implementation dialect.ts # SQL dialect handling ``` The `package.json` should follow the pattern of existing connectors: ```json { "name": "@ktx/connector-", "private": true, "type": "module", "main": "dist/index.js", "types": "dist/index.d.ts", "exports": { ".": { "types": "./dist/index.d.ts", "import": "./dist/index.js" } }, "dependencies": { "@ktx/context": "workspace:*" } } ``` ### Step 2: Implement the connector Your connector class must implement `KtxScanConnector`, which requires: - **`id`** - a string identifier, typically `":"` - **`driver`** - the `KtxConnectionDriver` value for your database - **`capabilities`** - a `KtxConnectorCapabilities` object declaring what your connector supports: `tableSampling`, `columnSampling`, `columnStats`, `readOnlySql`, `nestedAnalysis`, `eventStreamDiscovery`, `formalForeignKeys`, `estimatedRowCounts` - **`introspect()`** - discovers tables, columns, types, and constraints, returning a `KtxSchemaSnapshot` Optional methods for richer scanning: - **`sampleColumn()`** - sample values from a specific column - **`sampleTable()`** - sample rows from a table - **`columnStats()`** - compute column statistics - **`executeReadOnly()`** - execute arbitrary read-only SQL ### Step 3: Add a dialect The dialect class handles database-specific concerns: identifier quoting, type mapping (native types to normalized types), and query generation for sampling and statistics. ### Step 4: Wire it up Register the new connector driver in `packages/context` so the CLI and scan engine can instantiate it. Look at how existing connectors are registered for the pattern. ### Step 5: Test ```bash pnpm --filter @ktx/connector- run build pnpm --filter @ktx/connector- run type-check pnpm --filter @ktx/connector- run test ``` Use `packages/connector-sqlite/` as a minimal reference and `packages/connector-postgres/` as a full-featured one. ## Code conventions - **TypeScript**: strict types, no `any`, no `as unknown as`. Use `zod` schemas for runtime validation at CLI and config boundaries. Follow the `camelCaseSchema` / `PascalCaseType` naming convention for Zod schemas and inferred types. - **Python**: type hints on all new code, `pathlib` over `os.path`, explicit exception types over broad `except Exception`, `logger.exception()` for caught exceptions. Use `sqlglot` for SQL parsing - never regex. - **Dependencies**: `pnpm` for Node packages (never `npm` or `bun`), `uv` for Python (never `pip`). - **Dead code**: remove it. Don't leave commented-out code, unused wrappers, or empty directories. ## PR guidelines Before submitting a pull request: 1. **Run the relevant checks** - at minimum, `pnpm run type-check` and `pnpm run test` for TypeScript changes, `uv run pytest -q` and `uv run pre-commit run --files [FILES]` for Python changes. 2. **Build if you changed exports** - run `pnpm run build` to verify package exports and `dist/` expectations still align. 3. **Keep changes focused** - one logical change per PR. Don't bundle unrelated refactors. 4. **Follow existing patterns** - match the style and conventions of surrounding code. The codebase favors explicit over clever. 5. **Don't commit artifacts** - `node_modules/`, `.venv/`, `dist/`, coverage output, and local databases should not be committed. For larger features or architectural changes, open an issue first to discuss the approach. ## Agent usage notes Use this page when an agent is modifying the KTX repository itself rather than using KTX in an analytics project. | Agent task | Command or section | |------------|--------------------| | Prepare the workspace | `pnpm install`, `pnpm run setup:dev`, `uv sync --all-groups` | | Verify TypeScript changes | `pnpm run type-check`, `pnpm run test`, or package-filtered equivalents | | Verify Python changes | `uv run pytest -q` and `uv run pre-commit run --files ` | | Add a connector | Adding a connector | | Check style expectations | Code conventions | Common recovery path: if a check fails because generated files or local runtimes are missing, run the setup commands first. If a check fails because of a real type, lint, or test error, fix the source file and rerun the smallest failing check before broadening verification.