ktx/docs-site/content/docs/cli-reference/ktx-scan.mdx

170 lines
6.6 KiB
Text
Raw Permalink Normal View History

---
title: "ktx scan"
description: "Run or inspect database scans."
---
Discover your database schema — tables, columns, types, constraints, and relationships. Scanning is the first step in building context: KTX needs to understand your warehouse structure before it can build semantic sources.
Scan commands live under `ktx dev scan`. See also the [Building Context](/docs/guides/building-context) guide for a walkthrough.
## Command signature
```bash
ktx dev scan <connectionId> [options]
ktx dev scan <subcommand> [options]
```
## Subcommands
| Subcommand | Description |
|-----------|-------------|
| `status <runId>` | Print status for a local scan run |
| `report <runId>` | Print a local scan report |
| `relationships <runId>` | Print relationship artifacts for a local scan run |
| `relationship-apply <runId>` | Apply accepted relationship review decisions as manual manifest joins |
| `relationship-feedback` | Export persisted relationship review decisions as calibration labels |
| `relationship-calibration` | Summarize relationship feedback labels against current score thresholds |
| `relationship-thresholds` | Evaluate relationship feedback labels for offline threshold advice |
## Options
### `scan` (run)
| Flag | Description | Default |
|------|-------------|---------|
| `--mode <mode>` | Scan mode: `structural`, `enriched`, or `relationships` | `structural` |
| `--dry-run` | Run without writing scan results | `false` |
| `--database-introspection-url <url>` | Daemon URL for live-database introspection | — |
### `scan report`
| Flag | Description | Default |
|------|-------------|---------|
| `--json` | Print the raw scan report JSON | `false` |
### `scan relationships`
| Flag | Description | Default |
|------|-------------|---------|
| `--status <status>` | Filter by status: `accepted`, `review`, `rejected`, `skipped`, or `all` | `review` |
| `--limit <count>` | Maximum relationships to print per status | `25` |
| `--accept <candidateId>` | Record an accepted decision for a relationship candidate | — |
| `--reject <candidateId>` | Record a rejected decision for a relationship candidate | — |
| `--note <text>` | Attach a note when recording a relationship review decision | — |
| `--reviewer <name>` | Reviewer name for a relationship review decision | — |
| `--json` | Print relationship artifacts as JSON | `false` |
### `scan relationship-apply`
| Flag | Description | Default |
|------|-------------|---------|
| `--all-accepted` | Apply all accepted relationship review decisions for the scan run | `false` |
| `--candidate <candidateId>` | Apply one accepted relationship review decision; repeatable | — |
| `--dry-run` | Preview relationships that would be written without rewriting manifest shards | `false` |
| `--json` | Print the apply result as JSON | `false` |
### `scan relationship-feedback`
| Flag | Description | Default |
|------|-------------|---------|
| `--connection <connectionId>` | Only export labels for one KTX connection | — |
| `--decision <decision>` | Filter: `accepted`, `rejected`, or `all` | `all` |
| `--json` | Print the export as JSON | `false` |
| `--jsonl` | Print labels as newline-delimited JSON | `false` |
### `scan relationship-calibration`
| Flag | Description | Default |
|------|-------------|---------|
| `--connection <connectionId>` | Only calibrate labels for one KTX connection | — |
| `--decision <decision>` | Filter: `accepted`, `rejected`, or `all` | `all` |
| `--accept-threshold <value>` | Score threshold treated as predicted accepted (01) | `0.85` |
| `--review-threshold <value>` | Score threshold treated as predicted review (01) | `0.55` |
| `--json` | Print the calibration report as JSON | `false` |
### `scan relationship-thresholds`
| Flag | Description | Default |
|------|-------------|---------|
| `--connection <connectionId>` | Only evaluate labels for one KTX connection | — |
| `--min-total-labels <count>` | Minimum scored labels before advice can be ready | `20` |
| `--min-accepted-labels <count>` | Minimum accepted labels before advice can be ready | `5` |
| `--min-rejected-labels <count>` | Minimum rejected labels before advice can be ready | `5` |
| `--json` | Print the threshold advice report as JSON | `false` |
## Examples
```bash
# Run a structural scan of a connection
ktx dev scan my-warehouse
# Run a scan with LLM enrichment
ktx dev scan my-warehouse --mode enriched
# Run a scan with relationship detection
ktx dev scan my-warehouse --mode relationships
# Dry-run a scan (don't write results)
ktx dev scan my-warehouse --dry-run
# Check the status of a scan run
ktx dev scan status run-abc123
# View the scan report
ktx dev scan report run-abc123
# View scan report as JSON
ktx dev scan report run-abc123 --json
# List relationship candidates pending review
ktx dev scan relationships run-abc123
# List all relationships regardless of status
ktx dev scan relationships run-abc123 --status all
# Accept a relationship candidate
ktx dev scan relationships run-abc123 --accept candidate-xyz
# Reject a relationship candidate with a note
ktx dev scan relationships run-abc123 --reject candidate-xyz --note "false positive"
# Apply all accepted relationships to the manifest
ktx dev scan relationship-apply run-abc123 --all-accepted
# Preview what would be applied
ktx dev scan relationship-apply run-abc123 --all-accepted --dry-run
# Export relationship feedback as calibration labels
ktx dev scan relationship-feedback --json
# Calibrate relationship detection thresholds
ktx dev scan relationship-calibration --accept-threshold 0.9 --review-threshold 0.6
# Get threshold advice based on review decisions
ktx dev scan relationship-thresholds
```
## Output
Scan commands write scan artifacts under the KTX project directory and print status or report summaries. Use `--json` on report and relationship commands when an agent needs structured output.
```json
{
"runId": "scan-local-abc123",
"status": "completed",
"mode": "structural",
"changes": {
"tablesAdded": 42
}
}
```
## Common errors
| Error | Cause | Recovery |
|-------|-------|----------|
| Scan cannot connect | Connection credentials or network access are invalid | Run `ktx connection test <connectionId>` and update the connection before scanning |
| Enriched scan cannot describe columns | LLM credentials are missing or invalid | Complete LLM setup with `ktx setup` before enriched scans |
| Relationship apply writes nothing | No accepted candidates match the provided run id or candidate ids | Inspect `ktx dev scan relationships <runId> --status accepted` first |
| Calibration is not ready | Too few reviewed relationship labels exist | Review and accept/reject more candidates, then rerun calibration |