mirror of
https://github.com/Kaelio/ktx.git
synced 2026-06-10 08:05:14 +02:00
Stabilize parallel ingest concurrency
This commit is contained in:
parent
e64da5a85d
commit
1db8a6debd
19 changed files with 1370 additions and 40 deletions
|
|
@ -45,6 +45,23 @@ requires deep ingest readiness.
|
|||
When `--all` selects both databases and context sources, database ingest runs
|
||||
first, then source ingest and memory updates run for source connections.
|
||||
|
||||
`ktx ingest --all` runs one target at a time by default. Configure source
|
||||
concurrency in `ktx.yaml` when independent connections can run in parallel:
|
||||
|
||||
```yaml title="ktx.yaml"
|
||||
ingest:
|
||||
sources:
|
||||
maxConcurrency: 4
|
||||
workUnits:
|
||||
maxConcurrency: 6
|
||||
resolverConcurrency: 3
|
||||
```
|
||||
|
||||
`ingest.sources.maxConcurrency` controls top-level `--all` target dispatch.
|
||||
`ingest.workUnits.maxConcurrency` controls work units inside one source ingest.
|
||||
`ingest.workUnits.resolverConcurrency` controls concurrent textual conflict
|
||||
repairs for disjoint files. Each value must be between `1` and `8`.
|
||||
|
||||
Some ingest paths use the managed KTX Python runtime. Query-history ingest uses
|
||||
it for SQL analysis, and Looker source ingest uses it for Looker identifier
|
||||
parsing. In an interactive terminal, `ktx ingest` prompts before installing the
|
||||
|
|
|
|||
|
|
@ -121,6 +121,38 @@ Source ingest extracts metadata, reconciles it with existing local context, and
|
|||
writes semantic-layer YAML plus wiki Markdown. It merges rather than blindly
|
||||
overwriting local edits.
|
||||
|
||||
## Ingest concurrency
|
||||
|
||||
KTX keeps ingest sequential by default so first runs are predictable. Increase
|
||||
concurrency when your configured sources are independent and your local LLM
|
||||
backend can handle more simultaneous agent sessions.
|
||||
|
||||
```yaml title="ktx.yaml"
|
||||
ingest:
|
||||
sources:
|
||||
maxConcurrency: 4
|
||||
workUnits:
|
||||
maxConcurrency: 6
|
||||
resolverConcurrency: 3
|
||||
```
|
||||
|
||||
Use these settings together:
|
||||
|
||||
| Setting | Applies to | Default |
|
||||
|---------|------------|---------|
|
||||
| `ingest.sources.maxConcurrency` | Top-level `ktx ingest --all` targets | `1` |
|
||||
| `ingest.workUnits.maxConcurrency` | Work units inside one source ingest | `1` |
|
||||
| `ingest.workUnits.resolverConcurrency` | Textual conflict repair for disjoint files | Same as `workUnits.maxConcurrency` |
|
||||
|
||||
Evidence-only adapters, such as query-history ingest that emits historic SQL
|
||||
evidence, can usually tolerate higher work-unit concurrency because their
|
||||
patches are often no-ops. Source adapters that rewrite the same semantic-layer
|
||||
or wiki files need lower values to reduce conflict repair work.
|
||||
|
||||
Each concurrency value must be between `1` and `8`. Higher values create more
|
||||
temporary Git worktrees and more concurrent LLM sessions, so raise them in
|
||||
small steps and check `.ktx/ingest-traces/` when a run fails.
|
||||
|
||||
## Text ingest
|
||||
|
||||
Use `ktx ingest text` for notes, Markdown files, runbooks, Slack exports, or
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue