* test(engine): regression tests for #283 camelCase property filters Red against current code. A query (or chained mutation) that filters on a camelCase schema field lints and plans cleanly but fails at run time with "No field named reponame" because the identifier's case is destroyed at the engine->Lance boundary. Coverage added: - query.rs unit: ir_filter_to_expr on a camelCase property must emit an Expr::Column named `repoName`, not `reponame` (red); plus a green coercion guard that a camelCase int column still gets a coerced literal. - mutation.rs unit: predicate_to_sql must emit the column UNQUOTED and case-preserved (green guard documenting the committed-scan contract). - literal_filters.rs e2e: a camelCase @index field with an inline-binding pushdown filter returns the seeded row (red — read pushdown). - writes.rs e2e: an update+delete on a camelCase predicate, and a chained update that re-reads the pending side of scan_with_pending by the same camelCase predicate (red — pending MemTable scan). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7 * fix(engine): preserve identifier case in filter pushdown (#283) Two engine->Lance boundaries lowercased camelCase column identifiers, breaking any filter on a camelCase schema field even though the IR, compiler, projection, and in-memory filtering all preserve case. Read pushdown (exec/query.rs, ir_expr_to_expr): build the column reference with datafusion::prelude::ident() instead of col(). col() routes through SQL identifier normalization and lowercases an unquoted identifier (`repoName` -> `reponame`); ident() builds an unqualified, case-preserved Column. Property refs here are always bare column names, so there is no qualified-name handling to lose. No-op for the lowercase columns that work today. Pending mutation scan (table_store.rs, scan_pending_batches): the committed-scan consumer (Lance Scanner::filter(&str)) preserves an unquoted identifier's case but treats a double-quoted "col" as a string literal, so predicate_to_sql must keep the column unquoted. The pending side splices that same unquoted predicate into a DataFusion `SELECT ... WHERE`, which would lowercase it. Make that path case-preserving by disabling sql_parser.enable_ident_normalization on its SessionContext rather than quoting (quoting would match zero committed rows). predicate_to_sql gains only a clarifying comment; its emitted string is unchanged. Full engine suite green (579 tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7 * docs(dev): case study for #283 camelCase filter bug Record the root cause, the two-boundary fix (read pushdown col→ident; pending mutation scan ident-normalization off), and why the obvious symmetric "quote the column" fix is wrong (Lance reads a double-quoted column as a string literal and silently matches zero committed rows). Linked from a new "Case Studies" section in the dev index so the link check passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01FQ1Hf4eXLsJmeLUkTYBEw7 --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.1 KiB
Developer Docs
Audience: contributors, maintainers, and coding agents
This is the contributor-facing entry point. These docs explain architecture, invariants, implementation contracts, test ownership, and upstream Lance constraints. User-facing behavior should still be documented through docs/user/index.md and the relevant public reference docs.
Required For Every Non-Trivial Change
| Need | Read |
|---|---|
| Architectural rules, known gaps, deny-list | invariants.md |
| Upstream Lance source-of-truth index | lance.md |
| Existing test coverage and test placement | testing.md |
Architecture And Storage
| Area | Read |
|---|---|
| System structure, L1/L2 framing, component diagrams | architecture.md |
| On-disk layout, manifest schema, URI behavior | storage.md |
| Direct-publish writes, D2, staged writes, recovery sidecars | writes.md |
| Query execution, mutation execution, loader flow | execution.md |
| Index lifecycle and graph topology indexes | indexes.md |
| Branch and commit internals | branches-commits.md |
| Three-way merge implementation and conflicts | merge.md |
| Diff/change-feed implementation | changes.md |
| Branch protection policy | branch-protection.md |
Language, Runtime, And Boundaries
| Area | Read |
|---|---|
| Schema grammar, catalog, migration planner | schema-language.md |
| Query grammar, IR, lints, mutation restrictions | query-language.md |
Embedding client and @embed integration |
embeddings.md |
| Cedar policy surface and server gating | policy.md |
| Server auth, OpenAPI, endpoint handlers | server.md |
| Error taxonomy and serialization | errors.md |
| Constants and tunables | constants.md |
| Transaction model public contract | transactions.md |
Project Operations
| Area | Read |
|---|---|
| CI and release workflows | ci.md |
| Install and deployment packaging | install.md, deployment.md |
| Release history | releases/ |
Contribution & Governance
| Area | Read |
|---|---|
| How to contribute (external) | CONTRIBUTING.md |
| Governance model, roles, decision authority | GOVERNANCE.md |
| Public contribution RFC track | rfcs/ |
The docs/rfcs/ track is the public, externally-authorable RFC process. The
maintainer/internal RFCs below (rfc-00N-*.md) are a separate, team-owned
track; don't conflate the two.
Case Studies
Worked write-ups of specific bugs — root cause, fix, and the reasoning that ruled out the tempting-but-wrong alternatives. Read these for the debugging pattern, not just the outcome.
| Area | Read |
|---|---|
| camelCase property filters lowercased at runtime (#283) — two engine→Lance boundaries, two different fixes | bug-case-fix.md |
Active Implementation Plans
Working documents for in-flight feature work. Removed when the work lands.
| Area | Read |
|---|---|
Schema-lint chassis v1 (MR-694) — --allow-data-loss, soft/hard drops |
schema-lint-v1-plan.md |
| Inline + stored queries, request/response envelope, MCP (MR-656 / MR-976 / MR-969) | rfc-001-queries-envelope-mcp.md |
| Config & CLI architecture — layered config, client targeting, file naming (MR-973 / MR-974 / MR-981) | rfc-002-config-cli-architecture.md |
| MCP server surface — full tool parity, stored queries, modular auth (MR-969 / MR-956 / MR-974) | rfc-003-mcp-server-surface.md |
| Future cluster control plane — declarative as-code config, JSON state ledger, reconciler | cluster-config-specs.md, cluster-axioms.md, cluster-config-implementation-spec.md |
| Cluster graph & schema apply — Phase 4 sidecars, roll-forward recovery, approval artifacts | rfc-004-cluster-graph-schema-apply.md |
| Server boots from cluster state — Phase 5 mode switch, applied-revision serving | rfc-005-server-cluster-boot.md |
Per-operator config — ~/.omnigraph/ identity, keyed credentials, named servers (the operator slice of RFC-002) |
rfc-007-operator-config.md |
Deprecate omnigraph.yaml — one concern per config surface; key-by-key migration map and staged retirement |
rfc-008-deprecate-omnigraph-yaml.md |
Unify CLI embedded/remote access paths — parity referee, shared wire-DTO crate, GraphClient trait, declared plane capabilities |
rfc-009-unify-access-paths.md |
| Restructure the CLI around explicit planes — one graph-addressing model, declared capability surface, plane-grouped help (expands RFC-009 Phase 4) | rfc-010-cli-planes-restructure.md |
CLI refactoring — one addressing & config model post-omnigraph.yaml: scope + --graph + derived access path, served-default / privileged-direct, profiles, named queries, capability classifier (completes RFC-008) |
rfc-011-cli-refactoring.md |
Provider-independent embedding configuration — one resolved EmbeddingConfig + sealed provider enum (Gemini/OpenAI/Mock), identity recorded in the schema IR, query-time same-space validation, NFR floor |
rfc-012-embedding-provider-config.md |
Boundary
Developer docs may mention implementation details, stale gaps, upstream Lance blockers, and review rules. User docs should not require that context unless the detail changes the public contract.