omnigraph/docs/user/search/index.md
Andrew Altshuler 612741b387
docs(user): split language/branching pages + add front-door pages (Phase 2) (#225)
Content build-out on top of the Phase 1 topic move. No behavior changes.

Splits (existing content relocated, cross-linked):
- queries/index.md → mutations/index.md (insert/update/delete + the
  inserts-vs-deletes rule) and search/index.md (the multi-modal search
  functions + a hybrid-ranking overview tying nearest/bm25/rrf together).
  queries/index.md now covers the read shape and points at both.
- branching/index.md → branching/time-travel.md (snapshots/time travel) and
  branching/merge.md (three-way merge + the 7 conflict kinds, verified against
  error.rs MergeConflictKind).

New pages (written from the code, user-facing):
- quickstart.md — init → load → query → branch, with verified CLI flags.
- concepts/index.md — what OmniGraph is + the L1/L2 (Lance/OmniGraph) framing.

Expanded operations/audit.md from a 7-line struct dump into a real
actor-tracking page (server token-resolved vs CLI --as chain; reading the
trail; the omnigraph:recovery reserved actor).

Index wiring: docs/user/index.md and AGENTS.md's topic table link every new
page; also normalized AGENTS.md's docs/user link display text to match the
Phase 1 retargeted paths.

Verified: zero broken .md links; check-agents-md.sh green (57 links, 54 docs).

Deferred to Phase 3: de-dev polish (grammar paths, IR internals still in
queries/branching), guides/, and a possible reference/config.md split (the
config schema is already coherent in cli/reference.md).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 13:53:46 +03:00

48 lines
1.8 KiB
Markdown

# Search
OmniGraph runs vector, full-text, and hybrid search in the same runtime as graph
traversal — a single [query](../queries/index.md) can combine a vector `nearest`,
a `bm25` text score, and an `Expand` traversal. Search functions are used inside
`match` (to filter), or as expressions inside `return` / `order` (to score and
rank).
## Functions
| Function | Purpose | Backing index |
|---|---|---|
| `nearest($x.vec, $q)` | k-NN vector search (cosine) | vector index (IVF / HNSW) |
| `search(field, q)` | Generic full-text search | inverted (FTS) index |
| `fuzzy(field, q [, max_edits])` | Levenshtein-tolerant text search | inverted index |
| `match_text(field, q)` | Pattern match | inverted index |
| `bm25(field, q)` | BM25 relevance scoring | inverted index |
| `rrf(rank_a, rank_b [, k])` | Reciprocal Rank Fusion of two rankings (default `k=60`) | fuses scored rankings |
- `nearest()` requires a `limit`. The query vector is resolved from the param map,
or embedded from a text input at runtime via the configured
[embedding client](embeddings.md).
- Scores and ranks propagate as ordinary columns, so you can `return` a score and
`order` by it.
## Hybrid ranking with `rrf`
Reciprocal Rank Fusion combines two independent rankings (typically one vector and
one text) into a single fused ranking, without needing the two score scales to be
comparable. Rank each retrieval separately, then fuse:
```gq
query hybrid($q: String) {
match { $d: Document { } }
return {
$d,
rrf( nearest($d.embedding, $q), bm25($d.body, $q) ) as score
}
order { score desc }
limit 10
}
```
## Indexes and embeddings
Search functions only work when the backing index exists — see
[indexes](indexes.md) for building vector and inverted indexes, and
[embeddings](embeddings.md) for generating the vectors `nearest` searches over.