mirror of https://github.com/ModernRelay/omnigraph.git synced 2026-06-18 02:24:27 +02:00

Ragnor Comerford e7361dce49

docs(embeddings): provider-independent config surface (RFC-012 Phase 2)

Rewrite docs/user/search/embeddings.md for the resolved Provider model: the provider table, the OMNIGRAPH_EMBED_* env surface (default OpenRouter), the deadline/observability/reuse behavior, and a no-back-compat migration note for gemini-preview graphs.

2026-06-15 17:28:03 +02:00

4.3 KiB

Raw Blame History

Embeddings

OmniGraph embeds text through a single, provider-independent client resolved from one EmbeddingConfig { provider, model, base_url, api_key }. The same resolved config is used by the query-time auto-embed of a string in nearest($v, "string") and by the offline omnigraph embed file pipeline, so query vectors and document vectors share one model and one vector space.

Providers

`provider`	Wire shape	Use it for
`openai-compatible` (default)	`POST {base}/embeddings`, bearer auth, `{model, input, dimensions}`	OpenRouter (the default gateway — one key for many models), OpenAI direct, or a self-hosted endpoint (vLLM / Ollama / LM Studio)
`gemini`	`POST {base}/models/{model}:embedContent`, `x-goog-api-key`, with `RETRIEVAL_QUERY` / `RETRIEVAL_DOCUMENT` task types	Reaching Google's `generativelanguage` API directly
`mock`	none — deterministic offline vectors	Tests and local dev without a key

Vectors are stored L2-normalized as FixedSizeList(Float32, dim); the requested output dimension is driven by the target column width and sent as Gemini outputDimensionality / OpenAI dimensions.

Configuration (environment)

Variable	Meaning
`OMNIGRAPH_EMBED_PROVIDER`	`openai-compatible` (default) \| `gemini` \| `mock`
`OMNIGRAPH_EMBED_BASE_URL`	endpoint base; default `https://openrouter.ai/api/v1` (openai-compatible) or `https://generativelanguage.googleapis.com/v1beta` (gemini)
`OMNIGRAPH_EMBED_MODEL`	model id; default `openai/text-embedding-3-large` (openai-compatible) or `gemini-embedding-2` (gemini)
`OPENROUTER_API_KEY` / `OPENAI_API_KEY`	api key for `openai-compatible` (OpenRouter preferred)
`GEMINI_API_KEY`	api key for `gemini`
`OMNIGRAPH_EMBED_QUERY_DEADLINE_MS`	total wall-clock budget for one embed call across all retries (default `60000`; `0` = unbounded)
`OMNIGRAPH_EMBED_TIMEOUT_MS`	per-request HTTP timeout (default `30000`)
`OMNIGRAPH_EMBED_RETRY_ATTEMPTS` / `OMNIGRAPH_EMBED_RETRY_BACKOFF_MS`	retry policy (defaults `4` / `200`)
`OMNIGRAPH_EMBEDDINGS_MOCK`	set truthy to force the deterministic mock provider

The default zero-config path is OpenRouter: set OPENROUTER_API_KEY and run. Reaching Gemini takes OMNIGRAPH_EMBED_PROVIDER=gemini plus GEMINI_API_KEY.

Behavior notes

Bounded latency. Each embed call is wrapped in OMNIGRAPH_EMBED_QUERY_DEADLINE_MS, so a degraded provider cannot hang a read for the full retry envelope.
Reuse. The query path builds the client once per graph handle (on the first nearest($v, "string") that needs embedding) and reuses it, keeping the provider connection pool warm. A graph that never embeds needs no provider key.
Observability. Embed calls emit tracing events under target = "omnigraph::embedding" (provider, model, dim, attempt, elapsed, outcome).

`@embed` schema annotation

Mark a Vector property with @embed("source_text_property"). Today this is a catalog annotation consumed by the query typechecker and linter: it records which String property is the embedding source and lets nearest($v, "string") auto-embed a query string for comparison against that vector column.

It does not embed at ingest. Stored vectors are supplied directly in your load data, or pre-filled by the offline omnigraph embed pipeline below. (Ingest-time execution of @embed is a planned enhancement.)

CLI `omnigraph embed` (offline file pipeline)

Operates on JSONL files (not on a graph), using the same resolved provider config. Three modes (mutually exclusive):

(default) fill_missing — only embed rows whose target field is empty
--reembed-all — overwrite all
--clean — strip embeddings

Inputs are either a single seed manifest YAML or --input/--output/--spec. Selectors --type T, --select T:field=value filter rows. Streams JSONL → JSONL.

Migration

This release has no backwards-compatibility shim (pre-release). The default provider is now OpenRouter, and the legacy OMNIGRAPH_GEMINI_BASE_URL is removed. A graph whose vectors were produced with gemini-embedding-2-preview should either re-embed, or pin the query-time embedder to match by setting OMNIGRAPH_EMBED_PROVIDER=gemini and OMNIGRAPH_EMBED_MODEL=gemini-embedding-2-preview (the stored and query vectors must come from the same model to be comparable).

4.3 KiB Raw Blame History