fix(embedding): address PR review feedback (RFC-012 Phase 2)

openai-alias host (Cursor): OMNIGRAPH_EMBED_PROVIDER=openai now defaults its base URL to https://api.openai.com/v1 (model text-embedding-3-large), while openai-compatible/unset keep the OpenRouter gateway default. The default is derived from the alias rather than the Provider enum, so an operator's stated intent can no longer be silently routed to OpenRouter; an explicit OMNIGRAPH_EMBED_BASE_URL still wins. New test from_env_openai_alias_uses_openai_host_not_openrouter.

single model source of truth (Cursor): remove the EmbedSpec.model field. The provider config is authoritative for the model, so a spec can no longer declare a model that is silently ignored while the API uses another (the wrong-space-vectors footgun); the embed summary reports the model actually resolved. Correct by construction rather than a truthful-echo patch.

stale @embed docs (Codex): docs/user/schema/index.md and docs/dev/execution.md still claimed @embed embeds at ingest; corrected to the real contract (catalog annotation; vectors supplied or pre-filled by 'omnigraph embed'). Also documented the openai-vs-OpenRouter base default in embeddings.md.

Greptile's RFC-status note is declined: the repo lifecycle keeps an RFC Status: Proposed while its PR is open and flips to Accepted on merge.
This commit is contained in:
Ragnor Comerford 2026-06-15 18:37:34 +02:00
parent e7361dce49
commit 30377c453b
No known key found for this signature in database
5 changed files with 61 additions and 43 deletions

View file

@ -83,8 +83,6 @@ impl EmbedMode {
#[derive(Debug, Clone, Deserialize)] #[derive(Debug, Clone, Deserialize)]
struct EmbedSpec { struct EmbedSpec {
#[serde(default)]
model: String,
dimension: usize, dimension: usize,
types: BTreeMap<String, EmbedTypeSpec>, types: BTreeMap<String, EmbedTypeSpec>,
} }
@ -296,7 +294,14 @@ pub(crate) async fn run_embed_job(job: &EmbedJob) -> Result<EmbedOutput> {
cleaned_rows, cleaned_rows,
mode: job.mode.as_str(!job.selectors.is_empty()), mode: job.mode.as_str(!job.selectors.is_empty()),
dimension: job.spec.dimension, dimension: job.spec.dimension,
model: job.spec.model.clone(), // The embedding model is resolved solely from the provider config; the
// spec carries no model field, so there is no second source of truth to
// silently disagree with the API. Report what was actually used (empty
// for `--clean`, which builds no client).
model: client
.as_ref()
.map(|c| c.config().model.clone())
.unwrap_or_default(),
}) })
} }

View file

@ -8,8 +8,10 @@ use tokio::time::sleep;
use crate::error::{OmniError, Result}; use crate::error::{OmniError, Result};
const DEFAULT_OPENAI_BASE_URL: &str = "https://openrouter.ai/api/v1"; const DEFAULT_OPENROUTER_BASE_URL: &str = "https://openrouter.ai/api/v1";
const DEFAULT_OPENAI_MODEL: &str = "openai/text-embedding-3-large"; const DEFAULT_OPENROUTER_MODEL: &str = "openai/text-embedding-3-large";
const DEFAULT_OPENAI_BASE_URL: &str = "https://api.openai.com/v1";
const DEFAULT_OPENAI_MODEL: &str = "text-embedding-3-large";
const DEFAULT_GEMINI_BASE_URL: &str = "https://generativelanguage.googleapis.com/v1beta"; const DEFAULT_GEMINI_BASE_URL: &str = "https://generativelanguage.googleapis.com/v1beta";
const DEFAULT_GEMINI_MODEL: &str = "gemini-embedding-2"; const DEFAULT_GEMINI_MODEL: &str = "gemini-embedding-2";
const DEFAULT_TIMEOUT_MS: u64 = 30_000; const DEFAULT_TIMEOUT_MS: u64 = 30_000;
@ -35,24 +37,6 @@ pub enum Provider {
Mock, Mock,
} }
impl Provider {
fn default_base_url(self) -> &'static str {
match self {
Provider::OpenAiCompatible => DEFAULT_OPENAI_BASE_URL,
Provider::Gemini => DEFAULT_GEMINI_BASE_URL,
Provider::Mock => "",
}
}
fn default_model(self) -> &'static str {
match self {
Provider::OpenAiCompatible => DEFAULT_OPENAI_MODEL,
Provider::Gemini => DEFAULT_GEMINI_MODEL,
Provider::Mock => "",
}
}
}
/// Whether the text being embedded is a search query or a stored document. /// Whether the text being embedded is a search query or a stored document.
/// Only Gemini distinguishes these (`RETRIEVAL_QUERY` vs `RETRIEVAL_DOCUMENT`); /// Only Gemini distinguishes these (`RETRIEVAL_QUERY` vs `RETRIEVAL_DOCUMENT`);
/// OpenAI-compatible providers and Mock produce the identical request for both, /// OpenAI-compatible providers and Mock produce the identical request for both,
@ -89,24 +73,39 @@ impl EmbeddingConfig {
return Ok(Self::mock()); return Ok(Self::mock());
} }
let provider = match env_string("OMNIGRAPH_EMBED_PROVIDER").as_deref() { // The default base URL and model depend on the provider *alias*, not just
None | Some("openai-compatible") | Some("openai") => Provider::OpenAiCompatible, // the wire shape: `openai-compatible` (and the unset default) point at the
Some("gemini") => Provider::Gemini, // OpenRouter gateway, while `openai` points at OpenAI's own host.
Some("mock") => return Ok(Self::mock()), let (provider, default_base, default_model) =
Some(other) => { match env_string("OMNIGRAPH_EMBED_PROVIDER").as_deref() {
return Err(OmniError::manifest_internal(format!( None | Some("openai-compatible") => (
"unknown OMNIGRAPH_EMBED_PROVIDER '{}' (expected openai-compatible|gemini|mock)", Provider::OpenAiCompatible,
other DEFAULT_OPENROUTER_BASE_URL,
))); DEFAULT_OPENROUTER_MODEL,
} ),
}; Some("openai") => (
Provider::OpenAiCompatible,
DEFAULT_OPENAI_BASE_URL,
DEFAULT_OPENAI_MODEL,
),
Some("gemini") => {
(Provider::Gemini, DEFAULT_GEMINI_BASE_URL, DEFAULT_GEMINI_MODEL)
}
Some("mock") => return Ok(Self::mock()),
Some(other) => {
return Err(OmniError::manifest_internal(format!(
"unknown OMNIGRAPH_EMBED_PROVIDER '{}' (expected openai-compatible|openai|gemini|mock)",
other
)));
}
};
let base_url = env_string("OMNIGRAPH_EMBED_BASE_URL") let base_url = env_string("OMNIGRAPH_EMBED_BASE_URL")
.unwrap_or_else(|| provider.default_base_url().to_string()) .unwrap_or_else(|| default_base.to_string())
.trim_end_matches('/') .trim_end_matches('/')
.to_string(); .to_string();
let model = let model =
env_string("OMNIGRAPH_EMBED_MODEL").unwrap_or_else(|| provider.default_model().to_string()); env_string("OMNIGRAPH_EMBED_MODEL").unwrap_or_else(|| default_model.to_string());
let api_key = match provider { let api_key = match provider {
Provider::OpenAiCompatible => env_string("OPENROUTER_API_KEY") Provider::OpenAiCompatible => env_string("OPENROUTER_API_KEY")
@ -827,11 +826,25 @@ mod tests {
let _guard = cleared_env(&[("OPENROUTER_API_KEY", Some("sk-test"))]); let _guard = cleared_env(&[("OPENROUTER_API_KEY", Some("sk-test"))]);
let config = EmbeddingConfig::from_env().unwrap(); let config = EmbeddingConfig::from_env().unwrap();
assert_eq!(config.provider, Provider::OpenAiCompatible); assert_eq!(config.provider, Provider::OpenAiCompatible);
assert_eq!(config.base_url, DEFAULT_OPENAI_BASE_URL); assert_eq!(config.base_url, DEFAULT_OPENROUTER_BASE_URL);
assert_eq!(config.model, DEFAULT_OPENAI_MODEL); assert_eq!(config.model, DEFAULT_OPENROUTER_MODEL);
assert_eq!(config.api_key, "sk-test"); assert_eq!(config.api_key, "sk-test");
} }
#[test]
#[serial]
fn from_env_openai_alias_uses_openai_host_not_openrouter() {
let _guard = cleared_env(&[
("OMNIGRAPH_EMBED_PROVIDER", Some("openai")),
("OPENAI_API_KEY", Some("k")),
]);
let config = EmbeddingConfig::from_env().unwrap();
assert_eq!(config.provider, Provider::OpenAiCompatible);
assert_eq!(config.base_url, DEFAULT_OPENAI_BASE_URL); // api.openai.com, not OpenRouter
assert_eq!(config.model, DEFAULT_OPENAI_MODEL); // text-embedding-3-large, no openai/ prefix
assert_eq!(config.api_key, "k");
}
#[test] #[test]
#[serial] #[serial]
fn from_env_openai_compatible_prefers_openrouter_key() { fn from_env_openai_compatible_prefers_openrouter_key() {

View file

@ -177,4 +177,4 @@ For all three modes, a mid-load failure (RI / cardinality violation, validation
## Embeddings during load ## Embeddings during load
If a node type has `@embed` properties, the loader calls the engine embedding client (Gemini, RETRIEVAL_DOCUMENT) per row to populate the vector column. See [embeddings.md](../user/search/embeddings.md). The loader does **not** embed `@embed` properties at load time. `@embed` is a catalog annotation consumed by query typecheck/lint; vectors are supplied directly in the load data, or pre-filled by the offline `omnigraph embed` pipeline. Query-time `nearest($v, "string")` auto-embeds the query string via the provider-independent embedding client. See [embeddings.md](../user/search/embeddings.md). (Ingest-time `@embed` execution is a planned RFC-012 phase.)

View file

@ -45,7 +45,7 @@ Edge bodies only allow `@unique` and `@index`.
- `@<ident>` or `@<ident>(<literal>)` on any declaration or property. - `@<ident>` or `@<ident>(<literal>)` on any declaration or property.
- Known annotations: - Known annotations:
- `@embed` on a Vector property — names the *source* property whose text gets embedded into this vector at ingest. - `@embed("source_property")` on a Vector property — records which String property is the embedding source for query-time `nearest($v, "string")` auto-embedding. It is a catalog annotation; it does **not** populate the vector at ingest (supply vectors in load data, or pre-fill via the offline `omnigraph embed` pipeline).
- `@description("…")`, `@instruction("…")` on query declarations (carried through to clients). - `@description("…")`, `@instruction("…")` on query declarations (carried through to clients).
- Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation. - Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation.

View file

@ -20,9 +20,9 @@ the target column width and sent as Gemini `outputDimensionality` / OpenAI `dime
| Variable | Meaning | | Variable | Meaning |
|---|---| |---|---|
| `OMNIGRAPH_EMBED_PROVIDER` | `openai-compatible` (default) \| `gemini` \| `mock` | | `OMNIGRAPH_EMBED_PROVIDER` | `openai-compatible` (default, → OpenRouter) \| `openai` (→ OpenAI's own host) \| `gemini` \| `mock` |
| `OMNIGRAPH_EMBED_BASE_URL` | endpoint base; default `https://openrouter.ai/api/v1` (openai-compatible) or `https://generativelanguage.googleapis.com/v1beta` (gemini) | | `OMNIGRAPH_EMBED_BASE_URL` | endpoint base; defaults `https://openrouter.ai/api/v1` (`openai-compatible`/unset), `https://api.openai.com/v1` (`openai`), `https://generativelanguage.googleapis.com/v1beta` (`gemini`) |
| `OMNIGRAPH_EMBED_MODEL` | model id; default `openai/text-embedding-3-large` (openai-compatible) or `gemini-embedding-2` (gemini) | | `OMNIGRAPH_EMBED_MODEL` | model id; defaults `openai/text-embedding-3-large` (OpenRouter), `text-embedding-3-large` (`openai`), `gemini-embedding-2` (`gemini`) |
| `OPENROUTER_API_KEY` / `OPENAI_API_KEY` | api key for `openai-compatible` (OpenRouter preferred) | | `OPENROUTER_API_KEY` / `OPENAI_API_KEY` | api key for `openai-compatible` (OpenRouter preferred) |
| `GEMINI_API_KEY` | api key for `gemini` | | `GEMINI_API_KEY` | api key for `gemini` |
| `OMNIGRAPH_EMBED_QUERY_DEADLINE_MS` | total wall-clock budget for one embed call across all retries (default `60000`; `0` = unbounded) | | `OMNIGRAPH_EMBED_QUERY_DEADLINE_MS` | total wall-clock budget for one embed call across all retries (default `60000`; `0` = unbounded) |