fix: replace deprecated Gemini embedding model (text-embedding-004 → gemini-embedding-001)#174
Conversation
…i-embedding-001
The Gemini text-embedding-004 model is no longer served by Google's
Generative Language API — calls to /v1beta/models/text-embedding-004:*
return 404 for newly-issued API keys (verified 2026-05-09). Likewise,
the older models/embedding-001 referenced in get_defaults() is
deprecated.
Replace both with the current GA model gemini-embedding-001:
- packages/openmemory-js/src/memory/embed.ts: hardcoded URL + request
body model field (the gemini path does not yet read from models.yml
via get_model() — that's left as a follow-up).
- models.yml: gemini entry for all 5 sectors + the documentation
comment block at the bottom.
- packages/openmemory-js/src/core/models.ts: get_defaults() fallback
for all 5 sectors.
The new model exposes the same batchEmbedContents endpoint with the
same request body shape (model, content, taskType) and returns vectors
that are resized to env.vec_dim via the existing resize_vec helper, so
no caller-side changes are needed.
Verified on a smart-tier deployment (159 memories): queries that were
falling through to synthetic fallback ("[EMBED] gemini failed:
Gemini failed after 3 attempts: Gemini: 404") now return semantically-
ranked matches in <1s.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 007e3cb72d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!env.gemini_key) throw new Error("Gemini key missing"); | ||
| const prom = gem_q.then(async () => { | ||
| const url = `https://generativelanguage.googleapis.com/v1beta/models/text-embedding-004:batchEmbedContents?key=${env.gemini_key}`; | ||
| const url = `https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:batchEmbedContents?key=${env.gemini_key}`; |
There was a problem hiding this comment.
Re-embed vectors when switching Gemini model
When upgrading a database that already contains Gemini vectors produced by text-embedding-004/embedding-001, this line changes query and newly-written embeddings to gemini-embedding-001 without invalidating the old stored vectors. The vector table stores only v/dim and no model identity (packages/openmemory-js/src/core/db.ts:161), while search compares the new query vector against every stored vector in the sector (packages/openmemory-js/src/core/vector/postgres.ts:58), so existing memories remain in the old embedding space and semantic ranking becomes unreliable until they are re-embedded. Please add a rebuild/migration path or a model-version guard before silently switching the default.
Useful? React with 👍 / 👎.
Problem
text-embedding-004is no longer served by Google's Generative Language API. Calls to/v1beta/models/text-embedding-004:batchEmbedContents(and the single-contentvariant) return HTTP 404 for newly-issued API keys.This means a fresh deployment of OpenMemory configured with
OM_EMBEDDINGS=geminifalls back to synthetic embeddings on every call. Logs look like:The fallback is silent at the API surface (queries still return), but recall quality drops to the synthetic baseline — semantic search effectively stops working.
Verified against the live API on 2026-05-09 with a fresh AI Studio key:
The older
embedding-001(whichget_defaults()incore/models.tsfalls back to) is also deprecated.Fix
Replace both deprecated model names with
gemini-embedding-001in three places:packages/openmemory-js/src/memory/embed.tsmodelfield inemb_gemini()(lines 310, 314), and thegetEmbeddingInfo()self-report (line 708).models.ymlgemini:entry for all 5 sectors + the documentation comment block at the bottom.packages/openmemory-js/src/core/models.tsget_defaults()fallback for all 5 sectors.gemini-embedding-001exposes the samebatchEmbedContentsendpoint with the same request body shape (model,content,taskType); the response vectors get resized toenv.vec_dimvia the existingresize_vechelper, so no caller-side changes are needed.Verification
Tested on a smart-tier deployment with 159 existing memories:
"travel plans next week"→ matches aPre-Miami plan on trackmemory with no shared keywords) confirm real semantic recall, not lexical fallback.Out of scope (follow-up)
The
emb_gemini()function does not callget_model(s, "gemini")like every other provider — it hardcodes the model name in the URL. That meansmodels.yml'sgemini:entries are loaded bymodels.tsbut never read by the Gemini path. A separate PR should refactoremb_geminito useget_model(and ideally also support a per-calloutputDimensionalityquery param). I left that out of this PR to keep the scope to a model-name correction.Also worth considering: an
OM_GEMINI_MODELenv override (mirroringOM_OPENAI_MODEL/OM_OLLAMA_MODEL) so users can opt intogemini-embedding-2without code changes.