Skip to content

fix: replace deprecated Gemini embedding model (text-embedding-004 → gemini-embedding-001)#174

Open
vincenzopalazzo wants to merge 1 commit into
CaviraOSS:mainfrom
vincenzopalazzo:fix/gemini-embedding-model
Open

fix: replace deprecated Gemini embedding model (text-embedding-004 → gemini-embedding-001)#174
vincenzopalazzo wants to merge 1 commit into
CaviraOSS:mainfrom
vincenzopalazzo:fix/gemini-embedding-model

Conversation

@vincenzopalazzo
Copy link
Copy Markdown
Contributor

Problem

text-embedding-004 is no longer served by Google's Generative Language API. Calls to /v1beta/models/text-embedding-004:batchEmbedContents (and the single-contentvariant) return HTTP 404 for newly-issued API keys.

This means a fresh deployment of OpenMemory configured with OM_EMBEDDINGS=gemini falls back to synthetic embeddings on every call. Logs look like:

[EMBED] Provider: gemini, Tier: smart, Sector: semantic
[EMBED] Gemini error (1/3): Gemini: 404
[EMBED] Gemini error (2/3): Gemini: 404
[EMBED] Gemini batch failed, falling back to sequential: Error: Gemini failed after 3 attempts: Gemini: 404
[EMBED] gemini failed: Gemini failed after 3 attempts: Gemini: 404, trying synthetic
[EMBED] Fallback to synthetic succeeded for sector: semantic

The fallback is silent at the API surface (queries still return), but recall quality drops to the synthetic baseline — semantic search effectively stops working.

Verified against the live API on 2026-05-09 with a fresh AI Studio key:

GET  /v1beta/models?key=…
→ models/gemini-embedding-001        ✅ supports embedContent, asyncBatchEmbedContent
   models/gemini-embedding-2-preview ✅
   models/gemini-embedding-2         ✅
   (text-embedding-004 / embedding-001 are not listed)

POST /v1beta/models/text-embedding-004:embedContent     → 404
POST /v1beta/models/gemini-embedding-001:embedContent   → 200, returns vector
POST /v1beta/models/gemini-embedding-001:batchEmbedContents → 200

The older embedding-001 (which get_defaults() in core/models.ts falls back to) is also deprecated.

Fix

Replace both deprecated model names with gemini-embedding-001 in three places:

File Change
packages/openmemory-js/src/memory/embed.ts Hardcoded URL + request-body model field in emb_gemini() (lines 310, 314), and the getEmbeddingInfo() self-report (line 708).
models.yml gemini: entry for all 5 sectors + the documentation comment block at the bottom.
packages/openmemory-js/src/core/models.ts get_defaults() fallback for all 5 sectors.

gemini-embedding-001 exposes the same batchEmbedContents endpoint with the same request body shape (model, content, taskType); the response vectors get resized to env.vec_dim via the existing resize_vec helper, so no caller-side changes are needed.

Verification

Tested on a smart-tier deployment with 159 existing memories:

  • Pre-fix: every Gemini call 404s, queries fall through to synthetic.
  • Post-fix: queries return semantically-ranked matches in <1s, no errors in logs. Paraphrase queries ("travel plans next week" → matches a Pre-Miami plan on track memory with no shared keywords) confirm real semantic recall, not lexical fallback.

Out of scope (follow-up)

The emb_gemini() function does not call get_model(s, "gemini") like every other provider — it hardcodes the model name in the URL. That means models.yml's gemini: entries are loaded by models.ts but never read by the Gemini path. A separate PR should refactor emb_gemini to use get_model (and ideally also support a per-call outputDimensionality query param). I left that out of this PR to keep the scope to a model-name correction.

Also worth considering: an OM_GEMINI_MODEL env override (mirroring OM_OPENAI_MODEL / OM_OLLAMA_MODEL) so users can opt into gemini-embedding-2 without code changes.

…i-embedding-001

The Gemini text-embedding-004 model is no longer served by Google's
Generative Language API — calls to /v1beta/models/text-embedding-004:*
return 404 for newly-issued API keys (verified 2026-05-09). Likewise,
the older models/embedding-001 referenced in get_defaults() is
deprecated.

Replace both with the current GA model gemini-embedding-001:

- packages/openmemory-js/src/memory/embed.ts: hardcoded URL + request
  body model field (the gemini path does not yet read from models.yml
  via get_model() — that's left as a follow-up).
- models.yml: gemini entry for all 5 sectors + the documentation
  comment block at the bottom.
- packages/openmemory-js/src/core/models.ts: get_defaults() fallback
  for all 5 sectors.

The new model exposes the same batchEmbedContents endpoint with the
same request body shape (model, content, taskType) and returns vectors
that are resized to env.vec_dim via the existing resize_vec helper, so
no caller-side changes are needed.

Verified on a smart-tier deployment (159 memories): queries that were
falling through to synthetic fallback ("[EMBED] gemini failed:
Gemini failed after 3 attempts: Gemini: 404") now return semantically-
ranked matches in <1s.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 007e3cb72d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if (!env.gemini_key) throw new Error("Gemini key missing");
const prom = gem_q.then(async () => {
const url = `https://generativelanguage.googleapis.com/v1beta/models/text-embedding-004:batchEmbedContents?key=${env.gemini_key}`;
const url = `https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:batchEmbedContents?key=${env.gemini_key}`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Re-embed vectors when switching Gemini model

When upgrading a database that already contains Gemini vectors produced by text-embedding-004/embedding-001, this line changes query and newly-written embeddings to gemini-embedding-001 without invalidating the old stored vectors. The vector table stores only v/dim and no model identity (packages/openmemory-js/src/core/db.ts:161), while search compares the new query vector against every stored vector in the sector (packages/openmemory-js/src/core/vector/postgres.ts:58), so existing memories remain in the old embedding space and semantic ranking becomes unreliable until they are re-embedded. Please add a rebuild/migration path or a model-version guard before silently switching the default.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant