fix(gemini): native session IDs and process lifecycle reliability#690
fix(gemini): native session IDs and process lifecycle reliability#690menny wants to merge 5 commits intositeboon:mainfrom
Conversation
…g requests SessionManager loads all persisted Gemini sessions from ~/.gemini/sessions/*.json asynchronously in its constructor via `this.ready = this.init()`. Previously, `sessionManager.ready` was never awaited anywhere, so the server would start accepting WebSocket connections before loadSessions() had completed. This created a race condition on the very first request after a server restart: if a client tried to resume an existing Gemini session, `sessionManager.getSession(sessionId)` would return undefined (the session hadn't been loaded from disk yet), causing the `--resume <cliSessionId>` flag to be silently omitted when spawning the Gemini CLI process. The CLI would then start a blank new session instead of continuing the prior conversation, with no error or indication to the user. The fix adds `await sessionManager.ready` in startServer() immediately after `await initializeDatabase()`, following the same pattern already established for database initialization. This guarantees the session store is fully populated before any request handler can call getSession(), createSession(), or addMessage(). No other providers are affected: Claude, Cursor, and Codex session providers are stateless per-request (they read from disk or SQLite on demand) and have no equivalent eager-loading singleton. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fetchHistory used the UI's sessionId (e.g., gemini_<timestamp>) when falling back to getGeminiCliSessionMessages. However, the Gemini CLI stores its history keyed by its native cliSessionId. This caused the fallback to fail silently, preventing the restoration of conversation history from disk for existing sessions. The fix extracts the session object from sessionManager and uses session.cliSessionId if available, ensuring history is correctly retrieved using the CLI's native identifier.
This fix addresses several interconnected issues in Gemini session management: 1. Prevent Orphaned Processes (Issue #1): now checks for and aborts any existing process for a session before starting a new one. This prevents multiple CLI processes from running concurrently for the same session. 2. Defensive Session Readiness (Issue siteboon#3): Added in to ensure session data is fully loaded from disk before attempting to look up . 3. Improved SIGKILL Logic (Issue siteboon#11): now verifies the specific process object is still active before sending SIGKILL, preventing stale key issues and accidental killing of replacement processes. 4. Diagnostic Logging: Added warnings in when a session is found but its is missing, making it easier to diagnose CLI initialization failures. These changes ensure the --resume flag is reliably passed and system resources are properly managed during rapid interactions.
This refactoring eliminates duplicate Gemini sessions in the sidebar by
aligning the session ID lifecycle with the Claude SDK implementation:
1. Native ID Capture: Shifted session initialization from the raw stdout handler
to the structured 'onInit' event. The system now waits for the Gemini CLI
to emit its native 'session_id' (e.g., chat_...) before creating the session
in the session manager.
2. Unified Identifier: By using the native CLI ID as the primary key for both
the UI and disk storage, we ensure that projects.js can correctly merge UI
and CLI sessions based on a single, shared ID.
3. Fallback Logic: Maintained random-suffixed fallback ID generation for
unstructured plain-text mode to ensure robustness.
These changes root out the cause of duplicate sidebar entries and simplify
session management across the Gemini provider.
Implemented a 'settled' flag and 'settleOnce' helper in spawnGemini to ensure the process promise only resolves or rejects once. This prevents duplicate error messages from being sent to the client when both 'error' and 'close' events fire, and ensures that the execution timeout is properly cleared when the process finishes or fails.
📝 WalkthroughWalkthroughThe pull request improves Gemini CLI process lifecycle management and server initialization sequencing. The server now waits for the session manager to load before accepting requests. The CLI process spawner now waits for session manager readiness, prevents orphaned processes by aborting existing instances, defers session creation until Gemini provides a session ID, and ensures lifecycle operations resolve only once. Session history retrieval now correctly maps internal session IDs to CLI session identifiers. Changes
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/gemini-cli.js (1)
414-463:⚠️ Potential issue | 🟠 MajorClose handler can remove the replacement process from
activeGeminiProcessesafter an abort+respawn.With the new "abort existing process before spawning" logic (lines 23‑27), the sequence becomes:
abortGeminiSession(sessionId)sends SIGTERM to P1 but leaves it in the map.- The new process P2 is registered at line 240, overwriting the map entry for
sessionId.- P1 eventually emits
close; its handler runs with its own closure wherefinalSessionId === sessionId.- Line 425
activeGeminiProcesses.delete(finalSessionId)now removes P2 from the map, soisGeminiSessionActive,getActiveGeminiSessions, and subsequentabortGeminiSessioncalls silently fail on the live session.Additionally, line 432 sends
kind: 'complete'for P1 over the shared writer while P2 is streaming, which can prematurely terminate the client UI for the active run.Mirror the identity check you already added in
abortGeminiSession(lines 504‑514) so the close handler only mutates the map/ws when it still owns the entry.🔒 Proposed guard
- // Clean up process reference - const finalSessionId = capturedSessionId || sessionId || processKey; - activeGeminiProcesses.delete(finalSessionId); - - // Save assistant response to session if we have one - if (finalSessionId && assistantBlocks.length > 0) { - sessionManager.addMessage(finalSessionId, 'assistant', assistantBlocks); - } - - ws.send(createNormalizedMessage({ kind: 'complete', exitCode: code, isNewSession: !sessionId && !!command, sessionId: finalSessionId, provider: 'gemini' })); + // Clean up process reference only if this process still owns the entry. + // A prior spawn may have been aborted and replaced under the same key. + const finalSessionId = capturedSessionId || sessionId || processKey; + const stillOwned = activeGeminiProcesses.get(finalSessionId) === geminiProcess; + if (stillOwned) { + activeGeminiProcesses.delete(finalSessionId); + } + + // Save assistant response to session if we have one + if (finalSessionId && assistantBlocks.length > 0) { + sessionManager.addMessage(finalSessionId, 'assistant', assistantBlocks); + } + + if (stillOwned) { + ws.send(createNormalizedMessage({ kind: 'complete', exitCode: code, isNewSession: !sessionId && !!command, sessionId: finalSessionId, provider: 'gemini' })); + }The same
stillOwnedguard should be applied at line 469 in theerrorhandler for symmetry.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/gemini-cli.js` around lines 414 - 463, The close handler currently unconditionally deletes activeGeminiProcesses and sends completion messages for any process with the same session id, which can remove a newly spawned replacement; change the handler to first verify ownership by comparing the activeGeminiProcesses entry for finalSessionId against this geminiProcess (e.g., get the map entry and ensure it === geminiProcess or matches a stored pid/unique id) and only then delete the map entry, send the 'complete' message and clean up temp files; apply the same ownership guard in the error handler so it likewise only mutates activeGeminiProcesses and sends error messages if the closed/errored process is still the registered owner (referencing activeGeminiProcesses, finalSessionId, geminiProcess, ws.send, abortGeminiSession).
🧹 Nitpick comments (1)
server/modules/providers/list/gemini/gemini-sessions.provider.ts (1)
125-127: Logic is correct; consider typingcliSessionIdon the session shape.Functionally fine: in the new native-ID flow
sessionIdis already the CLI identifier, and legacy sessions carrycliSessionId, so the fallback chain works. Theas anycast is the only rough edge — adding an optionalcliSessionId?: stringto theSession/SessionRecordtype returned bysessionManager.getSessionwould drop the cast and document the native-ID relationship at the type level.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/modules/providers/list/gemini/gemini-sessions.provider.ts` around lines 125 - 127, The current code uses a cast (session as any)?.cliSessionId to access cliSessionId; instead, add an optional property cliSessionId?: string to the Session or SessionRecord type returned by sessionManager.getSession so the shape documents that legacy sessions may have cliSessionId while native sessions use sessionId, then update the usage in gemini-sessions.provider.ts (where sessionManager.getSession is called and cliSessionId is computed) to remove the any cast and read (session.cliSessionId || sessionId), keeping the fallback to getGeminiCliSessionMessages(cliSessionId) intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/gemini-cli.js`:
- Line 461: The line calling settleOnce currently has one too many closing
parentheses which causes a syntax error; update the call to settleOnce so it
wraps the arrow function that calls reject(new Error(...)) with the correct
number of closing parentheses and terminate the statement with a semicolon —
i.e., ensure settleOnce( ... ) closes only once after the arrow function and
that reject(...) and new Error(...) are properly closed; adjust the invocation
around settleOnce, reject, new Error and the code variable accordingly.
---
Outside diff comments:
In `@server/gemini-cli.js`:
- Around line 414-463: The close handler currently unconditionally deletes
activeGeminiProcesses and sends completion messages for any process with the
same session id, which can remove a newly spawned replacement; change the
handler to first verify ownership by comparing the activeGeminiProcesses entry
for finalSessionId against this geminiProcess (e.g., get the map entry and
ensure it === geminiProcess or matches a stored pid/unique id) and only then
delete the map entry, send the 'complete' message and clean up temp files; apply
the same ownership guard in the error handler so it likewise only mutates
activeGeminiProcesses and sends error messages if the closed/errored process is
still the registered owner (referencing activeGeminiProcesses, finalSessionId,
geminiProcess, ws.send, abortGeminiSession).
---
Nitpick comments:
In `@server/modules/providers/list/gemini/gemini-sessions.provider.ts`:
- Around line 125-127: The current code uses a cast (session as
any)?.cliSessionId to access cliSessionId; instead, add an optional property
cliSessionId?: string to the Session or SessionRecord type returned by
sessionManager.getSession so the shape documents that legacy sessions may have
cliSessionId while native sessions use sessionId, then update the usage in
gemini-sessions.provider.ts (where sessionManager.getSession is called and
cliSessionId is computed) to remove the any cast and read (session.cliSessionId
|| sessionId), keeping the fallback to getGeminiCliSessionMessages(cliSessionId)
intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 16e1dde7-4fed-42fc-bfca-4c0f89250f3b
📒 Files selected for processing (3)
server/gemini-cli.jsserver/index.jsserver/modules/providers/list/gemini/gemini-sessions.provider.ts
| error: code === null ? 'Gemini CLI process was terminated or timed out' : null | ||
| }); | ||
| reject(new Error(code === null ? 'Gemini CLI process was terminated or timed out' : `Gemini CLI exited with code ${code}`)); | ||
| settleOnce(() => reject(new Error(code === null ? 'Gemini CLI process was terminated or timed out' : `Gemini CLI exited with code ${code}`)))); |
There was a problem hiding this comment.
Syntax error — extra closing parenthesis breaks the module.
There are four ) after the template literal but only three opens (settleOnce(, reject(, new Error(). Biome flagged this (Expected a semicolon or an implicit semicolon after a statement, but found none). As written, this file will fail to parse and the entire Gemini provider becomes unusable.
🐛 Proposed fix
- settleOnce(() => reject(new Error(code === null ? 'Gemini CLI process was terminated or timed out' : `Gemini CLI exited with code ${code}`))));
+ settleOnce(() => reject(new Error(code === null ? 'Gemini CLI process was terminated or timed out' : `Gemini CLI exited with code ${code}`)));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| settleOnce(() => reject(new Error(code === null ? 'Gemini CLI process was terminated or timed out' : `Gemini CLI exited with code ${code}`)))); | |
| settleOnce(() => reject(new Error(code === null ? 'Gemini CLI process was terminated or timed out' : `Gemini CLI exited with code ${code}`))); |
🧰 Tools
🪛 Biome (2.4.11)
[error] 461-461: Expected a semicolon or an implicit semicolon after a statement, but found none
(parse)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@server/gemini-cli.js` at line 461, The line calling settleOnce currently has
one too many closing parentheses which causes a syntax error; update the call to
settleOnce so it wraps the arrow function that calls reject(new Error(...)) with
the correct number of closing parentheses and terminate the statement with a
semicolon — i.e., ensure settleOnce( ... ) closes only once after the arrow
function and that reject(...) and new Error(...) are properly closed; adjust the
invocation around settleOnce, reject, new Error and the code variable
accordingly.
Summary
This Pull Request significantly improves the Gemini provider by adopting the native session ID architecture and fixing several process management bugs.
Key Changes
session_id(e.g.,chat_...). This aligns Gemini with the Claude SDK implementation and natively resolves the "duplicate sessions in sidebar" issue.spawnGemininow proactively checks for and aborts any existing process for a session before starting a new one, preventing background resource leaks.--resumeflag is always passed for existing conversations.settleOncepattern to ensure the process promise only resolves/rejects once, preventing duplicate error messages to the client.Impact
Summary by CodeRabbit
Bug Fixes
Improvements