Skip to content

fix(sessionManager): implement write queue and LRU eviction#689

Open
menny wants to merge 3 commits intositeboon:mainfrom
menny:fix/session-manager-resilience
Open

fix(sessionManager): implement write queue and LRU eviction#689
menny wants to merge 3 commits intositeboon:mainfrom
menny:fix/session-manager-resilience

Conversation

@menny
Copy link
Copy Markdown
Contributor

@menny menny commented Apr 23, 2026

Summary

This Pull Request hardens the Gemini session management layer to prevent data loss and improve memory efficiency.

Key Changes

  1. Per-Session Write Queue: Added a Promise-based queue to saveSession() to ensure overlapping file writes for the same session are executed sequentially. This prevents race conditions where messages could be lost during rapid interactions.
  2. LRU Eviction Logic: Upgraded the in-memory session cache from a simple FIFO (First-In-First-Out) to a Least Recently Used (LRU) policy. Sessions are now evicted based on their lastActivity timestamp rather than Map insertion order.
  3. Collision Prevention: Switched from raw Date.now() timestamps to random-suffixed identifiers for session IDs and process keys, eliminating potential state overwrites during concurrent session starts.

Impact

  • Improves the reliability of conversation history persistence.
  • Ensures active old sessions stay in memory while truly inactive ones are pruned.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Fixed server startup synchronization to ensure session data is fully loaded before accepting requests
    • Enhanced WebSocket connection handling with improved null safety checks
  • Refactor

    • Improved session identification with randomized cryptographic suffixes to reduce collisions
    • Updated session memory management to use LRU (Least Recently Used) eviction policy
    • Optimized concurrent session writes through write-queue serialization

menny and others added 3 commits April 22, 2026 18:57
…g requests

SessionManager loads all persisted Gemini sessions from ~/.gemini/sessions/*.json
asynchronously in its constructor via `this.ready = this.init()`. Previously,
`sessionManager.ready` was never awaited anywhere, so the server would start
accepting WebSocket connections before loadSessions() had completed.

This created a race condition on the very first request after a server restart:
if a client tried to resume an existing Gemini session, `sessionManager.getSession(sessionId)`
would return undefined (the session hadn't been loaded from disk yet), causing
the `--resume <cliSessionId>` flag to be silently omitted when spawning the
Gemini CLI process. The CLI would then start a blank new session instead of
continuing the prior conversation, with no error or indication to the user.

The fix adds `await sessionManager.ready` in startServer() immediately after
`await initializeDatabase()`, following the same pattern already established for
database initialization. This guarantees the session store is fully populated
before any request handler can call getSession(), createSession(), or addMessage().

No other providers are affected: Claude, Cursor, and Codex session providers are
stateless per-request (they read from disk or SQLite on demand) and have no
equivalent eager-loading singleton.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. Concurrent Writes (Issue siteboon#5): Added a per-session Promise-based write queue to
   saveSession() to ensure that overlapping save operations for the same session
   are executed sequentially, preventing data loss.
2. LRU Eviction (Issue siteboon#7, siteboon#8): Replaced FIFO eviction with Least Recently Used (LRU)
   logic based on the 'lastActivity' timestamp. This ensures that frequently used
   old sessions aren't prematurely evicted from memory. Updated both the run-time
   eviction and the post-load cleanup logic.
Replaced raw Date.now() based identifiers with random-suffixed strings in
gemini-cli.js, cursor-cli.js, and openai-codex.js. This ensures that even if
multiple sessions or processes are started within the same millisecond, their
IDs remain unique, preventing silent state overwrites (Issue #2).
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

📝 Walkthrough

Walkthrough

Changes introduce cryptographic randomization to session identifiers across CLI processes, implement per-session write-queue serialization in the session manager, switch to LRU-based session eviction, and ensure server startup waits for session manager readiness before proceeding.

Changes

Cohort / File(s) Summary
Session ID Randomization
server/cursor-cli.js, server/gemini-cli.js, server/openai-codex.js
Add crypto.randomBytes() to process-tracking keys and session IDs to reduce collision likelihood when session IDs are not yet available.
Session Manager Improvements
server/sessionManager.js
Implement per-session write-queuing to serialize filesystem operations and switch session eviction from FIFO to LRU policy based on lastActivity timestamps.
Export Refactoring
server/openai-codex.js
Convert queryCodex, abortCodexSession, isCodexSessionActive, and getActiveCodexSessions from direct exported declarations to internal functions with a named export block; harden sendMessage with null/undefined WebSocket checks and readyState === 1 validation.
Startup Synchronization
server/index.js
Add await for sessionManager.ready after database initialization to ensure Gemini session store loads from disk before continuing startup.

Poem

🐰 Whisker-whiskers, the IDs now shine,
With crypto-random, they'll ne'er misalign,
Sessions queue up in perfect order,
LRU keeps them at the border,
Startup waits patient, no need to rush,
A symphony of managers, hush hush! 🎭

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main changes: write queue and LRU eviction implementation in sessionManager, which align with the primary objectives of preventing data loss and improving memory efficiency.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/sessionManager.js`:
- Around line 149-174: saveSession currently captures the session object and
enqueues a write that may run after deleteSession has removed the session and
file, causing deleted sessions to be recreated; modify saveSession (and
symmetric logic near lines 220-225) to re-check that the session still exists in
this.sessions (or a deletion flag) immediately before writing and skip the write
if missing, and update deleteSession to either await/cancel the pending Promise
in saveQueues for that sessionId or mark the session as deleted (e.g., remove
from this.sessions and delete any queue entry in this.saveQueues) so queued
writes cannot recreate the file; reference saveSession, deleteSession,
this.saveQueues, this.sessions and _safeFilePath when making the changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cdf05348-d61c-4e92-a416-d702f1d7b5a9

📥 Commits

Reviewing files that changed from the base of the PR and between f6200e3 and 909cc9d.

📒 Files selected for processing (5)
  • server/cursor-cli.js
  • server/gemini-cli.js
  • server/index.js
  • server/openai-codex.js
  • server/sessionManager.js

Comment thread server/sessionManager.js
Comment on lines +149 to +174
// Save session to disk (Issue #5 - added queuing to prevent concurrent writes)
async saveSession(sessionId) {
const session = this.sessions.get(sessionId);
if (!session) return;

try {
const filePath = this._safeFilePath(sessionId);
await fs.writeFile(filePath, JSON.stringify(session, null, 2));
} catch (error) {
// console.error('Error saving session:', error);
}
// Get current queue for this session or create new one
let currentQueue = this.saveQueues.get(sessionId) || Promise.resolve();

// Chain the new write operation
const nextQueue = currentQueue.then(async () => {
try {
const filePath = this._safeFilePath(sessionId);
const data = JSON.stringify(session, null, 2);
await fs.writeFile(filePath, data);
} catch (error) {
// console.error('Error saving session:', error);
}
}).finally(() => {
// If no other operations are pending, remove the queue reference
if (this.saveQueues.get(sessionId) === nextQueue) {
this.saveQueues.delete(sessionId);
}
});

this.saveQueues.set(sessionId, nextQueue);
return nextQueue;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prevent queued saves from recreating deleted sessions.

saveSession() captures session before the queued write runs, while deleteSession() does not wait for or cancel pending saves. A queued write can therefore run after fs.unlink() and recreate the deleted JSON file.

Proposed fix
 async saveSession(sessionId) {
-  const session = this.sessions.get(sessionId);
-  if (!session) return;
+  if (!this.sessions.has(sessionId)) return;

   // Get current queue for this session or create new one
   let currentQueue = this.saveQueues.get(sessionId) || Promise.resolve();

   // Chain the new write operation
-  const nextQueue = currentQueue.then(async () => {
+  const nextQueue = currentQueue.catch(() => {}).then(async () => {
     try {
+      const session = this.sessions.get(sessionId);
+      if (!session) return;
+
       const filePath = this._safeFilePath(sessionId);
       const data = JSON.stringify(session, null, 2);
       await fs.writeFile(filePath, data);
     } catch (error) {
       // console.error('Error saving session:', error);
@@
 async deleteSession(sessionId) {
   this.sessions.delete(sessionId);
+
+  const pendingSave = this.saveQueues.get(sessionId);
+  if (pendingSave) {
+    await pendingSave.catch(() => {});
+    this.saveQueues.delete(sessionId);
+  }

   try {
     const filePath = this._safeFilePath(sessionId);
     await fs.unlink(filePath);

Also applies to: 220-225

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/sessionManager.js` around lines 149 - 174, saveSession currently
captures the session object and enqueues a write that may run after
deleteSession has removed the session and file, causing deleted sessions to be
recreated; modify saveSession (and symmetric logic near lines 220-225) to
re-check that the session still exists in this.sessions (or a deletion flag)
immediately before writing and skip the write if missing, and update
deleteSession to either await/cancel the pending Promise in saveQueues for that
sessionId or mark the session as deleted (e.g., remove from this.sessions and
delete any queue entry in this.saveQueues) so queued writes cannot recreate the
file; reference saveSession, deleteSession, this.saveQueues, this.sessions and
_safeFilePath when making the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant