Summary
Heavy active local client sessions still retain significantly more memory than expected because the same conversation can remain resident in multiple full in-memory representations at once:
Session.messages as persisted StoredMessage history
App.messages as provider-ready Message history
display_messages as UI-rendered history
We already landed meaningful memory wins for:
- remote/startup transcript retention
- idle restored local sessions via lazy provider hydration
- reload propagation/reexec behavior
But heavy active local sessions remain expensive.
Evidence
Live profiling on heavy sessions like cat showed roughly:
- PSS: ~126 to 131 MB
- private heap: ~112 to 130 MB
- persisted session file: only ~2.3 MB
- tool result text: only ~1.7 MB
So the remaining footprint is much larger than raw transcript payload size and is likely dominated by duplicated transcript structures, copied content blocks/strings, and rendered display copies.
Current architecture problem
Today we intentionally keep different transcript forms for different jobs, but for large active sessions the overlap is too expensive:
Session.messages is the canonical persisted history and contains session-only metadata such as id, display_role, and token_usage
App.messages is the live provider/runtime transcript used for turn execution, compaction, and injected context
display_messages is the UI-facing rendered transcript
This is useful, but for large tool-heavy sessions it means old content can effectively be retained multiple times.
Goal
Keep all three product requirements:
- stable persisted session logs
- low-latency provider/model interaction
- low-latency UI rendering
while reducing duplicate in-memory transcript state for heavy active local sessions.
Proposed direction
- Keep
Session.messages as the canonical transcript source of truth.
- Make
App.messages more clearly a derived/cache view rather than a permanent second full transcript.
- Make
display_messages lighter for large historical tool outputs:
- prefer previews/truncation/lazy expansion for large old tool results
- avoid eagerly copying full large payloads when not needed
- Add memory attribution for:
- canonical transcript bytes
- provider cache bytes
- display cache bytes
- retained large tool output bytes
Likely implementation stages
Stage 1
Target large historical tool outputs first.
- Canonical full data stays in
Session.messages
- Provider/display layers avoid eagerly retaining another full owned copy unless needed
Stage 2
Reduce active local duplication between Session.messages and App.messages.
- Prefer incremental hydration / cache invalidation over maintaining two full authoritative copies
Stage 3
Add diagnostics so we can measure wins and catch regressions.
Expected win
Best estimate from current profiling:
- heavy local sessions: ~40 to 70 MB reduction each
- medium-heavy sessions: ~20 to 40 MB reduction each
Not all of the current heavy heap is reclaimable because some hot provider/UI state is still necessary, but the remaining duplication looks large enough to be worth the refactor.
Notes
This issue is specifically about active local sessions. The remote/startup and idle restore cases have already improved and should not be conflated with the remaining heavy-session problem.
Summary
Heavy active local client sessions still retain significantly more memory than expected because the same conversation can remain resident in multiple full in-memory representations at once:
Session.messagesas persistedStoredMessagehistoryApp.messagesas provider-readyMessagehistorydisplay_messagesas UI-rendered historyWe already landed meaningful memory wins for:
But heavy active local sessions remain expensive.
Evidence
Live profiling on heavy sessions like
catshowed roughly:So the remaining footprint is much larger than raw transcript payload size and is likely dominated by duplicated transcript structures, copied content blocks/strings, and rendered display copies.
Current architecture problem
Today we intentionally keep different transcript forms for different jobs, but for large active sessions the overlap is too expensive:
Session.messagesis the canonical persisted history and contains session-only metadata such asid,display_role, andtoken_usageApp.messagesis the live provider/runtime transcript used for turn execution, compaction, and injected contextdisplay_messagesis the UI-facing rendered transcriptThis is useful, but for large tool-heavy sessions it means old content can effectively be retained multiple times.
Goal
Keep all three product requirements:
while reducing duplicate in-memory transcript state for heavy active local sessions.
Proposed direction
Session.messagesas the canonical transcript source of truth.App.messagesmore clearly a derived/cache view rather than a permanent second full transcript.display_messageslighter for large historical tool outputs:Likely implementation stages
Stage 1
Target large historical tool outputs first.
Session.messagesStage 2
Reduce active local duplication between
Session.messagesandApp.messages.Stage 3
Add diagnostics so we can measure wins and catch regressions.
Expected win
Best estimate from current profiling:
Not all of the current heavy heap is reclaimable because some hot provider/UI state is still necessary, but the remaining duplication looks large enough to be worth the refactor.
Notes
This issue is specifically about active local sessions. The remote/startup and idle restore cases have already improved and should not be conflated with the remaining heavy-session problem.