Skip to content

Commit ae189c0

Browse files
fix(main): capture native/renderer/GPU crashes that left zero trace
User-reported: DevDeck sometimes closes with no apparent reason, taking every cockpit session down with it. Investigated via systematic-debugging before touching any code: - devdeck-errors.log (the v1.12.0 uncaughtException/unhandledRejection trap) doesn't exist — that guard has never fired. - Windows Application log has no Event 1000 ("Application Error") for DevDeck.exe/ electron.exe, ever. - Windows System log shows only clean, scheduled restarts (no Kernel-Power 41 dirty shutdown) around the relevant period. - No WER report or crash dump for the running app (CrashDumps only has entries for the OLD v0.3.0 NSIS *installer*, unrelated, from 2026-06-07). - Only two app.quit() call sites in the codebase, both intentional (tray Quit, second- instance lock) — no hidden quit path. This rules out a JS-catchable exception, an OS-level dirty shutdown, a classic OS-recorded native crash, and a rogue quit call — but the remaining candidates (a native crash swallowed by Electron's Crashpad handler since crashReporter was never configured, an external TerminateProcess, a renderer/GPU crash, a V8 OOM abort) all share one trait: they leave NO trace anywhere with the app's previous instrumentation. There was nothing left to investigate without adding observability first. Added, with user's explicit "don't restart the app I'm using right now" constraint respected throughout (build/test/QA only — the running instance was never touched): - crashReporter.start({ uploadToServer: false }) — local-only minidumps for native crashes (node-pty/conpty, Chromium) that previously vanished into Crashpad with no trace. Nothing is ever uploaded. - New installAppCrashHandlers (errorGuard.ts) wires `render-process-gone` / `child-process-gone` — Electron `app` events (not `process` events, so the existing guard couldn't see them) covering renderer/GPU/utility crashes, previously completely unmonitored. Skips reason:'clean-exit' (a normal window close/reload) to avoid noise. - Every diagnostic log line now includes a process.memoryUsage() snapshot, so a V8 OOM pattern (long-running cockpit sessions accumulating buffered output) is visible in hindsight even though the OOM abort itself can't be caught directly. 340 tests (+5, TDD). If DevDeck closes unexpectedly again, %APPDATA%\DevDeck\ devdeck-errors.log and app.getPath('crashDumps') should now hold real evidence instead of nothing. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
1 parent e4e3695 commit ae189c0

5 files changed

Lines changed: 95 additions & 9 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ See every repo's state at a glance — git status, how long it's been neglected,
1111
![License](https://img.shields.io/badge/license-MIT-blue)
1212
![Platform](https://img.shields.io/badge/platform-Windows%20%7C%20macOS%20%7C%20Linux-0078D6)
1313
![Built with Electron](https://img.shields.io/badge/Electron-31-47848F)
14-
![Tests](https://img.shields.io/badge/tests-302%20passing-3fb950)
14+
![Tests](https://img.shields.io/badge/tests-340%20passing-3fb950)
1515
![CI](https://github.com/writingdeveloper/devdeck/actions/workflows/ci.yml/badge.svg)
1616

1717
<img src="docs/demo/demo.gif" width="820" alt="DevDeck demo" />

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "devdeck",
3-
"version": "1.12.0",
3+
"version": "1.12.1",
44
"description": "Project command deck — at-a-glance state + claude -c resume",
55
"main": "dist/main/main.js",
66
"type": "commonjs",

src/main/errorGuard.test.ts

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
import { describe, it, expect } from 'vitest';
22
import { EventEmitter } from 'node:events';
3-
import { installGlobalErrorHandlers } from './errorGuard';
3+
import { installGlobalErrorHandlers, installAppCrashHandlers } from './errorGuard';
44

55
describe('installGlobalErrorHandlers', () => {
66
it('forwards an uncaughtException to onError instead of letting it terminate the process', () => {
@@ -27,3 +27,46 @@ describe('installGlobalErrorHandlers', () => {
2727
expect(() => proc.emit('uncaughtException', new Error('boom'))).not.toThrow();
2828
});
2929
});
30+
31+
describe('installAppCrashHandlers', () => {
32+
// render-process-gone / child-process-gone are Electron `app` events (not `process` events) that
33+
// installGlobalErrorHandlers cannot see — they cover renderer/GPU/utility-process crashes, which
34+
// previously left zero trace anywhere (no devdeck-errors.log entry, no Windows crash event).
35+
it('reports an abnormal render-process-gone (crashed/killed/oom) with reason + exitCode', () => {
36+
const seen: unknown[] = [];
37+
const appLike = new EventEmitter();
38+
installAppCrashHandlers(appLike, (kind, detail) => seen.push({ kind, detail }));
39+
appLike.emit('render-process-gone', {}, { id: 1 }, { reason: 'crashed', exitCode: 11 });
40+
expect(seen).toEqual([{ kind: 'render-process-gone', detail: { reason: 'crashed', exitCode: 11 } }]);
41+
});
42+
43+
it('does NOT report a clean-exit render-process-gone — that is a normal window close/reload, not a crash', () => {
44+
const seen: unknown[] = [];
45+
const appLike = new EventEmitter();
46+
installAppCrashHandlers(appLike, (kind, detail) => seen.push({ kind, detail }));
47+
appLike.emit('render-process-gone', {}, { id: 1 }, { reason: 'clean-exit', exitCode: 0 });
48+
expect(seen).toEqual([]);
49+
});
50+
51+
it('reports an abnormal child-process-gone (e.g. GPU crash) with its process type', () => {
52+
const seen: unknown[] = [];
53+
const appLike = new EventEmitter();
54+
installAppCrashHandlers(appLike, (kind, detail) => seen.push({ kind, detail }));
55+
appLike.emit('child-process-gone', {}, { type: 'GPU', reason: 'crashed', exitCode: 139 });
56+
expect(seen).toEqual([{ kind: 'child-process-gone', detail: { type: 'GPU', reason: 'crashed', exitCode: 139 } }]);
57+
});
58+
59+
it('does not report a clean-exit child-process-gone', () => {
60+
const seen: unknown[] = [];
61+
const appLike = new EventEmitter();
62+
installAppCrashHandlers(appLike, (kind, detail) => seen.push({ kind, detail }));
63+
appLike.emit('child-process-gone', {}, { type: 'Utility', reason: 'clean-exit', exitCode: 0 });
64+
expect(seen).toEqual([]);
65+
});
66+
67+
it('keeps running when onEvent itself throws', () => {
68+
const appLike = new EventEmitter();
69+
installAppCrashHandlers(appLike, () => { throw new Error('logger broke'); });
70+
expect(() => appLike.emit('render-process-gone', {}, {}, { reason: 'crashed', exitCode: 1 })).not.toThrow();
71+
});
72+
});

src/main/errorGuard.ts

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,26 @@ export function installGlobalErrorHandlers(
2424
proc.on('uncaughtException', guard('uncaughtException'));
2525
proc.on('unhandledRejection', guard('unhandledRejection'));
2626
}
27+
28+
export type AppCrashKind = 'render-process-gone' | 'child-process-gone';
29+
export interface AppCrashDetail { reason: string; exitCode: number; type?: string; }
30+
31+
/**
32+
* Electron's `render-process-gone` (renderer/GPU-tab crash) and `child-process-gone` (GPU/utility
33+
* process crash) fire on `app`, not `process` — installGlobalErrorHandlers can't see them. Before
34+
* this, a renderer or GPU crash left ZERO trace anywhere (no devdeck-errors.log entry, no Windows
35+
* crash event), one of the leading suspects for DevDeck "closing for no reason". `reason ===
36+
* 'clean-exit'` is a normal window close/reload, not a crash — skipped so it doesn't spam the log.
37+
*/
38+
export function installAppCrashHandlers(
39+
appLike: NodeJS.EventEmitter,
40+
onEvent: (kind: AppCrashKind, detail: AppCrashDetail) => void,
41+
): void {
42+
const guard = (kind: AppCrashKind) => (...args: unknown[]) => {
43+
const detail = args[args.length - 1] as AppCrashDetail;
44+
if (detail?.reason === 'clean-exit') return;
45+
try { onEvent(kind, detail); } catch { /* swallow — logging is best-effort */ }
46+
};
47+
appLike.on('render-process-gone', guard('render-process-gone'));
48+
appLike.on('child-process-gone', guard('child-process-gone'));
49+
}

src/main/main.ts

Lines changed: 26 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import { app, BrowserWindow, globalShortcut } from 'electron';
1+
import { app, BrowserWindow, globalShortcut, crashReporter } from 'electron';
22
import * as path from 'node:path';
33
import { appendFileSync } from 'node:fs';
44
import * as nodePty from '@homebridge/node-pty-prebuilt-multiarch';
@@ -8,7 +8,13 @@ import { PtyHost, type PtySpawn } from './ptyHost';
88
import { setupTray } from './tray';
99
import { registerUpdater } from './updater';
1010
import { applyOpenAtLogin } from './autostart';
11-
import { installGlobalErrorHandlers } from './errorGuard';
11+
import { installGlobalErrorHandlers, installAppCrashHandlers } from './errorGuard';
12+
13+
// Local-only crash capture (no upload — nothing is ever sent anywhere) so a NATIVE crash (a fault
14+
// inside node-pty/conpty or Chromium itself) writes an inspectable minidump instead of vanishing —
15+
// by default Electron's bundled Crashpad handler swallows unconfigured native crashes silently,
16+
// leaving no Windows Event Log entry and no trace in our own JS-level error guard below.
17+
crashReporter.start({ uploadToServer: false, compress: true });
1218

1319
const realSpawn: PtySpawn = (file, args, opts) => {
1420
const p = nodePty.spawn(file, args, { name: 'xterm-256color', cwd: opts.cwd, cols: opts.cols, rows: opts.rows });
@@ -61,14 +67,28 @@ if (!gotLock) {
6167

6268
app.whenReady().then(() => {
6369
const userData = app.getPath('userData');
70+
// Every diagnostic line includes a memory snapshot: if DevDeck is dying to a V8
71+
// "JavaScript heap out of memory" abort (long cockpit sessions accumulating buffered output),
72+
// that abort itself bypasses uncaughtException — but a rising rss/heapUsed trend across
73+
// whatever DID get logged before it is the only way to notice the pattern after the fact.
74+
const logLine = (line: string): void => {
75+
const m = process.memoryUsage();
76+
const withMem = `${line} | rss=${Math.round(m.rss / 1048576)}MB heapUsed=${Math.round(m.heapUsed / 1048576)}MB`;
77+
console.error('DevDeck', withMem);
78+
try { appendFileSync(path.join(userData, 'devdeck-errors.log'), `${new Date().toISOString()} ${withMem}\n`); } catch { /* logging is best-effort */ }
79+
};
6480
// Last-resort trap: keep the main process alive when an async callback (pty data/exit, the
6581
// PtyBatcher flush timer, a git spawn, a stray IPC reject) throws. Before this, such a throw
66-
// closed DevDeck "out of nowhere" and took every cockpit terminal with it. We also append the
67-
// stack to a log file so a future occurrence is diagnosable (console output is lost when packaged).
82+
// closed DevDeck "out of nowhere" and took every cockpit terminal with it.
6883
installGlobalErrorHandlers((kind, err) => {
6984
const detail = err instanceof Error ? (err.stack ?? err.message) : String(err);
70-
console.error(`DevDeck main ${kind}:`, detail);
71-
try { appendFileSync(path.join(userData, 'devdeck-errors.log'), `${new Date().toISOString()} [${kind}] ${detail}\n`); } catch { /* logging is best-effort */ }
85+
logLine(`[${kind}] ${detail}`);
86+
});
87+
// render-process-gone / child-process-gone fire on `app`, not `process` — a renderer or GPU
88+
// crash previously left zero trace anywhere (no log entry, no Windows crash event either,
89+
// since Crashpad intercepts it before the OS's own crash reporting sees it).
90+
installAppCrashHandlers(app, (kind, detail) => {
91+
logLine(`[${kind}] ${JSON.stringify(detail)}`);
7292
});
7393
// Match the installer shortcut's AppUserModelID (electron-builder sets it to
7494
// the appId) so Windows shows the DevDeck taskbar icon and groups windows

0 commit comments

Comments
 (0)