Skip to content

fix(render): isolate Chrome --user-data-dir per worker (fixes #54)#65

Open
dex0shubham wants to merge 1 commit into
StarTrail-org:mainfrom
dex0shubham:fix/cdp-user-data-dir-isolation
Open

fix(render): isolate Chrome --user-data-dir per worker (fixes #54)#65
dex0shubham wants to merge 1 commit into
StarTrail-org:mainfrom
dex0shubham:fix/cdp-user-data-dir-isolation

Conversation

@dex0shubham

Copy link
Copy Markdown

Problem

pixelshot hangs and fails on Windows/macOS when the configured Chrome is a normal
system install and the user already has Chrome open (the common desktop case). Every
URL fails with done=0 failed=1 after a ~180s hang (an asyncio.TimeoutError on
ws.recv()):

WARNING pixelrag_render.backends.cdp: [w0] FAIL https://example.com:
INFO    pixelrag_render.backends.cdp: Batch complete: done=0 failed=1

This is the second of the two root causes described in #54. The first — _connect_cdp
selecting a non-page CDP target — was already fixed in #56. This PR fixes the second.

Root cause

Both CDP backends launch Chrome with --remote-debugging-port but no --user-data-dir:

  • render/src/pixelrag_render/backends/cdp.py_worker (standard path)
  • render/src/pixelrag_render/backends/fast_cdp.py_launch_chrome (turbo path)

Without an explicit profile dir, a new chrome --remote-debugging-port=… invocation
forwards to an already-running Chrome instance on the default profile instead of
starting its own headless renderer. The CDP endpoint comes up and trivial
Runtime.evaluate works, but Page.navigate never commits and Page.captureScreenshot
hangs forever.

It's masked on Linux/CI because the bundled headless_shell launches fresh with no
competing instance, so the default profile is never contended.

Fix

Give each worker an isolated, throwaway profile via tempfile.mkdtemp, pass it as
--user-data-dir, and remove it on teardown (including the connect-failure path in the
turbo backend). A unique profile per worker also prevents parallel workers from colliding
on a single profile.

Also documents cross-platform Chrome resolution in the README: the bundled turbo
headless_shell auto-installs on linux-x64 only; Windows/macOS use auto-detected
system Chrome/Chromium (or CHROME_PATH).

Verification

On Windows 11 with system Chrome 149 (CHROME_PATH set):

  • Before: pixelshot https://example.comdone=0 failed=1 after ~180s.
  • After: done=1 failed=0, valid tile_0000.jpg produced; temp profile dirs are created
    per worker and cleaned up afterward (no leftovers under the temp dir).
  • ruff check / ruff format --check: clean.
  • Existing render + chrome tests pass (tests/test_render.py, tests/test_chrome_paths.py).

Risk

Low. Adding --user-data-dir is a no-op on a freshly-launched Chrome (Linux/CI) and is
the standard way to isolate Chrome instances; it only changes behavior in exactly the
contended-profile case the bug describes. Teardown uses rmtree(..., ignore_errors=True)
so a still-locked profile dir never raises.

Closes #54.

🤖 Generated with Claude Code

…il-org#54)

Both CDP backends launched Chrome with --remote-debugging-port but no
--user-data-dir. On a machine that already has Chrome open (normal on a
desktop), the new invocation forwards to the running instance on the default
profile instead of starting its own headless renderer: the CDP endpoint comes
up and trivial Runtime.evaluate works, but Page.navigate never commits and
Page.captureScreenshot hangs forever — every URL fails with done=0 after a
~180s timeout. This is the second of the two root causes in StarTrail-org#54 (the first,
picking a non-page CDP target, was fixed in StarTrail-org#56). It's masked on Linux/CI
because the bundled headless_shell launches fresh with no competing instance.

Give each worker an isolated, throwaway profile via tempfile.mkdtemp, pass it
as --user-data-dir, and remove it on teardown (including the connect-failure
path in the turbo backend). A unique profile per worker also prevents parallel
workers from colliding on one profile.

Also document cross-platform Chrome resolution in the README (headless_shell is
linux-x64 only; Windows/macOS use auto-detected system Chrome or CHROME_PATH).

Verified on Windows 11 with system Chrome 149: pixelshot https://example.com now
returns done=1 failed=0 with a valid tile (was a ~180s failure); temp profiles
are cleaned up; existing render/chrome tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 22, 2026

Copy link
Copy Markdown

@dex0shubham is attempting to deploy a commit to the andylizf's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pixelshot CDP backend hangs on Windows: connects to non-page target[0] and lacks --user-data-dir

1 participant