Fix: cdp connections spawn isolated contexts instead of reusing existing profiles#281
Fix: cdp connections spawn isolated contexts instead of reusing existing profiles#281Ewal11 wants to merge 1 commit into
Conversation
|
Hey @Ewal11, this fix targets a real problem, but it creates more issues than it solves. First, the PR is against More importantly, reusing the user’s existing context breaks teardown: It also silently drops the configured This needs an ownership flag, so teardown only closes contexts Scrapling created, and context options are only applied when Scrapling owns the context. Also, no tests were added, and all CDP paths are currently marked Not a maintainer, this is just to inform you! |
What does this PR do?
This PR modifies the default behavior of
cdp_urlconnections across all Fetcher variants (Sync and Async, Dynamic and Stealthy).Currently, when passing a
cdp_urlto connect to an existing, headful Chrome browser, Scrapling unconditionally callsbrowser.new_context(). Because of how Playwright operates, this generates a completely isolated, incognito-like browsing context. As a result, the user's active logins, installed extensions, and session cookies from their main profile are not accessible to the Fetcher.This PR checks if
browser.contextsexists upon a successful CDP connection. If the default context exists, it assigns it toself.contextinstead of spawning a new one.Why is this needed?
The primary use case for connecting to an active browser via a debug port (
cdp_url) is to leverage the existing, authenticated state of that browser (e.g., bypassing logins or utilizing active extensions). Spawning an isolated context defeats the purpose of connecting to a pre-configured user-data directory.Impact
cdp_urlconnections will now inherit the browser's active tabs, cookies, and extensions.