Skip to content

Fix auth-aware request deduplication#314

Open
3em0 wants to merge 1 commit into
D4Vinci:mainfrom
3em0:fix/scheduler-auth-context
Open

Fix auth-aware request deduplication#314
3em0 wants to merge 1 commit into
D4Vinci:mainfrom
3em0:fix/scheduler-auth-context

Conversation

@3em0
Copy link
Copy Markdown

@3em0 3em0 commented May 31, 2026

Summary

  • include authentication-related request context in the default scheduler fingerprint
  • keep non-auth headers behind fp_include_headers to preserve normal deduplication behavior
  • add regression tests for Authorization, Cookie, extra_headers, and explicit cookies request context
  • update scheduler deduplication docs

Closes #313

Tests

  • pytest tests/spiders/test_request.py tests/spiders/test_scheduler.py
  • ruff check scrapling/spiders/request.py tests/spiders/test_request.py tests/spiders/test_scheduler.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Authenticated requests can be incorrectly deduplicated when headers/cookies are omitted from scheduler fingerprint

1 participant