[Feature Request]: Allow Multiple Proxies in CrawlerRunConfig via Docker API #1315

duartemvix · 2025-07-17T04:02:46Z

duartemvix
Jul 17, 2025

What needs to be done?

Currently, it's only possible to set 1 proxy via BrowserConfig or CrawlerRunConfig while that is useful, it requires chaining together multiple API calls to the docker endpoint, turning all crawling much slower.

My suggestion is to find a way to pass a list of proxies (I get it from another API) and pass them as an array of either proxy or proxy_config in both BrowserConfig or CrawlerRunConfig. The only way supported to set up multiple proxies is via env_var's by setting a PROXIES var on start up. As I get it from an API, this would require quite a workaround to make it work, but I thought a lot of other people couple benefit from this as well.

Here's my config for crawling just 1 page via API:

{
  "urls": ["https://example.com/"],
  "browser_config": {
    "type": "BrowserConfig",
    "params": {
      "headless": true,
      "light_mode": true,
      "text_mode": true,
      "user_agent_mode": "random",
      "verbose": true,
      "use_persistent_context": true,
      "extra_args": [
        "--disable-extensions",
        "--disable-gpu",
        "--disable-dev-shm-usage",
        "--no-sandbox"
      ]
    }
  },
  "crawler_config": {
    "type": "CrawlerRunConfig",
    "params": {
      "cache_mode": "bypass",
      "remove_forms": true,
      "override_navigator": true,
      "only_text": true,
      "exclude_external_images": true,
      "exclude_all_images": true,
      "page_timeout": 10000,
      "wait_until": "domcontentloaded",
      "wait_for": "body",
      "stream": false,
      "verbose" : true,
      "mean_delay": 0.3,
      "magic": true,
      "delay_before_return_html": 1,
      "simulate_user": true,
      "remove_overlay_elements": true,
      "semaphore_count": 3,
      "proxy_config": {
        "server": "127.0.0.1:3000" // <- There should be a way to add multiple proxies here
      },    
      "markdown_generator": {
        "type": "DefaultMarkdownGenerator",
        "params": {
          "content_filter": {
            "type": "PruningContentFilter",
            "params": {
              "threshold_type": "dynamic",
              "min_word_threshold": 3
            }
          }
        }
      },
      "deep_crawl_strategy": {
        "type": "BestFirstCrawlingStrategy",
        "params": {
          "max_depth": 1,
          "max_pages": 10,
          "include_external": false,
          "filter_chain": {
            "type": "FilterChain",
            "params": {
              "filters": [
                {
                  "type": "URLPatternFilter",
                  "params": {
                    "patterns": ["*login*", "*terms*", "*privacy*", "*contact*"],
                    "reverse": true
                  }
                }
              ]
            }
          },     
          "url_scorer": {
            "type": "CompositeScorer",
            "params": {
              "scorers": [
                {
                  "type": "KeywordRelevanceScorer",
                  "params": {
                    "weight": 1.0,
                    "keywords": [
                      "growth",
                      "business",
                      "market",
                      "product",
                      "team",
                      "people",
                      "news",
                      "about",
                      "pricing",
                      "company",
                      "how it works"
                    ]
                  }
                }
              ]
            }
          }
        }
      }      
    }
  }  
}

What problem does this solve?

Increase crawling efficiency and removes an annoying bottleneck that requires spinning up new browsers and start new crawls for just changing proxies.

Target users/beneficiaries

All community

Current alternatives/workarounds

There are ways to workaround this but they're not fast and production ready. My suggestion entails that Crawl4AI could more robust.

Proposed approach

Changing the server parameter to take either a string (current setup) or an array (new setup).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature Request]: Allow Multiple Proxies in CrawlerRunConfig via Docker API #1315

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

[Feature Request]: Allow Multiple Proxies in CrawlerRunConfig via Docker API #1315

Uh oh!

Uh oh!

duartemvix Jul 17, 2025

What needs to be done?

What problem does this solve?

Target users/beneficiaries

Current alternatives/workarounds

Proposed approach

Replies: 0 comments

duartemvix
Jul 17, 2025