Proxy Scraper is an open source tool that finds public proxies from multiple sources and asynchronously checks them.
pip install -U git+https://github.com/boxxello/Proxy-Scraper-pkg.git
- Support protocols: HTTP(S)
- Automatically removes duplicate proxies
- Validates the proxies that are inputted from an input file
Python 3.8+
All the requirements will be installed with the setup.py - or you can manually installed them by running.
pip install -r requirements.txt
from proxy_scraper.getproxy import GetProxy
proxy_scraper = GetProxy()
proxy_scraper.init()
proxy_scraper.load_input_proxies()
proxy_scraper.validate_input_proxies()
proxy_scraper.load_plugins()
proxy_scraper.grab_web_proxies()
proxy_scraper.validate_web_proxies()
proxy_scraper.save_proxies()either way you can also run it from a console.
Usage:
pyhton -m proxy_scraper [--input name_of_the_file.txt] [--output name_of_the_file2.txt]
[--debug]
Make sure that You using latest version!!!
pip install -U git+https://github.com/boxxello/Proxy-Scraper-pkg.git
- Save the current ips in a db.
- Make an API to retrieve the latest scraped ips without having you to run it on your machine.
- Turn every plugin into an instance of a child which inherits its properties from a father.
- Fork it: https://github.com/boxxello/Proxy-Scraper-pkg/fork
- Create your feature branch:
git checkout -b MY-NEW-FEATURE - Commit your changes:
git commit -am 'Add some feature' - Push to the branch:
git push origin MY-NEW-FEATURE - Submit a pull request!
Licensed under the Apache License, Version 2.0
- This product includes GeoLite2 data created by MaxMind, available from* http://www.maxmind.com.
- This product includes Retrying available from* Retrying github page.
- This product includes Gevent available from* Gevent github page.
- This product includes fake-user-agent available from* fake-user-agent github page.
- This product include portions of code from GetProxy.
- This product include portions of code from ProxyBroker2.
** Disclaimer & WARNINGS:
- Use this ONLY for Educational Purposes! By using this code you agree
that I'm not responsible for any kind of trouble caused by the code.
- Make sure web-scraping is legal in your region.
