You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently the application which serves as a Crawler/Extractor only supports connectivity to either the Clearnet or Tor network using the localhost address over a sock5 proxy, which obtains a fixed address to communicate over. In order to improve anonymity and the service the project provides the capability to rotate IP addresses ought to be supported.
Describe the solution you'd like
Refactor the implementation of the connect_tor() method in order to support privoxy and proxy rotation.
Implementation for proxy rotation support of clearnet crawling.
Additional context
Modern web applications also tend to be supported be Web Application Firewalls (WAFs) and other technologies which can detect crawlers and bots and defer or block assess to the site. By rotating IPs we are consciously evading these detection and mitigating controls as to not disrupt the applications core service.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Currently the application which serves as a Crawler/Extractor only supports connectivity to either the Clearnet or Tor network using the localhost address over a sock5 proxy, which obtains a fixed address to communicate over. In order to improve anonymity and the service the project provides the capability to rotate IP addresses ought to be supported.
Describe the solution you'd like
connect_tor()
method in order to support privoxy and proxy rotation.Additional context
Modern web applications also tend to be supported be Web Application Firewalls (WAFs) and other technologies which can detect crawlers and bots and defer or block assess to the site. By rotating IPs we are consciously evading these detection and mitigating controls as to not disrupt the applications core service.
The text was updated successfully, but these errors were encountered: