-
-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SOCKS Proxy support #682
Conversation
@adbar Please let me know, if I can help with documenting the feature. |
9433a17
to
97332c6
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #682 +/- ##
=======================================
Coverage 98.54% 98.54%
=======================================
Files 21 21
Lines 3505 3517 +12
=======================================
+ Hits 3454 3466 +12
Misses 51 51 ☔ View full report in Codecov by Sentry. |
@gremid Could you do something about test coverage? We can see the docs later. |
97332c6
to
a732852
Compare
I moved the proxy-related tests from the GitHub Action setup to a parametrized pytest fixture. Thus, the coverage of both code paths in the HTTP fetch routines, with and without proxy setup, should be recorded now. The import and parsing of the environment variable |
a732852
to
b740098
Compare
Wrestling with GitHub Actions and environment settings … Now the variable |
Thanks! |
I haven't had time to look at Trafilatura for a while, but this PR caught my attention because in the past I had made a patch to add proxies as well. https://github.com/adbar/trafilatura/pull/332/files @gremid Congratulations on the tests; at the time, I couldn't give it the attention it deserved (sorry @adbar) A few considerations:
In my PR, you might find some ideas to further improve this functionality: https://github.com/adbar/trafilatura/pull/332/files These are just suggestions. Again, congratulations on the PR. |
Thanks for the feedback. We won't work on this right now but I added your idea to the list (#697). |
This PR adds SOCKS proxy support in
trafilatura.downloads
. It works for requests based onurllib3
as well aspycurl
, is configured via the environment variablehttp_proxy
and tested with SOCKS5 proxies (with and without BASIC authentication).