-
Notifications
You must be signed in to change notification settings - Fork 726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Net::HTTP to resolve rate limiting #280
base: master
Are you sure you want to change the base?
Conversation
awesome. working fine ! but i'm interested into why the use of Net::HTTP overcomes the rate-limiting. Do you have any idea what the initial problem was? |
Essentially we're using the same persistent HTTP session to download the whole thing (both snapshots and pages) and keeping it open until it's complete rather than opening/closing several sessions, which the Wayback Machine doesn't like (even if you're using a legitimate browser!). |
Until this gets merged and released can you provide instructions for a non-ruby person to run this branch? |
There are instructions in #281 |
because this project had no updates for the last 3y now i've written a replacement in python for my needs... seems dead |
Finished a 3,000,000 snapshot download thanks to this. Much appreciated. |
This needs to be merged; otherwise, we get a 400 BAD REQUEST. |
@bitdruid Can you please share the replacement? |
|
This is all based on #267 (comment) and @ee3e's work.
This resolves all rate limiting issues without the need of any delays/sleeps.
I am not sure that the
http.finish()
line inget_raw_list_from_api
is in the correct place, so any code review would be helpful.Regardless, I thought I'd submit this to try to resolve several of the issues that have come up lately.
Legitimately all credit should go to @ee3e for their solution. This helped me download a ridiculously large backup without issue (452831 files.)
(Issues) Resolves #277, resolves #275, resolves #273, resolves #269, resolves #267
(Pull requests) Resolves #268, resolves #266, resolves #262 (at least according to comments)