-
Notifications
You must be signed in to change notification settings - Fork 726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DO NOT USE unless you have a means of rate limiting yourself #281
Comments
See ShiftaDeband's fork (which contains the fixes mentioned in his PR) as well as issues #273 and #275. |
Sorry to bother, i'm pretty new in this, how can i actually use this fork instead of the master branch? |
@Elmagenta: You'll need to have Ruby installed then you can just download ShiftaDeband's fork as a ZIP file, unzip it, and run |
@tinyapps I'm also pretty new in this, and I couldn't follow your instructions. I have Ruby installed, and I had also installed the "original" wayback_machine_downloader via Mac OS Terminal. Now, following your instructions, I downloaded the ZIP file and simply tried to run the binary file. But I get an error message /Users/flag/Downloads/wayback-machine-downloader-feature-httpGet/bin/wayback_machine_downloader:3:in `require_relative': cannot load such file -- /Users/flag/Downloads/wayback-machine-downloader-feature-httpGet/lib/wayback_machine_downloader (LoadError) Could you give more details on how to proceed? |
@flag-br: Sounds like you might've deleted (or not extracted) the included
|
@tinyapps Thank you very much, it worked! It ran normally, but the final product is practically the same as what I was getting before with the master branch version. The folder structure apparently reproduced correctly on my machine, but only 15 htm files were downloaded. To check, I ran wayback_machine_downloader with the --list option, and the answer is that there are 1116 htm files. The command I'm using is (after cd to bin folder): wayback_machine_downloader https://jazzdiscogcorner.pagesperso-orange.fr/ This site is quite simple, just text and practically no images. Am I doing something wrong? |
@flag-br: Glad to hear it worked out. As for issues with a specific site, I'd recommend checking out the documentation and searching through the open and closed issues before posting a new issue. |
I'm being stupid here, but trying to run wayback_machine_downloader (type - file) in the bin directory gave me "not recognized as an internal or external command, operable program or batch file". Fresh Ruby install. I had to |
It would be great to have rate limiting added to this software. P.S. It is good that there is a fork with fixes. Just wishing that the main repo of this software had those fixes too. |
This patched version worked beautifully ... For those who are in Windows and do not understand much how to do it: gem install wayback_machine_downloader Replace bin and lib folders in: C:\Ruby33-x64\lib\ruby\gems\3.3.0\gems\wayback_machine_downloader-2.3.1 |
Doesnt seem to work anymore... gives |
including extra config settings, a proper rate limit, and a logger. Fixes: hartator#307 hartator#291 hartator#281 hartator#269 and probably others too
Hi, I do not want to install Ruby - any Docker image with a rate limiter? Thanks. |
The Wayback Machine is (rightfully) blocking bulk downloads that exceed too much bandwidth or requests per secon. As far as I can tell, this product does no rate-limiting of itself, at least not by default, per any examples in the README. As a result, the Internet Archive will soft ban your IP address if you use this script on a web site of any significant size.
It's irresponsible to leave this repository up without at least a warning in the documentation.
The text was updated successfully, but these errors were encountered: