-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] How do I set up a skip file for excluding specific file types? #23
Comments
hey :) thank you for your issue. the skipset does filter by one approach for you could be to remove the snapshots from the .cdx file. for this you have to keep that by |
Awesome! Thanks! Just to get this right: I'd download the cdxbackup file, remove the snapshots I don't want and then reinsert the cdx file via |
yes thats right. the cdx file contains the pure json response from the server and thus only the containing snapshots will be downloaded. if you use the in the long term maybe it would be an idea to add some kind of filter... is there a specific type or path you want to be removed? |
I came by two occasions where this can be super useful:
|
The --skip paramenter works great for interrupted downloads.
However, the othter day I wanted to download only specific files and exclude others. I couldn't figure out how to set up a csv file on my own.
Also, it didn't work when I tried to amand
waybackup_<sanitized_url>.csv
, created by the downloader. I tried to add the links I didn't want to download to the rowurl_origin
, but it didn't skip the links added.Any advice? Thanks!!
The text was updated successfully, but these errors were encountered: