This command-line tool extracts URLs from a PDF file and archives them using the Wayback Machine.
You can build and install the tool using Cargo:
cargo install archive-pdf-urls
The tool reads URLs from standard input, one URL per line, and archives them using the Wayback Machine.
Example usage:
archive-pdf-urls file.pdf --exclude https://some.pattern/\*
docker run --rm -v ./file.pdf:/file.pdf ghcr.io/thoth-pub/archive-pdf-urls file.pdf