Skip to content

kedorlaomer/slor

Repository files navigation

SLOR: read off-line sites

This is a set of scripts for downloading a web site and its images and stylesheets. The web site is manipulated in such a way that

  • links point to the right position on the internet
  • whereas links to images and stylesheets point to the downloaded version
  • scripts and iframes are removed without trace

Great, ha?

This is done using an awkward awk script slor.awk which has an ugly interface. That's why there is a script wrapper.sh that allows the following usage

sh wrappper.sh www.abc.de

Somewhat more practical is listener.sh which runs wrapper.sh in a loop, constantly prompting for URLs.

The downloaded files are saved in directories with such memorable names as cfd140df628db7480213704ae76d85a5; the html file is saved in cfd140df628db7480213704ae76d85a5/cfd140df628db7480213704ae76d85a5.html.

Requirements

  • POSIX shell
  • POSIX awk
  • curl

Note that the sh and awk of busybox fulfill the requirements. With some little changes in wrapper.sh and slor.awk, wget can be used instead (even the wget of busybox).

About

read off-line sites

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published