README v1.0 / 2015-08-17
We needed a crawler to find files on our website, which are not of the "usual pictures, slides and office" file types. Everything that is not "usual" and might create "problems", should be found and put on a list for review. This is exactly what this crawler does.
Special thanks to Johannes Lorenz for allowing to reuse his code.
crawler$ groovy src/de/fau/rrze/pp/crawler/Crawler.groovy
Issue a pull request. It will be evaluated and in all likelihood merged.
Currently there is no help beside of knowledge and understanding ... ☹
git clone https://github.com/RRZE-PP/crawler.git
Change the content of the list seedUrls.add("")
in src/de/fau/rrze/pp/crawler/Crawler.groovy
(starting at line 43).
Pay attention to use proper URLs!
- Thanks to Johannes Lorenz for the original code and the allowance for publishing
- This template was taken from https://opensource.com/business/15/6/template-starting-project-documentation
This project is licensed under GNU GPL V 3. See LICENSE for details.