Optimalization when scrapping the same page for multiple languages #32

C0rn3j · 2018-07-06T18:01:25Z

At the moment I have this simple scrapper - https://haste.c0rn3j.com/ahiyofahuf.py

It takes a word and scraps it in two languages. This however seems to send two requests to Wiktionary instead of just one (it is after all requesting the same page).

Is there a way I can scrap both languages in one request as to make the process faster and load on Wiktionary smaller?

EDIT: Assuming this is not currently implemented.

The parser could save the whole pages to /tmp/WiktionaryParser/. /tmp/ on every decent distro gets cleaned after reboot, and it should be a tmpfs on most distros (RAM storage).

So the parser just goes to check /tmp if the file is already there and not older than let's say 24 hours(user configurable?), and acts accordingly.

I think this should be user configurable behavior in case scrapping XXk pages can take a lot of memory.

If implemented, it should be mentioned on the README.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimalization when scrapping the same page for multiple languages #32

Optimalization when scrapping the same page for multiple languages #32

C0rn3j commented Jul 6, 2018 •

edited

Loading

Optimalization when scrapping the same page for multiple languages #32

Optimalization when scrapping the same page for multiple languages #32

Comments

C0rn3j commented Jul 6, 2018 • edited Loading

C0rn3j commented Jul 6, 2018 •

edited

Loading