Framework for the extraction of features from Wikipedia XML dumps.
This project has been tested with Python 3.5.0, but should also work with Python 3.4.3.
You need to install dependencies first, as usual.
pip install -r requirements.txt
You need to download Wikipiedia dumps first:
./download.sh
Then run the extractor:
python -m wikidump FILE [FILE ...] OUTPUT_DIR
It will take some time... RAM will not suffer, I promise.