CCeX extract bibliographic information from XML files and use them to create new RDF files in turtle format. This is part of the Semantic Lancet Project, follow this link for more info visit [the semantic lancet project's home page] (http://www.semanticlancet.eu/).
For installing requirements use pip command in the project's folder:
pip install -r requirements.txt
Also you need to install nltk Data, click here for official documentation.
Put all XML files in input_dir_name and keep ccex.py in the same directory with your input_dir_name. Then run:
python ccex.py input_dir_name output_dir_name
You can run also ccex_mp.py
in both single & multi processing modes, using -mp
option following by an integer. It will be bounded any way by your CPU capacity:
python ccex_mp.py input_dir_name output_dir_name -mp <processes_number>