Skip to content

Citation Context Extractor is a tool to convert xml files to RDF written in pure python

License

Notifications You must be signed in to change notification settings

sheshkovsky/CCeX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CitationContExtractor

CCeX extract bibliographic information from XML files and use them to create new RDF files in turtle format. This is part of the Semantic Lancet Project, follow this link for more info visit [the semantic lancet project's home page] (http://www.semanticlancet.eu/).

Requirements

For installing requirements use pip command in the project's folder:

pip install -r requirements.txt

Also you need to install nltk Data, click here for official documentation.

Usage

Put all XML files in input_dir_name and keep ccex.py in the same directory with your input_dir_name. Then run:

python ccex.py input_dir_name output_dir_name

You can run also ccex_mp.py in both single & multi processing modes, using -mp option following by an integer. It will be bounded any way by your CPU capacity:

python ccex_mp.py input_dir_name output_dir_name -mp <processes_number>

About

Citation Context Extractor is a tool to convert xml files to RDF written in pure python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages