Skip to content

Latest commit

 

History

History

src

(re)creating unarXive

The code in the src/ directory can be used to re-create or update unarXive.

Prerequisites
Usage
  1. Prepare arXiv metadata with: utility_scripts/generate_metadata_db.py
  2. Prepare OpenAlex DB with: utility_scripts/generate_openalex_db.py
  3. Parse arXiv sources with: prepare.py (or normalize_arxiv_dump.py + prase_latex_tralics.py)
  4. Match reference items with: match_references_openalex.py
  5. Extend matched data with: extend_matched.py (adds arXiv IDs to matched references and discipline information)
  6. Verify and analyze result with: utility_scripts/calc_stats.py