The project about citation of software in the Digital Humanities is conducted by members of different universities and aims at measuring the current practice of the use and citation of software in academic papers (abstracts of the ADHO Digital Humanities conference). By having a detailed look at current citation practice, and making the results transparent to the community, the project wants to foster a better understanding of the current situation and enhance the sustainable und scholarly use of software in the humanities and academics in general.
We provide the following datasets in the directory data/ (in TEI/XML format):
- Papers from DH conferences, years 2015–2020, originally published on ADHO's GitHub page: https://github.com/ADHO/ (licenced under CC BY). They were however retrieved in TEI format from the ToolXtractor's GitHub page: https://github.com/lehkost/ToolXtractor (licenced under Apache 2.0). Quote from ToolXtractor's Readme: "for papers of DH2017 we used the tool Grobid to create XML-TEI files (however, Grobid failed to convert all files properly, some of these conversions contain only parts of the PDF versions" (https://github.com/lehkost/ToolXtractor#datasets).