Softcite is a project to improve the visibility of research software. We produce datasets, software, and papers.
-
(archived) Softcite Dataset v1
Go from a folder of PDFs to XML extracted full text annoated with software mentions.
- Softcite Mention Extractor Example Notebook
- Softcite Mention Extractor client
- Softcite Mention Extractor Server
We have an infrastructure to build a website that provides a browser to a database created from software extractions.
There is a demonstration of this available (populated with a small set of extractions): https://cloud.science-miner.com/software_kb/frontend/index.html
-
Du, C., Cohoon, J., Lopez, P., & Howison, J. (2022). Understanding progress in software citation: a study of software citation in the CORD-19 corpus. PeerJ Computer Science, 8, e1022. https://doi.org/10.7717/peerj-cs.1022
-
Lopez, P., Du, C., Cohoon, J., Ram, K., & Howison, J. (2021). Mining Software Entities in Scientific Literature: Document-level NER for an Extremely Imbalance and Large-scale Task. Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 3986–3995. https://doi.org/10.1145/3459637.3481936
-
Du, C., Cohoon, J., Lopez, P., & Howison, J. (2021). Softcite dataset: A dataset of software mentions in biomedical and economic research publications. Journal of the Association for Information Science and Technology, 72(7), 870–884. https://doi.org/10.1002/asi.24454
-
Bassinet, A., Bracco, L., L’Hôte, A., Jeangirard, E., Lopez, P., & Romary, L. (2023). Monitoring the production and the openness of research data and software in France:Large-scale Machine-Learning analysis of scientific PDF. https://github.com/Barometre-de-la-Science-Ouverte/bso3-techdoc/blob/master/methodology/bso3.pdf
-
Andrew Nesbitt, Boris Veytsman, Daniel Mietchen, Eva Maxfield Brown, James Howison, João Felipe Pimentel, Laurent Hèbert-Dufresne, and Stephan Druskat. 2024. Biomedical Open Source Software: Crucial Packages and Hidden Heroes. arXiv, https://doi.org/10.48550/arXiv.2404.06672