Skip to content

Code and data produced in 2022/2023 by Isaac Dunford as part of a Digital Humanities Internship funded by the School of Humanities at the University of Southampton.

Notifications You must be signed in to change notification settings

Southampton-Digital-Humanities/2023_Catalogue-Entry-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 

Repository files navigation

Catalogue Entry Detection

This repository contains data and code produced in 2022/2023 by Isaac Dunford as part of a Digital Humanities Internship funded by the School of Humanities at the University of Southampton.

It documents a project whose purpose was to investigate and implement different methods for detecting catalogue entries within printed catalogues. For whilst printed catalogues are easy enough to digitise and convert into machine readable data, dividing that data by catalogue entry requires visual signifiers of divisions between entries - gaps in the printed page, large or upper-case headers, catalogue references - into machine-readable information.

Isaac describes the work in his post of the British Library Digital Scholarship blog.

To test this we worked with XML-formatted data derived from the 13-volume Catalogue of books printed in the 15th century now at the British Museum. The project was undertaken in support of Rossitza Atanassova's AHRC-RLUK Professional Practice Fellowship.

This project continues to be maintained at https://github.com/britishlibrary/Incunabula-Catalogue-Entry-Detection.

License

All data provided by the British Library: text data CC0 1.0 Universal Public Domain; images CC-BY 4.0 International. For code use MIT License.

About

Code and data produced in 2022/2023 by Isaac Dunford as part of a Digital Humanities Internship funded by the School of Humanities at the University of Southampton.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages