This is the home page of the medieval latin biblical manuscripts relational database project, developed within the Artificial Intelligence for Cultural Heritage (AI4CH) center and the Data Science and Automatic Verification Laboratory at the University of Udine, Italy. The database aims to pose itself as an integrated and generalizable framework to store, manage and leverage information about the text and paratext of medieval latin biblical manuscripts.
The current repository includes:
- an Excel file with the raw data about the biblical manuscripts: TO-DO
- the code to set up the database within a Postgres database instance
- the code to import the raw Excel file into the database: TO-DO
- the code of some queries that show how to use the database
The following picture reports the overall relational schema of the database:
Repository is the entity that represents the current physical location of the manuscript, typically a library or a conservation institution. Its key is the wd_code, which consists of a unique alphanumeric code extracted from the Wikidata portal. The other attributes are name, city, and country, which respectively represent the name of the institution, the city, and the country where it is located.
Manuscript represents the physical manuscript, identified by the siglum, a unique identifier within this database. A manuscript is described by the following attributes:
- City of origin: The city where the manuscript is believed to have been written, which may differ from its current location. Alternatively, the most ancient location that can be tracked for the manuscript. Optional.
- Initial year and final year: The time range during which the manuscript was produced. Optional.
- Date attribution notes: Text notes providing details on the manuscript’s date attribution. Optional.
- Decorations: A boolean attribute indicating whether the manuscript contains decorations.
- Writing style: Specifies the manuscript’s writing style.
- Musical notation_: A boolean attribute indicating the presence of musical notation in the text.
- Collection: Refers to the collection the manuscript belongs to or the language it was written in. If unknown, the default designation “MS” (Manuscript) is used.
- Number: Works with the collection attribute to identify a specific manuscript within a collection. Represented as an integer or alphanumeric string.
- Digital edition: A hyperlink to a digital reproduction of the manuscript, if available. Optional.
- MS identifier: A unique identifier derived by combining repository WD code, collection, and number. The siglum is used as the key for brevity.
Book represents a specific biblical book within a manuscript (e.g., Genesis or Exodus, tracked by the attribute book_type). It has the attribute sequence number, which indicates the book’s order within the manuscript. Since the sequence order depends on the manuscript, this entity is subordinate to Manuscript. Note that the relationship between Book and Manuscript is many-to-one: a specific book, as transcribed (including its unique errors and variations), belongs to one manuscript only, while a manuscript can contain multiple books. A book acts as a sort of "container" for the book elements.
Book Element represents a "generic" component of a book, which can be one of three types (attribute element_type): prologue, summary, and text. The structure is as follows:
- A book can contain up to five prologues, at most one summary, and exactly one text
- Books can exist without prologues and/or summaries Each Book Element has a unique ID, which serves as the primary key within the database. The idea is that the ID ties the book element to a particular text (e.g., a specific prologue known in the domain). Thus, for example, two distinct prologues for the book of Genesis will have different IDs to distinguish them from each other. Conversely, a prologue with a given ID can appear in more than one book, represents the same shared textual content. The used IDs are those that have been proposed in the Robert Weber e Roger Gryson Vulgate. Note how Book Element is a generic, abstract concept used to differentiate between element types and their defining textual content.
Includes tracks the many-to-many relationship between a specific manuscript’s book and its elements. While the Book Element entity represents generic elements of a book, characterized by their type and uniquely identified by their text (via the ID attribute), linking a Book Element to a Book “materializes” it. This connection describes the attributes of the specific physical instance of the Book Element as it appears within the particular manuscript’s book. Such physical characteristics, such as the paratext, are described by the attributes:
- Element sequence order: Indicates the order of the element within the book. The text is always the last element, while prologues and summaries are interchangeable.
- Initial sheet and final sheet: Specify the starting and ending sheet of the element in the manuscript, defined by: page number and column (e.g., “ra” for recto-column a, “vb” for verso-column b). Values such as “om.” (omitted) or “lac.” (lacuna) are used for missing or damaged portions.
- Initial heading and final heading: Represent the starting and ending headings of the element. These mix uppercase and lowercase letters to expand abbreviations, aiding text search. These attributes can also have “om.” or “lac.” values and are optional.
- Running title: An optional attribute for the abbreviated book title in the page margin.
- Decorated initial letter: An optional attribute indicating the presence of a decorated initial letter, which can also have “om.” or “lac.” values.
- Stichometry: An optional attribute recording the number of lines declared at the end of some texts.
- Junction: Indicates whether the element ends at a gathering’s junction. Values include: “(x)” caesura with blank spaces, “?” possible caesura, “(?)” possible caesura with blank spaces, “/” confirmed caesura that includes the following incipit, “/?” possible caesura that includes the following incipit, “(/)” caesura that includes the following incipit and blank spaces, “(/?)” possible caesura that includes the following incipit and blank spaces, and “B” anomalous blank spaces at the end of a text.
- Notes: An optional textual field for additional information about the element.
The system can be accessed at the address http://158.110.146.222:8080/. Upon connection, users will find a pgAdmin web server interface, asking for the login data.
A read-only user, that has the privileges to perform SELECT
operations over the public schema of the database biblical has been provided, with the following credentials:
username = tester_biblical@ai4ch.uniud.it
password = UXftJGM5eNMdPGZ