Skip to content

Intergrating Arclight with Digital Content, IIIF, and ArchivesSpace

License

Notifications You must be signed in to change notification settings

UAlbanyArchives/arclight_integration_project

Repository files navigation

Arclight Integration Project

Integrating Arclight with Digital Content, IIIF, and ArchivesSpace

IIIFlow

This is a python package that uses the directory structure defined in the Digital Object Discovery Storage Specification for a IIIF pipeline.

iiiflow functions create pyramidal tiffs, thumbnails, HOCR and text transcriptions, and combines them all into a IIIF v3 manifest.

Setup

Prereqisites

In addition to python dependancies in requirements.txt, there are some OS dependancies.

  • Pyramidal tiffs requires vips
  • Thumbnail generation requires ImageMagick (should probably be changed to vips)
  • HOCR requires tesseract
  • A/V transcriptions requires whisper

Config

iiiflow expects a .iiiflow.yml config file in your home directory (~) that defines paths to the root of your Digital Object Discovery Storage, error log, and a base url for where your images are hosted.

---
discovery_storage_root: /path/to/digital_object_root
manifest_url_root: https://my.server.org
error_log_file: /path/to/errors.log
audio_thumbnail_file: ./fixtures/thumbnail.jpg

Optionally, you can pass the path to any .yml file as the last arg of any iiiflow function.

For audio thumbnails and test to work, set audio_thumbnail_file to either a local path or accessible url to an image file.

create_ptif("collection1", "object1", "path/to/config.yml")

Create thumbnails

Creates a 300x300 thumbnail.jpg

from iiiflow import make_thumbnail

make_thumbnail("collection1", "object1")

Create pyramidal Tiffs

Uses the .ptif extension to distinguish from traditional tiffs.

from iiiflow import create_ptif

create_ptif("collection1", "object1")

Recognize text and create .hocr

from iiiflow import create_hocr

create_hocr("collection1", "object1")

Create A/V transcription

from iiiflow import create_transcription

create_transcription("collection1", "object1")

Validate metadata.yml

Validates metadata.yml using rules defined in the Digital Object Discovery Storage Specification.

from iiiflow import validate_metadata

validate_metadata("collection1", "object1")

Create manifest

from iiiflow import create_manifest

create_manifest("collection1", "object1")

Tests

This runs the tests with all dependancies

docker-compose run test

About

Intergrating Arclight with Digital Content, IIIF, and ArchivesSpace

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages