-
Notifications
You must be signed in to change notification settings - Fork 21
BagIt Importer
Bulkrax can import valid BagIt bags, either individually, or multiple bags in a single folder. The bag, or folder of bags may be supplied in a zip file.
Bulkrax assumes that each bag will contain one or more works, within a single metadata file and one or more data files.
This single bag containing two images and one metadata file would be imported as a single Work with two files attached. The metadata file (my_metadata.csv) can be at the top level as it is here, or it can be in the data folder.
my_bag
data
my_image.tif
my_other_image.jpg
my_metadata.csv
(bagit files)
This folder would import each bag as a separate work - 3 works in total.
folder
my_bag
data
my_image.tif
more_metadata.csv
(bagit files)
my_second_bag
(structured as per my_bag)
my_third_bag
(ditto)
This bag would be unpacked to create three works, one per metadata file.
my_bag
data
work1
my_image1.tif
my_metadata.csv
work2
my_image2.tif
my_metadata.csv
work3
my_image3.tif
my_metadata.csv
(bagit files)
-
If a CSV is supplied, File Sets and Collections can also be imported on the same csv. e.g. importing a work, file set and collection together
model title parents source_identifier Work My work my_collection my_work Collection My collection my_collection FileSet My file set my_work my_file -
If there are multiple bags, or multiple works, each metadata file MUST have the same filename and MUST be co-located with the data files (as per the example above).
-
Metadata can be supplied as RDF or CSV.
There are various tools for creating BagIt bags. For example, using the ruby 'bagit' library in an irb console:
gem install bagit
irb
> require 'bagit'
> # make a new bag from existing files
> bag = BagIt::Bag.new path_to_files
# e.g.: bag = BagIt::Bag.new '/Users/computer_name/Work/my_bag'
> bag.manifest!(algo: 'sha256') # ref: https://www.geeksforgeeks.org/difference-between-sha1-and-sha256/