-
Notifications
You must be signed in to change notification settings - Fork 4
Bulkrax imports
TODO: Adapt instructions from https://github.com/gwu-libraries/scholarspace-hyrax/pull/555
Run the ingest_bulkrax_prep
rake task either inside the container or from the outside using docker exec
. The task requires an argument, which is the path to the directory containing the ProQuest zips you wish to include in the ingest. For example, bundle exec rails gwss:ingest_pq_etds['/opt/scholarspace/scholarspace-ingest/etd-zips']
if etds are in /opt/scholarspace/scholarspace-ingest/etd-zips
.
The Bulkrax manifest will be written in a bulkrax_zips
directory, inside the directory corresponding to the value of the TEMP_FILE_BASE
environment variable (typically set in .env
). The manifest contains:
- a
metadata.csv
Bulkrax-compliant manifest file - a
files
directory, containing a directory for each ETD zip, which itself contains:- the ProQuest XML file
- the main ETD PDF
- optionally, a folder containing additional attachments for the ETD
Within the GW ScholarSpace web application, log in as an administrative user. On the Dashboard, click on Importers. Create a New importer with the following values:
- Name = any name
- Administrative Set = ETDs
- Frequency = Once (on save)
- Limit = leave blank
- Parser = CSV - Comma Separated Values
- Visibility = Public
- Rights Statement = leave blank
- Add CSV File to Import: Specify a Path on the Server. Import file path =
{TEMP_FILE_BASE}/bulkrax_zip/metadata.csv
-
Before starting the import, open a tab to the Sidekiq administrator (at
/sidekiq
) so that you can watch progress of the queues and monitor for any problems.
Then proceed and click Create and Import.
*If you wish to re-run the task to generate the bulkrax-ready metadata and files, then you'll need to first clear out the results of the previous run: rm -r {TEMP_FILE_BASE}/bulkrax_zip
TODO
Q: When I create an importer, the administrative set that I wish to import to isn't showing up in the dropdown list.
A: This can occur when your user has the admin
role and can therefore access /importers
but does not have the contentadmin
role; contentadmin
s can import to any admin set. Try adding the contentadmin
role to your administrative user.