This document covers some examples of using the cait command line utilities to export content from a production ArchivesSpace deployment to a local development ArchivesSpace deployment. The most recent version of this document can be found at https://github.com/caltechlibrary/cait.
The easiest way to export content from a production ArchivesSpace deployment is using the cait utility.
- Set you environment variables
- Use the archivesspace export option to create a local dump
- CAIT_USERNAME admin
- CAIT_PASSWORD admin
- CAIT_API_URL (for your production system) https://archives.example.edu/api
- CAIT_DATASET dataset
The following environment variables not note used in the export process
- CAIT_SITE_URL
- CAIT_HTDOCS
- CAIT_HTDOCS_INDEX
- CAIT_TEMPLATES
I am also assuming you have installed the cait (e.g. cait) utilities are installed in your path
#!/bin/bash
export CAIT_API_URL=https://archives.example.edu/api
export CAIT_USERNAME=admin
export CAIT_PASSWORD=admin
export CAIT_DATASET=dataset
cait archivesspace export
unset CAIT_USERNAME
unset CAIT_PASSWORD
unset CAIT_API_URL
This will take a while but it will create a local dump of the content in a directory called data. Each file is a JSON blob. Since you don't want to accidentally disturb your production system it is a good idea that you unset the environment variables when the export is complete.
In this example we're assuming your data directory is already populated, you are using the Bash shell, and the cait utilities are installed in your path.
The basic setups are
- Bring up an empty ArchivesSpace (follow the instructions at http://archiesspace.org)
- Create a repository (this usually gets created as Repo ID 2)
- Create any custom controlled vocabularies you need (e.g. extent types)
- Load the Agents (I am assuming you only are interested in the people in this example)
- Load the Subjects
- Load the Accessions
- Load the Digital Objects
- CAIT_API_URL http://localhost:8089
- CAIT_USERNAME admin
- CAIT_PASSWORD admin
- CAIT_DATASET dataset
The following environment variables not note used in the import process
- CAIT_SITE_URL
- CAIT_HTDOCS
- CAIT_HTDOCS_INDEX
- CAIT_TEMPLATES
Here's the stops to populate your local development ArchivesSpace. In this example I am assuming you're importing into repository id of 2.
If you have any non-default extent_extent_type create them before proceeding
- login to AS as admin
- click on System
- click Manage Controlled Value Lists
- Select (from List Name) Extent Extent Type (extent_extent_type)
- Add you additional values.
export CAIT_API_URL=http://localhost:8089
export CAIT_USERNAME=admin
export CAIT_PASSWORD=admin
export CAIT_DATASET=dataset
# If you have non-default extent extent types, create them before proceeding
# e.g. Multimedia, ProRes Master file, DVD
cait repository create -input dataset/repositories/2.json
find dataset/subjects -type f | while read ITEM; do cait subject create -input $ITEM; done
find dataset/agents/people -type f | while read ITEM; do cait agent create -input $ITEM; done
find dataset/agents/corporate_entities -type f | while read ITEM; do cait agent create -input $ITEM; done
find dataset/repositories/2/digital_objects -type f | while read ITEM; do cait digital_object create -input $ITEM; done
find dataset/repositories/2/accessions -type f | while read ITEM; do cait accession create -input $ITEM; done
You can import content from one ArchivesSpace deployment to the next using a combination of the cait utility and basic shell scripting.
The basic steps I take after having setup ArchivesSpace for development and loaded it with data is as follows. The instructions assume you're in your cait repository directory and that all the cait tools are compiled and installed in your path.
- CAIT_DATASET
- CAIT_HTDOCS
- CAIT_HTDOCS_INDEX
- CAIT_SITE_URL
- CAIT_TEMPLATES
- Make sure the CAIT_ environment variables are set and cait utilities are installed in your path.
- Build the website with
cait-genpages
- Create/update the sitemap (from the mkpage project) with
sitemapper $CAIT_HTDOCS $CAIT_HTDOCS/sitemap.xml $CAIT_SITE_URL
- Index the site (this takes a while on my machine)
cait-indexpages
- Launch
cait-servepages
and test with your web browser