-
Notifications
You must be signed in to change notification settings - Fork 28
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Mapping DSIDs to Media Use terms * Adding in PDFA * Adding linked agents. Pulling from solr * Does basic and large image, audio, video, and pdf from 7.x. Extracts metadata and linked agents from MODS. * Documentation * Adding images for README * Update README.md * Renaming some images * Pointing to new images in README? * Add files via upload * Add files via upload * Add files via upload * Update README.md * Updating images * Adding sample objects * Pointing at basic image solution pack by default * Update README.md * Updating pictures * Now working with AUDIT datastreams * Adding PID to field_identifier * Migrating PID field separately
- Loading branch information
Showing
37 changed files
with
1,083 additions
and
239 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,81 +1,139 @@ | ||
## Introduction | ||
This module contains plugins and some example migrations to import data from a Fedora 3 Islandora instance | ||
into an Islandora CLAW instance. | ||
|
||
This is a base setup, it requires adjustments to the default Repository Object and configuration changes | ||
for your setup. | ||
|
||
## Required changes | ||
The default Repository Object provided with Islandora CLAW requires one additional field to allow for | ||
these migrations (or you can comment out these field migrations). | ||
|
||
1. A large text field called `field_mods_text`, this will store the MODS datastream from the source object. | ||
|
||
This is defined in the `config/install/migrate_plus.migration.islandora_basic_image.yml` and | ||
can be commented out there. | ||
|
||
## Example usage | ||
To use this migration, clone this repo into your Drupal 8 instance `modules/contrib` directory. | ||
|
||
DO NOT INSTALL THE MODULE YET!!! | ||
|
||
You will need to edit the 3 `migrate_plus.migration.islandora_basic_image*` files in the `config/install` directory. | ||
|
||
At a minimum you'll need to set: | ||
1. `solr_base_url: http://10.0.2.2:9080/solr` to your Solr instance | ||
1. `fedora_base_url: &fedora_base_url http://10.0.2.2:9080/fedora` to your Fedora, please leave the `&fedora_base_url` | ||
this is a placeholder and saves re-typing this value in other locations. | ||
1. The `username` and `password` in the block | ||
This module contains plugins to import data from a Fedora 3 Islandora instance | ||
into an Islandora CLAW instance. It also contains a feature as a submodule | ||
that contains some example migrations. The example migrations are based on forms from vanilla Islandora 7.x solution | ||
packs, and are meant to work with the fields defined in `islandora_demo`. If you customized your MODS forms, then you | ||
will also need to customize the example migration and `islandora_demo`. | ||
|
||
Currently, the following content models can be migrated over with full functionality: | ||
|
||
- Collection | ||
- Basic Image | ||
- Large Image | ||
- Audio | ||
- Video | ||
- Binary | ||
|
||
If you want some sample Basic Image objects with metadata made from stock forms, check out [this zip | ||
file](docs/examples/sample_objects.zip) that you can use with `islandora_zip_batch_importer`. All the images were | ||
obtained from [Pexels](https://www.pexels.com/) and are free to use for personal or business purposes, with the | ||
original photographers attributed in the MODS. | ||
|
||
## Installation | ||
|
||
Download this module, its feature, and its dependencies with composer | ||
|
||
``` | ||
composer require islandora/migrate_7x_claw | ||
``` | ||
|
||
Install the module and example migrations at the same time using drush | ||
|
||
``` | ||
drush en islandora_migrate_7x_claw_feature | ||
``` | ||
|
||
## Configuration | ||
|
||
By default, the migrations are configured to work with an `islandora_vagrant` instance running on the same host as a | ||
`claw-playbook` instance, which is convienent for development and testing. But for your Islandora 7.x instance, the | ||
following config will need to be set the same way on the source plugin of each migration (except for the | ||
"7.x Tags Migration from CSV" migration): | ||
|
||
- `solr_base_url` should point to your Islandora 7.x Solr instance (i.e. `http://example.org:8080/solr`) | ||
- `fedora_base_url` should point to your Fedora 3 instance (i.e. `http://example.org:8080/fedora`) | ||
- The `username` and `password` for your Fedora 3 instance in the block | ||
``` | ||
authentication: &fedora_auth | ||
plugin: basic | ||
username: fedoraAdmin | ||
password: fedoraAdmin | ||
``` | ||
- `q` is used to define a Solr query that selects which objects get migrated. From a fresh clone, the | ||
migrations are configured to look for `islandora:sp_basic_image_collection` and all its children with the following query: | ||
``` | ||
RELS_EXT_isMemberOfCollection_uri_ms:"info:fedora/islandora:sp_basic_image_collection" OR PID:"islandora:sp_basic_image_collection" | ||
``` | ||
You can easily import a collection of your own by changing the PID in the above query, or you can provide your own | ||
query to migrate over objects in other ways (such as per content model, in order by date created, etc...). If you can write a Solr select query for it, you can migrate it into CLAW. Omitting `q` from configuration will default to `*:*` | ||
for the Solr query. | ||
|
||
Once you've updated the configuration, you need to re-import the feature to load your changes. You can do this with `drush`: | ||
``` | ||
drush -y fim islandora_migrate_7x_claw_feature | ||
``` | ||
|
||
You can also use the UI to import the feature if you go to `admin/config/development/features` and click on the `Changed` link next to "Migrate 7x Claw Feature". | ||
|
||
![Changed Link](docs/images/feature_click_changed.png) | ||
|
||
You may also need (or want) to alter the content model field name in Solr. | ||
`content_model_field: RELS_EXT_hasModel_uri_ms` | ||
and the content model to migrate. | ||
`content_model: islandora:sp_basic_image` | ||
From there, you can select all changes and clicking "Import Changes" | ||
|
||
These changes need to be made in all 3 migration configuration files. | ||
![Import Changes](docs/images/feature_import_changes.png) | ||
|
||
Now you can install the `migrate_7x_claw` module. | ||
## Running the migrations | ||
|
||
If you have installed the `migrate_ui` module you can review the process in the `Admin -> Structure -> Migrations`. | ||
You can quickly run all migrations using `drush`: | ||
``` | ||
drush -y mim --group islandora_7x | ||
``` | ||
|
||
You can then see. | ||
![List of Migrations](docs/images/migrations.jpg) | ||
If you want to go through the UI, you can visit `admin/structure/migrate` to see a list of migration groups. The migrations provided by this module have the machine name `islandora_7x`. | ||
|
||
If you click **List Migrations** you will see 3 migrations. | ||
![Migrations Groups](docs/images/migrate_groups.png) | ||
|
||
![Migration](docs/images/migrate1.jpg) | ||
You will see 8 migrations. _The "7.x Tags Migration from CSV" needs to be run first_. | ||
|
||
The _Basic Image Objects OBJ Media_ migration requires the other two be completed first, if you try to run this one it | ||
will run the other two first. | ||
![Migrations](docs/images/migrations.png) | ||
|
||
Clicking **Execute** on the _Basic Image Objects_ displays a page like. | ||
Clicking **Execute** on "7.x Tags Migration from CSV" migration displays a page like | ||
|
||
![Migration Execute](docs/images/migrate2.jpg) | ||
![Execute Migration](docs/images/execute_migration.png) | ||
|
||
The operations you can run are | ||
* **Import** - import the objects | ||
The operations you can run for a migration are | ||
* **Import** - import un-migrated objects (check the "Update" checkbox to re-run previously migrated objects) | ||
* **Rollback** - delete all the objects (if any) previously imported | ||
* **Stop** - stop a long running import. | ||
* **Reset** - reset an import that might have failed. | ||
|
||
With _Import_ selected press **Execute**. | ||
If you select "Import", and then click "Execute", it will run the migration. It should process 5 items. | ||
|
||
Then you can run the "Islandora Media" migration, which depends on the remaining migrations. Running it effectively | ||
runs the entire group of migrations other than the "7.x Tags Migration from CSV" migration. After they're all done, | ||
you should be able to navigate to the home page of your CLAW instance and see your content brought over from | ||
Islandora 7.x! | ||
|
||
When complete, you should see something like below (your number will be different). | ||
![Content in CLAW](docs/images/content_in_claw.png) | ||
|
||
![Migration result](docs/images/migrate_result1.jpg) | ||
If you click on any node you should see all its metadata, which has been extracted from its MODS and Solr documents. | ||
Here's the original object in Islandora 7.x: | ||
|
||
Once you have completed all 3 | ||
![Free Smells in 7x](docs/images/free_smells_in_7x.png) | ||
|
||
And here it is in Islandora CLAW: | ||
|
||
![Free Smells in CLAW](docs/images/free_smells_in_claw.png) | ||
|
||
Clicking on the Media tab will reveal all of the datastreams migrated over from 7.x, which you can now manage through CLAW. Here's the original datastreams in Islandora 7.x: | ||
|
||
![Free Smells Datastreams](docs/images/free_smells_datastreams.png) | ||
|
||
And here they are in Islandora CLAW as Media: | ||
|
||
![Free Smells Media](docs/images/free_smells_media.png) | ||
|
||
You can also check out the collection itself, which should have its "Members" block populated: | ||
|
||
![Collection in CLAW](docs/images/collection_in_claw.png) | ||
|
||
## How this migration works | ||
To allow for the magic Danny content modelling overhaul. | ||
You provide a query, as `q` in the source plugin configuration, that defines which objects get migrated. For each | ||
result in the query, you can choose to use either the Solr doc for an object, the FOXML file for an object, or | ||
a particular datastream for an object by setting the `url_type` configuration. The migrations for subjects, geographics, and agents all target the MODS file of an object. The migration for datastreams uses FOXML, and the migration for the objects themselves use the Solr doc. | ||
|
||
All datastreams are migrated over as-is, regardless of what data is extracted by the migrations and applied as fields. | ||
|
||
Collection hierarchy is preserved so long as all the collections are in the `q` query results. | ||
|
||
1. The migration searches Solr for all of the content models specified. | ||
1. Each is migrated to a new node in Drupal. | ||
Then it creates a file for the OBJ datastream | ||
of each of these objects. Lastly it creates a media object that links the file to the node. | ||
Subject, geographic, and person/corporate agents from MODS all get transformed into taxonomy terms, and content | ||
is tagged with these terms. |
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
85 changes: 85 additions & 0 deletions
85
...ra_migrate_7x_claw_feature/config/install/migrate_plus.migration.islandora_audit_file.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
langcode: en | ||
status: true | ||
dependencies: | ||
enforced: | ||
module: | ||
- migrate_7x_claw | ||
- migrate_plus | ||
- islandora | ||
id: islandora_audit_file | ||
class: null | ||
field_plugin_method: null | ||
cck_plugin_method: null | ||
migration_tags: null | ||
migration_group: islandora_7x | ||
label: 'AUDIT File' | ||
source: | ||
plugin: islandora | ||
solr_base_url: 'http://97.107.189.65:8080/solr' | ||
q: 'RELS_EXT_isMemberOfCollection_uri_ms:"info:fedora/islandora:sp_basic_image_collection" OR PID:"islandora:sp_basic_image_collection"' | ||
fedora_base_url: 'http://97.107.189.65:8080/fedora' | ||
data_fetcher_plugin: http | ||
authentication: | ||
plugin: basic | ||
username: fedoraAdmin | ||
password: fedoraAdmin | ||
data_parser_plugin: authenticated_xml | ||
item_selector: '/foxml:digitalObject' | ||
constants: | ||
destination_directory: 'fedora://masters' | ||
mimetype: application/xml | ||
extension: xml | ||
dsid: AUDIT | ||
creator_uid: 1 | ||
fields: | ||
- | ||
name: PID | ||
label: PID | ||
selector: '@PID' | ||
- | ||
name: audit_ds | ||
label: Audit Datastream | ||
selector: 'foxml:datastream[@ID = "AUDIT"]/foxml:datastreamVersion/foxml:xmlContent/*' | ||
ids: | ||
PID: | ||
type: string | ||
process: | ||
digital_id: | ||
- | ||
plugin: concat | ||
delimiter: _ | ||
source: | ||
- PID | ||
- constants/dsid | ||
- | ||
plugin: str_replace | ||
search: ':' | ||
replace: _ | ||
filemime: constants/mimetype | ||
uid: constants/creator_uid | ||
filename: | ||
plugin: concat | ||
delimiter: . | ||
source: | ||
- '@digital_id' | ||
- constants/extension | ||
destination: | ||
plugin: concat | ||
delimiter: / | ||
source: | ||
- constants/destination_directory | ||
- '@filename' | ||
uri: | ||
- | ||
plugin: flatten | ||
source: | ||
- '@destination' | ||
- audit_ds | ||
- | ||
plugin: file_blob | ||
destination: | ||
plugin: 'entity:file' | ||
default_bundle: file | ||
migration_dependencies: | ||
required: { } | ||
optional: { } |
87 changes: 87 additions & 0 deletions
87
...a_migrate_7x_claw_feature/config/install/migrate_plus.migration.islandora_audit_media.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
langcode: en | ||
status: true | ||
dependencies: | ||
enforced: | ||
module: | ||
- migrate_7x_claw | ||
- migrate_plus | ||
- islandora | ||
id: islandora_audit_media | ||
class: null | ||
field_plugin_method: null | ||
cck_plugin_method: null | ||
migration_tags: null | ||
migration_group: islandora_7x | ||
label: 'AUDIT Media' | ||
source: | ||
plugin: islandora | ||
solr_base_url: 'http://97.107.189.65:8080/solr' | ||
fedora_base_url: 'http://97.107.189.65:8080/fedora' | ||
data_fetcher_plugin: http | ||
authentication: | ||
plugin: basic | ||
username: fedoraAdmin | ||
password: fedoraAdmin | ||
content_model_field: RELS_EXT_hasModel_uri_ms | ||
content_model: 'islandora:sp_basic_image' | ||
data_parser_plugin: authenticated_xml | ||
item_selector: '/foxml:digitalObject' | ||
constants: | ||
destination_directory: 'fedora://masters' | ||
mimetype: application/xml | ||
extension: xml | ||
dsid: AUDIT | ||
fedora_base_url: 'http://97.107.189.65:8080/fedora' | ||
creator_uid: 1 | ||
audit_url: http://islandora.ca/audit-trail | ||
fields: | ||
- | ||
name: PID | ||
label: PID | ||
selector: '@PID' | ||
ids: | ||
PID: | ||
type: string | ||
process: | ||
digital_id: | ||
- | ||
plugin: concat | ||
delimiter: _ | ||
source: | ||
- PID | ||
- constants/dsid | ||
- | ||
plugin: str_replace | ||
search: ':' | ||
replace: _ | ||
name: | ||
plugin: concat | ||
delimiter: . | ||
source: | ||
- '@digital_id' | ||
- constants/extension | ||
field_media_use: | ||
plugin: migration_lookup | ||
migration: islandora_7x_tags | ||
source: constants/audit_url | ||
no_stub: true | ||
field_media_file: | ||
plugin: migration_lookup | ||
migration: islandora_audit_file | ||
source: PID | ||
no_stub: true | ||
field_media_of: | ||
plugin: migration_lookup | ||
migration: islandora_objects | ||
source: PID | ||
no_stub: true | ||
uid: constants/creator_uid | ||
destination: | ||
plugin: 'entity:media' | ||
default_bundle: file | ||
migration_dependencies: | ||
required: | ||
- migrate_plus.migration.islandora_objects | ||
- migrate_plus.migration.islandora_audit_file | ||
- migrate_plus.migration.islandora_7x_tags | ||
optional: { } |
Oops, something went wrong.