-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using migrate plus to import xml as nodes #819
Comments
Nothing better than going to make a ticket only to find @Natkeeran has already done so, and with more info/detail than I could. 🤗 |
Ok, so looking at sandbox's MODS form for basic image, the only thing that allows multiple nested entries is |
This times ten. |
@Natkeeran So how do I actually run the migration in https://github.com/Natkeeran/islandora_migrate_mods? All I get is a button to "Add a migration" that whitescreens. Am I missing something? |
@dannylamb |
@Natkeeran Seems to have installed properly. I I'm getting this error from the Drupal logs:
Ever run into that? |
Different Drupal core versions? @Natkeeran what version are you running? I guess Danny is on 8.5.0 right? |
@DiegoPino Yeah, about time to do the version check dance. @Natkeeran I'm on 8.4.5 for Drupal. |
Not sure what you mean by Add a migration. Please go to localhost:8000/admin/structure/migrate/manage/islandora_mods/migrations, and you will see the available migration there. |
@Natkeeran Yep, I can see it. But I've got an action button above the migration that whitescreens. I guess the How are you kicking off the migration? From the README in the migration example module, it says you have to use drush to run the migration, but I have no migrate commands when I check what's available with drush. The drupal console does appear to have some functionality, though it seems to be geared towards migrating from an earlier Drupal? |
I've been playing with the Migrate API a lot lately and have used the drush commands from migrate_tools v.4 almost exclusively. I would double-check your drush and migrate_tools install. |
@Natkeeran @dannylamb @seth-shaw-unlv @whikloj @jonathangreen @mjordan i just had an idea. What if we migrate from Solr? |
After lots of exploration, yes, this will do just fine. We can even stage multiple migrations that are interdependent, and the I've tested by using xml files on the filesystem, so if we want to pull from an actual islandora 7.x site we'll need a source plugin (probably solr as @DiegoPino is suggesting) that will get us the list to migrate, and then we can start requesting individual datastreams using |
Islandora REST provides a list of all datasteams on an object: {
"pid":"alping:756",
"label":"Mt. [Mount] Baker ice school, July 15, 1951",
"owner":"admin",
"models":[
"islandora:sp_large_image_cmodel",
"fedora-system:FedoraObject-3.0"
],
"state":"A",
"created":"2016-06-07T14:09:40.056Z",
"modified":"2016-06-07T17:44:02.068Z",
"datastreams":[
{
"dsid":"RELS-EXT",
"label":"Fedora Object to Object Relationship Metadata.",
"state":"A",
"size":553,
"mimeType":"application\/rdf+xml",
"controlGroup":"X",
"created":"2016-06-07T14:09:40.056Z",
"versionable":true,
"versions":[
]
},
{
"dsid":"MODS",
"label":"MODS Record",
"state":"A",
"size":4561,
"mimeType":"application\/xml",
"controlGroup":"M",
"created":"2016-06-07T14:09:40.056Z",
"versionable":true,
"versions":[
]
},
{
"dsid":"DC",
"label":"DC Record",
"state":"A",
"size":2117,
"mimeType":"application\/xml",
"controlGroup":"M",
"created":"2016-06-07T14:09:40.056Z",
"versionable":true,
"versions":[
]
},
{
"dsid":"OBJ",
"label":"OBJ Datastream",
"state":"A",
"size":1651496,
"mimeType":"image\/jp2",
"controlGroup":"M",
"created":"2016-06-07T14:09:40.056Z",
"versionable":true,
"versions":[
]
},
{
"dsid":"TECHMD",
"label":"TECHMD",
"state":"A",
"size":6725,
"mimeType":"application\/xml",
"controlGroup":"M",
"created":"2016-06-07T17:43:47.724Z",
"versionable":true,
"versions":[
]
},
{
"dsid":"TN",
"label":"Thumbnail",
"state":"A",
"size":5527,
"mimeType":"image\/jpeg",
"controlGroup":"M",
"created":"2016-06-07T17:43:52.991Z",
"versionable":true,
"versions":[
]
},
{
"dsid":"JPG",
"label":"Medium sized JPEG",
"state":"A",
"size":33129,
"mimeType":"image\/jpeg",
"controlGroup":"M",
"created":"2016-06-07T17:43:59.265Z",
"versionable":true,
"versions":[
]
},
{
"dsid":"JP2",
"label":"JPEG 2000",
"state":"A",
"size":1651496,
"mimeType":"image\/jp2",
"controlGroup":"M",
"created":"2016-06-07T17:44:02.068Z",
"versionable":true,
"versions":[
]
}
]
} Even custom datastreams are included in this list, so we wouldn't need to rely on content models to determine the list of datastreams. If we want a list to objects to migrate, an option would be to use OAI-PMH to get the objects. |
Being slow to this party, you might have covered this but in the example code I read this.
I'm wondering if we could export this mapping table after the fact, as we will need a way to redirect users from the old PID URIs to the new Drupal URIs. |
@whikloj I don't know if you can programmatically, but you certainly could run a SQL query against the db to get it. I used SQL queries against the migration mapping and message tables several times while trouble-shooting my migration development. |
Oooooo we can make our own ID mapping code. So we could store it where ever we want like a Redis cache or a text file. https://cgit.drupalcode.org/drupal/tree/core/modules/migrate/src/Plugin/MigrateIdMapInterface.php |
This is a work-in-progress, but it does query a remote Solr instance for the PIDs of items of a specific content-model and then use a modified XML data fetcher to grab the objectXML straight from Fedora. |
@whikloj this is awesome, but what if a site has its Solr and Fedora firewalled off (like we do)? Ima use your code as the basis for similar functionality via the 7.x REST module. |
The more the merrier! |
Maybe at next week's CLAW call we can focus on migrations? @dannylamb any objections? |
@mjordan Depends on the firewall, if you can't access one machine from the other then obviously there is nothing you can do. The HTTP data fetcher plugin allows for authentication, and I am using that for accessing Fedora (as you need API-M access to get the objectXML). What I have determined here is that I could be re-using the datafetcher plugin so long as Solr doesn't need different credentials. |
Basically right now I am testing this by running a 7.x vagrant and a CLAW playbook on my laptop and having one harvest the other. Hence the 10.0.2.2 Fedora URL. |
You can run both vagrants at the same time? I'm jealous. 💚 |
@whikloj I got a simple migration working using D8's built-in JSON source plugin. It requires installing the REST module on the source 7.x. I've put the configuration up at https://github.com/mjordan/7x_claw_migration_over_REST. |
@mjordan No objections to focusing on migration at CLAW calls. It seems to be the next big frontier for us. |
Do we consider this ticket as closed? I think we have established that this is a viable migration framework and while there is a lot of work to be done, short of writing your own 7.x module to use the new CLAW REST endpoints and push to them (which is also a viable solution) this is the path. |
On the March 21, 2018 CLAW Call some people wondered if it would be possible to import MODS xml into Drupal. This ticket is to explore this further.
Importing XML (i.e MODS) into Drupal 8 is a reasonably straightforward process. Please see this module here for an example: islandora_migrate_mods. The XML parser supports XPaths. Example: https://github.com/Natkeeran/islandora_migrate_mods/blob/master/config/install/migrate_plus.migration.import_mods.yml#L16.
In that example, MODS is combined into one data file. However, multiple xml files can be imported as well, without combining.
Drupal migrate has a UI and quite flexible. We can import files, inline entity nodes etc as well.
Once in Drupal, we can create tooling to generate MODS using Twig templates.
The text was updated successfully, but these errors were encountered: