-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial OAI-PMH Endpoint (issues 498 & 1192) #4
Conversation
@seth-shaw-unlv I'll review and test this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works as advertised, although I hit a snag when I tried creating new sets (see below). Here's the output from http://localhost:8000/oai/request?verb=ListRecords&metadataPrefix=oai_dc
:
<?xml version="1.0"?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd" name="OAI-PMH">
<responseDate>2019-09-05T13:56:46Z</responseDate>
<request verb="ListRecords" metadataPrefix="oai_dc">http://localhost:8000/oai/request</request>
<ListRecords>
<resumptionToken/>
<record>
<header>
<identifier>oai:localhost:node-6</identifier>
<datestamp>2019-08-23T14:13:08Z</datestamp>
<setSpec>oai_pmh:all_repository_items</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Greenwhich.jpg</dc:title>
<dc:description>Object ingested by: mark@mark-ThinkPad-X1-Carbon-6th.
Basic technical metadata: /home/mark/Pictures/pics/Greenwhich.jpg: JPEG image data, Exif standard: [TIFF image data, little-endian, direntries=10, description= , manufacturer=Canon, model=Canon PowerShot G12, orientation=upper-left, xresolution=204, yresolution=212, resolutionunit=2, datetime=2013:07:13 06:56:08], baseline, precision 8, 3648x2432, frames 3
SHA256 hash: a1c9c00867ce71bcc63ef7e715b1deadaf7251f43ec6f4f92f02985345ec4a60</dc:description>
<dc:extent>1 item</dc:extent>
<dc:identifier>/home/mark/Pictures/pics/Greenwhich.jpg</dc:identifier>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>oai:localhost:node-7</identifier>
<datestamp>2019-09-02T19:12:47Z</datestamp>
<setSpec>oai_pmh:all_repository_items</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>My test Islandora object</dc:title>
<dc:description>Object ingested by: mark@mark-ThinkPad-X1-Carbon-6th.
Basic technical metadata: /home/mark/Downloads/replace_media.png: PNG image data, 833 x 437, 8-bit/color RGBA, non-interlaced
SHA256 hash: 048caaecfd9abab788afd0becdf657b8bb66ab74381b38e0aadf33b6d5a814bc</dc:description>
<dc:extent>1 item</dc:extent>
<dc:identifier>/home/mark/Downloads/replace_media.png</dc:identifier>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>oai:localhost:node-9</identifier>
<datestamp>2019-09-05T13:49:13Z</datestamp>
<setSpec>oai_pmh:all_repository_items</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Small boats in Havana Harbour</dc:title>
<dc:description>Taken on vacation in Cuba.</dc:description>
<dc:extent>1 item</dc:extent>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>oai:localhost:node-10</identifier>
<datestamp>2019-09-05T13:49:22Z</datestamp>
<setSpec>oai_pmh:all_repository_items</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Manhatten Island</dc:title>
<dc:description>Taken from the ferry from downtown New York to Highlands, NJ. Weather was windy.</dc:description>
<dc:extent>1 item</dc:extent>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>oai:localhost:node-12</identifier>
<datestamp>2019-09-05T13:49:33Z</datestamp>
<setSpec>oai_pmh:all_repository_items</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Amsterdam waterfront</dc:title>
<dc:description>Amsterdam waterfront on an overcast day.</dc:description>
<dc:extent>1 item</dc:extent>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>oai:localhost:node-13</identifier>
<datestamp>2019-09-05T13:49:42Z</datestamp>
<setSpec>oai_pmh:all_repository_items</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Alcatraz Island</dc:title>
<dc:description>Taken from Fisherman's Wharf, San Francisco.</dc:description>
<dc:extent>1 item</dc:extent>
</oai_dc:dc>
</metadata>
</record>
</ListRecords>
</OAI-PMH>
Nice! Only glitch I ran into, and it might be a PEBCAK error, is I can't get sets to work. I've added an Entity Reference display to an existing View (Taxonomy Term), checked it in the OAI-PMH "What to expose to OAI-PMH" settings, and rebuilt my OAI-PMH index. ListSets is coming up empty: <OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd" name="OAI-PMH"><responseDate>2019-09-05T14:17:27Z</responseDate><request verb="ListSets">http://localhost:8000/oai/request</request><ListSets/></OAI-PMH> When I uncheck my new set in the "What to expose" list and rebuild, ListSets is returning the default set: <OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd" name="OAI-PMH"><responseDate>2019-09-05T14:25:48Z</responseDate><request verb="ListSets">http://localhost:8000/oai/request</request><ListSets><set><setSpec>oai_pmh:all_repository_items</setSpec><setName>All Repository Items</setName></set></ListSets></OAI-PMH> Am I missing something? |
As far as the sets not showing: it looks like you're using a taxonomy vocab as a contextual filter. I'm guessing you'd want each term in the vocab to be treated as a set? And that you have an RDF mapping (or some other form of metadata mapping) for the vocab? I'm not 100% sure if entity types other than nodes are available for the sets feature. I'll look into that tomorrow. |
@joecorall it occurred to me that the contextual filter might be the problem here too, it's essentially empty. I'll try with a specific view and report back. |
Removing the contextual filter and replacing it with a specific filter using a taxonomy term makes the set show up in the OAI-PMH settings. So based on my testing, filters can't be contextual. Now that I've got my view as a set, I'm seeing some unexpected behavior with the contents of the OAI requests, but at this point I'm still narrowing down what's going on. I'll continue over the weekend and report back here. |
Got sets persisting:
<?xml version="1.0"?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd" name="OAI-PMH">
<responseDate>2019-09-08T21:17:25Z</responseDate>
<request verb="ListSets">http://localhost:8000/oai/request</request>
<ListSets>
<set>
<setSpec>oai_pmh:all_repository_items</setSpec>
<setName>All Repository Items</setName>
</set>
</ListSets>
<ListSets>
<set>
<setSpec>taxonomy_term:entity_reference_1</setSpec>
<setName>Entity Reference</setName>
</set>
</ListSets>
</OAI-PMH> But a request for a set returns items that aren't in that set. For example, this request (there are 2 nodes in the "taxonomy_term:entity_reference_1" view display):
Returns results that contain records not in that set, e.g.: <record>
<header>
<identifier>oai:localhost:node-13</identifier>
<datestamp>2019-09-05T13:49:42Z</datestamp>
<setSpec>oai_pmh:all_repository_items</setSpec>
</header>
</record> Records that should be in the set are: <record>
<header>
<identifier>oai:localhost:node-12</identifier>
<datestamp>2019-09-07T17:32:02Z</datestamp>
<setSpec>taxonomy_term:entity_reference_1</setSpec>
<setSpec>oai_pmh:all_repository_items</setSpec>
</header>
</record/ Rebuilding my OAI index doesn't have effect on this. |
I'll see what I can do. |
Of course, I say that 4 hours ago and only now get around to it... |
@mjordan Can I get your new view config so I can test it locally? Thanks. |
@seth-shaw-unlv here my dumped config. It's for the "Taxonomy Term" View, in particular the "Entity Reference" display. |
Hmm... the view isn't pulling up any of my taxonomy terms. Does the view's preview pane work on yours, @mjordan? I'm wondering what is different. |
Yes, for that display the preview shows the expected nodes. |
@seth-shaw-unlv I added my own terms to the vocabulary that the view is using, so you won't have them on your VM. You should be able to adjust it to use any term however. |
@mjordan, I noticed that. Adding a term actually has the effect of displaying the node several (4) times over. Do you have repeating results? FWIW I wouldn't use the taxonomy_term view as my base. I would use the content view as my base and add the filters to that. |
Okay, it was repeated for every taxonomy term it had associated with it (Islandora Model, Islandora Access, Linked Agent, and Subject). Removing one of the node's associated terms dropped the number of times it appeared. |
Did you get it to work as an OAI set? |
No, I was just trying to get the view working so I could debug the rest. In anycase, I do see the bug you mentioned. I'm wondering if there was a regression in the rest_oai_pmh module since I submitted the PR, because I know this worked before. I will dig into this some more. |
@mjordan the OAI URL appeared to be incorrect. Try |
According to the docs, we use the 'set' argument with the value of a 'setSpec'. |
So, really, this 'works as designed' rather than a bug. |
Using |
'setSpec' is the name of the value, but for an HTTP ListRecords query, the argument is 'set'. |
Wow, which is the correct argument is pretty unclear, particularly by lack of examples. Approving and merging. Sorry for the rabbit hole. |
@mjordan, no worries! Better to run down the hole. I'd rather be safe than sorry! |
Nice work @joecorall and @seth-shaw-unlv on this very important feature! |
GitHub Issue: Islandora/documentation#1192
This PR represents the 'short term' solution described in the issue thread.
What does this Pull Request do?
Adds an islandora_oaipmh submodule using rest_oai_pmh to enable an OAI-PMH endpoint. Includes a README providing details about how it works.
What's new?
(ie. Regeneration activity, etc.)? No.
How should this be tested?
composer require drupal/rest_oai_pmh
)drush en -y islandora_oaipmh
http://localhost:8000/oai/request?verb=ListRecords&metadataPrefix=oai_dc
Bonus:
Additional Notes:
The linked agents field complicates the mapping to Dublin Core. This module makes some assumptions in islandora_oaipmh_preprocess_rest_oai_pmh_record() (in islandora_oaipmh.module) about which MARC relators are creators and assumes the rest are contributors. If someone could specifically review that list (e.g. @rosiel), that would be great.
@joecorall indicated the rest_oai_pmh module is still in alpha pending handling of deleted content. I personally don't think that should stop us from making it available. Although, I should probably add a note about it's experimental status in the README... I can also close the request if we think this is too soon.
Interested parties
@Islandora-CLAW/committers