-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding new Vocabularies/Taxonomies as per Metadata Interest Group mappings #35
Conversation
I am not sure why or where Travis is failing! |
Looks like a network issue. I'll restart it. |
I've tested this PR and it appears to work as expect. Moving on to testing the related PR Islandora/islandora_defaults#5. |
@Natkeeran this needs the new services thing to make Travis start MySQL. See Islandora/documentation#155 (comment) |
@whikloj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the UUIDs.
modules/controlled_access_terms_defaults/config/install/taxonomy.vocabulary.country.yml
Outdated
Show resolved
Hide resolved
modules/controlled_access_terms_defaults/config/install/taxonomy.vocabulary.form.yml
Outdated
Show resolved
Hide resolved
modules/controlled_access_terms_defaults/config/install/taxonomy.vocabulary.genre.yml
Outdated
Show resolved
Hide resolved
Considering the Islandora 8 Tech Call earlier today, I would like @rosiel to weigh in on the PR before we merge it. For the genre and language vocabularies it probably makes sense to use the external_uri field instead of authority_link one. Granted, that field isn't available without the islandora_core_feature module, but we can put the field definition inside of Also, the MIG mapping for "Place Published" simply says it is a reference, but doesn't specify which one. This PR makes a country vocabulary; but should we use the existing geo_location vocabulary instead? Finally, could genre and form be combined into a single vocabulary? |
I assume this has something to do with the magical property of the specific field external_uri to swap out values. However, I have not been able to document this behaviour. Please advise.
So the entire vocabulary is added only if the islandora_core_feature is installed? or the field that contains the URI is only added if the islandora_core_feature is installed? Either way, I wonder about this module's re-use value outside of Islandora... I know it's been part of the "selling point" that the modules we make are "not islandora specific", but they kind of are. I guess part of the problem is that I don't understand what this module does as a module vs the content that its default feature installs. In short, my opinion is that this is a confusing setup.
If you scroll one cell to the right, it says "Import MARC Country vocabulary". In MARC metadata, which a lot of us use, there are two places where the place of publication is entered.
When we refer to places as subjects, we use a different vocabulary, which is (sometimes) reconcilable against something with coordinates. That makes more sense to keep as the Geo_location one. Is this module the one that populates the default values? I can't figure out the mechanism for that.
I'll have to get back to you on that, sorry. |
Side note on module reusability: I do reuse the module for our ArchivesSpace/Drupal integration project. True, we are also using it with Islandora 8, but a repository wouldn't need to. |
@Natkeeran & @rosiel; I'd like to get this settled in advance of a coming Islandora 8.x-1.1 release. @Natkeeran, are you available to pull out those UUIDs and potentially do a few small updates (pending below), or would you rather I fork your branch and issue a new PR? @rosiel, I'm thinking that the Genre and Form should be kept separate as both MODS and Dublin Core (as indicated by the selected predicates) indicates they are two very different things; although I would prefer we change 'Form' to 'Physical Form'. Also, I think that Genre overlaps the existing Resource Types vocabulary (and we are mapping them with the same DC predicate) so it makes sense to me to keep them as a single vocabulary. As for the country codes v. geographic locations; I still prefer them to be combined as we can associate a term with any controlled vocabulary we want, including multiple vocabularies or none at all. I feel that the fewer terms with duplication across multiple vocabularies the more useful our linked data will be. All that stated, I'll approve/merge this if the MIG wants to keep them separate. (Tagging @mbolam & @rtilla1.) |
Genre/Form: Yes, separate, please.
This is the list of the known Vocabularies that might apply to genre/form: https://www.loc.gov/standards/sourcelist/genre-form.html The suggestion to arbitrarily, or based on string matching, merge terms from existing vocabularies is NOT advised. As @rtilla1 and I kind of touched on in our talk, authority work is the maintenance of these vocabularies, and it is careful, painstaking work that declares something to be authoritatively true in a concrete knowledge system. That's what controlled vocabs (i.e. authority files) are - someone making a decision that "this is the way the world is" - these were made in the days when not everybody could say anything about any topic, and these authority files/vocabularies still hold weight as "official" manifestations of institutions. So that's the philosophical side. Here's an example of why this matters: https://www.pbs.org/newshour/politics/gop-reinstates-usage-of-illegal-alien-in-library-of-congress-records So no, none of us get to say "London, as defined as a place by Geonames, is exactly the same as London, defined as a place by Wikidata". Using fewer controlled vocabularies in our predicates is probably a good idea for ease of harvesting, but I don't think using fewer controlled vocabularies in our values gives us any benefit at all. |
Perhaps to clarify, I'm not saying I would arbitrarily or string-match similar terms. Any association between related terms ought to be made by the metadata creators (or, more specifically, someone tasked with approving these relations suggested by metadata creators) after careful review. That stated, that any term we create is essentially a local authority record and until we have a magical swap my URI for an external one feature every new taxonomy term we add creates a new local authority record. While true that "London, as defined as a place by Geonames" is not exactly the same as "London, defined as a place by Wikidata", they are similar and represent to our casual users the same conceptual place. Therefore they could both be associated with our local record. If, however, I create a taxonomy term for both Londons, then I will have two local authority records for London linking back to their respective sources. They will both, also come up in our site searches unless you suppress one or both of them, which can lead to a confusing end-user experience. I don't think they want to see multiple terms across various vocabularies that express generally the same concept. I would much rather create a single London local authority record in my Geographic Locations vocabulary that includes links to Geonames and Wikidata. Creating new Drupal vocabularies for Geonames and Wikidata to link to entries in a Geographic Locations vocabulary because they aren't exactly the same, as well as a separate countries vocabulary, strikes me as introducing unnecessarily complicated bloat. Getting back to the Countries list; if I already have a country in my Geographic Locations vocabulary from Geonames and I add a second Countries list for the MARC authority it will result in two entities on my site, representing two local records that, to most users, also represent the same concept. OR I can simply add the MARC URI to my existing Geographic Locations terms resulting in a single local record related to both of those other authority record sources. I can still split these all apart if/when I start exporting records. Need the MARC code for a country I linked to from my record? Fine, do a look-up on the term for the URI that matches the MARC URL pattern. Want to know what the LOC says about it while navigating the linked data? Follow the LOC URL. Now, if we get the 'magically swap my local URL for the URL in one of my fields' feature setup, then creating whole vocabularies and terms might make more sense. Until then, I want to limit the number of local records I create. |
Oh, speaking to the LOC illegal-alien issue, that is exactly why we want our local records. We have materials related to many indigenous groups where we don't want the LOC authorized heading to be used when displaying records; although we do want to indicate that we are conceptually talking about the same group. So, our local records use the name the indigenous group prefers but we also link to the LOC record as well as (potentially) Wikidata or other sources. That is also why we use schema:sameAs. The schema:sameAs predicate allows a broader representation of 'same identity' than other ontologies (the example includes an official website in addition to a Wikipedia page which can be very different representation of the same 'identity'). |
Yesterday in the MIG meeting it was determined that the MARC Country Codes should probably be added as a separate vocabulary. I would like to propose, as a compromise, that we include it as a sub-module. As was noted in the meeting, this PR is impacting a module that is 'required' by default installations of Islandora 8, and I would rather this vocabulary not be. Yes, it is useful for those migrating from Islandora 7 and we should make it easy to include; but not all of us are coming from MARC or MODS and we 'greenfield' users may not need it. We can make a sub-module called |
I kinda like that. However, it even sounds like maybe MARC Country codes should be in Islandora Defaults? It's part of "the default metadata profile", and I'm unclear why we have to add all the taxonomies to support the Repository Item content type here in a different module. Is it because Defaults is a Feature, and Features can't include the Migration steps that allow us to populate content entities (as vocabularies are config but terms are content)? |
This division stems back to when I initially wrote Controlled Access Terms to create common vocabularies shared by Islandora and my ArchivesSpace/Drupal integration. (The initial set was a reflection of what I was importing from ArchivesSpace.) I wanted both projects to be able to be use them independently but also co-exisist without duplicating these common vocabularies; ergo, a separate module both could use. I'm fine with the submodule living in either location. I don't have any plans for supporting MARC country codes in the ArchivesSpace integration project. |
To respond directly to this; no, we don't have any restrictions on what we include in a module that is also a Feature. Indeed, I have a local module dedicated to our migrations that is also a Feature. I rely on Features heavily while developing those migrations. (Also, migrate_islandora_csv is also a module/Feature dedicated to migrating content entities.) Also, technically, terms and nodes are both content entities. The Migrate API can theoretically migrate any entity (content OR config) although I haven't seen any examples of config migrations. |
@Natkeeran, in summary, please:
Does that look right, @rosiel? |
I've removed the uuids and renamed the form to physical form. Should we move the Country into a sub module in islandora_defaults called islandora_marc_fields or islandora_marc_extended ? |
If you think there will be several other MARC-specific items. I figured we would call the sub module simply 'marc_countries', or 'islandora_marc_countries' if we want to be insistent on keeping the islandora prefix for all submodules. Also, if you want to remove the countries vocabulary from this PR we can go ahead and merge it since an islandora_defaults submodule is a separate PR anyway. |
👍 I want to spin it up again before approval/merge, but that might be tomorrow due to meetings today. |
I've removed the country vocab from here and put into a sub module in islandora_defaults. This PR should be good to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a few minutes today! Looks good. No errors on load.
GitHub Issue: (link)
What does this Pull Request do?
Currently, this PR adds the following vocabularies. It would be easy to test a set of fields in a batch. I plan to add all the straightforward ones first.
How should this be tested?
Interested parties
Tag (@ mention) interested parties or, if unsure, @Islandora-CLAW/committers