-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TG4 - Vocabularies needed for the Tests and Assertions #172
Comments
@ArthurChapman, @Jegelewicz, @tucotuco One of the vocabularies needed for the Tests and Assertions is dwc:continent. Would any of these be a good candidate for building a vocab? It would be very interesting to see what the marine folks think about this? (Gwen @gwemon, Mary-on email) |
I think this could be a really good place to start. Without a clear definition of what should go into this field, we are all creating our own grand partitioning of the Earth. The either/or of Getty/ISO 3166 isn't helping. As long as we recommend both, we will cause problems.
Going back to the example in Darwin Core Continent and Water Body Getty places this locality in:
ISO 3166 places this locality in:
AN = Antarctica Either is potentially workable, but we need to pick one so that we don't end up with conflicting information. This doesn't mean everyone HAS to use the chosen source, just that everyone understands which vocabulary the aggregators will be using. Personally, I would choose the ISO because it is driven by an international standards group. |
A problem I see is that it is difficult to look at continents in isolation from countries and even lower levels. A lot of the material in our collections is historic, and the ISO3166 does not (as far as I know) include historic country names. the Getty TGN on the other hand does include historic country names. As far as using continent as a SKOS-based vocabulary as an exemplar for TG4 - it depends on how we look at it. Are we likely to just recommend an external standard (ISO3166 or Getty TGN) then we are not recommending an exemplar for our methodologies. The alternative is that we create our own - and with something like continent, I don;t think that makes any sense. There is definite value being able to reference an external source rather than attempt to develop another system for our own use. We may be better off working with Getty (or ISO) to cater for our separate needs (at the country level). Water Bodies is another - more difficult issue and it is important that OBIS and Arctos have input into what are adopted. I am not saying continent is not an issue for us, and needs discussion on how we deal with it, however I don't think it can it be done in isolation from countries. But as an exemplar for TG2 - I think there may be better options. |
Agree - we shouldn't have to create a vocabulary from scratch for continent, so for the purposes of TG2, see #171 |
I have been giving the comment by @Jegelewicz some more thought. Higher Geography is very similar to higher Taxonomy. We don't tell people what Higher Taxonomic Classification to use but we can create a Vocabulary that includes acceptable values at various levels in the hierarchy. Similarly with Higher Geography - we should not dictate that you follow the hierarchy of GETTY TGN or ISO3166. We should, however, have a vocabulary that includes the terms available at say continent level. Thus in the example above - both "South America" and "Antarctica" are valid names for continents in both thesauri. So if someone wants to follow TGN and place South Georgia in continent "South America" and someone else wants to follow ISO3166 and place it in "Antarctica" they have a right to do so (a long as they document it). But from a vocabulary point of view as "continent" both are valid and acceptable values. As far as Darwin Core goes, though, - they could make a recommendation that TGN be followed or that ISO3666 be followed for dwc:higherGeography. In the Tests and Assertions - see #139 and #129 we have handled this by making the tests Parameterized, so that when you run the test you will be asked to add a Parameter - for example TGN or ISO3166, etc. and that will then report on records that are not Compliant with that test as Paramaterized. |
As an exemplar taxonRank may be a good one (see #170 . We have an excellent starting point with the GBIF Vocabulary (http://rs.gbif.org/vocabulary/gbif/rank.xml) for Taxon Rank. This would also fit well with the Tests #162 and #163. We also have the advantage with Taxon Rank that to some extent, Ranks follow the various codes (but only to some extent). Further comments under #170 |
A couple of points of information on continents, because it may matter for
the decision-making process here.
First, ISO 3666 is not for geography at all, it is for Viscosity of Water -
ISO 3166 is for Countries.
Second, ISO does not provide a standard for continents, nor does it include
them in ISO 3166.
Third, Darwin Core does not mention ISO as a source for continent values -
the recommendation is "Recommended best practice is to use a controlled
vocabulary such as the Getty Thesaurus of Geographic Names." The reference
to ISO Continent codes is from "https://terms.tdwg.org/wiki/dwc:continent",
an independent and no longer maintained commentary on Darwin Core, and is
in error.
Given these points, I think it might still be quite viable to have a
controlled vocabulary for dwc:continent as an exemplar, especially since
the terms already have URIs in TGN.
…On Wed, Sep 26, 2018 at 7:50 PM Arthur Chapman ***@***.***> wrote:
I have been giving the comment by @Jegelewicz
<https://github.com/Jegelewicz> some more thought. Higher Geography is
very similar to higher Taxonomy. We don't tell people what Higher Taxonomic
Classification to use but we can create a Vocabulary that includes
acceptable values at various levels in the hierarchy. Similarly with Higher
Geography - we should not dictate that you follow the hierarchy of GETTY
TGN or ISO3666. We should, however, have a vocabulary that includes the
terms available at say continent level. Thus in the example above - both
"South America" and "Antarctica" are valid names for continents in both
thesauri. So if someone wants to follow TGN and place South Georgia in
continent "South America" and someone else wants to follow ISO3666 and
place it in "Antarctica" they have a right to do so (a long as they
document it). But from a vocabulary point of view as "continent" both are
valid and acceptable values.
As far as Darwin Core goes, though, - they could make a recommendation
that TGN be followed or that ISO3666 be followed for dwc:higherGeography.
In the Tests and Assertions - see #139
<#139> and #129
<#129> we have handled this by making
the tests Parameterized, so that when you run the test you will be asked to
add a Parameter - for example TGN or ISO3666, etc. and that will then
report on records that are not Compliant with that test as Paramaterized.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#172 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAcP6yYbveqbvVngqZb3rRmuypOu50OSks5ufASqgaJpZM4W60zR>
.
|
Sorry John, my error - will edit to fix comments |
@ArthurChapman and @baskaufs, there are arguments for (in this issue) and against (#168) using dwc:taxonRank as an exemplar. Can those differences be resolved? If not we are looking at very few candidate terms for exemplar vocabularies if trying to satisfy the condition that it also serve for a TG2 test and assertion. The complete list of terms having tests or assertions in TG2's core list is currently: dc:type |
I'm not opposed in principle to using dwc:taxonRank as an exemplar. I was just afraid that if this task group put work into developing the controlled vocab and then the task group working on TCS 2.0 somehow changed the term, the work would be for naught. However, I suspect that this group will be working at a faster rate than that group, so presumably we would be done with the exemplar vocabulary before TCS 2.0 was finished anyway. If dwc:occurrenceStatus is already a work in progress, it might be a good option. |
Is there a current and agreed-upon Darwin Core field definition list? |
Yes, the definitions found on the Quick Reference Guide are produced directly from the canonical Darwin Core Standard, which is now managed in a single CSV document at https://github.com/tdwg/dwc/blob/master/vocabulary/term_versions.csv. |
Sigh. This is not a very user-friendly format, nor is it very accessible. I will speak for all of the overworked collection managers who don't have time to sift through a bunch of text to figure out what they need to know. I realize that you all have skills and knowledge that I don't have, but I am supposed to be the one using this to make my data better. Can we make this available to Joe Collection Manager in a way he/she can understand and use it? Right now, if I am interested and I google "Darwin Core Terms", the first result is the "independent and no longer maintained commentary on Darwin Core" that is "in error" with no real way to know that is what it is (It looks very official with the TDWG logo and says nothing about being out of date). I would venture to guess that others are relying upon it as well. |
I totally understand, and the fix is in progress. We are trying to release the new version of the Darwin Core web site, which has been two years in the making (volunteer time). The Quick Reference Guide (which will be easier to navigate in the new version) is where we expect Joe Collection Manager to go for the definnitions, and from there linking out to commentaries such as on the Darwin Core Questions and Answers site (https://github.com/tdwg/dwc-qa/wiki) and to recommended vocabularies when those are figured out. We will no longer link to the media wiki from the Darwin Core definitions. It is definitely a problem if it is not consistent with the standard, and maintaining it is clearly an issue. |
Thanks! I know the volunteer time thing well.... |
Just a note, but in doing other research I downloaded a template for getting data into the Atlas of Living Australia and every field includes a link to a definition at http://rs.tdwg.org/dwc/terms/index.htm |
Excellent. That is exactly the right place to point to. |
Following are the comments regarding building a vocabulary needed for the Tests and Assertions that have been provided to the group.
Arthur Chapman (@ArthurChapman):
I would like to see us develop the simple SKOS-based vocabulary on one of the terms/vocabularies needed for the Tests coming out of Task Group 2 on Tests and Assertions. I think (from memory) there are about 23 tests that rely on a vocabulary. Not all will be simple ones, but if we can pick one, then we solve several problems at the same time.
The text was updated successfully, but these errors were encountered: