Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-VALIDATION_GEOGRAPHY_STANDARD #139

Closed
chicoreus opened this issue Feb 7, 2018 · 27 comments
Closed

TG2-VALIDATION_GEOGRAPHY_STANDARD #139

chicoreus opened this issue Feb 7, 2018 · 27 comments
Labels
Conformance DO NOT IMPLEMENT A potential test that it is not recommended be implemented Parameterized Test requires a parameter SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY

Comments

@chicoreus
Copy link
Collaborator

chicoreus commented Feb 7, 2018

TestField Value
GUID 9d6f53c0-775b-4579-b7a4-5e5f093aa512
Label VALIDATION_GEOGRAPHY_STANDARD
Description Can the individual values of the terms dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality be unambiguously resolved from bdq:sourceAuthority?
TestType Validation
Darwin Core Class Location
Information Elements ActedUpon dwc:continent
dwc:country
dwc:countryCode
dwc:stateProvince
dwc:county
dwc:municipality
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if all of the terms dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality are bdq:Empty; COMPLIANT if the individual values of dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality are unambiguously resolved in the bdq:sourceAuthority; otherwise NOT_COMPLIANT
Data Quality Dimension Conformance
Term-Actions GEOGRAPHY_STANDARD
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" [https://www.getty.edu/research/tools/vocabularies/tgn/index.html]
Specification Last Updated 2023-09-22
Examples [dwc:continent="Oceania", dwc:country="Australia", dwc:stateProvince="Victoria": Response.status=COMPLIANT, response.result="", Response.comment="The geographic terms are in agreement with the source authority"]
[dwc:continent="", dwc:country="Australia", dwc:stateProvince="Virgina": Response.status=NOT_COMPLIANT, response.result="", Response.comment="The geographic terms are ambiguous in the source authority"]
Source VertNet, Kurator
References
Example Implementations (Mechanisms) Kurator
Link to Specification Source Code https://github.com/VertNet/toolkit, https://github.com/kurator-org/kurator-validation/blob/master/packages/kurator_dwca/workflows/dwca_geography_assessor.yaml
Notes This test only tests if the values of the Information Elements values can be found in the Source Authority. Test #95 tests the combination of these values. Only the administrative terms are considered. Terms for waterBody and islands are not included in this test.
@chicoreus
Copy link
Collaborator Author

The discussion in #118 suggests the need for this validation. Added here for discussion.

@ArthurChapman
Copy link
Collaborator

@chicoreus - does this mean that all higher geography terms need to be unambiguously resolvable. What about if you only have down to country and nothing lower?

@Tasilee Tasilee added Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT VOCABULARY and removed NEEDS WORK labels Mar 23, 2018
@ArthurChapman ArthurChapman added the Parameterized Test requires a parameter label Aug 29, 2018
@Tasilee Tasilee added Test and removed Test labels Sep 25, 2018
@ArthurChapman
Copy link
Collaborator

@chicoreus question has not been answered. Is it COMPLIANT if some can be unambiguously resolved - or does all have to be unambiguously resolvable

@tucotuco
Copy link
Member

I feel strongly that this test must be that the entire higher geography combination is unambiguous, as it is stated. It is not really functional to do a piece-wise validation of the geographic terms and posit compliance. If you want to go the route of checking to, for example, country, I believe this test would have to change to add another parameter to say how many levels down you want to be checked. Even so, the test would not check continent, then country, then stateProvince independently - it is only the combination that makes sense to be compliant.

@Tasilee
Copy link
Collaborator

Tasilee commented May 12, 2019

@tucotuco - Are you are saying that the test would return NOT_COMPLIANT if the not EMPTY terms among dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county and dwc:municipality resulted in ambiguity? If so, it raises an interesting point that the ambiguity needs to be reported?

@tucotuco
Copy link
Member

The test isn't about ambiguity, it is about being non-standard against a given authority. This is a NOTSTANDARD test, not a INCONSISTENT test. In NOTSTANDARD tests we don't say why it is not standard. Indeed, we don't even make that the job of consistency tests (e.g., #67) except to test for something very specific (e.g., #62).

@Tasilee
Copy link
Collaborator

Tasilee commented May 15, 2019

@tucotuco ok. tare

@Tasilee Tasilee changed the title TG2-VALIDATION_GEOGRAPHY_NOTSTANDARD TG2-VALIDATION_GEOGRAPHY_STANDARD Mar 22, 2022
@ArthurChapman
Copy link
Collaborator

I just noticed that the wording of this test is virtually identical to #95 - they appear duplicates!

#139
COMPLIANT if the combination of dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality can be unambiguously resolved from the bdq:sourceAuthority;

#95
COMPLIANT if the combination of values of administrative geographic terms (dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality) can be unambiguously resolved by the bdq:sourceAuthority;

@Tasilee
Copy link
Collaborator

Tasilee commented Mar 25, 2022

I was going to use "administrative geographic terms" in the Description of one of these tests and decided to just be explicit, as we have opted for elsewhere.

@ArthurChapman
Copy link
Collaborator

But are the tests the same, and can one be deleted?

@Tasilee
Copy link
Collaborator

Tasilee commented Mar 25, 2022

I think you are right. They are close enough.

@ArthurChapman
Copy link
Collaborator

@tucotuco Can you check those two tests and see if there is any reason we have two tests (#139 and #95)

@Tasilee
Copy link
Collaborator

Tasilee commented Mar 27, 2022

#139: COMPLIANT if the combination of [values] of dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality can be unambiguously resolved from the bdq:sourceAuthority;

#95: COMPLIANT if the combination of values of dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality can be unambiguously resolved by the bdq:sourceAuthority

So....there are equal in intent. I suggest we CLOSE #95 and I'll edit this one to add "values of".

Votes please?

@tucotuco
Copy link
Member

Sorry folks. These two tests are not supposed to be the same. This one is not about ambiguity, while #95 is. The geography can be not standard and unambiguous.

I could see having only one of these two tests, but I think #95 is actually the useful one in that case.

@Tasilee
Copy link
Collaborator

Tasilee commented Mar 29, 2022

Thanks @tucotuco. Your call. I guess the questions are then

  • Are the potential combinations of STANDARD/NOT_STANDARD | AMBIGUOUS/UNAMBIGUOUS informative?
  • If they are, then the Expected Responses need to more clearly differentiate the tests.
  • If not (which you simply by suggesting that TG2-VALIDATION_GEOGRAPHY_STANDARD #139 may be dropped), fine.

@chicoreus
Copy link
Collaborator Author

The examples seem informative. #95 dwc:stateProvince="WA" Is standard, but ambiguous, other terms would be needed to disambiguate which WA. #139 dwc:continent="Oceania", dwc:country="Australia", dwc:stateProvince="Virginia" is not standard, there is no Virginia within Oceania:Australia, no such higher geography exists, while at least two WA higher geographies exist.

The whiteboard diagram from Ganesville on #95 seems to suggest that #95 was being thought of as encompasing both of these at that time.

@tucotuco
Copy link
Member

tucotuco commented Mar 29, 2022 via email

@ArthurChapman
Copy link
Collaborator

Is there a way we can reword one of them to cover both cases. It is obvious that over time the two tests that were distinct at one time have converged over time. As @chicoreus suggests, the examples are helpful.

@tucotuco
Copy link
Member

I think they are very different and an attempt should not be made to merge them. The fact that the Expected Responses look the same is an accident. I think the correct Expected Response for this step should be something more akin to:

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if all of the terms dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality are EMPTY; COMPLIANT if each of the NOT_EMPTY values of dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality is a STANDARD value for that term and is consistent with all of the other values in the listed fields according to the bdq:sourceAuthority; otherwise NOT_COMPLIANT

There are problems inherent in what standard means here that make me think this test is not as useful as #95.

I think I am convincing myself to drop this one.

@Tasilee
Copy link
Collaborator

Tasilee commented Mar 31, 2022

We need to make a decision on #139 and #95. Hold till the next Zoom or 'bite (bight? :) the bullet'?

@tucotuco
Copy link
Member

tucotuco commented Apr 2, 2022

I remain fine with omitting #139.

@Tasilee
Copy link
Collaborator

Tasilee commented Apr 2, 2022

The thumbs up suggests that #139 does not add significantly to #95 but for the deletion of a test, I'd value @chicoreus's 'vote'.

@ArthurChapman
Copy link
Collaborator

Because of the confusion between this test and #95, I suggest rewording as

INTERNAL_PREREQUISITES_NOT_MET if all of the terms dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality are EMPTY; COMPLIANT if the values of dwc:continent, dwc:country, dwc:countryCode, dwc:stateProvince, dwc:county, dwc:municipality can be individually resolved from the bdq:sourceAuthority; otherwise NOT_COMPLIANT

@ArthurChapman
Copy link
Collaborator

There has been discussion between Issues #95 and #139 - the wording converged over time such as the two tests appeared to be testing for the same thing. Discussion on ZOOM has resulted in separating the two.

#139 is testing individual terms for validity at that level - it looks at only one level in the hierarchy at a time and checks the validity of what is there at the level. Hence the suggested change in wording above.
#95 is testing for inconsistencies between levels in the hierarchy - for example Western Australia (WA) as a State, and USA as a country - i.e. one is wrong and thus ambiguous.

@Tasilee Tasilee removed the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Aug 28, 2022
@Tasilee
Copy link
Collaborator

Tasilee commented Aug 29, 2022

The zoom discussion with @ArthurChapman, @tucotuco and @chicoreus today concluded that tests #95, #139 and #118 were going to be very difficult to implement properly given the lack of a consistent geographic terms hierarchy by comparison with the taxonomic terms. Note the issues arising from the table above for example. We will therefore remove these tests from CORE.

In their place, we will

  1. Add a test for dwc:stateProvice found to complement TG2-VALIDATION_COUNTRY_FOUND #21 (which we will rename)
  2. Add a test for dwc:country dwc:stateProvince combo exist at least once in the bdq:sourceAuthority (country-state/province consistent)
  3. Add a test for dwc:country dwc:stateProvince combo exists exactly once in the bdq:sourceAuthority ((country-state/province unambiguous)

@Tasilee Tasilee added the Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. label Sep 13, 2022
@chicoreus chicoreus added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Sep 18, 2023
@ArthurChapman
Copy link
Collaborator

Splitting Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Changed "Field" to "TestField", "Output Type" to "TestType", deleted "Warning Type" and added TestField "Specification Last Updated"

@ArthurChapman ArthurChapman added DO NOT IMPLEMENT A potential test that it is not recommended be implemented and removed Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. labels Jan 14, 2024
@Tasilee Tasilee closed this as completed Feb 6, 2024
@Tasilee
Copy link
Collaborator

Tasilee commented Feb 22, 2024

Specifications updated to align with the current template

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Conformance DO NOT IMPLEMENT A potential test that it is not recommended be implemented Parameterized Test requires a parameter SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY
Projects
None yet
Development

No branches or pull requests

4 participants