Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-AMENDMENT_BASISOFRECORD_STANDARDIZED #63

Open
iDigBioBot opened this issue Jan 5, 2018 · 21 comments
Open

TG2-AMENDMENT_BASISOFRECORD_STANDARDIZED #63

iDigBioBot opened this issue Jan 5, 2018 · 21 comments
Labels
Amendment Conformance CORE TG2 CORE tests OTHER Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY

Comments

@iDigBioBot
Copy link
Collaborator

iDigBioBot commented Jan 5, 2018

TestField Value
GUID 07c28ace-561a-476e-a9b9-3d5ad6e35933
Label AMENDMENT_BASISOFRECORD_STANDARDIZED
Description Proposes an amendment to the value of dwc:basisOfRecord using the bdq:sourceAuthority.
TestType Amendment
Darwin Core Class Record-level
Information Elements ActedUpon dwc:basisOfRecord
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:basisOfRecord is bdq:Empty; AMENDED the value of dwc:basisOfRecord if it could be unambiguously interpreted as a value in the bdq:sourceAuthority; otherwise NOT_AMENDED
Data Quality Dimension Conformance
Term-Actions BASISOFRECORD_STANDARDIZED
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "Darwin Core basisOfRecord" {[https://dwc.tdwg.org/terms/#dwc:basisOfRecord]} {dwc:basisOfRecord vocabulary [https://rs.gbif.org/vocabulary/dwc/basis_of_record.xml]}
Specification Last Updated 2024-07-24
Examples [dwc:basisOfRecord="Human obs": Response.status=AMENDED, Response.result=dwc:basisOfRecord="HumanObservation", Response.comment="dwc:basisOfRecord contains interpretable value"]
[dwc:basisOfRecord="FossilSpecimen": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:basisOfRecord contains match in the bdq:sourceAuthority so NOT_AMENDED"]
Source VertNet
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes The term dwc:basisOfRecord has the comment "Recommended best practice is to use a controlled vocabulary such as the set of local names of the identifiers for classes in Darwin Core." The list of these values can be determined by searching https://github.com/tdwg/dwc/blob/master/vocabulary/term_versions.csv for rows with status="recommended" and rdf_type="http://www.w3.org/2000/01/rdf-schema#Class". For example, the term http://rs.tdwg.org/dwc/terms/PreservedSpecimen has a local name PreservedSpecimen. For tests against a dwc:Occurrence record, the set of valid terms is more limited and embodied in the resource found at https://rs.gbif.org/vocabulary/dwc/basis_of_record.xml, which contains the local name for the identifier, as well as preferred and alternate labels from which to standardize values.
@iDigBioBot
Copy link
Collaborator Author

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet:
Should follow on after Line 57

@ArthurChapman ArthurChapman added VOCABULARY Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT labels Jan 17, 2018
@tucotuco tucotuco added the Parameterized Test requires a parameter label Nov 5, 2018
@ArthurChapman
Copy link
Collaborator

What is the case if an Institution has all its collection as one type of "dwc:basisOfRecord" (like everything is a "FossilSpecimen"). Is there a case then that if the filed is EMPTY it can be populated from the source authority that might just have one value for that institution that is "FossilSpecimen"? Thus we would lkeave EMPTY out of INTERNAL_PREREQUISITES_NOT_MET

@tucotuco
Copy link
Member

I would be a hard-ass. If every row is of the same type, it is trivial to provide the value. This is a record-level test, and we can not rely on metadata to get the information.

@ArthurChapman
Copy link
Collaborator

I wan't thinking of using metadata, but looking at an example where an institution was running the tests and could set there Parameter as just being one value. Otherwise why is it Parameterized? But I am happy either way.

@tucotuco
Copy link
Member

It is currently parametrized to provide a source authority against which to check.

@Tasilee
Copy link
Collaborator

Tasilee commented May 12, 2019

We have two levels related to 'source authority' - the authority itself (Parameter required) and the terms it contains (VOCABULARY)?

Except for #75, all tests that have 'VOCABULARY', also have 'Parameterized' VOCABULARY is either Darwin Core - that I'd call internal as the tests have this as a foundation, or an external authority. Maybe we, like the full specifications of the Expected responses for annotations even if they have a corresponding validation, need to be explicit. That is we need to specify Darwin Core as the source authority where relevant?

Am I rambling? It wouldn't be the first time.

@tucotuco
Copy link
Member

I do not see that issue #75 is or ever was parametrized.

Yes, the tests are designed to be used against concepts that match the definitions of the Darwin Core terms they reference, and so we should not have Darwin Core as an authority in any of our extant tests. However, "vocabularies of values" designed for use with Darwin Core (or indeed recommended to be used from the Darwin Core side) are not Darwin Core. I would say that these authorities always should be parametrized to decouple the tests from content that is much more mutable over time than the definitions of the Darwin Core terms.

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 30, 2023

Changed Source Authority from

bdq:sourceAuthority default = "Darwin Core Terms" [https://dwc.tdwg.org/terms/#dwc:basisOfRecord]

to

bdq:sourceAuthority default = {Darwin Core} {Basis of record [https://dwc.tdwg.org/terms/#dwc:basisOfRecord] }

and removed bdq:sourceAuthority from Parameters (I presume, as there is no alternative vocab)?

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 4, 2023

Amended Source Authority values to align with @chicoreus syntax

bdq:sourceAuthority default = {Darwin Core} {Basis of record [https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

to

bdq:sourceAuthority default = "Darwin Core dwc:basisOfRecord" {[https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 11, 2023

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "Darwin Core dwc:basisOfRecord" {[https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

to

bdq:sourceAuthority default = "Darwin Core" {https://dwc.tdwg.org/} {dwc:basisOfRecord [https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 16, 2023

Due to recent discussions, changed Source Authority from

bdq:sourceAuthority default = "Darwin Core" {[https://dwc.tdwg.org/]} {dwc:basisOfRecord [https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

to

bdq:sourceAuthority default = "Darwin Core basisOfRecord" {[https://dwc.tdwg.org/terms/#dwc:basisOfRecord]} {Basis of record vocabulary [https://rs.gbif.org/vocabulary/dwc/basis_of_record.xml]}

Notes by @tucotuco required

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 17, 2023

I missed the Parameter(s) (added) and the syntax on the vocabulary in Source Authority (done)

@tucotuco
Copy link
Member

Updated comment from blank to

"The term dwc:basisOfRecord has the comment "Recommended best practice is to use the standard label of one of the Darwin Core classes." The list of these values can be determined by searching https://github.com/tdwg/dwc/blob/master/vocabulary/term_versions.csv for rows with status="recommended" and rdf_type="http://www.w3.org/2000/01/rdf-schema#Class". For tests against a dwc:Occurrence record, the set of valid terms is more limited and embodied in the resource found at https://rs.gbif.org/vocabulary/dwc/basis_of_record.xml, which contains both preferred labels and alternate labels from which to standardize values. This test will fail if there is leading or trailing whitespace or there are leading or trailing non-printing characters."

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 18, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
@chicoreus
Copy link
Collaborator

Updated note to remove evident copy/paste error of fail on whitespace text. Leading or trailing whitespace is one condition this amendment should be able to propose a correction for.

chicoreus added a commit to FilteredPush/rec_occur_qc that referenced this issue Jul 9, 2024
…DIZED using a hard coded list of the current labels.
@chicoreus
Copy link
Collaborator

Note that the labels contain spaces, e.g. "Preserved Specimen", not "PreservedSpecimen".

Updating the examples from:

[dwc:basisOfRecord="Human obs": Response.status=AMENDED, Response.result=dwc:basisOfRecord="HumanObservation", Response.comment="dwc:basisOfRecord contains interpretable value"]

[dwc:basisOfRecord="FossilSpecimen": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:basisOfRecord contains match in bdq:sourceAuthority so NOT_AMENDED"]

to

[dwc:basisOfRecord="Human obs": Response.status=AMENDED, Response.result=dwc:basisOfRecord="Human Observation", Response.comment="dwc:basisOfRecord contains interpretable value"]

[dwc:basisOfRecord="Fossil Specimen": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:basisOfRecord contains match in bdq:sourceAuthority so NOT_AMENDED"]

Validation data for dataID rows 438, 439, 440, 441, 442, 443, 444, 445, and 446 need to be examined, and at least 443-446 need to be corrected to reflect spaces in the labels.

@chicoreus
Copy link
Collaborator

Added an example to the note.

Needs Work label currently applies to the validation data rather than the specification.

@tucotuco
Copy link
Member

I don't agree with this one. The term names are the standard (HumanObservation), not their labels. From https://dwc.tdwg.org/terms/#dwc:basisOfRecord:
"Recommended best practice is to use a controlled vocabulary such as the set of local names of the identifiers for classes in Darwin Core."
Examples: HumanObservation

@chicoreus
Copy link
Collaborator

@tucotuco Good. I like local names better. Feels like it fits better with more people's practices. Looks like the Darwin Core term recommendation for best practice has changed. On July 16, 2023, you had added the note with the text: "The term dwc:basisOfRecord has the comment "Recommended best practice is to use the standard label of one of the Darwin Core classes."

I'd be very in favor of changing the test note and examples and keeping the validation data with the local names (without spaces).

@chicoreus
Copy link
Collaborator

Updated comment and examples accordingly.

chicoreus added a commit to FilteredPush/rec_occur_qc that referenced this issue Jul 24, 2024
…se of local name instead of label, also adding support for source authority detection and error handling.
@Tasilee
Copy link
Collaborator

Tasilee commented Aug 2, 2024

I have changed the relevant Test Data records and added a new one. Is NEEDS WORK still needed on this?

@Tasilee Tasilee removed the NEEDS WORK label Aug 3, 2024
chicoreus added a commit to FilteredPush/rec_occur_qc that referenced this issue Aug 25, 2024
chicoreus added a commit to FilteredPush/rec_occur_qc that referenced this issue Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Amendment Conformance CORE TG2 CORE tests OTHER Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY
Projects
None yet
Development

No branches or pull requests

5 participants