Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Highlight RA-funded vs. Non RA-funded datasets in Catalog #49

Open
mwengren opened this issue Jul 14, 2017 · 3 comments
Open

Highlight RA-funded vs. Non RA-funded datasets in Catalog #49

mwengren opened this issue Jul 14, 2017 · 3 comments

Comments

@mwengren
Copy link
Member

Many RAs aggregate lots of datasets that they don't necessarily fund or operate themselves.

In order to allow some/all of these data to be submitted to the Catalog, we need to have some way to highlight certain datasets funded or directly managed by the RA vs. more general data they aggregate and serve.

May require a metadata flag or some other manual means (in Harvest Registry?) to highlight datasets.

@benjwadams
Copy link
Contributor

There are several issues at play here:

  1. As you mentioned, just because an RA has a dataset on their WAF or THREDDS server does not necessarily mean they are involved with the production of the dataset.

  2. Often times I've seen that the institution and publishing information is not standardized. This makes it so some of the metadata can reliably be associated with a dataset, but others cannot. We need a way to reliably identify that an RA is associated with a particular dataset. We could use the info from CKAN regarding the organization it's harvested from, but then there is the problem of 1), which complicates things.

@benjwadams
Copy link
Contributor

Related: #41

@mwengren mwengren modified the milestone: Release 1.3: Attribution/Search Enhancements Sep 5, 2017
@benjwadams benjwadams changed the title Hightlight RA-funded vs. Non RA-funded datasets in Catalog Highlight RA-funded vs. Non RA-funded datasets in Catalog Jan 29, 2019
@mwengren
Copy link
Member Author

mwengren commented Jan 5, 2021

@benjwadams If we came up with an approved list of RA names (and variations/abbreviations on those names) that we could publish somewhere, and we had specific metadata fields defined to match against in code, could we add some UI or ingest code to call those out, both in the dataset list view (https://data.ioos.us/dataset) and somehow on the dataset detail page (e.g. https://data.ioos.us/dataset/peace-river-at-zolfo-springs-fl)?

Say we have ra_names list for all the name variations to match against, we could use the following as a attribution matching rules:

  • creator_name in ra_names
  • creator_institution in ra_names
  • contributor_name in ra_names where contributor_role in author|coAuthor|collaborator|contributor|funder|originator|owner|principalInvestigator|publisher|sponsor

For the full list of contributor_role(s), see: https://vocab.nerc.ac.uk/collection/G04/current/.

Presumably, this would only work for ERDDAP datasets where we are ingesting metadata directly, and not via ISO XML file content.

I would guess this would need to be handled as part of the ingest processing with some sort of true/false flag, as it's too complicated to code within the UI code directly.

May need to wait for the next milestone since 1.6 is due at the end of Jan, just brainstorming ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

2 participants