-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Task: Establish an ATN WAF of ISO19115 metadata records for IOOS Data Catalog to harvest #52
Comments
Copying my comments from ioos/ckanext-ioos-theme#237 (comment) below: AFAIK IOOS is required to furnish ISO XML metadata (or perhaps DCAT JSON, not 100% sure on that alternative) to NOAA for inclusion in NOAA's enterprise data inventories for all of our publicly-available data/services. For all of IOOS' non-bio data, it's been fairly straightforward to do this as most of the software we use has been developed to able to output an ISO XML metadata representation of the datasets they serve. Since that isn't the case for OBIS, MBON, or ATN (I believe), that's something we'll need to address for both including those data in IOOS Catalog given its current capabilities, and also for sending up the chain to NOAA to meet requirements. It may be that leveraging IOOS Catalog and converting the various bio data formats to ISO XML format isn't the best approach to meeting NOAA data inventory requirements. If there are better, simpler ways to furnish these metadata to NOAA that I'm not aware, we should consider those options. Catalog has been our solution to date, but primarily because of the pre-existing metadata format support and compatibility. Ideally, we can have a comprehensive inventory of 100% of IOOS' data in Catalog, and I think we should still aim for that goal, but we need to understand better what the challenges for that might be wrt ATN, MBON, or other bio/Marine Life data. |
Thanks @mwengren. For ATN, at some point, we hope to add non-embargoed data to an ATN ERDDAP which could be an easy pathway for that observing method. See #44 For MBON, we are encouraging the MBON projects to work with RAs to host the raw data on an RA ERDDAP (or other web service as applicable). Most of the RA ERDDAPs are already being harvested, hence the push for that collaboration. Below is an example:
Another wrinkle in the whole pipeline is that OBIS-USA is being archived at NCEI on a quarterly basis. Part of our guidance is to submit data to OBIS-USA. While that metadata record is not available through the IOOS Catalog, it is available through the various NOAA and higher Catalogs. So, does that meet our NOAA data inventory requirements?? See links below:
The data flow diagram might help illustrate all the nuances https://ioos.github.io/mbon-docs/mbon-data-flow.html |
@MathewBiddle That makes sense on the data flow and connection in with the RA ERDDAPs, I recall that plan now... thanks for adding the example. I think the OBIS-USA/NCEI archive probably does meet the NOAA data publishing/open data requirements for those data - at least from what I understand. I think our goal should be to include both access points (NCEI archive and RA ERDDAP) at the NOAA Catalog level (i.e. OneStop). The IOOS Catalog should include all data access services provided by the RAs, or other IOOS DACs, that are funded and supported by IOOS. Having two separate metadata records for the same dataset should be OK as well as they'll be describing different endpoints to access the same data, presumably. Ideally there would be a way to relate each metadata record to the other within the NOAA Catalog, but I'm not sure that is technically possible at present. That might be a good requirement to share with the OneStop team though. I guess the one scenario that seems to be a potential gap where IOOS-funded bio data might not be represented in IOOS Catalog is if a provider is not serving their data via RA ERDDAP, but are aligning them to Darwin Core and submitting to NCEI. Ideally, we could also represent those raw data access points, whatever they might be, in IOOS Catalog as well, even if they would be technically meeting the NOAA open data publishing guidelines via OBIS/NCEI archive pathway. I don't know how much of a priority or how common this is... maybe would provide justification to encourage those providers to work with an RA to publish to ERDDAP, however. |
@mwengren Is there a reason you couldn't share the RA ERDDAP link as another data access link in the collection metadata record at NCEI? It doesn't seem ideal to have two collection records for the same dataset in OneStop. Here is an example: https://data.noaa.gov/onestop/collections/details/573b7dc1-7d06-4fdc-a134-056c112c2260 |
I think this might be more common with cross funded efforts, like MBON. Some projects use EDI and Arctic Data Center as their repositories (maybe BCO-DMO too). |
This task has evolved a little bit since that last revisit in June. This activity should be initially focused on how to get ATN "data" into the IOOS Data Catalog. I put quotes around "data" because we should define what we mean by that term in the context of ATN. For now, MBON datasets are making it to the IOOS Data Catalog via RA ERDDAP's (which are being harvested into the catalog). So, there is no effort required to make MBON datasets appear in the IOOS Data Catalog. Next steps:
edit: data -> metadata (the IOOS Data Catalog only harvests metadata) |
Updating the title to be more reflective of the activity. |
I'm understanding this task better as we continue to have conversations with @mwengren about the IOOS Data Catalog. The requirement for the IOOS Data Catalog is:
More details:
|
clarified the title to what the activity is. |
related #44 Need to identify if we go
|
Who is requesting this?
@ioos/marine-life
What is being requested?
Connect ATN and MBON into IOOS DMAC. Coordinate with IOOS Catalog developers (POC: @mwengren) on how ATN and/or MBON portals could be harvested for data.ioos.us. Guidance for the process to add records is documented at https://ioos.github.io/catalog/
What is the requested deadline and why?
No response
What is the current status quo (i.e., what happens if this does not get done)?
ATN and MBON datasets wont show up in data.ioos.us.
Marine Life will not meeting IOOS DMAC requirements by being discoverable in data.ioos.us.
What indicates this is done (i.e., how do we know this is complete)?
Provide a description or any other important information.
xref:
The text was updated successfully, but these errors were encountered: