Publish Hazard Datasets calculated by ZAMG as Open Data #9

p-a-s-c-a-l · 2018-11-23T13:44:17Z

According to the status presentation, ZAMG calculates datasets for

25 Indices
16 GCM/RCM climate model combinations (daily data) from EURO-CORDEX
4 time periods (1971-2000, 2011-2040, 2041-2070, 2071-2100)
3 RCP scenarios (2.6, 4.5, 8.5)

= 3325 unique datasets. Btw, why 3325 datasets not 4800 (25x16x4x3)?

An example for Heatwave Duration Hazard NetCDF file can be found in this issue.

Note: This data has to be "rasterised" to GeoTIF 500km grid (example for the same dataset here) and then the local effects are taken into account to generate derived datasets. The complete process chain will eventually documented here. So in the end, we would possibly calculate 3 x 3325 datasets that have to be published as open data according to H2020 Open Access Guidelines. However, it is up to the @clarity-h2020/data-processing-team and @clarity-h2020/mathematical-models-implementation-team to discuss and decide, if really we need that amount of derived datasets. But this is better addressed in this issue and other HC, HC-LE and EE related questions I'm going to ask soon.

p-a-s-c-a-l · 2018-11-23T16:50:30Z

For the Data Management Plan it is at the moment only relevant to consider how the datasets can be made publicly available for re-use by other interested parties (this is also a dissemination issue). Here we concentrate first on releasing the original datasets produced ZAMG as open data and address derived datasets (those considering the local effects) in a separate issue.

When we talk about 3325 datasets, the publication process (Example) must be automated:

deposit the dataset and associated meta-data in a research data repository, e.g. Zendo unless ZAMG wants release it on data.ccca.ac.at
register the dataset meta-data (including a link to the actual data resource stored in Zenodo) in our CKAN instance ('living' Data Management Plan).

Both Zenodo and CKAN offer APIs, so we can develop some simple scripts that automates this process. Theoretically it would also be possible to configure CKAN to automatically harvest the meta-data from Zenodo.

Questions to @clarity-h2020/science-support-team

Why 3325 datasets and not 4800 (25x16x4x3)?
Why 16 different GCM/RCP combinations? Do we really need to consider all of them impact calculation as discussed in this issue or do we select one mean/ensemble scenario?

claudiahahn · 2018-11-27T08:19:26Z

The original data sets produced by ZAMG will be stored on a server of the CCCA and, after all licenses have been checked, will be released on data.ccca.ac.at.
Question 1: Robert listed 3325 and not 4800 data sets because not for all GCM/RCM combinations the RCP 2.6 scenario was available
Question 2: There is no need to consider all GCM/RCM combinations. We will provide the ensemble mean (and the max/ min or some percentiles to assess uncertainty). Thus, we have for each index one ensemble mean value for each time period (4) and each RCP scenario (3). That makes 12 ensemble mean values for each index plus e.g. the respective min/max.
All CLARITY partners can work with that data, but before making data that are based on the EURO-CORDEX data, publicly available the licenses have to be checked. That means the institutions that provide the EURO-CORDEX data need to be contacted.

DenoBeno · 2018-11-27T12:40:30Z

In todays telco, a decision was made that Louis (Meteogrid) will draft a letter for requesting the use of data from the owners. This is needed mainly for EURO-CORDEX data, as far as I understand.

p-a-s-c-a-l · 2018-11-27T12:57:08Z

The original data sets produced by ZAMG will be stored on a server of the CCCA and, after all licenses have been checked, will be released on data.ccca.ac.at.

O.K. In practical terms that means that we

don't need to upload these datasets to Zenodo as they can be downloaded from data.ccca.ac.at. This obsoletes also Data Management Example: Heatwave Duration Hazard
don't need to register these datasets in CLARTIY's CKAN since meta-data can be viewed in data.ccca.ac.at's CKAN

In Data Management Plan we can then directly refer to data.ccca.ac.at. Perfect.
@claudiahahn Assuming that we are allowed to publish the data (see #9 (comment)), when will it be made available on data.ccca.ac.at? D7.9 Data Management Plan v2 is due by end of January 2018.

Where and how to publish (in terms of Data Management, not CSIS WMS/WCS publication) derived hazard datasets (+local effects) is another story and has to be discussed with @clarity-h2020/data-processing-team

p-a-s-c-a-l · 2018-11-27T13:40:03Z

As soon as the bias correction is finished, we can calculate the indices using the bias corrected EURO-CORDEX data and make it available.

OK, so the implications are

In the D7.9 Data Management Plan v2 we will just announce that indices using bias corrected EURO-CORDEX data will be made available as open data. In D7.9 Data Management Plan v3 (end of 2019) we can then provide the links to the actual data @data.ccca.ac.at. Fine.
@clarity-h2020/data-processing-team must be aware that the data that is now made available internally (uploaded to sFTP) contains initial/draft hazard indices and that they have to be re-processed when the bias corrected indices have been calculated.

p-a-s-c-a-l · 2019-11-06T17:19:27Z

Any progress to be reported here?

claudiahahn · 2019-11-07T11:09:20Z

The data sets are not yet published on CCCA.

Regarding the license issue: According to the following list, Lena has directed us to:
http://is-enes-data.github.io/CORDEX_RCMs_info.html,
the use of the EURO-CORDEX data we use to calculate the climate indices is not restricted. Therefore, the Climate Indices can be made publicly available without restrictions.

p-a-s-c-a-l · 2020-02-07T06:37:53Z

This isn't valid any more, right?

The original data sets produced by ZAMG will be stored on a server of the CCCA and, after all licenses have been checked, will be released on data.ccca.ac.at.

All datasets will be made available on Zendo?

RobAndGo · 2020-02-07T06:42:22Z

Yes, this is correct. Initially when this statement was made, I was not aware of Zenodo. Then when I made the comparison of uploading the data, I found it was much easier to upload it via Zenodo than on CCCA.

p-a-s-c-a-l · 2020-04-27T11:49:34Z

All datasets are now available on Zenodo, right? So can close this issue.

RobAndGo · 2020-05-04T06:37:18Z

Yes

p-a-s-c-a-l · 2020-05-12T11:30:55Z

Thanks, Robert!

p-a-s-c-a-l added data meta-data labels Nov 23, 2018

p-a-s-c-a-l added this to the D7.10 Data Management Plan v3 milestone Nov 23, 2018

p-a-s-c-a-l self-assigned this Nov 23, 2018

p-a-s-c-a-l mentioned this issue Nov 23, 2018

Integration with external Catalogues #10

Closed

p-a-s-c-a-l mentioned this issue Nov 26, 2018

assesment on scripts goals clarity-h2020/data-package#3

Closed

This comment has been minimized.

Sign in to view

DenoBeno closed this as completed Nov 27, 2018

DenoBeno reopened this Nov 27, 2018

p-a-s-c-a-l assigned claudiahahn and RobAndGo Nov 27, 2018

p-a-s-c-a-l modified the milestones: D7.10 Data Management Plan v3, D7.9 Data Management Plan v2 Nov 27, 2018

This was referenced Nov 27, 2018

Data Management Example: Heatwave Duration Hazard #8

Closed

Configure CKAN for CSIS #2

Closed

Update Catalogue Information #7

Closed

This comment has been minimized.

Sign in to view

p-a-s-c-a-l mentioned this issue Dec 3, 2018

1st Data Package: 01 HC Layers clarity-h2020/data-package#8

Closed

6 tasks

p-a-s-c-a-l modified the milestones: D7.9 Data Management Plan v2, D7.10 Data Management Plan v3 Jun 18, 2019

p-a-s-c-a-l added the on-hold Issue is on-hold label Jul 2, 2019

p-a-s-c-a-l removed the on-hold Issue is on-hold label Nov 6, 2019

This comment has been minimized.

Sign in to view

p-a-s-c-a-l mentioned this issue Nov 27, 2019

CSIS URI variables in praxis 1 [HC]: heat hazard indices clarity-h2020/data-package#47

Closed

p-a-s-c-a-l mentioned this issue Mar 30, 2020

D7.10 Data Management Plan v3 #32

Closed

9 tasks

p-a-s-c-a-l closed this as completed May 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish Hazard Datasets calculated by ZAMG as Open Data #9

Publish Hazard Datasets calculated by ZAMG as Open Data #9

p-a-s-c-a-l commented Nov 23, 2018 •

edited

Loading

p-a-s-c-a-l commented Nov 23, 2018 •

edited

Loading

claudiahahn commented Nov 27, 2018

This comment has been minimized.

This comment has been minimized.

DenoBeno commented Nov 27, 2018

p-a-s-c-a-l commented Nov 27, 2018 •

edited

Loading

This comment has been minimized.

p-a-s-c-a-l commented Nov 27, 2018

This comment has been minimized.

This comment has been minimized.

p-a-s-c-a-l commented Nov 6, 2019

claudiahahn commented Nov 7, 2019

This comment has been minimized.

This comment has been minimized.

p-a-s-c-a-l commented Feb 7, 2020

RobAndGo commented Feb 7, 2020 •

edited

Loading

p-a-s-c-a-l commented Apr 27, 2020

RobAndGo commented May 4, 2020

p-a-s-c-a-l commented May 12, 2020

Publish Hazard Datasets calculated by ZAMG as Open Data #9

Publish Hazard Datasets calculated by ZAMG as Open Data #9

Comments

p-a-s-c-a-l commented Nov 23, 2018 • edited Loading

p-a-s-c-a-l commented Nov 23, 2018 • edited Loading

claudiahahn commented Nov 27, 2018

This comment has been minimized.

This comment has been minimized.

DenoBeno commented Nov 27, 2018

p-a-s-c-a-l commented Nov 27, 2018 • edited Loading

This comment has been minimized.

p-a-s-c-a-l commented Nov 27, 2018

This comment has been minimized.

This comment has been minimized.

p-a-s-c-a-l commented Nov 6, 2019

claudiahahn commented Nov 7, 2019

This comment has been minimized.

This comment has been minimized.

p-a-s-c-a-l commented Feb 7, 2020

RobAndGo commented Feb 7, 2020 • edited Loading

p-a-s-c-a-l commented Apr 27, 2020

RobAndGo commented May 4, 2020

p-a-s-c-a-l commented May 12, 2020

p-a-s-c-a-l commented Nov 23, 2018 •

edited

Loading

p-a-s-c-a-l commented Nov 23, 2018 •

edited

Loading

p-a-s-c-a-l commented Nov 27, 2018 •

edited

Loading

RobAndGo commented Feb 7, 2020 •

edited

Loading