Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Challenge #12 - Machine learning for predicting extreme weather hazards #14

Closed
jwagemann opened this issue Jan 18, 2019 · 17 comments
Closed

Comments

@jwagemann
Copy link
Contributor

jwagemann commented Jan 18, 2019

Challenge 12

Machine learning for predicting extreme weather hazards

Goal: To use ECMWF/Copernicus open datasets to evaluate machine learning (ML) techniques to better predict one specific kind of an extreme weather event, e.g. drought or hurricanes; provide templates for future ML work


Mentors @cvitolo, @StephanSiemen, @jwagemann
Skills required - Data Science
- Experience in building machine learning algorithms
- Knowledge of meteorological and climate data and formats desirable
- Knowledge of extreme weather hazards desirable

Challenge description

This challenge is of an explorative nature. The aim of this challenge is to have a better understanding of the feasibility, accuracy and challenges of using ECMWF/Copernicus open datasets to better predict extreme weather events.

Possible datasets available:

A potential open dataset available by ECMWF / Copernicus is e.g. the climate reanalysis product ERA5. It extends back to 1979, has a global spatial resolution and an hourly temporal resolution.
But we also have data on fire risk, air quality or floods.

Possible approach

A possible approach could be:

  • Select an extreme weather hazard you would like to predict, e.g. hurricanes or drought, etc.
  • Select one or more suitable machine learning algorithms
  • Select suitable datasets from the ECMWF and Copernicus open data catalogues and prepare the datasets as required
  • Set up the machine learning model and evaluate the results, e.g. based on a set of extreme weather hazards in the past (e.g. ECMWF has a database with past extreme weather events)

Depending on the extreme weather hazard chosen and the algorithm, there are different possible outcomes, e.g.

  • we could do a comparison study of the performance and robustness of two or more different machine learning algorithms
  • we could focus on preparing templates regarding data preparation and machine learning models for future ML applications

Since this challenge is very explorative, we would like to have a detailed documentation of the single steps taken. It would be further valuable to have a detailed description how datasets should be prepared. We would like to get a better understanding of the current challenges / limitations machine learning with weather / climate data entails.

Potential questions that can be explored:

  • Can we predict naturally-occurring wildfires by mapping lightning occurrence?
  • How well can we predict vector-borne disease outbreaks in Africa (e.g. malaria) just by looking at temperature and other weather variables?
  • How well can we predict flash floods and/or landslides from extreme precipitation?
@lkugler
Copy link

lkugler commented Feb 13, 2019

Hi,
what a nice challenge! We'd apply for this one but to prepare a detailed plan we would need some details concerning the observational data for wildfires and floods, which is new to us, unlike ERA5. For example: what kind of database/dataset is it, how is one given access to it, how many years of data are there? Thanks!

@cvitolo
Copy link
Contributor

cvitolo commented Feb 13, 2019 via email

@jwagemann
Copy link
Contributor Author

Hi,
to add information to the flood data:
The flood data can be provided from the Global Flood Awareness System (GloFAS). Here is an overview of the data that is available on request. We will be able to make the data available and provide a ftp access to the data. The data is available in NetCDF.

@masterflorin
Copy link

masterflorin commented Mar 28, 2019

Hi! Wow and congrats to ECMWF for organizing a summer of code. I took part in a similar competition several years ago (GSoC) under OSGeo umbrella. I enjoyed it a lot and it gave a me good start in terms of open source software development. 👍

I wonder is there any comprehensive repository or database that has links towards various datasets on climate/weather data? I'd be interested in attempting a solution that uses multiple data sources something like data fusion but I hardly have any experience at doing that. I also need to get re-acquainted with NetCDF, Sentinel data as I haven't touched them recently. 🔨

One of the things that I appreciate in your above posted message is the nature of the challenge, there are so many ways to go about it which makes it so intriguing.

@jwagemann
Copy link
Contributor Author

Hi @masterflorin ,
thanks for your interest in this challenge.
Open climate/meteorological datasets are available via

One great dataset is the ERA5 reanalysis available from the Copernicus Climate Data Store.
Let us know if you have specific questions to the data. We are happy to help.

See links to Flood and Fire data in the thread above.
HTH,
Julia

@masterflorin
Copy link

Thank you for that lighting-fast response @jwagemann.

Great! I'll check those out.

-FlorinC

@tommylees112
Copy link

Hi Julia, we would be very interested in exploring the possibility of predicting drought. Something that I have questions about are the problems of defining drought because there are lots of variables which drive and respond to drought conditions. Would this be something you would like us to explore in the application form? Thank you so much for setting these up we are extremely excited about getting involved! Tommy

@jwagemann
Copy link
Contributor Author

Hi @tommylees112 ,
yes please. Drought prediction with machine learning is exactly in the scope of this challenge. We are looking forward to your proposal.

@tommylees112
Copy link

Hi @jwagemann

The description states:

Set up the machine learning model and evaluate the results, e.g. based on a set of extreme weather hazards in the past (e.g. ECMWF has a database with past extreme weather events)

Is it possible to see the database with past extreme weather events or at least view the metadata so we can see the kind of information about each past event that you have?

Thanks so much for your help!
Tommy

@jwagemann
Copy link
Contributor Author

jwagemann commented Apr 15, 2019

Hi @tommylees112 ,
unfortunately it is not possible to make the database publicly available. However, we have:

  1. a severe weather events database
  2. a severe event catalogue, and
  3. a fire events database

Regarding 1:
This datatype holds some information on geographical area of the event, nature of the event (e.g. data outage, excess of thresholds, etc.), severity of event (e.g. severe).

Regarding 2:
this is a collection of past sever events, e.g. the heatwave in Europe in 2018. It is a more detailed description with analysis of data in order to better replicate and understand the event. From this catalogue, we could pick some drought (as an example, but also floods, etc.) events from the past to set up and validate the machine learning model.

Regarding 3:
If fire data is of interest, we have a collection of severe fire events since 2018. Here, we also have the geographical area of the event and a description of the event as well as threshold surpassed.

I hope this helps.
Julia

@jwagemann
Copy link
Contributor Author

REMINDER: Deadline to register and submit your proposal is upcoming Sunday, 21 April at 23:59 GMT!

Application process is a 2-step process:

  • You have to first register here, and then
  • Submit a proposal under "Call for abstract" here.

Applications without a submitted proposal will not be taken under consideration!
We are looking forward to your proposal!

@ppalmes
Copy link

ppalmes commented Apr 18, 2019

Assuming the proposal is accepted, is it possible to publish a paper out of this work? If yes, any requirement?

@jwagemann
Copy link
Contributor Author

Hi @ppalmes ,
yes, of course. We even embrace it, as this challenge is very explorative and more research is needed on this topic. The only requirement is to use open data from Copernicus / ECMWF.
HTH,
Julia

@itsmohitanand
Copy link

itsmohitanand commented May 29, 2019

Dear Assignees

I came to know about this challenge today. And can I have some guidance on the topic for flood forecasting. I want to do a catchment level flood forecasting and plan to use sentinel 2 data and DEMs to model the same. What kind of data is available, as flood labels. I will try to use 4 bands of sentinel [B2,B3,B4 and B8] and then DEM and finally radar precipitation data [from some source] to make flood predictions using Machine Learning.

In short, what kind of flood data I can have access to, being not a participant of the challenge?
As the competition deadline is over, Is it possible to get this project as my Master Thesis. I am pursuing MSc in Environmental Engineering at ETH Zurich, Switzerland

P.S. ETH provides an excellent opportunity to do master thesis in association with organisations/institutions.

@Yusuf-Oluwatoki
Copy link

Hi @jwagemann,

I am very interested in this research project and I would like to know if it is still open to submit proposals. I have just concluded my Master's essay on modeling extreme climate events for impact studies. A further step I want to go is using state of the art techniques to validate results from stat models like GEV AND GPD. Let me know if the challenge is still open or I can proceed with some personal research as regards this challenge.

@jwagemann
Copy link
Contributor Author

Hi @gapton76 and @melioristic ,
thanks a lot for your interest in this ESoWC challenge. Unfortunately, the applications for this year are now closed and we already identified three projects that will work on predicting drought, fire and floods with different machine learning approaches.
Please watch this space and follow us on Twitter to stay up to date with all future announcements.

You can specifically follow the machine learning projects on Github(drought, fire and flood) and also get in touch with the teams to discuss their work.

Cheers,
Julia

@itsmohitanand
Copy link

@jwagemann Thanks for the information. Will surely get in touch with the team and see what I can learn from them and how I can contribute to this field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants