Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import historical raw GRIB files #3

Open
markkvdb opened this issue Feb 5, 2024 · 7 comments
Open

Import historical raw GRIB files #3

markkvdb opened this issue Feb 5, 2024 · 7 comments

Comments

@markkvdb
Copy link

markkvdb commented Feb 5, 2024

What is the easiest way to run some kind of custom import script for a large number of historical GRIB files from MeteoFrance. These GRIB files are stored in a S3 bucket and I would like to sync them to an open meteo API server manually. Is this already supported? Else, can you point me in the direction of how I could add this functionality to the main open-meteo codebase?

I'd like to contribute to open-meteo and this might be a good first issue?

@markkvdb
Copy link
Author

markkvdb commented Feb 5, 2024

Do you have some kind of chat (forum) for devs to ask questions to each other? On my journey to make the open-meteo server work for some use-cases I encounter I'd like to share this experience with devs and other people setting up their servers!

@patrick-zippenfenig
Copy link
Member

There is no generic ingestion for GRIB files. For every weather service, specialised downloaders are developed. Although, most NWS use GRIB, there are still so many differences that it is not possible to have a unified ingestion process.

In addition to that, all downloaders are optimised for performance and efficiency to ingest updates as fast as possible. Most downloaders use now parallel multipart downloads and concurrent processing. Certain domains like CMA GRAPES download and process 220 GB GRIB files per run every 6 hours. With 8 cores in parallel, it is feasible within 1 hour. This level of optimisation does not make it easy for developers to integrate new data sources.

Open-Meteo is not designed to be a universal framework or database for GRIB files. The initial focus is still the API endpoint, but I can understand that more and more users might want to ingest other data sources. Right now, it would be a very rough start to write a downloader.

Which kind of MeteoFrance GRIB files do you want to integrate? Is the S3 bucket publicly accessible? GRIB files from MeteoFrance labelled with SP1, HP1, etc are quite a mess to ingest. Support for them was dropped a month ago with the integration of the new MeteoFrance API

Do you have some kind of chat (forum) for devs to ask questions to each other? On my journey to make the open-meteo server work for some use-cases I encounter I'd like to share this experience with devs and other people setting up their servers!

I was considering to setup a discord channel. A drawback is that Discord chats are not well indexable by search engines. Using GitHub Tickets and Discussions is better in this regard. What is your take on that?

@markkvdb
Copy link
Author

markkvdb commented Feb 5, 2024

I was considering to setup a discord channel. A drawback is that Discord chats are not well indexable by search engines. Using GitHub Tickets and Discussions is better in this regard. What is your take on that?

I agree that issues and questions that are relevant for a wider audience should be shared on Github to make them publicly accessible.

I do think there's also place for more fast-paced conversations and small "coffee machine" chat that might clutter Github!

@markkvdb
Copy link
Author

Small update: I will come back to you about sharing the MeteoFrance data no sooner than next week. But I haven't forgotten!

@patrick-zippenfenig
Copy link
Member

No worries. I am pretty busy with open-meteo/open-meteo#206 right now

@kikocorreoso
Copy link

grib2 allows range requests. See here to check how herbie does this.

Here you can read about meteofrance models in AWS S3.

An example of downloading only one field, TMP:35 m above ground:anl from a meteofrance grib2 file would be:

curl -o outFile.grib2 --range 254045-508079 https://mf-nwp-models.s3.amazonaws.com/arpege-europe/v1/2024-06-27/06/HP1/00H12H.grib2

The outFile.grib2 file can be opened using panoply without issues.

I hope it helps.

@patrick-zippenfenig
Copy link
Member

Hi, the MeteoFrance on AWS distribution does not offer the highest resolution, all time steps and weather variables. MeteoFrance now has an open data distribution for ARPEGE 0.25, ARPEGE 0.1, AROME 0.025 and AROME 0.01. They offer a similar S3 interface. There are still some missing files, but I am trying to report all issues to MeteoFrance and hope it gets fixed in the next weeks.

The AROME PI models with updates every hour and 15 minutely data are only available through the MeteoFrance API.

All those distributions are already implemented in Open-Meteo and use HTTP RANGE calls if possible (most open-data servers do not offer an index file).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants