Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow opening a GRIB file with 100.000 messages #142

Closed
matteodefelice opened this issue May 27, 2020 · 13 comments
Closed

Very slow opening a GRIB file with 100.000 messages #142

matteodefelice opened this issue May 27, 2020 · 13 comments
Labels
enhancement New feature or request

Comments

@matteodefelice
Copy link

I have download an ERA5 hourly file of ~68 MB (you can download it here).
If I use the following command to open it:

ed1 = xr.open_dataset('../data/nl-era5-1981-1993-t2m.grib', engine = 'cfgrib')

Python is not able to open it, after 10 minutes is still loading and then I kill the process.
This happens on Jupyter with Python 3.7.6, xarray 0.15.1 and cfgrib 0.9.8.2.

Is this behaviour expected?

@alexamici
Copy link
Contributor

alexamici commented May 27, 2020

@matteodefelice expected no, but the file is rather unusual with a super tiny spatial extent (21, 9) and a huge number of time steps (113957).

Every time step is in a different GRIB message and apparently ecCodes takes quite some time in decoding:

$ time grib_ls nl-era5-1981-1993-t2m.grib | wc        
  113957 1139545 14814018

real	1m47.465s
user	1m19.928s
sys	0m21.212s

Then cfgrib performs a few operations per message via a slow C-interface.

So yes, cfgrib is not optimised to handle files with a huge number of fields even if they are small. Supporting a similar use case will need some work.

@alexamici alexamici changed the title Very slow opening a GRIB file Very slow opening a GRIB file with 100.000 messages May 27, 2020
@matteodefelice
Copy link
Author

Thanks for the quick reply. Actually, after converting the data in NetCDF everything was rather quick. Good to know, thanks!

@alexamici
Copy link
Contributor

I'll leave the issue open for reference and to keep track of the enhancement request!

@alexamici alexamici reopened this May 28, 2020
@alexamici alexamici added the enhancement New feature or request label May 28, 2020
@shahramn
Copy link
Collaborator

shahramn commented May 28, 2020

I just tried that file with various timings on my Linux machine with the file stored on my local disk (/tmp). Using ecCodes 2.17.0 (GNU/7.3.0)
Timings:
% grib_ls $file - 44 seconds
% grib_ls -p count,step $file - 23 seconds

Note: grib_ls by default shows the shortName key.
By excluding that key, you get a speed up as that one is more costly to calculate. But as you can see calculating the step is not slow.

Also with a very simple Python script which loops through each message printing the message number (count) and step
% python3 grib_get_keys.py $file - 33 seconds

Perhaps you have very slow disk access.

@alexamici
Copy link
Contributor

@shahramn the default install of cfgrib uses the internal bindings (still) in ABI mode so every call to an ecCodes function gets a the CFFI overhead.

So most probably the performance issue with many messages is more on the cfgrib side than on ecCodes.

@Plantain
Copy link

Plantain commented Jun 5, 2020

the default install of cfgrib uses the internal bindings
Is there an alternative/non-default way? Performance is quite slow for our usecases.

@alexamici
Copy link
Contributor

alexamici commented Jun 5, 2020

@Plantain an undocumented feature is to set the CFGRIB_USE_EXTERNAL_ECCODES_BINDINGS to 1 to switch using eccodes-python, but not all the features are supported at the moment. YMMV.

@matteodefelice
Copy link
Author

Ciao @alexamici, are you still using the internal bindings in the 0.9.9 or is it something we might expect for the 1.0?

@alexamici
Copy link
Contributor

alexamici commented Dec 10, 2020

@matteodefelice in fact master already uses external bindings by default and I just merged a PR to improve performance.

Unfortunately just today I tried again your example and it takes still several minutes to open. But at least cfgrib now opens it :)

I'll do a new release soon.

@matteodefelice
Copy link
Author

You are right, it opens it - but once converted to NetCDF it loads it in less than a second. I know that most of the users use large grids rather than long time-series like data but however, the user at least should be warned about this limitation. Don't you think?

@alexamici
Copy link
Contributor

alexamici commented Dec 12, 2020

@aurghs and myself are on a performance optimisation spree and we identified and fixed a number of bottlenecks.

At the moment I'm tackling the issues with your file that is in fact the worst of the worst cases as I found a couple of code sections that scale quadratically in the number of values in a dimension.

One fix is already in stable/0.9.9.x, hopefully a second one is coming.

@alexamici
Copy link
Contributor

@matteodefelice and that was faster than I expected. stable/0.9.9.x has a second fix and now it can open the above file in less than 2 minutes the first time you access it (when it builds the index), and almost instantly for all subsequent times.

I'll do a proper release soon.

@alexamici
Copy link
Contributor

alexamici commented Jan 4, 2021

This is actually fixed in master and now cfgrib to_netcdf converts your sample file in 2 and half minute, much faster than ecCodes grib_to_netcdf. Those are 2 minutes to create the index and 30 seconds to actually perform the conversion.

@matteodefelice thanks for reporting the issue with the very best sample file and sorry that it took a bit.

Also, note that this work was sponsored by the CDS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants