improve speed of cmems ini file generation

Generating an CMEMS inifile is very slow if the source folder contains 1000+ netcdf files. This is due to `xr.open_mfdataset()`.

Reproducible code (the code is way faster when adding the year to the pattern (resulting in only 25% of the files):
```python
import os
import dfm_tools as dfmt
from dfm_tools.modelbuilder import get_ncvarname
import xarray as xr

# user input
model_name = 'DCSM-FM' # the name cannot contain a space
date_min = '2012-01-01'
dir_output_data_cmems = r'p:/11211535-004-dcsm-fm/data/CMEMS/'

# convert downloaded CMEMS data to initial fields
# dir_pattern = os.path.join(dir_output_data_cmems,'cmems_{ncvarname}_' f'{date_min[0:4]}*.nc')
dir_pattern = os.path.join(dir_output_data_cmems,'cmems_{ncvarname}_*.nc')

xr_kwargs = {"join":"exact", "data_vars":"minimal"}

conversion_dict = dfmt.get_conversion_dict()

quan_bnd = "salinitybnd"
ncvarname = get_ncvarname(
    quantity=quan_bnd,
    conversion_dict=conversion_dict,
    )
dir_pattern_one = dir_pattern.format(ncvarname=ncvarname)

data_xr = xr.open_mfdataset(dir_pattern_one, **xr_kwargs)
```

Todo:
- [x] check if the performance can be improved with additional arguments: `parallel=True, coords="minimal", compat="equals"` >> not much effect
- [x] Related xarray issue (7 years old but still active): https://github.com/pydata/xarray/issues/1385 >> fixed in https://github.com/pydata/xarray/pull/10062 but will take some time for the new defaults to be set automatically for everyone (without explicitly opting in) >> does not increase the speed also, it is just way too much files.
- [x] alternatively, allow user to provide hardcoded list of (selection of) files, this will improve results dramatically
- [x] update whatsnew

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improve speed of cmems ini file generation #1207

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

improve speed of cmems ini file generation #1207

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions