Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support mixed data formats #416

Merged
merged 2 commits into from
Dec 15, 2021
Merged

Support mixed data formats #416

merged 2 commits into from
Dec 15, 2021

Conversation

aulemahal
Copy link
Contributor

@aulemahal aulemahal commented Dec 14, 2021

Change Summary

  • Made Assets.format optional
  • Added format_column_name to ESMDataSource's init
  • Format information is added a column on ESMDataSource's internal df. xarray_open_kwargs are generated independently for each assets (instead of only once).

Related issue number

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI
  • Documentation reflects the changes where applicable

To make this PR simple, I decided to use an hardcoded name for the new column (_data_format_). Is this dangerous? It could be a column name stored in another variable, or the series could be independent from the main dataframe? The latter seems dangerous too.

I added a sample collection that has 4 assets of CMIP6. 2 assets are from Google Storage (zarr) and the 2 others are local files. This has the bonus to also test that to_dataset_dict also works with remote and local assets within the same dataset.

@andersy005
Copy link
Member

andersy005 commented Dec 14, 2021

To make this PR simple, I decided to use an hardcoded name for the new column (data_format). Is this dangerous?

I think this is good enough (It's similar to what we used to do before the re-write in #368).

I added a sample collection that has 4 assets of CMIP6. 2 assets are from Google Storage (zarr) and the 2 others are local files.

👍🏽 you forgot to commit the file though.

@aulemahal
Copy link
Contributor Author

aulemahal commented Dec 15, 2021

Woops! And on the day I am working from home with the other machine shut down... I'll push them tomorrow.
EDIT I realized the machine was indeed reachable and not shut down!

Copy link
Member

@andersy005 andersy005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 👍🏽

@andersy005 andersy005 merged commit 40fe3a7 into intake:main Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mixed formats in catalog
2 participants