-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REQUEST]: MPI-ESM1-2-HR historical #116
Comments
Hi @kareed1, thanks for raising an issue here! I assume you are still using the 'old' catalog file here. Can you provide some more information (small code snipped) on how you are accessing the data currently? The new current catalog (more info how to access) does not seem to have that iid: def zstore_to_iid(zstore: str):
# this is a bit whacky to account for the different way of storing old/new stores
return '.'.join(zstore.replace('gs://','').replace('.zarr','').replace('.','/').split('/')[-11:-1])
iids_requested = [
'CMIP6.CMIP.MPI-M.MPI-ESM1-2-HR.historical.r1i1p1f1.Amon.tas.gn.v20190710',
]
import intake
# uncomment/comment lines to swap catalogs
url = "https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog.json"
col = intake.open_esm_datastore(url)
iids_all= [zstore_to_iid(z) for z in col.df['zstore'].tolist()]
iids_uploaded = [iid for iid in iids_all if iid in iids_requested]
iids_uploaded gives an empty list. I will add this to the ingestion and see what we get. |
See my comments in #119: This seems to require some more deep debugging unfortunately. Well get to the bottom of this eventually! |
Hi @jbusecke , Thank you for the updates and your assistance on this. It probably is from the old catalog. I had found some code online to get started, so I'm not sure how old that code was. Below is an example of the Python code I'm using.
|
Cool thanks for that info. That all looks good but I recommend using |
I have high hopes that a solution to jbusecke/pangeo-forge-esgf#42 will address this issue too. |
* Add requested data for #116 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update iids_pr.yaml * Update iids_pr.yaml * Update iids.yaml * Update iids_pr.yaml --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Ok the dataset was ingested, but ended up in our non-qc catalog. I did some digging:
Lets load the store and run out tests (failing these causes this to be put in our non-qc catalog) import zarr
from pangeo_forge_esgf.utils import facets_from_iid
from leap_data_management_utils.cmip_testing import test_all
import intake
# uncomment/comment lines to swap catalogs
url = "https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog_noqc.json" # Only stores that fail current
col = intake.open_esm_datastore(url)
iid = 'CMIP6.CMIP.MPI-M.MPI-ESM1-2-HR.historical.r1i1p1f1.Amon.tas.gn.v20190710'
facets = facets_from_iid(iid)
del facets['mip_era']
cat = col.search(**facets)
store = zarr.storage.FSStore(cat.df['zstore'].tolist()[0])
test_all(store, iid) gives
so the time is not continous! we can confirm that import matplotlib.pyplot as plt
import xarray as xr
ds = xr.open_dataset(store, engine='zarr')
plt.plot(ds.time) # note do not use the built in plot since it will seem like the time is continous, because the time is plotted against itself not the array index Yeah thats not great...but its fixable! plt.plot(ds.sortby('time').time) so @kareed1 you can use the above to work with the dataset for now. I want to understand how this happened though... from pangeo_forge_esgf.client import ESGFClient
iid = 'CMIP6.CMIP.MPI-M.MPI-ESM1-2-HR.historical.r1i1p1f1.Amon.tas.gn.v20190710'
client = ESGFClient()
dataset_id = client.get_instance_id_input([iid])[iid]['id']
file_dict = client.get_recipe_inputs_from_dataset_ids([dataset_id])
list(file_dict[iid].keys()) this seems fine
My first suspicion was that the files are not correctly concatenated, but that might not be it. |
Oh wait, this is not a complete set of files! How strange. |
Ill move discussion over to jbusecke/pangeo-forge-esgf#46, but will close this for now. Feel free to use the non-qc data for now, but proceed with caution @kareed1 |
List of requested idds
Description
Hello,
On both Google and AWS, the above noted dataset shows that it only contains the years 1915-1959. I'm not sure if this was on purpose. I'd like to request data for Jan 1985-Dec 2014 to be added to the repositories. Thank you for making CMIP6 data easier to access!
The text was updated successfully, but these errors were encountered: