-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More support for pop data #240
Conversation
I don't know to address this issue for (2D) dimension coordinates unfortunately. Ccing @dcherian who may have ideas/recommendations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: ds_subset = ds_subset.where(ds[time_dim_name].dt.season == 'SON', drop=True)
it's a lot more efficient to do ds_subset.cf.sel(time=ds.cf["time"].dt.season == "SON")
mlon = ds_subset[lon_coord_name].compute() | ||
# euclidean dist for now.... | ||
di = np.sqrt(np.square(ad_lon - mlon) + np.square(lat - mlat)) | ||
index = np.where(di == np.min(di)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index = np.where(di == np.min(di)) | |
index = np.argmin(di) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to know - thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I tried this suggestion, but don't understand the error. Using the simple example from below again:
import numpy as np
import pandas as pd
import xarray as xr
mylat = np.array([[1., 2., 3., 4.], [1.1, 2.1, 3.1, 3.9], [1, 1.9, 2.9, 3.9]])
mylon = np.array([[3., 4., 5., 6.], [3.1, 4.1, 5.1, 5.9], [3, 3.9, 4.9, 5.9]])
mytimes = pd.date_range('2000-01-01', periods=5)
pop_data = np.random.uniform(low=0.0, high=100.0, size=(3,4,5))
ds_subset = xr.DataArray(
pop_data,
coords={
"time": mytimes,
"TLAT": (("nlat", "nlon"), mylat, {'standard_name': 'latitude', 'units': 'degrees_north'}),
"TLON": (("nlat", "nlon"), mylon, {'standard_name': 'longitude', 'units': 'degrees_east'})
},
dims=['nlat', 'nlon', 'time'] , attrs={'long_name': 'Surface Potential'}
)
Then I do this calculation:
mlat = ds_subset["TLAT"].compute()
mlon = ds_subset["TLON"].compute()
ad_lat = 3.1
ad_lon = 4
di = np.sqrt(np.square(ad_lon - mlon) + np.square(ad_lat - mlat))
index = np.argmin(di)
Then I get this error:
"ValueError: dimensions ('nlat', 'nlon') must have the same length as the number of data dimensions, ndim=0 "
(doing 'index = np.where(di == np.min(di))' does not give an error)
I can't tell; can you add a simple example please? |
Yes - I will make simple example tomorrow - thanks for your help. |
@dcherian Here is my simple example:
So in the above, the coords TLAT and TLON have dims (nlat, nlon). Now I do some calculations and want a the data just a a single grid point:
Now the dims (nlat, nlon) are missing from the TLAT and TLON coordinates. My question is how can I put them back (I need them for a further calculation). |
Use The list preserves the dimension. |
Thanks so much. I can't tell you how long I spent trying to figure that out.... |
lon_dim_name = ds_subset.cf['longitude'].dims[0] | ||
elif latdim == 2: | ||
lat_dim_name = dd[0] | ||
lon_dim_name = dd[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
:((((( it isn't even documented, we need some kind of indexing cheatsheet. |
This fixes a number of issues for POP data. Now we are very close to having everything working!
@andersy005 - could you help me with one bit that is broken? It is demonstrated in the last cell in the POPData notebook. Basically the issue is when I grab POP data with a fixed lat and lon, then the TLAT and TLON coords no longer have dimensions (nlat, nlon). This happens in the subset_data() in util.py - I added a comment at the end of that function. I spent several hours on Friday trying to figure out how to add the dimensions back to these coordinates, but to no avail :). Any help is appreciated. (Note that the CAM data is unaffected as the lat and lon are one dimensional, so I can grab them independently with isel(). But for POP I had to figure out which grid point was closet from the 2D lat and lon data)