Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best way to find data variables by standard_name #567

Closed
rsignell-usgs opened this issue Sep 9, 2015 · 6 comments
Closed

Best way to find data variables by standard_name #567

rsignell-usgs opened this issue Sep 9, 2015 · 6 comments

Comments

@rsignell-usgs
Copy link

Is there a way to return the data variables that match a specified standard_name?

I came up with this, but maybe the functionality already exists or there is a better way.

def get_std_name_vars(ds,std_name):
    return {k: v for k, v in ds.data_vars.iteritems() if 'standard_name' in v.attrs.keys() and std_name in v.standard_name}

as in this example:
http://nbviewer.ipython.org/gist/rsignell-usgs/5b263906e92ce47bf05e

@shoyer
Copy link
Member

shoyer commented Sep 9, 2015

I would probably make this return an xray.Dataset object instead of a plain dict, but otherwise this looks about right:

def get_std_name_vars(ds,std_name):
    return ds[[k for k, v in ds.data_vars.iteritems() if 'standard_name' in v.attrs and std_name in v.standard_name]]

I don't think there's a better way to do this currently. Hypothetically, we could add this to a CF-specific API for xray, e.g., as discussed in #461.

@rsignell-usgs
Copy link
Author

I was thinking that the data variables that matched a specified standard_name would be a subset of the variables in the data_vars object.

@ocefpaf
Copy link
Contributor

ocefpaf commented Sep 10, 2015

How about the function below? (Stolen from pyaxiom.)

def get_variables_by_attributes(ds, strict=False, **kwargs):
    """
    Returns variables that match specific conditions.
    * Can pass in key=value parameters and variables are returned that
      contain all of the matches.
      ex.:
          vs = nc.get_variables_by_attributes(axis='X')
    * Can pass in key=callable parameter and if the callable returns
      True.  The callable should accept a single parameter, the attribute
      value.  None is returned as the attribute valuewhen the attribute
      does not exist on the variable.
      ex.:
          # Get Axis variables
          vs = nc.get_variables_by_attributes(axis=lambda v: v in ['X', 'Y', 'Z', 'T'])
          # Get variable that don't have a "axis" attribute
          vs = nc.get_variables_by_attributes(axis=lambda v: v is None)
          # Get variable that have a "grid_mapping" attribute
          vs = nc.get_variables_by_attributes(axis=lambda v: v is not None)

    * strict : True/False
        If True will return only 1 variable if only one is found. Default is False.
    """
    vs = []

    has_value_flag  = False
    for vname, var in ds.iteritems():
        for k, v in kwargs.items():
            if callable(v):
                has_value_flag = v(getattr(var, k, None))
                if has_value_flag is False:
                    break
            elif hasattr(var, k) and getattr(var, k) == v:
                has_value_flag = True
            else:
                has_value_flag = False
                break

        if has_value_flag is True:
            vs.append(ds[vname])

    if strict:
        if len(vs) == 1:
            vs = vs[0]
        else:
            msg = "Expected only one variable.  Got {!r}".format
            raise ValueError(msg(vs))
    return vs

See it in action here.

@ocefpaf
Copy link
Contributor

ocefpaf commented May 4, 2016

@shoyer this made into netcdf4 and some people in my group would like to have this in xarray too. If you think it is worth it I can put a PR together for this.

@rsignell-usgs
Copy link
Author

👍 -- I think this would be super-useful general functionality for the xarray community that doesn't come with any downside.

@shoyer
Copy link
Member

shoyer commented May 4, 2016

get_variables_by_attributes does seem generic enough that we can safely add it to xarray. For our version, I would make this a Dataset method and always return another Dataset (no strict flag). Pulling out the single variable of a Dataset as a DataArray should be a separate method, e.g., .item() like the NumPy method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants