Skip to content

allow specifying a fill value per variable #4165

Closed
@keewis

Description

@keewis

While working on #4163 I noticed that the fill value parameter for align (but maybe also reindex, concat, merge and combine_*?) will be used for all variables (except dimension coordinates) which is obviously not ideal when working with quantities. Would it make sense to optionally allow fill_value to be a dict which maps a fill value to a variable name?

Consider this:

In [2]: a = xr.Dataset( 
   ...:     data_vars={"a": ("x", [12, 14, 13, 10, 8])}, 
   ...:     coords={"x": [-2, -1, 0, 1, 2], "u": ("x", [-20, -10, 0, 10, 20])},  
   ...: ) 
   ...: b = xr.Dataset( 
   ...:     data_vars={"b": ("x", [7, 9, 3])}, 
   ...:     coords={"x": [0, 3, 4], "u": ("x", [0, 30, 40])}, 
   ...:  
   ...: ) 
   ...:  
   ...: xr.align(a, b, join="outer", fill_value=-50)
Out[2]: 
(<xarray.Dataset>
 Dimensions:  (x: 7)
 Coordinates:
   * x        (x) int64 -2 -1 0 1 2 3 4
     u        (x) int64 -20 -10 0 10 20 -50 -50
 Data variables:
     a        (x) int64 12 14 13 10 8 -50 -50,
 <xarray.Dataset>
 Dimensions:  (x: 7)
 Coordinates:
   * x        (x) int64 -2 -1 0 1 2 3 4
     u        (x) int64 -50 -50 0 -50 -50 30 40
 Data variables:
     b        (x) int64 -50 -50 7 -50 -50 9 3)

I'd like to be able to do something like this instead:

In [3]: xr.align(a, b, join="outer", fill_value={"a": -30, "b": -40, "u": -50})
Out[3]: 
(<xarray.Dataset>
 Dimensions:  (x: 7)
 Coordinates:
   * x        (x) int64 -2 -1 0 1 2 3 4
     u        (x) int64 -20 -10 0 10 20 -50 -50
 Data variables:
     a        (x) int64 12 14 13 10 8 -30 -30,
 <xarray.Dataset>
 Dimensions:  (x: 7)
 Coordinates:
   * x        (x) int64 -2 -1 0 1 2 3 4
     u        (x) int64 -40 -40 0 -40 -40 30 40
 Data variables:
     b        (x) int64 -50 -50 7 -50 -50 9 3)

I could get there by passing the default (dtypes.NA) and then using fillna, but that only seems to work with data variables so coordinates would need to pass through a reset_coords / set_coords cycle. Also, with this the dtype is changed to float.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions