Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom indexes and coordinate (re)ordering #7002

Open
benbovy opened this issue Sep 7, 2022 · 2 comments · May be fixed by #8111
Open

Custom indexes and coordinate (re)ordering #7002

benbovy opened this issue Sep 7, 2022 · 2 comments · May be fixed by #8111

Comments

@benbovy
Copy link
Member

benbovy commented Sep 7, 2022

What is your issue?

(From #5647 (comment)).

The current alignment logic (as refactored in #5692) requires that two compatible indexes (i.e., of the same type) must relate to one or more coordinates with matching names but also in a matching order.

For some multi-coordinate indexes like PandasMultiIndex this makes sense. However, for other multi-coordinate indexes (e.g., staggered grid indexes) the order of the coordinates doesn't matter much.

Possible options:

  1. Setting new Xarray indexes may reorder the coordinate variables, possibly via Index.create_variables(), to ensure consistent order
  2. Xarray indexes must implement a Index.matching_key abstract property in order to support re-indexing and alignment.
  3. Take care of coordinate order (and maybe other things) inside Index.join and Index.equals, e.g., for PandasMultiIndex maybe reorder the levels beforehand.
    • pros: more flexible
    • cons: not great to implicitly reorder levels if it's a costly operation?
  4. Find matching indexes using a two-passes approach: (1) group all indexes by dimension name and (2) check compatibility between the indexes listed in each group.
@benbovy benbovy added the needs triage Issue that has not been reviewed by xarray team member label Sep 7, 2022
@benbovy benbovy added topic-indexing and removed needs triage Issue that has not been reviewed by xarray team member labels Sep 7, 2022
@shoyer
Copy link
Member

shoyer commented Sep 13, 2022

I like option (4). If a multi-coordinate index needs to care about order, it can implement that logic itself.

@benbovy
Copy link
Member Author

benbovy commented Aug 23, 2023

If a multi-coordinate index needs to care about order, it can implement that logic itself.

Agreed.

Option 4 would be nice indeed but it might be difficult to implement in the current Aligner class.

Another (easier) option would be to sort the names of the coordinates of each unique index before using it as a hash for finding the list of indexes to compare together, i.e.,

SortedCoordNamesAndDims = tuple[tuple[Hashable, tuple[Hashable, ...]], ...]
MatchingIndexKey = tuple[SortedCoordNamesAndDims, type[Index]]

instead of

CoordNamesAndDims = tuple[tuple[Hashable, tuple[Hashable, ...]], ...]
MatchingIndexKey = tuple[CoordNamesAndDims, type[Index]]

@benbovy benbovy linked a pull request Aug 24, 2023 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants