Skip to content

Proof of concept: MeshIndexSet #6014

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

stephenworsley
Copy link
Contributor

🚀 Pull Request

Description

Implements a possible way to subset a Mesh while retaining a reference to the original Mesh. A MeshIndexSet would be effectively an implementation of the UGRID concept of a location index set. This particular implementation has the benefit of appearing like a regular mesh for most opperations and is capable of functioning with iris-esmf-regrid, for example. The benefit of such a set over a strictly smaller Mesh is that it opens the option for introducing sophisticated functions which can "stitch" a collection of MeshIndexSets back together into their original Mesh. One use case which this may be useful in is when calling expensive functions, like in iris-esmf-regrid, which a full Mesh would be too large to process as a whole so it must be processed as a series of submeshes. It may well be that such cases are specific enough that it is better to solve each on their own terms, but if such problems are general enough, there may be a case to introduce a class like this one.

Copy link

codecov bot commented Jun 19, 2024

Codecov Report

Attention: Patch coverage is 25.47771% with 117 lines in your changes missing coverage. Please review.

Project coverage is 89.35%. Comparing base (e92e36a) to head (7c05019).
Report is 5 commits behind head on main.

Files Patch % Lines
lib/iris/experimental/ugrid/mesh.py 25.47% 117 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6014      +/-   ##
==========================================
- Coverage   89.75%   89.35%   -0.41%     
==========================================
  Files          90       90              
  Lines       22929    23102     +173     
  Branches     5020     5062      +42     
==========================================
+ Hits        20580    20642      +62     
- Misses       1618     1729     +111     
  Partials      731      731              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to get my head into this space as I think it's a valuable and exciting addition to the API. Here are some interim comments. Ideally I could propose some actual code changes via a PR soon but I can't guarantee it.

return self.indices
elif self.location == "edge":
connectivity = self.mesh.edge_node_connectivity[self.indices]
node_set = list(set(connectivity.indices.compressed()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compressed() is only available for masked arrays. We currently allow non-masked arrays too. This is a similar problem to SciTools/iris-esmf-regrid#323.

if self.location == "node":
return self.indices
elif self.location == "edge":
connectivity = self.mesh.edge_node_connectivity[self.indices]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you're doing here is completely valid, but it might take some getting used to - need more developer education to avoid people like me getting confused!

A Connectivity also has indices. I'm used to those exclusively being used to index node_coordinates - i.e. the Nodes, so I found it confusing that here they are instead indexing Edges, and lower down indexing Faces.

HOWEVER: that is only because we always work with edge_node_ and face_node_connectivity. If we were more used to working with other examples such as face_face_connectivity then there would be no confusion.

We might consider some different variable naming but that's probably overdoing it IMO.

Comment on lines 2212 to 2214
connectivity_indices = np.vectorize(self.node_index_dict.get)(
connectivity.indices
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bow to your wisdom on matters like this, but I would assume it would be more efficient to store a 1D array of integers, where position in the array is equivalent to new_index and the value is equivalent to old_index? Pretty sure there would be a way to use that in the generation of connectivity_indices, although it's not immediately obvious to me what that would be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps your existing approach is easier to get working lazily as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the problem with storing this as an array where the one set of indices is infered by the position is that the indices aren't guaranteed to be of the form [0,1,2,3,...] but may have gaps in them. It ought to be possible to rewrite this to be of the form connectivity_indices = index_array[connectivity.indices], but this would require building index_array to be about as large as the coordinates on the original mesh rather than scaling with the size of the provided indices. It's probably still worth checking how well this performs, because this may well be the more natural way to use numpy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants