Proof of concept: MeshIndexSet #6014

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

stephenworsley wants to merge 10 commits into SciTools:main from stephenworsley:index_set_experiments

Contributor

stephenworsley commented Jun 19, 2024

🚀 Pull Request

Description

Implements a possible way to subset a Mesh while retaining a reference to the original Mesh. A MeshIndexSet would be effectively an implementation of the UGRID concept of a location index set. This particular implementation has the benefit of appearing like a regular mesh for most opperations and is capable of functioning with iris-esmf-regrid, for example. The benefit of such a set over a strictly smaller Mesh is that it opens the option for introducing sophisticated functions which can "stitch" a collection of MeshIndexSets back together into their original Mesh. One use case which this may be useful in is when calling expensive functions, like in iris-esmf-regrid, which a full Mesh would be too large to process as a whole so it must be processed as a series of submeshes. It may well be that such cases are specific enough that it is better to solve each on their own terms, but if such problems are general enough, there may be a case to introduce a class like this one.

stephenworsley and others added 3 commits

June 13, 2024 16:35


          add MeshIndexSet

91fe92c


          improve MeshIndexSet

b2ba7ff


          [pre-commit.ci] auto fixes from pre-commit.com hooks

7c05019

for more information, see https://pre-commit.ci

codecov bot commented Jun 19, 2024

Codecov Report

Attention: Patch coverage is 25.47771% with 117 lines in your changes missing coverage. Please review.

Project coverage is 89.35%. Comparing base (e92e36a) to head (7c05019).
Report is 5 commits behind head on main.

Files	Patch %	Lines
lib/iris/experimental/ugrid/mesh.py	25.47%	117 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6014      +/-   ##
==========================================
- Coverage   89.75%   89.35%   -0.41%     
==========================================
  Files          90       90              
  Lines       22929    23102     +173     
  Branches     5020     5062      +42     
==========================================
+ Hits        20580    20642      +62     
- Misses       1618     1729     +111     
  Partials      731      731

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

stephenworsley mentioned this pull request

Investigate: subsetting a MeshCoord should return a MeshCoord? #5991

Closed

trexfeathers mentioned this pull request

Constrained load on MeshCoord bypasses callback #5939

Open

trexfeathers self-assigned this

trexfeathers reviewed

View reviewed changes

Contributor

trexfeathers left a comment

I wanted to get my head into this space as I think it's a valuable and exciting addition to the API. Here are some interim comments. Ideally I could propose some actual code changes via a PR soon but I can't guarantee it.

lib/iris/experimental/ugrid/mesh.py Outdated Show resolved Hide resolved

lib/iris/experimental/ugrid/mesh.py Outdated

+                          return self.indices
+                      elif self.location == "edge":
+                          connectivity = self.mesh.edge_node_connectivity[self.indices]
+                          node_set = list(set(connectivity.indices.compressed()))

Contributor

trexfeathers Aug 7, 2024

compressed() is only available for masked arrays. We currently allow non-masked arrays too. This is a similar problem to SciTools/iris-esmf-regrid#323.

lib/iris/experimental/ugrid/mesh.py Outdated

+                      if self.location == "node":
+                          return self.indices
+                      elif self.location == "edge":
+                          connectivity = self.mesh.edge_node_connectivity[self.indices]

Contributor

trexfeathers Aug 7, 2024

What you're doing here is completely valid, but it might take some getting used to - need more developer education to avoid people like me getting confused!

A Connectivity also has indices. I'm used to those exclusively being used to index node_coordinates - i.e. the Nodes, so I found it confusing that here they are instead indexing Edges, and lower down indexing Faces.

HOWEVER: that is only because we always work with edge_node_ and face_node_connectivity. If we were more used to working with other examples such as face_face_connectivity then there would be no confusion.

We might consider some different variable naming but that's probably overdoing it IMO.

lib/iris/experimental/ugrid/mesh.py Outdated Show resolved Hide resolved

lib/iris/experimental/ugrid/mesh.py Outdated Show resolved Hide resolved

lib/iris/experimental/ugrid/mesh.py Outdated Show resolved Hide resolved

lib/iris/experimental/ugrid/mesh.py Outdated Show resolved Hide resolved

lib/iris/experimental/ugrid/mesh.py Outdated

Comment on lines 2212 to 2214

+                          connectivity_indices = np.vectorize(self.node_index_dict.get)(
+                              connectivity.indices
+                          )

Contributor

trexfeathers Aug 7, 2024

I bow to your wisdom on matters like this, but I would assume it would be more efficient to store a 1D array of integers, where position in the array is equivalent to new_index and the value is equivalent to old_index? Pretty sure there would be a way to use that in the generation of connectivity_indices, although it's not immediately obvious to me what that would be.

Contributor

trexfeathers Aug 7, 2024

Perhaps your existing approach is easier to get working lazily as well?

Contributor Author

stephenworsley Aug 15, 2024

So I think the problem with storing this as an array where the one set of indices is infered by the position is that the indices aren't guaranteed to be of the form [0,1,2,3,...] but may have gaps in them. It ought to be possible to rewrite this to be of the form connectivity_indices = index_array[connectivity.indices], but this would require building index_array to be about as large as the coordinates on the original mesh rather than scaling with the size of the provided indices. It's probably still worth checking how well this performs, because this may well be the more natural way to use numpy.

lib/iris/experimental/ugrid/mesh.py Outdated Show resolved Hide resolved

trexfeathers added 5 commits

August 9, 2024 13:29


          Simpler MeshMetadata generation.

800ad2d


          Common handling of face and edge connectivities.

ac48bd6


          Consistent return pattern with rest of Iris.

ab93e7a


          Alternative for managing indexed meshes.

a1d38c4


          Don't need indices_by_location.

6b24489

trexfeathers mentioned this pull request

Some suggestions for SciTools/iris#6014 stephenworsley/iris#11

Merged

trexfeathers assigned stephenworsley


          Use MeshIndexSet.from_mesh method.

63702e5

This was referenced Aug 15, 2024

Implement the UGRID Location Index Set #6120

Open

Create MeshIndexSet class #6121

Open


          Merge pull request #11 from trexfeathers/index_set_trexfeathers

73685c1

Some suggestions for SciTools#6014

trexfeathers closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet