-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError
and Points missing in VertexOnlyMesh
#2178
Comments
I'm not able to recreate your error but I see the missing points you mention. What's happening is the The trouble is that this is happening silently (see #2035). Mathematically it's of course hard to decide what should happen when a point is on a cell boundary - you have to pick one cell or another. And then when that decision is about a point at a cell boundary... well that's tricky. I'll need to have a think about what the best way of dealing with this is. Certainly dealing with #2035 is the first step so that this is less unexpected.
This is expected behaviour - the points that make it into a parallel distributed |
On my laptop, I don't have any problem with from firedrake import *
import numpy as np
mesh = RectangleMesh(100, 200, 1.0, 2.0)
num_rec = 10
δs = np.linspace(0.4, 0.9, num_rec)
X, Y = np.meshgrid(δs*0.999,0.5*0.999)
print(X,Y)
xs = np.vstack((X.flatten(), Y.flatten())).T
point_cloud = VertexOnlyMesh(mesh, xs) However, I had a problem running the same test at the equal Firedrake version on another computer. The error was:
I ran a print at the
and everything works. So, my question is why this error does not appears on my laptop? Maybe some relation with a library installation? |
@lrtfm note I may be wrong - I am investigating nonetheless! |
@ReubenHill I think the points which are not located locally should be searched in other processes. If I would like to compute the norm of two functions defined on two different meshes, I need to interpolate one function into space on another mesh. In parallel, the vertex of the first mesh may not be found locally in the second mesh. |
To follow up on this, it seems that PETSc is dropping the out-of-mesh points when firedrake calls Line 2119 in 9e7c358
which is a wrapper for DMSwarmSetPointCoordinates. For one reason or another, it's only deciding that the on-the-edge points are outside the parent mesh topology for a parent mesh made of quads. Ideally there would be a way of passing a tolerance to this function but it doesn't currently appear to be in the API. I suppose we could manually add such points in cases like this (maybe via an optional kwarg to EDIT: maybe we could also figure out a way to warn a user when a point coordinate is on the parent mesh boundary too |
I think I understand what you're trying to get at but to make sure I don't go barking up the wrong tree do you have a MFE that demonstrates what you think is wrong? |
Okay rereading what you've said it sounds like you're trying to use Sound about right? |
Yes, please consider the following example:
This will not work in parallel.
|
Note to self: email PETSc users asking for the ability to specify a tolerance for mesh edges and to ask if DMSwarmSetPointCoordinates is guaranteed to pick a cell when the point is on cell boundaries. |
@lrtfm Fundamentally what you want to do is not possible with the current VertexOnlyMesh implementation because, for now at least, these meshes respect the parallel distribution of their parent meshes (and, I think, have good reason to do so to allow interpolation from their parent mesh). It seems the real goal here is interpolation of a function from one mesh to another. This can already be done to some extent using multigrid and supermeshing and is on my radar for the future (e.g. going from a parent mesh to, say, a mesh of disconnected lines or vertices). The general case is non-trivial, particularly in parallel, as you have discovered. Very open to thought and ideas on how we might deal with this going forwards and please do correct me if I have misunderstood. Regarding the issue about points missing: see previous messages about mathematical difficulty (made extra hard thanks to floating point arithmetic as @wence- pointed out to me earlier). I will, in the meantime, contact the PETSc mailing list to find out about adding a tolerance argument to |
@ReubenHill Interpolation between meshes in parallel is something I am interested in, too. Would be good to discuss what would be required at some point. |
I don't get what you mean by "have good reason to do so". Could you explain this? Thanks. For the interpolation in parallel, could we insert the missing points (which are not located locally) to the corresponding rank's |
I assume that you have two meshes that cover the same domain (potentially with different parallel distributions). Let's call them So on the target mesh, your target function defines a set of points at which you'd like to evaluate the source function. So we do something like: source_mesh = Mesh(...)
target_mesh = Mesh(...)
source_function = Function(..., source_mesh)
target_function = Function(..., target_mesh)
point_locations = target_function.node_locations
vmesh = VertexOnlyMesh(source_mesh, point_locations)
source_values = interpolate(source_function, FunctionSpace(vmesh, "DG", 0)) So now we have all the correct values corresponding the target nodes, but they potentially live on the wrong process. So updating the In fact, my line So in pseudo-code you need to do something like:
The
|
To be clear this line, in parallel, will place each of Line 2119 in 9e7c358
does by looking at the rank-local coordinates field of source_mesh 's DMPlex and filtering as necessary.
If we wrote It therefore does work correctly given what you have asked it to do. To get |
I guess I meant that you might hope that points not located locally are migrated to the correct rank.
Are you assuming that
and rank1 only locates
|
Yes I am - clearly that's the source of confusion here. I guess that
I'll need to check |
Even if they are dropped you could get around this in a hacky way by either
|
It can do, but I think that's a minor detail. We can instead just think, "each rank has cooked up some set of points it wishes to evaluate at, but doesn't actually know which rank 'owns' the point location" |
These are a reasonable interim approach I think, though they require that we gather all points on all processes which is not scalable in memory cost if I have a constant number of points per process. |
Really useful discussion, thanks! |
Can confirm, from firedrake import *
from mpi4py import MPI
mesh = UnitSquareMesh(1,1)
comm = MPI.COMM_WORLD
a = [0.1, 0.1] # on rank 0
b = [0.9, 0.9] # on rank 1
c = [0.2, 0.2] # on rank 0
d = [0.8, 0.8] # on rank 1
if comm.rank == 0:
points = [a, b] # b should go to rank 1
elif comm.rank == 1:
points = [c, d] # c should go to rank 0
vm = VertexOnlyMesh(mesh, points)
print(comm.rank, vm.coordinates.dat.data_ro) # should get 0 [a, c] and 1 [b, d] |
This could be done without needing to build an SF at swarm/ |
Another thought: at the moment when I call Line 2119 in 9e7c358
it places points everywhere in the local DMPlex, including in the ghost cells. At the moment I just remove such points from the ghost cells Lines 2121 to 2122 in 9e7c358
however, presumably I could look at which ranks the ghost cells correspond to and, before removing them, send the points in those ghost cells to the relevant ranks to be added to the local DMPlex. I presume there's some fancy PETSc SF way of doing this (everything about SFs left my brain some time ago). Thoughts? |
This would work if the provided points are in the halo region of the local process (if not, they'll still disappear). |
My reading of |
Well if you first expunged any out-of-mesh points then you could do it iteratively until all points are no longer in ghost cells (i.e. keep passing them around as necessary). |
I guess you could bypass DMSwarmSetPointCoordinates altogether and manually:
sounds more faff than it's worth and we'd be better off supplementing DMSwarmSetPointCoordinates to deal with this case. |
Oh, you mean just try and put all your points in the swarm, and the ones that didn't make it, you send to someone else until everything is accounted for? I guess that works. The kd-tree approach is O(log N) in the number of processes and the local number of cells, which is asymptotically optimal for this setup I think. |
Yep. Is this the same as the kd-tree approach? I think your original suggestion
along with first asking the PETSc mailing list for their collective thoughts is probably the way to go though. |
Yes, effectively you're using a tree to speed up finding candidate processes. The number of bounding boxes for the processes is
I don't think anything is there in the petsc code right now. |
On Thu, Aug 19, 2021 at 5:54 AM Lawrence Mitchell ***@***.***> wrote:
Oh, you mean just try and put all your points in the swarm, and the ones
that didn't make it, you send to someone else until everything is accounted
for? I guess that works. The kd-tree approach is O(log N) in the number of
processes and the local number of cells, which is asymptotically optimal
for this setup I think.
Yep. Is this the same as the kd-tree approach?
Yes, effectively you're using a tree to speed up finding candidate
processes. The number of bounding boxes for the processes is N, so you
can intersect a point with matching bounding boxes in logN time if you use
a spatial tree lookup.
I think your original suggestion
1. build a kd-tree for the bounding boxes of each subdomain (this way
you can identify candidate processes for each point)
2. Try local point location (using a kd-tree for your local part of
the mesh)
3. For any points you don't find locally, send to candidates, they
locate and someone votes to own the point
4. Build an SF for the migration of the points
along with first asking the PETSc mailing list for their collective
thoughts is probably the way to go though.
It might turn out @knepley <https://github.com/knepley> has already
thought about this and worked something out.
I don't think anything is there in the petsc code right now.
I have an implementation of this right now, minus the tree among processes.
Since each process only has a bounding box, rather than a more sophisticated
boundary, the O(N) box lookup is still pretty trivial for 1M processes.
Thanks,
Matt
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2178 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEORCMT3KH5P5NI3W5OP4LT5TPFLANCNFSM5CETMVFQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
|
Where do we look for the implementation? There's some grid hashing in various places but seemingly only coded for 2D? I think the rendezvous to determine point owners is probably similar to some of the mesh distribution, but I don't quite see it. |
On Thu, Aug 19, 2021 at 1:52 PM Lawrence Mitchell ***@***.***> wrote:
I have an implementation of this right now, minus the tree among
processes. Since each process only has a bounding box, rather than a more
sophisticated boundary, the O(N) box lookup is still pretty trivial for 1M
processes.
Where do we look for the implementation? There's some grid hashing in
various places but seemingly only coded for 2D?
I think the rendezvous to determine point owners is probably similar to
some of the mesh distribution, but I don't quite see it.
I am fixing it this week for 3D. Almost done.
Matt
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2178 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEORCIK5GHWA2V5MFH62JLT5VAGTANCNFSM5CETMVFQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
|
This is the preliminary implementation. It passes simple tests I wrote: https://gitlab.com/petsc/petsc/-/merge_requests/4346 |
This introduces a global index numbering and rank labelling for DMSwarm points. Note that we can't use the in-build DMSwarm numbering in the field named 'DMSwarmField_pid' because there's a bug in petsc4py which makes it inaccessible (it uses PETSC_INT64 which is not in petsc4py). At the moment we leave halos empty to match existing behaviour and fix the case where we were getting point duplication. This also adds point redistribution which fixes #2178. See code for how the algorithm works Also includes tests
This introduces a global index numbering and rank labelling for DMSwarm points. Note that we can't use the in-build DMSwarm numbering in the field named 'DMSwarmField_pid' because there's a bug in petsc4py which makes it inaccessible (it uses PETSC_INT64 which is not in petsc4py). At the moment we leave halos empty to match existing behaviour and fix the case where we were getting point duplication. This also adds point redistribution which fixes #2178. See code for how the algorithm works Also includes tests
This introduces a global index numbering and rank labelling for DMSwarm points. Note that we can't use the in-build DMSwarm numbering in the field named 'DMSwarmField_pid' because there's a bug in petsc4py which makes it inaccessible (it uses PETSC_INT64 which is not in petsc4py). At the moment we leave halos empty to match existing behaviour and fix the case where we were getting point duplication. This also adds point redistribution which fixes #2178. See code for how the algorithm works Also includes tests
This introduces a global index numbering and rank labelling for DMSwarm points. Note that we can't use the in-build DMSwarm numbering in the field named 'DMSwarmField_pid' because there's a bug in petsc4py which makes it inaccessible (it uses PETSC_INT64 which is not in petsc4py). At the moment we leave halos empty to match existing behaviour and fix the case where we were getting point duplication. This also adds point redistribution which fixes #2178. See code for how the algorithm works Also includes tests
This has now been fixed - see #2773 |
When I use the
VertexOnlyMesh
, there are some errors and sometimes points are missing. See the below example.The errors are list below:
After changing the default tolerance value of
locate_cell_and_reference_coordinate
to1e-6
bythe error gone, but some points still missing
By the way, another question here is the
VertexOnlyMesh
seems only find the points locally when using it in parallel. It seems related to the parameters ofDMSwarmSetPointCoordinates
in PETSc.The text was updated successfully, but these errors were encountered: