-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dantzig LCP solver nan values #892
Comments
Here is an example using DART without Gazebo: collideBoxes.zip |
I see that pull request #881 relates to contact constraints and nan values as well. I checked the normals of the of the contacts at this point. They all appear to be of approximately length 1. There are a number of contact constraints (10) and they point in many different directions. |
Digging further into While the contact normals look ok, some of the contact locations are causing the inconsistent equations. The GIF below shows the contacts visualized in Gazebo and the simulation failure. If I apply the diff below to ignore the invalid contact with the lower z position, the simulation continues for more steps. However, the other contacts begin to drift and the simulation fails eventually as shown below.
|
Thank you for investigating this issue. I'm currently on vacation so might not be able to take a look at this for the next few weeks. However, any update would be much appreciated. |
I can help look into this. Enjoy your vacation, JS! |
I see two ways to keep the simulation from failing.
Option 2 may be preferable since invalid contacts points are not good. However, option 1 could be more general if inconsistent equations crop up in other contact patterns. |
@scpeters FYI From the gif that Peter posted, it is already clear that there is contact within the body itself, which would fail the simulation if and only if the solver returns a contact force close to 0 at this invalid contact. For PGS, I suppose most of the case, initial iteration, or initial guess is 0, which is actually helpful in this case. Now after investigation, we should change the title of this issue, because this is a collision detection issue, not solver. I tried with the following combinations: The solvers in case (2) and (4) are exactly the same since DART uses the implementation from ODE as well. Here are the simulation results: Therefore, (2) and (4) are sufficient proof that the PCL returned invalid contacts, which results in the NaN in the Dantzig solution. I think it would not worth the effort to fix the solver at this time. But rather improve the FCL. |
It does sound like the problem is coming from FCL, but it also seems that the Dantzig solver is less robust to bad inputs. Is there a way to detect when invalid contact points are given? Maybe we could compute the signed distance from a contact point to each collision shape? It could be a useful utility for debugging collision detection issues. FYI @JenniferBuehler this issue may be interesting to you |
This is an interesting issue. I will only have more time to help look into this in October, but from what you are writing above I'd also say that it would make sense to fix the contact points as Peter suggests and as Ying has shown in the test (good proof btw). However making the solver more robust (at the very least with assertions to catch critical inputs) would be helpful as well. IMO generated NaN values should always be caught somewhere (ideally as close to the cause as possible), before they cause follow-up issues. If this is not already solved by October when I will get back to work, I can help look into this as well. |
Hi, I'm also running into the problem of nonsensical contact points using the Bullet collision detector. Situation is: two cylinders colliding with each other at 90 degrees (in the shape of an X); should return one contact point roughly in the middle of the X. Most of the time this goes OK, but eventually Bullet collision detector returns one or more points that are outside the intersection of the AABBs of the colliding bodies. So clearly nonsensical. My codebase is fairly big so it's hard to make a minimal working example, but I can try if this would help. First idea is to compute the AABB intersection between the two colliding bodies, and cull all contact points that lie outside it. But clearly it's just a workaround and not really a solution. Phenomenon occurs with Bullet collision margin of 0 as well as 1E-3, 1E-6. Any ideas how this might happen and how we can address it properly? |
I've had that off-shape contact point with bullet in the collision_benchmark tests. It would be next on my TODO list to start debugging the tests which fail with the collision_benchmark. If I find anything I will comment on it here. |
Hi, some more observations. I printed all collision shapes and their transformations after getting the collision result back from Bullet, and they all work out, so the problem is not there. (Anyway, the code that interfaces DART to the Bullet collision detector is pretty small and simple.) The Bullet collision detector is very sensitive though; there are several epsilons defined in Bullet's GJK code. Earlier I had decreased some of these margins; recompiling with a fresh clone of the Bullet repository leads to the error sometimes not occurring, sometimes later in the simulation. Some more insight from Erwin Coumans (also note: GJK penetration depth is only valid for small penetrations, i.e. less than the margins):
So, it might not be a bug per se, just a shortcoming of the GJK/EPA algorithms? (Any experts on those subjects?) |
It seems the root causes are: (1) may be the same with other collision detectors as well. What FCL makes the behavior worse is that it returns more contact points. There are two possible solutions on top of my head now. (a) Let the Dantzig solver falls back to using PGS solver when there is no solution, which can be detected by observing the In addition to that, (b) will allow us to use the convexity based collision algorithms for primitive shapes that is more efficient than using meshes. The main reason we don't use the FCL's primitive shapes (that uses the convexity based collision algorithms) is that they only return a single contact at a time. Contact manifold will aggregate contact pointes over time so that they can be used in dynamic simulation. |
#1265 is a partial solution to this issue. I'll leave this issue open because I think the Dantzig solver shouldn't return true when the solution includes NAN values. Also, we need to investigate why FCL returns contact points in the middle of the boxes. |
### Problem Dantzig LCP solver sometimes returns inaccurate solution or even NAN values when the LCP doesn't have a solution. More importantly, `BoxedLCPConstraintSolver` uses the solution without any sanity checks, which easily leads to unstable simulation. This is reported in #892. ### Solution Let `BoxedLCPConstraintSolver` to use more robust LCP solver such as PGS when it fails to solve with the primary LCP solver. In addition to that, as a final resort, it discards NAN value solution when the secondary LCP solver failed as well. ### Testing Added the regression test provided by @pchorak. Thanks! *** **Before creating a pull request** - [x] Document new methods and classes - [x] Format new code files using `clang-format` **Before merging a pull request** - [x] Set version target by selecting a milestone on the right side - [x] Summarize this change in `CHANGELOG.md` - [x] Add unit test(s) for this change
Hello,
I have come across an issue while using DART with Gazebo in which the Dantzig LCP solver returns nans, causing the simulation to fail. I am using the Gazebo default branch and DART master branch. Here is a Gazebo world file (sdf) to reproduce the issue. In the example, two boxes anchored to the world with parallel revolute joints collide. matrices.txt contains the debug printout from
DantzigLCPSolver..cpp
. If I replace the boxes with spheres, do not align the revolute axes, or use the projected Gauss-Seidel solver then the issue does not occur.The text was updated successfully, but these errors were encountered: