-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treematch issues in master #4303
Comments
We're looking into it, but I can't replicate it on any of my machines. The execution path in treematch depends on the local architecture of the platform. Can we have a description of the setting where this assert triggered ? |
I'm unable to replicate this issue on our machines. The test program works fine. However, it fails when I try your topology (local.txt) with this error: topology discovery failed I need to investigate this more. |
Here's a little more info from a failure on AWS:
|
This same Here's a gist (https://gist.github.com/jsquyres/098f256cead9d20d2ad1c3aea0e6b0be) showing:
The test doesn't actually fail for me, but it does give approximately ~21K lines like this:
And yes, I mean approximately twenty-one thousand lines like this. Here's the exact mpirun command I used (inside a SLURM allocation containing these 2 nodes):
|
For that aws error report? It was from the v3.0 branch, and simply:
Here's another one, this from the v3.1.x branch:
And this is from master:
|
@rhc54: the issue with the master is resolved : this assert should not be here in the first place. |
When I compile ompi with |
I have created PR #4644 o address the issue highlighted here. |
Thanks George! |
Per 2018-01-09 teleconf, @bwbarrett will follow up with @bosilca on this one. |
Reading the history in this ticket, clearly a fix is needed for v3.1.x, and it looks like it was never pulled. @bosilca, can you file a PR to merge the changes into v3.1.x? I'm not seeing the failure on v3.0.x, so I believe it's not needed there? |
The treematch topology component is segfaulting in master when running MTT:
The text was updated successfully, but these errors were encountered: