OSHMEM yoda spml failures: need to update to BTL v3.0

Cisco just added OSHMEM testing to its MTT 2 weeks ago (at the Dallas engineering meeting).

We're seeing a large failure rate on v2.x with OSHMEM testing using TCP,vader,self.  For example: https://mtt.open-mpi.org/index.php?do_redir=2347

This shows 1,624 failures and 6,546 passes.  I.e., a nearly 20% failure rate.  😱 

Many of the failures show this kind of error message:

```
[mpi006:31821] Error base/memheap_base_mkey.c:162 - memheap_attach_segment() tr_id: 1 key 54ba0015
attach failed: errno = 12
```

Does anyone know what this means?

@artpol84 @jladd-mlnx 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OSHMEM yoda spml failures: need to update to BTL v3.0 #2028

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OSHMEM yoda spml failures: need to update to BTL v3.0 #2028

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions