-
Notifications
You must be signed in to change notification settings - Fork 868
WeeklyTelcon_20210525
Geoffrey Paulsen edited this page May 25, 2021
·
1 revision
- Austen Lauria (IBM)
- Brendan Cunningham (Cornelis Networks)
- Brian Barrett (AWS)
- Edgar Gabriel (UH)
- Geoffrey Paulsen (IBM)
- Harumi Kuno (HPE)
- Hessam Mirsadeghi (NVIDIA))
- Howard Pritchard (LANL)
- Jeff Squyres (Cisco)
- Joseph Schuchart (HLRS)
- Josh Hursey (IBM)
- Marisa Roman (Cornelius)
- Matthew Dosanjh (Sandia)
- Sam Gutierrez (LANL)
- Todd Kordenbrock (Sandia)
- William Zhang (AWS)
- Akshay Venkatesh (NVIDIA)
- Artem Polyakov (NVIDIA)
- Aurelien Bouteiller (UTK)
- Brandon Yates (Intel)
- Charles Shereda (LLNL)
- Christoph Niethammer (HLRS)
- David Bernholdt (ORNL)
- Erik Zeiske (HPE)
- Geoffroy Vallee (ARM)
- George Bosilca (UTK)
- Joshua Ladd (NVIDIA)
- Mark Allen (IBM)
- Matias Cabral (Intel)
- Michael Heinz (Cornelis Networks)
- Nathan Hjelm (Google)
- Naughton III, Thomas (ORNL)
- Noah Evans (Sandia)
- Raghu Raja (secret startup)
- Ralph Castain (Intel)
- Scott Breyer (Sandia?)
- Shintaro iwasaki
- Tomislav Janjusic (NVIDIA)
- Xin Zhao (NVIDIA)
- Pretty close to syncing up with master.
- Open a DRAFT PR in next few weeks (Before June)
- Confidence is high on this implementation.
- High as to not break MPI_Init / MPI_Finalize.
- Intern writing more tests on sessions.
- Code prototype should be okay, but PMIx is more up in the air.
- Implements everything in current MPI 4.0 proposal.
- need to rip out some extra stuff in branch
- Text is in standard
- Will have a new minimum PMIx version.
- YES. Might need PMIx v5
- Will need to have some conversations about what to do with older PMIxes.
- Also includes making frameworks refcounted
- 8982 - MPI_Snedrecev btl/tcp doesn't block an RC
- We'll do one more RC, and then get a final v4.0.6 out.
- Where are we on pack/unpack with long and long double
- only external32
- This worked before, but not sure
- 8918 - pack/unpack with external32
- 8818 - checking if
- Brian thinks Issue 8990 would also apply to v4.0.x
- with-libevent=/usr (Debian packaging does), we add a -L/usr to wrapper output, and put all of the -L to find deps, before -L to libmpi.so, and if there is an ompi in /usr/lib as well,
- https://www.mail-archive.com/devel@lists.open-mpi.org//msg21289.html
- Jeff replied saying use same License terms as source code
- Fortran bindings for MPI_Ialltoallw and neighbor version
- We create an array, pass it into the C binding, and then free it before the C side has completed.
- Another issue discovered:
- 4 byte C ints, and 8 byte Fortran integers.
- This has been in code "forever"
- Howard thought we discussed this before and that we added a configury check to disallow this.
- George thinks he has an elegant solution.
- Won't be as invasive as originally thought.
- Marked as critical, but not a blocker as it's been in the product forever.
- No driver to rush, so now just in bugfix phase.
- Need some configury changes in before we RC.
- Issue 8850, 8990 and more
- Brian will file 3-ish issues
- One is configure pmix
- Unscheduled RC
- Dynamic Windows fix in for UCX.
- Any update on debugger support?
- Need some documentation that Open MPI v5.0 supports PMIx based debuggers, and that if
- MPIR Shim - pushed up fixes, and enabled CI.
- Could add it to some more CI, to ensure that PMIx doesn't break
- IBM is working on some CI testing with MPIR (typically very brittle)
- Need some guidance on pmix version.
- Right not, probably not a big deal, but perhaps in 2 years when we have 3 release branches with different pmix versions on different release branches, it might make sense to do open-mpi CI testing.
- Shouldn't be too much work to do.
- UCC coll component updating to just set to be default when UCX is selected. PR 8969
- Intent is that this will eventually replace hcoll.
- PR 8998 - MPIPy -
- In shift to PRRTE, --oversubscribe is NOT being handled. If you have more procs than slots on a node, internal oversubscribe var is not yet being set.
- Mellanox hasn't been reporting for a while. Tommi will follow up.
- Austen filed a couple of issues from MTT.
- No discussion
- No update
- No discussion.