-
Notifications
You must be signed in to change notification settings - Fork 868
WeeklyTelcon_20180508
Geoffrey Paulsen edited this page Jan 15, 2019
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoff Paulsen
- Dan Topa (LANL)
- Jeff Squyres
- Akvenkatesh
- Brian
- Joshua Ladd
- Howard Pritchard
- Nathan Hjelm
- Ralph
- Todd Kordenbrock
- Josh Hursey
- Xin Zhao
- David Bernholdt
- Geoffroy Vallee
Review All Open Blockers
Review v2.x Milestones v2.1.4
- v2.1.4 - Targeting Oct 15th,
- lower priority to v3.0 and v3.1
- No new news on v2.1.x
Review v3.0.x Milestones v3.0.2
- Schedule:
- Quick turnaround on this, Shooting for May 1st.
- Going to try to post v3.0.2 RC tomorrow
- A few PRs ready to go, and a few that need review.
Review v3.1.x Milestones v3.1.0
- Shipped v3.1.0 last night.
- Long outstanding list of PR for v3.1.x branch.
- 4 or 5 need review. one is Geoff tagged for review. (done)
- will hold of about a week in case we need to do a quick turn-around oops release.
- Schedule
- Like to hold to a month turn around.
- in two weeks will cut release canidate
- For v3.1.1 - Want to get UCX OSC at a higher priority.
- Issue 5048
- looking pretty good. Howard brought up some issues on single node with xpmem.
- UCX bug.
- Xin will rerun and see where we stand.
- xpmem can be disabled via env var.
- Issue with Connect-X3 attomic support. UCX limitation.
- For v3.1.1 Some want fallbacks, or Errors, but don't segv.
- For v4.0
- Mellanox planning to do emulation on CPU if IB card can't do HCA attomics.
- Still need a check in OMPI, incase they're running with old UCX.
- As a heads up ULFM support may require PMIx v3.0
- Schedule: mid-July branch. mid-Sept relelase.
- Moving to drop openib (except for iWarp)
- More support for setting priority of openib to 0, and copying Open IB to iWarp and only use for iWarp.
- Rename openib BTL to iWarp - because if UCX is prefered way for Mellanox, go whole hog on Mellanox.
- And change logic so that it only works on iwarp devices.
- If someone (Broadcom or Chelcio?) steps up for MTT testing.
- Jeff will get contact info to v4.0 release managers who will reach out to iWarp providers to request MTT testing.
- MPI Standards removal for MPI removed items in Open MPI v4.0
- Nathan sent out email about PR 5127 - to remove all MPI2.x standard items.
- A little weird to be able to pull back MPI1 removed items.
- Lets remove these too at the same time.
- C++ bindings are seperate pull request. PR 5128 Goal is to have these removed as well.
- Sent poll out - 12 of 41 responses. 1 of 12 has said they're using C++ binding, but not sure if it's accurate.
- If C++ bindings are sufficently isolated we could move to a seperte repo.
- But if no one is really using, lets just remove it all.
- Lets turn off more building by default.
- Forum didn't REMOVE everything that was deprecated in MPI v3.0 standard.
- PMIx v3.0 - finishing up debugger support
- Open MPI needs someone to step up to pull in PMIx v3.0 changes
- PRTE is not trying to be a production quality runtime.
- It's a prototype layer.
- Start with Open MPI having some runtime people.
- If you go to PMIx v3.0 - need to at least update the ORTE/PMIX directory
- Biggest thing would be for DVM mode, additional changes to be made to keep that work.
- Open MPI needs a runtime plan
- Open MPI needs runtime developers.
- Contention because code-bases are divering?
- Two seperate projects. In PRTE, no requirement
- PRTE used to be ORTE (parts of opal) used by PMIx as a refrence library
- Possible options are:
- Continue to maintaining ORTE - but someone has to maintain it.
- Or move to PRTE (with slightly different goals) - but need developers for this as well.
- Scoping of move inside of ORTE to move to PMIx v3.0 is significantly smaller than move to PMIx v2.0
- Earlier (orte changes needed to move from PMIx v1.2 -> PMIx v2.0) was quite signficant.
- Most of code changes in ORTE server area.
- If you want to take advantage of PMIx Tooling, some changes in MPI-IO area.
- Earlier (orte changes needed to move from PMIx v1.2 -> PMIx v2.0) was quite signficant.
- If we had a statement of work, we could SWAG an estiamate of work, it could help request resources.
- Unfortunately we don't have anyone on the team that could step up.
- PMIx said that in Open MPI v4.0 we were going to put a deprecation warning in the news.
- ORTE has bit rotted a fair amount relative to PRRTE.
- Some issues found in ORTE
- right now Open MPI's only Debugger interface is MPIX.
- To upgrade to PMIx debugger interface, ORTE would need to upgrade to PMIx v3.0 + some ORTE changes.
Review Master Master Pull Requests
- Last week: OSHMEM v1.4 - not sure if we have to drop the depricated APIs, curious OMPI is dropping depricated APIs...
- Only remove things removed from the OSHMEM standard, not things Deprecated as "deprecated" means it will be removed from a future version of the standard. If some APIs were removed from the standard, then ask oshmem email list their thoughts.
- Xin should be able to push first version of OSHMEM v1.4 changes to master next week or so.
- Got compiler licenses for NAG compiler, and Absoft
- Both Fortran
- Get copy of perl JSON, and put it on MTT.
Review Master MTT testing
- Mellanox, Sandia, Intel
- LANL, Houston, IBM, Fujitsu
- Amazon,
- Cisco, ORNL, UTK, NVIDIA