-
Notifications
You must be signed in to change notification settings - Fork 424
UCC Virtual F2F Meeting Information
Manjunath Gorentla Venkata edited this page May 11, 2020
·
35 revisions
Please fill in the form here
Meeting Notes
Time | Topic | Telecon |
---|---|---|
7:00 am - 7:30 PT | Kickoff and Opening Remarks (Gilad Shainer) | |
7:30 - 8:15 PT | Highlights of UCC API (Review) (Manju) | |
8:15 - 8:30 AM PT | Break | |
8:30 - 9:30 AM PT | Teams API (Manju; All/Discussion) | |
9:30 - 9:45 AM PT | Break | |
9:45 - 11:00 AM PT | Endpoints / Collective Operations (Manju; All/Discussion) |
- Manjunath Gorentla Venkata
- Alex Margolin
- Sergey Lebedev
- Valentin Petrov
- Rami Nudelman
- Baker, Matthew
- Tony
- Gilad Shainer
- James S Dinan .
- Chambreau, Chris
- Gil Bloch
- Dmitry Gladkov
- Arturo
- Pavel Shamis
- Ravi, Naveen
- Raffenetti, Kenneth J.
- Akshay Venkatesh
-
Initialization
- Have a flexible infrastructure for initialization and selection of library functionality
- Discuss final options during component arch discussion
- UCC config interface to follow UCS config.
- Rename ucc_config to ucc_params to reflect UCX style
-
Context
- Do we need sync model config on the context create ?
- Yes for enabling RDMA based implementations
- The drawback - might have to create more contexts (sync and non-sync)
- Yes, might require multiple objects but not necessarily multiple resources
- Explore explicit device abstraction and ability to express affinity and propose to the WG group
- Do we need sync model config on the context create ?
-
Team Creation
- Need to revisit endpoints (as this seems to be implementation specific) after presentation from Alex
- Can we hide endpoint from interface and enable agnostic way of creating teams
-
Collective Operations
- Need to define the mapping of programming model (src, dst) to UCC (src, dst) for cases like MPI broadcast, which has only set of buffers.
- Is there a need for multiple outstanding persistent collective operations of same type ? No use case yet.
- Join the Meeting
- +1 425-659-5232 United States, Seattle (Toll)
- (844) 612-0969 United States (Toll-free)
- Conference ID: 997 771 404#
Time | Topic | Telecon |
---|---|---|
7:00 am - 7:45 PT | Topology Aware Collectives (Sameh) | |
7:45 - 8:00 AM PT | Break | |
8:00 am - 8:45 PT | Collectives API - the Reactive alternative (Alex) | |
8:45 - 9:00 AM PT | Break | |
9:00 - 11:00 PT | Task and Plan API Discussion |
- Join the Meeting
- +1 425-659-5232 United States, Seattle (Toll)
- (844) 612-0969 United States (Toll-free)
- Conference ID: 874 275 202#
Time | Topic | Telecon |
---|---|---|
7:00 am - 7:45 PT | GPUs/DL (TBD) | |
7:45 - 9:00 PT | API Discussion | |
9:15 - 11:00 PT | API Discussion |
Time | Topic | Telecon |
---|---|---|
7:00 am - 7:45 PT | OMPI-X / ADAPT (George Bosilca/Talk) | |
7:45 - 9:00 PT | Component Architecture (Review for non-WG participants)(Alex/Val/Discussion) / Algorithm Selection / Memory Registration |
Time | Topic | Telecon |
---|---|---|
7:00 am - 11:00 PT |
(Laundry List)
- Kickoff (Gilad)
- Highlights of UCC API (Review for non-WG participants) (Manju)
- OMPI-X / ADAPT (George Bosilca/Talk)
- Requirements from the AI Users/Deep Learning/GPUs (NVIDIA; All)
- API Discussion (Incase not completed in WG)
- Library Initialization
- Resource Abstraction (Contexts)
- Teams API (Manju; All/Discussion)
- Endpoints (Manju; All/Discussion)
- Collective Operations (Manju; All/Discussion)
- Task API (Manju; All/Discussion)
- Alternative Control-path API (Initialization and communicator creation) (Alex; All/Discussion)
- Alternative Data-path API (Starting and progressing collectives) (Alex; All/Discussion)
- Component Architecture (Review for non-WG participants)(Alex/Val/Discussion)
- Flesh out UCC.H Header (All)
- Unit tests and CI infrastructure (?)
- Documentation (doxygen ?)(?)
- Multirail Support (Sergey)
- Topology-aware collectives (Sameh/Talk)
- Memory registration (Discussion)
- Algorithm selection (Discussion)