Skip to content

Integrate HICR as hicr engine into LPF #23

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 44 commits into
base: master
Choose a base branch
from
Open

Conversation

KADichev
Copy link
Collaborator

No description provided.

KADichev and others added 30 commits September 20, 2023 11:38
…so that these could be assigned to different LPF functions (e.g., trigger send early by moving ibv_post_send calls into IBVerbs::put
…(hopefully) through integrating BSC changes to enable both local and remote completion queues, which is key if we want to read the number of messages received or posted.
…slot. This is currently done via imm_data field which carries the memory slot ID of the destination at the sender before it is RDMA written. After a poll finds that a message has been received, the imm_data entry is being read and used as a key for a hash table, where the value is the number of receives (being incremented at each receive at the right key). The lookup at the receiver is then just a lookup of this hash table. There is currently a problem in lines around 840 of mesgqueue.cpp, where the destination ID is being reset to zero. This needs to be solved.
…dified slot ID if edge buffer is used. The original slot ID is then only used as a key for hashtable with key = slot ID and value = number of received messages
…ut directly calls IBVerbs put, and LPF sync only waits on the local completion of IBVerbs put (via polling that the message has been sent -- but no confirmation exists the message has been received). I still keep one barrier in the IBVerbs::sync for synchronicity, but this barrier should be removed in the future.
…within LPF as I need it. 2) Add get_rcvd_msg_cnt_per_slot besides the more general get_rcvd_msg_cnt, as the counts should be per memory slot. 3) Add a flush_send_sync function, which checks only on sender side that messages are not just posted, but also polled for. But I think this functionality is probably going away again.
…s without (b), finalization crashes. But in the near future, both of these will be removed from the sync for efficiency reasons.
… as this leads to additional data being allreduced in each sync. When the user issues runtime.abort(), the allreduce call is still made to check if everyone has called the abort.
…uce in sync. This is tricky though -- it means all parties synchronously call resize themselves, otherwise a deadlock might occur?
… all messages queued to be sent (via ibv_post_send) are sent out (via ibv_poll_cq). This is a requirement from the HiCR Channels library
Comment the post-install scripts as they fail running stuff for this branch.
…n call with expected sent and expected received messages as parameters. The tagged synchronization call without expected sent and expected received messages is not implemented yet. More testing needed on tagged sync.
…rk and is used by HiCR's fence(tag,key,sent_msgs,recvd_msgs) call. The tagged sync, which relies on syncPerSlot, is currently not finalized. This version only waits on the locally outstanding sends/receives for the slot, which does not mean any synchronization with other peers.
…pment. Now set to 7 / 7 for infinite polling, if needed.
…ky for HiCR, which then needs to do sync explicitly before checking these counters.
…all in a few missing cases. Also remove tryLock/tryUnlock in this version, as it is not used yet.
…iated sends and initiated receives. Now replace with a counter only for initiated sends. This counter is checked (initiated sends == completed sends) for the sync phase ending with a barrier.
…s. It is added as a functional test to LPF (tests/func_lpf_compare_and_swap.ibverbs.c), with implementation directly added to the backend in src/MPI/ibverbs.cpp, which employs IB Verbs atomics
KADichev and others added 14 commits March 1, 2024 15:09
Merge compare-and-swap example and functionality into HICR branch
…all wait_completion. Wait_completion is extended now to return the ibv_wc_opcode list, to check if events are atomic compare-and-swap. Such events are currently excluded from the counters. Also in IBVerbs::get there was a bug, where the srcSlot counter was associated with a get, and it should be the dstSlot. Also, a known bug in the allgatherv collective is fixed -- if a process has no messages to send, it does not have an associated global slot registered, so it shouldn't even try to call put/get.
…asically either Op::SEND or Op::GET (put or get - both sends). Still lots of debug output
…d flush receive queues. This is important to expose to external applications, as they might need to flush either send or receive queues. E.g. channels have producers or consumers, respectively
…emote process issuing a put, or a local process issuing a get (and the ability to differentiate that. Without it, e.g. the fencing on a received count was broken for get messages. Now it is fixed.
…, which significantly improves over ordered map implementation. Currently, it is fixed size 1000. This should be improved in case array overruns.
…t to add macros for LPF_CORE_MPI_USES - without it, standalone ibverbs tests will compile incorrectly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant