-
Notifications
You must be signed in to change notification settings - Fork 1
Integrate HICR as hicr engine into LPF #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
KADichev
wants to merge
44
commits into
master
Choose a base branch
from
hicr_engine
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…so that these could be assigned to different LPF functions (e.g., trigger send early by moving ibv_post_send calls into IBVerbs::put
…(hopefully) through integrating BSC changes to enable both local and remote completion queues, which is key if we want to read the number of messages received or posted.
… us to notice new reads/writes too late.
…slot. This is currently done via imm_data field which carries the memory slot ID of the destination at the sender before it is RDMA written. After a poll finds that a message has been received, the imm_data entry is being read and used as a key for a hash table, where the value is the number of receives (being incremented at each receive at the right key). The lookup at the receiver is then just a lookup of this hash table. There is currently a problem in lines around 840 of mesgqueue.cpp, where the destination ID is being reset to zero. This needs to be solved.
…dified slot ID if edge buffer is used. The original slot ID is then only used as a key for hashtable with key = slot ID and value = number of received messages
…ut directly calls IBVerbs put, and LPF sync only waits on the local completion of IBVerbs put (via polling that the message has been sent -- but no confirmation exists the message has been received). I still keep one barrier in the IBVerbs::sync for synchronicity, but this barrier should be removed in the future.
…within LPF as I need it. 2) Add get_rcvd_msg_cnt_per_slot besides the more general get_rcvd_msg_cnt, as the counts should be per memory slot. 3) Add a flush_send_sync function, which checks only on sender side that messages are not just posted, but also polled for. But I think this functionality is probably going away again.
…s without (b), finalization crashes. But in the near future, both of these will be removed from the sync for efficiency reasons.
… as this leads to additional data being allreduced in each sync. When the user issues runtime.abort(), the allreduce call is still made to check if everyone has called the abort.
…uce in sync. This is tricky though -- it means all parties synchronously call resize themselves, otherwise a deadlock might occur?
… all messages queued to be sent (via ibv_post_send) are sent out (via ibv_poll_cq). This is a requirement from the HiCR Channels library
Comment the post-install scripts as they fail running stuff for this branch.
…n call with expected sent and expected received messages as parameters. The tagged synchronization call without expected sent and expected received messages is not implemented yet. More testing needed on tagged sync.
…rk and is used by HiCR's fence(tag,key,sent_msgs,recvd_msgs) call. The tagged sync, which relies on syncPerSlot, is currently not finalized. This version only waits on the locally outstanding sends/receives for the slot, which does not mean any synchronization with other peers.
…pment. Now set to 7 / 7 for infinite polling, if needed.
…ky for HiCR, which then needs to do sync explicitly before checking these counters.
…all in a few missing cases. Also remove tryLock/tryUnlock in this version, as it is not used yet.
…iated sends and initiated receives. Now replace with a counter only for initiated sends. This counter is checked (initiated sends == completed sends) for the sync phase ending with a barrier.
…s. It is added as a functional test to LPF (tests/func_lpf_compare_and_swap.ibverbs.c), with implementation directly added to the backend in src/MPI/ibverbs.cpp, which employs IB Verbs atomics
Merge compare-and-swap example and functionality into HICR branch
…all wait_completion. Wait_completion is extended now to return the ibv_wc_opcode list, to check if events are atomic compare-and-swap. Such events are currently excluded from the counters. Also in IBVerbs::get there was a bug, where the srcSlot counter was associated with a get, and it should be the dstSlot. Also, a known bug in the allgatherv collective is fixed -- if a process has no messages to send, it does not have an associated global slot registered, so it shouldn't even try to call put/get.
…asically either Op::SEND or Op::GET (put or get - both sends). Still lots of debug output
…d flush receive queues. This is important to expose to external applications, as they might need to flush either send or receive queues. E.g. channels have producers or consumers, respectively
Refactor flushing
…emote process issuing a put, or a local process issuing a get (and the ability to differentiate that. Without it, e.g. the fencing on a received count was broken for get messages. Now it is fixed.
…well as sends of a get into local queue.
…, which significantly improves over ordered map implementation. Currently, it is fixed size 1000. This should be improved in case array overruns.
…ne, instead of replacing existing engines
…t to add macros for LPF_CORE_MPI_USES - without it, standalone ibverbs tests will compile incorrectly.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.