-
Notifications
You must be signed in to change notification settings - Fork 60
[flang][OpenMP] Handle usage of array elements in loop-control expressions #128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flang][OpenMP] Handle usage of array elements in loop-control expressions #128
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dominik suggested to convert this to a smoke test instead. I will do that in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR fixes the build error observed with flang-new with array elements in loop bounds in a target region.
9e00764
to
b50e1d1
Compare
…sions Extends the fix-up logic for `trip_count` calculation in `target` regions. Previously, if an array element was used to compute any of the loop bounds, the trip-count calculation ops would extract arra elements from the mapped declaration of the array inside the target region. This commit hanles that situation.
b50e1d1
to
ad1cf4c
Compare
This patch does 3 things: 1. Add support for optimizing the address mode of HVX load/store instructions 2. Reduce the value of Add instruction immediates by replacing with the difference from other Addi instructions that share common base: For Example, If we have the below sequence of instructions: r1 = add(r2,# 1024) ... r3 = add(r2,# 1152) ... r4 = add(r2,# 1280) Where the register r2 has the same reaching definition, They get modified to the below sequence: r1 = add(r2,# 1024) ... r3 = add(r1,# 128) ... r4 = add(r1,# 256) 3. Fixes a bug pass where the addi instructions were modified based on a predicated register definition, leading to incorrect output. Eg: INST-1: if (p0) r2 = add(r13,# 128) INST-2: r1 = add(r2,# 1024) INST-3: r3 = add(r2,# 1152) INST-4: r5 = add(r2,# 1280) In the above case, since r2's definition is predicated, we do not want to modify the uses of r2 in INST-3/INST-4 with add(r1,#128/256) 4.Fixes a corner case It looks like we never check whether the offset register is actually live (not clobbered) at optimization site. Add the check whether it is live at MBB entrance. The rest should have already been verified. 5. Fixes a bad codegen For whatever reason we do transformation without checking if the value in register actually reaches the user. This is second identical fix for this pass. Co-authored-by: Anirudh Sundar <quic_sanirudh@quicinc.com> Co-authored-by: Sergei Larin <slarin@quicinc.com>
llvm#128662) …471)" Reland llvm#128471 The Passes library was not linked in earlier.
…PM (#128…" (llvm#128819) Reverts llvm#128662 Still a link error.
Extends the fix-up logic for
trip_count
calculation intarget
regions. Previously, if an array element was used to compute any of the loop bounds, the trip-count calculation ops would extract arra elements from the mapped declaration of the array inside the target region. This commit hanles that situation.Hopefully, fixes https://ontrack-internal.amd.com/browse/SWDEV-476122