Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev/wwolff/llvm cleanup 20240710 #100

Draft
wants to merge 17 commits into
base: upmem_release_120
Choose a base branch
from

Conversation

wwolff42
Copy link

No description provided.

@wwolff42 wwolff42 self-assigned this Jul 10, 2024
@wwolff42 wwolff42 force-pushed the dev/wwolff/llvm_cleanup_20240710 branch 5 times, most recently from 37c19fd to 4d7d03f Compare July 12, 2024 18:14
wwolff42 added 17 commits July 29, 2024 09:50
The idea is to generate correct SSA form in EmitInstrWithCustomInserter
Currently, we introduce arith+comp+branch, which seems to be wrong for SSA ala LLVM
Loads of passes during reg_alloc/phi_elim morph def to use in those instruction
it also put some stuff after the terminator for stack frame management
and potentially put COPY before the def

So this is a wip to actually keep correct SSA form at this stage.
Todo so, we do the actual arith+cmp+branch with simple instruction,
and add enough information to keep together/adjacent those instructions
as we know they will be merge back together at later stage.
Try to use metadata and implement the tweaker for LLVM internal passes.
I first moved the trick stuff from DPUMergeComboInstrPass to a new DPUPostRAFusion
because I wanted to have possible optim between RA and PreEmit::MergeComboInstr be effective.

Along the way, I simply discover that we lose consistency of our def/use
during analyzeBranch process for those arith+comp+jump:
insertBranch and buildConditionalBranch was changing the instruction in a bad way.

So, in fact we may don't even need to tweak shouldSink and don't need DPUPostRAFusion.
And now I believe we don't kill really the SSA form, we just wasn't doing the right way.

It was probably just bad branch reconstruction ... We always learn :)
…GE cleanup now ...

While at it, I implemented some quick optimization.

- Implemented `(1 << n) - 1` == `lslx lneg n`
#9

- Moved some 64-bit operation earlier in the pipeline,
from ResolveMacroInstrPass to expandPostRA.
I tried even earlier, but SUBC is not well defined and get moved around during MergeSink
for critical edge split. The computation is then wrong because require to be packed for the
particular use I identified.
I leave my experiment here. I will check to fully define it and move it preRA, or will leave it
like that again for a while, fulfilling my other duty.

- Fixed a few easy Def/Use when BuildMI

- Fixed lose of MachineInstr correctness
Our arithmetic+comp+branch was destroyed during analyzeBranch/removeBranch/insertBranch

- temporarily removed fusion of any instruction + JUMPi in MergeComboInstr
The problem is that at this stage (PreEmit):
- machine CFG is done.
- JUMPi is unconditional jump
- arithmetic + cond + branch; with cond as True/False is conditional
-- even if we know that cond with true/false is unconditional, the instruction have the
-- property of being conditional by its definition.
---- To fix that, I will create other PseudoInstruction to have them set correctly.
---- Also, if those arith+cond+branch do have pattern, maybe they could be selected far earlier
---- and the machine CFG would be correctly formed at the first place probably.

- issue with ThinLTO fixed
-- some code construction ended up in SELECTrr, which is not common for us
---- this is lowered to TmpJcc
---- and TmpJcc is kind of wrong
------ and finally, MergeComboInstr was combining even more wrong.
--> I removed TmpJcc, and use simply our well defined JEQrii

- issue whith ThinLTO fixed
-- another was present but undetected on Release build
-- we use multiple address spacees (IRAM, WRAM, MRAM)
-- there was an assertion with ThinLTO when populating GV out of multiple modules
---- it's fixed in llvm13, but we are on llvm12
------ I reproduced the patch (not cherry-picked) llvm@60c60dd
------ just for now. will do that correctly when cleaning up
------- so when I will upgrade our LLVM, it will be mergeable easily
@wwolff42 wwolff42 force-pushed the dev/wwolff/llvm_cleanup_20240710 branch from bd40fcb to 45abde9 Compare July 29, 2024 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant