Dev/wwolff/llvm cleanup 20240710 #100

wwolff42 · 2024-07-10T21:45:22Z

No description provided.

…king of change made

The idea is to generate correct SSA form in EmitInstrWithCustomInserter Currently, we introduce arith+comp+branch, which seems to be wrong for SSA ala LLVM Loads of passes during reg_alloc/phi_elim morph def to use in those instruction it also put some stuff after the terminator for stack frame management and potentially put COPY before the def So this is a wip to actually keep correct SSA form at this stage. Todo so, we do the actual arith+cmp+branch with simple instruction, and add enough information to keep together/adjacent those instructions as we know they will be merge back together at later stage. Try to use metadata and implement the tweaker for LLVM internal passes.

… generic and handle naturally all cases

I first moved the trick stuff from DPUMergeComboInstrPass to a new DPUPostRAFusion because I wanted to have possible optim between RA and PreEmit::MergeComboInstr be effective. Along the way, I simply discover that we lose consistency of our def/use during analyzeBranch process for those arith+comp+jump: insertBranch and buildConditionalBranch was changing the instruction in a bad way. So, in fact we may don't even need to tweak shouldSink and don't need DPUPostRAFusion. And now I believe we don't kill really the SSA form, we just wasn't doing the right way. It was probably just bad branch reconstruction ... We always learn :)

…itionalBranch

…propagation and other optimizer

…GE cleanup now ... While at it, I implemented some quick optimization. - Implemented `(1 << n) - 1` == `lslx lneg n` #9 - Moved some 64-bit operation earlier in the pipeline, from ResolveMacroInstrPass to expandPostRA. I tried even earlier, but SUBC is not well defined and get moved around during MergeSink for critical edge split. The computation is then wrong because require to be packed for the particular use I identified. I leave my experiment here. I will check to fully define it and move it preRA, or will leave it like that again for a while, fulfilling my other duty. - Fixed a few easy Def/Use when BuildMI - Fixed lose of MachineInstr correctness Our arithmetic+comp+branch was destroyed during analyzeBranch/removeBranch/insertBranch - temporarily removed fusion of any instruction + JUMPi in MergeComboInstr The problem is that at this stage (PreEmit): - machine CFG is done. - JUMPi is unconditional jump - arithmetic + cond + branch; with cond as True/False is conditional -- even if we know that cond with true/false is unconditional, the instruction have the -- property of being conditional by its definition. ---- To fix that, I will create other PseudoInstruction to have them set correctly. ---- Also, if those arith+cond+branch do have pattern, maybe they could be selected far earlier ---- and the machine CFG would be correctly formed at the first place probably. - issue with ThinLTO fixed -- some code construction ended up in SELECTrr, which is not common for us ---- this is lowered to TmpJcc ---- and TmpJcc is kind of wrong ------ and finally, MergeComboInstr was combining even more wrong. --> I removed TmpJcc, and use simply our well defined JEQrii - issue whith ThinLTO fixed -- another was present but undetected on Release build -- we use multiple address spacees (IRAM, WRAM, MRAM) -- there was an assertion with ThinLTO when populating GV out of multiple modules ---- it's fixed in llvm13, but we are on llvm12 ------ I reproduced the patch (not cherry-picked) llvm@60c60dd ------ just for now. will do that correctly when cleaning up ------- so when I will upgrade our LLVM, it will be mergeable easily

wwolff42 self-assigned this Jul 10, 2024

wwolff42 force-pushed the dev/wwolff/llvm_cleanup_20240710 branch 5 times, most recently from 37c19fd to 4d7d03f Compare July 12, 2024 18:14

wwolff42 added 17 commits July 29, 2024 09:50

dpu: llvm: DPUTargetLowering::LowerOperation: cleaner default case

d13dab1

dpu: llvm: DPUMergeComboInstrPass::runOnMachineFunction: improve trac…

7a5ebcb

…king of change made

dpu: llvm: add constant register for lowering

210bb44

dpu: llvm: correcting IR generation

ad6f57a

dpu: llvm: register simplify and remove lneg,mneg

cd1553f

dpu: llvm: correct some def

45e4cc8

dpu: llvm: DPUResolveMacroInstrPass: correct some

81385dc

wip: optimize ls{l,r}x + and + jeq

310f790

wip: don't lose metadata during analyzeBranch

df78969

wip: do mul16 and tailored logic to metadata, will rework later to be…

0af56bb

… generic and handle naturally all cases

wip: have a go without the metadata now, and keep simple in buildCond…

f4b565d

…itionalBranch

wip: allow this buggy one because CI/CD test it ...

2d24393

wip: have a go with renamable phy register and normal coalescer/copy_…

bdbdeec

…propagation and other optimizer

dpu: compiler-rt: add support

bbb5895

wwolff42 force-pushed the dev/wwolff/llvm_cleanup_20240710 branch from bd40fcb to 45abde9 Compare July 29, 2024 07:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev/wwolff/llvm cleanup 20240710 #100

Dev/wwolff/llvm cleanup 20240710 #100

wwolff42 commented Jul 10, 2024

Dev/wwolff/llvm cleanup 20240710 #100

Are you sure you want to change the base?

Dev/wwolff/llvm cleanup 20240710 #100

Conversation

wwolff42 commented Jul 10, 2024