Skip to content

[CodeGen][CodeLayout] Fix segfault on access to deleted block in MBP. #142357

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ivafanas
Copy link
Contributor

@ivafanas ivafanas commented Jun 2, 2025

Problem 1: There is a typo which reassigns BlockWorkList to EHPadWorkList on attempt to remove RemBB from work lists.

Problem 2: Chain->UnscheduledPredecessors == 0 is an incorrect way to check whether RemBB is enqueued or not. The root cause is a postponed deletion of WorkList from already scheduled blocks in selectBestCandidateBlock. Bug happens in the following scenario:

  • FunctionChain is being processed with non-zero UnscheduledPredecessors
  • Block B' is added to the BlockWorkList
  • Block B' is chosen as the best successor (selectBestSuccessor) for some another block and added into Chain
  • Block B' is removed by tail duplicator.

RemovalCallback erroneously won't erase B' from BlockWorkList, because UnscheduledPredecessors value of FunctionChain is not zero (and it is allowed to be non-zero).

Proposed solution is to always cleanup worklists on block deletion by tail duplicator.

@ivafanas
Copy link
Contributor Author

ivafanas commented Jun 2, 2025

We have caught the bug on the custom out-of-tree backend in a ~200Kb function of ~70 basic blocks with several loops, switch instructions etc.. Any attempt to reduce the test case results in bug disappearing.

I'm afraid, I'm not able to provide the reproducer / regression test for any in-tree backend in a reasonable time.

@ivafanas
Copy link
Contributor Author

ivafanas commented Jun 2, 2025

@arsenm , @weiguozhi

Could you please review the PR?

@weiguozhi
Copy link
Contributor

RemovalCallback erroneously won't erase B' from BlockWorkList, because UnscheduledPredecessors value of FunctionChain is not zero (and it is allowed to be non-zero).

From function markBlockSuccessors, a BB can be added to BlockWorkList only when UnscheduledPredecessors reaches 0. So how was B added to BlockWorkList in the first place?

@ivafanas
Copy link
Contributor Author

ivafanas commented Jun 2, 2025

From function markBlockSuccessors, a BB can be added to BlockWorkList only when UnscheduledPredecessors reaches 0. So how was B added to BlockWorkList in the first place?

Yes, it is a bit tricky:

  • When block B' is added to the work list, it belongs to the chain with zero UnscheduledPredecessors
  • When block B' is chosen as the best successor, it is merged into FunctionChain inside BlockChain::merge method. FunctionChain has non-zero UnscheduledPredecessor.

Work lists might contain blocks from chains with non-zero UnscheduledPredecessor. It happens for scheduled blocks which should be removed from work lists soon. However, postponed removal comes to play.

@ivafanas
Copy link
Contributor Author

ivafanas commented Jun 3, 2025

Some test in flang failed. Need to be investigated.

@ivafanas ivafanas force-pushed the dev/fix-segfault-in-block-placement branch from 501954d to 034458b Compare June 5, 2025 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants