Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix possible deadlock in async packet processing #2545

Merged

Conversation

Ingrim4
Copy link
Collaborator

@Ingrim4 Ingrim4 commented Sep 27, 2023

I recently noticed a weird behavior of ProtocolLib when using async packet listeners for chunk packets (in Orebfuscator). Sometimes especially on older versions of the game the chunks wouldn't get send to the client.

Being the author of Orebfuscator I tried to fix it but with no success. During my debugging I thankfully found out that the packets never reached the packet listener. Turns out ProtocolLib's processing queue has a hard limit on parallel processing per packet type (SERVER/CLIENT) which isn't necessarily a problem in and of itself. The real problem was that if the server for example sends 100 Packets of type MAP_CHUNK (e.g. on player join) and doesn't send more packets of a whitelisted type (added to ListeningWhitelist) then the first 32 (hard limit in queue) packets will get processed (enqueued into packet listener) and the rest will be stuck until the server tries to send more packets (PacketProcessingQueue#L107). This behavior is similar to a deadlock but not as severe since sending more packets of the whitelisted type will slowing drain the queue 32 packets at a time. (This will only happen if the packet listener increases the processing delay of all packets)

Their are two possible solutions that I found for the problem the first is this PR. Simply starting the processing of the next packet after a packet is done processing should technically solve this. The other solution would be to call PacketProcessingQueue::signalBeginProcessing every tick (AsyncFilterManager#L482) but I discarded it since the first solution seem to do the job.

@Ingrim4
Copy link
Collaborator Author

Ingrim4 commented Sep 29, 2023

Just encountered another closely related problem so I'm just gone put this here for discussion:

Since 1.20.2 chunks get send in batches. A batch always starts with a CHUNK_BATCH_START packet followed by one or multiple chunk packets (MAP_CHUNK) depending on the batch size and ends with a CHUNK_BATCH_FINISHED packet. After a client receives a whole batch it will send a CHUNK_BATCH_RECEIVED packet which tells the server the desired chunk batch size for future batches.

In Orebfuscator I would like to delay the CHUNK_BATCH_START packet until all chunks of the current batch got processed (which is quiet slow and resource intensive) because the client will decrease the batch size if the time between CHUNK_BATCH_START and CHUNK_BATCH_RECEIVED increases. To achieve that I have to increment the processing delay of all CHUNK_BATCH_START and MAP_CHUNK packets since CHUNK_BATCH_START has to wait MAP_CHUNK which in turn waits to be obfuscated. This is currently possible with the API but could result in a permanent deadlock of the above mentioned processing queue since only 32 packets can get processed at a time (delaying a packet also counts as processing) and with enough players moving and joining the amount of pending CHUNK_BATCH_START packets could grow quickly to 32 or higher.

So here's my question/proposal: Why is the processing slot limit of 32 need and could it be removed in future releases? If not would it be possible to not count packets in said processing slots if the processing delay of the packet got incremented manually.

@dmulloy2
Copy link
Owner

dmulloy2 commented Oct 25, 2023

Why is the processing slot limit of 32 need and could it be removed in future releases?

Looks like that limit was added 11 years ago in 93468b5, which was before my time. My guess is that it's arbitrary (who doesn't love a good power of 2?) but probably has something to do with making sure the queue doesn't get too out of hand. I don't see why it couldn't at the very least be higher.

Honestly I'd say up it to 64 or 128 and give it a shot locally

@dmulloy2
Copy link
Owner

Paging @derklaro on this as well as he's familiar with the fancy new injector

@Ingrim4
Copy link
Collaborator Author

Ingrim4 commented Oct 28, 2023

Why is the processing slot limit of 32 need and could it be removed in future releases?

Looks like that limit was added 11 years ago in 93468b5, which was before my time. My guess is that it's arbitrary (who doesn't love a good power of 2?) but probably has something to do with making sure the queue doesn't get too out of hand. I don't see why it couldn't at the very least be higher.

Honestly I'd say up it to 64 or 128 and give it a shot locally

Regarding the initial problem this PR is trying to fix. I feel like increasing the concurrent processing count seems like a bandage fix to me which is why I would still prefer the solution I initially proposed. This would also fix an old bug in Orebfuscator (#60) which is currently "fixed" through processing the packet sync then cancelling the packet and finally requeuing it which isn't ideal since other plugins/packet listener won't get a chance to process the packet that way. This issue will get reported from time to time and the only currently working solution is my "fix" with the sync listener requeuing strategy.

With regard to my second problem for now I simply "disabled" the CHUNK_BATCH_RECEIVED packet (I always set the chunks per seconds to the default value to recreated the old behavior) which so far works fine for me so there is no immediate need to do anything about the concurrent processing count.

@Moritzoni
Copy link

Any updates on this?
Having the same issues

@Ingrim4 Ingrim4 mentioned this pull request Apr 24, 2024
3 tasks
@Ingrim4
Copy link
Collaborator Author

Ingrim4 commented May 5, 2024

Any reason this was closed cause I still need the fix?

@dmulloy2
Copy link
Owner

dmulloy2 commented May 6, 2024

ah i think GitHub auto closed it since it was mentioned in that PR. whoops

@dmulloy2 dmulloy2 reopened this May 6, 2024
@dmulloy2 dmulloy2 merged commit a2f0265 into dmulloy2:master May 6, 2024
4 of 5 checks passed
@Ingrim4 Ingrim4 deleted the fix-async-processing-deadlock branch June 6, 2024 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants