-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Packet losses in reliable RPCs #79487
Comments
cc @Faless |
I am baffled. Today, the uploaded (M)RP seems to have no packet loss for me when I do ~40 reproductions. Anyway, I suspect its the RPC channels conflicting with each other. Because my original project kept having this bug today, and on conversion of all RPCs to 0, I could not replicate this bug. I will playtest on a real network to see if desync happens. I will update if I see packet loss on this MRP again or my original project converted to RPC channels 0. |
Keep in mind, that reliable RPC order is enforced per channel, i.e. sending reliable message A on channel 1, and reliable message B on channel 2, might result in B being received before A (that's the whole point of channels). So if your game makes assumptions on them being received in order even when using different channels, you will experience unexpected behaviors. It's really hard to tell if this is the case given the size of the reproduction project posted, though I can see it uses multiple channels, so this might be something to look into. It seems really weird that ENet is dropping reliable packets, as that's supposed to cause a disconnection, and I honestly have never experienced something like that. |
I made the channels like this: There are no RPCs at channel 0, because this is a turn-based game and all core game logic requires reliable RPCs, whereas channel 0 is a hybrid of both reliable and unreliable from what I know. Now, the curious thing is that the project's multiplayer peer has a channel variable with 0 as the default (it doesn't change), yet why did my project work and send RPCs without problems, when no RPCs are at 0? Shouldn't a disconnect happen, or the RPC to not be sent? That said, the bug shown in the videos above definitely shows a packet drop, so assuming the channels are the cause, they somehow conflict. I even have a MultiplayerSynchronizer somewhere in there, which I recently removed in my original project (whose RPCs I converted all to channel 0), so that may be the culprit too. I have playtested for 1 hour on my original project on a real network without any packet losses. Before I did the aforementioned changes (RPC channels to 0, and removed MultiplayerSynchronizer under HandCard), packet losses occured around every 5 minutes on a real network. But I wish to do 1 more playtest to confirm that the RPC channels were somehow the cause of the packet losses. By weekend I should achieve this, and confirm |
The (M)RP attached here is my original project stripped down heavily, cleaned off all art, and most systems are gone. Obviously, it had the packet loss bug aforementioned. By moving all RPC channels to 0 (instead of having 3), it works perfectly, there are no packet losses. I tested my original project (which had packet losses every ~3 minutes) on a real network for an 1 hour session and another 1 hour, so it's safe to say the bug was caused by having 3 channels. Theoritically, this issue can be closed, but to be satisfied I would like to learn about the ENet channels, since I want to utilize them, yet I cannot... What is the recommended setup? When you use more than 1 channel, should you spawn 1 more peer? (since the peer has a channel variable which is by default 0) Off-topic: When an RPC function runs, e.g. |
For ENet specifically, refer to the official ENet documenation: http://enet.bespin.org/Features.html You can read about multiplayer channels in godot in the official documentation: https://docs.godotengine.org/en/latest/classes/class_multiplayerpeer.html#class-multiplayerpeer-property-transfer-channel In general, as stated in the Godot documentation, the default channel (
They are all on the default channel (
No, it contains an ID, up to 16 bits. This is the reason why both clients and server must define the same RPCs. |
This is the World node (it exists in the uploaded MRP of the OP) Some feedback on the above. On the inspector, the peer should have sub-properties on the rpc channels, kinda like an array or dictionary has items within. So you can have multiple channels, instead of showing one hardcoded (idk if its just an inspector bug or also in the ENet binder) As for feedback related to OP: My RPCs using channels 1,2,3 should not have been sent at all, and there should be a warning at least. I genuinely have no idea how they even sent (perhaps the existing multiplayer spawner/synchronizer which uses channel 0 somehow affected them? idk but I can make an MRP if its really a mystery)
I am aware of it, but that is just function API, there is no best practices or anything. For example, on the link you refer https://docs.godotengine.org/en/latest/classes/class_multiplayerpeer.html#class-multiplayerpeer-property-transfer-channel , me and anyone reading it, is instantly convinced to spread messages to multiple channels (like I did in the OP)
I don't think they should change either. 0 is good for that. Very niche use-case in a real game, in an already complex system. If someone gets one synchronizer on channel 0, and another on channel 1, who can guarantee bugs will not come up? Especially given channels 0+ support exclusively 1 ordering type (e.g. unreliable) which means lots more development and polishing for an extremely niche use-case. And that is aside of the misc UI work to alert/notify the developer accordingly. If you feel like it, open a proposal but I think that while the high-level multiplayer docs are yet unfinished, it would be wasteful to put effort into such a niche use-case.
Is there any documentation or link I can read on the generation of this ID? For example, does this ID refer to the node or the function or a combination of both? Especially given RPCs are based on a nodepath which can be made realtime, this does seem interesting (e.g. if you select a single node which is at the root of all your scenes, and only that receives RPCs, could it be optimized? Alternatively, when nodes move constantly on the tree hierarchy, how does this ID still preserve itself within 16 bits?) |
I ran into the same issue and created a minimal reproduction project: Demo.zip There are two shell scripts in the project folder that you can use on linux to simulate certain network conditions. I set it to 10% packet loss. This way the issue is always reproducable. |
Outstanding. You made an actual MRP, deterministic and all, out of nowhere. And it's simple and clean. I didn't need to use either shell scripts. I expected it to be a lot harder to automate this bug reproduction (I would have tried via timestamp lol), hence I kept delaying it. And if Faless answered me about RPC channels, I would have closed this issue, as I would think the packet loss is a result of bugged channels... With the above MRP, the tag "Needs Testing" is no longer true. |
An interesting observation I made with my MRP is that
So basically, when I use channel 1 all RPCs behave "unreliable_ordered" no matter what transport mode I actually specified. |
Godot version
4.1 stable
System information
Godot v4.1.stable - Artix Linux #1 SMP PREEMPT_DYNAMIC Thu, 03 Nov 2022 21:10:08 +0000 - Vulkan (Forward+) - dedicated NVIDIA GeForce GTX 1050 Ti (nvidia; 520.56.06) - AMD Ryzen 5 2600 Six-Core Processor (12 Threads)
Issue description
I apologize in advance for the wall of text, you will lose 10 minutes of your life to even understand what's going on.
Context: I was making an online fully turn-based tactical card game (no timers)
Over the past months I have noticed some packet losses here and there, on reliable RPCs. This is a critical bug because it causes a desync I cannot catch (or must make a bloated wrapper to detect desyncs and resync, hence wasting months of work). Yet, these packet drops were not rare as they happened on less than 4 minutes of gameplay to get 1 packet loss (1 move desynced ggwp), as displayed in the screen recordings below.
So, the server sends an RPC to a client, but the message never arrives to client and ofc a desync between server and client(s) is achieved. I would love to debug it but the 2 debugging methods (console log printing + watchdog variables) are both bugged and I am also a pleb at wireshark.
Now, I tried to strip down my project heavily to get it to MRP form, I did my best, and focused exclusively on the scenario of the very start of the game, where enemy deck shows his second deck to the other player, where the cards are placed on the appropriate deck slot. (i apologize for the boring context, but its required for the screen recording of the bug)
The annoying thing and why I am writing this wall of text is that this bug's MRP is not deterministic. It just happens once in around 40 use-cases with this MRP, so I doubt anyone will figure out the root of the problem, but with the following screen recordings, its at least proven that packets do get dropped on reliable RPCs.
See start and skip to 2:40
Skip to 3:21
(The print messages shown^ are out of order because of another bug with RPCs)
Ideally, I would have provided an MRP or use-case scenario where I automate RPCs via timer and basically count how many were sent from the client, and arrived to server, but I thought of this too late, and did it all by hand.
Note that I am using localhost, without any packet jumbler of sorts so it makes no sense packets drop. Ofc, I have also tested on a real network via UPNP/Port Forwarding, and packets drop there too.
Steps to reproduce
See both videos above, and also endure that you have to run the below scenario around 40 times to get a visible packet loss. Spam F5/F8.
At this step, the old player must have the cards of the second deck of the opponent on the board. If the slot where these cards go is empty, then that's a packet loss as displayed via the print messages.
Minimal reproduction project
somewhat-rare-packet-loss.zip
The text was updated successfully, but these errors were encountered: