-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added diff states (no binary serialization, no static typing) - includes fix for the first-state problem. #278
base: main
Are you sure you want to change the base?
Conversation
…piler hits error if function doesnt exist, so typos are caught with static RPC functions)
get: | ||
return ProjectSettings.get_setting("netfox/rollback/enable_state_diffs", true) | ||
set(v): | ||
enable_state_diffs = v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the setter needless? I am not familiar with Godot's setters/getters.
Also are project settings' values cached on runtime, or does it search project settings' values each time I get()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason for this flag to be writable during runtime? I'd just go with a push_error
message.
ProjectSettings.get_setting
boils down to a Node.get
call under the hood, so I'd say it's fine as it is. But if you want to be proactive about perf, feel free to introduce a variable with an underscore prefix that's initialized in _ready
and use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it push_error
as you suggested. As for the second part of this, I opened #304
…RED) regardless of full or diff. The server always starts by sending full states. Once the client receives any (full) state, he sends an ack to the server to start sending diffs.
I really hope the author reads this |
@olivatooo getting there 🙂 |
Cleaned up the PR a bit so we can focus on the actual changes. In the future, please keep in mind the following for PRs:
Thanks! |
I apologize for this slipping through, it pops up in each PR I make by default, I gotta manually exclude this, and it seems in this PR I missed it.
This PR still contains the input changes which are unrelated to diff states. I will move them to a different PR. Freely close that input refactor PR. Btw on the removal of static typing, I gotta say its harder to read this PR without types, because you cant tell if a |
@@ -28,6 +27,5 @@ static func parse(root: Node, path: String) -> PropertyEntry: | |||
var result = PropertyEntry.new() | |||
result.node = root.get_node(NodePath(path)) | |||
result.property = path.erase(0, path.find(":") + 1) | |||
result.type = typeof(result.get_value()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I make a PR for caching the type of the property entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, that would need a separate discussion on why that would be beneficial.
for property in properties: | ||
result[property.to_string()] = property.get_value() | ||
result.make_read_only() | ||
return result | ||
|
||
static func apply(properties: Dictionary, cache: PropertyCache): | ||
for property in properties: | ||
var pe: PropertyEntry = cache.get_entry(property) | ||
var pe = cache.get_entry(property) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no static typing here, at least a rename is required of pe
to property_entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but either in a different PR, or later in this PR's lifecycle
var earliest: int = buffer.keys().min() | ||
var latest: int = buffer.keys().max() | ||
var earliest = buffer.keys().min() | ||
var latest = buffer.keys().max() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static typing in places such as this help greatly in readability. Without the int
(hinting to tick), if you land suddenly on this codeblock, you dont know what buffer.keys()
can be, could be variant in the reader's head
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed earlier, convention is to not use static typing inside methods. Renaming variables to be more expressive of what they do is a good idea though.
The latter can be done either in a separate PR intended for readability, or later in this PR's lifecycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the input refactor so this PR is exclusively on state diffs. I found a weird bug though on Godot 4.1 (haven't tested other versions) It sends state ticks in order (e.g. 45, 46, 47) but sometimes they arrive out of order (e.g. 45, 47, 46) which shouldn't happen because the RPC is "unreliable_ordered". I want to test this with other godot versions because until Godot 4.4 dev3 some RPC bugs are fixed, and this may include this. Nonetheless, this made me realize that even if Godot doesn't bug, packet loss and packets being shuffled are a possibility, so it's good to cover those edge scenarios. So I made a condition block for this scenario (see the The above works and has no bugs but there is still that rare jittering bug where you run 0 ping but there is a jittering at some player once in 10 boots or something (which existed before this PR), which I suspect is caused by this "unreliable_ordered" failure. I will open an issue on this once I confirm its root and can reproduce it consistently. Anyway, I left the debug comments and stuff inside, turning this into a draft for now until I remove the debug comments and look deeper into this bug (e.g. test other godot versions) I will open a PR for the input refactor later, gotta do tasks irl will take a few days. |
So I pause this PR to investigate this bug #300 |
@@ -148,27 +151,63 @@ func _process_tick(tick: int): | |||
|
|||
func _record_tick(tick: int): | |||
# Broadcast state we own | |||
if not _auth_state_props.is_empty(): | |||
var broadcast = {} | |||
if not _auth_state_props.is_empty(): #This is always the server, as the server owns all avatars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment is misleading, this is not a requirement. Though uncommon, other peers can own state too, and this is determined per node. So I'd prefer not to specify this, even as comment, in netfox core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
var diff_state_to_broadcast: Dictionary = {} | ||
for picked_property_path in full_state_to_broadcast: | ||
#If previous tick doesnt have a property, this means its a new property added in runtime, so we must add it. | ||
if (_states[tick - 1].has(picked_property_path) == false): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create a dedicated variable to the previous state and add a comment that it will always have a value due to the check on line 169.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
diff_state_to_broadcast[picked_property] = full_state_to_broadcast[picked_property] | ||
|
||
for picked_peer_id in multiplayer.get_peers(): | ||
if (NetworkTime._synced_clients.has(picked_peer_id) == false): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation behind this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NetworkTime.tick is not synced yet for the client, so sending states can cause issues because before the sync for example the client can be tick 54 then jump to tick 60 and then start incrementing the tick by 1, so there is a skip in states
For a practical example, in the game im using netfox for, when a client joins, it syncs to the exact NetworkTime.tick of the server
This check can be removed, it shouldn't cause a bug because to un-synced clients, diff states are never sent, only full states, but I would need to test first before removing this from draft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, and if the state jumps from 54 to 60, we can't do diff states because we don't have tick 59, right? If so, I'd refer to my other comment on the approach. Ideally we shouldn't need a contiguous state history.
var _earliest_input: int | ||
|
||
var _sent_full_state_to_peer_ids: Array[int] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you plan to use this as a set of peers, I'd recommend a dictionary instead of an array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
_logger.warning("At tick %s skipped peer id %s" % [tick, picked_peer_id]) | ||
continue | ||
|
||
if (_sent_full_state_to_peer_ids.count(picked_peer_id) > 1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this trigger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a remnant from debugging, I will remove it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
properties_to_include.append(picked_property_path) | ||
|
||
#Set property values which are different from previous state sent | ||
for picked_property in properties_to_include: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have this loop separately? Why not just add things to the diff state as we go in the previous loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. It was a remnant from the olde PackedByteArray
serialization, where I first got the properties to include then serialized them in a different loop.
I will add that line to the above/identical loop 👍
else: | ||
var properties_to_include: Array[String] = [] | ||
var diff_state_to_broadcast: Dictionary = {} | ||
for picked_property_path in full_state_to_broadcast: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be best to create a PropertySnapshot.diff(a, b)
method instead, and call that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still true for the new commit, aka this code?
if (sanitized.has(picked_property_path) == false): | ||
if (_states.has(tick - 1)): | ||
if (_states[tick-1].has(picked_property_path)): | ||
sanitized[picked_property_path] = _states[tick - 1][picked_property_path] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just PropertySnapshot.merge
the incoming state and the previous tick's state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into that function, I didn't know it existed
_logger.error("Received a state without any properties from %s, but diff states which should make it possible, are disabled!" % multiplayer.get_remote_sender_id()) | ||
breakpoint | ||
|
||
func set_received_state(received_state: Dictionary, tick: int, sender_id: int): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that a piece of logic was extracted here, but I don't think set_received_state
is the best name. Let's get back to this once the rest of the comments are addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Let's find a proper name once I push a commit, and have addressed the comments 👍
set_received_state(received_state, tick, multiplayer.get_remote_sender_id()) | ||
elif (NetworkRollback.enable_state_diffs): | ||
if (_states.has(tick - 1)): | ||
_states[tick] = _states[tick - 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do this? At this point, _states[tick]
should already be set by set_received_state
, which this then overrides fully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because if a diff contains 0 properties (that means there was no change in the properties from previous tick), the new state must simply be a duplicate of the previous one.
Ok so, I have a few big concerns regarding this implementation. The way I understand it:
Let me know if I missed something. The third point is very problematic imo - client ack's the first full state, but nothing guarantees they'll have full info for the rest of the states. Diffs might get lost, as they're sent unreliably. So once a diff is lost, the whole chain breaks. I think diff states should include a reference tick number, i.e. saying "this state describes the diff from this given reference tick". This way, clients will only need to have full info for the reference tick and none other. This was discussed under the original issue. I'd also prefer if full states weren't sent once at the beginning, but at regular intervals, to make sure that even if something goes wrong with the diffs, we still get the occasional full state to course-correct. Similar to how video codecs have keyframes. Lastly, I think it would benefit the code if either the full-state and diff-state code flows were more visibly separated ( e.g. different RPC's to handle them ), or collapsed into a single flow, with as little branching as possible. I feel the current version has way too many checks to comfortably manage the two flows in one. |
Correct overview 👍
It is true, but it should correct itself once the diff changes again. For example, for position and rotation since they change so often, losing a tick doesn't matter because it will be updated in another diff tick. But to play devil's advocate and support your argument, if the different property is
The issue above, from what I understand, says that a diff should reference a tick with full state, but there are no full states here once diff states begin.
This is one of the 2 solutions to this problem. The other solution being akin to The first solution is a hybrid of full state and diff state, and the second solution is doubling down on the diff state. There are pros and cons for both, for example, the second solution is ideal if you have 200+ dropped items in a game world whose position doesn't change, so they simply don't send their position (or state) and they consume no bandwidth. For its implementation, I would add a Which solution would you like? I can code both, and either solves the problem.
I will push a commit today or tomorrow, give me a confirmation if you want me to split in 2 RPCs. Personally I don't mind. Thank you for the detailed code review. I honestly didn't expect it, and I feel kinda guilty that I have paused the commit. Its just 2 codeblocks I have to debug/confirm they work as intended and I have prioritized other tasks irl, I didn't expect such rigorous review. I will push the above commit since its just 2 codeblocks (ignore it freely) which smoothens the packet loss, then I will do a followup commit which addresses all the code review comments. |
I pushed, packet loss is fixed/addressed. It is good enough to remove from draft, though I keep the debug comments until everything is determined to be perfect. I addressed most of the coding review comments in these commits. I will test this with the latest master merges and probably push a commit tomorrow because of that @elementbound I request to confirm with the current code, whether to split the After the above confirmations, I will implement them on this PR. |
# Also fixes packet loss (e.g. a state missing) | ||
for picked_tick in range(_latest_state_tick, received_tick): | ||
#Check for the very first latest_state_tick which doesn't have a corresponding state | ||
if (_states.has(picked_tick)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if check is required because I found a bug with _latest_state_tick
which exists prior to this PR as well, even with the starting order fix of #284 and I will open an issue for that, but until then this if check saves the day (consider it a hack until that issue is resolved)
To synopsize the issue: _latest_state_tick
is initialized inside process_settings()
or ready()
, but wherever, its value is always NetworkTime.tick - 1
. I changed this to NetworkTime.tick
as it makes more sense. But even this value is untrue because the latest state is set only if we are the authority (after _record_tick
) or after _submit_state
if we are not the authority. So it should have the value -1 instead of NetworkTime.tick
and let this variable tick value be set after its actual state is set.
By setting the value to -1 as it should be, causes some other problems, so yeah I opted for this if check 😅
All states are sent
UNRELIABLE_ORDERED
regardless if they are a full state (contains all properties) or a diff state (contains only the properties with new values)Once the client receives any state, he sends a
RELIABLE
rpc to the server that he has indeed received a full state, and so it can start receiving diff states. Once thisRELIABLE
rpc is received by the server, all future states sent to that specific peer are diff states, instead of full states. Diff states are states whose property/values included are only those which have changed from the previous tick.This implementation also solves the diff states problem mentioned in #264 (this is the 2nd way, and I think its the cleanest and most performant)
I have tested it and found no issues.
Solves #157
As for the bandwidth it saves, see forest-brawl example: "Displaceable:mass", "Displaceable:impulse", ":speed" which change only once via powerup, so 3/5 states are skipped with diff states. Assuming the variables were of same size, with diff states, there is 60% less bandwidth.
Requires #277 which is just static typing additions (so this PR is easy to read, instead of having double the size)