-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add config option to specify json float serialization precision #905
base: develop
Are you sure you want to change the base?
Add config option to specify json float serialization precision #905
Conversation
bf7c3aa
to
86d642a
Compare
|
Hey, nice work! I think it makes sense to use the same precision for everything, so those tests probably need to be adjusted to have higher tolerance for error. The second one is interesting because it means there will be some edge cases where the rating change is so small that it doesn’t appear to change at all. I think that’s ok though since the full change history will still be available through the replay details page and the rating graph. |
One thing we have to keep in mind is that if we’re messing with the json encoding, we need to be very careful of the performance impact since this is one of the hottest functions in the entire code base. It would be amazing if we could just override the float serialization code and leave the rest of the implementation untouched. I found a little snippet that does this although it uses the https://gist.github.com/Sukonnik-Illia/ed9b2bec1821cad437d1b8adb17406a3 |
It looks like we can't do much without creating custom python package that will use compiled C code. patching (example given above): 107.800
current (this pr's code): 28.629 Default serialization simplejson: 6.981
ujson: 3.503
stdlib json: 4.500 Round floats approach (https://stackoverflow.com/a/53798633): simplejson: 14.196
ujson: 10.481
stdlib json: 11.109 Overriding 15.991 Compiling custom mix of 6.589 Time is in seconds, the data for encoding is: {
"command": "game_info",
"visibility": "public",
"password_protected": True,
"uid": 13,
"title": "someone's game",
"state": "playing",
"game_type": "custom",
"featured_mod": "faf",
"sim_mods": {},
"mapname": "scmp_009",
"map_file_path": "maps/scmp_009.zip",
"host": "Foo",
"num_players": 2,
"launched_at": 1111111111.1112312312312,
"rating_type": "faf",
"rating_min": None,
"rating_max": None,
"enforce_rating_range": False,
"team_ids": [
{
"team_id": 1,
"player_ids": [1],
},
{
"team_id": 2,
"player_ids": [2],
},
],
"teams": {
1: ["Foo"],
2: ["Bar"],
},
} There is also a package called 0.624 Otherwise, using "round floats" approach in which we also convert 8.457 |
|
||
class CustomJSONEncoder(json.JSONEncoder): | ||
# taken from https://stackoverflow.com/a/53798633 | ||
def encode(self, o): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this way the best out of all the ones proposed so far. Have you measured the performance difference for this one? I imagine it would be noticeable especially for dicts since it makes a copy of the entire data structure, but maybe it's not too bad.
I think the other way that would have merit would be to make a wrapper type around float
and use default
to format it. That would require us to explicitly set the float rounding everywhere, but it would also give more control if we wanted to round rating to 3 or 4 places but timestamps only to 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I see this is the 'round floats' approach you mentioned in your other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand what do you mean by a wrapper, because if it requires us to call this wrapper every time we do some math, then why don't just use round
?
Rewriting representation of the float won't help, the C code will operate with value and encode all the digits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something similar to this, but using our own class instead of Decimal: https://stackoverflow.com/a/3885198
class PFloat:
def __init__(self, value: float, precision: int):
self.value = value
self.precision = precision
And then in the to_dict
method you'd have to wrap the floats in this class:
def to_dict(self):
return {
"some_float": PFloat(self.some_float, 2)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can make a class, but I think it is unnecessary complication, because we already have built-in round
function which effectively does the same thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like, we have 2 options:
- Rewrite json's
encode
method to change encoding of all floats - Rewrite every
to_dict
method where we will round every float with its own precision (the "more control" mentioned above)
server/protocol/protocol.py
Outdated
if isinstance(o, (list, tuple)): | ||
return [round_floats(x) for x in o] | ||
return o | ||
return super().encode(round_floats(o)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add a config option to disable this feature just in case it turns out to be too costly in production. Although, I think even if it doubles the json encoding time that will still be fine judging by the profiling information I've collected from the server in the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sure, what would you call it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think JSON_ROUND_FLOATS
would be good. And then rename the other one to JSON_ROUND_FLOATS_MAX_DIGITS
or JSON_ROUND_FLOATS_PRECISION
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should mocking be also conditional in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's too important to have tests for the config values. They're generally simple enough that it doesn't matter
2a356ce
to
c61d227
Compare
Closes #875