-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce complexity overhead (and avoid one copy) when obtaining character SSDQs in SpriteTextDrawNode
#5883
Conversation
…ter SSDQs in `SpriteTextDrawNode` Of note, this doesn't show a huge improvement in benchmarks, but in an edge case noticed in osu!, the `AddRange` operation can add a perceivable overhead. We believe this may be a dotnet runtime quirk. Even though this change can't be shown in isolated benchmarks, I'd argue the code quality improvement is worth it. Of note, the caching of the SSDQs at `SpriteText` was a bit redundant as it was being invalidated by basically everything. So it makes sense to just fetch it every time, regardless, in the draw node itself. The only potential saving we could obtain with the previous logic would be to shift the load to the `Update` frame. But this wasn't being done. Rather than investigating whether that has any benefits, I'd rather focus on getting `SpriteText` rewritten to use an optimised shader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like @smoogipoo's take on the invalidation stuff if possible.
{ | ||
int partCount = Source.characters.Count; | ||
|
||
parts ??= new List<ScreenSpaceCharacterPart>(partCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the optimisation, correct? The lazy list init and the partCount
spec?
It's not what I would call "avoid one copy" exactly because it's not directly avoiding copies, it's avoiding list reallocs due to having to expand the collection to match incoming count (which in turn will cause struct copies due to how list-of-struct works), which makes me mildly confused. Strictly speaking it would save to the order of O(n) copies if I'm not mistaken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copy I meant was the fact that items were previously added once to a list in SpriteText
then subsequently copied to a second list in DrawNode
.
The main goal was to remove the weird InsertRange
overhead, and this was enough to do that.
I double-checked with him IRL on the invalidation part when making this change, so he should be on board with it 😄 |
With
|
private readonly LayoutValue parentScreenSpaceCache = new LayoutValue(Invalidation.DrawSize | Invalidation.Presence | Invalidation.DrawInfo, InvalidationSource.Parent); | ||
private readonly LayoutValue localScreenSpaceCache = new LayoutValue(Invalidation.MiscGeometry, InvalidationSource.Self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose one of the advantages of this is that the overhead only occurred once per invalidation, rather than 3 times (once for each DrawNode) per invalidation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lookup was already lazy, so I'm not sure if this would ever be the case, but maybe.
Of note, this doesn't show a huge improvement in benchmarks, but in an edge case noticed in osu!, the
AddRange
operation can add a perceivable overhead:We believe this may be a dotnet runtime quirk.
Even though this change can't be shown in isolated benchmarks, I'd argue the code quality improvement is worth it.
Of note, the caching of the SSDQs at
SpriteText
was a bit redundant as it was being invalidated by basically everything. So it makes sense to just fetch it every time, regardless, in the draw node itself.The only potential saving we could obtain with the previous logic would be to shift the load to the
Update
frame. But this wasn't being done. Rather than investigating whether that has any benefits, I'd rather focus on gettingSpriteText
rewritten to use an optimised shader.master
this pr (pooled at 2967626)
this pr (list at 90a2e28)