-
Notifications
You must be signed in to change notification settings - Fork 3.7k
drill picking performance #12916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
drill picking performance #12916
Conversation
Thank you for the pull request, @Beilinson! ✅ We can confirm we have a CLA on file for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Beilinson
Thanks for the contribution! I left a number of comments, but they're mostly minor. The performance improvement seems huge. I was particularly surprised the Map
change gave a big difference, since the plain object {}
is also backed by a hash table. (In contrast to #12896, where we were iterating over an array. Map had a huge impact there).
I just have a few questions, but they're pretty important:
- Paraphrasing a comment I left on
PickFramebuffer.js
, can you elaborate on why not returning early fromPickFramebuffer.end
speeds up picking? Seems counterintuitive to me. - Following from that, does this performance gain come at the cost of some other use case of picking? (like, does it speed up drill picking at the cost of slowing down regular picking?)
- In the example sandcastle you linked under the testing section, it only picks 40 pins (compared to the 220 in your measurements). What do I change to increase it to 220?
while (defined(pickedResults) && pickedResults.length > 0 && !shouldBreak) { | ||
for (const pickedResult of pickedResults) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely we don't need a for loop inside a while loop, right? I haven't fully processed this code, but what is the while loop doing, exactly? Can we just... remove it, and only use the for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note- if we do really need this while loop, I think I'd rather see the inner for loop as its own function (to be frank, I'd prefer that the existing code had been written as several smaller functions).
Then, you would no longer need shouldBreak
- you could just say if !myForLoopFunction(...) break
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH i would've liked to use a label on the outer while loop rather than the shouldBreak
flag, but I see your point. I''m not sure refactoring it out into a separate function would be great, this is the bread and butter of this method and it would need to get multiple parameters to refactor. thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the double loops:
while
loops as long as there are entities (and hasn't reached the limit) and redoes thepickCallback
for
loop to actually iterate over all the entities and add/break once hit the limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you were to refactor the inner for loop into its own function, the args would bepickedResults
, results
, and pickedPrimitives
- did I miss anything? That doesn't sound too bad to me.
Unrelated, is defined(pickedResults)
a necessary while loop condition? Isn't pickedResults
guaranteed to be defined, it just may be 0-length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getRayIntersection
in Picking.js
returns object[] | undefined
, which is used as the drillPick callback in getRayIntersections
. Do you prefer me to change it to return Frozen.EMPTY_ARRAY
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored the for loop to an external function, it does take quite a few arguments but overall looks cleaner to me
Hey @mzschwartz5 , thanks for the review!
The drill picking implemented by Cesium is similar to Depth Peeling. Each iteration, entities are rendered to some offscreen buffer, that buffer is then read into memory and then we iterate over the pixels (spirally from the center, this is primarily used for picking the entities "closest" to the pointer, not really useful when picking an actual rectangle). The old algorithm would return the moment one primitive was found. The sandcastle demonstrates a scenario where picking isn't directly of a 3x3 pixel rectangle at the pointer, rather a significant portion of the screen. This is also a use case we have in our enterprise system, where we support "drag selecting" multiple entities. The important thing is depth peeling is critical when entities are rendered 100% directly on top of another and you are picking a very small (3x3) area. In this use case however, the entities dont have a significant amount of overlap, meaning each iteration we could find more entities rendered, but instead we exit early to re-render everything and re-read everything. By letting the
No, regular picking still picks a (3x3 by default) rectangle, and stops after finding the first entity (
Probably has to do with screen size, you could try resizing the viewer size or alter the sandcastle so the pickRectangle is the entire canvas, but that would be even more unfair |
Essentially this PR has two separate (not necessarily exclusive) improvements:
To be precise, 2 speeds up from O(N) to O(N2), where N2 is the maximum # of fully overlapping entities at the given zoom and render resolution. |
Got it. So the behavior of regular picking is exactly the same as before this PR, because it uses limit=1. Perfect. I guess the natural follow up is then, does this slow down the performance of regular 3x3 drill picking, where we most likely do want 1 result at a time? (I need to go over the PR again, and maybe I'll figure out the answer myself, but good to document it anyway). Took another pass, still interested in your response to the above question, but otherwise I think this is looking pretty good barring those few style-related discussions. |
It's the exact same code flow as previously because it returns early after finding one result |
No, I don't think this is true. Take this example: I'm doing a 10x10 drill pick, on a region of the screen where 5 entities are overlapped, and I've set the limit to 5. On each pick pass, now, even once we've found 1 entity, Essentially, this PR is a tradeoff - better performance for large pick rectangles when entities are non-overlapping, at the same depth, but worse performance when they are overlapping at various depths. Is the tradeoff worth it? We might need to discuss. The primary use-case of drill picking is to pick objects that are overlapping each other at various depth layers. So iff this has a significant performance impact on large-rectangle drill picks containing overlapping objects, we may want to consider other options. (like maybe a new picking API specifically for this type of picking?). |
Also, won't this change which entities get picked? The old method prioritizes entities closest to the mouse position, depth-first if you will. The new method prioritizes breadth-first. This is probably more important of a question than the performance one above, as I assume that, overall, performance is probably better all-around given the other changes (map, rgb conversion) |
Ah sorry I misread your question. Yes you're right, this PR causes breadth-first instead of depth-first search. Regarding the fact that you iterate over empty pixels after already finding an entity rather than early return, I took the liberty to benchmark the iteration speed before and after my map+byte conversion improvements: New sandbox In the new example, I click directly on the entities, which are all 100% overlapping Main: Color + Map improvements: Color+Map+Iterate over all pixels: I benchmarked all these several times, and the numbers weren't 100% consistent obviously but this was the average first click time. As a user of both scenarios, I feel a 10ms slowdown (which isn't felt because this pr also gives 100ms speedup) for a pick of 400 entities, while getting a O(N) improvement over non-overlapping entities is worth it. Iterating over 100 pixels really isn't all that slow when all the iteration does is check a map |
Agreed this is 100% a more important question. On any sized rectangle, given no limit, the result will be the same (different order, but for large picks thats an improvement). On small picks (3x3) it is most likely guranteed to be the same as before, because I still iterate centrally outwards, meaning if any outer entities pixels exist they are are probably of entities hidden by the central entity. Given a large rectangle, and a small limit, and a mix of overlapping and non-overlapping entities, then the results will be different - drillPick will now prioritize visible entities over completely hidden ones. From my point of view this is a matter of intention: as a user, we use drillPick with the default 3x3 to pick things directly under the mouse. The use of drillPick with a large rectangle is to drag select, in which case if we have a limit (which we do) we prefer to highlight the visible entities over the hidden ones. If there is a use-case for drill picking a large rectangle and still preferring a depth-first search, I am open to adding a flag which will early return after 1 entity is found. I do think breadth-first should be the default, because of its improved performance. If needed, I could add a warning in the changelog |
That's a pretty fair assessment on both fronts. I agree about the performance changes being a non-issue. I'm going to seek input from my teammates on the selection ordering; I think it could be OK, and may just warrant a note in the change log as you said. |
Found this #3018 randomly looking through issues related to performance/memory leaks, realized that this PR implements the transition to Map as mentioned there, so this could close that I think |
Description
The goal of this PR is to improve performance with
scene.drillPick
, without changing the algorithm significantly.The first commit changing the internal data structure of
_pickObjects
toMap
was inspired by the performance boosts seen in #12896, and led to a surprising 75% decrease in time.I tried further reducing unneeded code by removing some funny
byte-float-byte
transformations and reusing a scratch array for pixels inContext.readPixels
, which also led to around ~10% improvements over the previous iteration.Further, I realized that in sandcastles such as the one described in #9660, where may non-fully overlapping entities may be drawn, and the drill pick rectangle may be particularly large, instead of
PickFramebuffer.end
exiting early after one entity was found, it could continue counting entities and iterate over the entire pickPolygon. This led to the most significant performance boost, the attached sandcastle now takes ~30ms down from ~3s on my machine, since the slowest part of picking is the need to rerender for every single entity in the pick rectangle currently (i.e, 220 entities = 220 renders + 220 buffer reads + 220 iterations over bigger and bigger parts of the image)Issue number and link
#9660
Testing plan
Sandcastle example modified from #9660:
local
main
Author checklist
CONTRIBUTORS.md
CHANGES.md
with a short summary of my change