Welcome to GPUSorting Discussions! #8
Replies: 3 comments 7 replies
-
Hi b0nes, Outside of the readings listed on the Github, or there any video lectures, powerpoints, or visualizations of the differing parallel prefix sum algorithms that could be used to better understand them? |
Beta Was this translation helpful? Give feedback.
-
Hi, I'm Morten, I work with JangaFX on VFX authoring tools for games and film such as EmberGen and LiquiGen. My primary interests in terms of sorting are sorting smallish input sizes relative to most GPUs (i.e. between 2^10 and 2^18 or so key-values, barely enough to saturate the highest end GPUs) with as little overhead as possible, as well as segmented sorts of similar ranges of larger datasets. Doing a global sort of relatively small datasets (for the GPU) and doing smallish segmented sorts for larger datasets has some interesting similarities, but also very different tradeoffs, which I find interesting and somewhat underexplored. Add in the different characteristics of different GPU architectures and it gets even more interesting. To be more specific: I'm currently working on a sparse fluid simulation where the number of tiles is typically on the order of a few thousand, and worst case on the order of a few tens of thousands. I need to sort these by distance to the camera for rendering purposes, and I want to try to sort these based spatial locality using some kind of z-ordering curve, and I'd like to have these sorts take up as little time as possible, preferably close to the cost of dispatching a compute shader altogether if possible. If someone wants to integrate what I am working on in an actual video game where there's a 1-2ms budget then spending 100µs per isn't going to be quite acceptable. My interest in GPUSorting in particular probably stops at using it as a great reference implementation and reference benchmark of OneSweep so far, where I'm applying it to a range of inputs it's not really designed for at all. I'll probably look into the SplitSort aspect as well at some point in the future as well, but for now I'm mostly focusing on my own implementation. I only have a limited selection of GPUs available for testing locally, so I'm also interested in more comprehensive benchmarks across different architectures (e.g. 1000 series and onwards for NV, and RDNA1 and onward for AMD). |
Beta Was this translation helpful? Give feedback.
-
Could a discussions section be created for the PrefixSums repository? I mostly have questions about parallel prefix sums, but I feel those questions would be better addressed in the relevant repository. |
Beta Was this translation helpful? Give feedback.
-
👋 Welcome!
We’re using Discussions as a place to connect with other members of our community. We hope that you:
build together 💪.
To get started, comment below with an introduction of yourself and tell us about what you do with this community.
Beta Was this translation helpful? Give feedback.
All reactions