Use caching allocators in Thrust #835

stephenswat · 2025-02-03T13:40:00Z

I was very pleasantly surprised to learn that using our vecmem memory resources in Thrust is trivial! You simply create a polymorphic allocator from them and Thrust will accept them as-is. This should reduce the amount of memory allocations significantly throughput traccc.

With this change, I observe a 10-15% improvement of the full-application throughput of traccc. 🥳

I was very pleasantly surprised to learn that using our vecmem memory resources in Thrust is trivial! You simply create a polymorphic allocator from them and Thrust will accept them as-is. This should reduce the amount of memory allocations significantly throughput traccc. With this change, I observe a 10-15% improvement of the full-application throughput of traccc.

sonarqubecloud · 2025-02-03T16:14:06Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

krasznaa

I'm also pretty sure that we could avoid creating these policy objects on every single call separately. 🤔 Again, something to try.

I'm very happy though about the performance improvement that you found!

krasznaa · 2025-02-03T19:43:56Z

device/cuda/src/clusterization/measurement_sorting_algorithm.cu

-                 measurements_view.ptr() + n_measurements,
-                 measurement_sort_comp());
+    thrust::sort(
+        thrust::cuda::par(std::pmr::polymorphic_allocator(&(m_mr.main)))


Strange that you needed to be so very explicit. 🤔 std::pmr::polymoprhic_allocator doesn't put explicit on its constructor for sure. That's how we can create vecmem::vector objects without always writing out the full allocator name in their constructors.

I guess I'll try thrust::cuda::par(&(m_mr.main)).on(stream) tomorrow, to see if that would also work...

beomki-yeo · 2025-02-03T19:52:41Z

Yeah - What a free lunch..

stephenswat added cuda Changes related to CUDA improvement Improve an existing feature performance Performance-relevant changes labels Feb 3, 2025

stephenswat requested review from krasznaa, niermann999 and beomki-yeo February 3, 2025 13:40

niermann999 approved these changes Feb 3, 2025

View reviewed changes

stephenswat force-pushed the feat/thrust_cache branch from bc4d5e5 to 13cd6af Compare February 3, 2025 16:13

stephenswat merged commit 86ac4c9 into acts-project:main Feb 3, 2025
29 checks passed

krasznaa reviewed Feb 3, 2025

View reviewed changes

krasznaa mentioned this pull request Feb 5, 2025

Asynchronous Thrust, main branch (2025.02.05.) #843

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use caching allocators in Thrust #835

Use caching allocators in Thrust #835

stephenswat commented Feb 3, 2025

sonarqubecloud bot commented Feb 3, 2025

krasznaa left a comment

krasznaa Feb 3, 2025

beomki-yeo commented Feb 3, 2025

Use caching allocators in Thrust #835

Use caching allocators in Thrust #835

Conversation

stephenswat commented Feb 3, 2025

sonarqubecloud bot commented Feb 3, 2025

Quality Gate passed

krasznaa left a comment

Choose a reason for hiding this comment

krasznaa Feb 3, 2025

Choose a reason for hiding this comment

beomki-yeo commented Feb 3, 2025