Index tracks by `TrackSlotId` #678

esseivaju · 2023-03-10T20:47:35Z

As mentioned in #675, this is an initial implementation of indexing tracks by TrackSlotId instead of ThreadId. Sorting should help adjacent threads working on similar tracks (alive, same particle type, energy,...).

I added a Collection in CoreStateData which does the thread --> track slot lookup. No sorting is done yet, only a 1-1 correspondence.

…o `TrackSlotId`

src/celeritas/global/CoreTrackView.hh

sethrj · 2023-03-16T13:43:46Z

@esseivaju I wonder if a good "stopping point" for this PR is to simply allocate and use the (unsorted) track slot IDs? That might make it easier to tag and experiment with the performance differences between "track ID is implicitly track slot ID", "track ID requires a single (new) indirection", "track IDs are randomized", "track IDs are sorted by alive/not", etc. Or do you think we should do all of those experiments on a single branch and then only merge once we see a consistent performance boost?

esseivaju · 2023-03-16T17:00:57Z

I think we can merge that once we use the unsorted track slot IDs as you suggest as I would say the TrackSlotId abstraction is good to have regardless of the outcome of performance gain.

…iew is created

src/celeritas/global/CoreTrackData.hh

sethrj · 2023-03-20T19:08:30Z

src/corecel/OpaqueId.hh

+//! Multiply an opaque ID by a factor
+template<class V, class S>
+inline CELER_FUNCTION OpaqueId<V, S>
+operator*(OpaqueId<V, S> id, std::make_signed_t<S> factor)


I don't think the semantics here make sense: the best analog to OpaqueId is probably a pointer, so it makes sense to subtract and add and increment and operator[], but not to multiply.

I agree that if the semantics are pointers for OpaqueId then these operations don't make sense. If the semantics are more specific - an index in a collection - then it could be justified, e.g. to divide work on a range into chunks for parallel processing, though still not mandatory.

The reason for having these is that the collection CoreStateData::track_slots is an array of OpaqueId and thrust::sequence is doing arithmetic with the values to fill the array as its expecting numbers:

// Compile error if we don't have the * and + operators // /opt/cuda/11.8.0/include/thrust/system/detail/generic/sequence.inl(65): error: no operator "+" matches these operands // operand types are: const celeritas::TrackSlotId + celeritas::OpaqueId<celeritas::TrackSlot, celeritas::size_type> // sequence.inl:61-66 __thrust_exec_check_disable__ __host__ __device__ T operator()(std::size_t i) const { return init + step * i; }

So if we don't want operator*(OpaqueId, <signed_numeric>) and operator+(OpaqueId, OpaqueId) I suppose we can use 989007d

OK, I see. And I understand the problem now, and it looks like the thrust specification of sequence is incomplete.

https://github.com/NVIDIA/thrust/blob/949ac649afc37e9f92d946f267c2390c6ab3daa7/thrust/system/detail/generic/sequence.inl#L57-L69

implicitly requires addition of another of the instance and multiplication of the index by a "step".

I'll think about this.

One solution could be to have CoreStateData::track_slots defined as Collection<TrackSlotId::size_type, W, M, ThreadId>, do the sorting/filling on numbers and we wrap it in a TrackSlotId when returning from CoreTrackView::track_slot_id

Implementation of this idea in 0fc32cc. We could also remove the pre/post-increment operators but I'd say it still make sense to keep them if we don't care about OpaqueId being immutable.

Actually, I like your idea and implementation. It might even be necessary when we try to use more complex algorithms (sorting). Let's stick with that.

I was thinking as an alternative to just use a kernel to set TrackSlotId independently for each thread, but yours is better.

+1
I guess another way to see is that the sorting is now done via an iterator type which is distinct from the value type track_slots.

src/corecel/OpaqueId.hh

The collection of ThreadId to TrackSlotId maps ThreadId to TrackSlotId::size_type instead. When returning a track slot id, the CoreTrackView will wrap the index into a TrackSlotId. Advantage of this variant is that we no longer need operator+(OpaqueId, OpaqueId) and operator*(OpaqueId, int) which don't make sense for the pointer semantics of OpaqueId

src/celeritas/global/CoreTrackData.hh

src/corecel/OpaqueId.hh

test/corecel/OpaqueId.test.cc

sethrj

Nice! With this in, we should be able to start experimenting 😄 Good job!

esseivaju · 2023-03-21T17:22:21Z

I added some last-minute formatting. So should we close that PR and open a new one for starting with experimentation?

sethrj · 2023-03-21T20:20:59Z

@esseivaju I've converted #675 into a checklist so we can track the multiple "sorting" aspects together. So let's just proceed from the existing issue :)

Add a track_slots collection in CoreStateData to map ThreadId t…

bfe6c3a

…o `TrackSlotId`

esseivaju self-assigned this Mar 10, 2023

esseivaju linked an issue Mar 10, 2023 that may be closed by this pull request

Study performance impact of sorting tracks #675

Closed

6 tasks

sethrj changed the title ~~Sorting and indexing tracks by TrackSlotId~~ Indexing tracks by TrackSlotId Mar 16, 2023

sethrj changed the title ~~Indexing tracks by TrackSlotId~~ Index tracks by TrackSlotId Mar 16, 2023

sethrj reviewed Mar 16, 2023

View reviewed changes

src/celeritas/global/CoreTrackView.hh Outdated Show resolved Hide resolved

Moved initialization of track_slots into resize instead of when the v…

989007d

…iew is created

esseivaju commented Mar 16, 2023

View reviewed changes

src/celeritas/global/CoreTrackData.hh Outdated Show resolved Hide resolved

Initialize track_slots using std::iota / thrust::sequence.

58687fd

esseivaju marked this pull request as ready for review March 16, 2023 21:35

esseivaju added 3 commits March 16, 2023 14:48

Add template parameter to inline definition

ef2b182

removed name for unused parameter

b791928

Remove unnecessary header

b7d54a9

sethrj changed the base branch from develop to temp March 18, 2023 12:25

sethrj changed the base branch from temp to develop March 18, 2023 12:25

sethrj reviewed Mar 20, 2023

View reviewed changes

esseivaju added 4 commits March 20, 2023 14:47

Merge branch 'develop' into thread-to-trackslot

23aaec4

add postfix increment for OpaqueId, updated OpaqueId tests

48cc9d6

Use TEST_EQ

5d57957

sethrj reviewed Mar 21, 2023

View reviewed changes

src/celeritas/global/CoreTrackData.hh Outdated Show resolved Hide resolved

src/corecel/OpaqueId.hh Show resolved Hide resolved

test/corecel/OpaqueId.test.cc Show resolved Hide resolved

doc, update OpaqueId iota test, typename for track_slots

c245779

sethrj approved these changes Mar 21, 2023

View reviewed changes

formatting

934d4c1

sethrj added core Software engineering infrastructure minor Minor internal changes or fixes labels Mar 21, 2023

sethrj mentioned this pull request Mar 21, 2023

Study performance impact of sorting tracks #675

Closed

6 tasks

sethrj removed a link to an issue Mar 21, 2023

Study performance impact of sorting tracks #675

Closed

6 tasks

sethrj merged commit 5e3c593 into celeritas-project:develop Mar 21, 2023

esseivaju deleted the thread-to-trackslot branch May 12, 2023 14:27

sethrj added performance Changes for performance optimization and removed core Software engineering infrastructure labels Nov 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Index tracks by `TrackSlotId` #678

Index tracks by `TrackSlotId` #678

esseivaju commented Mar 10, 2023

sethrj commented Mar 16, 2023

esseivaju commented Mar 16, 2023

sethrj Mar 20, 2023

esseivaju Mar 20, 2023 •

edited

Loading

sethrj Mar 20, 2023

esseivaju Mar 20, 2023 •

edited

Loading

esseivaju Mar 20, 2023

sethrj Mar 21, 2023

pcanal Mar 21, 2023

sethrj left a comment

esseivaju commented Mar 21, 2023

sethrj commented Mar 21, 2023

Index tracks by TrackSlotId #678

Index tracks by TrackSlotId #678

Conversation

esseivaju commented Mar 10, 2023

sethrj commented Mar 16, 2023

esseivaju commented Mar 16, 2023

sethrj Mar 20, 2023

Choose a reason for hiding this comment

esseivaju Mar 20, 2023 • edited Loading

Choose a reason for hiding this comment

sethrj Mar 20, 2023

Choose a reason for hiding this comment

esseivaju Mar 20, 2023 • edited Loading

Choose a reason for hiding this comment

esseivaju Mar 20, 2023

Choose a reason for hiding this comment

sethrj Mar 21, 2023

Choose a reason for hiding this comment

pcanal Mar 21, 2023

Choose a reason for hiding this comment

sethrj left a comment

Choose a reason for hiding this comment

esseivaju commented Mar 21, 2023

sethrj commented Mar 21, 2023

Index tracks by `TrackSlotId` #678

Index tracks by `TrackSlotId` #678

esseivaju Mar 20, 2023 •

edited

Loading

esseivaju Mar 20, 2023 •

edited

Loading