Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CKF propagation a 1:1 kernel #732

Merged
merged 1 commit into from
Oct 15, 2024

Conversation

stephenswat
Copy link
Member

Right now, we have two big kernels in the CKF. The first is find_tracks, which takes $N$ tracks and turns it into $M$ tracks, where $M$ can be either smaller than, equal to, or greater than $M$. The second kernel is the propagation kernel, which takes $N$ tracks and turns those into $M$ tracks, where $M$ is equal to or smaller than $N$, but never greater.

The problem with kernels where the number of outputs differs from the number of inputs is that you need to atomically insert the elements into the output, and then you lose the ordering of your inputs.

In this PR, I make propagation a 1:1 kernel, e.g. the size of the output is always equal to the size of the input. I do this by keeping track of which kernels are alive.

This has the following benefits:

  1. It makes the code easier to understand, as there is a bit less indirection.
  2. It makes the code faster, although only by a small ~5% margin.
  3. It promotes future optimization by preserving the order of tracks, i.e. we can perhaps try to do sorting only at the start.
  4. This model will be useful for the proposed non-combinatorial KF.

@stephenswat stephenswat added refactor Change the structure of the code cuda Changes related to CUDA labels Oct 7, 2024
Copy link
Contributor

@beomki-yeo beomki-yeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. But I do not get how this works in the apply interaction kernel

Copy link
Contributor

@beomki-yeo beomki-yeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should go in for marginal performance gain

Copy link

Right now, we have two big kernels in the CKF. The first is
`find_tracks`, which takes $N$ tracks and turns it into $M$ tracks,
where $M$ can be either smaller than, equal to, or greater than $M$. The
second kernel is the propagation kernel, which takes $N$ tracks and
turns those into $M$ tracks, where $M$ is equal to or smaller than $N$,
but never greater.

The problem with kernels where the number of outputs differs from the
number of inputs is that you need to atomically insert the elements into
the output, and then you lose the ordering of your inputs.

In this PR, I make propagation a 1:1 kernel, e.g. the size of the output
is always equal to the size of the input. I do this by keeping track of
which kernels are alive.

This has the following benefits:

1. It makes the code easier to understand, as there is a bit less
   indirection.
2. It makes the code faster, although only by a small ~5% margin.
3. It promotes future optimization by preserving the order of tracks,
   i.e. we can perhaps try to do sorting only at the start.
4. This model will be useful for the proposed non-combinatorial KF.
Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As much as I can tell... 🤔

@stephenswat stephenswat merged commit 8b96143 into acts-project:main Oct 15, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda Changes related to CUDA refactor Change the structure of the code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants