Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

traccc::sycl::full_chain_algorithm Fixes, main branch (2024.11.12.) #769

Merged

Conversation

krasznaa
Copy link
Member

As reported by @flg, the SYCL throughput applications have been crashing since some time. 😦

With SYCL it is paramount that the underlying ::sycl::queue object would be deleted after all objects have stopped using it. (Most importantly, after all allocated memory has been freed.) So now I made this super explicit.

This reminded me that the clusterization algorithm also manages some memory itself. So that algorithm needs to be very manually deleted as well. 🤔

At the same time did some general maintenance as well, to make the code just a little bit better.

@krasznaa krasznaa added bug Something isn't working sycl Changes related to SYCL labels Nov 12, 2024
@flg
Copy link
Contributor

flg commented Nov 12, 2024

Hi Attila,

Since you've also done some general maintenance I have a small question/suggestion.

If I understand correctly a local sycl::queue is created in full_chain_algorithm constructor that is never used:

::sycl::queue q(handle_async_error);

If I'm not wrong it should be removed and more importantly the following should wait on the m_data->m_queue not this local one since m_copy will use m_data->m_queue:

Or is this local queue really used for something?

Copy link
Member

@stephenswat stephenswat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is not the way to do this... We walked into this in #664; the solution here is to remove the destructors, not make them more complicated. The C++ standard guarantees that members are destroyed in reverse declaration order anyway, so I don't see what these destructors bring us.

@krasznaa krasznaa force-pushed the SYCLFullChainFixes-main-20241112 branch from 481bad1 to 576daaa Compare November 12, 2024 12:00
@krasznaa
Copy link
Member Author

The code could indeed be simplified. 🤔

In a separate PR I'll add unit tests that would run some of our example applications. As we should really test it in our CI that the applications work to some basic level at least... 🤔

@krasznaa
Copy link
Member Author

Hi Attila,

Since you've also done some general maintenance I have a small question/suggestion.

If I understand correctly a local sycl::queue is created in full_chain_algorithm constructor that is never used:

::sycl::queue q(handle_async_error);

If I'm not wrong it should be removed and more importantly the following should wait on the m_data->m_queue not this local one since m_copy will use m_data->m_queue:

Or is this local queue really used for something?

Yepp, I saw that too. Didn't search the history for how that code happened, but indeed it was doing some silly things. The PR now removes the erroneous lines.

Copy link
Member

@stephenswat stephenswat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

With SYCL it is paramount that the underlying ::sycl::queue object
would be deleted after all objects have stopped using it. (Most
importantly, after all allocated memory has been freed.)
@krasznaa krasznaa force-pushed the SYCLFullChainFixes-main-20241112 branch from 576daaa to 0e123e5 Compare November 12, 2024 14:04
@stephenswat stephenswat merged commit 3f571a6 into acts-project:main Nov 12, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working sycl Changes related to SYCL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants