Fix description of `sycl::free` #758

Pennycook · 2025-02-27T14:38:27Z

This PR makes three changes, which I've split into three commits to simplify review.

The description of sycl::free is migrated to the new format.
An unnecessary statement about "waiting" is removed. This statement was already covered by (and in fact, inconsistent with!) sycl::free having undefined behavior in the case where sycl::free is called while a command is in-progress. Whether an implementation waits for in-progress commands or not doesn't matter, because that usage is invalid.
I added a non-normative note to highlight that sycl::free is not guaranteed to be blocking or non-blocking. The old wording implied that sycl::free was always guaranteed to be non-blocking, but this isn't the case for existing implementations.

The previous wording that memory is deallocated "without waiting for commands operating on it to be completed" is unnecessary, because calling free with such a command in-progress is already defined as undefined behavior.

The previous wording about "not waiting" could have been read as a guarantee that calls to sycl::free are always non-blocking. In practice, calls to sycl::free may need to block until the device can satisfy the deallocation request (depending on implementation details tied to specific backends). Although the new wording cannot be read this way, adding a non-normative note may help developers to understand that they should not rely on the behavior of specific implementations.

TApplencourt · 2025-02-27T15:10:42Z

Thanks from moving to the new format!

The old wording implied that sycl::free was always guaranteed to be non-blocking, but this isn't the case for existing implementations.

What do you mean?

cudaFreeis blocking for example, so I suspect all the cuda back to be blocking.
And the fact that DPCPP may use zeMemFree and not the new zeMemFreeExt with the ZE_DRIVER_MEMORY_FREE_POLICY_EXT_FLAG_BLOCKING_FREE is an implementation detail.

So I'm not really sure buy removing the non-blocking. It seems to be a breaking change; who may break codes (at least
conceptually)

Pennycook · 2025-02-27T15:18:46Z

What do you mean?

I mean that the specification didn't actually say "sycl::free is non-blocking" -- it just said that calling sycl::free while a command was executing is undefined behavior.

So I'm not really sure buy removing the non-blocking. It seems to be a breaking change; who may break codes (at least conceptually)

I'm not sure what you mean.

You said that existing CUDA backends are probably blocking (and I agree). That's what I meant about the behavior of existing implementations: there are existing implementations of SYCL that implement sycl::free as blocking.

TApplencourt · 2025-02-27T15:24:49Z

Nevermind... I misread; I was thinking before the spec said it was always blocking,but we implied the opposite!

Which like you said, is totally incorrect. Some implementations are blocking (and making them nonblocking was more or less impossible previously).

So,UB is a good clarification.

We can discuss if we want that to be Blocking in another PR.

Sorry!

This patch fixes undefined behavior when freeing memory that might still be in use. Using regular zeMemFree (as was done before this change) is unsafe. L0 spec says that for zeMemFree, the application must ensure the device is not referencing memory before it is freed. SYCL sets 'indirect access' flag for every kernel. This means that each kernel can potentially access any memory and hence it is unsafe to free any memory allocation during kernel execution (if that kernel was submitted at some point in time when that allocation was alive). This replaces the 'indirect access tracking' mechanism used in the legacy adapter (see https://github.com/intel/llvm/blob/4c9b19bdd8b4b95b865522d583b6252bda301d98/unified-runtime/source/adapters/level_zero/context.hpp#L153) Related clarification in the SYCL spec: KhronosGroup/SYCL-Docs#758

Pennycook added 3 commits February 27, 2025 14:10

Migrate sycl::free table to new apidef format

75b39a0

Remove unnecessary statement about free "waiting"

0464d59

The previous wording that memory is deallocated "without waiting for commands operating on it to be completed" is unnecessary, because calling free with such a command in-progress is already defined as undefined behavior.

Pennycook added the bug Something isn't working label Feb 27, 2025

Pennycook added this to the SYCL 2020 milestone Feb 27, 2025

Pennycook requested a review from gmlueck February 27, 2025 14:38

sycl-issue-bot bot mentioned this pull request Feb 27, 2025

[Spec change] Fix description of sycl::free KhronosGroup/SYCL-CTS#1060

Open

TApplencourt approved these changes Feb 27, 2025

View reviewed changes

bashbaug mentioned this pull request Feb 28, 2025

add cl_khr_unified_svm extension KhronosGroup/OpenCL-Docs#1282

Open

gmlueck approved these changes Feb 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix description of `sycl::free` #758

Fix description of `sycl::free` #758

Pennycook commented Feb 27, 2025 •

edited

Loading

TApplencourt commented Feb 27, 2025 •

edited

Loading

Pennycook commented Feb 27, 2025

TApplencourt commented Feb 27, 2025

Fix description of sycl::free #758

Are you sure you want to change the base?

Fix description of sycl::free #758

Conversation

Pennycook commented Feb 27, 2025 • edited Loading

TApplencourt commented Feb 27, 2025 • edited Loading

Pennycook commented Feb 27, 2025

TApplencourt commented Feb 27, 2025

Fix description of `sycl::free` #758

Fix description of `sycl::free` #758

Pennycook commented Feb 27, 2025 •

edited

Loading

TApplencourt commented Feb 27, 2025 •

edited

Loading