Skip to content

[OpenMP][Offload] Update the Logic for Configuring Auto Zero-Copy #143638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 11, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions offload/libomptarget/PluginManager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -286,16 +286,16 @@ void PluginManager::registerLib(__tgt_bin_desc *Desc) {
}
PM->RTLsMtx.unlock();

bool UseAutoZeroCopy = Plugins.size() > 0;
bool UseAutoZeroCopy = false;

auto ExclusiveDevicesAccessor = getExclusiveDevicesAccessor();
for (const auto &Device : *ExclusiveDevicesAccessor)
UseAutoZeroCopy &= Device->useAutoZeroCopy();
// APUs are homogeneous set of GPUs. Check the first device for
// configuring Auto Zero-Copy.
if (ExclusiveDevicesAccessor->size() > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird, this is all devices right? What if someone had another GPU on it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since an APU is a homogeneous set of GPUs, so another GPU should also be same kind by design. so we can enable zero-copy if the first device suggest to do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but this is all devices. It could be an NVIDIA GPU on some external PCI-e thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with Kewen in that we do not have a case where one of the devices is an APU and the others could be something else. You could build it yourself perhaps, but it doesn't exist today as a product. If time comes when there will be such a case, we can revisit this. Reminder: MI300A is a very special case where each socket GPU devices appear as a single GPU and all sockets are identical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate this scenario? IIUC, GPU on PCI-e is dGPU?

auto &Device = *(*ExclusiveDevicesAccessor)[0];
UseAutoZeroCopy = Device.useAutoZeroCopy();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to check for APU?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plug-in determines whether this is a case of auto zero copy and in case you f the amdgpu plugin it does so by verifying that the node is an APU. We do not abstract the concept of APU, which is vendor specific, to the top level libomptarget.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think useAutoZeroCopy is sufficient to return the correct info we need here (even if it is not an apu).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you have a system that has both AMD and GPU cards on it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what mentioned in Carlo's reply. We'd be happy to revisit in the future if needed. Also, this change would work better for our future changes on device initialization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd be happy to revisit in the future if needed.

This doesn't sound like a good idea TBH but it is probably fine, since this piece of code will only affect AMDGPU's plugin if I understand correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not have a case where one of the devices is an APU and the others could be something else. You could build it yourself perhaps, but it doesn't exist today as a product. If time comes when there will be such a case, we can revisit this.

We could have a system with mix of GPUs in theory, but we didn't see that in product in real life. Perhaps we will see a produce of that design in the future :) We could make update accordingly when it happens. Thanks for the comments!

}

// Auto Zero-Copy can only be currently triggered when the system is an
// homogeneous APU architecture without attached discrete GPUs.
// If all devices suggest to use it, change requirement flags to trigger
// zero-copy behavior when mapping memory.
if (UseAutoZeroCopy)
addRequirements(OMPX_REQ_AUTO_ZERO_COPY);

Expand Down
Loading