Skip to content

[UR][CUDA] Fix device to device copies in...#21277

Draft
kswiecicki wants to merge 1 commit intointel:syclfrom
kswiecicki:ur-cuda-urUSMContextMemcpyExp-known-failure
Draft

[UR][CUDA] Fix device to device copies in...#21277
kswiecicki wants to merge 1 commit intointel:syclfrom
kswiecicki:ur-cuda-urUSMContextMemcpyExp-known-failure

Conversation

@kswiecicki
Copy link
Contributor

urUSMContextMemcpyExp function. According to the CUDA documentation cuMemcpy doesn't synchronize with the host for device to device copies.

@kswiecicki
Copy link
Contributor Author

kswiecicki commented Feb 12, 2026

This is an attempt to fix #19688 issue. It causes a sporadic CI failures, eg. https://github.com/intel/llvm/actions/runs/21866529301/job/63258410809?pr=21251.

urUSMContextMemcpyExp function. According to the CUDA documentation
cuMemcpy doesn't synchronize with the host for device to device copies.
@kswiecicki kswiecicki force-pushed the ur-cuda-urUSMContextMemcpyExp-known-failure branch from a8396ce to f58242e Compare February 16, 2026 09:05
void *pDst,
const void *pSrc,
size_t Size) {
UR_APIEXPORT ur_result_t UR_APICALL urUSMContextMemcpyExp(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a performance-critical function. Can we just always synchronize at the end? Would make the code simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants