Skip to content

[SYCL][CUDA] Fixes active context when creating base event #1447

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 2, 2020

Conversation

steffenlarsen
Copy link
Contributor

Due to the top of the context stack being recovered before creating the base event, this event may be created using the wrong context. This PR fixes this problem.

Signed-off-by: Steffen Larsen steffen.larsen@codeplay.com

@bader bader added the cuda CUDA back-end label Apr 1, 2020
@bader bader requested a review from Ruyk April 1, 2020 13:49
Copy link
Contributor

@Ruyk Ruyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment to clarify the code, thanks!

Signed-off-by: Steffen Larsen <steffen.larsen@codeplay.com>
@steffenlarsen steffenlarsen force-pushed the steffen/context-base-event-fix branch from f849437 to 0894af7 Compare April 1, 2020 15:26
@bader bader merged commit 6788713 into intel:sycl Apr 2, 2020
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Apr 6, 2020
…_private_api

* origin/sycl: (614 commits)
  [SYCL][Doc] Update prerequisites in GetStartedGuide (intel#1466)
  [SYCL][USM] Remove vestigial dead code (intel#1474)
  [SYCL-PTX] Fix __spirv_GroupAsyncCopy stride computation (intel#1451)
  [Driver][SYCL] Emit an error if c compilation is forced (intel#1438)
  [SYCL] Fix sycl-post-link when no split and symbols are requested. (intel#1454)
  [SYCL] Change priority of devices in default_selector (intel#1264)
  [CI] Update CODEOWNERS matching rules order (intel#1468)
  [SYCL] Share PFWG lambda object through shared memory (intel#1455)
  [CI] Fix CODEOWNERS file syntax (intel#1464)
  [SYCL][CUDA] Fix active context when creating base event (intel#1447)
  [SYCL] Diagnose implicit declaration of kernel function type (intel#1450)
  [BuildBot] Modify configure script (intel#1421)
  [SYCL] Resolve min/max conflict (intel#1339)
  [CI][BuildBot] Fix configure parameter to turn on/off assertions (intel#1449)
  [SYCL] XFAIL LIT test due to duplicate diagnostic
  [SYCL] Remove explicit sycl_device attribute requirement
  Apply more suggestions
  Apply suggestions
  Translate new set of Intel FPGA Loop Controls
  Translate Intel FPGA force_pow2_depth memory attribute
  ...
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Apr 15, 2020
…c_abi_checks

* origin/sycl: (625 commits)
  [SYCL][Test] Disable spec_const_redefine.cpp on all devices but HOST (intel#1488)
  [SYCL] Only export public API (intel#1456)
  [SYCL][CUDA] Fix selected_binary argument in piextDeviceSelectBinary (intel#1475)
  [SYCL] Enable LIT testing with CUDA BE (intel#1458)
  [SYCL] Fix float to half-type conversion (intel#1395)
  [NFC] Cleanup unneded macro from builtins implementation (intel#1445)
  Enable cfg-printer LLVM lit tests only if LLVM linked statically (intel#1479)
  [SYCL][NFC] Reflect the "allowlist" renaming in the code (intel#1480)
  [SYCL][Doc] Update prerequisites in GetStartedGuide (intel#1466)
  [SYCL][USM] Remove vestigial dead code (intel#1474)
  [SYCL-PTX] Fix __spirv_GroupAsyncCopy stride computation (intel#1451)
  [Driver][SYCL] Emit an error if c compilation is forced (intel#1438)
  [SYCL] Fix sycl-post-link when no split and symbols are requested. (intel#1454)
  [SYCL] Change priority of devices in default_selector (intel#1264)
  [CI] Update CODEOWNERS matching rules order (intel#1468)
  [SYCL] Share PFWG lambda object through shared memory (intel#1455)
  [CI] Fix CODEOWNERS file syntax (intel#1464)
  [SYCL][CUDA] Fix active context when creating base event (intel#1447)
  [SYCL] Diagnose implicit declaration of kernel function type (intel#1450)
  [BuildBot] Modify configure script (intel#1421)
  ...
vmaksimo added a commit to vmaksimo/llvm that referenced this pull request Mar 22, 2022
…tel#1447)

It is done in order to avoid llvm.memcpy -> llvm.memset transformation which can
be an incorrect assumption.

SPIR-V does not have a memset instruction, there is a direct mapping only for
`llvm.memcpy` to `OpCopyMemory*`.
To handle `llvm.memset`, the input LLVM IR is lowered by `SPIRVRegularizeLLVM`
pass. The mentioned optimization tried to restore the memset instruction but it
surely does not cover all the possible cases (e.g., memcpy is not a memset inst).

Possibly, the number of follow-up fixes for different corner cases can prove
that the optimization is not absolutely correct, and it's better to rely on
regularize pass.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@cebaf79
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda CUDA back-end
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants