Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 15 additions & 10 deletions clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -945,24 +945,29 @@ static void addBackendOptions(const ArgList &Args,
if (IsCPU) {
OptC.split(CmdArgs, " ", /*MaxSplit=*/-1, /*KeepEmpty=*/false);
} else {
// ocloc -options args need to be comma separated, e.g. `-options
// "-g,-cl-opt-disable"`. Otherwise, only the first arg is processed by
// ocloc as an arg for -options, and the rest are processed as standalone
// flags, possibly leading to errors.
// ocloc -options takes arguments in the form of '-options "-g
// -cl-opt-disable"' where each argument is separated with spaces.
// split function here returns a pair with everything before the separator
// ("-options") in the first member of the pair, and everything after the
// separator in the second part of the pair. The separator is not included
// in any of them.
auto [BeforeOptions, AfterOptions] = OptC.split("-options ");
// Only add if not empty, an empty arg can lead to ocloc errors.
if (!BeforeOptions.empty())
CmdArgs.push_back(BeforeOptions);
if (!BeforeOptions.empty()) {
SmallVector<StringRef, 8> BeforeArgs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also just wondering, i see one of your commits is titled says 'revert nick's PR' but i don't remember writing that code so if it is referring to me do you mind linking the PR being reverted, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely! I previously applied the changes directly from your PR #15121 (this PR is currently closed) to enable the new offloading model and verify the CI test results. After confirming that both fp64-conv-emu-1.cpp and fp64-conv-emu-2.cpp passed in CI with the new offloading model enabled, I reverted the changes. If my understanding is correct, I think we would enable the new offloading model changes after the other SYCL E2E test failures are resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, resolving all SYCL E2E test failures with new offloading model is necessary to enable new model by default, but it is not the only condition.
I think as soon as SYCL E2E pass, we want to enable regular testing (pre-commit? post-commit? nightly?) with new offloading model to make sure no regressions.
Or we might want to even enable it now and mark not-passing tests as XFAIL.
Thoughts?

Copy link
Contributor

@sarnex sarnex Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say we should add it to the nightly, but yeah IMO we can do it now if someone has bandwidth do to it. i definitely do not but im happy to help. We need the job to pass though, so we ideally would have a LIT variable so we can XFAIL the failing ones, or more crudely we coukd have some file with a list of either passing or failing tests, in that case we could use LIT_FILTER or LIT_FILTER_OUT

btw that pr isn't from me but doesnt matter either way :P

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will ask someone to add it to nightly. I think we should do it similar to how spirv_backend is being tested, likely with a LIT variable.

BeforeOptions.split(BeforeArgs, " ", /*MaxSplit=*/-1,
/*KeepEmpty=*/false);
for (const auto &string : BeforeArgs) {
CmdArgs.push_back(string);
}
}
if (!AfterOptions.empty()) {
// Separator not included by the split function, so explicitly added here.
CmdArgs.push_back("-options");
std::string Replace = AfterOptions.str();
std::replace(Replace.begin(), Replace.end(), ' ', ',');
CmdArgs.push_back(Args.MakeArgString(Replace));
// Split the options string by spaces and rejoin to normalize whitespace
SmallVector<StringRef, 8> AfterArgs;
AfterOptions.split(AfterArgs, " ", /*MaxSplit=*/-1, /*KeepEmpty=*/false);
std::string JoinedOptions = llvm::join(AfterArgs, " ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this pass all tests? It was similar to this before and I had to add the , because it was causing trouble with ocloc. I added a specific test for that, so if it is passing, then I'm good.

Copy link
Contributor Author

@YixingZhang007 YixingZhang007 Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests that were previously failing for the clang-linker-wrapper issue (fp64-conv-emu-1.cpp and fp64-conv-emu-2.cpp) are now passing. The tests currently failing in CI (such as https://github.com/intel/llvm/actions/runs/18854382403/job/53800150157) for the new offloading model are unrelated to this issue and will be addressed in future PRs. I am wondering which one is the test you mentioned adding for this issue? I'm not sure if you're referring to https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/SYCLBIN/simple_kernel_aot; this test is passing with the current changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YixingZhang007 , did you run entire SYCL E2E with new offloading model and your changes? Comparing with Justin's result, any regressions, or only 2 more passes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Yury! Yes, I have run the entire SYCL E2E tests with new offloading model and my change (the result can also found at https://github.com/intel/llvm/actions/runs/18854382403/job/53800150157 , this is the CI before removing the new offloading model changes from this PR). Comparing with Justin's result, I haven't seen any regression and the 2 tests (fp64-conv-emu-1.cpp and fp64-conv-emu-2.cpp) are passing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, that addresses my concern.

CmdArgs.push_back(Args.MakeArgString(JoinedOptions));
}
}
StringRef OptL =
Expand Down
8 changes: 7 additions & 1 deletion sycl/test-e2e/OptionalKernelFeatures/fp64-conv-emu-1.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,13 @@
// UNSUPPORTED: cuda, hip
// UNSUPPORTED-INTENDED: FP64 emulation is an Intel specific feature.

// RUN: %clangxx -fsycl -fsycl-targets=intel_gpu_dg2_g10,intel_gpu_dg2_g11,intel_gpu_dg2_g12,intel_gpu_pvc,intel_gpu_mtl_h,intel_gpu_mtl_u -fsycl-fp64-conv-emu %O0 %s -o %t.out
// RUN: %clangxx -fsycl -fsycl-targets=intel_gpu_dg2_g10,intel_gpu_dg2_g11,intel_gpu_dg2_g12,intel_gpu_pvc,intel_gpu_mtl_h,intel_gpu_mtl_u -fsycl-fp64-conv-emu --no-offload-new-driver %O0 %s -o %t.out
// RUN: %{run} %t.out
Comment on lines +20 to +21
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: do we want to modify this test that way by adding 2 command lines with and without offload new driver? It looks like that way we would modify a lot of tests introducing command lines like that and then when we switch to new offload driver by default, we would need to clean up all that.
I'm thinking would not it be better to just run entire SYCL E2E with new offload driver enabled, and mark currently failing tests as XFAIL? and when we are ready, we would just flip the switch and stop new offload driver testing (since it would become a default)
That way we would not need to modify all tests like that...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for:

  • adding entire SYCL E2E with new offload driver enabled as nightly and marking failing tests as XFAIL
  • remove modification of tests in this PR, which are adding new-offload-driver/old-offload-driver behavior
  • keep -g addition, but using default offload driver (which is old currently)


// RUN: %clangxx -fsycl -fsycl-targets=intel_gpu_dg2_g10,intel_gpu_dg2_g11,intel_gpu_dg2_g12,intel_gpu_pvc,intel_gpu_mtl_h,intel_gpu_mtl_u -fsycl-fp64-conv-emu --offload-new-driver %O0 %s -o %t.out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also try with -g? The code you're changing used to have issues with -g, so just to be on the safe side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes work with '-g' and I have also added a new test command in this file for running the test with -g. Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue my previous comment: we can keep -g test, but just removing --offload-new-driver.

Copy link
Contributor Author

@YixingZhang007 YixingZhang007 Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion!
This test is currently testing three cases:

  1. old offloading model
  2. new offloading model
  3. new offloading model and -g

Do you mean that we should replace the third case to be "old offloading model and -g"?

Copy link
Contributor

@YuriPlyakhin YuriPlyakhin Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original test was testing 1 case:

  1. default offloading model

I suggest to add one more case, so result will be:

  1. default offloading model
  2. default offloading model and -g

We should not distinguish between new and old offloading model in this test. Instead, we should just add new entire SYCL E2E testing with new offloading model enabled by option for all tests and of course it should be separate PR.

If you still have questions, feel free to ping me in teams and we can discuss.

// RUN: %{run} %t.out

// RUN: %clangxx -fsycl -fsycl-targets=intel_gpu_dg2_g10,intel_gpu_dg2_g11,intel_gpu_dg2_g12,intel_gpu_pvc,intel_gpu_mtl_h,intel_gpu_mtl_u -fsycl-fp64-conv-emu --offload-new-driver -g %O0 %s -o %t.out
// RUN: %{run} %t.out

// Tests that aspect::fp64 is not emitted correctly when -fsycl-fp64-conv-emu
Expand Down
20 changes: 17 additions & 3 deletions sycl/test-e2e/OptionalKernelFeatures/fp64-conv-emu-2.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,13 @@
// UNSUPPORTED: cuda, hip
// UNSUPPORTED-INTENDED: FP64 emulation is an Intel specific feature.

// RUN: %clangxx -fsycl -fsycl-targets=intel_gpu_dg2_g10,intel_gpu_dg2_g11,intel_gpu_dg2_g12,intel_gpu_pvc,intel_gpu_mtl_h,intel_gpu_mtl_u -fsycl-fp64-conv-emu %O0 %s -o %t.out
// RUN: %clangxx -fsycl -fsycl-targets=intel_gpu_dg2_g10,intel_gpu_dg2_g11,intel_gpu_dg2_g12,intel_gpu_pvc,intel_gpu_mtl_h,intel_gpu_mtl_u -fsycl-fp64-conv-emu --no-offload-new-driver %O0 %s -o %t.out
// RUN: %{run} %t.out

// RUN: %clangxx -fsycl -fsycl-targets=intel_gpu_dg2_g10,intel_gpu_dg2_g11,intel_gpu_dg2_g12,intel_gpu_pvc,intel_gpu_mtl_h,intel_gpu_mtl_u -fsycl-fp64-conv-emu --offload-new-driver %O0 %s -o %t.out
// RUN: %{run} %t.out

// RUN: %clangxx -fsycl -fsycl-targets=intel_gpu_dg2_g10,intel_gpu_dg2_g11,intel_gpu_dg2_g12,intel_gpu_pvc,intel_gpu_mtl_h,intel_gpu_mtl_u -fsycl-fp64-conv-emu --offload-new-driver -g %O0 %s -o %t.out
// RUN: %{run} %t.out

#include <sycl/detail/core.hpp>
Expand Down Expand Up @@ -61,8 +67,16 @@ int main() {
nfail += test<Increment<long>>(q);
nfail += test<Increment<float>>(q);

if (q.get_device().has(aspect::fp64))
nfail += test<Increment<double>>(q);
// This test is currently disabled because it requires the -ze-fp64-gen-emu
// IGC option to run FP64 arithmetic operations. The -fsycl-fp64-conv-emu flag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry why are we seeing this failure now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out! We are seeing the following failure with this test when we enable the new offloading model (after applying changes from #15121). The error generated is as below.

Double arithmetic operation is not supported on this platform with FP64 conversion emulation mode (poison FP64 kernels is disabled).
in kernel: 'typeinfo name for int test<Increment<double>>(sycl::_V1::queue&)::'lambda'(sycl::_V1::handler&)::operator()(sycl::_V1::handler&) const::'lambda'()'

I think this error occurs because this test performs arithmetic operations with double-precision variables, which should only be permitted when the -ze-fp64-gen-emu IGC option is not enabled. The currently enabled -ze-fp64-gen-conv-emu option (activated by -fsycl-fp64-conv-emu flag that we passed into clang) should only allow FP64 conversions, not FP64 computations.

Copy link
Contributor

@YuriPlyakhin YuriPlyakhin Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I have the same question, and I still do not understand: how does this test pass with old offloading model then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to figure out how the old offloading model is able to pass this, but I have checked that the old offloading model only has -ze-fp64-gen-conv-emu passed into ocloc which should not be sufficient. I think this failure for the old offloading model did not get captured for some issue that would need more investigation (I can definitely look more into this if needed). But I have tried passing -ze-fp64-gen-emu into the ocloc command while running the test with the new offloading model, and this helps solve the issue, and also according to the previous documentation and PRs (such as #13912), I think the option -ze-fp64-gen-emu is required for this test to have arithmetic operations on double numbers. Thank you :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to figure out how the test is passing, before disabling anything in the code.
Currently the test is passing, but with your change, this scenario will be excluded from testing for the old offloading model, which doesn't look good to me.

Copy link
Contributor

@sarnex sarnex Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new/old offloading model shouldnt change the content of any IR, it should just change the tools called (which shouldn't effect IR content either) and the arguments to the tools, so it may be easy to see if there is some difference in args to ocloc or different image compile/link flags or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! For sure! I will investigate why errors are not raised when double-precision arithmetic operations are executed with only the fp64-gen-conv-emu option enabled.

// only enables the -ze-fp64-gen-conv-emu IGC option, which provides partial
// FP64 emulation limited to kernels with FP64 conversions but no FP64
// computations.
// TODO: Implement support for a new flag, -fsycl-fp64-gen-emu, which will
// enable the use of the -ze-fp64-gen-emu IGC option.
// if (q.get_device().has(aspect::fp64)) {
// nfail += test<Increment<double>>(q);
// }

nfail += test<IntCastThenIncrement<double>>(q);

Expand Down
Loading