-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verification results differ across vendors' GPUs #16636
Comments
Hi @jinz2014, thanks for the report. Could you please also attach Reproduced but for another Intel GPU got different results:
|
I assume that the ported program in SYCL matches the CUDA/HIP programs. I tried Syclomatic (Intel(R) DPC++ Compatibility Tool version 2025.0.0), but building the migrated files was not successful. [opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Xeon(R) Silver 4410T OpenCL 3.0 (Build 0) [2024.18.12.0.05_160000] Platforms: 2 |
I found that verification results differ when we have mid value case. For example, result before rounding evaluates to 63.5... on host and to 63.499996 on device. After rounding we got 64 vs 63 and test fails. Floating point math has different precision on host (precise) vs device. Device satisfies the following requirements: https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#relative-error-as-ulps we recently added new compiler option that could improve fdiv accuracy on device #15836. without this option I see failing tests on Intel GPU HW:
with -foffload-fp32-prec-div option I see Passed results on the same system:
jinz2014 could you please verify if this new feature helps with the problem on your side? |
Thank you for your answer very much. |
icpx 2025.0 with NVIDIA/AMD plugins:
The verification of the SYCL program in https://github.com/zjin-lcf/HeCBench/tree/master/src/quantVLLM-sycl may show some issues.
Intel Max GPU:
./main 4096 5137 1000
Input type is FP16
PASS
Input type is BF16
FAIL
Input type is FP32
PASS
NVIDIA/AMD GPU:
FAIL for three data types
The CUDA and HIP programs run successfully on the NVIDIA and AMD GPUs.
The text was updated successfully, but these errors were encountered: