Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions sycl/include/sycl/ext/oneapi/experimental/clock.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,18 @@ enum class clock_scope : int {
namespace detail {
template <clock_scope Scope> inline uint64_t clock_impl() {
#ifdef __SYCL_DEVICE_ONLY__
#if defined(__NVPTX__) || defined(__AMDGCN__)
// Currently clock() is not supported on NVPTX and AMDGCN.
return 0;
// here note that __builtin_readcyclecounter is used as fallback.
// this is due to potential higher overhead compared to a native API call
// see : https://github.com/ROCm/ROCm/issues/1288
#if defined(__NVPTX__)
if constexpr (Scope == work_group || Scope == sub_group) {
Copy link
Contributor

@KornevNikita KornevNikita Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if constexpr (Scope == work_group || Scope == sub_group) {
if constexpr (Scope == clock_scope::work_group || Scope == clock_scope::sub_group) {

Note - do not apply this as is, clang-format will fail because strings should be <= 80 symbols.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably like:

if constexpr (Scope == clock_scope::work_group ||
              Scope == clock_scope::sub_group) {

return __nvvm_read_ptx_sreg_clock64();
} else {
return __builtin_readcyclecounter();
}
#elif defined(__AMDGCN__)
// No direct variant of clock() is currently implemented for AMDGCN
return __builtin_readcyclecounter();
#else
return __spirv_ReadClockKHR(static_cast<int>(Scope));
#endif // defined(__NVPTX__) || defined(__AMDGCN__)
Expand Down
Loading