Skip to content

[SYCL] __grid_constant__  #7998

Closed
Closed
@zjin-lcf

Description

@zjin-lcf

Is this an optimization feature for SYCL ?

If the address of a __global__ function parameter is taken, the compiler will ordinarily make a copy of the kernel parameter in thread local memory and use the address of the copy, to partially support C++ semantics, which allow each thread to modify its own local copy of function parameters. Annotating a __global__ function parameter with __grid_constant__ ensures that the compiler will not create a copy of the kernel parameter in thread local memory, but will instead use the generic address of the parameter itself. Avoiding the local copy may result in improved performance.

#include <cuda.h>

struct S {
  char4 a;
  int x;
};

__device__ void unknown_function(S const&) {}

__global__ void kernel(const __grid_constant__ S s) {
   // s.x += threadIdx.x;  // Undefined Behavior: tried to modify read-only memory

   // Compiler will _not_ create a per-thread thread local copy of "s":
   unknown_function(s);
}

int main() {
  S car;
  kernel<<<1,1>>>(car);
  return 0;
}

Reference
https://docs.nvidia.com/cuda/cuda-c-programming-guide/#grid-constant

Metadata

Metadata

Assignees

No one assigned

    Labels

    cudaCUDA back-endenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions