Skip to content
This repository was archived by the owner on Mar 21, 2024. It is now read-only.

CUDA Tuple Structured Binding Declaration Broken #316

Closed
jdwapman opened this issue Sep 20, 2022 · 4 comments · Fixed by #317
Closed

CUDA Tuple Structured Binding Declaration Broken #316

jdwapman opened this issue Sep 20, 2022 · 4 comments · Fixed by #317

Comments

@jdwapman
Copy link

I'm running into issues where cuda::std::tuple does not seem to support structured binding declarations. Is this a feature that should work but is broken? Is this unsupported for cuda::std::tuple specifically?

Note that the normal std::tuple supports this cpp17 feature, even in device code. If I comment out the cuda_tuple_kernel and its calls in the main() function, the code compiles without issues.

Code:

#include <cuda/std/tuple>
#include <tuple>

#define CHECK_CUDA(cmd)                                  \
  do {                                                   \
    cudaError_t res = (cmd);                             \
    if (res != cudaSuccess) {                            \
      fprintf(stderr, "CUDA: %s = %d (%s)\n", #cmd, res, \
              cudaGetErrorString(res));                  \
      abort();                                           \
    }                                                    \
  } while (0)

// This works
__global__ void std_tuple_kernel()
{
  std::tuple<bool, float> my_tup = std::make_tuple(true, 1.0f);

  printf("Direct access: %d, %f\n", std::get<0>(my_tup), std::get<1>(my_tup));

  auto [first, second] = my_tup;

  printf("Structured binding: %d, %f\n", first, second);
}

// This fails to compile
__global__ void cuda_tuple_kernel()
{

  cuda::std::tuple<bool, float> my_tup = cuda::std::make_tuple(true, 1.0f);

  printf("Direct access: %d, %f\n", cuda::std::get<0>(my_tup), cuda::std::get<1>(my_tup));

  auto [first, second] = my_tup;

  printf("Structured binding: %d, %f\n", first, second);
}

int main()
{

  std_tuple_kernel<<<1, 1>>>();
  CHECK_CUDA(cudaDeviceSynchronize());

  cuda_tuple_kernel<<<1, 1>>>();
  CHECK_CUDA(cudaDeviceSynchronize());
}

Compile command:

nvcc tuple_issues.cu --expt-relaxed-constexpr -std=c++17 -gencode=arch=compute_80,code=compute_80

Compile error:

tuple_issues.cu(22): error: cannot bind to non-public member "cuda::std::__4::tuple<_Tp...>::__base_ [with _Tp=<__nv_bool, float>]"

System Info:
GPU: A100
nvcc: 11.7.64
g++: 9.4.0
OS: Ubuntu 20 LTS

@jrhemstad
Copy link
Collaborator

Hm, confirmed: https://godbolt.org/z/YPdYdaGTW

@wmaxey is this a known problem? It looks like our structured binding tests for tuple are commented out:

__host__ __device__ void test_decomp_tuple() {
typedef cuda::std::tuple<int> T;
// Possible compiler bug?
/*
{
T s{99};
auto [m1] = s;
auto& [r1] = s;
assert(m1 == 99);
assert(&r1 == &cuda::std::get<0>(s));
}
{
T const s{99};
auto [m1] = s;
auto& [r1] = s;
assert(m1 == 99);
assert(&r1 == &cuda::std::get<0>(s));
}
*/

The comment indicates a possible compiler bug as well.

@miscco you've been looking at tuple stuff lately, so you may be able to help look into this as well.

@jdwapman
Copy link
Author

jdwapman commented Sep 20, 2022

Interesting, I've been running into general std::tuple compile errors in nvcc as well. For example:

template <typename T>
class TestTupleMember
{

public:
    // remove constexpr to fail compile on nvcc >= 11.7)
    constexpr __host__ __device__ TestTupleMember(T _data)
    {
        data = _data;
    }

    T data;
};

__global__ void make_tuple_kernel()
{
    auto my_tup = std::make_tuple(TestTupleMember(1), TestTupleMember(2));

    auto [first, second] = my_tup;

    printf("Structured binding: %d, %d\n", first.data, second.data);
}

The above code fails to compile unless I add a constexpr in front of the __host__ statement, but only for 11.7 and up. 11.6.2 compiles successfully. I know this might be out of scope for libcudacxx, but maybe it helps track down the compiler issue.

@jdwapman
Copy link
Author

Note that this succeeds if I use std::make_tuple with a basic type. It's only a class or struct that fails to compile.

@miscco
Copy link
Collaborator

miscco commented Sep 21, 2022

Hm, confirmed: https://godbolt.org/z/YPdYdaGTW

@wmaxey is this a known problem? It looks like our structured binding tests for tuple are commented out:

__host__ __device__ void test_decomp_tuple() {
typedef cuda::std::tuple<int> T;
// Possible compiler bug?
/*
{
T s{99};
auto [m1] = s;
auto& [r1] = s;
assert(m1 == 99);
assert(&r1 == &cuda::std::get<0>(s));
}
{
T const s{99};
auto [m1] = s;
auto& [r1] = s;
assert(m1 == 99);
assert(&r1 == &cuda::std::get<0>(s));
}
*/

The comment indicates a possible compiler bug as well.

@miscco you've been looking at tuple stuff lately, so you may be able to help look into this as well.

So this is a "I should consider gardening" moment.

The issue at hand is that structured bindings only work when the respective tuple machinery is declared in namespace std.

You can have a look at this here https://godbolt.org/z/h438MMfhM

miscco added a commit to miscco/libcudacxx that referenced this issue Sep 21, 2022
Currently structured bindings for `cuda::std::tuple` and `cuda::std::array` was broken.

The reason for that is that the standard requires, that the specializations of `tuple_size` and `tuple_element` reside in namespace std. whereas our specializations resided in namespace `cuda::std`

Work around that by pulling those specializations into namespace std too.

Fixes CUDA Tuple Structured Binding Declaration Broken NVIDIA#316
@miscco miscco linked a pull request Sep 21, 2022 that will close this issue
miscco added a commit to miscco/libcudacxx that referenced this issue Sep 22, 2022
Currently structured bindings for `cuda::std::tuple` and `cuda::std::array` was broken.

The reason for that is that the standard requires, that the specializations of `tuple_size` and `tuple_element` reside in namespace std. whereas our specializations resided in namespace `cuda::std`

Work around that by pulling those specializations into namespace std too.

Fixes CUDA Tuple Structured Binding Declaration Broken NVIDIA#316
miscco added a commit to miscco/libcudacxx that referenced this issue Sep 22, 2022
Currently structured bindings for `cuda::std::tuple` and `cuda::std::array` was broken.

The reason for that is that the standard requires, that the specializations of `tuple_size` and `tuple_element` reside in namespace std. whereas our specializations resided in namespace `cuda::std`

Work around that by pulling those specializations into namespace std too.

Fixes CUDA Tuple Structured Binding Declaration Broken NVIDIA#316
miscco added a commit to miscco/libcudacxx that referenced this issue Oct 14, 2022
Currently structured bindings for `cuda::std::tuple` and `cuda::std::array` was broken.

The reason for that is that the standard requires, that the specializations of `tuple_size` and `tuple_element` reside in namespace std. whereas our specializations resided in namespace `cuda::std`

Work around that by pulling those specializations into namespace std too.

Fixes CUDA Tuple Structured Binding Declaration Broken NVIDIA#316
@miscco miscco moved this to Done in CCCL Nov 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants