Expose support for <iterator> and <array> #243

wmaxey · 2022-01-21T01:50:39Z

<cuda/std/array> + <cuda/std/iterator> - Statically sized arrays and iterators.

Adds support for cuda::std::array and cuda::std::iterator.

Details

Motivation

Users often make use of statically sized typed arrays in CUDA and frequently implement their own array<T,N> class.
--expt-relaxed-constexpr is required to allow users to use std::array in device code.
Users using NVRTC have even fewer options.

Impact

Exposes <cuda/std/array> and <cuda/std/iterator> headers.
Silently exposes sections of algorithm as an implementation detail. Full support will come later.
No test coverage impact.
Exceptions are disabled, invalid accesses abort or trap on device.

Checklists

Testing

Bug fixes have regression tests that would reproduce the bug.
New features should have correctness tests to validate behavior.
Benchmarks have been added to monitor performance of new features. (not required)

wmaxey · 2022-01-21T01:51:30Z

Closes #51

wmaxey · 2022-01-21T01:55:16Z

@griwes / @jrhemstad If anyone wants to take a particular look at changes for initializer_list I think the current solution is brutally hacky. Potentially the best we can do is add an implicit constructor for a cuda::std::initializer_list, but I'm uncertain if we want to add something so confusing. Same issue as tuple_size honestly in that we can't go around conflicting with the host standard library.

wmaxey · 2022-01-21T23:17:12Z

VDVS: https://builds4u.nvidia.com/dvs/#/change/3090343059892240.1?showTab=DVS

jrhemstad · 2022-01-24T15:42:25Z

@griwes / @jrhemstad If anyone wants to take a particular look at changes for initializer_list I think the current solution is brutally hacky.

Can you summarize the problem/solution? I'm guessing something to do with std::initializer_list gets special treatment from the compiler that a "user-defined" cuda::std::initializer_list won't get?

wmaxey · 2022-01-24T19:36:14Z

include/cuda/std/array

+#include "iterator"
+#include "utility"
+
+#include <initializer_list>


@jrhemstad This is mostly what I was talking about. We can't use cuda::std::initializer_list, due to the implicit construction of a std::initializer_list.

The compiler might insert something like the below, and if we are really certain on using a cuda::std::initializer_list we would need to add an implicit conversion which makes 'our' initializer_list behave weird: https://gcc.godbolt.org/z/EG739eEdP

http://eel.is/c++draft/dcl.init.list#5

Yeah, I don't see how we're going to get around this. intializer_list is a type that has special rules that the compiler follows that we can't really emulate with cuda::std::initializer_list.

wmaxey · 2022-01-24T19:36:32Z

include/cuda/std/detail/libcxx/include/iterator

@@ -1749,14 +1749,14 @@ reverse_iterator<_Tp*> rend(_Tp (&__array)[_Np])

 template <class _Ep>
 _LIBCUDACXX_INLINE_VISIBILITY _LIBCUDACXX_CONSTEXPR_AFTER_CXX14
-reverse_iterator<const _Ep*> rbegin(initializer_list<_Ep> __il)
+reverse_iterator<const _Ep*> rbegin(::std::initializer_list<_Ep> __il)


@jrhemstad hacks here.

This seems fine to me.

include/cuda/std/initializer_list

griwes · 2022-02-11T21:13:49Z

.upstream-tests/maintenance/std-to-cuda

@@ -7,7 +7,7 @@ root_dir=$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)
 stdlib_headers=$(<"${root_dir}/maintenance/stdlib-headers")
 header_replacements=$(echo "${stdlib_headers}" | sed 's#<\(.*\)>#-e s:<\1>:<cuda/std/\1>:g#')

-find "${root_dir}/test" -name "*.cpp" |
+find "${root_dir}/test/std/iterators" -name "*.cpp" |


I don't think you meant to keep this ;>

This should be fixed now.

griwes · 2022-02-11T21:16:59Z

include/cuda/std/detail/libcxx/include/algorithm

@@ -13,7 +13,7 @@
 /*
    algorithm synopsis

-#include <initializer_list>
+#include <::std::initializer_list>


Tell me you've just run search+replace without telling me... :D

Also fixed. >.>

griwes · 2022-02-11T21:18:39Z

include/cuda/std/detail/libcxx/include/algorithm

@@ -3302,6 +3336,7 @@ partition_copy(_InputIterator __first, _InputIterator __last,
 // partition_point

 template<class _ForwardIterator, class _Predicate>
+_LIBCUDACXX_INLINE_VISIBILITY


We probably also want _LIBCUDACXX_EXECUTION_SPACE_SPECIFIER for the ones here and above? I assume this is where you invented the new macro.

I'm wary to do that, when building source files in libcxx there are places where some extra ABI flags are set that may matter. The ones I've marked are mainly internal and inline.

Putting _LIBCUDACXX_INLINE_VISIBLITY on those causes them to fail to build due to the linkage flags.

e.g.

#if __has_attribute(internal_linkage) # define _LIBCUDACXX_INTERNAL_LINKAGE __attribute__ ((internal_linkage)) #else # define _LIBCUDACXX_INTERNAL_LINKAGE _LIBCUDACXX_ALWAYS_INLINE #endif

griwes · 2022-02-11T21:19:49Z

include/cuda/std/detail/libcxx/include/array


    _Tp __elems_[_Size];

    // No explicit construct/copy/destroy for aggregate type
    _LIBCUDACXX_INLINE_VISIBILITY void fill(const value_type& __u) {
-      _CUDA_VSTD::fill_n(__elems_, _Size, __u);
+      fill_n(__elems_, _Size, __u);


This (and swap_ranges below) are now ADL calls, we need to keep them qualified.

I've replaced _CUDA_VSTD:: on this and below.

griwes · 2022-02-11T21:20:34Z

include/cuda/std/detail/libcxx/include/iterator

@@ -714,7 +714,7 @@ public:
    _LIBCUDACXX_INLINE_VISIBILITY _LIBCUDACXX_CONSTEXPR_AFTER_CXX14
    reference operator*() const {_Iter __tmp = current; return *--__tmp;}
    _LIBCUDACXX_INLINE_VISIBILITY _LIBCUDACXX_CONSTEXPR_AFTER_CXX14
-    pointer  operator->() const {return _CUDA_VSTD::addressof(operator*());}
+    pointer  operator->() const {return addressof(operator*());}


Needs to be qualified to avoid ADL.

griwes · 2022-02-11T21:20:48Z

include/cuda/std/detail/libcxx/include/iterator

@@ -1168,7 +1168,7 @@ public:
    _LIBCUDACXX_INLINE_VISIBILITY _LIBCUDACXX_CONSTEXPR_AFTER_CXX14
    move_iterator& operator-=(difference_type __n) {__i -= __n; return *this;}
    _LIBCUDACXX_INLINE_VISIBILITY _LIBCUDACXX_CONSTEXPR_AFTER_CXX14
-    reference operator[](difference_type __n) const { return static_cast<reference>(__i[__n]); }
+    reference operator[](difference_type __n) const { return move(__i[__n]); }


griwes · 2022-02-11T21:21:04Z

include/cuda/std/detail/libcxx/include/iterator

@@ -1749,14 +1749,14 @@ reverse_iterator<_Tp*> rend(_Tp (&__array)[_Np])

 template <class _Ep>
 _LIBCUDACXX_INLINE_VISIBILITY _LIBCUDACXX_CONSTEXPR_AFTER_CXX14
-reverse_iterator<const _Ep*> rbegin(initializer_list<_Ep> __il)
+reverse_iterator<const _Ep*> rbegin(::std::initializer_list<_Ep> __il)


This seems fine to me.

griwes · 2022-02-11T21:22:24Z

include/cuda/std/detail/libcxx/include/stdexcept

+    NV_IF_ELSE_TARGET(
+      NV_IS_DEVICE,
+        __trap();,
+        _CUDA_VSTD::abort();
+    )


Somewhere above you're made a macro for this, though using if/else on __CUDA_ARCH__ instead of if target. Should we replace that with what's in here, and just use the macro here to avoid repeating this construct?

Which macro was this?

Replaced with _LIBCUDACXX_UNREACHABLE

…erts ::std::initializer_list into the cuda:: namespace

…ted in some libcu++ things were expected in std::

Co-authored-by: Jake Hemstad <jhemstad@nvidia.com>

…piled with NVRTC

wmaxey added this to the 1.8.0 milestone Jan 21, 2022

wmaxey requested review from griwes and jrhemstad January 21, 2022 01:50

wmaxey self-assigned this Jan 21, 2022

wmaxey added the testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). label Jan 21, 2022

wmaxey commented Jan 24, 2022

View reviewed changes

jrhemstad reviewed Jan 28, 2022

View reviewed changes

include/cuda/std/initializer_list Outdated Show resolved Hide resolved

wmaxey added testing: internal ci passed Passed internal NVIDIA CI (DVS). and removed testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). labels Feb 3, 2022

wmaxey requested a review from jrhemstad February 3, 2022 07:40

wmaxey changed the title ~~WIP: Expose support for <iterator> and <array>~~ Expose support for <iterator> and <array> Feb 3, 2022

wmaxey force-pushed the feature/array branch 2 times, most recently from cfa48ad to 312aaf0 Compare February 8, 2022 05:33

griwes suggested changes Feb 11, 2022

View reviewed changes

wmaxey force-pushed the feature/array branch 3 times, most recently from 8804caf to fe8f048 Compare February 18, 2022 19:58

wmaxey added 8 commits February 18, 2022 15:47

Add iterator tests and apply std-to-cuda

a5b70f4

Add <cuda/std/iterator>

7c065f7

Make <sstream> tests conditional on a predefined macro

6c2222b

Block iterator tests for <vector>

85a8950

Block <string> tests in iterator

0bd1491

Block <list> tests in iterator

a275862

Add <iosfwd> to implicit includes for <iterator>, add WAR for wchar.h

3a1245e

Add H/D annotations to iterator test files

2d7a03b

wmaxey and others added 20 commits February 18, 2022 15:47

Fix several tests, add a define for _LIBCUDACXX_ASSERT in __config

38876ff

Add array tests

b2ea2e6

Add H/D annotations to array tests

f40c64a

Add iterator and array to exposed headers

a02a85e

fix several issues and add <cuda/std/initializer_list> which just ins…

e980f0b

…erts ::std::initializer_list into the cuda:: namespace

Fix build issues with order of no_discard attribute

319b0d4

Additional test fixes on gcc-5 and clang-8 where some fallbacks resul…

1336455

…ted in some libcu++ things were expected in std::

Fix unused warning and missing __host__ __device__ on test functions

02c280b

Fix constexpr statement returning void

db1f6da

Update include/cuda/std/initializer_list

5a4b7b7

Co-authored-by: Jake Hemstad <jhemstad@nvidia.com>

Fix missing include on msvc, add unused to two more failing tests

29740e9

Add a path for addressof on MSVC, swap inline static_cast for move

7834bb2

On MSVC there is no unreachable, use __trap instead

197807f

Revert changes to std-to-cuda script

5475ef5

Add _CUDA_VSTD:: to narrow ADL selections for iterators/array

e356862

Add execution space specifiers and ifdefs for NVRTC

ee0ff07

Add H/D specifiers for NVRTC in tests

aa49498

Fix support files and missing H/D specifiers for array tests when com…

6ad6af3

…piled with NVRTC

Mark several other tests as NVRTC specific failures/unsupported

eb6b617

Replace an NV_IF_TARGET dispatch with _LIBCUDACXX_UNREACHABLE

6496c8e

wmaxey force-pushed the feature/array branch from 8d9bff1 to 6496c8e Compare February 18, 2022 23:48

griwes approved these changes Feb 18, 2022

View reviewed changes

wmaxey merged commit b4762a7 into main Feb 19, 2022

wmaxey deleted the feature/array branch February 19, 2022 00:40

RichardYCJ mentioned this pull request Nov 5, 2023

Modernize Tensor and Point dealii/dealii#16243

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose support for <iterator> and <array> #243

Expose support for <iterator> and <array> #243

wmaxey commented Jan 21, 2022 •

edited

Loading

wmaxey commented Jan 21, 2022

wmaxey commented Jan 21, 2022

wmaxey commented Jan 21, 2022

jrhemstad commented Jan 24, 2022

wmaxey Jan 24, 2022

jrhemstad Jan 24, 2022

wmaxey Jan 24, 2022

griwes Feb 11, 2022

griwes Feb 11, 2022

wmaxey Feb 16, 2022

griwes Feb 11, 2022

wmaxey Feb 16, 2022

griwes Feb 11, 2022

wmaxey Feb 16, 2022

griwes Feb 11, 2022

wmaxey Feb 16, 2022

griwes Feb 11, 2022

griwes Feb 11, 2022

griwes Feb 11, 2022

griwes Feb 11, 2022

wmaxey Feb 16, 2022

wmaxey Feb 18, 2022

Expose support for <iterator> and <array> #243

Expose support for <iterator> and <array> #243

Conversation

wmaxey commented Jan 21, 2022 • edited Loading

<cuda/std/array> + <cuda/std/iterator> - Statically sized arrays and iterators.

Adds support for cuda::std::array and cuda::std::iterator.

Details

Motivation

Impact

Checklists

Testing

wmaxey commented Jan 21, 2022

wmaxey commented Jan 21, 2022

wmaxey commented Jan 21, 2022

jrhemstad commented Jan 24, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wmaxey commented Jan 21, 2022 •

edited

Loading