A few improvements for internal macro documentation (#3554)

fbusato · web-flow · commit cdd94ad21598 · 2025-01-28T09:24:08.000+01:00
* add MVSV 2017 deprecation
* add ARM64 emulation note
* Add behavior of NVRTC with OS macros
* add FP8 identification macro
diff --git a/docs/cccl_development/macro.rst b/docs/cccl_development/macro.rst
@@ -12,21 +12,21 @@ Compiler Macros
 
 **Host compiler macros**:
 
-+------------------------------+--------------------------------+
-| ``_CCCL_COMPILER(CLANG)``    | Clang                          |
-+------------------------------+--------------------------------+
-| ``_CCCL_COMPILER(GCC)``      | GCC                            |
-+------------------------------+--------------------------------+
-| ``_CCCL_COMPILER(NVHPC)``    | Nvidia HPC compiler            |
-+------------------------------+--------------------------------+
-| ``_CCCL_COMPILER(MSVC)``     | Microsoft Visual Studio        |
-+------------------------------+--------------------------------+
-| ``_CCCL_COMPILER(MSVC2017)`` | Microsoft Visual Studio 2017   |
-+------------------------------+--------------------------------+
-| ``_CCCL_COMPILER(MSVC2019)`` | Microsoft Visual Studio 2019   |
-+------------------------------+--------------------------------+
-| ``_CCCL_COMPILER(MSVC2022)`` | Microsoft Visual Studio 2022   |
-+------------------------------+--------------------------------+
++------------------------------+---------------------------------------------+
+| ``_CCCL_COMPILER(CLANG)``    | Clang                                       |
++------------------------------+---------------------------------------------+
+| ``_CCCL_COMPILER(GCC)``      | GCC                                         |
++------------------------------+---------------------------------------------+
+| ``_CCCL_COMPILER(NVHPC)``    | Nvidia HPC compiler                         |
++------------------------------+---------------------------------------------+
+| ``_CCCL_COMPILER(MSVC)``     | Microsoft Visual Studio                     |
++------------------------------+---------------------------------------------+
+| ``_CCCL_COMPILER(MSVC2017)`` | Microsoft Visual Studio 2017 (deprecated)   |
++------------------------------+---------------------------------------------+
+| ``_CCCL_COMPILER(MSVC2019)`` | Microsoft Visual Studio 2019                |
++------------------------------+---------------------------------------------+
+| ``_CCCL_COMPILER(MSVC2022)`` | Microsoft Visual Studio 2022                |
++------------------------------+---------------------------------------------+
 
 The ``_CCCL_COMPILER`` function-like macro can also be used to check the version of a compiler.
 
@@ -68,11 +68,11 @@ The ``_CCCL_CUDA_COMPILER`` function-like macro can also be used to check the ve
 
 **PTX macros**:
 
-+-------------------------+-------------------------------------------------------------------------------------------------------------------+
-| ``_CCCL_PTX_ARCH``      | Alias of ``__CUDA_ARCH__`` with value equal to 0 if cuda compiler is not available                                |
-+-------------------------+-------------------------------------------------------------------------------------------------------------------+
-| ``__cccl_ptx_isa``      | PTX ISA version available with the current CUDA compiler, e.g. PTX ISA 8.4 (``840``) is available from CUDA 12.4  |
-+-------------------------+-------------------------------------------------------------------------------------------------------------------+
++--------------------+-------------------------------------------------------------------------------------------------------------------+
+| ``_CCCL_PTX_ARCH`` | Alias of ``__CUDA_ARCH__`` with value equal to 0 if cuda compiler is not available                                |
++--------------------+-------------------------------------------------------------------------------------------------------------------+
+| ``__cccl_ptx_isa`` | PTX ISA version available with the current CUDA compiler, e.g. PTX ISA 8.4 (``840``) is available from CUDA 12.4  |
++--------------------+-------------------------------------------------------------------------------------------------------------------+
 
 ----
 
@@ -81,26 +81,26 @@ Architecture Macros
 
 The following macros are used to check the target architecture. They comply with the compiler supported by the CUDA toolkit. Compilers outside the CUDA toolkit may define such macros in a different way.
 
-+-------------------------+-------------------------------------+
-| ``_CCCL_ARCH(ARM64)``   |  ARM 64-bit                         |
-+-------------------------+-------------------------------------+
-| ``_CCCL_ARCH(X86_64)``  |  X86 64-bit                         |
-+-------------------------+-------------------------------------+
++-------------------------+---------------------------------------------------+
+| ``_CCCL_ARCH(ARM64)``   |  ARM 64-bit, including MSVC emulation             |
++-------------------------+---------------------------------------------------+
+| ``_CCCL_ARCH(X86_64)``  |  X86 64-bit. False on ARM 64-bit MSVC emulation   |
++-------------------------+---------------------------------------------------+
 
 ----
 
 OS Macros
 ---------
 
-+-----------------------+---------+
-| ``_CCCL_OS(WINDOWS)`` | Windows |
-+-----------------------+---------+
-| ``_CCCL_OS(LINUX)``   | Linux   |
-+-----------------------+---------+
-| ``_CCCL_OS(ANDROID)`` | Android |
-+-----------------------+---------+
-| ``_CCCL_OS(QNX)``     | QNX     |
-+-----------------------+---------+
++-----------------------+---------------------------------+
+| ``_CCCL_OS(WINDOWS)`` | Windows, including NVRTC LLP64  |
++-----------------------+---------------------------------+
+| ``_CCCL_OS(LINUX)``   | Linux, including NVRTC LP64     |
++-----------------------+---------------------------------+
+| ``_CCCL_OS(ANDROID)`` | Android                         |
++-----------------------+---------------------------------+
+| ``_CCCL_OS(QNX)``     | QNX                             |
++-----------------------+---------------------------------+
 
 ----
 
@@ -131,6 +131,8 @@ In addition, ``_CCCL_EXEC_CHECK_DISABLE`` disables the execution space check for
 
 Possible ``TARGET`` values:
 
++---------------------------+-------------------------------------------------------------------+
+| ``NV_ANY_TARGET``         | Any target                                                        |
 +---------------------------+-------------------------------------------------------------------+
 | ``NV_IS_HOST``            | Host-code target                                                  |
 +---------------------------+-------------------------------------------------------------------+
@@ -159,30 +161,41 @@ Usage example:
 
 ----
 
-CUDA Extension Macros
----------------------
-
-**CUDA attributes**:
+CUDA attributes
+---------------
 
 +------------------------------+----------------------------------------------------------+
 | ``_CCCL_GRID_CONSTANT``      | Grid constant kernel parameter                           |
 +------------------------------+----------------------------------------------------------+
 | ``_CCCL_GLOBAL_CONSTANT``    | Host/device global scope constant (``inline constexpr``) |
 +------------------------------+----------------------------------------------------------+
 
-**Extended floating-point types**:
+----
 
-+------------------------------+-----------------------------------------------------------------------------------------------------------------+
-| ``_CCCL_HAS_NVFP16``         | `__half/__half2` data types are supported and enabled. Prefer over ``__CUDA_FP16_TYPES_EXIST__``                |
-+------------------------------+-----------------------------------------------------------------------------------------------------------------+
-| ``_CCCL_HAS_NVBF16``         | `__nv_bfloat16/__nv_bfloat162` data types are supported and enabled.  Prefer over ``__CUDA_BF16_TYPES_EXIST__`` |
-+------------------------------+-----------------------------------------------------------------------------------------------------------------+
+Non-standard Types Support
+--------------------------
 
-+------------------------------+----------------------------------------------------------------+
-| ``_LIBCUDACXX_HAS_NVFP16``   | `__half/__half2` host/device support  (CUDA 12.2)              |
-+------------------------------+----------------------------------------------------------------+
-| ``_LIBCUDACXX_HAS_NVBF16``   | `__nv_bfloat16/__nv_bfloat162` host/device support (CUDA 12.2) |
-+------------------------------+----------------------------------------------------------------+
++------------------------------+-------------------------------------------------------------------------------------------------------------------------------+
+| ``_CCCL_HAS_INT128()``       | ``__int128`` and ``__uint128_t`` for 128-bit integer are supported and enabled                                                |
++------------------------------+-------------------------------------------------------------------------------------------------------------------------------+
+| ``_CCCL_HAS_FLOAT128()``     | ``__float128`` for 128-bit floating-point are supported and enabled                                                           |
++------------------------------+-------------------------------------------------------------------------------------------------------------------------------+
+| ``_CCCL_HAS_NVFP16``         | ``__half/__half2`` data types are supported and enabled. Prefer over ``__CUDA_FP16_TYPES_EXIST__``                            |
++------------------------------+-------------------------------------------------------------------------------------------------------------------------------+
+| ``_CCCL_HAS_NVBF16``         | ``__nv_bfloat16/__nv_bfloat162`` data types are supported and enabled.  Prefer over ``__CUDA_BF16_TYPES_EXIST__``             |
++------------------------------+-------------------------------------------------------------------------------------------------------------------------------+
+| ``_CCCL_HAS_FP8()``          | ``__nv_fp8_e5m2/__nv_fp8_e4m3/__nv_fp8_e8m0`` data types are supported and enabled.  Prefer over ``__CUDA_FP8_TYPES_EXIST__`` |
++------------------------------+-------------------------------------------------------------------------------------------------------------------------------+
+
++------------------------------+-------------------------------------------------------------------------+
+| ``_CCCL_DISABLE_INT128``     | Disable ``__int128/__uint128_t`` support                                |
++------------------------------+-------------------------------------------------------------------------+
+| ``_CCCL_DISABLE_FLOAT128``   | Disable ``__float128`` support                                          |
++------------------------------+-------------------------------------------------------------------------+
+| ``_LIBCUDACXX_HAS_NVFP16``   | ``__half/__half2`` host/device are supported  (CUDA 12.2+)              |
++------------------------------+-------------------------------------------------------------------------+
+| ``_LIBCUDACXX_HAS_NVBF16``   | ``__nv_bfloat16/__nv_bfloat162`` host/device are supported (CUDA 12.2+) |
++------------------------------+-------------------------------------------------------------------------+
 
 ----
 
@@ -244,6 +257,8 @@ Usage example:
 
 **Portable attributes**:
 
++----------------------------------+------------------------------------------------------------------------------+
+| ``_CCCL_ASSUME()``               | Portable ``[[assume]]`` attribute (before C++23)                             |
 +----------------------------------+------------------------------------------------------------------------------+
 | ``_CCCL_FALLTHROUGH()``          | Portable ``[[fallthrough]]`` attribute (before C++17)                        |
 +----------------------------------+------------------------------------------------------------------------------+
diff --git a/docs/cpp.rst b/docs/cpp.rst
@@ -11,7 +11,7 @@ CUDA C++ Core Libraries
    CUB <https://nvidia.github.io/cccl/cub/>
    Thrust <https://nvidia.github.io/cccl/thrust/>
    Cuda Experimental <https://nvidia.github.io/cccl/cudax/>
-   CCCL development <cccl_development/index>
+   CCCL Development <cccl_development/index>
 
 Welcome to the CUDA Core Compute Libraries (CCCL) libraries for C++.