CUDA nvcc error in func_common.inl(442) #530

Squelsh · 2016-07-24T22:21:57Z

Hi!
After updating my Ubuntu, I get a nvcc error when compiling my code that uses GLM:

../detail/func_common.inl(442): error: expected a field name

Problem can be solved by additional temporary variable:
Replacing

    // mod
    template <typename genType>
        GLM_FUNC_QUALIFIER genType mod(genType x, genType y)
    {
            return mod(tvec1<genType>(x), y).x;
    }

with

    // mod
    template <typename genType>
        GLM_FUNC_QUALIFIER genType mod(genType x, genType y)
    {
            tvec1<genType> ret_vec(mod(tvec1<genType>(x), y));
            return ret_vec.x;
    }

Any suggestions how to make it compile without patching GLM?

My system:
Ubuntu 16.04.1
CUDA 7.5 with nvcc V7.5.17
Ubuntu glm package (libglm-dev:i386 (= 0.9.7.2-1))
But problem also occurs with latest code from master.

Thanks,
Andreas

The text was updated successfully, but these errors were encountered:

Groovounet · 2016-08-06T17:42:19Z

This issue is fixed with you proposed workaround... another annoying Cuda compiler bug!

Thanks for contributing,
Christophe

API breaking changes: - GpuVoxels is now a singleton and has to be initialized - BitVoxelMeaning enum changed! 0 = Free, 1 = Occupied. More SV IDs. - Added as() operator to cast general maps into specific maps - Map-Offsets may now be negative, so Voxel-pointers changed datatype. - Shifted main API from general map type to specific implementations: Many functions can no longer be called on basic map types but only on specific maps. As not all map types offer all interfaces, this allowed us to remove unimplemented function stubs (thanks to Herbert Pietrzyk) - RPY rotation order changed to ROS standards: First rotated around roll, then pitch, then yaw Major changes: - Unified map-locking for all map types to guarantee thread-safety - Added new Pointcloud class for single clouds (thanks to Herbert Pietrzyk) - Added Octree-API function: insertPointCloudWithFreespaceCalculation to trigger raycasting - Added option to interpret unknown cells of an octree as obstacles when checking collisions - Added tfHelper class to interact easily with ROS tf - New Math functions: - Added host function to invert matrices. Code thankfully copied from Maxim Singer - Function to convert Mat4 to Roll, Pitch, Yaw together with Boost tests - Vector3f now offers: apprx_equal, length, normalize, dot, cross - angleBetween two vectors, orientationMatrixDiff between two matrices - Matrix4f now offers: equality, approximate equality and subtract together with Boost tests Minor changes: - Added visualizer config file and a python generator for random swept volume colors (this time for real) - New Boost testcases for Pointclouds and MetaPointClouds - Simplified sensor code for Raycasting in Octree - Restructured keyboard shortcuts in visualizer: - Added "command mode" to switch between data types so all Function keys can toggle maps of the selected kind. - Using ALT+digit to set decimal preposition of SweptVol IDs - Right-Click available for more datatypes in Visualizer to print voxel information - Fixed updates of subclouds in MetaPointClouds - Added sanity check in computeLinearLoad - Added Getter functions for GVL parameters - Added some general HTML pages to Doxygen docu (thanks to Darius Pietsch) - Unified probability type in all maps - Fixed memory leaks in MetaPointCloud - Added Kernel for GPU memory comparison - Unified geometric transformation kernels - Clarified singed and unsigned voxel indices (thanks to Christian Juelg) - GPU Voxels main class now checks for Compute Capability at init - Added PointCloud constructor to load file Other changes: - Compiles with Ubuntu 16.04 - Added CMake macro to remove VTK defines - Required lib glm fix: g-truc/glm#530 - Added enlarged UR5 model - Updated list of contributors

…ing example and better C++11, Ubuntu 16.04 support Known issues: - Octrees are broken on Pascal GPUs - confirmed on Titan Xp and GTX 1080 Ti - the GLM in Ubuntu 16.04 has to be patched to allow usage of the visualizer. - see g-truc/glm#530 - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch - Cuda 9.1 - many incompatibilities are fixed, but there are still failing tests in the voxellist and octree test-suites API breaking changes: - GpuVoxels - addMap returns null-initialized shared_ptr if map already exists - lockSelf, lockBoth & Co removed, replaced by exception-safe lock_guard constructs to improve debugging - TinyXML - use system version from APT package libtinyxml-dev Major changes: - Changes in CUDA CMake setup - uncomment SET(CMAKE_CXX_STANDARD 11) at the top of CMakeLists.txt to activate C++11 mode - set -maxrregcount=63 to avoid errors on desktop GPUs with 1024 threads per block - always use ICMAKER_CUDA_CPPDEFINES to pass parameters to nvcc - Added OMPL planning example gvl_ompl_planning (Thanks to Andreas Hermann) - requires C++11 - incompatible with GPU-Voxels built with PCL 1.7 - Added model "ur10_coarse" voxelized at 9mm - Added CountingVoxelList to offer pointcloud density filtering (Thanks to Herbert Pietrzyk) - Use remove_underpopulated(minimum_count) to remove outliers - Use subtractFromCountingVoxelList to remove the robot and static objects - Added BitVoxelMap collision with ProbVoxelmap - ProbVoxelMap - insert() parses BitVoxelMeaning to allow freeing single voxels - SVCollider checks for noneButEmpty instead of isZero - Fix CUDA 9 incompatibilities, issue 63 - version macro - cub namespace - __ballot vs __ballot_sync Minor changes: - DistanceVoxelTest uses double buffering of DistanceVoxelMap to avoid flickering - ProbVoxelMap: Speed up the clearing of ProbVoxelMap by using memset instead of ctor calls for every voxel - C++11 fixes - icl_core_logging operator<< stringstream bug - various fixes collected by dybedal in issue 55 - VoxelList collision - added test cases and examples - fixed memory leaks in TemplateVoxelList - fixed PCL dependency of examples and helpers, see issue 61 - fix computeLinearLoad returning grid and block size 0 - added many checks after kernel launches to improve error discovery Other changes: - Documentation updates

Fix Cuda 9.1 linking problem that broke voxellist code using Thrust CUB: always use shared cudart library Known issues: - Octrees are broken on Pascal GPUs - confirmed on Titan Xp and GTX 1080 Ti - the GLM in Ubuntu 16.04 has to be patched to allow usage of the visualizer. - see g-truc/glm#530 - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch Minor changes: - added synchronization points after some voxellist Thrust calls - removed obsolete explicit Kinect.cpp references in src/examples - define GLM_ENABLE_EXPERIMENTAL in Primitive.h to fix issue #55

… VoxelList bounds checking Thanks to Andreas Hermann for these additions: - Implemented self collision checks. Also added an example and a testcase. Fixes #67 and resolves #68 - Improvements and fixes for prob voxels. Resolves #69 - Added example and vis config file - Moved conversion from float to Probability to ProbabilisticVoxel - Fixed the updates of probabilistic voxels. - Added example about constructing a distance map from a probabilistic map - Added a raycasting example. Resolves #70 - Rewrote the insertSensorData() function to resemble the current interfaces. - Deleted old sensor / environment map code. - Moved cSENSOR_MODEL_FREE and cSENSOR_MODEL_OCCUPIED to the common defines. - Removed unnecessary meaning checks in ProbabilisticVoxelMap - Added a missing template instance for the sensor model. - Corrected all VisConfig files from the examples. - Fixes in the visualizer for probability voxels and SWEPT_VOLUME_END Possibly breaking changes: - Added TemplateVoxelList::remove_out_of_bounds functionality. - Always check for VoxelList map dimensions when inserting PointClouds into VoxelLists. Old behavior was not to check at all. - Added mapToVoxels(linear_id, coords) to VoxelMapOperations. Available for host code too. - Removed "linearIndexToCoordinates" - Changed Logging in computeLinearLoad: - Inputting 0 items generates a warning now and lead to one block with one thread. - More than 64M items create an error message and only cMAX_NR_OF_BLOCKS blocks. Additional changes: - Added DVM::get(Squared)Distances(ToHost) functions: extract distance value for voxels at selected indexes - Implemented TemplateVoxelList.merge(CountingVoxelList) - Added TemplateVoxelMap::gatherVoxelsByIndex as basis for getDistances - Added TemplateVoxelList::copyCoordsToHost - Added test countingvoxellist_merge_into_bitvectorvoxellist_minimal - Added test distance_extraction - Fixed memory management for URDF RobotLink Known issues: - Octrees are broken on Pascal GPUs - confirmed on Titan Xp and GTX 1080 Ti - The constant cMAX_NR_OF_BLOCKS is currently limited to 65535, while current CUDA devices support over 2 billion blocks - relevant during computeLinearLoad calls and collision checking - the GLM headers provided by Ubuntu 16.04 have to be patched to allow usage of the visualizer. - see g-truc/glm#530 - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch

@r2b0

…ometry generation with integer coordinates. Important Bufixes: - Octrees issues on Pascal GPUs have been fixed. Thanks to Florian! - Icmaker updated to avoid issues on CMake 3.6.2 and newer - Fix Boost 1.65.1 compatibility issues for Ubuntu 18.04 - Fix CUBIN compilation in code=sm_** settings. Thanks to @r2b0 and Florian! Known Issues: - the GLM headers provided by Ubuntu 16.04 have to be patched to allow usage of the visualizer. - see g-truc/glm#530 - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch - Cuda 8.0: Code compiled with Cuda 8.0 works fine with older GPU drivers such as 375.66, but there are runtime errors with driver 384.111 and newer - produces runtime error "PTX JIT compilation failed" - Eigen 3.3.4 and 3.3.5 issue with CUDA 9.0, 9.1, 9.2: Error: class "Eigen::half" has no member "x". - Can be fixed by using latest unstable version of Eigen - The constant cMAX_NR_OF_BLOCKS is currently limited to 65535, while current CUDA devices support over 2 billion blocks - relevant during computeLinearLoad calls and collision checking Possibly breaking changes: - TemplateVoxelList::getDimensions returns (size,1,1) instead of (size,0,0) - Important: TemplateVoxelList::getDimensions does not depend on m_ref_map_dim, unlike TemplateVoxelList::getMetricDimensions! Additional changes: - New example: GeometryGeneration showcases - New function GpuVoxelsMap::insertCoordinateList and geometry generation with integer coordinates - Example: geometry_generation::createBoxOfPoints(corner_min, corner_max, delta, side_length); - Available for maps, lists and octrees. Can create boxes, spheres and cylinders. Thanks to Herbert for his contribution! - GpuVoxels::insertBoxIntoMap now uses the integer coordinate based creation and insertion functionality, avoiding discretization issues - Remove warnings regarding logging and visualizer initialization - Fix visualizer warning by always creating the shared memory segment on initialization - Removed unused shader loading code - Updated URDF code for compatibility with new ROS releases

Uses C++11 by default. Adds ROS-connected DistanceROSDemo example. Improved Visualizer Shared Memory error handling. Major changes: - Added DistanceROSDemo to show-case a ROS node subscribing to PCL point-clouds. - Implemented additional merge functions - Added TemplateVoxelMap.merge(TemplateVoxelList) - Added functions to merge TemplateVoxelList into BitVoxelList and CountingVoxelList; - Added TemplateVoxelMap::clone and TemplateVoxelList::clone, was previously only available in DistanceVoxelMap - Added functions that implement the morphological operations Closing, Dilation and Erosion - Added function getCenterOfMass to TemplateVoxelList and TemplateVoxelMap - Added BitVoxelList::copyCoordsToHostBvmBounded: allows the selective copy of voxels with BitVoxelMeanings in a given range - Updated icmaker and icl_core to fix issues with new Boost and CMake versions Possibly breaking changes: - C++11 support is active by default. The new "indigo" branch has C++11 deactivated by default, is identical otherwise - changed BitVoxelList::findMatchingVoxels signature: - Implicitly set argument list1 to 'this'. - Add option omit_coords: if true, output VoxelLists will only contain voxel IDs and data but the coord_list will be empty (default). If set to false, voxel coordinates will also be copied to the output lists. Additional changes: - gvl_ompl_planner: - fixed collision check threshold value - updated comments and logging output - gvl_ompl_planner: added comment regarding LD_LIBRARY_PATH - fix MetaPointCloud name unknown for robot links without geometry - updated stb_image.h to version 2.19 - fix ROS&Urdf CMake discovery issue - add tests for TemplateVoxelList and TemplateVoxelMap clone - add hollie_from_pointcloud2.pcd for example DistanceVoxelTest.cpp - fix a CountingVoxelList visualization issue Known Issues: - Eigen 3 issues: can be fixed by cloning a more current unstable Eigen version and placing it in CMAKE_PREFIX_PATH + on Ubuntu 18.04 with CUDA 10.0: "math_functions.hpp not found" + Eigen 3.3.4 and 3.3.5 with CUDA 9.0, 9.1, 9.2: Error: class "Eigen::half" has no member "x" + see http://eigen.tuxfamily.org/index.php?title=Main_Page#Download - If the ROS dependency was found, but the GPU-Voxels URDF features are still unabailable, run `source /opt/ros/YOUR_ROS_DISTRO/setup.bash` before running cmake. - Cuda 8.0: Code compiled with Cuda 8.0 works fine with older GPU drivers such as 375.66, but there are runtime errors with driver 384.111 and newer ("PTX JIT compilation failed"). + Easy fix: use Cuda 10 with a current 410 or newer driver version. Cuda 10 is also available for Ubuntu 14.04 and 16.04. - The GLM headers provided by Ubuntu 16.04 have to be patched to allow usage of the visualizer. - see g-truc/glm#530 - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch - The constant cMAX_NR_OF_BLOCKS is currently limited to 65535, while current CUDA devices support over 2 billion blocks - relevant during computeLinearLoad calls and collision checking

Groovounet added a commit that referenced this issue Aug 6, 2016

Tentative CUDA workaround #530

cd50d4a

Groovounet added the bug label Aug 6, 2016

Groovounet added this to the GLM 0.9.8 milestone Aug 6, 2016

Groovounet self-assigned this Aug 6, 2016

Groovounet added a commit that referenced this issue Aug 6, 2016

Workaround Cuda compiler bug #530

dcdc966

Groovounet closed this as completed Aug 6, 2016

Squelsh mentioned this issue Sep 1, 2017

Compile issue on Jetson TX2 fzi-forschungszentrum-informatik/gpu-voxels#55

Closed

kne1p mentioned this issue Nov 29, 2017

No examples built after make, no ros or urdf, ubuntu 16.04 fzi-forschungszentrum-informatik/gpu-voxels#59

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA nvcc error in func_common.inl(442) #530

CUDA nvcc error in func_common.inl(442) #530

Squelsh commented Jul 24, 2016

Groovounet commented Aug 6, 2016

CUDA nvcc error in func_common.inl(442) #530

CUDA nvcc error in func_common.inl(442) #530

Comments

Squelsh commented Jul 24, 2016

Groovounet commented Aug 6, 2016