Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA nvcc error in func_common.inl(442) #530

Closed
Squelsh opened this issue Jul 24, 2016 · 1 comment
Closed

CUDA nvcc error in func_common.inl(442) #530

Squelsh opened this issue Jul 24, 2016 · 1 comment
Assignees
Labels
Milestone

Comments

@Squelsh
Copy link

Squelsh commented Jul 24, 2016

Hi!
After updating my Ubuntu, I get a nvcc error when compiling my code that uses GLM:

../detail/func_common.inl(442): error: expected a field name

Problem can be solved by additional temporary variable:
Replacing

    // mod
    template <typename genType>
        GLM_FUNC_QUALIFIER genType mod(genType x, genType y)
    {
            return mod(tvec1<genType>(x), y).x;
    }

with

    // mod
    template <typename genType>
        GLM_FUNC_QUALIFIER genType mod(genType x, genType y)
    {
            tvec1<genType> ret_vec(mod(tvec1<genType>(x), y));
            return ret_vec.x;
    }

Any suggestions how to make it compile without patching GLM?

My system:
Ubuntu 16.04.1
CUDA 7.5 with nvcc V7.5.17
Ubuntu glm package (libglm-dev:i386 (= 0.9.7.2-1))
But problem also occurs with latest code from master.

Thanks,
Andreas

Groovounet added a commit that referenced this issue Aug 6, 2016
@Groovounet Groovounet added the bug label Aug 6, 2016
@Groovounet Groovounet added this to the GLM 0.9.8 milestone Aug 6, 2016
@Groovounet Groovounet self-assigned this Aug 6, 2016
Groovounet added a commit that referenced this issue Aug 6, 2016
@Groovounet
Copy link
Member

This issue is fixed with you proposed workaround... another annoying Cuda compiler bug!

Thanks for contributing,
Christophe

Squelsh pushed a commit to fzi-forschungszentrum-informatik/gpu-voxels that referenced this issue Aug 21, 2016
API breaking changes:
- GpuVoxels is now a singleton and has to be initialized
- BitVoxelMeaning enum changed! 0 = Free, 1 = Occupied. More SV IDs.
- Added as() operator to cast general maps into specific maps
- Map-Offsets may now be negative, so Voxel-pointers changed datatype.
- Shifted main API from general map type to specific implementations:
  Many functions can no longer be called on basic map types but only on
  specific maps. As not all map types offer all interfaces, this allowed
  us to remove unimplemented function stubs (thanks to Herbert Pietrzyk)
- RPY rotation order changed to ROS standards:
  First rotated around roll, then pitch, then yaw

Major changes:
- Unified map-locking for all map types to guarantee thread-safety
- Added new Pointcloud class for single clouds (thanks to Herbert Pietrzyk)
- Added Octree-API function: insertPointCloudWithFreespaceCalculation
  to trigger raycasting
- Added option to interpret unknown cells of an octree as obstacles
  when checking collisions
- Added tfHelper class to interact easily with ROS tf
- New Math functions:
  - Added host function to invert matrices.
    Code thankfully copied from Maxim Singer
  - Function to convert Mat4 to Roll, Pitch, Yaw
    together with Boost tests
  - Vector3f now offers: apprx_equal, length, normalize, dot, cross
  - angleBetween two vectors, orientationMatrixDiff between two matrices
  - Matrix4f now offers: equality, approximate equality and subtract
    together with Boost tests

Minor changes:
- Added visualizer config file and a python generator for random swept
  volume colors (this time for real)
- New Boost testcases for Pointclouds and MetaPointClouds
- Simplified sensor code for Raycasting in Octree
- Restructured keyboard shortcuts in visualizer:
  - Added "command mode" to switch between data types so
    all Function keys can toggle maps of the selected kind.
  - Using ALT+digit to set decimal preposition of SweptVol IDs
- Right-Click available for more datatypes in Visualizer to
  print voxel information
- Fixed updates of subclouds in MetaPointClouds
- Added sanity check in computeLinearLoad
- Added Getter functions for GVL parameters
- Added some general HTML pages to Doxygen docu (thanks to Darius Pietsch)
- Unified probability type in all maps
- Fixed memory leaks in MetaPointCloud
- Added Kernel for GPU memory comparison
- Unified geometric transformation kernels
- Clarified singed and unsigned voxel indices (thanks to Christian Juelg)
- GPU Voxels main class now checks for Compute Capability at init
- Added PointCloud constructor to load file

Other changes:
- Compiles with Ubuntu 16.04
  - Added CMake macro to remove VTK defines
  - Required lib glm fix: g-truc/glm#530
- Added enlarged UR5 model
- Updated list of contributors
cjue added a commit to cjue/gpu-voxels that referenced this issue Feb 5, 2018
…ing example and better C++11, Ubuntu 16.04 support

Known issues:
- Octrees are broken on Pascal GPUs
  - confirmed on Titan Xp and GTX 1080 Ti
- the GLM in Ubuntu 16.04 has to be patched to allow usage of the visualizer.
  - see g-truc/glm#530
  - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch
- Cuda 9.1
  - many incompatibilities are fixed, but there are still failing tests in the voxellist and octree test-suites

API breaking changes:
- GpuVoxels
  - addMap returns null-initialized shared_ptr if map already exists
  - lockSelf, lockBoth & Co removed, replaced by exception-safe lock_guard
  constructs to improve debugging
- TinyXML
  - use system version from APT package libtinyxml-dev

Major changes:
- Changes in CUDA CMake setup
  - uncomment SET(CMAKE_CXX_STANDARD 11) at the top of CMakeLists.txt to activate C++11 mode
  - set -maxrregcount=63 to avoid errors on desktop GPUs with 1024 threads per block
  - always use ICMAKER_CUDA_CPPDEFINES to pass parameters to nvcc
- Added OMPL planning example gvl_ompl_planning  (Thanks to Andreas Hermann)
  - requires C++11
  - incompatible with GPU-Voxels built with PCL 1.7
- Added model "ur10_coarse" voxelized at 9mm
- Added CountingVoxelList to offer pointcloud density filtering  (Thanks to Herbert Pietrzyk)
  - Use remove_underpopulated(minimum_count) to remove outliers
  - Use subtractFromCountingVoxelList to remove the robot and static objects
- Added BitVoxelMap collision with ProbVoxelmap
- ProbVoxelMap
  - insert() parses BitVoxelMeaning to allow freeing single voxels
  - SVCollider checks for noneButEmpty instead of isZero
- Fix CUDA 9 incompatibilities, issue 63
  - version macro
  - cub namespace
  - __ballot vs __ballot_sync

Minor changes:
- DistanceVoxelTest uses double buffering of DistanceVoxelMap to avoid flickering
- ProbVoxelMap: Speed up the clearing of ProbVoxelMap by using memset instead of ctor calls for every voxel
- C++11 fixes
  - icl_core_logging operator<< stringstream bug
  - various fixes collected by dybedal in issue 55
- VoxelList collision
  - added test cases and examples
- fixed memory leaks in TemplateVoxelList
- fixed PCL dependency of examples and helpers, see issue 61
- fix computeLinearLoad returning grid and block size 0
- added many checks after kernel launches to improve error discovery

Other changes:
- Documentation updates
cjue added a commit to fzi-forschungszentrum-informatik/gpu-voxels that referenced this issue Feb 8, 2018
Fix Cuda 9.1 linking problem that broke voxellist code using Thrust CUB: always use shared cudart library

Known issues:
- Octrees are broken on Pascal GPUs
  - confirmed on Titan Xp and GTX 1080 Ti
- the GLM in Ubuntu 16.04 has to be patched to allow usage of the visualizer.
  - see g-truc/glm#530
  - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch

Minor changes:
- added synchronization points after some voxellist Thrust calls
- removed obsolete explicit Kinect.cpp references in src/examples
- define GLM_ENABLE_EXPERIMENTAL in Primitive.h to fix issue #55
cjue added a commit to fzi-forschungszentrum-informatik/gpu-voxels that referenced this issue May 1, 2018
… VoxelList bounds checking

Thanks to Andreas Hermann for these additions:
- Implemented self collision checks. Also added an example and a testcase. Fixes #67 and resolves #68
- Improvements and fixes for prob voxels. Resolves #69
- Added example and vis config file
- Moved conversion from float to Probability to ProbabilisticVoxel
- Fixed the updates of probabilistic voxels.
- Added example about constructing a distance map from a probabilistic map
- Added a raycasting example. Resolves #70
- Rewrote the insertSensorData() function to resemble the current interfaces.
- Deleted old sensor / environment map code.
- Moved cSENSOR_MODEL_FREE and cSENSOR_MODEL_OCCUPIED to the common defines.
- Removed unnecessary meaning checks in ProbabilisticVoxelMap
- Added a missing template instance for the sensor model.
- Corrected all VisConfig files from the examples.
- Fixes in the visualizer for probability voxels and SWEPT_VOLUME_END

Possibly breaking changes:
- Added TemplateVoxelList::remove_out_of_bounds functionality.
- Always check for VoxelList map dimensions when inserting PointClouds into VoxelLists. Old behavior was not to check at all.
- Added mapToVoxels(linear_id, coords) to VoxelMapOperations. Available for host code too.
- Removed "linearIndexToCoordinates"
- Changed Logging in computeLinearLoad:
  - Inputting 0 items generates a warning now and lead to one block with one thread.
  - More than 64M items create an error message and only cMAX_NR_OF_BLOCKS blocks.

Additional changes:
- Added DVM::get(Squared)Distances(ToHost) functions: extract distance value for voxels at selected indexes
- Implemented TemplateVoxelList.merge(CountingVoxelList)
- Added TemplateVoxelMap::gatherVoxelsByIndex as basis for getDistances
- Added TemplateVoxelList::copyCoordsToHost
- Added test countingvoxellist_merge_into_bitvectorvoxellist_minimal
- Added test distance_extraction
- Fixed memory management for URDF RobotLink

Known issues:
- Octrees are broken on Pascal GPUs
  - confirmed on Titan Xp and GTX 1080 Ti
- The constant cMAX_NR_OF_BLOCKS is currently limited to 65535, while current CUDA devices support over 2 billion blocks
  - relevant during computeLinearLoad calls and collision checking
- the GLM headers provided by Ubuntu 16.04 have to be patched to allow usage of the visualizer.
  - see g-truc/glm#530
  - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch
cjue added a commit to fzi-forschungszentrum-informatik/gpu-voxels that referenced this issue Aug 28, 2018
…ometry generation with integer coordinates.

Important Bufixes:
- Octrees issues on Pascal GPUs have been fixed. Thanks to Florian!
- Icmaker updated to avoid issues on CMake 3.6.2 and newer
- Fix Boost 1.65.1 compatibility issues for Ubuntu 18.04
- Fix CUBIN compilation in code=sm_** settings. Thanks to @r2b0 and Florian!

Known Issues:
- the GLM headers provided by Ubuntu 16.04 have to be patched to allow usage of the visualizer.
  - see g-truc/glm#530
  - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch
- Cuda 8.0: Code compiled with Cuda 8.0 works fine with older GPU drivers such as 375.66, but there are runtime errors with driver 384.111 and newer
  - produces runtime error "PTX JIT compilation failed"
- Eigen 3.3.4 and 3.3.5 issue with CUDA 9.0, 9.1, 9.2: Error: class "Eigen::half" has no member "x".
  - Can be fixed by using latest unstable version of Eigen
- The constant cMAX_NR_OF_BLOCKS is currently limited to 65535, while current CUDA devices support over 2 billion blocks
  - relevant during computeLinearLoad calls and collision checking

Possibly breaking changes:
- TemplateVoxelList::getDimensions returns (size,1,1) instead of (size,0,0)
  - Important: TemplateVoxelList::getDimensions does not depend on m_ref_map_dim, unlike TemplateVoxelList::getMetricDimensions!

Additional changes:
- New example: GeometryGeneration showcases
- New function GpuVoxelsMap::insertCoordinateList and geometry generation with integer coordinates
  - Example: geometry_generation::createBoxOfPoints(corner_min, corner_max, delta, side_length);
  - Available for maps, lists and octrees. Can create boxes, spheres and cylinders. Thanks to Herbert for his contribution!
- GpuVoxels::insertBoxIntoMap now uses the integer coordinate based creation and insertion functionality, avoiding discretization issues
- Remove warnings regarding logging and visualizer initialization
- Fix visualizer warning by always creating the shared memory segment on initialization
- Removed unused shader loading code
- Updated URDF code for compatibility with new ROS releases
cjue added a commit to fzi-forschungszentrum-informatik/gpu-voxels that referenced this issue Jan 10, 2019
Uses C++11 by default. Adds ROS-connected DistanceROSDemo example. Improved Visualizer Shared Memory error handling.

Major changes:
- Added DistanceROSDemo to show-case a ROS node subscribing to PCL point-clouds.
- Implemented additional merge functions
  - Added TemplateVoxelMap.merge(TemplateVoxelList)
  - Added functions to merge TemplateVoxelList into BitVoxelList and CountingVoxelList;
- Added TemplateVoxelMap::clone and TemplateVoxelList::clone, was previously
only available in DistanceVoxelMap
- Added functions that implement the morphological operations Closing, Dilation and Erosion
- Added function getCenterOfMass to TemplateVoxelList and TemplateVoxelMap
- Added BitVoxelList::copyCoordsToHostBvmBounded: allows the selective copy of voxels with BitVoxelMeanings in a given range
- Updated icmaker and icl_core to fix issues with new Boost and CMake versions

Possibly breaking changes:
- C++11 support is active by default. The new "indigo" branch has C++11 deactivated by default, is identical otherwise
- changed BitVoxelList::findMatchingVoxels signature:
  - Implicitly set argument list1 to 'this'.
  - Add option omit_coords: if true, output VoxelLists will only contain voxel IDs and data but the coord_list will be empty (default). If set to false, voxel coordinates will also be copied to the output lists.

Additional changes:
- gvl_ompl_planner:
  - fixed collision check threshold value
  - updated comments and logging output
- gvl_ompl_planner: added comment regarding LD_LIBRARY_PATH
- fix MetaPointCloud name unknown for robot links without geometry
- updated stb_image.h to version 2.19
- fix ROS&Urdf CMake discovery issue
- add tests for TemplateVoxelList and TemplateVoxelMap clone
- add hollie_from_pointcloud2.pcd for example DistanceVoxelTest.cpp
- fix a CountingVoxelList visualization issue

Known Issues:
- Eigen 3 issues: can be fixed by cloning a more current unstable Eigen version and placing it in CMAKE_PREFIX_PATH
  + on Ubuntu 18.04 with CUDA 10.0: "math_functions.hpp not found"
  + Eigen 3.3.4 and 3.3.5 with CUDA 9.0, 9.1, 9.2: Error: class "Eigen::half" has no member "x"
  + see http://eigen.tuxfamily.org/index.php?title=Main_Page#Download
- If the ROS dependency was found, but the GPU-Voxels URDF features are still unabailable, run `source /opt/ros/YOUR_ROS_DISTRO/setup.bash` before running cmake.
- Cuda 8.0: Code compiled with Cuda 8.0 works fine with older GPU drivers such as 375.66, but there are runtime errors with driver 384.111 and newer ("PTX JIT compilation failed").
  + Easy fix: use Cuda 10 with a current 410 or newer driver version. Cuda 10 is also available for Ubuntu 14.04 and 16.04.
- The GLM headers provided by Ubuntu 16.04 have to be patched to allow usage of the visualizer.
  - see g-truc/glm#530
  - patch for /usr/include/glm/detail/func_common.inl in packages/gpu_voxels/doc/glm_fix_issu530.patch
- The constant cMAX_NR_OF_BLOCKS is currently limited to 65535, while current CUDA devices support over 2 billion blocks
  - relevant during computeLinearLoad calls and collision checking
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants