Can't determine number of cores. Unknown SM version 5.0! #880

zhangxaochen · 2014-09-01T15:18:08Z

I'm trying to build&run the project pcl_kinfu_app, building is OK, while running gets me the following error:

E:\AC++\l\p-m-6d0343d1b7\_b\bin> pcl_kinfu_app_debug.exe
[pcl::gpu::printShortCudaDeviceInfo] : Device 0:  "GeForce GTX 750 Ti"  2048Mb
Can't determine number of cores. Unknown SM version 5.0!
, sm_50, 0 cores, Driver/Runtime ver.6.50/6.50
Error: invalid device function  E:/ABOUT C++/libs/pcl-master-6d0343d1b7/gpu/kinfu/src
/cuda/tsdf_volume.cu:76

I googled but not found useful posts to solve this issue. Is that caused by my CUDA installation? Or because of my GTX 750 Ti architecture being MAXWELL?

I've found the code here: https://github.com/PointCloudLibrary/pcl/blob/master/gpu/containers/src/initialization.cpp

inline int convertSMVer2Cores(int major, int minor)
{
    // Defines for GPU Architecture types (using the SM version to determine the # of cores per SM
    typedef struct {
        int SM; // 0xMm (hexidecimal notation), M = SM Major version, and m = SM minor version
        int Cores;
    } SMtoCores;

    SMtoCores gpuArchCoresPerSM[] =  { { 0x10,  8 }, { 0x11,  8 }, { 0x12,  8 }, { 0x13,  8 }, { 0x20, 32 }, { 0x21, 48 }, {0x30, 192}, {0x35, 192}, { -1, -1 }  };

    int index = 0;
    while (gpuArchCoresPerSM[index].SM != -1) 
    {
        if (gpuArchCoresPerSM[index].SM == ((major << 4) + minor) ) 
            return gpuArchCoresPerSM[index].Cores;
        index++;
    }
    printf("\nCan't determine number of cores. Unknown SM version %d.%d!\n", major, minor);
    return 0;
}

Does that mean I should reinstall some other versions of CUDA?

The text was updated successfully, but these errors were encountered:

VictorLamoine · 2014-09-01T19:00:41Z

Hello,

Here is a link description architectures / CUDA:
http://docs.nvidia.com/cuda/maxwell-compatibility-guide/index.html

The code was probably written to handle sm_30 architectures and not above.
It is not the version of CUDA but the architecture of your GPU that is a "problem" here.

You need to tweak:
if (gpuArchCoresPerSM[index].SM == ((major << 4) + minor) )

If you find a solution to this problem it would be nice to share it with by sending a pull request

Bye

zhangxaochen · 2014-09-05T02:08:08Z

@VictorLamoine I find my GTX 750TI has 5 SMs, each with 128 cores, so I added {0x50, 128} ahead of {-1, -1}, then the error Unknown SM version 5.0 is gone, yet the invalid device function is still there:

E:\AC++\l\p-m-6d0343d1b7\_b\bin> pcl_kinfu_app_debug.exe
[pcl::gpu::printShortCudaDeviceInfo] : Device 0:  "GeForce GTX 750 Ti"  2048Mb, sm_50
, 640 cores, Driver/Runtime ver.6.50/6.50
Error: invalid device function  E:/ABOUT C++/libs/pcl-master-6d0343d1b7/gpu/kinfu/src
/cuda/tsdf_volume.cu:76

SteveSmithStyku · 2014-09-25T23:33:54Z

Has there been any progress on this issue? I'm running into the same "invalid device function" issue in tsdf_volume.cu:76

I was previously able to compile in Visual Studio 2010 with CUDA 5.0 using the following cmake options:

CUDA_ARCH_BIN - 2.0 2.1(2.0) 3.0
CUDA_ARCH_PTX - 3.0

This allowed me to run kinfu on both Keppler and Maxwel GPUs.

However this is no longer working for me under Visual Studio 2013 and CUDA 6.5, and I get the run-time error "invalid device function", but only when attempting to run on a Maxwell GPU. It works fine on Keppler.

If I find out anything more I'll write it here. Thanks.

tshimba · 2014-10-24T21:49:55Z

Hi,

Do you have any progresses about this problem?
I got the same problem with you.

However if I build on Debug mode, the program works correctly, but I got the same error message 'Error: invalid device function' when I run the program built on Release mode.

It's looks strange behavior. I'm looking for the difference between release and debug on some settings (ex. cmake list), but still didn't find it.

I'm using
Windows 8.1
Visual Studio 2013
CUDA 6.5 with NVIDIA Quadro 4000
VTK 5.10
Boost 1.56.0
Eigen 3.2.2
FLANN 1.8.4
QHull 2012.1 for Windows

Thanks,

MichaelKorn · 2014-11-08T13:30:00Z

I get it working with Ubuntu 14.04 and a GTX 970:

set CUDA_ARCH_BIN = 5.2
set CUDA_ARCH_PTX = 5.2
add {0x52, 128} in convertSMVer2Cores
version 5.2 uses up to 255 registers per Thread, because of this I had to reduce the number of Threads in pcl::device::initVolume (dim3 block (16, 16);), otherwise I got "Error: too many resources requested for launch"

ddetone · 2014-11-30T17:20:39Z

I got KinFu working following MichaelKorn's steps with Ubuntu 14.04, GTX 980, CUDA 6.5 (using the special drivers for GTX9xx).

…y#880)

improve compatibility with new nvidia GPUs (resolves #880)

Grandgarfield · 2015-02-25T18:26:15Z

Hi,

If it may help, after searching a long time I managed to run the kinfu application with CUDA 6.5.
My OS is win7, and i'm using a laptop computer with nvidia 640M as GPU.
I had the same error in release mode and the application working fine in debug mode as f2um2326 described.
The problem seems to come from the /GL flag during compilation.
To make it work after generating the project files with cmake i manually removed the /GL flags in all .cmake files in : $(BUILD_DIR)\gpu\kinfu\CMakeFiles\cuda_compile.dir\src\cuda, there is a line setting CMAKE_HOST_FLAGS_RELEASE that is where i removed the flag.
This is no optimal solution since code is not properly optimized i guess but it may help some of you...

…y#880)

QinZiwen · 2017-06-19T14:19:11Z

I have gtx1070, cuda8.0, and have the same problem:

Device 0:  "GeForce GTX 1070"  8106Mb
Can't determine number of cores. Unknown SM version 6.1!
, sm_61, 0 cores, Driver/Runtime ver.8.0/8.0

SergioRAgostinho · 2017-06-26T10:00:31Z

Which version of PCL are you using exactly? The current master should have no issue with this.

haueck · 2017-07-05T13:27:58Z

Hi,

I have similar problem when I use pcl-1.8.1rc1 release and NVIDIA Tegra X1. When I try to estimate normals I get an error: Error: invalid device function pcl-pcl-1.8.0/gpu/octree/src/cuda/octree_host.cu:64.

Build command: cmake .. -DCMAKE_BUILD_TYPE=Release -DBUILD_GPU=true -DBUILD_gpu_surface=true -DBUILD_gpu_kinfu=false -DBUILD_gpu_kinfu_large_scale=false -DCUDA_ARCH_BIN="5.3" -DCUDA_ARCH_PTX="5.3" && make

pcl::gpu::printCudaDeviceInfo() output:

*** CUDA Device Query (Runtime API) version (CUDART static linking) *** 

Device count: 1

Can't determine number of cores. Unknown SM version 5.3!

Device 0: "NVIDIA Tegra X1"
  CUDA Driver Version / Runtime Version          8.0 / 8.0
  CUDA Capability Major/Minor version number:    5.3
  Total amount of global memory:                 3995 MBytes (4188778496 bytes)
  ( 2) Multiprocessors x ( 0) CUDA Cores/MP:     0 CUDA Cores
  GPU Clock Speed:                               0.07 GHz
  Memory Clock rate:                             0.00 Mhz
  Memory Bus Width:                              0-bit
  Max Texture Dimension Size (x,y,z)             1D=(65536), 2D=(65536,65536), 3D=(4096,4096,4096)
  Max Layered Texture Size (dim) x layers        1D=(16384) x 2048, 2D=(16384,16384) x 2048
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     2147483647 x 65535 x 65535
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and execution:                 Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support enabled:                No
  Device is using TCC driver mode:               No
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           0 / 0
  Compute Mode:
      Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) 

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version  = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1

Thanks.

taketwo · 2017-07-05T13:35:19Z

#1824 added number of CUDA cores per SM for Pascal GPUs. However, I don't see an entry for "5.3", could this be a problem?

SergioRAgostinho · 2017-07-05T14:03:57Z

It's exactly that.

Edit: pcl_find_cuda.cmake also needs to be updated accordingly.

haueck · 2017-07-05T14:16:38Z

I will test it and I can create pull request for this change.

SergioRAgostinho · 2017-07-05T14:25:35Z

Thanks

According to http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capability-5-x
the number of cores for compute capability 5.x should be 128 as well.

Also according to this https://en.wikipedia.org/wiki/CUDA#GPUs_supported (not really official I know), our compute capabilities per cuda toolkit version are not exactly right
https://github.com/PointCloudLibrary/pcl/blob/master/cmake/pcl_find_cuda.cmake#L46-L58

haueck · 2017-07-06T09:34:56Z

I am not sure if there is a correlation between capability version and number of cores:

Cards with capability version 5.0:
GeForce GTX 750 Ti - 640 CUDA Cores
GeForce GTX 750 - 512 CUDA Cores

Cards with capability version 5.2:
GeForce GTX 980 Ti - 2816 CUDA Cores
GeForce GTX 980 - 2048 CUDA Cores
GeForce GTX 970 - 1664 CUDA Cores

Cards with capability version 5.3:
Tegra X1 - 256 CUDA Cores

SergioRAgostinho · 2017-07-06T10:18:11Z

It's cuda cores per multiprocessor. Which means the total number of cores in the card will always be a multiple of that number. For the compute capability 5.x , each multiprocessor has 128. All the cards you listed have a humber of cores which is a multiple of this one.

MichaelKorn added a commit to MichaelKorn/pcl that referenced this issue Dec 3, 2014

improve compatibility with new nvidia GPUs (resolves PointCloudLibrar…

9d777f8

…y#880)

jspricke closed this as completed in 1337f66 Dec 4, 2014

jspricke added a commit that referenced this issue Dec 4, 2014

Merge pull request #1030 from MichaelKorn/cuda

63b0b25

improve compatibility with new nvidia GPUs (resolves #880)

SteveSmithStyku mentioned this issue Jan 27, 2015

Runtime Error: invalid device function, tsdf_volume.cu:76 (Kinfu bug on windows) #1113

Closed

nh2 pushed a commit to nh2/pcl that referenced this issue May 23, 2015

improve compatibility with new nvidia GPUs (resolves PointCloudLibrar…

59d2d8f

…y#880)

prclibo pushed a commit to prclibo/pcl that referenced this issue Aug 22, 2016

improve compatibility with new nvidia GPUs (resolves PointCloudLibrar…

fdb6951

…y#880)

haueck mentioned this issue Jul 7, 2017

Added CUDA compute capability 5.3 #1929

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't determine number of cores. Unknown SM version 5.0! #880

Can't determine number of cores. Unknown SM version 5.0! #880

zhangxaochen commented Sep 1, 2014

VictorLamoine commented Sep 1, 2014

zhangxaochen commented Sep 5, 2014

SteveSmithStyku commented Sep 25, 2014

tshimba commented Oct 24, 2014

MichaelKorn commented Nov 8, 2014

ddetone commented Nov 30, 2014

Grandgarfield commented Feb 25, 2015

QinZiwen commented Jun 19, 2017

SergioRAgostinho commented Jun 26, 2017

haueck commented Jul 5, 2017

taketwo commented Jul 5, 2017

SergioRAgostinho commented Jul 5, 2017 •

edited

Loading

haueck commented Jul 5, 2017

SergioRAgostinho commented Jul 5, 2017

haueck commented Jul 6, 2017

SergioRAgostinho commented Jul 6, 2017

Can't determine number of cores. Unknown SM version 5.0! #880

Can't determine number of cores. Unknown SM version 5.0! #880

Comments

zhangxaochen commented Sep 1, 2014

VictorLamoine commented Sep 1, 2014

zhangxaochen commented Sep 5, 2014

SteveSmithStyku commented Sep 25, 2014

tshimba commented Oct 24, 2014

MichaelKorn commented Nov 8, 2014

ddetone commented Nov 30, 2014

Grandgarfield commented Feb 25, 2015

QinZiwen commented Jun 19, 2017

SergioRAgostinho commented Jun 26, 2017

haueck commented Jul 5, 2017

taketwo commented Jul 5, 2017

SergioRAgostinho commented Jul 5, 2017 • edited Loading

haueck commented Jul 5, 2017

SergioRAgostinho commented Jul 5, 2017

haueck commented Jul 6, 2017

SergioRAgostinho commented Jul 6, 2017

SergioRAgostinho commented Jul 5, 2017 •

edited

Loading