Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MacOS 10.13.6] compiled and ran predictions with OPENCL GPU_MODE #818

Closed
ghost opened this issue Sep 6, 2018 · 20 comments
Closed

[MacOS 10.13.6] compiled and ran predictions with OPENCL GPU_MODE #818

ghost opened this issue Sep 6, 2018 · 20 comments

Comments

@ghost
Copy link

ghost commented Sep 6, 2018

Posting rules

  1. Fill the Your System Configuration section (all of it!) if you have some kind of error or performance question.
  2. No questions about training. OpenPose only implements testing.
  3. No questions about 3rd party libraries.
    • Caffe errors/issues, check Caffe documentation.
    • CUDA check failed errors: They are usually fixed by re-installing CUDA, then re-installing the proper cuDNN version, and then re-compiling (or re-installing) OpenPose. Otherwise, check for help in CUDA forums.
    • OpenCV errors: Install the default/pre-compiled OpenCV or check for online help.
  4. No duplicated posts.
  5. No posts about questions already answered / clearly explained in the documentation (e.g. no more low-speed nor out-of-memory questions).
  6. Set a proper issue title: add the Ubuntu/Windows word and be specific (e.g. do not simple call it: Compile error).
  7. Only English comments.
    Issues/comments which do not follow these rules will be ignored or removed with no further clarification.

Issue Summary

Compiled and ran predictions on AMD GPU version in MacOS 10.13.6 successfully (with minor mutex lock error at the end; but output is still successfully written)

You can use a GPU version in MacOS with the following steps:

  1. Install viennacl using brew install viennacl
  2. Compile in CMake with OPENCL GPU_MODE
  3. When running openpose.bin, make sure to run with --num_gpu_start 1 to avoid Error: ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE

Used this command (also works for images):

Executed Command (if any)

  • ./build/examples/openpose/openpose.bin --video examples/media/video.mp4 --num_gpu_start 1
    (video.mp4 is my own 2-second video)

OpenPose Output (if any)

Starting OpenPose demo...
Auto-detecting all available GPUs... Detected 2 GPU(s), using 1 of them starting at GPU 1.
Starting thread(s)...
Kernel: resizeAndMergeFullKernel Type: float GPU: 1 built successfully
Kernel: resizeAndMergeKernel Type: float GPU: 1 built successfully
Kernel: resizeAndAddKernel Type: float GPU: 1 built successfully
Kernel: resizeAndAverageKernel Type: float GPU: 1 built successfully
Kernel: zeroBufferKernel Type: float GPU: 1 built successfully
Kernel: copyBufferKernel Type: float GPU: 1 built successfully
Kernel: nmsRegisterKernel Type: float GPU: 1 built successfully
Kernel: nmsWriteKernel Type: float GPU: 1 built successfully
OpenPose demo successfully finished. Total time: 481.331713 seconds.
libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument
Abort trap: 6

Type of Issue

  • Test result report

Your System Configuration

  1. OpenPose version: Latest GitHub code (v1.4.0)

  2. General configuration:

    • Installation mode: Standard installation steps for MacOS (except stated OPENCL GPU_MODE)
    • Operating system: MacOS 10.13.6
    • Release or Debug mode?: release
    • Compiler (GCC):
      • Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
      • Apple LLVM version 9.1.0 (clang-902.0.39.2)
      • Target: x86_64-apple-darwin17.7.0
      • Thread model: posix
  3. Non-default settings:

    • 3-D Reconstruction module added?: no
    • Any other custom CMake configuration with respect to the default version?: GPU_MODE=OPENCL
  4. 3rd-party software:

    • Caffe version: Default from OpenPose
    • CMake version: 3.12.1
    • OpenCV version: 3.4.2
  5. If GPU mode issue:

    • CUDA version: NA
    • cuDNN version: NA
    • GPU model: Intel UHD Graphics 630 (Using Radeon Pro 560X causes INVALID_WORK_GROUP_SIZE error in ViennaCL)
  6. If CPU-only mode issue:

    • CPU brand & model: Intel Core i9-8950HK
    • Total RAM memory available: 32GB
  7. If Python API:

    • Python version: 3.7
    • Numpy version: 1.15.1
  8. If Windows system:

    • NA
  9. If speed performance issue:

    • NA

Not sure if this should be a GitHub issue, but posting it here to share.

@soulslicer
Copy link
Collaborator

soulslicer commented Sep 7, 2018

Hmm..interesting

We were waiting for somebody with a radon graphics card with OSX to test it but no one came forward. We did not have such a machine.

So does it actually work? What was the frame rate and what was the GPU utililization like?

You can run it with the display on a longer video to see the frame rate. Starting will be slow because it has to compile the CL kernels

@tmanh

This comment has been minimized.

@ghost
Copy link
Author

ghost commented Sep 7, 2018

@soulslicer I don't remember seeing the video while running it; I just waited for it to finish in the command line to output the .json. It did output the .json successfully, though.

I'll run with a longer video later and benchmark the frame rate and GPU utilisation when I get home and report back the findings.

@ghost
Copy link
Author

ghost commented Sep 7, 2018

@tmanh this is the step where you have to run the CMake GUI, right? I was having issues with that and cloned caffe directly into the 3rdparty directory because of an error. I followed this comment

@ghost ghost changed the title [MacOS 10.13.6] Successfully compiled and ran predictions with OPENCL GPU_MODE [MacOS 10.13.6] compiled and ran predictions with OPENCL GPU_MODE Sep 7, 2018
@ghost
Copy link
Author

ghost commented Sep 7, 2018

@soulslicer I am so sorry! I just found out the command was using the non-AMD, Intel UHD Graphics 630 GPU. I incorrectly assumed it was the opposite, meaning I had to specify --num_gpu_start 1 to use the AMD one.

2018-09-07 19 38 56

I would have to debug first why using --num_gpu_start 0, the AMD Radeon 560X GPU, causes this error:

ViennaCL: FATAL ERROR: Kernel start failed for 'conv_forward'.

Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: 
Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204

Abort trap: 6

This may take some time

@gineshidalgo99
Copy link
Member

gineshidalgo99 commented Sep 7, 2018

My best guess of the error is that it should be using both the intel HD graphics and the AMD ones, and maybe OpenCL is not able to run in parallel in both of them simultaneously for being different brand GPUs.

Could you try and let me know if it works with these extra 2 flags? --num_gpu_start 0 --num_gpu 1 (this should ONLY use the Intel HD one)

And to recap, in your case, we know that: --num_gpu_start 1 --num_gpu 1 works (this only uses the AMD one) and that using both together fail (i.e., not adding any flag).

@ghost
Copy link
Author

ghost commented Sep 7, 2018

@gineshidalgo99 thanks for the quick reply! I'd like to clarify that in my case, this command worked:

./build/examples/openpose/openpose.bin --video examples/media/video.avi --num_gpu_start 1

and based on the GPU usage history, it only used the Intel HD Graphics 630 card, not the AMD one.

However, when I try to use this command which specifies to use the AMD one:

./build/examples/openpose/openpose.bin --video examples/media/video.avi --num_gpu_start 0

it outputs the error in my previous comment.


I tried the 2 flags you sent, but unfortunately, it still doesn't seem to work. Command used:
./build/examples/openpose/openpose.bin --video examples/media/video.avi --num_gpu_start 0 --num_gpu 1

Stacktrace:

(env) computer:openpose computer$ ./build/examples/openpose/openpose.bin --video examples/media/video.avi --num_gpu_start 0 --num_gpu 1
Starting OpenPose demo...
Starting thread(s)...
ViennaCL: FATAL ERROR: Kernel start failed for 'conv_forward'.

Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: 
Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204

Abort trap: 6

I'm not too familiar with the libraries, but my guess is if some constant for the WORK_GROUP_SIZE is tweaked, it may work. I'm just guessing here but would this make sense?

I can provide more info or even screencast if needed.

@soulslicer
Copy link
Collaborator

soulslicer commented Sep 7, 2018

Can you try an image that is a resolution that is a multiple of 64 or 128 (try a square image also) and also disable multithreading.

Also Gines may be right, it may be trying to use both GPU's somehow

@soulslicer
Copy link
Collaborator

Also, was OP any faster when using the Intel GPU (I would presume slower)

@ghost
Copy link
Author

ghost commented Sep 7, 2018

@soulslicer sure! I tried the following:

cropped 450x300 image:

test

and ran the following command:

./build/examples/openpose/openpose.bin --image_dir 2x3_image/ --num_gpu_start 0 -num_gpu 1 --disable_multi_thread

stacktrace was still similar:

(env) computer:openpose computer$ ./build/examples/openpose/openpose.bin --image_dir 2x3_image/ --num_gpu_start 0 -num_gpu 1 --disable_multi_thread
Starting OpenPose demo...
The default dynamic `--net_resolution` is not supported in MKL (MKL CPU Caffe) and OpenCL Caffe versions. Please, use a static `net_resolution` (recommended `--net_resolution 656x368`) or use the Caffe CUDA master branch when processing images and/or when using your custom image reader. OpenPose has automatically set the resolution to 656x368.
Starting thread(s)...
ViennaCL: FATAL ERROR: Kernel start failed for 'conv_forward'.

Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:exec():129
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/threadManager.hpp:exec():186
- /Users/computer/Desktop/dev/openpose/include/openpose/wrapper/wrapper.hpp:exec():1072
- /Users/computer/Desktop/dev/openpose/examples/openpose/openpose.cpp:openPoseDemo():372
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: 
Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:exec():129
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/threadManager.hpp:exec():186
- /Users/computer/Desktop/dev/openpose/include/openpose/wrapper/wrapper.hpp:exec():1072
- /Users/computer/Desktop/dev/openpose/examples/openpose/openpose.cpp:openPoseDemo():372

Abort trap: 6

square image (300x300):

test

Command:

./build/examples/openpose/openpose.bin --image_dir square/ --num_gpu_start 0 -num_gpu 1 --disable_multi_thread

Stacktrace:

(env) computer:openpose computer$ ./build/examples/openpose/openpose.bin --image_dir square/ --num_gpu_start 0 -num_gpu 1 --disable_multi_thread
Starting OpenPose demo...
The default dynamic `--net_resolution` is not supported in MKL (MKL CPU Caffe) and OpenCL Caffe versions. Please, use a static `net_resolution` (recommended `--net_resolution 656x368`) or use the Caffe CUDA master branch when processing images and/or when using your custom image reader. OpenPose has automatically set the resolution to 656x368.
Starting thread(s)...
ViennaCL: FATAL ERROR: Kernel start failed for 'conv_forward'.

Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:exec():129
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/threadManager.hpp:exec():186
- /Users/computer/Desktop/dev/openpose/include/openpose/wrapper/wrapper.hpp:exec():1072
- /Users/computer/Desktop/dev/openpose/examples/openpose/openpose.cpp:openPoseDemo():372
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: 
Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:exec():129
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/threadManager.hpp:exec():186
- /Users/computer/Desktop/dev/openpose/include/openpose/wrapper/wrapper.hpp:exec():1072
- /Users/computer/Desktop/dev/openpose/examples/openpose/openpose.cpp:openPoseDemo():372

Abort trap: 6
(env) computer:openpose computer$ 

64/128 multiple image (256x256):

test

Command:

./build/examples/openpose/openpose.bin --image_dir 64/ --num_gpu_start 0 -num_gpu 1 --disable_multi_thread

Stacktrace:

(env) computer:openpose computer$ ./build/examples/openpose/openpose.bin --image_dir 64/ --num_gpu_start 0 -num_gpu 1 --disable_multi_thread
Starting OpenPose demo...
The default dynamic `--net_resolution` is not supported in MKL (MKL CPU Caffe) and OpenCL Caffe versions. Please, use a static `net_resolution` (recommended `--net_resolution 656x368`) or use the Caffe CUDA master branch when processing images and/or when using your custom image reader. OpenPose has automatically set the resolution to 656x368.
Starting thread(s)...

ViennaCL: FATAL ERROR: Kernel start failed for 'conv_forward'.

Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:exec():129
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/threadManager.hpp:exec():186
- /Users/computer/Desktop/dev/openpose/include/openpose/wrapper/wrapper.hpp:exec():1072
- /Users/computer/Desktop/dev/openpose/examples/openpose/openpose.cpp:openPoseDemo():372
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: 
Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractor.cpp:forwardPass():52
- /Users/computer/Desktop/dev/openpose/include/openpose/pose/wPoseExtractor.hpp:work():100
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThread.hpp:workTWorkers():135
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/subThreadQueueInOut.hpp:work():96
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:threadFunction():204
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/thread.hpp:exec():129
- /Users/computer/Desktop/dev/openpose/include/openpose/thread/threadManager.hpp:exec():186
- /Users/computer/Desktop/dev/openpose/include/openpose/wrapper/wrapper.hpp:exec():1072
- /Users/computer/Desktop/dev/openpose/examples/openpose/openpose.cpp:openPoseDemo():372

Abort trap: 6

@ghost
Copy link
Author

ghost commented Sep 7, 2018

I haven't done benchmark comparisons with using the CPU version and i9 processor, but after running predictions with the example video.avi with the Intel Graphics card I remember it ran for ~1500 seconds (~25 minutes).

@soulslicer
Copy link
Collaborator

soulslicer commented Sep 7, 2018

Sorry you have to run --net_resolution 256x256.

Please run this command instead of Openpose

build/examples/tutorial_pose/1_extract_from_image.bin --image XX.jpg --net_resolution 256x256

@soulslicer
Copy link
Collaborator

soulslicer commented Sep 7, 2018

That is the wrong command, you need to run

build/examples/tutorial_pose/1_extract_from_image.bin --image XX.jpg --net_resolution 256x256

But I don't think it matters, it seems to be a weird bug on the Caffe side:

BVLC/caffe#6239

I think it is a AMD OSX driver issue. We have tested the AMD RX Vega cards on Ubuntu and Windows with no problems. On the other hand, Apple has completely dropped OpenCL support from its systems so.

@ghost
Copy link
Author

ghost commented Sep 7, 2018

It's still the same (I used --image_path, --image returns invalid flag):

Command:

./build/examples/tutorial_pose/1_extract_from_image.bin --image_path 256x256.png --net_resolution 256x256

Stacktrace:

(env) computer:openpose computer$ ./build/examples/tutorial_pose/1_extract_from_image.bin --image_path 256x256.png --net_resolution 256x256
Starting OpenPose demo...
ViennaCL: FATAL ERROR: Kernel start failed for 'conv_forward'.

Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/examples/tutorial_pose/1_extract_from_image.cpp:openPoseTutorialPose1():145
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: 
Error:
ViennaCL: FATAL ERROR: CL_INVALID_WORK_GROUP_SIZE 
 The supplied work group size is invalid. If you have set this value manually, please reconsider your choice.
If you think that this is a bug in ViennaCL, please report it at viennacl-support@lists.sourceforge.net and supply at least the following information:
 * Operating System
 * Which OpenCL implementation (AMD, NVIDIA, etc.)
 * ViennaCL version
Many thanks in advance!

Coming from:
- /Users/computer/Desktop/dev/openpose/src/openpose/net/netCaffe.cpp:forwardPass():227
- /Users/computer/Desktop/dev/openpose/src/openpose/pose/poseExtractorCaffe.cpp:forwardPass():315
- /Users/computer/Desktop/dev/openpose/examples/tutorial_pose/1_extract_from_image.cpp:openPoseTutorialPose1():145

Abort trap: 6

@ghost
Copy link
Author

ghost commented Sep 7, 2018

I see. At least we were able to document this. Thanks for all your help! @soulslicer @gineshidalgo99

@soulslicer
Copy link
Collaborator

soulslicer commented Sep 7, 2018

It seems specific to OSX

viennacl/viennacl-dev#233

naibaf7/libdnn#27

Unfortunately, it looks like AMD wont be working in OSX anytime soon

@ghost
Copy link
Author

ghost commented Sep 7, 2018

That's unfortunate.

It's all good, though; the CPU_ONLY version is already great in my use case, and people can simply rent a box for heavier predictions. Thanks for the support with this ticket!

@soulslicer
Copy link
Collaborator

Okay, but can you tell me your benchmarks when using the Intel GPU? Is it faster than the CPU version?

@ghost
Copy link
Author

ghost commented Sep 8, 2018

@soulslicer when using Intel GPU, it's more or less the same with FPS and running time took ~1 min faster than when running i9 CPU with the following:

Constants

Sample video

examples/media/video.avi

Flags

--video examples/media/video.avi --write_json out/

Usage

MacBook was not used while running predictions

Opened applications

PyCharm
Sublime Text (4 tabs)
CMake
iMessage
iStat Menus

Intel HD Graphics 630 GPU (built with GPU_MODE=OPENCL)

Command
./build/examples/openpose/openpose.bin --video examples/media/video.avi --write_json out/ --num_gpu_start 1
Running time

1518.763496 seconds

FPS

0.1-0.2

GPU Usage:

~80-100%

Stacktrace
(env) computer:openpose computer$ ./build/examples/openpose/openpose.bin --video examples/media/video.avi --write_json out/ --num_gpu_start 1
Starting OpenPose demo...
Auto-detecting all available GPUs... Detected 2 GPU(s), using 1 of them starting at GPU 1.
Starting thread(s)...
Kernel: resizeAndMergeFullKernel Type: float GPU: 1 built successfully
Kernel: resizeAndMergeKernel Type: float GPU: 1 built successfully
Kernel: resizeAndAddKernel Type: float GPU: 1 built successfully
Kernel: resizeAndAverageKernel Type: float GPU: 1 built successfully
Kernel: zeroBufferKernel Type: float GPU: 1 built successfully
Kernel: copyBufferKernel Type: float GPU: 1 built successfully
Kernel: nmsRegisterKernel Type: float GPU: 1 built successfully
Kernel: nmsWriteKernel Type: float GPU: 1 built successfully
OpenPose demo successfully finished. Total time: 1518.763496 seconds.
libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument
Abort trap: 6

Intel Core i9-8950HK CPU (built with GPU_MODE=CPU_ONLY)

Command run
./build/examples/openpose/openpose.bin --video examples/media/video.avi --write_json out/
Running time

1572.387431 seconds

FPS

0.1-0.2

CPU Usage

~60-75%

Stacktrace
(env) computer:openpose computer$ ./build/examples/openpose/openpose.bin --video examples/media/video.avi --write_json out/
Starting OpenPose demo...
Starting thread(s)...
OpenPose demo successfully finished. Total time: 1572.387431 seconds.

@stale
Copy link

stale bot commented Nov 7, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale/old label Nov 7, 2018
@stale stale bot closed this as completed Nov 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants