Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

omp::vector does not use OpenMP unless THRUST_DEVICE_SYSTEM=...OMP? #383

Closed
slayoo opened this issue Jul 16, 2013 · 5 comments
Closed

omp::vector does not use OpenMP unless THRUST_DEVICE_SYSTEM=...OMP? #383

slayoo opened this issue Jul 16, 2013 · 5 comments
Labels
type: bug: functional Does not work as intended.

Comments

@slayoo
Copy link
Contributor

slayoo commented Jul 16, 2013

Hi,

The following code:

$ cat test.cpp
#include <thrust/system/omp/vector.h>
#include <iostream>
#include <omp.h>

int main() {
  thrust::omp::vector<int> v(10);
  struct {
    void operator()(int) {
      std::cerr << omp_get_thread_num() << std::endl;
    }
  } test;
  thrust::for_each(v.begin(), v.end(), test);
}

when compiled with no THRUST_DEVICE_SYSTEM specified:

$ g++ -std=c++11 -fopenmp test.cpp

runs on a single thread even though an omp::vector was used:

$ ./a.out
0
0
0
0
0
0
0
0
0
0

The same code compiled with THRUST_DEVICE_SYSTEM set to OpenMP:

$ g++ -std=c++11 -fopenmp -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_OMP test.cpp

Does run using multiple threads:

$ ./a.out
15
2
2
4

1
30
0

3 

How to force it to use OpenMP even without defining THRUST_DEVICE_SYSTEM?
(context: https://groups.google.com/forum/#!msg/thrust-users/r_aorLOXnpI/wLjQmFKMOFcJ)

Thanks for help,
Sylwester

@jaredhoberock
Copy link
Contributor

Hi Sylwester,
That's a rather embarrassing bug. Thanks for reporting it.

I think you can workaround it by adding #include <thrust/system/omp/execution_policy.h>:

#include <thrust/system/omp/execution_policy.h>
#include <thrust/system/omp/vector.h>
#include <thrust/for_each.h>
#include <iostream>
#include <omp.h>

int main() {
  thrust::omp::vector<int> v(10);
  struct {
    void operator()(int) {
      std::cerr << omp_get_thread_num() << std::endl;
    }
  } test;
  thrust::for_each(v.begin(), v.end(), test);
}

Also, don't forget #include <thrust/for_each.h>.

@jaredhoberock
Copy link
Contributor

To fix this, we need to change

this line to #include <thrust/system/omp/execution_policy.h> and ensure that all the vector headers do the same thing

@slayoo
Copy link
Contributor Author

slayoo commented Jun 1, 2014

Hi,

I've came across a similar issue again, and remembered the discussion here. Trying to rerun the example above I'm now getting:

$ g++ -std=c++11 -fopenmp test.cpp
In file included from /usr/include/thrust/detail/config/config.h:31:0,
                 from /usr/include/thrust/detail/config.h:22,
                 from /usr/include/thrust/system/omp/vector.h:24,
                 from test.cpp:1:
/usr/include/thrust/detail/config/host_device.h:27:26: fatal error: host_defines.h: No such file or directory
 #include <host_defines.h>

with the Thrust 1.7 packaged in Debian, and:

$ g++ -std=c++11 -fopenmp test.cpp
In file included from /usr/include/thrust/system/cuda/detail/bulk/future.hpp:20:0,
                 from /usr/include/thrust/system/cuda/detail/bulk/execution_policy.hpp:19,
                 from /usr/include/thrust/system/cuda/detail/bulk/bulk.hpp:20,
                 from /usr/include/thrust/system/cuda/detail/bulk.h:44,
                 from /usr/include/thrust/system/cuda/detail/for_each.inl:25,
                 from /usr/include/thrust/system/cuda/detail/for_each.h:59,
                 from /usr/include/thrust/system/detail/adl/for_each.h:32,
                 from /usr/include/thrust/detail/for_each.inl:27,
                 from /usr/include/thrust/for_each.h:279,
                 from /usr/include/thrust/system/detail/generic/transform.inl:19,
                 from /usr/include/thrust/system/detail/generic/transform.h:105,
                 from /usr/include/thrust/detail/transform.inl:25,
                 from /usr/include/thrust/transform.h:724,
                 from /usr/include/thrust/system/detail/generic/gather.inl:21,
                 from /usr/include/thrust/system/detail/generic/gather.h:80,
                 from /usr/include/thrust/detail/gather.inl:25,
                 from /usr/include/thrust/gather.h:440,
                 from /usr/include/thrust/system/cuda/detail/adjacent_difference.inl:19,
                 from /usr/include/thrust/system/cuda/detail/adjacent_difference.h:50,
                 from /usr/include/thrust/system/detail/adl/adjacent_difference.h:32,
                 from /usr/include/thrust/detail/adjacent_difference.inl:25,
                 from /usr/include/thrust/adjacent_difference.h:245,
                 from /usr/include/thrust/system/detail/generic/adjacent_difference.inl:19,
                 from /usr/include/thrust/system/detail/generic/adjacent_difference.h:57,
                 from /usr/include/thrust/system/omp/detail/adjacent_difference.h:21,
                 from /usr/include/thrust/system/omp/execution_policy.h:33,
                 from /usr/include/thrust/system/omp/memory.h:24,
                 from /usr/include/thrust/system/omp/vector.h:25,
                 from test.cpp:1:
/usr/include/thrust/system/cuda/detail/bulk/detail/guarded_cuda_runtime_api.hpp:40:30: fatal error: cuda_runtime_api.h: No such file or directory
 #include <cuda_runtime_api.h>

with the current github version.
Any hints welcome!
Thanks,
Sylwester

@jaredhoberock
Copy link
Contributor

Hi Sylwester,
Thrust is set up to use CUDA by default.

To fix this, either install the CUDA runtime so that cuda_runtime_api.h is available, or define THRUST_DEVICE_SYSTEM to something like THRUST_DEVICE_SYSTEM_OMP.

@slayoo
Copy link
Contributor Author

slayoo commented Jun 6, 2014

Hi,

First, thanks for quick answer.

Second - just for the record - the reason I was getting the same symptoms was that the Thrust calls got into another OpenMP block:

// workaround Thrust issue #382 (missing CUDA includes)
#define THRUST_DEVICE_SYSTEM THRUST_DEVICE_SYSTEM_CPP 

// workaround Thrust issue #383 (for Thrust < 1.8)
#include <thrust/system/omp/execution_policy.h> 

#include <thrust/system/omp/vector.h>
#include <iostream>
#include <omp.h>

void test() {
  thrust::omp::vector<int> v(10);
  struct {
    void operator()(int) {
      std::cerr << omp_get_thread_num() << std::endl;
    }
  } test;
  thrust::for_each(v.begin(), v.end(), test);
}

main()
{
  test();

  std::cerr << "---------------\n";

#pragma omp parallel for
  for (int i=0; i < 1; ++i) test();
}

is giving:

$ g++ -std=c++11 -fopenmp test.cpp 
$ ./a.out 
0512
2
4


03
1


3
---------------
0
0
0
0
0
0
0
0
0
0

Seems obvious, but was hard to track down in a longer code.
Again, thanks.
Sylwester

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: bug: functional Does not work as intended.
Projects
None yet
Development

No branches or pull requests

2 participants