Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HPC] [NWays-challenge] stdpar implementation issue #95

Open
mozhgan-kch opened this issue Feb 17, 2022 · 2 comments
Open

[HPC] [NWays-challenge] stdpar implementation issue #95

mozhgan-kch opened this issue Feb 17, 2022 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@mozhgan-kch
Copy link
Contributor

Error reported for the stdpar implementation :

nvc++ -std=c++17 -stdpar=gpu -lm -I/opt/nvidia/hpc_sdk/Linux_x86_64/21.3/cuda/11.2/include -L/opt/nvidia/hpc_sdk/Linux_x86_64/21.3/cuda/11.2/lib64 -lnvToolsExt -c jacobi.cpp
"/opt/nvidia/hpc_sdk/Linux_x86_64/21.3/cuda/11.2/include/thrust/system/detail/generic/for_each.h", line 48: error: static assertion failed with "unimplemented for this system"
    THRUST_STATIC_ASSERT_MSG(
    ^
          detected during:
            instantiation of "InputIterator thrust::system::detail::generic::for_each(thrust::execution_policy<DerivedPolicy> &, InputIterator, InputIterator, UnaryFunction) [with DerivedPolicy=thrust::detail::execute_with_allocator<thrust::mr::allocator<char, thrust::mr::disjoint_unsynchronized_pool_resource<thrust::device_memory_resource, thrust::mr::new_delete_resource>>, thrust::cuda_cub::execute_on_stream_base>, InputIterator=thrust::counting_iterator<unsigned int, thrust::use_default, thrust::use_default, thrust::use_default>, UnaryFunction=lambda [](unsigned int)->void]" at line 44 of "/opt/nvidia/hpc_sdk/Linux_x86_64/21.3/cuda/11.2/include/thrust/detail/for_each.inl"
            instantiation of "InputIterator thrust::for_each(const thrust::detail::execution_policy_base<DerivedPolicy> &, InputIterator, InputIterator, UnaryFunction) [with DerivedPolicy=thrust::detail::execute_with_allocator<thrust::mr::allocator<char, thrust::mr::disjoint_unsynchronized_pool_resource<thrust::device_memory_resource, thrust::mr::new_delete_resource>>, thrust::cuda_cub::execute_on_stream_base>, InputIterator=thrust::counting_iterator<unsigned int, thrust::use_default, thrust::use_default, thrust::use_default>, UnaryFunction=lambda [](unsigned int)->void]" at line 1035 of "/opt/nvidia/hpc_sdk/Linux_x86_64/21.3/compilers/include/nvhpc/algorithm_execution.hpp"
            instantiation of "void std::__pstl::__algorithm_wrapper_struct<true>::for_each(_FIt, _FIt, _UF) [with _FIt=thrust::counting_iterator<unsigned int, thrust::use_default, thrust::use_default, thrust::use_default>, _UF=lambda [](unsigned int)->void]" at line 2136 of "/opt/nvidia/hpc_sdk/Linux_x86_64/21.3/compilers/include/nvhpc/algorithm_execution.hpp"
            instantiation of "std::__pstl::__enable_if_EP<_EP, void> std::for_each(_EP &&, _FIt, _FIt, _UF) [with _EP=const std::execution::parallel_policy &, _FIt=thrust::counting_iterator<unsigned int, thrust::use_default, thrust::use_default, thrust::use_default>, _UF=lambda [](unsigned int)->void]" at line 11 of "jacobi.cpp"

1 error detected in the compilation of "jacobi.cpp".
make: *** [Makefile:41: jacobi.o] Error 2

It looks like the issue is from the below part:
This is the culprit:

void jacobistep(double *psinew, double *psi, int m, int n)
{
  
  
		std::for_each(std::execution::par, thrust::counting_iterator<unsigned int>(1u), 
                      thrust::counting_iterator<unsigned int>(m),
					  [psinew, psi, m, n](unsigned int i) {

      for(int j=1;j<=n;j++)
	{
	  psinew[i*(m+2)+j]=0.25*(psi[(i-1)*(m+2)+j]+psi[(i+1)*(m+2)+j]+psi[i*(m+2)+j-1]+psi[i*(m+2)+j+1]);
        }
                      });
  
}

This needs investigation to recreate the issue.

@mozhgan-kch mozhgan-kch added the bug Something isn't working label Feb 17, 2022
@mozhgan-kch mozhgan-kch changed the title [HPC] [NWays] stdpar implementation issue [HPC] [NWays-challenge] stdpar implementation issue Mar 9, 2022
@mozhgan-kch
Copy link
Contributor Author

Looks like adding -cuda to the compiler solves this: https://forums.developer.nvidia.com/t/device-code-generated-from-stdpar-versus-thrust/196172/9

This needs checking to make sure the container is the same as the nways lab.

@mozhgan-kch mozhgan-kch self-assigned this Apr 14, 2022
@mozhgan-kch
Copy link
Contributor Author

Check and see if the deltasq calc is wrong. It might be missing various includes (unless we put them in the jacobi.h?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant