Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy_v_cs_from_GPU(Utils::Range<ParticleIterator<Cell**, Particle> >) seems to make trouble with hip #2413

Closed
KaiSzuttor opened this issue Dec 12, 2018 · 7 comments

Comments

@KaiSzuttor
Copy link
Member

see https://gitlab.icp.uni-stuttgart.de/espressomd/espresso/-/jobs/56529

@mkuron
Copy link
Member

mkuron commented Dec 12, 2018

What pull request did this start with? It worked fine last week.

@mkuron
Copy link
Member

mkuron commented Dec 13, 2018

I think #2390 fixes it. https://gitlab.icp.uni-stuttgart.de/espressomd/espresso/-/jobs/57294 did not crash.

@KaiSzuttor
Copy link
Member Author

@mkuron
Copy link
Member

mkuron commented Dec 13, 2018

Now it only occurs in engine_lb, which makes some sense. It didn't occur in #2390, which means that it was broken by some pull request in the past two weeks.

@hmenke
Copy link
Member

hmenke commented Dec 13, 2018

It might be that HIP does not like the strided memcpy we are performing in copy_v_cs_from_GPU.

#if defined(ENGINE) && defined(LB_GPU)
// setup and call kernel to copy v_cs to host
void copy_v_cs_from_GPU(ParticleRange particles) {
if (global_part_vars_host.communication_enabled == 1 &&
global_part_vars_host.number_of_particles) {
// Copy result from device memory to host memory
if (this_node == 0) {
cuda_safe_mem(cudaMemcpy2D(
host_v_cs, sizeof(CUDA_v_cs), particle_data_device,
sizeof(CUDA_particle_data), sizeof(CUDA_v_cs),
global_part_vars_host.number_of_particles, cudaMemcpyDeviceToHost));
}
cuda_mpi_send_v_cs(particles, host_v_cs);
}
}
#endif

What we are doing here is, instead of calling memcpy for each individual particle we are calling memcpy of all particles at once, but in strides of the first few bits. That is also the reason why v_cs must remain the first member in the CUDA_ParticleParametersSwimming structure.

// Parameters for swimmers
#ifdef ENGINE
typedef struct {
// v_cs has to stay in the front for memmove reasons
float v_cs[6];
float v_swim;
float f_swim;
float director[3];
int push_pull;
float dipole_length;
bool swimming;
} CUDA_ParticleParametersSwimming;
#endif

@mkuron
Copy link
Member

mkuron commented Dec 13, 2018

That should be fine. Also it used to work, some recent change must have broken it. I won‘t have time to debug it before next week though.

@mkuron
Copy link
Member

mkuron commented Dec 17, 2018

It's mysteriously fixed on the current master: https://gitlab.icp.uni-stuttgart.de/espressomd/espresso/-/jobs/57998 . I also couldn't reproduce it on my own machine. Please close this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants