Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[edited] GPU: keep_interpolated_fields #759

Open
weiy-me opened this issue Oct 30, 2024 · 4 comments
Open

[edited] GPU: keep_interpolated_fields #759

weiy-me opened this issue Oct 30, 2024 · 4 comments
Labels
feature-request something that could be added to the code

Comments

@weiy-me
Copy link

weiy-me commented Oct 30, 2024

When using GPU computation, the ID list in trackparticle seems to be reassigned based on the current position at each timestamp, while with CPU computation, particle IDs are inherited across different timestamps, allowing for tracking.

Description

Please refer to the attached image and code.

I attempted to perform a TNSA simulation and tracked some particles that moved to the back of the simulation area (x > 50) using their IDs, as shown in the figure. It can be seen that the results from the CPU and GPU are completely different. In the GPU version, these IDs seem to be reassigned based on position at each step, so they always appear together. In the final time step, they suddenly moved to the back of the simulation area.

CPU

CPU

GPU

GPU

input_gpu.py.txt
input_cpu.py.txt
plot.py.txt

Parameters

  • Smilei Version : 5.0-206-g456444957-master
  • HDF5 version 1.14.2
  • Python version 3.10.15
  • GPU: A100-PCIE-40GB
  • CUDA Version: 12.1
@weiy-me weiy-me added the bug label Oct 30, 2024
@mccoys
Copy link
Contributor

mccoys commented Oct 31, 2024

Thank you for reporting this bug. Are you able to try the latest version from GitHub to see if this was previously resolved?

@weiy-me
Copy link
Author

weiy-me commented Nov 4, 2024

Thank you. After upgrading to the latest version (Version : 5.1-55-ga31451771-master), ID tracking of particle works normally, and the sorting process is much faster than before. However, after adding "keep_interpolated_fields"=[ "Ex", "Ey", "Ez", "Bx", "By", "Bz"] to the input file, an error occurs around step 4700/51199 during execution.

--------------------------------------------------------------------------
No OpenFabrics connection schemes reported that they were able to be
used on a specific port.  As such, the openib BTL (OpenFabrics
support) will be disabled for this port.

  Local host:           l12gpu25
  Local device:         mlx5_0
  Local port:           1
  CPCs attempted:       rdmacm, udcm
--------------------------------------------------------------------------
Stack trace (most recent call last):
#8    Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in 
#7    Object "/lustre/home/2000011366/Smilei_5.1/Smilei/smilei", at 0x48eb6d, in _start
#6    Object "/lib64/libc.so.6", at 0x7f3968789d84, in __libc_start_main
#5    Object "/lustre/home/2000011366/Smilei_5.1/Smilei/smilei", at 0xd93ccb, in main
#4    Object "/lustre/home/2000011366/Smilei_5.1/Smilei/smilei", at 0xb3eb41, in VectorPatch::dynamics(Params&, SmileiMPI*, SimWindow*, RadiationTables&, MultiphotonBreitWheelerTables&, double, Timers&, int)
#3    Object "/lustre/home/2000011366/Smilei_5.1/Smilei/smilei", at 0xb3ee4b, in VectorPatch::dynamicsWithoutTasks(Params&, SmileiMPI*, SimWindow*, RadiationTables&, MultiphotonBreitWheelerTables&, double, Timers&, int)
#2    Object "/lustre/home/2000011366/Smilei_5.1/Smilei/smilei", at 0xdcd035, in Species::dynamics(double, unsigned int, ElectroMagn*, Params&, bool, PartWalls*, Patch*, SmileiMPI*, RadiationTables&, MultiphotonBreitWheelerTables&)
#1    Object "/lustre/home/2000011366/Smilei_5.1/Smilei/smilei", at 0xa9a0e5, in Particles::copyInterpolatedFields(double*, double*, std::vector<std::vector<double, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >&, unsigned long, unsigned long, unsigned long, double)
#0    Object "/lib64/libc.so.6", at 0x7f396881f39f, in 
Segmentation fault (Invalid permissions for mapped object [0x7f383a000000])
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direct

ion, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 514218 on node l12gpu25 exited on signal 11 (Segmentation fault).
-------------------
-------------------------------------------------------

I tried several times, but the same error still occurs.

These files might be useful.

@mccoys
Copy link
Contributor

mccoys commented Nov 5, 2024

Indeed, keep_interpolated_fields was not ported yet to GPU. We are in the process of porting features one by one, but there is lot more work remaining.

Let me change the title of this issue to make it as a feature request

@mccoys mccoys added feature-request something that could be added to the code and removed bug labels Nov 5, 2024
@mccoys mccoys changed the title Unable to track particles by ID when using GPU GPU: keep_interpolated_fields Nov 5, 2024
@mccoys mccoys changed the title GPU: keep_interpolated_fields [edited] GPU: keep_interpolated_fields Nov 5, 2024
@weiy-me
Copy link
Author

weiy-me commented Nov 6, 2024

Okay, thank you very much!

@weiy-me weiy-me closed this as completed Nov 11, 2024
@weiy-me weiy-me reopened this Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request something that could be added to the code
Projects
None yet
Development

No branches or pull requests

2 participants