Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug in tensornet backend scratch pad allocation in multi-GPU mode #2516

Merged
merged 4 commits into from
Jan 17, 2025

Conversation

1tnguyen
Copy link
Collaborator

Description

ScratchDeviceMem allocates memory based on memory availability on construction. This mechanism is not compatible with multi-GPU code path (MPI execution), whereby the CUDA device is selected in the simulator constructor; hence we need to defer the allocation until the device is selected.

Fixed by having a separate allocate method to be called once during the simulator backend constructor after device selection.

This bug was introduced in #1865, where the scratch pad is allocated once (scratch pad is a member variable of the simulator class) rather than on-demand to improve performance.

Add a unit test for this case, to be executed when there are multiple GPUs.

…r we've set the device

Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>
Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>
Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>
Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>
@1tnguyen 1tnguyen added the bug fix To be listed under Bug Fixes in the release notes label Jan 17, 2025
Copy link

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

github-actions bot pushed a commit that referenced this pull request Jan 17, 2025
Copy link
Collaborator

@bmhowe23 bmhowe23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@1tnguyen 1tnguyen merged commit 9e0b590 into NVIDIA:main Jan 17, 2025
213 checks passed
github-actions bot pushed a commit that referenced this pull request Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix To be listed under Bug Fixes in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants