Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] FailureCallbackResourceAdaptor pytests fail on GPUs with more than 100GiB of device memory #1733

Closed
harrism opened this issue Nov 19, 2024 · 0 comments · Fixed by #1734
Assignees
Labels
bug Something isn't working Python Related to RMM Python API tests Related to unit tests

Comments

@harrism
Copy link
Member

harrism commented Nov 19, 2024

Describe the bug
The following two tests expect exceptions when allocating more than 100GB of memory (1e11 bytes).

Steps/Code to reproduce bug
Run RMM pytests on a >100GiB GPU, e.g. GH200.

Expected behavior
Tests should pass.

@harrism harrism added bug Something isn't working Python Related to RMM Python API tests Related to unit tests labels Nov 19, 2024
@harrism harrism self-assigned this Nov 19, 2024
rapids-bot bot pushed a commit that referenced this issue Nov 20, 2024
Fixes #1733 by querying total device memory and using twice as much in tests that are expected to fail allocation.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #1734
@github-project-automation github-project-automation bot moved this from Todo to Done in RMM Project Board Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Related to RMM Python API tests Related to unit tests
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant