Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New argument compute_on_cpu #867

Merged
merged 16 commits into from
Apr 11, 2022
Merged

New argument compute_on_cpu #867

merged 16 commits into from
Apr 11, 2022

Conversation

SkafteNicki
Copy link
Member

@SkafteNicki SkafteNicki commented Feb 28, 2022

What does this PR do?

Rework of #851
Fixes #848
General solution that allows all metrics where the metric state is a list to store the state on CPU instead of GPU through a new keyword argument compute_on_cpu=True/False (default False). This can beneficial in situations where the user do not have that much VRAM. Consequence of enabling this feature is that code in compute will be executed on CPU (code in update will still be executed on GPU).

Fixes #866
Related as the MAP metric currently moves all metric states to CPU for faster computations. As the issue points out this can overload the CPU. Feature in this PR essentially also fixes this by making it an option to move to CPU by setting MeanAveragePrecision(compute_on_cpu=True).

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@SkafteNicki SkafteNicki added the enhancement New feature or request label Feb 28, 2022
@SkafteNicki SkafteNicki added this to the v0.8 milestone Feb 28, 2022
@codecov
Copy link

codecov bot commented Feb 28, 2022

Codecov Report

Merging #867 (7da164c) into master (d587be9) will increase coverage by 25%.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master   #867     +/-   ##
=======================================
+ Coverage      70%    95%    +25%     
=======================================
  Files         175    175             
  Lines        7468   7476      +8     
=======================================
+ Hits         5249   7102   +1853     
+ Misses       2219    374   -1845     

@mergify mergify bot added the ready label Feb 28, 2022
@maximsch2
Copy link
Contributor

A few questions:

  • If I'm reading code correctly, self.device will still be gpu with this during compute? I'm wondering if this will break some metrics.
  • The implicit assumption here is that a List state is only appended during update as otherwise we'll get device mismatches, right?
  • This doesn't help with other metrics that can occupy significant memory (e.g. binned curve metrics) on GPU as those are allocated as fixed size tensors. In that case though, I guess, a user can manually place them on CPU and transfer data to CPU before update. Is that what we would be recommending people?

@maximsch2 maximsch2 mentioned this pull request Feb 28, 2022
4 tasks
@justusschock
Copy link
Member

@maximsch2 I think the device should indicate the input and output device of a metric. E.g. even if the metic states are on cpu, it would require the devices of the inputs to be cuda and handle the device-transfer logic accordingly and the output would also be on gpu.

@mergify mergify bot added the has conflicts label Mar 1, 2022
@mergify mergify bot removed the has conflicts label Mar 1, 2022
@Borda
Copy link
Member

Borda commented Mar 20, 2022

@SkafteNicki mind resolve conflicts 🐰
@justusschock mind review? 🐟

@Borda
Copy link
Member

Borda commented Mar 28, 2022

@SkafteNicki @justusschock pls ^^ 🐰

@mergify mergify bot removed the has conflicts label Apr 11, 2022
@Borda
Copy link
Member

Borda commented Apr 11, 2022

seems there is a DDP issue... :/

@SkafteNicki
Copy link
Member Author

@Borda all unrelated to the PR, trying to run tests again :]

@SkafteNicki SkafteNicki enabled auto-merge (squash) April 11, 2022 14:36
@SkafteNicki SkafteNicki merged commit 15fb10b into master Apr 11, 2022
@SkafteNicki SkafteNicki deleted the features/store_on_cpu branch April 11, 2022 14:42
@Borda Borda added this to the v0.8 milestone May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ready
Projects
None yet
4 participants