Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow caching on cpu when compute some metrics. #848

Closed
kynk94 opened this issue Feb 19, 2022 · 4 comments · Fixed by #867
Closed

Allow caching on cpu when compute some metrics. #848

kynk94 opened this issue Feb 19, 2022 · 4 comments · Fixed by #867
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@kynk94
Copy link

kynk94 commented Feb 19, 2022

🚀 Feature

Currently, some metrics like FID and KID only use GPU computing.
Since all datasets are cached in gpu memory, there is a big limitation on hardware.
It may be helpful to add compute_on_cpu option to compute these metrics like below.

def update(self, imgs: Tensor, real: bool) -> None:  # type: ignore
    """Update the state with extracted features.

    Args:
        imgs: tensor with images feed to the feature extractor
        real: bool indicating if imgs belong to the real or the fake distribution
    """
    features = self.inception(imgs)

    if self.compute_on_cpu:
        features = features.detach().cpu()

    if real:
        self.real_features.append(features)
    else:
        self.fake_features.append(features)

Motivation

I can compute FID on cloud system but not my local, because of the memory issue.
Currently, warn large memory foorprint. #468

@kynk94 kynk94 added the enhancement New feature or request label Feb 19, 2022
@github-actions
Copy link

Hi! thanks for your contribution!, great first issue!

@SkafteNicki
Copy link
Member

Hi @kynk94,
I do not completely understand the problem here. If you initialize a metric, it will by default be on the CPU:
image

Do you want to keep the overall metric on GPU, meaning that features = self.inception(imgs) is still calculated on the GPU but when you store the features for later you want that to be on CPU?

@Borda
Copy link
Member

Borda commented Feb 21, 2022

My understanding is that if you run training and you use all GPU memory it could be interesting to force/offload the metric computation to CPU... is it correct @kynk94?

@kynk94
Copy link
Author

kynk94 commented Feb 21, 2022

Yes, that's right.
If training on GPU, the inception features are stored in GPU memory to compute metric.
If have a lot of GPU, there are enough GPU memory.
But most local desktops not have enough GPU memory, and CPU memory capacity is larger than GPU memory.

In this case, to avoid memory overflow errors, need to cache the features on CPU memory and compute metric on CPU

@SkafteNicki SkafteNicki added this to the v0.8 milestone Mar 23, 2022
@Borda Borda added this to the v0.8 milestone May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
3 participants