Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RPG] Regarding the accuracy measurement #140

Open
Acasia opened this issue Jun 26, 2024 · 4 comments
Open

[RPG] Regarding the accuracy measurement #140

Acasia opened this issue Jun 26, 2024 · 4 comments

Comments

@Acasia
Copy link

Acasia commented Jun 26, 2024

#139 (comment)

I created a new issue because I can't reopen the existing issue.

Hello, As mentioned in the link above, I successfully reimplemented CIFAR-10 using the best accuracy.
I have checked that the best accuracy matches the accuracy in the paper's table. However, isn't the best accuracy different from the target sparsity? If it is correct to use the best accuracy, could you explain why the best accuracy was used instead of the last accuracy?

@YuchuanTian
Copy link
Collaborator

Thank you very much for raising this question. Let me check... My basic intuition doing this is that I found ImageNet accuracy would fluctuate by around 0.1% at the last several epochs. Thus, I wrote a code that simply read the value to the 'best_prec1' key for the model state dict as the result and I followed this convention for CIFAR-10. This worked perfectly well for ImageNet, but it did not occur to me that the practice could cause problems on CIFAR.

I will re-implement these experiments and try to figure the problem out.

@YuchuanTian
Copy link
Collaborator

I checked the experiments as follows:
CIFAR-10 VGG19 experiments are correct at all sparsities;
CIFAR-10 ResNet32 experiments are problematic at sparsity 99.5% and 99.9%;
ImageNet ResNet-50 experiments are correct at all sparsities.

The following modifications will be made in the following couple of days:

  1. Update the code to fix this bug;
  2. Update ResNet-32 experiment results at sparsity 99.5% and 99.9%.

I really appreciate your effort figuring this bug out!

@Acasia
Copy link
Author

Acasia commented Jul 1, 2024

Thank you for answering my question and confirming it with your own experiments.
I will look forward to your code update.
Thank you.

@YuchuanTian
Copy link
Collaborator

YuchuanTian commented Jul 17, 2024

Hi Acasia,

Sorry for replying late! Here are some updates after thorough checking:

  1. The codes have been updated. A patch is added to resolve the problem.
  2. The latest table after correction is as follows:
ResNet-32 Sparsity 99% 99.5% 99.9%
ProbMask (Official) 91.79 89.34 76.87
ProbMask (Our replication) 91.45 88.44 76.41
AC/DC (Our replication) 90.86 87.58 16.70
RPG (Ours) 91.61 89.13 71.09

Here are some additional comments:

  1. We are having difficulty replicating ProbMask CIFAR-10 results. We use the official codes of ProbMask and tried various settings (batchsize=[128,256]; w/ or w/o pretrained weight loading) but we still have a small gap with official results. I guess the gap could be attributed to different device types (we use Nvidia Titan-Xp for CIFAR).
  2. We also update AC/DC results because it was replicated with the same codebase and the same bug exists.
  3. RPG performs better than ProbMask on most sparsities except the extreme sparsity of 99.9% on ResNet-32. A hypothesis is that learnable soft mask could avoid weight/channel collapse when very few weights are retained. (It is also notable that ProbMask's training is 20epochs longer than ours because it contains an additional 20epoch finetuning)
  4. The manuscripts will be updated shortly.

Thanks again for raising this issue! I hope my response helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants