-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is the CNN version code trained well? #2
Comments
Hi
Thanks for kind comments. The model I have trains correctly but slowly. Did
you alter any of the parameters including the batch size?
Also the quality of the generated images are poor without training for long
(200 epochs in the original paper) even though visual inspection of some of
the dimensions seem correct.
Jarrel
…On Fri., 17 May 2019, 12:38 shimazing, ***@***.***> wrote:
Your code is really helpful to understand how iResNet works.
Thanks for writing this code.
However, when I was trying to run the CNN version code jupyter notebook,
It gave me the wrong result such that after a few iterations, the model
even can not reconstruct the inputs and latent standard of test data
diverges.
Did you get the right result??
Thanks in advance for your reply
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=AAOEUBYX3N3U5JHJYJNY6ILPVYLCLA5CNFSM4HNRS6S2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GUJZWRQ>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOEUB2HAJZI33V635WE6LTPVYLCLANCNFSM4HNRS6SQ>
.
|
My conjecture is that the optimization step makes spectral norm larger than 1 and your code uses sigma calculated in the training phase to normalize it. It changes weight in a test phase. I think this is not a correct action. And one more question is that does this code really do an in-place update for u? What do u mean my "v" in comment under Hajin Shim |
Actually you are on the right path. I have just checked the code and it uses an older incorrect version of the Spectral Normalization by Gouk. Specifically it underestimates the largest singular value because I use too small an x_i https://arxiv.org/pdf/1804.04368.pdf for my future reference. I actually made this change a while back but did not upload the corrected file, so my apologies. p.s. you can ignore the comment under compute_weight this was copied from an earlier implementation of Miyato's spectral norm using u and v vectors As to whether using sigma calculated in the training phase is valid in the testing phase, that is a good point. In theory the weight shouldn't be changing during the testing phase (since no weight update is performed) and sigma is solely dependent on the weight, so that shouldn't also change. In practice the sigma is somewhat variable, as the power iteration method only gives a bounded estimate, so I'm unclear whether recalculating sigma during the testing phase will change the result. Try the updated version and see if this works first. |
Thanks for updating!! :) However, I still have a problem and have a question. Thanks again for your fast reply :) |
Weight_orig is the original weight and the actual parameter that is undergoes gradient descent This is the same approach used in the pytorch implementation of Miyato's spectral_norm (in fact it is shamelessly copied including comments...) https://pytorch.org/docs/stable/_modules/torch/nn/utils/spectral_norm.html So when the Conv2d runs, it requests module.weight which is the recomputed tensor. When gradient descent runs and weight_orig is altered, weight is recomputed by finding the sigma of weight_orig and dividing it by the sigma if it is larger than 1. |
I met the similar problem. I am writing the classification code based on the "SpectralNormGouk.py" file. However, the test loss increased and the test accuracy decreased to around 10% while the training loss and accuracy performed well. Besides, when I checked the trained model , i.e., load the state dict, the loss differed a lot from the values printed during the training. |
@lingzenan Do you run the code with DataParallel?? |
@jarrelscy I still have a problem even with the updated version. Have you run the code with DataParallel? |
@shimazing yes |
I did not use data parallel. I'm not sure how the actnorm would behave with
data parallel, you may have to run a test batch before copying the model to
other gpus or the batch statistics may be wrong.
…On Fri., 17 May 2019, 20:55 Zenan Ling, ***@***.***> wrote:
@shimazing <https://github.com/shimazing> yes
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=AAOEUB4IXKRIXAIGA3FQ5V3PV2FJRA5CNFSM4HNRS6S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVUODIY#issuecomment-493412771>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOEUB5QU7JLJFJ2BH7AJ6TPV2FJRANCNFSM4HNRS6SQ>
.
|
@jarrelscy Problems still exit without data parallel. Here is a toy example. import torch.nn as nn class toy(nn.Module):
if name == "main":
"AttributeError: 'Linear' object has no attribute 'sigma' " |
Hi Zenan,
Saving and loading is not implemented yet.
Jarrel
…On Mon., 20 May 2019, 07:15 Zenan Ling, ***@***.***> wrote:
@jarrelscy <https://github.com/jarrelscy> Problems still exit without
data parallel. Here is a toy example.
import torch.nn as nn
from SpectralNormGouk1 import *
from torch.optim import *
class toy(nn.Module):
def *init*(self):
super(toy, self).*init*()
self.f = spectral_norm(nn.Linear(10, 10, bias=False), magnitude=0.9,
n_power_iterations=5)
def forward(self, x):
x = self.f(x)
return x
if *name* == "*main*":
net = toy()
opt = Adam(net.parameters(), lr=0.01)
criterion = nn.MSELoss()
for i in range(1000):
net.train()
opt.zero_grad()
inputs = torch.ones(32, 10)
y = net(inputs)
loss = criterion(y, inputs)
loss.backward()
opt.step()
print(loss.item())
torch.save(net.state_dict(), 'check.pkl')
print("########eval###########")
net.eval()
with torch.no_grad():
inputs = torch.ones(32, 10)
y = net(inputs)
loss = criterion(y, inputs)
print(loss.item())
print("########eval_check###########")
net_ = toy()
state = torch.load('check.pkl')
net_.load_state_dict(state)
net_.eval()
with torch.no_grad():
inputs = torch.ones(32, 10)
y = net_(inputs)
loss = criterion(y, inputs)
print(loss.item())
"AttributeError: 'Linear' object has no attribute 'sigma' "
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=AAOEUB3VEDXK4CHAIZGOEZ3PWKB3PA5CNFSM4HNRS6S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVYPVFI#issuecomment-493943445>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOEUB4W4UL5TORI4R4VSGDPWKB3PANCNFSM4HNRS6SQ>
.
|
@jarrelscy Thanks for your reply. |
@jarrelscy The test loss and accuracy seem to be normal if I use "net.train()" and "with with torch.no_grad()" during the test phase. |
Interesting maybe we should be recalculating sigma during test time then
…On Tue., 21 May 2019, 03:32 Zenan Ling, ***@***.***> wrote:
@jarrelscy <https://github.com/jarrelscy> The test loss and accuracy seem
to be normal if I use "net.train()" and "with with torch.no_grad()" during
the test phase.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=AAOEUB33HBT6FTKD5QQ4MELPWOQQBA5CNFSM4HNRS6S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODV3APMI#issuecomment-494274481>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOEUB6JIUDBFWZFKVDVZCTPWOQQBANCNFSM4HNRS6SQ>
.
|
I also found it runs correctly with a single gpu and net.train() under torch.no_grad(). |
@shimazing @jarrelscy did you train the classification model?The author release the code in the latest version paper but the link is 404 now. |
my classification net doesn’t work on single gpu the loss explodes |
I have yet to train the classification model.
…On Thu., 30 May 2019, 06:21 Zenan Ling, ***@***.***> wrote:
my classification net doesn’t work on single gpu the loss explodes
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=AAOEUB5KOD3HLNLFCRBPYFLPX5I5BA5CNFSM4HNRS6S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWRJ6IQ#issuecomment-497196834>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOEUB7TIZ65XRL4YWJHI4DPX5I5BANCNFSM4HNRS6SQ>
.
|
Your code is really helpful to understand how iResNet works.
Thanks for writing this code.
However, when I was trying to run the CNN version code jupyter notebook,
It gave me the wrong result on the evaluation phase (when activating evaluation mode with net.eval()) such that after a few iterations, the model even cannot reconstruct the inputs and the latent standard of test data diverges. (I am using DataParallel and Do u think the problem comes from this?)
Did you get the right result??
Thanks in advance for your reply
The text was updated successfully, but these errors were encountered: