Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which model will get if I run train python main.py --train #16

Open
liqilei opened this issue Sep 11, 2020 · 8 comments
Open

Which model will get if I run train python main.py --train #16

liqilei opened this issue Sep 11, 2020 · 8 comments

Comments

@liqilei
Copy link

liqilei commented Sep 11, 2020

Hi, thanks for your meaningful work.
I want to ask if I do the following operations, which model ill I have?

  1. I create trainset use MainSR to get train_MZSR.tfrecord(7.09GB);
  2. run python main.py --train --gpu 0 --trial 0 --step 0.

I trained for 27522 iters, but when I test the trained model on Input/g20/Set5/, I get PSNR=33.6947 and SSIM=0.9265.

Can you give me some suggestions?

@liqilei
Copy link
Author

liqilei commented Sep 12, 2020

I noticed that you have released four models, they are:

  1. Bicubicx2;
  2. Directx2;
  3. Directx4;
  4. Multi-scale.

If I want to train the network to get these models, could you please tell me where should I modify?

@JWSoh
Copy link
Owner

JWSoh commented Sep 15, 2020

Based on default setting, you will get 2. Directx2 model.

  1. To obtain Bicubic model, you need to change line 30 of dataGenerator.py by adding bicubic downscaling option as below.
    clean_img_LR=imresize(img_HR,scale=1./scale, kernel=Kernel, ds_method='bicubic'

  2. For DirectX4, you need to change the scaling factors to 4 instead of 2 in config.py.

  3. For Multi-scale model, change SCALE_LIST in config.py.

For the second issue, try to train until it converges. I think iteration 27522 is not enough for convergence.

@liqilei
Copy link
Author

liqilei commented Sep 15, 2020

Based on default setting, you will get 2. Directx2 model.

  1. To obtain Bicubic model, you need to change line 30 of dataGenerator.py by adding bicubic downscaling option as below.
    clean_img_LR=imresize(img_HR,scale=1./scale, kernel=Kernel, ds_method='bicubic'
  2. For DirectX4, you need to change the scaling factors to 4 instead of 2 in config.py.
  3. For Multi-scale model, change SCALE_LIST in config.py.

For the second issue, try to train until it converges. I think iteration 27522 is not enough for convergence.

Thanks for your email. I also tried to train another Bicubic x2 model by adding ds_method='bicubic', and trained 100000 iters, but the performance is still not as reported. I wonder if this is because of the dataset? I get the DIV2K using the MainSR code, and the file size is 7.09GB, is this right?

@liqilei
Copy link
Author

liqilei commented Sep 15, 2020

Here is the config for x2 BIcubic with 100000 iters.

clean_img_LR=imresize(img_HR,scale=1./scale, kernel=Kernel, ds_method='bicubic')

python main.py --train --gpu 0 --trial 0 --step 0

After training 100000 iters, I tested the trained model using the ‘g13_bic’ with the corresponding kernel for x2. I got the following results:

My:
1 update: 33.7713/ 0.9234
10 updates: 36.4782/ 0.9489

Official:
1 update: 35.18/ 0.9430
10 updates: 36.64/ 0.9498

@JWSoh
Copy link
Owner

JWSoh commented Sep 15, 2020

We have slightly different options for training dataset with 3.37 GB.

I found that, in our previous code, we used option
if gradients(patch_l.astype(np.float64)/255.) >= 0.005 and np.var(patch_l.astype(np.float64)/255.) >= 0.03:
when we generate tfrecord dataset.

The reason we changed the option is based on our extensive experiments, where our new option brought increased performance.

I don't think this is the problem which leads PSNR gap of 2 dB.

@liqilei
Copy link
Author

liqilei commented Sep 15, 2020

Thanks for your comment. By setting if gradients(patch_l.astype(np.float64)/255.) >= 0.005 and np.var(patch_l.astype(np.float64)/255.) >= 0.03, I do get a dataset with 3.37GB. I will try to train the model using this dataset and see the performance. And I will share the results with you.

@BassantTolba1234
Copy link

Please @liqilei I'm facing a problem when i load the pretrained model , specially when it reads the checkpoint
this is the error .. how did you kindly solve it please ??

NotFoundError (see above for traceback): Key MODEL/conv7/kernel/Adam_3 not found in checkpoint
[[Node: save/RestoreV2_69 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_69/tensor_names, save/RestoreV2_69/shape_and_slices)]]

@BassantTolba1234
Copy link

Please @JWSoh I'm facing a problem when i load the pretrained model , specially when it reads the checkpoint
this is the error .. how did you kindly solve it please ??

NotFoundError (see above for traceback): Key MODEL/conv7/kernel/Adam_3 not found in checkpoint
[[Node: save/RestoreV2_69 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_69/tensor_names, save/RestoreV2_69/shape_and_slices)]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants