Which model will get if I run train python main.py --train #16

liqilei · 2020-09-11T13:04:33Z

Hi, thanks for your meaningful work.
I want to ask if I do the following operations, which model ill I have?

I create trainset use MainSR to get train_MZSR.tfrecord(7.09GB);
run python main.py --train --gpu 0 --trial 0 --step 0.

I trained for 27522 iters, but when I test the trained model on Input/g20/Set5/, I get PSNR=33.6947 and SSIM=0.9265.

Can you give me some suggestions?

The text was updated successfully, but these errors were encountered:

liqilei · 2020-09-12T02:02:52Z

I noticed that you have released four models, they are:

Bicubicx2;
Directx2;
Directx4;
Multi-scale.

If I want to train the network to get these models, could you please tell me where should I modify?

JWSoh · 2020-09-15T07:04:26Z

Based on default setting, you will get 2. Directx2 model.

To obtain Bicubic model, you need to change line 30 of dataGenerator.py by adding bicubic downscaling option as below.
clean_img_LR=imresize(img_HR,scale=1./scale, kernel=Kernel, ds_method='bicubic'
For DirectX4, you need to change the scaling factors to 4 instead of 2 in config.py.
For Multi-scale model, change SCALE_LIST in config.py.

For the second issue, try to train until it converges. I think iteration 27522 is not enough for convergence.

liqilei · 2020-09-15T07:08:15Z

Based on default setting, you will get 2. Directx2 model.

To obtain Bicubic model, you need to change line 30 of dataGenerator.py by adding bicubic downscaling option as below.
clean_img_LR=imresize(img_HR,scale=1./scale, kernel=Kernel, ds_method='bicubic'

For DirectX4, you need to change the scaling factors to 4 instead of 2 in config.py.

For Multi-scale model, change SCALE_LIST in config.py.

For the second issue, try to train until it converges. I think iteration 27522 is not enough for convergence.

Thanks for your email. I also tried to train another Bicubic x2 model by adding ds_method='bicubic', and trained 100000 iters, but the performance is still not as reported. I wonder if this is because of the dataset? I get the DIV2K using the MainSR code, and the file size is 7.09GB, is this right?

liqilei · 2020-09-15T07:37:12Z

Here is the config for x2 BIcubic with 100000 iters.

clean_img_LR=imresize(img_HR,scale=1./scale, kernel=Kernel, ds_method='bicubic')

python main.py --train --gpu 0 --trial 0 --step 0

After training 100000 iters, I tested the trained model using the ‘g13_bic’ with the corresponding kernel for x2. I got the following results:

My:
1 update: 33.7713/ 0.9234
10 updates: 36.4782/ 0.9489

Official:
1 update: 35.18/ 0.9430
10 updates: 36.64/ 0.9498

JWSoh · 2020-09-15T07:43:37Z

We have slightly different options for training dataset with 3.37 GB.

I found that, in our previous code, we used option
if gradients(patch_l.astype(np.float64)/255.) >= 0.005 and np.var(patch_l.astype(np.float64)/255.) >= 0.03:
when we generate tfrecord dataset.

The reason we changed the option is based on our extensive experiments, where our new option brought increased performance.

I don't think this is the problem which leads PSNR gap of 2 dB.

liqilei · 2020-09-15T07:58:25Z

Thanks for your comment. By setting if gradients(patch_l.astype(np.float64)/255.) >= 0.005 and np.var(patch_l.astype(np.float64)/255.) >= 0.03, I do get a dataset with 3.37GB. I will try to train the model using this dataset and see the performance. And I will share the results with you.

BassantTolba1234 · 2020-12-21T09:02:00Z

Please @liqilei I'm facing a problem when i load the pretrained model , specially when it reads the checkpoint
this is the error .. how did you kindly solve it please ??

NotFoundError (see above for traceback): Key MODEL/conv7/kernel/Adam_3 not found in checkpoint
[[Node: save/RestoreV2_69 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_69/tensor_names, save/RestoreV2_69/shape_and_slices)]]

BassantTolba1234 · 2020-12-21T09:02:29Z

Please @JWSoh I'm facing a problem when i load the pretrained model , specially when it reads the checkpoint
this is the error .. how did you kindly solve it please ??

NotFoundError (see above for traceback): Key MODEL/conv7/kernel/Adam_3 not found in checkpoint
[[Node: save/RestoreV2_69 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_69/tensor_names, save/RestoreV2_69/shape_and_slices)]]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Which model will get if I run train python main.py --train #16

Which model will get if I run train python main.py --train #16

liqilei commented Sep 11, 2020

liqilei commented Sep 12, 2020

JWSoh commented Sep 15, 2020

liqilei commented Sep 15, 2020

liqilei commented Sep 15, 2020

JWSoh commented Sep 15, 2020

liqilei commented Sep 15, 2020

BassantTolba1234 commented Dec 21, 2020

BassantTolba1234 commented Dec 21, 2020

Which model will get if I run train python main.py --train #16

Which model will get if I run train python main.py --train #16

Comments

liqilei commented Sep 11, 2020

liqilei commented Sep 12, 2020

JWSoh commented Sep 15, 2020

liqilei commented Sep 15, 2020

liqilei commented Sep 15, 2020

JWSoh commented Sep 15, 2020

liqilei commented Sep 15, 2020

BassantTolba1234 commented Dec 21, 2020

BassantTolba1234 commented Dec 21, 2020