Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about the image size #4

Open
TobyTongg opened this issue May 28, 2022 · 8 comments
Open

about the image size #4

TobyTongg opened this issue May 28, 2022 · 8 comments

Comments

@TobyTongg
Copy link

Hi, Is the resolution of all input pictures 352*352?

@thograce
Copy link
Owner

Hi, Is the resolution of all input pictures 352*352?

The images of the training set and the testing set have different resolutions, but after loading them, the resolution is uniformly modified to 352x352, so the input size of the model is Bx3x352x352.

@TobyTongg
Copy link
Author

But why does the resolution of the picture changed when I am training? For example, the x4(resnet_layer4) should be (batchsize, 2048, 11, 11) all the time, but when I trained it, it appeared x4(batchsize, 2048, 8, 8), and I didn't change any parameters.

@TobyTongg
Copy link
Author

OK, thanks for your help!

@thograce
Copy link
Owner

OK, thanks for your help!

The last reply was wrong. I just remembered the reason. I use multi-scale input instead of data enhancement, which is also explained in the paper. I use [0.75,1,1.25] three scaling ratios to scale the picture, and ensure that the picture size is an integral multiple of 32, so each picture will be trained in three sizes of [256x256, 352x352, 448x448] in one epoch.

@thograce
Copy link
Owner

OK, thanks for your help!

During the testing stage, all input images are 352x352.

@TobyTongg
Copy link
Author

I understand, thanks a lot!

@thograce
Copy link
Owner

OK, thanks for your help!

The last reply was wrong. I just remembered the reason. I use multi-scale input instead of data enhancement, which is also explained in the paper. I use [0.75,1,1.25] three scaling ratios to scale the picture, and ensure that the picture size is an integral multiple of 32, so each picture will be trained in three sizes of [256x256, 352x352, 448x448] in one epoch.

Correct it: [256x256, 352x352, 416x416].

@thograce
Copy link
Owner

I understand, thanks a lot!

You're welcome! It's a good question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants