Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About training #36

Open
long123524 opened this issue Jul 3, 2021 · 12 comments
Open

About training #36

long123524 opened this issue Jul 3, 2021 · 12 comments

Comments

@long123524
Copy link

I trained on my own data set, and the following result graph appeared (I am doing a binary classification task). How can I improve it? Please answer
image

@xwjabc
Copy link
Owner

xwjabc commented Jul 5, 2021

It looks a bit weird. Did you use the VGG pre-trained weights?

@long123524
Copy link
Author

I didn't use it. Do I have to use pre-trained weights to proceed? Because my original image is in 6 bands, adding pre-training weights will report an error

@xwjabc
Copy link
Owner

xwjabc commented Jul 5, 2021

I didn't use it. Do I have to use pre-trained weights to proceed? Because my original image is in 6 bands, adding pre-training weights will report an error

Hi @long123524,
For 6 bands, do you mean your input has 6 channels? If that's the case, you can try to load the weights for the VGG backbone except the first layer.

@xwjabc xwjabc mentioned this issue Jul 5, 2021
@long123524
Copy link
Author

I try to load the weights for the VGG backbone except the first layer, however, the result is not very satisfactory.
image
I trained 150 epoch,The picture shows the result of the 150th epoch,What else do I need to do to get good results?

@long123524
Copy link
Author

I try to load the weights for the VGG backbone include the first layer. The training data is BSDS, and the result is obviously different from my own data set. The VGG weight of the first layer should be more important. How should I modify the weight of the first layer of pre-training so that it can read the 6-channel data set?
image

@xwjabc
Copy link
Owner

xwjabc commented Jul 7, 2021

Hi @long123524, I wonder if you can try to load the weights for the VGG backbone excluding the first layer and train on BSDS dataset? This can help to check if the parameters in the first layer are indeed essential.

@long123524
Copy link
Author

After removing the first-layer weight of VGG, the BSDS data set will not get good training results or without pre-training VGG, the effect is not good, as shown in the picture.
image

Therefore, I think VGG pre-training weights are necessary, and the first layer is also necessary. My idea is to change the first layer of VGG to 6 channels, but how to modify such a pre-training weight file? Or how did this pre-training weight file come from? Can we make a pre-trained VGG ourselves?

@xwjabc
Copy link
Owner

xwjabc commented Jul 9, 2021

Hi @long123524, the pre-trained weight file comes from original HED repo, and I think this weight file is pretrained on ImageNet classification (may need to double check). It seems the first layer is necessary in your visualization. However, I am not sure if there is a easy way to convert the current 3-channel input layer to 6-channel. In addition, perhaps the 6-channel input of your dataset has different value distribution from the RGB images in HED. Could you tell me what's the meaning of each channel in your data? We can design some conversion given the property of your data.

@long123524
Copy link
Author

Thank you very much for your reply. My own research direction is remote sensing. The 6 channels I input are the 6 bands in the remote sensing image, and each band has its own spectral information in it.

@xwjabc
Copy link
Owner

xwjabc commented Jul 11, 2021

Gotcha, I would suggest if you can compute the statistics of the 6 channels (e.g. max/min/std/avg) and choose suitable normalization. Afterward, you can attempt to copy the 3-channel input layer weights twice to make 6-channel input weights and see if this way can help training.

@long123524
Copy link
Author

image
I successfully got the code to run, but the loss I got was very large, already tens of thousands. Why is this?

@xwjabc
Copy link
Owner

xwjabc commented Jul 11, 2021

I think you may attempt to lower down the learning rate and try several possible values.

@xwjabc xwjabc mentioned this issue Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants