Loss can't drop #12

QiushiYang · 2021-01-13T13:46:51Z

Thank you so much for sharing your codes. I try to employ Vit as the encoder and follow a common decoder to build a segmentation network. I train it from scratch but found the loss can't drop since the beginning of training, and the results keep near 0. Is there any trick for training Vit correctly? Is it very important to load the pre-train model to fine-tune?
Here is my configuration:
patch_size=16 hidden_size=16*16*3 mlp_dim = 3072 dropout_rate = 0.1 num_heads = 12 num_layers = 12 lr=3e-4 opt=Adam weight_decay=0.0

The text was updated successfully, but these errors were encountered:

jeonsworld · 2021-02-03T06:19:50Z

Hyperparameter in pre-train and fine-tuning have different settings.
Also, if you are running the scratch train, you need to decide on the hyperparameter that fits it.
Hyperparameters for pre-train and fine-tuning can be found in the paper.

QiushiYang · 2021-02-08T13:08:12Z

Thanks a lot for your reply. It was a coding bug and I have fixed the problems. Many thanks.

jeonsworld closed this as completed Feb 8, 2021

superxiaoying mentioned this issue Feb 22, 2021

Errors when use custom data to retrain the Vit-transformer #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss can't drop #12

Loss can't drop #12

QiushiYang commented Jan 13, 2021

jeonsworld commented Feb 3, 2021

QiushiYang commented Feb 8, 2021

Loss can't drop #12

Loss can't drop #12

Comments

QiushiYang commented Jan 13, 2021

jeonsworld commented Feb 3, 2021

QiushiYang commented Feb 8, 2021