-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trying to Train VGG16 Model for localizing Text from natural images. Used Dataset MSRA-TD500 #20
Comments
Hi @nikstar802 , I appreciate you like it. |
Hi, Actually, the loss is exploding suddenly. From Second Epoch itself, loss increases and then becomes constant after 3rd Epoch. But unable to understand why the network is NOT training properly. |
@nikstar802 Please first make sure you are working with the newest master branch because I forgot to include "softmax" activation previously which causes sudden weights/loss explosion. Other than that, it can also be possible that:
|
Hi, Thanks for reply. Epoch 00000: val_loss improved from inf to 6.38728, saving model to /tmp/fcn_vgg16_weights.h5 I have few questions to ask?
... |
|
Hi, So, Now I am cropping my training images randomly in 500x500 sized segments, and total I am cropping one image 20 times, so I am getting 20 Sub Images of 500x500 size each from single training Image. This training set I am feeding in. Here is my model summary. Layer (type) Output Shape Param # Connected toinput_1 (InputLayer) (None, 500, 500, 3) 0 block1_conv1 (Conv2D) (None, 500, 500, 64) 1792 input_1[0][0] block1_conv2 (Conv2D) (None, 500, 500, 64) 36928 block1_conv1[0][0] block1_pool (MaxPooling2D) (None, 250, 250, 64) 0 block1_conv2[0][0] block2_conv1 (Conv2D) (None, 250, 250, 128) 73856 block1_pool[0][0] block2_conv2 (Conv2D) (None, 250, 250, 128) 147584 block2_conv1[0][0] block2_pool (MaxPooling2D) (None, 125, 125, 128) 0 block2_conv2[0][0] block3_conv1 (Conv2D) (None, 125, 125, 256) 295168 block2_pool[0][0] block3_conv2 (Conv2D) (None, 125, 125, 256) 590080 block3_conv1[0][0] block3_conv3 (Conv2D) (None, 125, 125, 256) 590080 block3_conv2[0][0] block3_pool (MaxPooling2D) (None, 63, 63, 256) 0 block3_conv3[0][0] block4_conv1 (Conv2D) (None, 63, 63, 512) 1180160 block3_pool[0][0] block4_conv2 (Conv2D) (None, 63, 63, 512) 2359808 block4_conv1[0][0] block4_conv3 (Conv2D) (None, 63, 63, 512) 2359808 block4_conv2[0][0] block4_pool (MaxPooling2D) (None, 32, 32, 512) 0 block4_conv3[0][0] block5_conv1 (Conv2D) (None, 32, 32, 512) 2359808 block4_pool[0][0] block5_conv2 (Conv2D) (None, 32, 32, 512) 2359808 block5_conv1[0][0] block5_conv3 (Conv2D) (None, 32, 32, 512) 2359808 block5_conv2[0][0] block5_pool (MaxPooling2D) (None, 16, 16, 512) 0 block5_conv3[0][0] block5_fc6 (Conv2D) (None, 16, 16, 4096) 102764544 block5_pool[0][0] dropout_1 (Dropout) (None, 16, 16, 4096) 0 block5_fc6[0][0] block5_fc7 (Conv2D) (None, 16, 16, 4096) 16781312 dropout_1[0][0] dropout_2 (Dropout) (None, 16, 16, 4096) 0 block5_fc7[0][0] score_feat1 (Conv2D) (None, 16, 16, 1) 4097 dropout_2[0][0] score_feat2 (Conv2D) (None, 32, 32, 1) 513 block4_pool[0][0] upscore_feat1 (BilinearUpSamplin (None, 32, 32, 1) 0 score_feat1[0][0] scale_feat2 (Lambda) (None, 32, 32, 1) 0 score_feat2[0][0] add_1 (Add) (None, 32, 32, 1) 0 upscore_feat1[0][0] score_feat3 (Conv2D) (None, 63, 63, 1) 257 block3_pool[0][0] upscore_feat2 (BilinearUpSamplin (None, 63, 63, 1) 0 add_1[0][0] scale_feat3 (Lambda) (None, 63, 63, 1) 0 score_feat3[0][0] add_2 (Add) (None, 63, 63, 1) 0 upscore_feat2[0][0] upscore_feat3 (BilinearUpSamplin (None, 500, 500, 1) 0 add_2[0][0] activation_1 (Activation) (None, 500, 500, 1) 0 upscore_feat3[0][0]Total params: 134,265,411 Now, with this model, even one epoch is NOT moving forward. My system is hung up with this .... ... |
@nikstar802 |
Hi,
First of all, I want to say this library is awesome.
Actually, I am trying to localize Text from natural images. I am trying to train a single Image from MSRA-TD500 Dataset using VGG16 network given by you. But unfortunately, the model is NOT converging as per the expectations.
For understanding, I just want to train my network on single image and test on the same image. But that itself is NOT Happening.
I am using 'ADAM' Optimizer and 'Categorical Crossentroy' as functions and using 2 Classes to categorize text and non-text areas.
This is how it is getting trained. For pre-processing, I am subtracting mean pixels from original image and then dividing the image by standard deviation.
1/1 [==============================] - 64s - loss: 0.7233 - acc: 0.4443
Epoch 2/10
1/1 [==============================] - 51s - loss: 3.2022 - acc: 0.8014
Epoch 3/10
1/1 [==============================] - 52s - loss: 3.2022 - acc: 0.8014
Epoch 4/10
1/1 [==============================] - 52s - loss: 3.2022 - acc: 0.8014
Epoch 5/10
1/1 [==============================] - 52s - loss: 3.2022 - acc: 0.8014
Epoch 6/10
1/1 [==============================] - 51s - loss: 3.2022 - acc: 0.8014
Epoch 7/10
1/1 [==============================] - 52s - loss: 3.2022 - acc: 0.8014
Epoch 8/10
1/1 [==============================] - 51s - loss: 3.2022 - acc: 0.8014
Epoch 9/10
1/1 [==============================] - 51s - loss: 3.2022 - acc: 0.8014
Epoch 10/10
1/1 [==============================] - 51s - loss: 3.2021 - acc: 0.8014
Can you suggest something on this issue...
Thanks ...
The text was updated successfully, but these errors were encountered: