Help Needed #10

SibtainRazaJamali · 2019-05-02T13:58:58Z

I am training a CRNN model in pytorch
max_seq_length=99
number_of_alphabets=96
batch_size=16
output=CRNN(image)
what should be the expected shape of output?
Secondly, should we apply softmax in CRNN after fully connected layer?
Any help would be appreciated. Thanks

zhiqwang · 2019-05-02T16:08:42Z

It seems that your image is of different size. The network suppose your image's hight is 32, and the width is multiplies of 8 by default. The CNN backbone of the network compress image's width by 1/4 in arch densenet_cifar and 1/8 in arch densenet_cifar.

So if your image's width is 80, the size of output is (20, 16, 97) (96+1, encoder code 0 is reserved for CTC blank token) in --arch densenet121.

Or you can train the network with --keep-ratio options to keep the image's width.

The nn.CTCLoss implementation suppose it has applied log_softmax operations, you can refer the code snippet in CRNN network here and CTC loss implementation here.

SibtainRazaJamali · 2019-05-02T17:24:50Z

Thanks for your quick response.
I have lots of confusions because we pass input to ctc loss as
Width x BatchSize x number of classes
Am i right?
My ouput size
99 x 16 x 97

If i decode this prediction
How many output character should be predicted?
My decoded characters are 16 always equal to batch size.
Why?
sequence length can be upto 97 but it always predicts 16 characters.
Am i doing something wrong?

zhiqwang · 2019-05-03T11:26:21Z

Do you mean sequence length can be up to 99?

CTC loss introduce a blank token ϵ to get around not knowing the alignment between the input and the output. When you infer an image's content, you should collapse repeats and remove ϵ tokens. So the decoded output size is not fixed. It depends on your input image and your trained networks. You can refer Awni's article here.

If your datasets is small, the mean and std of the datasets is significant, and the --arch parameters also depends on your image datasets (choose densenet_cifar or densenet121), you can also combine a RNN part refer to issue #6 .

ronghui19 · 2019-05-16T09:42:56Z

Is there any particular reason to use softmax? i did not see original paper mention it

Never mind. i got it. It served as input for CTClOSS

zhiqwang · 2019-05-16T09:48:50Z

@ronghui19 Yes, I didn't know what is the result when remove CTC loss from the crnn network, which I mean only combine the log softmax in the procedure of back propagation. I'm testing this.

ronghui19 · 2019-05-16T10:02:59Z

@zhiqwang In file crnn.py, there is an init_network function, as far as i am concerned, you may forget to set pretrained network frozen. There should be something like this:
for param in model.parameters(): param.requires_grad = False
I could be wrong.

zhiqwang · 2019-05-16T10:07:58Z

@ronghui19 Current, there is not any pre-trained model to be used in the CNN backbone. So I did not set the learning rate of this part.

zhiqwang added the question Further information is requested label Jun 13, 2019

zhiqwang closed this as completed Jun 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help Needed #10

Help Needed #10

SibtainRazaJamali commented May 2, 2019

zhiqwang commented May 2, 2019 •

edited

Loading

SibtainRazaJamali commented May 2, 2019

zhiqwang commented May 3, 2019 •

edited

Loading

ronghui19 commented May 16, 2019 •

edited

Loading

zhiqwang commented May 16, 2019 •

edited

Loading

ronghui19 commented May 16, 2019

zhiqwang commented May 16, 2019

Help Needed #10

Help Needed #10

Comments

SibtainRazaJamali commented May 2, 2019

zhiqwang commented May 2, 2019 • edited Loading

SibtainRazaJamali commented May 2, 2019

zhiqwang commented May 3, 2019 • edited Loading

ronghui19 commented May 16, 2019 • edited Loading

zhiqwang commented May 16, 2019 • edited Loading

ronghui19 commented May 16, 2019

zhiqwang commented May 16, 2019

zhiqwang commented May 2, 2019 •

edited

Loading

zhiqwang commented May 3, 2019 •

edited

Loading

ronghui19 commented May 16, 2019 •

edited

Loading

zhiqwang commented May 16, 2019 •

edited

Loading