-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some problems aboult this code #8
Comments
@980044579 , thanks for sharing your observations and experience.
|
Just change the code between CNN -> RNN in cnn_lstm_otc_ocr.py, make sure the shape of the input of RNN is [batch_size, max_stepsize, num_features]. |
Hi @980044579 , thanks a lot for your kind reply. I did the code changes too in yesterday and found the model can achieve 0.999 accuracy at 12th epoch. so the model is able to converge faster and achieve better performance after fixing this bug. For those who are interested, here is my code changes. |
Good job~ |
I am getting and error Failed precondition: sequence_length(0) <= 12 What I did for inference is I have already trained the model to model_checkpoint_path: "ocr-model-21001" on a set of 80000 train and 20 val images a provided in the dataset. I took a few images from val set and create a folder infer(40imgs named 1.png .. 40.png). I tried to run the code for inference using the command given in the readme. INFO:tensorflow:Restoring parameters from ./checkpoint/ocr-model-20001 During handling of the above exception, another exception occurred: Traceback (most recent call last): Caused by op 'CTCBeamSearchDecoder', defined at: FailedPreconditionError (see above for traceback): sequence_length(0) <= 12 |
@anubhavrohatgi make sure the maxlength of label in your dataset must <= max_stepsize |
@980044579 Please brief me a bit, quiet new to this stuff in Python. what maxlength of label is. Currently I am using the dataset that was provided in the link given in the repo. Max_stepsize = 64, i guess as is stated in utils.py All images are 180x60. error occurs somewhere here: |
are you talking about the labels.txt? Correct me if I am wrong here:: by infer we mean we are testing on our real time data. is it. |
@anubhavrohatgi @980044579 ,hello, i run into the same question,but i inspect the label and find the max length of label is not greater than maxT in[maxT,batch_size,num_char],have you solve it? i don't konw how to do it |
@anubhavrohatgi @kstys make sure you understand how the framework "CNN + RNN + CTC" work and there are some bugs in this code.You should not only change the "maxsteps" in utils.py but also the code between CNN ——> RNN in cnn_lstm_otc_ocr.py |
I have a question. in the file of cnn_letm_otc_ocr.oy , after cnn, the x.set_shape([FLAGS.batch_size, filters[3], 24]) is right? the time sequence should be the width which will be feed to the LSTM, but the code is the length of channels. |
I changed the code as @LevinJ ,but i got a error "tensorflow/core/util/ctc/ctc_loss_calculator.cc:144] No valid path found." I set the max_step as 128 and my input image is 32*192 |
Infact the "max_stepsize" in this code should't be 64.The "max_stepsize" is equal to 12,which is shrunk from original "image_width"(180) to 180/2/2/2/2 = 12.Remenber the core idea in CRNN+CTC is that we split the image vertically to many slices,and we predict each slice's classes,finally using CTC to decode the predicted sequence to the respectd result.For example "aaa_bb_c_"and "a__b_ccc" both respect to the same label "abc",you can also read the paper for more details.
But when I run the wrong code in author's dataset,and I got 98% accuracy while I got a bad result in VGGWord dataset.Finally I got a good result after changing the code.
So, why this code work in your situation,I am very courious about this.Thank you.
The text was updated successfully, but these errors were encountered: