Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctc_loss gets inf values and Unknow chars #25

Open
RuijieJ opened this issue Mar 28, 2019 · 8 comments
Open

ctc_loss gets inf values and Unknow chars #25

RuijieJ opened this issue Mar 28, 2019 · 8 comments

Comments

@RuijieJ
Copy link

RuijieJ commented Mar 28, 2019

When I use the pre-trained model you provide to continue training on the MLT-2019 dataset, the ctc_loss gets inf values at most steps. Is there something wrong with it?

Also, there are some characters that seem not included in the directory. Will this infect the training performance?

The screen prints are look like:

epoch 12[205000], loss: 8.969, bbox_loss: 5.751, seg_loss: -0.604, ang_loss: 3.349, ctc_loss: 9.002, rec: 0.00000 in -0.000
Unknown char: 铣
Unknown char: 綦
epoch 12[205100], loss: 5.823, bbox_loss: 5.086, seg_loss: -0.910, ang_loss: 2.095, ctc_loss: inf, rec: 0.02500 in 0.000
epoch 12[205200], loss: 2.607, bbox_loss: 3.810, seg_loss: -1.047, ang_loss: 0.875, ctc_loss: inf, rec: 0.02069 in -0.000
Unknown char: 铣
epoch 12[205300], loss: 2.268, bbox_loss: 3.619, seg_loss: -0.995, ang_loss: 0.727, ctc_loss: inf, rec: 0.03540 in -0.003
Unknown char: 桷
Unknown char: 灏
Unknown char: 綦
epoch 12[205400], loss: 1.631, bbox_loss: 3.175, seg_loss: -1.080, ang_loss: 0.562, ctc_loss: inf, rec: 0.00775 in -0.000
epoch 12[205500], loss: 1.843, bbox_loss: 3.320, seg_loss: -1.135, ang_loss: 0.659, ctc_loss: inf, rec: 0.02113 in 0.000
Unknown char: 捌
epoch 12[205600], loss: 1.517, bbox_loss: 2.975, seg_loss: -1.124, ang_loss: 0.577, ctc_loss: 7.738, rec: 0.05634 in 0.000
epoch 12[205700], loss: 1.242, bbox_loss: 2.867, seg_loss: -1.183, ang_loss: 0.496, ctc_loss: inf, rec: 0.01439 in 0.000
Unknown char: 綦
epoch 12[205800], loss: 1.263, bbox_loss: 2.826, seg_loss: -1.203, ang_loss: 0.527, ctc_loss: inf, rec: 0.02899 in 0.001
Unknown char: 灏
epoch 12[205900], loss: 1.187, bbox_loss: 2.795, seg_loss: -1.212, ang_loss: 0.501, ctc_loss: inf, rec: 0.05825 in 0.000
epoch 12[206000], loss: 1.216, bbox_loss: 2.815, seg_loss: -1.169, ang_loss: 0.489, ctc_loss: inf, rec: 0.03731 in -0.001
Unknown char: 螃
Unknown char: 捌
Unknown char: 閩
epoch 12[206100], loss: 1.194, bbox_loss: 2.801, seg_loss: -1.191, ang_loss: 0.492, ctc_loss: inf, rec: 0.04032 in -0.006
epoch 12[206200], loss: 0.760, bbox_loss: 2.559, seg_loss: -1.231, ang_loss: 0.356, ctc_loss: inf, rec: 0.04487 in 0.024

@MichalBusta
Copy link
Owner

When I use the pre-trained model you provide to continue training on the MLT-2019 dataset, the ctc_loss gets inf values at most steps. Is there something wrong with it?

#4 (in short - gradients are ok, loss can get to inf)

Also, there are some characters that seem not included in the directory. Will this infect the training performance?

testing - yes, you should add missing chars to codec.

@MichalBusta
Copy link
Owner

MichalBusta commented Mar 29, 2019 via email

@RuijieJ
Copy link
Author

RuijieJ commented Mar 29, 2019

Thanks. I finally found my problem: the original configuration is designed for the vertical text poly (height > width) except for the ICDAR-2015 dataset. After I modified this line in "data_gen.py":
if is_icdar:
np.roll(pts, 2)
and roll the coordinates for all polys, I succeeded in training the model.

Thanks a lot to the author's timely reply. Though, I have one last question:

The loss value can easily be negative, is this situation normal? Would the negative loss modify the network's parameters in a harmful way?

@MichalBusta
Copy link
Owner

MichalBusta commented Mar 29, 2019 via email

@kei6
Copy link

kei6 commented May 28, 2019

Thanks. I finally found my problem: the original configuration is designed for the vertical text poly (height > width) except for the ICDAR-2015 dataset. After I modified this line in "data_gen.py":
if is_icdar:
np.roll(pts, 2)
and roll the coordinates for all polys, I succeeded in training the model.

Hi @RuijieJ, how did you modify it? Can you share your code??
Thanks,

@RuijieJ
Copy link
Author

RuijieJ commented May 29, 2019

@kei6 Of course. In data_gen.py, line 122:

if is_icdar: pts = np.roll(pts,2)

Because the texts in the sample image "sample_train_data/MLT/done/img_5407.jpg" are vertical, so the original code doesn't roll its points. However, if most of your samples only contain horizontal texts, you just need to remove the if term:

pts = np.roll(pts,2)

@duxiangcheng
Copy link

@RuijieJ Allow me to ask you a question. What is the function of "-ocr_feed_list" in the train.py? And where can I get the cropped image? Thanks

@AniketGurav
Copy link

AniketGurav commented Aug 4, 2022

@kei6 Of course. In data_gen.py, line 122:

if is_icdar: pts = np.roll(pts,2)

Because the texts in the sample image "sample_train_data/MLT/done/img_5407.jpg" are vertical, so the original code doesn't roll its points. However, if most of your samples only contain horizontal texts, you just need to remove the if term:

pts = np.roll(pts,2)

RuijieJ

Hi @RuijieJ
I have images with only horizontal text and it's not ICDAR data.
what I am expected to do in code?
Should I always execute line "pts = np.roll(pts,2)" and remove the line "if is_icdar:"
like below

#if is_icdar: pts = #### commented condition ###
np.roll(pts,2) #### this line will always get executed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants