Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the problem about process_boxes #70

Open
duxiangcheng opened this issue May 2, 2020 · 4 comments
Open

the problem about process_boxes #70

duxiangcheng opened this issue May 2, 2020 · 4 comments

Comments

@duxiangcheng
Copy link

Hi, thanks for sharing your amazing code!
I have some question, can you help me?

  1. I don't know the function of process_boxes in train.py.
    if step > 10000 or True: #this is just extra augumentation step ... in early stage just slows down training ctcl, gt_b_good, gt_b_all = process_boxes(images, im_data, seg_pred[0], roi_pred[0], angle_pred[0], score_maps, gt_idxs, gtso, lbso, features, net, ctc_loss, opts, debug=opts.debug) ctc_loss_val += ctcl.data.cpu().numpy()[0] loss = loss + ctcl gt_all += gt_b_all good_all += gt_b_good

  2. as shown in the above code, the ctc_loss is validation loss. But I notice that the loss will backward. As I know, the validation loss should not operate backward(). So can you explain it?

thanks!

@MichalBusta
Copy link
Owner

Hi,

    • see:
      def process_boxes(images, im_data, iou_pred, roi_pred, angle_pred, score_maps, gt_idxs, gtso, lbso, features, net, ctc_loss, opts, debug = False):

function just feeds actual proposals from detector to OCR module.

    • we do not use any validation loss - the strategy for taking model is: run several checkpoints on your validation dataset and pick the model with best end-to-end score.

Hope it helps, Michal

@duxiangcheng
Copy link
Author

Thank you for your reply.
And, I used train_ocr.py to pre-train the ocr net, but the CTC loss is unstable. The loss curve is so strange!
捕获

@MichalBusta
Copy link
Owner

What is your batch size? My guess it too small - try something > 64

@AniketGurav
Copy link

Hi, in function process_boxes net.forward_ocr is called 3 times. I am not clear about it.
those lines no are 270,276,381 in train.py

By reading paper, what I understand is the function process_boxes ocr the crops extracted by the
Localization Module LM.
Those crops are extracted from the 1. bounding box coordinate extracted by LM and 2.feature map from one of the layer of LM.

But I am not clear about 3rd ocr call on line 381 above..

I have referred Fig 3 of your paper https://arxiv.org/pdf/1801.09919.pdf for understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants