-
Notifications
You must be signed in to change notification settings - Fork 755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docs] Add Chinese Guidance on How to Add New Datasets to Dataset Preparer #1506
Conversation
Codecov ReportBase: 88.16% // Head: 85.85% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## dev-1.x #1506 +/- ##
===========================================
- Coverage 88.16% 85.85% -2.31%
===========================================
Files 147 158 +11
Lines 9249 9881 +632
Branches 1268 1368 +100
===========================================
+ Hits 8154 8483 +329
- Misses 863 1156 +293
- Partials 232 242 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Tested CRNN on IC13 test split generated by the dataset preparer, got 82.65 instead of 87.39 (https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#id5). The same issue has also been found on IC15. I'll investigate the issue behind such a difference. |
Dataset preparer now generates 1095 test images without post-filtering, which is usually required. (https://arxiv.org/pdf/1904.01906.pdf) |
It might be easy to develop a post-filtering script for IC13, but IC15 is filtered manually and may not be generated. Shall we allow users to download existing annotations for these special cases? |
We may comment the original URL, and add the specified version of annotation just like the old converter. |
I can raise a new PR to fix this issue |
@xinke-wang Yes, we need to download annotations by default for IC13&15 textrecog datasets. I can upload the filtered annotations for IC13 and IC15 if needed. I think I'll not merge this PR till your get the new PR ready. |
I've added the 1015 version for IC13. However, after checking the link provided by the doc https://mmocr.readthedocs.io/en/latest/datasets/recog.html#icdar-2015, it seems the 2077 IC15 was used? Can you check if MMOCR models were tested on IC15 2077 or IC15 1811? |
Can split pr into several parts for not blocking other pr:
|
2077 for recogintion test |
Ok, so we do not need to fix the IC15 preparer since the current one is 2077. |
Add Guidance on How to Add New Datasets to Dataset Preparer (Chinese Version Only), using ICDAR 2013 dataset as an example. Also, this PR adds the IC13 dataset to the dataset preparer.