[Feature] Adding test & train API to be used directly in code #1138

wybryan · 2022-07-05T08:07:09Z

This changes added capability for test & training to be directly invoked
in code, e.g., inside a Jupyter notebook cell.

The change also ensures the original command-line usage remains the
same.

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

It is very common that data scientists use Jupyter notebook to author modelling work. It is desirable that experiments such as training and/or test can be called in code instead of typing in command-line terminal. This PR adds such capability without affecting existing command-line usage.

Modification

The modification is made with minimum changes in mind, it just adds a class to assemble argument list as list of string, and parse it into the argparser. The design ensures the following:

the existing command-line user case remain as is.
when user wants to initiate training/testing, the parameter parsing is IDENTICAL to the command-line user case.
an example of how to use such modifications is shown as the following:

from mmocr.tools.train import TrainArg, parse_args, run_train_cmd
args = TrainArg(config='/path/to/config.py')
args.add_arg('--work-dir', '/path/to/dir')
args = parse_args(args.arg_list)
run_train_cmd(args)

BC-breaking (Optional)

No, it remains 100% backward compatibility

Use cases (Optional)

Allowing training experiments and testing experiments can be started directly in code such as Jupyter notebook.

Checklist

Before PR:

I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
New functionalities are covered by complete unit tests. If not, please add more unit test to ensure the correctness.
The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with some of those projects.
CLA has been signed and all committers have signed the CLA in this PR.

This changes added capability for test & training to be directly invoked in code, e.g., inside a Jupyter notebook cell. The change also ensures the original command-line usage remains the same.

CLAassistant · 2022-07-05T09:11:59Z

All committers have signed the CLA.

gaotongxiao · 2022-07-06T16:46:50Z

Thanks for your contribution! It is an appealing feature and is likely to be applied to all OpenMMLab projects. I'm looping in other colleagues to have some discussion on it.

gaotongxiao · 2022-07-06T16:49:21Z

And, please install precommit hooks following https://github.com/open-mmlab/mmocr/blob/main/.github/CONTRIBUTING.md#installing-pre-commit-hooks and format your code with pre-commit run --all-files to pass our lint tests.

wybryan · 2022-07-07T04:02:40Z

And, please install precommit hooks following https://github.com/open-mmlab/mmocr/blob/main/.github/CONTRIBUTING.md#installing-pre-commit-hooks and format your code with pre-commit run --all-files to pass our lint tests.

thanks for the info, I've fixed the linting issue now.

wybryan

lint has passed.

bug fix for 4-point polygon, sort the points in clock-wise order

gaotongxiao · 2022-07-18T02:48:49Z

This is a nice suggestion, but the design still has some room to improve.

This design is not straightforward for multiple arguments. Users would have to run add_arg() multiple times.
It unnecessarily exposes an intermediate step (parse_args) to users and requires them to call at least three processes sequentially to train/test a model.

Therefore, this design might compromise some user-friendliness. Referring to MMOCR() where CLI and object share exactly the same group of arguments, we can neatify its interface a bit. Consider this example:

from mmocr.tools.train import Trainer
trainer = Trainer(config='xxx', work_dir='xxx', no_validate=True)  # accept same group of arguments as CLI
trainer.add_args(launcher='pytorch', diff_seed=100)  # Optional, but useful sometimes
trainer.train()

Does it look better?

gaotongxiao · 2022-07-18T02:58:05Z

BTW, this PR is inspiring and prompts us to design a better interface for notebook users. I'd like to share one potentially even better and unified proposal for some feedback though it's a bit out of this PR's scope. Currently MMOCR() object is just for demonstration. How about integrating every fundamental API into it:

from mmocr import MMOCR
mmocr = MMOCR(...)
mmocr.train(...)
mmocr.test(...)
mmocr.inference(...)  # alias of readtext()

Also seems applicable to all other OpenMMLab projects.

wybryan · 2022-07-18T13:15:54Z

BTW, this PR is inspiring and prompts us to design a better interface for notebook users. I'd like to share one potentially even better and unified proposal for some feedback though it's a bit out of this PR's scope. Currently MMOCR() object is just for demonstration. How about integrating every fundamental API into it:
from mmocr import MMOCR
mmocr = MMOCR(...)
mmocr.train(...)
mmocr.test(...)
mmocr.inference(...)  # alias of readtext()
Also seems applicable to all other OpenMMLab projects.

I agree this is better API.

wybryan · 2022-07-18T13:22:49Z

BTW, this PR is inspiring and prompts us to design a better interface for notebook users. I'd like to share one potentially even better and unified proposal for some feedback though it's a bit out of this PR's scope. Currently MMOCR() object is just for demonstration. How about integrating every fundamental API into it:
from mmocr import MMOCR
mmocr = MMOCR(...)
mmocr.train(...)
mmocr.test(...)
mmocr.inference(...)  # alias of readtext()
Also seems applicable to all other OpenMMLab projects.
I agree this is better API.

hopefully I can contribute to this proposal. I don't know how MMLab projects operate with each other, maybe we can implement this API with MMOCR first as a 'pilot' example.

Alternatively, a grand design can be carried out by implementing such API into mmcv project, which is the mother project for all other mmlab sub-project, but I guess this would need more sync with each other subprojects to conform with API.

gaotongxiao · 2022-07-19T11:19:03Z

hopefully I can contribute to this proposal. I don't know how MMLab projects operate with each other, maybe we can implement this API with MMOCR first as a 'pilot' example.

Alternatively, a grand design can be carried out by implementing such API into mmcv project, which is the mother project for all other mmlab sub-project, but I guess this would need more sync with each other subprojects to conform with API.

Great to hear that! Could you send an email to mmocr@openmmlab.com to join our Slack group? We can discuss more details there.

wybryan · 2022-07-21T14:49:39Z

hopefully I can contribute to this proposal. I don't know how MMLab projects operate with each other, maybe we can implement this API with MMOCR first as a 'pilot' example.
Alternatively, a grand design can be carried out by implementing such API into mmcv project, which is the mother project for all other mmlab sub-project, but I guess this would need more sync with each other subprojects to conform with API.

Great to hear that! Could you send an email to mmocr@openmmlab.com to join our Slack group? We can discuss more details there.

cool, email sent, cheers.

gaotongxiao · 2022-09-23T07:26:15Z

Hi, sorry for coming back late. Now we finally have time to proceed with this PR after the release of 1.0.0rc0. Could you clean up your code a little bit and leave only the train&test API part in this PR?

wybryan · 2022-09-26T06:57:04Z

Hi, sorry for coming back late. Now we finally have time to proceed with this PR after the release of 1.0.0rc0. Could you clean up your code a little bit and leave only the train&test API part in this PR?

what do you mean? you mean only keep changes made in test.py & train.py?

gaotongxiao · 2022-09-26T11:02:38Z

@wybryan Right, the changes of a PR should be kept within the scope as claimed in the title.

wybryan · 2022-09-28T04:50:40Z

@wybryan Right, the changes of a PR should be kept within the scope as claimed in the title.

sure, I'll revert other changes, just keeping changes in train.py & test.py.

yaqi0510 · 2023-04-03T11:32:01Z

wybryan，您好！您在MMOCR项目中给我们提的PR非常重要，感谢您付出私人时间帮助改进开源项目，相信很多开发者会从你的PR中受益。
我们非常期待与您继续合作，OpenMMLab专门成立了贡献者组织MMSIG，为贡献者们提供开源证书、荣誉体系和专享好礼，可通过添加微信：openmmlabwx 联系我们（请备注mmsig+GitHub id），由衷希望您能加入！

Hi @wybryan ！First of all, we want to express our gratitude for your significant PR in the MMOCR project. Your contribution is highly appreciated, and we are grateful for your efforts in helping improve this open-source project during your personal time. We believe that many developers will benefit from your PR.

We would also like to invite you to join our Special Interest Group (SIG) private channel on Discord, where you can share your experiences, ideas, and build connections with like-minded peers. To join the SIG channel, simply message moderator— OpenMMLab on Discord or briefly share your open-source contributions in the #introductions channel and we will assist you. Look forward to seeing you there! Join us ：https://discord.gg/raweFPmdzG

If you have WeChat account，welcome to join our community on WeChat. You can add our assistant ：openmmlabwx. Please add "mmsig + Github ID" as a remark when adding friends：）
Thank you again for your contribution❤

[Feature] Adding test & train API

d58209b

This changes added capability for test & training to be directly invoked in code, e.g., inside a Jupyter notebook cell. The change also ensures the original command-line usage remains the same.

wybryan and others added 4 commits July 7, 2022 11:32

Merge branch 'open-mmlab:main' into main

9c0ba25

Merge github.com:open-mmlab/mmocr

131f5ec

fix linting issue

089944b

Merge branch 'main' of github.com:wybryan/mmocr

141de19

wybryan commented Jul 7, 2022

View reviewed changes

wybryan and others added 3 commits July 10, 2022 02:45

Merge branch 'open-mmlab:main' into main

b71ddcd

[bug fix] fix polygon points ordering

fe0aaf7

bug fix for 4-point polygon, sort the points in clock-wise order

Merge branch 'main' of github.com:wybryan/mmocr

0b67b38

revert changes for PR open-mmlab#1138

9f38f81

gaotongxiao approved these changes Sep 29, 2022

View reviewed changes

Merge branch 'main' into wy_main

0cdfd77

gaotongxiao merged commit b422ded into open-mmlab:main Sep 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Adding test & train API to be used directly in code #1138

[Feature] Adding test & train API to be used directly in code #1138

wybryan commented Jul 5, 2022

CLAassistant commented Jul 5, 2022 •

edited

Loading

gaotongxiao commented Jul 6, 2022

gaotongxiao commented Jul 6, 2022

wybryan commented Jul 7, 2022

wybryan left a comment

gaotongxiao commented Jul 18, 2022 •

edited

Loading

gaotongxiao commented Jul 18, 2022

wybryan commented Jul 18, 2022

wybryan commented Jul 18, 2022

gaotongxiao commented Jul 19, 2022

wybryan commented Jul 21, 2022

gaotongxiao commented Sep 23, 2022

wybryan commented Sep 26, 2022

gaotongxiao commented Sep 26, 2022

wybryan commented Sep 28, 2022

yaqi0510 commented Apr 3, 2023

[Feature] Adding test & train API to be used directly in code #1138

[Feature] Adding test & train API to be used directly in code #1138

Conversation

wybryan commented Jul 5, 2022

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

CLAassistant commented Jul 5, 2022 • edited Loading

gaotongxiao commented Jul 6, 2022

gaotongxiao commented Jul 6, 2022

wybryan commented Jul 7, 2022

wybryan left a comment

Choose a reason for hiding this comment

gaotongxiao commented Jul 18, 2022 • edited Loading

gaotongxiao commented Jul 18, 2022

wybryan commented Jul 18, 2022

wybryan commented Jul 18, 2022

gaotongxiao commented Jul 19, 2022

wybryan commented Jul 21, 2022

gaotongxiao commented Sep 23, 2022

wybryan commented Sep 26, 2022

gaotongxiao commented Sep 26, 2022

wybryan commented Sep 28, 2022

yaqi0510 commented Apr 3, 2023

CLAassistant commented Jul 5, 2022 •

edited

Loading

gaotongxiao commented Jul 18, 2022 •

edited

Loading