Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ner task #148

Merged
merged 78 commits into from
May 18, 2021
Merged

Ner task #148

merged 78 commits into from
May 18, 2021

Conversation

HqWei
Copy link
Contributor

@HqWei HqWei commented May 1, 2021

new version

@CLAassistant
Copy link

CLAassistant commented May 1, 2021

CLA assistant check
All committers have signed the CLA.

@codecov
Copy link

codecov bot commented May 1, 2021

Codecov Report

Merging #148 (75c5e5f) into main (94cd52d) will increase coverage by 0.23%.
The diff coverage is 84.92%.

❗ Current head 75c5e5f differs from pull request most recent head 6a0b06b. Consider uploading reports for the commit 6a0b06b to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main     #148      +/-   ##
==========================================
+ Coverage   82.12%   82.35%   +0.23%     
==========================================
  Files         111      123      +12     
  Lines        7194     7738     +544     
  Branches     1060     1138      +78     
==========================================
+ Hits         5908     6373     +465     
- Misses       1031     1081      +50     
- Partials      255      284      +29     
Flag Coverage Δ
unittests 82.35% <84.92%> (+0.23%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmocr/apis/inference.py 77.92% <7.14%> (-15.73%) ⬇️
mmocr/models/ner/encoder/bert_encoder.py 75.00% <75.00%> (ø)
mmocr/models/ner/convertors/ner_convertor.py 82.29% <82.29%> (ø)
mmocr/models/ner/utils/bert.py 82.82% <82.82%> (ø)
mmocr/models/ner/decoder/fc_decoder.py 85.18% <85.18%> (ø)
mmocr/models/ner/loss/masked_cross_entropy_loss.py 89.47% <89.47%> (ø)
mmocr/models/ner/loss/masked_focal_loss.py 89.47% <89.47%> (ø)
mmocr/models/ner/classifer/ner_classifier.py 90.00% <90.00%> (ø)
mmocr/core/evaluation/ner_metric.py 100.00% <100.00%> (ø)
mmocr/datasets/ner_dataset.py 100.00% <100.00%> (ø)
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94cd52d...6a0b06b. Read the comment docs.

configs/ner/bert_softmax/bert_softmax_cluener.py Outdated Show resolved Hide resolved
configs/ner/bert_softmax/bert_softmax_cluener.py Outdated Show resolved Hide resolved
mmocr/core/evaluation/ner_metric.py Outdated Show resolved Hide resolved
mmocr/core/evaluation/ner_metric.py Outdated Show resolved Hide resolved
mmocr/core/evaluation/ner_metric.py Outdated Show resolved Hide resolved
mmocr/core/evaluation/ner_metric.py Outdated Show resolved Hide resolved
mmocr/core/evaluation/ner_metric.py Outdated Show resolved Hide resolved
mmocr/datasets/ner_dataset.py Outdated Show resolved Hide resolved
mmocr/models/common/losses/focal_loss.py Outdated Show resolved Hide resolved
mmocr/models/ner/convertors/ner_convertor.py Outdated Show resolved Hide resolved
configs/ner/bert_softmax/README.md Outdated Show resolved Hide resolved
configs/ner/bert_softmax/bert_softmax_cluener_18e.py Outdated Show resolved Hide resolved
configs/ner/bert_softmax/bert_softmax_cluener_18e.py Outdated Show resolved Hide resolved
mmocr/core/evaluation/ner_metric.py Outdated Show resolved Hide resolved
mmocr/core/evaluation/ner_metric.py Outdated Show resolved Hide resolved
mmocr/models/common/losses/focal_loss.py Outdated Show resolved Hide resolved
mmocr/models/ner/encoder/bert_encoder.py Outdated Show resolved Hide resolved
mmocr/models/ner/encoder/bert_encoder.py Outdated Show resolved Hide resolved
mmocr/models/ner/loss/ner_loss.py Outdated Show resolved Hide resolved
tests/data/ner_toy_dataset/eval_sample.json Outdated Show resolved Hide resolved
mmocr/datasets/pipelines/ner_transforms.py Outdated Show resolved Hide resolved
mmocr/models/common/losses/focal_loss.py Outdated Show resolved Hide resolved
tools/test.py Outdated Show resolved Hide resolved
configs/ner/bert_softmax/README.md Outdated Show resolved Hide resolved
configs/ner/bert_softmax/bert_softmax_cluener_18e.py Outdated Show resolved Hide resolved
configs/ner/bert_softmax/bert_softmax_toy_dataset.py Outdated Show resolved Hide resolved
mmocr/core/evaluation/ner_metric.py Outdated Show resolved Hide resolved
mmocr/datasets/ner_dataset.py Outdated Show resolved Hide resolved
mmocr/models/common/losses/focal_loss.py Outdated Show resolved Hide resolved
mmocr/models/common/losses/focal_loss.py Outdated Show resolved Hide resolved
mmocr/models/ner/convertors/ner_convertor.py Outdated Show resolved Hide resolved
mmocr/models/ner/loss/ner_loss.py Outdated Show resolved Hide resolved
mmocr/models/ner/loss/ner_loss.py Outdated Show resolved Hide resolved
docs/datasets.md Outdated Show resolved Hide resolved
output_attentions=False,
output_hidden_states=False,
num_attention_heads=12,
attention_probs_dropout_prob=0.1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the name attention_probs_dropout_prob. Why there are two probs?


# self.LayerNorm is not snake-cased to stick with
# TensorFlow model variable name and be able to load
# any TensorFlow checkpoint file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have the burden of loading tf models. So LayerNorm -> layer_norm


| Method |Pretrain| Precision | Recall | F1-Score | Download |
| :--------------------------------------------------------------------: |:-----------:|:-----------:| :--------:| :-------: | :-------------------------------------: |
| [bert_softmax](/configs/ner/bert_softmax/bert_softmax_cluener_18e.py)| [pretrain](https://download.openmmlab.com/mmocr/ner/bert_softmax/bert_pretrain.pth) |0.7885 | 0.7998 | 0.7941 | [model](https://download.openmmlab.com/mmocr/ner/bert_softmax/bert_softmax_cluener-eea70ea2.pth) \| [log](https://download.openmmlab.com/mmocr/ner/bert_softmax/20210514_172645.log.json) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Give a note on the source of pretrained model

@jeffreykuang jeffreykuang self-requested a review May 18, 2021 02:58
@cuhk-hbsun cuhk-hbsun merged commit 24c590b into open-mmlab:main May 18, 2021
gaotongxiao pushed a commit to gaotongxiao/mmocr that referenced this pull request Jul 15, 2022
* update ner standard code format

* add pytest

* fix pre-commit

* Annotate the dataset section

* fix pre-commit for dataset

* rm big files and add comments in dataset

* rename configs for ner task

* minor changes if metric

* Note modification

* fix pre-commit

* detail modification

* rm transform

* rm magic number

* fix warnings in pylint

* fix pre-commit

* correct help info

* rename model files

* rename err fixed

* 428_tag

* Adjust to more general pipline

* update unit test rate

* update

* Unit test coverage over 90% and add Readme

* modify details

* fix precommit

* update

* fix pre-commit

* update

* update

* update

* update result

* update readme

* update baseline config

* update config and small minor changes

* minor changes in readme and etc.

* back to original

* update toy config

* upload model and log

* fix pytest

* Modify the notes.

* fix readme

* Delete Chinese punctuation

* add demo and fix some logic and naming problems

* add To_tensor transformer for ner and load pretrained model in config

* delete extra lines

* split ner loss to MaskedCrossEntropyLoss and MaskedFocalLoss

* update config

* fix err

* updata

* modify noqa

* update new model report

* fix err in ner demo

* Update ner_dataset.py

* Update test_ner_dataset.py

* Update ner_dataset.py

* Update ner_transforms.py

* rm toy config and data

* add comment

* add empty

* fix conflict

* fix precommit

* fix pytest

* fix pytest err

* Update ner_dataset.py

* change dataset name to cluener2020

* move the postprocess in metric to convertor

* rm __init__ etc.

* precommit

* add discription in loss

* add auto download

* add http

* update

* remove some 'issert'

* replace unsqueeze

* update config

* update doc and bert.py

* update

* update demo code

Co-authored-by: weihuaqiang <weihuaqiang@sensetime.com>
Co-authored-by: Hongbin Sun <hongbin306@gmail.com>
gaotongxiao pushed a commit to gaotongxiao/mmocr that referenced this pull request Jul 15, 2022
* update ner standard code format

* add pytest

* fix pre-commit

* Annotate the dataset section

* fix pre-commit for dataset

* rm big files and add comments in dataset

* rename configs for ner task

* minor changes if metric

* Note modification

* fix pre-commit

* detail modification

* rm transform

* rm magic number

* fix warnings in pylint

* fix pre-commit

* correct help info

* rename model files

* rename err fixed

* 428_tag

* Adjust to more general pipline

* update unit test rate

* update

* Unit test coverage over 90% and add Readme

* modify details

* fix precommit

* update

* fix pre-commit

* update

* update

* update

* update result

* update readme

* update baseline config

* update config and small minor changes

* minor changes in readme and etc.

* back to original

* update toy config

* upload model and log

* fix pytest

* Modify the notes.

* fix readme

* Delete Chinese punctuation

* add demo and fix some logic and naming problems

* add To_tensor transformer for ner and load pretrained model in config

* delete extra lines

* split ner loss to MaskedCrossEntropyLoss and MaskedFocalLoss

* update config

* fix err

* updata

* modify noqa

* update new model report

* fix err in ner demo

* Update ner_dataset.py

* Update test_ner_dataset.py

* Update ner_dataset.py

* Update ner_transforms.py

* rm toy config and data

* add comment

* add empty

* fix conflict

* fix precommit

* fix pytest

* fix pytest err

* Update ner_dataset.py

* change dataset name to cluener2020

* move the postprocess in metric to convertor

* rm __init__ etc.

* precommit

* add discription in loss

* add auto download

* add http

* update

* remove some 'issert'

* replace unsqueeze

* update config

* update doc and bert.py

* update

* update demo code

Co-authored-by: weihuaqiang <weihuaqiang@sensetime.com>
Co-authored-by: Hongbin Sun <hongbin306@gmail.com>
@OpenMMLab-Assistant-007
Copy link

Hi!
@HqWei
First of all, we want to express our gratitude for your significant PR in the OpenMMLab project. Your contribution is highly appreciated, and we are grateful for your efforts in helping improve this open-source project during your personal time. We believe that many developers will benefit from your PR.

We would also like to invite you to join our Special Interest Group (SIG) private channel on Discord, where you can share your experiences, ideas, and build connections with like-minded peers. To join the SIG channel, simply message moderator— OpenMMLab on Discord or briefly share your open-source contributions in the #introductions channel and we will assist you. Look forward to seeing you there! Join us :https://discord.gg/UjgXkPWNqA

If you have WeChat account,welcome to join our community on WeChat. You can add our assistant :openmmlabwx. Please add "mmsig + Github ID" as a remark when adding friends:)
Thank you again for your contribution❤

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants