-
Notifications
You must be signed in to change notification settings - Fork 498
Add paddlemodel #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Add paddlemodel #45
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
## Install Requirements: | ||
|
||
- PaddlePaddle 2.1 | ||
- OS 64 bit | ||
- Python 3(3.5.1+/3.6/3.7/3.8/3.9),64 bit | ||
- pip/pip3(9.0.1+), 64 bit | ||
- CUDA >= 10.1 | ||
- cuDNN >= 7.6 | ||
|
||
## Install PaddleDetection | ||
|
||
```bash | ||
# Clone PaddleDetection repository | ||
cd <path/to/clone/PaddleDetection> | ||
git clone https://github.com/PaddlePaddle/PaddleDetection.git | ||
|
||
cd PaddleDetection | ||
# Install other dependencies | ||
pip install -r requirements.txt | ||
``` | ||
|
||
## Prepare Dataset | ||
|
||
Download [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet): | ||
|
||
```bash | ||
cd PaddleDetection/dataset/ | ||
mkdir publaynet | ||
# download dataset | ||
wget https://dax-cdn.cdn.appdomain.cloud/dax-publaynet/1.0.0/publaynet.tar.gz?_ga=2.104193024.1076900768.1622560733-649911202.1622560733 | ||
|
||
tar -xvf publaynet.tar.gz | ||
``` | ||
|
||
Folder structure: | ||
|
||
| File or Folder | Description | num | | ||
| :------------- | :----------------------------------------------- | ------- | | ||
| `train/` | Images in the training subset | 335,703 | | ||
| `val/` | Images in the validation subset | 11,245 | | ||
| `test/` | Images in the testing subset | 11,405 | | ||
| `train.json` | Annotations for training images | | | ||
| `val.json` | Annotations for validation images | | | ||
| `LICENSE.txt` | Plaintext version of the CDLA-Permissive license | | | ||
| `README.txt` | Text file with the file names and description | | | ||
|
||
## Modify Config Files | ||
|
||
Use `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml` config for training: | ||
|
||
<div align='center'> | ||
<img src='../../examples/data/PaddleDetection_config.png' width='600px'/> | ||
</div> | ||
|
||
|
||
From the figure above, `ppyolov2_r50vd_dcn_365e_coco.yml` the config depends on other config files: | ||
|
||
``` | ||
coco_detection.yml:mainly explains the path of training data and verification data | ||
|
||
runtime.yml:describes common runtime parameters, such as whether to use a GPU, and how many Epoch checkpoints to store per Epoch,etc. | ||
|
||
optimizer_365e.yml:mainly explains learning rate and optimizer. | ||
|
||
ppyolov2_r50vd_dcn.yml:mainly explains the model, and the trunk network. | ||
|
||
ppyolov2_reader.yml:mainly explains the configuration of data reader, such as batch size, number of concurrent loading child processes, etc, and post-read preprocessing operations, such as resize, data enhancement, etc | ||
``` | ||
|
||
You will need to modify the above configuration file according to the actual situation. | ||
|
||
## Train | ||
|
||
* Perform evaluation in training | ||
|
||
```bash | ||
export CUDA_VISIBLE_DEVICES=0,1,2,3 | ||
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval | ||
``` | ||
|
||
Notice: If you encounter "`Out of memory error`" problem, try reducing batch size in `ppyolov2_reader.yml` file. | ||
|
||
* Fine-tune other task | ||
|
||
When using pre-trained model to fine-tune other task, pretrain_weights can be used directly. The parameters with different shape will be ignored automatically. For example: | ||
|
||
```bash | ||
export CUDA_VISIBLE_DEVICES=0,1,2,3 | ||
# If the shape of parameters in program is different from pretrain_weights, | ||
# then PaddleDetection will not use such parameters. | ||
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml \ | ||
-o pretrain_weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final \ | ||
``` | ||
|
||
## Inference | ||
|
||
- Output specified directory && Set up threshold | ||
|
||
``` | ||
export CUDA_VISIBLE_DEVICES=0 | ||
python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml \ | ||
--infer_img=demo/000000570688.jpg \ | ||
--output_dir=infer_output/ \ | ||
--draw_threshold=0.5 \ | ||
-o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final \ | ||
--use_vdl=Ture | ||
``` | ||
|
||
`--draw_threshold` is an optional argument. Default is 0.5. Different thresholds will produce different results depending on the calculation of [NMS](https://ieeexplore.ieee.org/document/1699659). | ||
|
||
## Inference and deployment | ||
|
||
### Export model for inference | ||
|
||
```bash | ||
python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --output_dir=./inference \ | ||
-o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final.pdparams | ||
``` | ||
|
||
* -c:config file | ||
* --output_dir:model save dir | ||
|
||
The prediction model is exported to the directory 'inference/ppyolov2_r50vd_dcn_365e_coco', respectively:`infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel` | ||
|
||
More Info:https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md | ||
|
||
### Python inference | ||
|
||
```bash | ||
python deploy/python/infer.py --model_dir=./inference/ppyolov2_r50vd_dcn_365e_coco --image_file=./demo/road554.png --use_gpu=True | ||
``` | ||
|
||
* --model_dir:the previous step exported model dir | ||
* --image_file:inference image name | ||
* --use_gpu:whether use gpu | ||
|
||
More Info:https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.1/deploy/python | ||
|
||
C++ infernece:https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/cpp | ||
|
||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ | |
|
||
We provide a spectrum of pre-trained models on different datasets. | ||
|
||
## Example Usage: | ||
## Example Usage using Detectron2: | ||
|
||
```python | ||
import layoutparser as lp | ||
|
@@ -14,22 +14,59 @@ model = lp.Detectron2LayoutModel( | |
model.detect(image) | ||
``` | ||
|
||
## Example Usage using PaddleDetection: | ||
|
||
```python | ||
import layoutparser as lp | ||
model = lp.PaddleDetectionLayoutModel( | ||
config_path="lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config", # In model catalog | ||
label_map ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map` | ||
threshold =0.5] # Optional | ||
) | ||
model.detect(image) | ||
``` | ||
|
||
## Model Catalog | ||
|
||
| Dataset | Model | Config Path | Eval Result (mAP) | | ||
|-----------------------------------------------------------------------|--------------------------------------------------------------------------------------------|--------------------------------------------------------|---------------------------------------------------------------------------| | ||
| [HJDataset](https://dell-research-harvard.github.io/HJDataset/) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/j4yseny2u0hn22r/config.yml?dl=1) | lp://HJDataset/faster_rcnn_R_50_FPN_3x/config | | | ||
| [HJDataset](https://dell-research-harvard.github.io/HJDataset/) | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/4jmr3xanmxmjcf8/config.yml?dl=1) | lp://HJDataset/mask_rcnn_R_50_FPN_3x/config | | | ||
| [HJDataset](https://dell-research-harvard.github.io/HJDataset/) | [retinanet_R_50_FPN_3x](https://www.dropbox.com/s/z8a8ywozuyc5c2x/config.yml?dl=1) | lp://HJDataset/retinanet_R_50_FPN_3x/config | | | ||
| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/f3b12qc4hc0yh4m/config.yml?dl=1) | lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config | | | ||
| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/u9wbsfwz4y0ziki/config.yml?dl=1) | lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config | | | ||
| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) | [mask_rcnn_X_101_32x8d_FPN_3x](https://www.dropbox.com/s/nau5ut6zgthunil/config.yaml?dl=1) | lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config | 88.98 [eval.csv](https://www.dropbox.com/s/15ytg3fzmc6l59x/eval.csv?dl=0) | | ||
| [PrimaLayout](https://www.primaresearch.org/dataset/) | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/yc92x97k50abynt/config.yaml?dl=1) | lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config | 69.35 [eval.csv](https://www.dropbox.com/s/9uuql57uedvb9mo/eval.csv?dl=0) | | ||
| [NewspaperNavigator](https://news-navigator.labs.loc.gov/) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/wnido8pk4oubyzr/config.yml?dl=1) | lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config | | | ||
| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/7cqle02do7ah7k4/config.yaml?dl=1) | lp://TableBank/faster_rcnn_R_50_FPN_3x/config | 89.78 [eval.csv](https://www.dropbox.com/s/1uwnz58hxf96iw2/eval.csv?dl=0) | | ||
| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_101_FPN_3x](https://www.dropbox.com/s/h63n6nv51kfl923/config.yaml?dl=1) | lp://TableBank/faster_rcnn_R_101_FPN_3x/config | 91.26 [eval.csv](https://www.dropbox.com/s/e1kq8thkj2id1li/eval.csv?dl=0) | | ||
| Dataset | Model | Config Path | Eval Result (mAP) | | ||
| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | ||
| [HJDataset](https://dell-research-harvard.github.io/HJDataset/) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/j4yseny2u0hn22r/config.yml?dl=1) | lp://HJDataset/faster_rcnn_R_50_FPN_3x/config | | | ||
| [HJDataset](https://dell-research-harvard.github.io/HJDataset/) | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/4jmr3xanmxmjcf8/config.yml?dl=1) | lp://HJDataset/mask_rcnn_R_50_FPN_3x/config | | | ||
| [HJDataset](https://dell-research-harvard.github.io/HJDataset/) | [retinanet_R_50_FPN_3x](https://www.dropbox.com/s/z8a8ywozuyc5c2x/config.yml?dl=1) | lp://HJDataset/retinanet_R_50_FPN_3x/config | | | ||
| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/f3b12qc4hc0yh4m/config.yml?dl=1) | lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config | | | ||
| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/u9wbsfwz4y0ziki/config.yml?dl=1) | lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config | | | ||
| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) | [mask_rcnn_X_101_32x8d_FPN_3x](https://www.dropbox.com/s/nau5ut6zgthunil/config.yaml?dl=1) | lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config | 88.98 [eval.csv](https://www.dropbox.com/s/15ytg3fzmc6l59x/eval.csv?dl=0) | | ||
| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet) | [ppyolov2_r50vd_dcn_365e_publaynet](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) | lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config | 93.6 [eval.csv](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/eval_publaynet.csv) | | ||
| [PrimaLayout](https://www.primaresearch.org/dataset/) | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/yc92x97k50abynt/config.yaml?dl=1) | lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config | 69.35 [eval.csv](https://www.dropbox.com/s/9uuql57uedvb9mo/eval.csv?dl=0) | | ||
| [NewspaperNavigator](https://news-navigator.labs.loc.gov/) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/wnido8pk4oubyzr/config.yml?dl=1) | lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config | | | ||
| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/7cqle02do7ah7k4/config.yaml?dl=1) | lp://TableBank/faster_rcnn_R_50_FPN_3x/config | 89.78 [eval.csv](https://www.dropbox.com/s/1uwnz58hxf96iw2/eval.csv?dl=0) | | ||
| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_101_FPN_3x](https://www.dropbox.com/s/h63n6nv51kfl923/config.yaml?dl=1) | lp://TableBank/faster_rcnn_R_101_FPN_3x/config | 91.26 [eval.csv](https://www.dropbox.com/s/e1kq8thkj2id1li/eval.csv?dl=0) | | ||
| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [ppyolov2_r50vd_dcn_365e_tableBank_word](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | lp://TableBank/ppyolov2_r50vd_dcn_365e_tableBank_word/config | 96.2 [eval.csv](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/eval_tablebank.csv) | | ||
|
||
* For PubLayNet models, we suggest using `mask_rcnn_X_101_32x8d_FPN_3x` model as it's trained on the whole training set, while others are only trained on the validation set (the size is only around 1/50). You could expect a 15% AP improvement using the `mask_rcnn_X_101_32x8d_FPN_3x` model. | ||
* []() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这一行,没用的话就删掉吧 |
||
|
||
* Compare the time cost of **Detectron2** and **PaddleDetection**(ppyolov2_* models in the above table): | ||
|
||
PublayNet Dataset: | ||
|
||
| Model | CPU time cost | GPU time cost | | ||
| --------------- | ------------- | ------------- | | ||
| Detectron2 | 16545.5ms | 209.5ms | | ||
| PaddleDetection | 1713.7ms | 66.6ms | | ||
|
||
TableBank Dataset: | ||
|
||
| Model | CPU time cost | GPU time cost | | ||
| --------------- | ------------- | ------------- | | ||
| Detectron2 | 7623.2ms | 104.2.ms | | ||
| PaddleDetection | 1968.4ms | 65.1ms | | ||
|
||
**Envrionment:** | ||
|
||
**GPU: **a single NVIDIA Tesla P40 | ||
|
||
**CPU:** Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz,24core | ||
|
||
## Model `label_map` | ||
|
||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have external documentations for training the Paddle models - we can add a link to that. But this should not be included in the layout parser documentation.