Layout-Parser · an1018 · Jun 8, 2021 · Jun 8, 2021 · lolipopshock · Jun 8, 2021
diff --git a/README.md b/README.md
@@ -39,13 +39,30 @@ pip install layoutparser[ocr]
 
 **For Windows Users:** Please read [installation.md](installation.md) for details about installing Detectron2.
 
+## **Recent updates**
+
+2021.6.8 Update new  detection model (PaddleDetection) and ocr model (PaddleOCR).
+
+```Python
+# Install PaddlePaddle
+# CUDA10.1
+python -m pip install paddlepaddle-gpu==2.1.0.post101 -f https://paddlepaddle.org.cn/whl/mkl/stable.html
+# CPU
+python -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
+
+# Install the paddle ocr components when necessary 
+pip install layoutparser[paddleocr] 
+```
+
+For more PaddlePaddle CUDA version or environment to quick install, please refer to the [PaddlePaddle Quick Installation document](https://www.paddlepaddle.org.cn/install/quick)
+
 ## Quick Start
 
 We provide a series of examples for to help you start using the layout parser library: 
 
 1. [Table OCR and Results Parsing](https://github.com/Layout-Parser/layout-parser/blob/master/examples/OCR%20Tables%20and%20Parse%20the%20Output.ipynb): `layoutparser` can be used for conveniently OCR documents and convert the output in to structured data. 
-
 2. [Deep Layout Parsing Example](https://github.com/Layout-Parser/layout-parser/blob/master/examples/Deep%20Layout%20Parsing.ipynb): With the help of Deep Learning, `layoutparser` supports the analysis very complex documents and processing of the hierarchical structure in the layouts. 
+3. [Deep Layout Parsing using Paddle](examples/Deep%20Layout%20Parsing%20using%20Paddle.ipynb): `layoutparser` supports the analysis very complex documents and processing of the hierarchical structure in the layouts Using Paddle models.
 
 
 ## DL Assisted Layout Prediction Example 
@@ -63,6 +80,17 @@ With only 4 lines of code in `layoutparse`, you can unlock the information from
 >>> lp.draw_box(image, layout,) # With extra configurations
 ```
 
+Use PaddleDetection model：
+
+```python
+>>> import layoutparser as lp
+>>> model = lp.PaddleDetectionLayoutModel('lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config')
+>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
+>>> lp.draw_box(image, layout,) # With extra configurations
+```
+
+If you want to train Paddledetection model yourself, please refer to：[Train_PaddleDetection_model.md](docs/notes/Train_PaddleDetection_model.md)
+
 ## Contributing
 
 We encourage you to contribute to Layout Parser! Please check out the [Contributing guidelines](.github/CONTRIBUTING.md) for guidelines about how to proceed. Join us!

diff --git a/dev-requirements.txt b/dev-requirements.txt
@@ -10,4 +10,6 @@ sphinx_rtd_theme
 google-cloud-vision==1
 pytesseract
 pycocotools
-git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2
+git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2
+paddlepaddle==2.1.0
+paddleocr>=2.0.1
diff --git a/docs/notes/Train_PaddleDetection_model.md b/docs/notes/Train_PaddleDetection_model.md
@@ -0,0 +1,142 @@
+## Install Requirements:
+
+- PaddlePaddle 2.1
+- OS 64 bit
+- Python 3(3.5.1+/3.6/3.7/3.8/3.9)，64 bit
+- pip/pip3(9.0.1+), 64 bit
+- CUDA >= 10.1
+- cuDNN >= 7.6
+
+## Install PaddleDetection
+
+```bash
+# Clone PaddleDetection repository
+cd <path/to/clone/PaddleDetection>
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+
+cd PaddleDetection
+# Install other dependencies
+pip install -r requirements.txt
+```
+
+## Prepare Dataset
+
+Download [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)：
+
+```bash
+cd PaddleDetection/dataset/
+mkdir publaynet
+# download dataset
+wget https://dax-cdn.cdn.appdomain.cloud/dax-publaynet/1.0.0/publaynet.tar.gz?_ga=2.104193024.1076900768.1622560733-649911202.1622560733
+
+tar -xvf publaynet.tar.gz
+```
+
+Folder structure：
+
+| File or Folder | Description                                      | num     |
+| :------------- | :----------------------------------------------- | ------- |
+| `train/`       | Images in the training subset                    | 335,703 |
+| `val/`         | Images in the validation subset                  | 11,245  |
+| `test/`        | Images in the testing subset                     | 11,405  |
+| `train.json`   | Annotations for training images                  |         |
+| `val.json`     | Annotations for validation images                |         |
+| `LICENSE.txt`  | Plaintext version of the CDLA-Permissive license |         |
+| `README.txt`   | Text file with the file names and description    |         |
+
+## Modify Config Files
+
+Use `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml` config for training：
+
+<div align='center'>
+  <img src='../../examples/data/PaddleDetection_config.png' width='600px'/>
+</div>
+
+
+From the figure above, `ppyolov2_r50vd_dcn_365e_coco.yml` the config depends on other config files:
+
+```
+coco_detection.yml：mainly explains the path of training data and verification data
+
+runtime.yml：describes common runtime parameters, such as whether to use a GPU, and how many Epoch checkpoints to store per Epoch,etc.
+
+optimizer_365e.yml：mainly explains learning rate and optimizer.
+
+ppyolov2_r50vd_dcn.yml：mainly explains the model, and the trunk network.
+
+ppyolov2_reader.yml：mainly explains the configuration of data reader, such as batch size, number of concurrent loading child processes, etc, and post-read preprocessing operations, such as resize, data enhancement, etc
+```
+
+You will need to modify the above configuration file  according to the actual situation.
+
+## Train
+
+* Perform evaluation in training
+
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval
+```
+
+Notice: If you encounter "`Out of memory error`" problem, try reducing  batch size  in `ppyolov2_reader.yml` file.
+
+* Fine-tune other task
+
+When using pre-trained model to fine-tune other task, pretrain_weights can be used directly. The parameters with different shape will be ignored automatically. For example:
+
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+# If the shape of parameters in program is different from pretrain_weights,
+# then PaddleDetection will not use such parameters.
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml \
+                         -o pretrain_weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final \
+```
+
+## Inference
+
+- Output specified directory && Set up threshold
+
+```
+export CUDA_VISIBLE_DEVICES=0
+python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml \
+                    --infer_img=demo/000000570688.jpg \
+                    --output_dir=infer_output/ \
+                    --draw_threshold=0.5 \
+                    -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final \
+                    --use_vdl=Ture
+```
+
+`--draw_threshold` is an optional argument. Default is 0.5. Different thresholds will produce different results depending on the calculation of [NMS](https://ieeexplore.ieee.org/document/1699659).
+
+## Inference and deployment
+
+### Export model for inference
+
+```bash
+python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --output_dir=./inference \
+ -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final.pdparams
+```
+
+* -c：config file
+* --output_dir：model save dir
+
+The prediction model is exported to the directory 'inference/ppyolov2_r50vd_dcn_365e_coco', respectively:`infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel`
+
+More Info：https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md
+
+### Python inference
+
+```bash
+python deploy/python/infer.py --model_dir=./inference/ppyolov2_r50vd_dcn_365e_coco --image_file=./demo/road554.png --use_gpu=True
+```
+
+* --model_dir：the previous step exported  model dir
+* --image_file：inference image name 
+* --use_gpu：whether use gpu
+
+More Info：https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.1/deploy/python
+
+C++ infernece：https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/cpp
+
+
+
diff --git a/docs/notes/modelzoo.md b/docs/notes/modelzoo.md
@@ -2,7 +2,7 @@
 
 We provide a spectrum of pre-trained models on different datasets.
 
-## Example Usage: 
+## Example Usage using Detectron2: 
 
 ```python
 import layoutparser as lp
@@ -14,22 +14,59 @@ model = lp.Detectron2LayoutModel(
 model.detect(image)
 ```
 
+## Example  Usage using PaddleDetection: 
+
+```python
+import layoutparser as lp
+model = lp.PaddleDetectionLayoutModel(
+  					config_path="lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config", # In model catalog
+            label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
+            threshold =0.5] # Optional
+        )
+model.detect(image)
+```
+
 ## Model Catalog
 
-| Dataset                                                               | Model                                                                                      | Config Path                                            | Eval Result (mAP)                                                         |
-|-----------------------------------------------------------------------|--------------------------------------------------------------------------------------------|--------------------------------------------------------|---------------------------------------------------------------------------|
-| [HJDataset](https://dell-research-harvard.github.io/HJDataset/)       | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/j4yseny2u0hn22r/config.yml?dl=1)       | lp://HJDataset/faster_rcnn_R_50_FPN_3x/config          |                                                                           |
-| [HJDataset](https://dell-research-harvard.github.io/HJDataset/)       | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/4jmr3xanmxmjcf8/config.yml?dl=1)         | lp://HJDataset/mask_rcnn_R_50_FPN_3x/config            |                                                                           |
-| [HJDataset](https://dell-research-harvard.github.io/HJDataset/)       | [retinanet_R_50_FPN_3x](https://www.dropbox.com/s/z8a8ywozuyc5c2x/config.yml?dl=1)         | lp://HJDataset/retinanet_R_50_FPN_3x/config            |                                                                           |
-| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)                 | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/f3b12qc4hc0yh4m/config.yml?dl=1)       | lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config          |                                                                           |
-| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)                 | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/u9wbsfwz4y0ziki/config.yml?dl=1)         | lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config            |                                                                           |
-| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)                 | [mask_rcnn_X_101_32x8d_FPN_3x](https://www.dropbox.com/s/nau5ut6zgthunil/config.yaml?dl=1) | lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config     | 88.98 [eval.csv](https://www.dropbox.com/s/15ytg3fzmc6l59x/eval.csv?dl=0) |
-| [PrimaLayout](https://www.primaresearch.org/dataset/)                 | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/yc92x97k50abynt/config.yaml?dl=1)        | lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config          | 69.35 [eval.csv](https://www.dropbox.com/s/9uuql57uedvb9mo/eval.csv?dl=0) |
-| [NewspaperNavigator](https://news-navigator.labs.loc.gov/)            | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/wnido8pk4oubyzr/config.yml?dl=1)       | lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config |                                                                           |
-| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/7cqle02do7ah7k4/config.yaml?dl=1)      | lp://TableBank/faster_rcnn_R_50_FPN_3x/config          | 89.78 [eval.csv](https://www.dropbox.com/s/1uwnz58hxf96iw2/eval.csv?dl=0) |
-| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_101_FPN_3x](https://www.dropbox.com/s/h63n6nv51kfl923/config.yaml?dl=1)     | lp://TableBank/faster_rcnn_R_101_FPN_3x/config         | 91.26 [eval.csv](https://www.dropbox.com/s/e1kq8thkj2id1li/eval.csv?dl=0) |
+| Dataset                                                      | Model                                                        | Config Path                                                  | Eval Result (mAP)                                            |
+| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [HJDataset](https://dell-research-harvard.github.io/HJDataset/) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/j4yseny2u0hn22r/config.yml?dl=1) | lp://HJDataset/faster_rcnn_R_50_FPN_3x/config                |                                                              |
+| [HJDataset](https://dell-research-harvard.github.io/HJDataset/) | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/4jmr3xanmxmjcf8/config.yml?dl=1) | lp://HJDataset/mask_rcnn_R_50_FPN_3x/config                  |                                                              |
+| [HJDataset](https://dell-research-harvard.github.io/HJDataset/) | [retinanet_R_50_FPN_3x](https://www.dropbox.com/s/z8a8ywozuyc5c2x/config.yml?dl=1) | lp://HJDataset/retinanet_R_50_FPN_3x/config                  |                                                              |
+| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)        | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/f3b12qc4hc0yh4m/config.yml?dl=1) | lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config                |                                                              |
+| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)        | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/u9wbsfwz4y0ziki/config.yml?dl=1) | lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config                  |                                                              |
+| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)        | [mask_rcnn_X_101_32x8d_FPN_3x](https://www.dropbox.com/s/nau5ut6zgthunil/config.yaml?dl=1) | lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config           | 88.98 [eval.csv](https://www.dropbox.com/s/15ytg3fzmc6l59x/eval.csv?dl=0) |
+| [PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)        | [ppyolov2_r50vd_dcn_365e_publaynet](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) | lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config      | 93.6 [eval.csv](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/eval_publaynet.csv) |
+| [PrimaLayout](https://www.primaresearch.org/dataset/)        | [mask_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/yc92x97k50abynt/config.yaml?dl=1) | lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config                | 69.35 [eval.csv](https://www.dropbox.com/s/9uuql57uedvb9mo/eval.csv?dl=0) |
+| [NewspaperNavigator](https://news-navigator.labs.loc.gov/)   | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/wnido8pk4oubyzr/config.yml?dl=1) | lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config       |                                                              |
+| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_50_FPN_3x](https://www.dropbox.com/s/7cqle02do7ah7k4/config.yaml?dl=1) | lp://TableBank/faster_rcnn_R_50_FPN_3x/config                | 89.78 [eval.csv](https://www.dropbox.com/s/1uwnz58hxf96iw2/eval.csv?dl=0) |
+| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [faster_rcnn_R_101_FPN_3x](https://www.dropbox.com/s/h63n6nv51kfl923/config.yaml?dl=1) | lp://TableBank/faster_rcnn_R_101_FPN_3x/config               | 91.26 [eval.csv](https://www.dropbox.com/s/e1kq8thkj2id1li/eval.csv?dl=0) |
+| [TableBank](https://doc-analysis.github.io/tablebank-page/index.html) | [ppyolov2_r50vd_dcn_365e_tableBank_word](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | lp://TableBank/ppyolov2_r50vd_dcn_365e_tableBank_word/config | 96.2 [eval.csv](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/eval_tablebank.csv) |
 
 * For PubLayNet models, we suggest using `mask_rcnn_X_101_32x8d_FPN_3x` model as it's trained on the whole training set, while others are only trained on the validation set (the size is only around 1/50). You could expect a 15% AP improvement using the `mask_rcnn_X_101_32x8d_FPN_3x` model.
+* []()
+
+* Compare the time cost  of **Detectron2** and **PaddleDetection**(ppyolov2_* models in the above table):
+
+PublayNet Dataset:
+
+| Model           | CPU time cost | GPU time cost |
+| --------------- | ------------- | ------------- |
+| Detectron2      | 16545.5ms     | 209.5ms       |
+| PaddleDetection | 1713.7ms      | 66.6ms        |
+
+TableBank Dataset:
+
+| Model           | CPU time cost | GPU time cost |
+| --------------- | ------------- | ------------- |
+| Detectron2      | 7623.2ms      | 104.2.ms      |
+| PaddleDetection | 1968.4ms      | 65.1ms        |
+
+**Envrionment：**	
+
+	**GPU： **a single NVIDIA Tesla P40
+
+	**CPU：**  Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz，24core
 
 ## Model `label_map`