Merge pull request #6 from PaddlePaddle/develop

merge paddleocr
PaddlePaddle · Aug 31, 2020 · 10f7e51 · 10f7e51
2 parents ee05c91 + 31e62cb
commit 10f7e51
Show file tree

Hide file tree

Showing 6 changed files with 158 additions and 611 deletions.
diff --git a/docker/hubserving/README.md b/docker/hubserving/README.md
@@ -0,0 +1,58 @@
+English | [简体中文](README_cn.md)
+
+## Introduction
+Many user hopes package the PaddleOCR service into an docker image, so that it can be quickly released and used in the docker or k8s environment.
+
+This page provide some standardized code to achieve this goal. You can quickly publish the PaddleOCR project into a callable Restful API service through the following steps. (At present, the deployment based on the HubServing mode is implemented first, and author plans to increase the deployment of the PaddleServing mode in the futrue)
+
+## 1. Prerequisites
+
+You need to install the following basic components first：
+a. Docker
+b. Graphics driver and CUDA 10.0+（GPU）
+c. NVIDIA Container Toolkit（GPU，Docker 19.03+ can skip this）
+d. cuDNN 7.6+（GPU）
+
+## 2. Build Image
+a. Download PaddleOCR sourcecode
+```
+git clone https://github.com/PaddlePaddle/PaddleOCR.git
+```
+b. Goto Dockerfile directory（ps：Need to distinguish between cpu and gpu version, the following takes cpu as an example, gpu version needs to replace the keyword）
+```
+cd docker/cpu
+```
+c. Build image
+```
+docker build -t paddleocr:cpu . 
+```
+
+## 3. Start container
+a. CPU version
+```
+sudo docker run -dp 8866:8866 --name paddle_ocr paddleocr:cpu
+```
+b. GPU version (base on NVIDIA Container Toolkit)
+```
+sudo nvidia-docker run -dp 8866:8866 --name paddle_ocr paddleocr:gpu
+```
+c. GPU version (Docker 19.03++)
+```
+sudo docker run -dp 8866:8866 --gpus all --name paddle_ocr paddleocr:gpu
+```
+d. Check service status（If you can see the following statement then it means completed：Successfully installed ocr_system && Running on http://0.0.0.0:8866/）
+```
+docker logs -f paddle_ocr
+```
+
+## 4. Test
+a. Calculate the Base64 encoding of the picture to be recognized (if you just test, you can use a free online tool, like：https://freeonlinetools24.com/base64-image/）
+b. Post a service request（sample request in sample_request.txt）
+
+```
+curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"Input image Base64 encode(need to delete the code 'data:image/jpg;base64,'）\"]}" http://localhost:8866/predict/ocr_system
+```
+c. Get resposne（If the call is successful, the following result will be returned）
+```
+{"msg":"","results":[[{"confidence":0.8403433561325073,"text":"约定","text_region":[[345,377],[641,390],[634,540],[339,528]]},{"confidence":0.8131805658340454,"text":"最终相遇","text_region":[[356,532],[624,530],[624,596],[356,598]]}]],"status":"0"}
+```
diff --git a/docker/hubserving/README_cn.md b/docker/hubserving/README_cn.md
@@ -0,0 +1,57 @@
+[English](README.md) | 简体中文
+
+## Docker化部署服务
+在日常项目应用中，相信大家一般都会希望能通过Docker技术，把PaddleOCR服务打包成一个镜像，以便在Docker或k8s环境里，快速发布上线使用。
+
+本文将提供一些标准化的代码来实现这样的目标。大家通过如下步骤可以把PaddleOCR项目快速发布成可调用的Restful API服务。（目前暂时先实现了基于HubServing模式的部署，后续作者计划增加PaddleServing模式的部署）
+
+## 1.实施前提准备
+
+需要先完成如下基本组件的安装：
+a. Docker环境
+b. 显卡驱动和CUDA 10.0+（GPU）
+c. NVIDIA Container Toolkit（GPU，Docker 19.03以上版本可以跳过此步）
+d. cuDNN 7.6+（GPU）
+
+## 2.制作镜像
+a.下载PaddleOCR项目代码
+```
+git clone https://github.com/PaddlePaddle/PaddleOCR.git
+```
+b.切换至Dockerfile目录（注：需要区分cpu或gpu版本，下文以cpu为例，gpu版本需要替换一下关键字即可）
+```
+cd docker/cpu
+```
+c.生成镜像
+```
+docker build -t paddleocr:cpu . 
+```
+
+## 3.启动Docker容器
+a. CPU 版本
+```
+sudo docker run -dp 8866:8866 --name paddle_ocr paddleocr:cpu
+```
+b. GPU 版本 (通过NVIDIA Container Toolkit)
+```
+sudo nvidia-docker run -dp 8866:8866 --name paddle_ocr paddleocr:gpu
+```
+c. GPU 版本 (Docker 19.03以上版本，可以直接用如下命令)
+```
+sudo docker run -dp 8866:8866 --gpus all --name paddle_ocr paddleocr:gpu
+```
+d. 检查服务运行情况（出现：Successfully installed ocr_system和Running on http://0.0.0.0:8866/等信息，表示运行成功）
+```
+docker logs -f paddle_ocr
+```
+
+## 4.测试服务
+a. 计算待识别图片的Base64编码（如果只是测试一下效果，可以通过免费的在线工具实现，如：http://tool.chinaz.com/tools/imgtobase/）
+b. 发送服务请求（可参见sample_request.txt中的值）
+```
+curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"填入图片Base64编码(需要删除'data:image/jpg;base64,'）\"]}" http://localhost:8866/predict/ocr_system
+```
+c. 返回结果（如果调用成功，会返回如下结果）
+```
+{"msg":"","results":[[{"confidence":0.8403433561325073,"text":"约定","text_region":[[345,377],[641,390],[634,540],[339,528]]},{"confidence":0.8131805658340454,"text":"最终相遇","text_region":[[356,532],[624,530],[624,596],[356,598]]}]],"status":"0"}
+```
diff --git a/docker/hubserving/readme.md b/docker/hubserving/readme.md
@@ -1,55 +1,58 @@
-# Docker化部署服务
-在日常项目应用中，相信大家一般都会希望能通过Docker技术，把PaddleOCR服务打包成一个镜像，以便在Docker或k8s环境里，快速发布上线使用。
+English | [简体中文](README_cn.md)
 
-本文将提供一些标准化的代码来实现这样的目标。大家通过如下步骤可以把PaddleOCR项目快速发布成可调用的Restful API服务。（目前暂时先实现了基于HubServing模式的部署，后续作者计划增加PaddleServing模式的部署）
+## Introduction
+Many user hopes package the PaddleOCR service into an docker image, so that it can be quickly released and used in the docker or k8s environment.
 
-## 1.实施前提准备
+This page provide some standardized code to achieve this goal. You can quickly publish the PaddleOCR project into a callable Restful API service through the following steps. (At present, the deployment based on the HubServing mode is implemented first, and author plans to increase the deployment of the PaddleServing mode in the futrue)
 
-需要先完成如下基本组件的安装：
-a. Docker环境
-b. 显卡驱动和CUDA 10.0+（GPU）
-c. NVIDIA Container Toolkit（GPU，Docker 19.03以上版本可以跳过此步）
+## 1. Prerequisites
+
+You need to install the following basic components first：
+a. Docker
+b. Graphics driver and CUDA 10.0+（GPU）
+c. NVIDIA Container Toolkit（GPU，Docker 19.03+ can skip this）
 d. cuDNN 7.6+（GPU）
 
-## 2.制作镜像
-a.下载PaddleOCR项目代码
+## 2. Build Image
+a. Download PaddleOCR sourcecode
 ```
 git clone https://github.com/PaddlePaddle/PaddleOCR.git
 ```
-b.切换至Dockerfile目录（注：需要区分cpu或gpu版本，下文以cpu为例，gpu版本需要替换一下关键字即可）
+b. Goto Dockerfile directory（ps：Need to distinguish between cpu and gpu version, the following takes cpu as an example, gpu version needs to replace the keyword）
 ```
 cd docker/cpu
 ```
-c.生成镜像
+c. Build image
 ```
 docker build -t paddleocr:cpu . 
 ```
 
-## 3.启动Docker容器
-a. CPU 版本
+## 3. Start container
+a. CPU version
 ```
 sudo docker run -dp 8866:8866 --name paddle_ocr paddleocr:cpu
 ```
-b. GPU 版本 (通过NVIDIA Container Toolkit)
+b. GPU version (base on NVIDIA Container Toolkit)
 ```
 sudo nvidia-docker run -dp 8866:8866 --name paddle_ocr paddleocr:gpu
 ```
-c. GPU 版本 (Docker 19.03以上版本，可以直接用如下命令)
+c. GPU version (Docker 19.03++)
 ```
 sudo docker run -dp 8866:8866 --gpus all --name paddle_ocr paddleocr:gpu
 ```
-d. 检查服务运行情况（出现：Successfully installed ocr_system和Running on http://0.0.0.0:8866/等信息，表示运行成功）
+d. Check service status（If you can see the following statement then it means completed：Successfully installed ocr_system && Running on http://0.0.0.0:8866/）
 ```
 docker logs -f paddle_ocr
 ```
 
-## 4.测试服务
-a. 计算待识别图片的Base64编码（如果只是测试一下效果，可以通过免费的在线工具实现，如：http://tool.chinaz.com/tools/imgtobase/）
-b. 发送服务请求（可参见sample_request.txt中的值）
+## 4. Test
+a. Calculate the Base64 encoding of the picture to be recognized (if you just test, you can use a free online tool, like：https://freeonlinetools24.com/base64-image/）
+b. Post a service request（sample request in sample_request.txt）
+
 ```
-curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"填入图片Base64编码(需要删除'data:image/jpg;base64,'）\"]}" http://localhost:8866/predict/ocr_system
+curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"Input image Base64 encode(need to delete the code 'data:image/jpg;base64,'）\"]}" http://localhost:8866/predict/ocr_system
 ```
-c. 返回结果（如果调用成功，会返回如下结果）
+c. Get resposne（If the call is successful, the following result will be returned）
 ```
 {"msg":"","results":[[{"confidence":0.8403433561325073,"text":"约定","text_region":[[345,377],[641,390],[634,540],[339,528]]},{"confidence":0.8131805658340454,"text":"最终相遇","text_region":[[356,532],[624,530],[624,596],[356,598]]}]],"status":"0"}
 ```
diff --git a/paddleocr.py b/paddleocr.py
@@ -129,6 +129,7 @@ def str2bool(v):
 
     parser.add_argument("--det", type=str2bool, default=True)
     parser.add_argument("--rec", type=str2bool, default=True)
+    parser.add_argument("--use_zero_copy_run", type=bool, default=False)
     return parser.parse_args()
 
 
@@ -209,4 +210,4 @@ def main():
         print(img_path)
         result = ocr_engine.ocr(img_path, det=args.det, rec=args.rec)
         for line in result:
-            print(line)
+            print(line)
diff --git a/ppocr/data/rec/dataset_traversal.py b/ppocr/data/rec/dataset_traversal.py
@@ -257,6 +257,7 @@ def sample_iter_reader():
                         norm_img = process_image_srn(
                             img=img,
                             image_shape=self.image_shape,
+                            char_ops=self.char_ops,
                             num_heads=self.num_heads,
                             max_text_length=self.max_text_length)
                     else: