Pipeline模式下，设置为线程模型（is_thread_op: True）之后，在有多个请求的情况下，predict无法利用CPU所有核。 #985

zhfkt · 2021-01-21T10:27:20Z

直接按照官方的这个例子跑Serving pipeline的OCR。

https://github.com/PaddlePaddle/Serving/blob/v0.4.0/python/examples/pipeline/ocr/README_CN.md

在有多个请求的情况下，当设置为进程模型（在config.yml中is_thread_op: False）的时候可以利用CPU所有核心去predict。但是设置为线程模型（is_thread_op: True）之后，predict无法利用CPU所有核去predict，只能利用local predictor中的thread_num个CPU核心数去predict

复现步骤:

先按照 https://github.com/PaddlePaddle/Serving/blob/v0.4.0/python/examples/pipeline/ocr/README_CN.md 配置。
设置config.yml中的字段
a. worker_num: 10
b. op的det和rec下 concurrency: 8
从 https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/models_list.md 下载大模型 https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar （下载大模型的原因是一张图片predict可以跑30秒左右）
通过 https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/pdserving/inference_to_serving.py 脚本转换模型到Serving格式
设置config.yml中 rec模型路径 op -> rec下的model_config为新的转换后Serving格式的大模型路径
model_config: inference/ch_ppocr_server_v1.1_rec_infer/serving_server_dir
进程模式下（config.yml设置 is_thread_op: False），在两个独立窗口中同时执行脚本 python pipeline_http_client.py（模拟同时有两个请求），会发现cpu可以利用4个核心去跑（因为local predictor中的thread_num默认值为2，现在有两个请求，生成两个进程 -> 2个进程*2个thread = 会利用CPU4个核心）
下图中显示了，在htop中多进程会利用4个CPU核心跑满100%：

但是在线程模式下（config.yml设置 is_thread_op: True），在两个独立窗口中同时执行脚本 python pipeline_http_client.py（模拟同时有两个请求），会发现cpu只能用2个核心去跑
下图中显示了，在htop中多线程模式只能利用2个（local predictor中的thread_num默认值）CPU核心跑满100%：

观察发现，在线程模式下，只能利用local predictor中的thread_num个CPU核心数去predict，在有多个请求的情况下（同时发送一个相同的request，在这里是在两个窗口中同时执行 python pipeline_http_client.py），无法利用CPU所有核。仿佛参数 concurrency 和 worker_num 没有使用一样。

Pls review whether it is a bug or not.

Thank you !

TeslaZhao self-assigned this Jan 22, 2021

TeslaZhao added the question Further information is requested label Jan 22, 2021

paddle-bot bot closed this as completed Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline模式下，设置为线程模型（is_thread_op: True）之后，在有多个请求的情况下，predict无法利用CPU所有核。 #985

Pipeline模式下，设置为线程模型（is_thread_op: True）之后，在有多个请求的情况下，predict无法利用CPU所有核。 #985

zhfkt commented Jan 21, 2021 •

edited

Loading

Pipeline模式下，设置为 线程模型（is_thread_op: True）之后，在有多个请求的情况下，predict无法利用CPU所有核。 #985

Pipeline模式下，设置为 线程模型（is_thread_op: True）之后，在有多个请求的情况下，predict无法利用CPU所有核。 #985

Comments

zhfkt commented Jan 21, 2021 • edited Loading

Pipeline模式下，设置为线程模型（is_thread_op: True）之后，在有多个请求的情况下，predict无法利用CPU所有核。 #985

Pipeline模式下，设置为线程模型（is_thread_op: True）之后，在有多个请求的情况下，predict无法利用CPU所有核。 #985

zhfkt commented Jan 21, 2021 •

edited

Loading