Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

单GPU情况下多张图片并行推理 #9544

Closed
catchy666 opened this issue Mar 26, 2023 · 6 comments
Closed

单GPU情况下多张图片并行推理 #9544

catchy666 opened this issue Mar 26, 2023 · 6 comments
Assignees
Labels

Comments

@catchy666
Copy link

  • 系统环境/System Environment:Ubuntu 18.04

  • 版本号/Version:Paddle:2.4.2.post112 PaddleOCR:2.6.1.3

  • 我们通常输入单张图片到PaddleOCR模型中,像下面这样:

from paddleocr import PaddleOCR

model = PaddleOCR(use_angle_cls=True, lang='ch', show_log=False, use_gpu=True)
result = model.ocr(image, det=True, rec=True, cls=False)
  • 然而当有多张图片时,这种OneByOne的方式显得不是那么高效。
  • 那么,有两种想法:
    1. PaddleOCR是否支持batch输入?(本人调研后发现貌似不支持这一点);
    2. 通过Python multiprocessing创建N个进程,N个进程分别放置一个独立的PaddleOCR模型,最后将这些图片以参数传入这些进程进行推理。下面是简要的实现代码和出现的报错:
def recognizer_in_subproc(model, images):
    text_results = {}
    for image in images:
        # Run paddleocr
        result = model.ocr(image , cls=False)[0]

        # Outputs
        txts = [line[1][0] for line in result]
        return txts


# Build ocr models
models= []
for _ in range(4):
    _ppocr = PaddleOCR(use_angle_cls=True, lang='ch', show_log=False, use_gpu=True)
    models.append(_ppocr)

# Create & run multi-processing
text_results, processes = {}, []
pool = Pool(processes=4)
for gi in range(len(grouped)):
    p = pool.apply_async(func=recognizer_in_subproc, args=(models[i], images_list[i]))
    processes.append(p)

# Wait for all sub-processes to complete
pool.close()
pool.join()
for p in processes:
    temp = p.get()
    …

但我发现Paddle并不能简单以参数形式传入多进程?报错信息如下:

  File "code/inference.py", line 218, in text_recognizer_async
    temp = p.get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 768, in get
    raise self._value
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 537, in _handle_tasks
    put(task)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: cannot pickle 'paddle.fluid.libpaddle.PaddleInferPredictor' object

请问关于单GPU下多种图像并行推理有什么好的建议吗?谢谢!

@MissPenguin
Copy link
Collaborator

@BaoBaoJianqiang
Copy link

多进程推理已经支持,可以参考:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/FAQ.md#q-%E5%A6%82%E4%BD%95%E5%A4%9A%E8%BF%9B%E7%A8%8B%E9%A2%84%E6%B5%8B

根本就不好用,速度更慢。

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@dengbuqi
Copy link

dengbuqi commented Sep 5, 2023

和你同样的问题,看了好多人的提问,差不多两年前就有人问了,现在也没有合适的方法~!ㅠ_ㅠ

@FarV-Ma
Copy link

FarV-Ma commented Sep 13, 2023

多进程推理已经支持,可以参考:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/FAQ.md#q-%E5%A6%82%E4%BD%95%E5%A4%9A%E8%BF%9B%E7%A8%8B%E9%A2%84%E6%B5%8B

提升非常有限,只能波动而不持续地带来不到20%的速度提升……

@lilqz66
Copy link

lilqz66 commented Oct 16, 2024

多进程推理已经支持,可以参考:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/FAQ.md#q-%E5%A6%82%E4%BD%95%E5%A4%9A%E8%BF%9B%E7%A8%8B%E9%A2%84%E6%B5%8B

根本就不好用,速度更慢。

确实更慢🥹

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants