Skip to content

Exception at a third party lib importing line (from paddleocr import PaddleOCR) #450

Closed
@purplesword

Description

@purplesword

Problem Description

When trying to use pdoc to generate document for a module with a line from paddleocr import PaddleOCR, an error occurred:

Traceback (most recent call last):
  File "venv/lib/python3.10/site-packages/pdoc/extract.py", line 211, in load_module
    return importlib.import_module(module)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "./debug.py", line 1, in <module>
    from paddleocr import PaddleOCR
  File "venv/lib/python3.10/site-packages/paddleocr/__init__.py", line 14, in <module>
    from .paddleocr import *
  File "venv/lib/python3.10/site-packages/paddleocr/paddleocr.py", line 21, in <module>
    import paddle
  File "venv/lib/python3.10/site-packages/paddle/__init__.py", line 25, in <module>
    from .framework import monkey_patch_variable
  File "venv/lib/python3.10/site-packages/paddle/framework/__init__.py", line 17, in <module>
    from . import random  # noqa: F401
  File "venv/lib/python3.10/site-packages/paddle/framework/random.py", line 16, in <module>
    import paddle.fluid as fluid
  File "venv/lib/python3.10/site-packages/paddle/fluid/__init__.py", line 36, in <module>
    from . import framework
  File "venv/lib/python3.10/site-packages/paddle/fluid/framework.py", line 37, in <module>
    from . import core
  File "venv/lib/python3.10/site-packages/paddle/fluid/core.py", line 247, in <module>
    if libc_type == 'glibc' and less_than_ver(libc_ver, '2.23'):
  File "venv/lib/python3.10/site-packages/paddle/fluid/core.py", line 234, in less_than_ver
    return operator.lt(to_list(a), to_list(b))
  File "venv/lib/python3.10/site-packages/paddle/fluid/core.py", line 232, in to_list
    return [int(x) for x in s.split('.')]
  File "venv/lib/python3.10/site-packages/paddle/fluid/core.py", line 232, in <listcomp>
    return [int(x) for x in s.split('.')]
ValueError: invalid literal for int() with base 10: "-c ldd --version | awk '/ldd/{print $NF}'"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "venv/bin/pdoc", line 8, in <module>
    sys.exit(cli())
  File "venv/lib/python3.10/site-packages/pdoc/__main__.py", line 186, in cli
    pdoc.pdoc(
  File "venv/lib/python3.10/site-packages/pdoc/__init__.py", line 493, in pdoc
    all_modules[module_name] = doc.Module.from_name(module_name)
  File "venv/lib/python3.10/site-packages/pdoc/doc.py", line 388, in from_name
    return cls(extract.load_module(name))
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "venv/lib/python3.10/site-packages/pdoc/extract.py", line 213, in load_module
    raise RuntimeError(f"Error importing {module}") from e
RuntimeError: Error importing debug

It seems the issue lies on the subprocess.Popen restriction. It failed when paddle tried to use ldd to get glib version (Link ) via suprocess.Popen (Link )

Steps to reproduce the behavior:

  1. create a py file (e.g. debug.py) with 1 single line from paddleocr import PaddleOCR
  2. run pdoc -o docs/api ./debug.py
  3. reproduced

System Information

pdoc: 12.2.0
Python: 3.10.6, 3.9.12, etc
Platform: Linux-with-glibc2.35, Linux-with-glibc2.31, etc

Extra:

Even if I bypass the get_libc_ver, there's still plenty of errors like

ImportError:  venv/lib/python3.10/site-packages/paddle/fluid/core_avx.so: undefined symbol: _dl_sym, version GLIBC_PRIVATE

Is there a way to build the api doc ignoring all these external imports? After all, they are not that relevant to the final generated HTML doc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions