Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception at a third party lib importing line (from paddleocr import PaddleOCR) #450

Closed
purplesword opened this issue Nov 4, 2022 · 1 comment
Labels

Comments

@purplesword
Copy link

Problem Description

When trying to use pdoc to generate document for a module with a line from paddleocr import PaddleOCR, an error occurred:

Traceback (most recent call last):
  File "venv/lib/python3.10/site-packages/pdoc/extract.py", line 211, in load_module
    return importlib.import_module(module)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "./debug.py", line 1, in <module>
    from paddleocr import PaddleOCR
  File "venv/lib/python3.10/site-packages/paddleocr/__init__.py", line 14, in <module>
    from .paddleocr import *
  File "venv/lib/python3.10/site-packages/paddleocr/paddleocr.py", line 21, in <module>
    import paddle
  File "venv/lib/python3.10/site-packages/paddle/__init__.py", line 25, in <module>
    from .framework import monkey_patch_variable
  File "venv/lib/python3.10/site-packages/paddle/framework/__init__.py", line 17, in <module>
    from . import random  # noqa: F401
  File "venv/lib/python3.10/site-packages/paddle/framework/random.py", line 16, in <module>
    import paddle.fluid as fluid
  File "venv/lib/python3.10/site-packages/paddle/fluid/__init__.py", line 36, in <module>
    from . import framework
  File "venv/lib/python3.10/site-packages/paddle/fluid/framework.py", line 37, in <module>
    from . import core
  File "venv/lib/python3.10/site-packages/paddle/fluid/core.py", line 247, in <module>
    if libc_type == 'glibc' and less_than_ver(libc_ver, '2.23'):
  File "venv/lib/python3.10/site-packages/paddle/fluid/core.py", line 234, in less_than_ver
    return operator.lt(to_list(a), to_list(b))
  File "venv/lib/python3.10/site-packages/paddle/fluid/core.py", line 232, in to_list
    return [int(x) for x in s.split('.')]
  File "venv/lib/python3.10/site-packages/paddle/fluid/core.py", line 232, in <listcomp>
    return [int(x) for x in s.split('.')]
ValueError: invalid literal for int() with base 10: "-c ldd --version | awk '/ldd/{print $NF}'"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "venv/bin/pdoc", line 8, in <module>
    sys.exit(cli())
  File "venv/lib/python3.10/site-packages/pdoc/__main__.py", line 186, in cli
    pdoc.pdoc(
  File "venv/lib/python3.10/site-packages/pdoc/__init__.py", line 493, in pdoc
    all_modules[module_name] = doc.Module.from_name(module_name)
  File "venv/lib/python3.10/site-packages/pdoc/doc.py", line 388, in from_name
    return cls(extract.load_module(name))
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "venv/lib/python3.10/site-packages/pdoc/extract.py", line 213, in load_module
    raise RuntimeError(f"Error importing {module}") from e
RuntimeError: Error importing debug

It seems the issue lies on the subprocess.Popen restriction. It failed when paddle tried to use ldd to get glib version (Link ) via suprocess.Popen (Link )

Steps to reproduce the behavior:

  1. create a py file (e.g. debug.py) with 1 single line from paddleocr import PaddleOCR
  2. run pdoc -o docs/api ./debug.py
  3. reproduced

System Information

pdoc: 12.2.0
Python: 3.10.6, 3.9.12, etc
Platform: Linux-with-glibc2.35, Linux-with-glibc2.31, etc

Extra:

Even if I bypass the get_libc_ver, there's still plenty of errors like

ImportError:  venv/lib/python3.10/site-packages/paddle/fluid/core_avx.so: undefined symbol: _dl_sym, version GLIBC_PRIVATE

Is there a way to build the api doc ignoring all these external imports? After all, they are not that relevant to the final generated HTML doc.

@purplesword purplesword added the bug label Nov 4, 2022
@purplesword purplesword changed the title exception at a third party lib importing line (from paddleocr import PaddleOCR) Exception at a third party lib importing line (from paddleocr import PaddleOCR) Nov 4, 2022
mhils added a commit to mhils/pdoc that referenced this issue Nov 5, 2022
@mhils
Copy link
Member

mhils commented Nov 5, 2022

Thanks for the clear report! I think we do want to keep blocking subprocess execution on startup, there's just a lot of code that has very unintended side effects on import. Two ways forward here:

First, one way to fix this is to use pdoc as a library and then import before pdoc blocks it.

Second, I've just pushed #451 which adds a PDOC_ALLOW_EXEC environment variable to allow all subprocess execution. PDOC_ALLOW_EXEC=1 pdoc -o docs/api ./debug.py works for me as expected!

@mhils mhils closed this as completed in 5f78c54 Nov 5, 2022
adigitoleo added a commit to seismic-anisotropy/PyDRex that referenced this issue Mar 22, 2024
Introduced in mitmproxy/pdoc#451, to mitigate
the issue reported in mitmproxy/pdoc#450.
I'm guessing the gmsh python module does some similar nonsense to that
library, i.e. executing subprocess.Popen upon import or something like
that, because before adding the gmsh dep this was not required.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants