Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: staticly import Python modules in C++ global scope #4263

Closed
2 of 3 tasks
XuehaiPan opened this issue Oct 20, 2022 · 8 comments
Closed
2 of 3 tasks

[BUG]: staticly import Python modules in C++ global scope #4263

XuehaiPan opened this issue Oct 20, 2022 · 8 comments
Labels
triage New bug, unverified

Comments

@XuehaiPan
Copy link
Contributor

XuehaiPan commented Oct 20, 2022

Required prerequisites

Problem description

Failed to load a .pyd library if it contains global imports with Python 3.8-3.10 (Python 3.6-3.7 works fine). See https://github.com/metaopt/optree/actions/runs/3288272968/jobs/5418431116 for more details.

  + python -c "import optree"
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "C:\Users\runneradmin\AppData\Local\Temp\cibw-run-in9p445l\cp38-win_amd64\venv-test\lib\site-packages\optree\__init__.py", line 17, in <module>
      from optree.ops import (
    File "C:\Users\runneradmin\AppData\Local\Temp\cibw-run-in9p445l\cp38-win_amd64\venv-test\lib\site-packages\optree\ops.py", line 24, in <module>
      import optree._C as _C
  ImportError: DLL load failed while importing _C: A dynamic link library (DLL) initialization routine failed.

Related code:

https://github.com/metaopt/optree/blob/fa1bb70f2e92f5a3c110c00fc1790dd20d36b72d/include/utils.h#L51-L56

For Python 3.8-3.10, the import fails if there are C++ global-level imports:

namespace py = pybind11;

const py::module_ collections_module = py::module_::import("collections");

void foo(py::handle) {
    const py::object OrderedDict = collections_module.attr("OrderedDict");
}

Everything works fine if I change this into a function static variable:

namespace py = pybind11;

void foo(py::handle) {
    static const py::module_ collections_module = py::module_::import("collections");
    const py::object OrderedDict = collections_module.attr("OrderedDict");
}

Reproducible example code

Directory structure:

.
├── CMakeLists.txt
├── test.cpp
└── third-party
    └── pybind11
        ├── include
        ...

Source files:

// test.cpp

#include <pybind11/pybind11.h>

namespace test {

namespace py = pybind11;

const py::module_ collections_module = py::module_::import("collections");

void BuildModule(py::module& mod) {}

}  // namespace test

PYBIND11_MODULE(_C, mod) { test::BuildModule(mod); }
# CMakeLists.txt

cmake_minimum_required(VERSION 3.4)
project(test LANGUAGES CXX)

if(NOT CMAKE_BUILD_TYPE)
    set(CMAKE_BUILD_TYPE Release)
endif()

set(CMAKE_CXX_STANDARD 14)

find_package(Threads REQUIRED)           # -pthread
find_package(OpenMP REQUIRED)            # -Xpreprocessor -fopenmp
set(CMAKE_POSITION_INDEPENDENT_CODE ON)  # -fPIC
set(CMAKE_CXX_VISIBILITY_PRESET hidden)  # -fvisibility=hidden

if(MSVC)
    string(APPEND CMAKE_CXX_FLAGS " /Wall")
    string(APPEND CMAKE_CXX_FLAGS_DEBUG " /Zi")
    string(APPEND CMAKE_CXX_FLAGS_RELEASE " /O2 /Ob2")
else()
    string(APPEND CMAKE_CXX_FLAGS " -Wall")
    string(APPEND CMAKE_CXX_FLAGS_DEBUG " -g -Og")
    string(APPEND CMAKE_CXX_FLAGS_RELEASE " -O3")
endif()

if(NOT DEFINED PYTHON_EXECUTABLE)
    if(WIN32)
        set(PYTHON_EXECUTABLE "python.exe")
    else()
        set(PYTHON_EXECUTABLE "python")
    endif()
endif()

add_subdirectory(third_party/pybind11)
include_directories(third_party/pybind11/include)

include_directories("${CMAKE_SOURCE_DIR}")
set(csrc test.cpp)

pybind11_add_module(_C THIN_LTO "${csrc}")

target_link_libraries(_C PUBLIC)

Command-line and results:

Python 3.7:

$ conda create -n py37 python=3.7 -y
$ conda activate py37
$ mkdir build-py37
$ cd build-py37

$ cmake .. -DCMAKE_BUILD_TYPE=DEBUG
-- Building for: Visual Studio 17 2022
-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.19044.
-- The CXX compiler identification is MSVC 19.33.31629.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.33.31629/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Found Threads: TRUE  
-- Found OpenMP_CXX: -openmp (found version "2.0") 
-- Found OpenMP: TRUE (found version "2.0")  
-- pybind11 v2.11.0 dev1
-- Found PythonInterp: python.exe (found suitable version "3.7.13", minimum required is "3.6") 
-- Found PythonLibs: C:/Users/PanXuehai/Miniconda3/envs/py37/libs/python37.lib
-- Performing Test HAS_MSVC_GL_LTCG
-- Performing Test HAS_MSVC_GL_LTCG - Success
-- Configuring done
-- Generating done
-- Build files have been written to: C:/Users/PanXuehai/Projects/test/build-py37

$ cmake --build .
exit with code 0

$ cd Debug
$ ls

   Directory: C:\Users\PanXuehai\Projects\test\build-py37\Debug

_C.cp37-win_amd64.pyd  _C.exp                 _C.lib                 _C.pdb

$ python -c 'import collections; import _C'
exit with code 0

Python 3.10:

$ conda create -n py310 python=3.10 -y
$ conda activate py310
$ mkdir build-py310
$ cd build-py310

$ cmake .. -DCMAKE_BUILD_TYPE=DEBUG
-- Building for: Visual Studio 17 2022
-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.19044.
-- The CXX compiler identification is MSVC 19.33.31629.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.33.31629/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Found Threads: TRUE  
-- Found OpenMP_CXX: -openmp (found version "2.0") 
-- Found OpenMP: TRUE (found version "2.0")  
-- pybind11 v2.11.0 dev1
-- Found PythonInterp: python.exe (found suitable version "3.10.6", minimum required is "3.6") 
-- Found PythonLibs: C:/Users/PanXuehai/Miniconda3/envs/py38/libs/python310.lib
-- Performing Test HAS_MSVC_GL_LTCG
-- Performing Test HAS_MSVC_GL_LTCG - Success
-- Configuring done
-- Generating done
-- Build files have been written to: C:/Users/PanXuehai/Projects/test/build-py310

$ cmake --build .
exit with code 0

$ cd Debug
$ ls

   Directory: C:\Users\PanXuehai\Projects\test\build-py310\Debug

_C.cp310-win_amd64.pyd  _C.exp                 _C.lib                 _C.pdb

$ python -c 'import collections; import _C'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: DLL load failed while importing _C: A dynamic link library (DLL) initialization routine failed.
@XuehaiPan XuehaiPan added the triage New bug, unverified label Oct 20, 2022
@XuehaiPan
Copy link
Contributor Author

I found a workaround (metaopt/optree@b0c4212). It works fine for Python 3.6-3.10 on Windows (https://github.com/metaopt/optree/actions/runs/3290405945/jobs/5423476779).

Change:

// header.h
static const py::module_ PyCollectionsModule = py::module_::import("collections");
static const py::object PyOrderedDictTypeObject = py::getattr(PyCollectionsModule, "OrderedDict");
static const py::object PyDefaultDictTypeObject = py::getattr(PyCollectionsModule, "defaultdict");
static const py::object PyDequeTypeObject = py::getattr(PyCollectionsModule, "deque");

to

// header.h

#ifdef _WIN32  // Windows
const py::object& ImportOrderedDict();
const py::object& ImportDefaultDict();
const py::object& ImportDeque();
#define PyOrderedDictTypeObject ImportOrderedDict()
#define PyDefaultDictTypeObject ImportDefaultDict()
#define PyDequeTypeObject ImportDeque()
#else  // UNIX
static const py::module_ PyCollectionsModule = py::module_::import("collections");
static const py::object PyOrderedDictTypeObject = py::getattr(PyCollectionsModule, "OrderedDict");
static const py::object PyDefaultDictTypeObject = py::getattr(PyCollectionsModule, "defaultdict");
static const py::object PyDequeTypeObject = py::getattr(PyCollectionsModule, "deque");
#endif
// impl.h

#ifdef _WIN32  // Windows
const py::object &ImportOrderedDict() {
    static const py::module_ collections = py::module_::import("collections");
    static const py::object object = py::getattr(collections, "OrderedDict");
    return object;
}

const py::object &ImportDefaultDict() {
    static const py::module_ collections = py::module_::import("collections");
    static const py::object object = py::getattr(collections, "defaultdict");
    return object;
}

const py::object &ImportDeque() {
    static const py::module_ collections = py::module_::import("collections");
    static const py::object object = py::getattr(collections, "deque");
    return object;
}
#endif

@EthanSteinberg
Copy link
Collaborator

EthanSteinberg commented Oct 22, 2022

I'm surprised your original code works on Linux at all. Calling cpython functions like import in a static context (outside of your module init functions) is generally a really bad idea because the Python interpreter might be in an inconsistent state. You original code was broken, not pybind11.

Also, your fix (to function statics) would be greatly improved by switching to py::module_ pointers instead of direct py::module_s to avoid similar issues with running destructors in a bad context.

const py::object &ImportDeque() {
    static const py::module_* collections = new py::module_(py::module_::import("collections"));
    static const py::object* object = new py::object(py::getattr(*collections, "deque"));
    return *object;
}

As a general rule in C++, never use a class with a constructor or destructor as a global static due to initialization and deallocation issues.

Thanks for the report though, pybind11 should definitely have better error reporting here. I'll see if I can write a PR with better error messages. At the very least, documentation should probably be improved.

@EthanSteinberg
Copy link
Collaborator

EthanSteinberg commented Oct 22, 2022

#4246 will probably fix the lack of good error messaging here

@XuehaiPan
Copy link
Contributor Author

XuehaiPan commented Nov 25, 2022

Fixed with function static variables:

#define PyCollectionsModule (*ImportCollections())
#define PyOrderedDictTypeObject (*ImportOrderedDict())
#define PyDefaultDictTypeObject (*ImportDefaultDict())
#define PyDequeTypeObject (*ImportDeque())

inline py::module_* ImportCollections() {
    static auto collectionsUptr = std::make_unique<py::module_>(
        py::reinterpret_borrow<py::module_>(py::module_::import("collections")));
    return collectionsUptr.get();
}

inline py::object* ImportOrderedDict() {
    static auto OrderedDictUptr = std::make_unique<py::object>(
        py::reinterpret_borrow<py::object>(py::getattr(PyCollectionsModule, "OrderedDict")));
    return OrderedDictUptr.get();
}

inline py::object* ImportDefaultDict() {
    static auto defaultdictUptr = std::make_unique<py::object>(
        py::reinterpret_borrow<py::object>(py::getattr(PyCollectionsModule, "defaultdict")));
    return defaultdictUptr.get();
}

inline py::object* ImportDeque() {
    static auto dequeUptr = std::make_unique<py::object>(
        py::reinterpret_borrow<py::object>(py::getattr(PyCollectionsModule, "deque")));
    return dequeUptr.get();
}

@EthanSteinberg
Copy link
Collaborator

@XuehaiPan That code is still not right. You absolutely should not be using unique_ptrs for those variables as you are not supposed to deallocate them in the global scope.

Use simply raw pointers that you intentionally leak.

@XuehaiPan
Copy link
Contributor Author

XuehaiPan commented Nov 25, 2022

That code is still not right. You absolutely should not be using unique_ptrs for those variables as you are not supposed to deallocate them in the global scope.

@lalaland Thanks for the advice. To my best knowledge, as the references are explicitly borrowed (py::module_::import and py::getattr return stolen references), the PyObject is alive on std::unique_ptr deallocation. And the py::object only holds an m_ptr member and the deallocation overhead is small.

Use simply raw pointers that you intentionally leak.

Both std::unique_ptr + borrow reference (in #4263 (comment)) and raw pointer + steal reference (in #4263 (comment)) works fine for me:

#define PyCollectionsModule (ImportCollections())
#define PyOrderedDictTypeObject (ImportOrderedDict())
#define PyDefaultDictTypeObject (ImportDefaultDict())
#define PyDequeTypeObject (ImportDeque())

inline const py::module_& ImportCollections() {
    static const py::module_* ptr = new py::module_{py::module_::import("collections")};
    return *ptr;
}
inline const py::object& ImportOrderedDict() {
    static const py::object* ptr = new py::object{py::getattr(PyCollectionsModule, "OrderedDict")};
    return *ptr;
}
inline const py::object& ImportDefaultDict() {
    static const py::object* ptr = new py::object{py::getattr(PyCollectionsModule, "defaultdict")};
    return *ptr;
}
inline const py::object& ImportDeque() {
    static const py::object* ptr = new py::object{py::getattr(PyCollectionsModule, "deque")};
    return *ptr;
}

The raw pointer version looks cleaner and I refactored my code in metaopt/optree#16.

@EthanSteinberg
Copy link
Collaborator

To my best knowledge, as the references are explicitly borrowed (py::module_::import and py::getattr return stolen references), the PyObject is alive on std::unique_ptr deallocation.

This is incorrect and might indicate a documentation issue on our side. Borrowed references do not mean that the PyObject is alive on std::unique_ptr deallocation. The borrowed vs stolen reference distinction is about whether the reference count for the PyObject gets increased when the py::object C++ wrapper is created. Reference counts are always decreased when py::objects are deallocated, whether it was originally from a stolen or borrowed reference.

And when that reference count is decreased, we call a CPython function to do that. And that CPython function cannot be called in a global static context. Thus you cannot deallocate any py::objects in a global static context. So you cannot use unique_ptr here. (Not to mention that if the reference count actually drops to zero, it will call a ton of CPython functions to actually free that PyObject ...)

Both std::unique_ptr + borrow reference (in #4263 (comment)) and raw pointer + steal reference (in #4263 (comment)) works fine for me:

Your code might appear to work work, but it's undefined behavior. It might be silently corrupting data or change behavior whenever you change compilers, CPython versions, OSes, etc, etc.

@EthanSteinberg
Copy link
Collaborator

(Also, as a PS, that library for optimized PyTrees looks really nice. I use JAX quite frequently and I'll have to check it out!)

@XuehaiPan XuehaiPan changed the title [BUG]: Dynamic link library (DLL) initialization routine failed on Windows if the code has global imports with Python 3.8-3.10 [BUG]: staticly import Python modules in C++ global scope Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage New bug, unverified
Projects
None yet
Development

No branches or pull requests

2 participants