-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: PyCapsule_SetName segfault #4420
Comments
For convenience, here's some information ported over from scipy/scipy#17644 (comment) and similar. Scipy has a compiled module With Pybind 2.10.2, on Windows only, the compiled Tyler's investigation revealed this was due a new missing dependency. Direct import generates the following: >>> import scipy.fft
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\treddy\python310_venv_windows\lib\site-packages\scipy\fft\__init__.py", line 92, in <module>
from ._helper import next_fast_len
File "C:\Users\treddy\python310_venv_windows\lib\site-packages\scipy\fft\_helper.py", line 3, in <module>
from ._pocketfft import helper as _helper
File "C:\Users\treddy\python310_venv_windows\lib\site-packages\scipy\fft\_pocketfft\__init__.py", line 3, in <module>
from .basic import *
File "C:\Users\treddy\python310_venv_windows\lib\site-packages\scipy\fft\_pocketfft\basic.py", line 6, in <module>
from . import pypocketfft as pfft
ImportError: DLL load failed while importing pypocketfft: A dynamic link library (DLL) initialization routine failed. Further investigation revealed that there was a difference in declared imports between Pybind 2.10.1 and 2.10.2 - the 2.10.2 binary no longer imported @rgommers wondered whether this might be due to #4254 . |
I left a comment here: scipy/scipy#17644 (comment) |
I excised the reproducer from SciPy so that you now have a standalone project with a single C++ source file, 1 C++ header file, and a build file: https://github.com/tylerjereddy/pybind_repro The At the very least, this should completely uncouple your debugging cycles from SciPy, which is going to save a ton of time most likely. |
This turned out to not be a pybind11 issue as it was a bug within scipy. See scipy/scipy#17662. We should fix have better error logging within pybind11 though. See #4426 |
Thanks a lot for getting to the bottom of this @lalaland!
Given that this went into a bug fix rather than a minor release, and caused issues in SciPy and PyTorch already, it may still be wise to yank this release? You can argue that going from "working in practice but UB" to "hard to debug segfault" is not a pybind11 bug technically, but it's still pretty impactful. It even defeated SciPy's fairly careful pinning for the last already released version ( |
@rgommers Yeah, I think I agree with you. This is not acceptable for a bug fix change and it should probably be reverted and added back with better error reporting in the next minor release. Super sorry for all the work this caused on your part (especially during this time of the year). |
I'm sorry, too, that the failure reporting is so unobvious on some platforms. But what is "yanking"? I was thinking we make a 2.10.3 release with the better error reporting asap. Should we do something else? |
I forgot to add: only debug builds are affected. |
Yanking is a process you can do on Pypi to remove the release so it does not appear as a candidate for installation. |
Thanks! Sounds great to me.
@henryiii do you know how to do that?
…On Wed, Dec 28, 2022 at 22:03 Matthew Brett ***@***.***> wrote:
Yanking is a process you can do on Pypi to remove the release so it does
not appear as a candidate for installation.
—
Reply to this email directly, view it on GitHub
<#4420 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFUZACOV4ZBGAOIMMXIYZ3WPRJFNANCNFSM6AAAAAATISH6VI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Okay, I've yanked it. Let's get a 2.10.3 out ASAP since anyone relying on the Windows ARM support added in 2.10.2 is now broken (unless they request >=2.10.2, which is not likely just to add a single modern platform). |
From what I understand, we revert the UB -> error behavior for 2.10.3, and make it an error with better error reporting in 2.11? |
No worries at all, and thanks for your help in resolving the issue! |
I wrote down my thoughts here: #4432 (comment) |
Could you verify the UB was actually working as intended in SciPy (and/or PyTorch)? (The |
@henryiii I highly doubt the UB in SciPy did anything wrong. Calling inc_ref on a none object is unlikely to break anything. Still worth fixing though. |
Agreed - and no bug reports in ~3 years related to this for heavily used functionality in |
Required prerequisites
What version (or hash if on master) of pybind11 are you using?
2.10.2
Problem description
A detailed analysis of the situation that caused a hold-up of the SciPy release process on Windows is in this issue: scipy/scipy#17644 (comment)
Our testsuite can't even start with
pybind11
2.10.2
-- it literally segfaults during thepytest
collection phase, and more specificallyimport scipy.stats
segfaults, andscipy.fft
import will fail as desribed here: scipy/scipy#17644 (comment)Reproducible example code
Unfortunately, I probably don't have the bandwidth to excise an isolated reproducer, but I bet the
pybind11
folks will benefit from my DLL analysis over here: scipy/scipy#17644 (comment)Is this a regression? Put the last known working version here if it is.
2.10.1
The text was updated successfully, but these errors were encountered: