Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_Py_wfopen no longer exported #127350

Open
PlanetCNC opened this issue Nov 27, 2024 · 21 comments
Open

_Py_wfopen no longer exported #127350

PlanetCNC opened this issue Nov 27, 2024 · 21 comments
Labels
OS-windows topic-C-API type-bug An unexpected behavior, bug, or error

Comments

@PlanetCNC
Copy link

PlanetCNC commented Nov 27, 2024

Bug report

Bug description:

_Py_wfopen in no longer exported since 3.13.
I'm using embed version and I can not use fopen or _wfopen.
Please reconsider decision to remove _Py_wfopen since it is only way to open file when used in embed mode and fopen/_wfopen is not available. Without it PyRun_FileExFlags is useless to me and my application can no longer call external scripts.

CPython versions tested on:

3.13

Operating systems tested on:

Windows

@PlanetCNC PlanetCNC added the type-bug An unexpected behavior, bug, or error label Nov 27, 2024
@skirpichev
Copy link
Member

Hidden by #107213

CC @vstinner

@vstinner
Copy link
Member

Hi,

I'm using embed version and I can not use fopen or _wfopen.

Why can't you use fopen() or _wfopen()?

@PlanetCNC
Copy link
Author

In my case PyRun_FileExFlags crashes if I use FILE* from fopen() or _wfopen().

I suspect this is because different toolset versions are used to compile python313.dll and our software. That is why we don't mix them.

With _Py_wfopen everything works flawlessly.

We integrated Python into our CNC machine control software and it was working great for lots of users until now. Here is an example which is no longer possible to run:
https://www.youtube.com/watch?v=MruGmi1JoCc

Here is our module documentation:
https://cnc.zone/sdk/python/python

@vstinner
Copy link
Member

In my case PyRun_FileExFlags crashes if I use FILE* from fopen() or _wfopen(). I suspect this is because different toolset versions are used to compile python313.dll and our software. That is why we don't mix them.

The FILE* is not NULL? That's strange.

You may give a try to _Py_fopen_obj() function which is the last "open" function which returns a FILE* in the Python C API. It's private, you're not supposed to use it.

@PlanetCNC
Copy link
Author

No, it is not NULL.

I tried _Py_fopen_obj() but I was unable to make it work. It is "private" anyway so I expect to be removed in next version.
How about making new, nonprivate function with same implementation as _Py_wfopen ?

@PlanetCNC
Copy link
Author

I managed to make _Py_fopen_obj() work. Please do not remove my last option to open file.

@ZeroIntensity
Copy link
Member

In my case PyRun_FileExFlags crashes if I use FILE* from fopen() or _wfopen().

That sounds like a bug, could you file a new issue with a reproducer?

But overall, I think we just need a public API if this is the only way to do something. You shouldn't ever rely on anything prefixed with _Py as a user.

@vstinner
Copy link
Member

Yeah, you should be able to call fopen() and pass the result to a Python API which expect a FILE* file.

@PlanetCNC
Copy link
Author

This is not correct. FILE* should not be passed across a DLL boundary which is case when using embed Python.
Here is Microsoft article explaining this in detail.
https://learn.microsoft.com/en-us/cpp/c-runtime-library/potential-errors-passing-crt-objects-across-dll-boundaries?view=msvc-170

@asvetlov
Copy link
Contributor

I read the article as:

  1. 10 years ago you should be careful and use the same Visual Studio version and the same multithreaded (/MT vs /MD) mode as it was used for compiling Python itself. It was always a good advice. Now CRT is cross-compatible which helps people a lot; good to know.
  2. Release the memory by the same dynamic library that was used for allocation. For example, call fclose() in your dll for files opened with fopen(). Also it sounds reasonable.

@ZeroIntensity
Copy link
Member

As far as I can tell, _Py_wfopen is just a wrapper over fopen and _wfopen anyway. It shouldn't matter where the FILE * comes from.

@PlanetCNC
Copy link
Author

As far as I can tell, _Py_wfopen is just a wrapper over fopen and _wfopen anyway. It shouldn't matter where the FILE * comes from.

This is not true. Microsoft clearly states that passing passing CRT objects across DLL boundaries causes potential errors.

I spend a lot of time trying to "fix" this issue and is simply not possible. And I'm not a novice - I have more than 30 years experience in programming very complex stuff that runs on all popular operating systems. It is not that I don't know how to correctly use 'fopen' and '_wfopen'.

If there is such a problem keeping '_Py_wfopen' then functions that accept FILE* should also expect filename as PyObject and then internally call _Py_fopen_obj(). This change should be fairly easy to make.

@ZeroIntensity
Copy link
Member

I'm not doubting your expertise, but note that the PyRun APIs have existed for years; if they weren't usable without the private API on Windows, I would hope that we would have noticed by now. Could this be a matter of some incorrect linker flags?

cc @zooba

@PlanetCNC
Copy link
Author

PlanetCNC commented Nov 30, 2024

I use the /MT option, which is most likely the cause of the access violation. Unfortunately, switching to /MD is not feasible for my project. This is not a simple application—it consists of 2,714,742 lines of C++ code, runs on Windows, Linux, Mac, and Raspberry Pi, and interacts with dedicated hardware.

Python has been an part of the project for almost 10 years, starting with version 3.5, offering users a way to script their own commands. If the fopen/_wfopen issue means Python can no longer be used, then so be it. However, it certainly feels like an unnecessarily restrictive and frustrating reason for such a limitation.

@ZeroIntensity
Copy link
Member

I'll wait and see if Steve has anything to add, but it sounds like we should expose a public API for it.

@zooba
Copy link
Member

zooba commented Dec 2, 2024

I'd prefer to totally drop FILE * from our public API and if necessary, add an incremental parser API. Or add an API that takes the path and handles reading internally (don't we have this one?).

The best thing to do here to be portable is to open and read the file yourself, and then pass the contents to our parser. If your files are too big to read into memory, an incremental parser API would let you pass in a chunk at a time (we should have this interface internally already, as it's how we handle stdin, but it's not public).

FILE * and file descriptors are terrible values to pass across boundaries. The emulation (on Windows) is flakey at best, but even worse when you insist on static linking (which makes duplicate copies of a lot of libc-equivalent state and doesn't share them).

The other alternative which may be a good option here is to also compile Python yourself, so that you can ensure it matches (and possibly even shares) the version of the CRT. In theory passing a FILE * across statically linked C runtimes should be okay, provided the versions match, though it depends whether they try to validate anything (which they probably do).1 Our builds aren't especially special, and it's very easy to build on Windows.

Footnotes

  1. File descriptors are, of course, right out, because they will index into different FILE * arrays.

@PlanetCNC
Copy link
Author

I'd prefer to totally drop FILE * from our public API and if necessary, add an incremental parser API. Or add an API that takes the path and handles reading internally (don't we have this one?).

Python does not have API that takes the path and handles reading internally. For me this solution seems perfect.

The best thing to do here to be portable is to open and read the file yourself, and then pass the contents to our parser. If your files are too big to read into memory, an incremental parser API would let you pass in a chunk at a time (we should have this interface internally already, as it's how we handle stdin, but it's not public).

Incremental parser API should be fine. However to me it seems to be quite some work to make it public. Specially because all heap allocations should be done in Python because of same boundary issues. I will check how incremental parser works.

The other alternative which may be a good option here is to also compile Python yourself, so that you can ensure it matches (and possibly even shares) the version of the CRT. In theory passing a FILE * across statically linked C runtimes should be okay, provided the versions match, though it depends whether they try to validate anything (which they probably do).1 Our builds aren't especially special, and it's very easy to build on Windows.

This can cause issues with dependencies and maintenance. Perhaps even licensing.

@ZeroIntensity
Copy link
Member

Thanks, Steve. I'm now convinced this is an issue, but I'm not sure how we didn't notice it for so long. Is it just that nobody uses the PyRun* file APIs? (Some code searches are saying so.)

IMO, an incremental parser sounds good, but I think we should keep it simple for users (as in, we don't need to expose the fact that it's an incremental parser; just let them pass a file path and it should just work). Something like Py_CompilePath or PyRun_FilePath should work. Luckily, none of the current FILE* APIs are stable, so we can drop them in a few versions with a deprecation. I'm happy to temporarily expose a public Py_wfopen as a bandaid, too. In the meantime @PlanetCNC, does reading the file yourself and passing them to one of the compiling APIs work?

@zooba
Copy link
Member

zooba commented Dec 2, 2024

This can cause issues with dependencies and maintenance. Perhaps even licensing.

It shouldn't create any new issues on top of what you're already doing. CPython is very flexibly licensed, despite not being one of the typical short licenses, it has basically the same requirements (i.e. none). Further, the license should apply the same if you're using our build vs. using your own, so the only difference is going to be if you make users do the install themselves (which I also don't recommend).

Python does not have API that takes the path and handles reading internally. For me this solution seems perfect.

Okay, let's add this. The implementation is really just going to _Py_wfopen and pass the FILE * into the existing API, so it shouldn't be complicated, but it will get the non-portable types out of the public API.

@PlanetCNC
Copy link
Author

All PyRun_... functions that accept a FILE* parameter already include a corresponding filename parameter. Currently, if the FILE* is NULL, these functions crash. A potential solution could be to modify the implementation so that when FILE* is NULL, the filename is used internally to call _Py_fopen or _Py_wfopen. This approach would resolve the issue without requiring any changes to the existing API definitions.

@zooba
Copy link
Member

zooba commented Dec 3, 2024

A potential solution could be to modify the implementation so that when FILE* is NULL, the filename is used internally to call _Py_fopen or _Py_wfopen.

Sounds good to me.

Anyone want to open a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-windows topic-C-API type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

6 participants