-
-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Windows support #2
Comments
OpenBLAS for mingwpy is build with the help of a fork https://github.com/mingwpy/OpenBLAS of the original repository https://github.com/xianyi/OpenBLAS containing some patches addressing problems found with the Windows numpy/scipy usage. Prebuild binaries are available: https://bitbucket.org/carlkl/mingw-w64-for-python/downloads. Typically the mingwpy compiler toolchain is used within msys2 rather than the mingw.w64 packages supplied by msys2. See i.e. https://anaconda.org/carlkl/mingwpy for a binary download. OpenBLAS build with mingwpy is always statically linked, so no further gcc runtime dll are needed. See https://github.com/mingwpy/OpenBLAS/blob/mingwpy-dev/mingwpybuild.amd64 and https://github.com/mingwpy/OpenBLAS/blob/mingwpy-dev/mingwpybuild.win32 for the build command. |
If you have a package manager that can handle binary dependencies, then you don't really need to do static linking. This won't be the only library with a dependency on the gcc runtimes, and if you start building any C++ libraries that may throw exceptions then statically linking libstdc++ hurts more than it helps. |
This has been discussed so often ... Unfortunately there is no silver bullet. If you use a package manager you can handle dependency management with it. If not, you are better of with statically linking the gcc runtime code. The problem with C++ exception handling across boundaries is known, but on the other hand there is also the weird problem with missing runtime DLLs if you use non-statically builds. |
I disagree that this is ever beneficial, unless you impose a constraint that your end product must be entirely contained in a single file. You have one library dll to deal with instead of multiple. The system loader knows what to do with libraries that are linked to one another. The compiler runtime is not a duplicated component of every library you build, it's a dependency. If you've chosen a standard place to put extension libraries, then it's not that hard to pick a place to put common runtime libraries. Note there's the precedent of the TDM GCC build of mingw w64 that defaults to static linking, and is widely considered to be a bad idea that the core developers of mingw w64 do not support. If it were a good idea, it would be a recommended configuration. |
Unfortunately CPython on Windows does'nt include any kind of package manager to handle 3rd party DLLs needed by more than one python package/extension. conda and conda-forge may solve this problem, but this is not part of the standard distribution. Any binary extension build with mingw-w64 will be dependant on one or more gcc runtime DLLs. You will encounter several problems:
An obvious solution is to create a dedicated mingw-w64 runtime package as a requirement to all mingw-w64 or mingwpy build extensions. This package could preload all gcc runtimes with the help of ctypes into the process space. However, there wasn't an aggrement on such a solution in the community. You may give your hand sign on the mingwpy mailing list or https://github.com/mingwpy/mingwpy/issues |
Where's the standard place to deploy libopenblaspy.dll or equivalent for numpy? Isn't placing the gcc runtime dll's alongside that more or less equivalent to static linking, while the whole pynativelib thing is still getting figured out? |
@njsmith, @matthew-brett, @ogrisel, @mingwandroid, @gillins, @mikofski, @patricksnape, @jjhelmus further discussion of getting OpenBLAS to build on Windows is occurring on this issue if you are interested. Please feel free to contribute. |
For numpy If If C. Gohlke's MKL builds work with this mechanismn. Unfortunately under some circumstances even this workaround will fail, as the PATH variable is more or less the last entry in the search order of DLLs. Now for scipy: some extensions will suddenly need You are aware of the fact, that binary extension files are spreaded all around on multiple places inside a Windows CPython installation? There is no way to put all needed GCC runtime DLLs alongside all of these the extensions. |
Are we only planning to use mingw for building openblas and not other packages? If only openblas maybe we can go for the simplest solution. |
@carlkl, conda-forge needs to talk about MSYS2. It can be used to provide a POSIX shell, autotools, git etc, or it can be used to provide mingw-w64 compilers via the m2w64-toolchain meta-package, in which case, the gcc-libs are provided by the m2w64-gcc-libs package. Hopefully this discussion can take place soon. |
I really wanted to talk about this in the last meeting, @mingwandroid. There are many exciting possibilities that come with having MSYS2 in conda. I hope we can talk about this at our next meeting. |
@jakirkham, I guess you considered repackaging binaries, is it not an option ? |
We would strongly prefer not to. Also installing MinGW-w64 compilers with |
msys2 is without doubt a great resource of libraries and tools. Personally I moved from msys to msys2 two years ago. mingw-w64 used by msys2 or from mingw-builds has many issues that affects the usage together with CPython on Windows, see https://mingwpy.github.io/issues.html. There are more of this: mingw-w64 places great value on long double support (extended precision https://en.wikipedia.org/wiki/Extended_precision#x86_Extended_Precision_Format). CPython is compiled with MSVC and uses long double as a synoym to the double format. More worse: mingw-w64 makes a change to the FPU precision control field to use extended precision. Unfortunately as a side effect this changes the FPU control field for the callee as well and can change the behaviour of the callee. The problems with the usage of non-matching CRT runtimes and the stack alignment problem for SSE code for 32 bit code has to be stressed, as this is the reason for hard to follow segfaults in the past. Simply using Dlls from the msys2 package may or may not work together with CPython. |
I'm not really sure what the alternative is here @carlkl. Is there something else that is going to take its place? Or are you just trying to make us aware of some surprises when we inevitably go down this road? |
We try hard to address all this problems with the mingwpy project. |
@carlkl we have mingwpy here in conda-forge too. Do you want to be a maintainer? I use it with success to build Fortran modules on for Win-32, but I am not sure about the Win-64 status. |
Yes, @ocefpaf said, we have that in I'm not sure we are going this way though. The general thought is we don't really want static linking to the gcc libraries. There are cases we have accepted this as compilation with Also, and I am a bit fuzzy on this so feel free to help me get a clearer picture here @carlkl, my understanding is |
@jakirkham, @ocefpaf, @tkelman "static linking" is something that may change later for mingwpy to the canonical way of linking. But the CRT and stack alignment problems have to be adressed, at least for CPython extension builds. OpenBLAS is a library that is known to work with the msvcrt.dll though. EDIT: there is a parallel discussion abouth that: conda-forge/conda-forge.github.io#112 |
@jakirkham: there's nothing Python specific about the binaries that mingwpy generates, except that they're abi compatible with some chosen version of msvc. |
Yeah, I'm not really understanding this, but I have limited exposure to
|
@jakirkham, personally I use mingwpy to build openblas. I only have to add mingwpy's /bin folder to the PATH environment within msys2. |
Could you point me to how you do this, @carlkl? |
My latest toolchain builds I use for preparation of the mingwpy packages can be downloaded here: https://dl.bintray.com/carlkl/generic/ these toolchains have been build with https://github.com/mingwpy/mingw-builds. Just use these toolchains as a replacement for the msys2 mingw-w64 toolchains. |
Just so we are on the same page, I'm mainly asking these questions as I want more information about how this works with |
Me too, see conda-forge/conda-forge.github.io#112 (comment) |
mingwpy works for compatibility with python2.7 You could make a lot of these C runtime problems go away (and trade them for an entirely different set of problems 😄) by choosing to build CPython itself with a different toolchain, a la conda-forge/conda-forge.github.io#112 (comment) |
@tkelman, I doubt that a CPython implementation build with mingw-w64 will be ever supported by the python community. |
@mingwandroid, good catch. However, I had read this blog: And Microsoft unexpectedly stopped so much of its technology in the last few years. So take my comment as a sign of my mistrust. I don't have access to a crystal sphere or other hidden information. |
I believe Raymond is trying to give the product team less headaches here by dissuading people. In the middle of it he even discusses investigating, and someone else finding and fixing a problem with a program that used mscvt.dll that broke due to a change to a structure within it. We also have: https://msdn.microsoft.com/en-us/library/aa376609.aspx But if my list of MS maintained software that relies on mscvrt.dll isn't proof enough then I don't see what else will work to convince here. Anyway personally I support using the UCRT (so welcome preparatory work) but I do not expect to do anything with it in AD until Python 2.7 goes away (I will have a big celebration on that day). Also for the GPL we probably need the UCRT to be part of the operating system and I'm not sure it fits that definition on Windows 7 (this is a grey area). Personally I'd recommend to ignore that concern though. |
@Carlk - I am speaking again from ignorance, but my impression is that building scipy linking against the msvcrt is taking the risk that objects passed from Python (or numpy I guess) will be passed across the RT boundaries. So, it might pass the tests, but the money has to be on obscure problems in widespread use. I believe the same is not true for the current scipy builds, because the msvcrt code is housed in DLLs, and, being only the Fortran code, probably does not need to deal with dangerous other-runtime objects such as file-handles. But back to @mingwandroid - can I ask then - how do you think we should build scipy, such that it will work correctly with UCRT Python? |
numpy and scipy do not trade in OS-level objects AFAIK, so it would be fine. Code generated by a GCC linking to the UCRT will probably work just as well as code generated by one linking to msvcrt. The work that needs to be done before this can be tested is to build this UCRT-based GCC, and I don't know who's working on that. @njsmith? Then there's flang, but AFAICT, that is further out and will probably require a lot of changes to scipy/numpy before it can be trailed. |
Are you sure? If you are, I'm happy to believe you, but that's a guarantee that we probably don't want to make into the future, even if currently true. Numpy and Scipy certainly use file-objects - are you saying that would not be a problem? |
In Python 2, the CPython ABI uses
I was posting here in the hopes of tempting someone into doing that :-). You know as much as I do. |
No I'm far from an expert in numpy. You can use FILE * in C code just fine so long as it's kept internal. If you pass them back up to (or down from) Python then it is a problem. |
@mingwandroid - can I clarify more? I realized I didn't understand what you were saying about Python 2.7. Please correct me if I am wrong, but we can currently use mingw-w64 to build Python modules against the MSVCR90.DLL. So mingw-w64 is an option for Python 2.7. But we cannot currently build against the UCRT - so it is not an option for Python >= 3.5. For posterity: https://matthew-brett.github.io/pydagogue/python_msvc.html Above you say:
but - I'm not sure how Python 2.7 is relevant here. Do you mean that you are waiting until the default mingw-w64 linking is against UCRT instead of msvcrt? I guess that will take a very long time. Meanwhile we can't use an open-source compiler for our Python packages. |
If you are in the mingw-w64 loop, you can see that M. Storsjö and others are working hard for UCRT support for mingw-w64. No idea when it is ready for production. I don't think it takes too much time though. Right now mingw-w64 can be used to build cp35, 36... packages linked against msvcrt instead of UCRT. BTW: In case of scipy @xoviat's builds are partly linked against msvcrt instead of UCRT as well. I don't see a difference to a scipy build with all binary components linked against msvcrt. |
@carlkl - I think the point is that we can't guarantee that numpy or scipy will not receive RT objects that should not be passed across RT boundaries. Hence we have reason to worry about what will happen if we build against msvcrt. I plead ignorance yet again, but I believe @xoviat's technique is safer because a) it's much more tractable to guarantee that Fortran code does not receive such RT objects, and b) I had the vague impression that passing RT objects to DLLs with different RT linking was safer, but I cannot justify this. |
@matthew-brett, I don't see the difference anyway. It is all about DLLs; even in case of a |
I can't comment on the DLL thing without embarrassing myself. The difference between Fortran and C / C++ is nothing to do with the language - it's just that we only use Fortran for doing numerics, not IO, so it's easier to be fairly sure we aren't passing RT-related things to Fortran. |
You can build modules with |
@mingwandroid - sure - and, as Carl has shown, you can build Scipy against msvcrt and it passes its tests. My worry would be, that it will fail at some point in a confusing way because we're mixing RTs. Is that not a reasonable worry? |
I don't want to hypothesize about scenarios, where 3rd party binary extensions misuse memory (conceivable) or file desciptors (unlikely?) from scipy. Of course the most safe option is a consistent use of only one CRT. The pragmatic way is to use msvcrt alongside UCRT as long mingw-w64 is not ready for UCRT and FLANG is not ready to compile scipy's fortran components. |
But we're not talking about mis-use - we do not (should not) specify that people aren't allowed to pass RT objects into Numpy or Scipy. The reason we're doing what we are doing now is to minimize risk. Sure, it's a bit of a hack to build the Fortran code with mingw / msvcrt and the C code with MSVC / UCRT, but, if my understanding above is correct, the risk is less in doing that, than it is for building the whole thing with mingw / msvcrt, and hoping that we get away with it when our users pass RT objects. |
Link flags used for the MSVC/mingw-w64 combi like |
@xoviat - is |
No idea. Try removing it and see what happens. BTW the method used for the Fortran code was originally suggested by @zooba, so I think it's probably safe. |
FYI I don't think that this is correct. The failures that have occurred have been pretty much all compiler bugs, so I expect ten or twenty lines at most will need to be changed in both scipy and numpy at this point, if that? The rest of his waiting for NVIDIA to fix all of their bugs. |
@carlkl Looks like you were correct on this matter. Thanks! |
Yup - it looks like https://ci.appveyor.com/project/matthew-brett/scipy/build/1.0.14 https://github.com/numpy/numpy/pull/10165/files |
The difference between Fortran and Python C extensions is that with typical Fortran code, we can be pretty sure no CRT objects are passed over, simply because file handles and array memory allocation is incompatible or does not exist in Fortran 77, so you cannot really do it even if using the same CRT. With extension code written in C, that's much less given --- Python has a large C API, which at least in the past explicitly contained parts dealing with e.g. FILE* pointers, and it's not really specified what parts resolve to macros calling CRT and what do not. |
This may be a safe assumption for NumPy and SciPy. However it is not a safe assumption in conda-forge in general. For example, people make use of HDF5 with Fortran support on macOS and Linux. Would expect that once a viable Fortran solution emerges, people will want the same option on Windows, which does meaning passing FILE* around. |
The assumption is very likely valid for any pure Fortran 77 code.
If you want to go beyond that, then you need to find a better solution.
|
In the case of HDF5, it provides Fortran 90 and/or Fortran 2003. So not Fortran 77. Yes, I agree. |
This happened in PR ( #34 ). |
Needs to use MSYS2 packages. Needs to build with the MinGW-w64 toolchain. This needs to be done to ensure we have Fortran support and to get optimized AT&T assembly support. Without these, a build of OpenBLAS for Windows is pretty pointless as it will be a little slow (no AT&T assembly support with VS), have no LAPACK support, and be missing a Fortran API that many things expect to use.
The text was updated successfully, but these errors were encountered: