Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linking to the library statically on Windows breaks multi-arch support #3985

Open
3e33 opened this issue Mar 31, 2023 · 20 comments
Open

Linking to the library statically on Windows breaks multi-arch support #3985

3e33 opened this issue Mar 31, 2023 · 20 comments
Labels
Bug in other software Compiler, Virtual Machine, etc. bug affecting OpenBLAS

Comments

@3e33
Copy link

3e33 commented Mar 31, 2023

Now I'm not entirely sure what's going on, but MSVC on Windows requires /OPT:NOREF to be added when static linking to OpenBLAS, otherwise it doesn't work on architectures the library wasn't being built on.
The problem is this option doesn't work for clang-cl, even with /opt:noref, something required is still removed during linking.
This issue doesn't happen at all on Linux. But happens with two different linkers on Windows (don't know about GCC), which makes me think it's a bug in OpenBLAS.

@martin-frbg
Copy link
Collaborator

Can you please be more specific about what "it does not work" means, and what you are trying to achieve ? By "architectures it was not built on, do you mean a dual x64/arm64 build, or just different models of x64 cpus (what DYNAMIC_ARCH is for) ?

@3e33
Copy link
Author

3e33 commented Apr 1, 2023

Other models of CPUs don't work.
If I compile as a dynamic library, and link my application to it, everything is fine.
If I compile a static library, and link my app to it statically on Linux with Clang++, everything is fine and the final app (also a dynamic library) can be loaded.
If I link on Windows with Clang++ or MSVC's cl, the final .dll loads on my machine, but not on other CPUs (so AMD for me, but Intel doesn't work).
If I link with MSVC and add /OPT:NOREF, then once again everything works as expected.
But the same doesn't work for clang with /opt:noref (which might be a problem with lld / clang)

From what I can understand, while linking there's a large chunk of OpenBLAS that gets optimised out, which is what /OPT:NOREF fixes.

/OPT:REF eliminates functions and data that are never referenced; /OPT:NOREF keeps functions and data that are never referenced.

When /OPT:REF is enabled, LINK removes unreferenced packaged functions and data, known as COMDATs. This optimization is known as transitive COMDAT elimination. The /OPT:REF option also disables incremental linking.
https://learn.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=msvc-170

For some reason, on Windows, compilers / linkers think that needed parts of OpenBLAS aren't referenced and remove them during a static build.

@martin-frbg
Copy link
Collaborator

Maybe it is related to lld's link-time optimizations, can you try building with -fno-lto added to the CFLAGS ?
But searching finds many similar questions (even in the llvm issue tracker) with no clear answer, so it could well be a design limitation in llvm on Windows.

@3e33
Copy link
Author

3e33 commented Apr 1, 2023

-fno-lto hasn't made any difference, it's also on by default according to clang++ -help

@martin-frbg
Copy link
Collaborator

That's unfortunate - and as I'm not a Windows guy I can only hope someone else has a better suggestion. Maybe https://developercommunity.visualstudio.com/t/optnoref-still-eliminates-unused-functiondata/1178871 and what it says about clang is related ?

@3e33
Copy link
Author

3e33 commented Apr 1, 2023

Sure sounds related, but that means that OpenBLAS can only be linked to dynamically, if you're using clang on Windows. Seems like a bug.

@martin-frbg martin-frbg added the Bug in other software Compiler, Virtual Machine, etc. bug affecting OpenBLAS label Apr 1, 2023
@brada4
Copy link
Contributor

brada4 commented Apr 2, 2023

In file driver/others/dynamic.c there is dependency that MSVC linker does not see.
It goes like

gotoblas=&gotoblas_TARGET
detect_cpu
gotoblas=&gotoblas_detected // MSVC does not see this dozen-function list as dependencies

You can work around limitation by force-including all gotoblas_CORETYPE functions, which will more or less include all functions in library. By large no gain over dynamic library.

@borrrden
Copy link

You can work around limitation by force-including all gotoblas_CORETYPE functions, which will more or less include all functions in library. By large no gain over dynamic library.

What does this entail exactly? I'm unfamiliar with the "force include" terminology.

@martin-frbg
Copy link
Collaborator

Unfortunately that suggestion may have been pure handwaving... sadly the VS community thread linked above did not offer any workarounds beyond "don't do that then" either

@martin-frbg
Copy link
Collaborator

could you try if using /OPT:NOICF in addition to the /OPT:NOREF has any effect ? (I guess the chance is slim, but I have no better suggestion at the moment)

@brada4
Copy link
Contributor

brada4 commented Jan 31, 2024

Also if in project you need to change 2 project options:
https://learn.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=msvc-170

@borrrden
Copy link

borrrden commented Jan 31, 2024

Do these options apply to clang-cl as well? I haven't actually confirmed any issues yet I'm just trying to get ahead of this in case I immediately run into it when we are ready to use this on Windows. If all else fails I'll switch to building shared, I just thought it would be a shame to alter things just for Windows' sake. In theory once we are ready to use Windows I should run into this pretty quickly given that our build agents are Intel based and my dev machine is AMD based so I'll try any suggestions that I see here.

My build process is essentially the same as in the wiki

@3e33
Copy link
Author

3e33 commented Jan 31, 2024

I ended up using Clang on Linux and MSVC on Windows. The reason I wanted to avoid building the shared library is because of its size.

@martin-frbg
Copy link
Collaborator

I assume so, and it could be that it is specifically a problem with clang-cl (via the LLVM linker). (On the other hand, one wouldn't want to use plain old MSVC as that cannot handle all the optimized assembly kernels)

@brada4
Copy link
Contributor

brada4 commented Jan 31, 2024

The size is extra code paths for more architectures. You cannot have both eliminate them and expect them to be there.

@borrrden
Copy link

Thanks for the insight. Unlike the entire world, it seems, I do the majority of my dev work using Windows so when I get to this I will surely report back what I find, and any potential fixes if I'm clever enough to find them 😋 .

@3e33
Copy link
Author

3e33 commented Jan 31, 2024

@brada4 I could be wrong but I haven't noticed any issues and the size is reduced by more than half compiling statically just for the archs I need.

@brada4
Copy link
Contributor

brada4 commented Jan 31, 2024

Yeah, then you complain its not linking together.... You can build TARGET=GENERIC for small 1-arch library (better clang than cl.exe)

@3e33
Copy link
Author

3e33 commented Jan 31, 2024

It works with MSVC on Windows and with Clang on Linux. It's only a clang-cl issue.

@brada4
Copy link
Contributor

brada4 commented Jan 31, 2024

Yes, and Linux makes 50MB library that you cannot trim down, but thats not a problem with windows linker, it is certainly OpenBLAS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug in other software Compiler, Virtual Machine, etc. bug affecting OpenBLAS
Projects
None yet
Development

No branches or pull requests

4 participants