-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows ARM64 support #1339
Windows ARM64 support #1339
Conversation
crypto/fipsmodule/ec/p256.c
Outdated
// 'function': incompatible types - from 'uint32_t *' to 'Limb *' | ||
// 'function': incompatible types - from 'fiat_p256_felem' to 'const Limb *' | ||
// etc. | ||
#pragma warning(disable: 4133) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably a bad idea, as I'm not sure if it's okay to use 32-bit math on ARM64 in this module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we shouldn't disable these warnings. When the code is correct, then there will be no warnings. These warnings help us ensure we're not accidentally using 32-bit integer types instead of 32-bit integer types or vice versa. If there are cases we need to do conversions then we should add explicit casts for those specific instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, it looks like we have a problem then because we don't have the __uint128_t
type, and we also don't have a OPENSSL_USE_NISTZ256
implementation 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@briansmith What do you think about including a bigint library? For example boost's cpp_int or calccrypto? That could solve all the issues with math at once, and we could only include it on Windows ARM64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, it looks like we have a problem then because we don't have the __uint128_t type, and we also don't have a OPENSSL_USE_NISTZ256 implementation.
If I understand what you're saying correctly, there is another PR that adds the fallback bn_mul_mont
implementation that would be needed for supporting MSVC to work.
HOWEVER, that code will be much slower than using the uint128_t-based arithmetic from Fiat Crypto that using clang would provide.
So, with that in mind, maybe we should first get this working with TARGET_CC=clang
and then worry about supporting MSVC as the C compiler in a separate PR?
@briansmith What do you think about including a bigint library? For example boost's cpp_int or calccrypto? That could solve all the issues with math at once, and we could only include it on Windows ARM64.
It would be very unlikely that we'd add an external dependency for the math.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the main goals for me is to make sure Rustup can compile with rustls-tls feature enabled. If we can somehow specify in build.rs that we want to use clang.exe as the C compiler, and that setting can be dragged into Rustup, I'm generally happy 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@briansmith Okay, before I change a ton of #if defined(__GNUC__)
to #if defined(__GNUC__) || defined(__clang__)
in this context:
#if defined(__GNUC__) || defined(__clang__)
#pragma GCC diagnostic ignored "-Wsign-conversion"
#pragma GCC diagnostic ignored "-Wconversion"
#endif
-- is this an expected change? 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some code that does that in ring, because BoringSSL does that. But only code that we share with BoringSSL is supposed to do that, ideally.
For code not shared with BoringSSL, we should be using explicit casts and avoid disabling the warnings.
If you're going to hand this off to me, then go ahead and disable whatever warnings and then I'll fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to be as conservative as I could when introducing those #if ... defined(__clang__)
🙂 It now builds on my SPX and all the tests run. I'm going to try cross-compiling everything from an x64 machine now, and if that succeeds, I think my job here is largely done 🙂 I'd prefer to leave the code itself to you since it likely has nothing to do with Windows ARM64 target but rather with the logic itself, and I don't trust myself there as I have zero knowledge of crypto 😬
crypto/limbs/limbs.inl
Outdated
{ \ | ||
Limb tmp = (_a) - (_borrow); \ | ||
*(_r) = tmp - (_b); \ | ||
(_ret) = ((_a) < tmp) | ((*(_r)) > tmp); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did my best, you know the rest... Please validate this math
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, these macros are just doing what the #else
branches below already do, so we can just undo this change if you take my suggestion above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not possible to use the #else
branches below unfortunately because they use 128-bit __uint128_t
type which is not available on MSVC 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not possible to use the
#else
branches below unfortunately because they use 128-bit__uint128_t
type which is not available on MSVC 🤷
Thanks. I forgot about that. I will think about what to do. Probably it makes sense to change the #else
branches so that they don't require __uint128_t
and then are shared across MSVC and non-MSVC targets, if that doesn't negatively affect the performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave this decision up to you if you don't mind 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three aspects of this:
- Get something working. If we can rely on clang as the C compiler, then this is less much less work than what's in this PR.
- Get it working with MSVC as the C compiler.
- Unifying, if practical, the MSVC and non-MSVC code paths.
If we want to have, for expediency, a separate MSVC-only code path for now, then we "just" need to convert your #define
macros into true inline functions, since we don't do that #define
stuff except as a last resort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I'd leave the crypto/fipsmodule/bn/internal.h
changes as is because they're present upstream in BoringSSL, but if you'd prefer to revert that file, please let me know 🙂
@Alovchin91 I have improved the build.rs script to add dynamic detection of nasm.exe / clang.exe within the PATH + a few extra paths, which solves the annoying issue of having to copy nasm.exe inside the target folder every time. Next on my list is to figure out how the pregeneration works, I think one issue could be that both x64 and ARM object files use the same names currently:
|
@awakecoding Hmm, I don't think the issue is the file name but rather the lack of "target arch" parameter in clang.exe invocation. |
I think you are correct, it doesn't hurt to explicitly specify it anyway, otherwise you rely on defaults (and that's never good). |
ok, I fixed it - I can build the tests now if I copy the "tests" folder inside the "target" folder to avoid the incorrect path issue |
For some odd reason, I don't seem to have the issue with compiling the test code anymore. I did a "git clean -fdx" and I didn't copy the tests directory, and it still builds. In other words, the latest code on your branch just builds correctly for me |
I think it's important to call both |
My next step is to add proper detection for the MSVC clang distribution (the one installed as an MSVC component) because I think this is what @briansmith would rather use. the llvm.org clang distribution has an installer that can add clang to the PATH, but the clang.exe installed via MSVC doesn't, so it'll need a bit more work to detect and find properly. The proper modern way might be to call vswhere.exe to find it, unless there's already something better available to do this detection. |
@awakecoding I think you might want to use https://docs.rs/cc/1.0.69/cc/windows_registry/fn.find_tool.html BTW could you please also upgrade cc to 1.0.69? This version contains fixes to find MSVC tools on Windows ARM. |
@Alovchin91 sure, I absolutely hate vswhere.exe, so if cc has some helpers, I'm all for it! |
I'm trying to call cc::windows_registry::find_tool(&msvc_target, "clang.exe") with "x86", "x64" and "ARM64" as the msvc_target, but nothing appears to work. I'm not sure if I'm calling it right? |
The name of the parameter is a bit confusing, but it means that it should be an |
yeah, I tried all of those and even looked at the source code for CC, it just doesn't seem to want to find it. Even if I could get a proper result, it returns a Command, and there doesn't seem to be a stable feature right now to extract the path to the executable? That could be a problem |
|
I can call cc::windows_registry::find_tool to find link.exe and other tools, but I think it is unfortunately limited to just the executable in the "primary" tools directory you get with the visual studio command prompt. maybe that's why it doesn't find clang.exe? The thing is that we can find clang.exe through the PATH in the visual studio command prompt. |
I think you're right: https://github.com/alexcrichton/cc-rs/blob/4a6e8b7c7522392de0cda0325d4041f536745fe0/src/windows_registry.rs#L420 A good idea for extending the library then! 🙂 |
As a workaround you could ask it to find devenv.exe and given that it has a well-known path, build a new path based on it: https://github.com/alexcrichton/cc-rs/blob/4a6e8b7c7522392de0cda0325d4041f536745fe0/src/windows_registry.rs#L841 Something like this could work: cc::windows_registry::find_tool("x86_64-pc-windows-msvc", "devenv.exe")
.map(|tool| tool.path().parent().join(r"..\..\VC\Tools\Llvm\x64\bin\clang.exe")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a quick look at the patch and
build.rs
Outdated
@@ -287,6 +295,24 @@ fn read_env_var(name: &'static str) -> Result<String, std::env::VarError> { | |||
std::env::var(name) | |||
} | |||
|
|||
// Find executable in PATH with optional extra search directories | |||
fn find_in_path(path: &Path, extra: Vec<&Path>) -> Option<PathBuf> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to avoid this. The goal of the original design is to ensure we're using a known-good version of build tools during packaging and not just whatever happens to be around. When ring is packaged on crates.io, the code is pre-assembled and the object files are included in the crate. The previous way nasm was found, in a specific directory, ensures we don't accidentally use a different version of nasm that happens to be around, when packaging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your point, so may I suggest to modify it such that specific directories are searched before checking the PATH instead? This would ensure that target/tools/windows/nasm/nasm.exe always gets picked up even if nasm.exe is also present in the PATH. Requiring a specific copy of a tool within a custom directory is particularly painful for everybody else. I have nasm installed in my CI environment for other dependencies that require it, I just make sure to make it available in my system PATH so they can pick it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@briansmith I think you need to mark this comment as "resolved"
build.rs
Outdated
@@ -365,7 +391,7 @@ fn pregenerate_asm_main() { | |||
let srcs = asm_srcs(perlasm_src_dsts); | |||
for src in srcs { | |||
let obj_path = obj_path(&pregenerated, &src, MSVC_OBJ_EXT); | |||
run_command(nasm(&src, asm_target.arch, &obj_path, &pregenerated)); | |||
run_command(win_asm(&src, asm_target.arch, &obj_path, &pregenerated)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this should be made consistent with the code below that does this:
let cmd = if target.os != WINDOWS || (ext != "asm" && ext != "S") {
cc(p, ext, target, warnings_are_errors, &out_path, out_dir)
} else {
win_asm(p, &target.arch, &out_path, out_dir)
};
Probably factor out this conditonal into a function (named asm
perhaps) and use it here and below.
In particular, it is confusing how win_asm
is used for .S
files here (IIUC) but not used for .S
files below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disregard this. This was leftover feedback that was building on some other feedback I deleted before posting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the refactoring in recent PRs will allow us to avoid making any changes here in this PR.
build.rs
Outdated
} | ||
} | ||
|
||
fn find_nasm() -> Option<PathBuf> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially since this PR doesn't actually use NASM, we don't need to change the way NASM is found in this PR. If we want to be more flexible with NASM then let's discuss that in another issue/PR.
crypto/fipsmodule/ec/p256.c
Outdated
// 'function': incompatible types - from 'uint32_t *' to 'Limb *' | ||
// 'function': incompatible types - from 'fiat_p256_felem' to 'const Limb *' | ||
// etc. | ||
#pragma warning(disable: 4133) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we shouldn't disable these warnings. When the code is correct, then there will be no warnings. These warnings help us ensure we're not accidentally using 32-bit integer types instead of 32-bit integer types or vice versa. If there are cases we need to do conversions then we should add explicit casts for those specific instances.
crypto/limbs/limbs.inl
Outdated
@@ -90,6 +109,8 @@ static inline Carry limb_sbb(Limb *r, Limb a, Limb b, Carry borrow_in) { | |||
Carry ret; | |||
#if defined(RING_CORE_SUBBORROW_INTRINSIC) | |||
ret = RING_CORE_SUBBORROW_INTRINSIC(borrow_in, a, b, r); | |||
#elif defined(RING_CORE_SUBBORROW) | |||
RING_CORE_SUBBORROW(borrow_in, a, b, r, ret); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
Hi, I took a quick look at the PR and offered some suggested changes. I am impressed with your ability to get this mostly sorted out already on your own. Could you please also extend the GitHub Actions configuration to include Windows AAarch64 as a build-but-don't-test target? This would make it clear how one gets |
Again, I wouldn't worry about trying to search for |
Codecov Report
@@ Coverage Diff @@
## main #1339 +/- ##
=======================================
Coverage 92.81% 92.82%
=======================================
Files 117 117
Lines 17826 17841 +15
Branches 195 195
=======================================
+ Hits 16546 16561 +15
+ Misses 1246 1245 -1
- Partials 34 35 +1
Continue to review full report at Codecov.
|
The concern is not limited to the way ring is packaged for crates.io, but how one can rebuild it entirely from source, including for the assembly files. You mentioned that you'd rather use the MSVC clang, and that's the only thing missing right now - properly detecting its install path so we can pick it up. If you don't care about detection then I guess you can manually add it the PATH in your build environment? We're just trying to make it simpler with minimal custom operations, such that all one needs to do is either install MSVC clang or llvm.org clang to rebuild everything. |
I understand. I would prefer to deal with it in a separate PR though. For a long time the crates.io distribution is the only one I "support." I am open to changing that but I'd rather deal with all the things that are involved in changing that policy separately. |
No problem, we can trim and remove some of the detection code and expect clang.exe to be already present in the PATH, and avoid touching nasm.exe to keep it all for a separate PR. Just to make sure you understand, we're not going to implement MSVC clang detection in this PR, mostly because it's quite painful and we've been on it for hours already, but also because it would bring even more complex detection code in the current PR. The most sensible way to detect it would be to implement calling vswhere.exe like in the screenshot below. I pointed to the x86 and x64 versions of the MSVC-distributed clang.exe: |
@briansmith @Alovchin91 I removed the dynamic nasm/clang detection from the current branch. From the build system point of view, it should be good for the current PR (excluding the update to the CI environment). I am done for now, let me know if you need anything, I think the rest has to do with the math and the lack of a uint128_t data type in MSVC for ARM64. |
I was worried that cleanup in other PRs for specific things like the cc handling of clangs, and clang-related improvements, would ultimately create conflicts with the current PR. Can we keep it a minimum such that we can merge this PR sooner than later, and complete the improvements after the current PR has been merged? It would likely result in a lot less conflicts. |
There are parts of this PR that I don't want to maintain, and which are hard for me to verify the correctness of, including the Keep in mind that most people who contribute to ring are not interested in Windows AAarch64, so we have to optimize for those people, by minimizing the amount of Windows-AAarch64-specific logic. In any case, I expect the work to be done today. |
@briansmith I'm sorry, I've made a comment about conflicts after I saw an email about #1343 but before I read your previous message 😅 Please go ahead, I totally support your argument that supporting Windows AArch64 shouldn't require (too much) additional effort compared to other architectures, especially given that API-wise it's just
I was hoping that would be possible, indeed 😄 @briansmith My only concern is that I hope that we could merge this PR before we enable compilation with MSVC, so there still will be some hardcoded
|
Discussion from #1346:
I'm afraid that would make usage of ring unnecessarily over-complicated on Windows AArch64. I'd probably still suggest to hardcode At the same time, if you don't want to release a new ring version before it's compilable under MSVC on AArch64, and given that one still needs additional setup (Clang in the $PATH), maybe it doesn't really matter much. 🤔 I guess I don't have a strong opinion but am slightly inclined to hardcode |
I suggest we just make sure that it can be built for Windows ARM64, and then we could make further clang detection and flexibility improvements in a separate PR to handle clang vs clang-cl, and clang-12 vs clang, etc, if need be. |
@awakecoding To be honest, I don't think ring should know anything about where to find clang and which clang to use. I'd remove any knowledge of these specifics as soon as MSVC support is there. My suggestion to default to clang is just because I think it's easier to comprehend "clang-cl was not found" error message than something cryptic about int128_t 😅
@briansmith Sounds reasonable to me 🙂 |
build.rs
Outdated
@@ -365,7 +391,7 @@ fn pregenerate_asm_main() { | |||
let srcs = asm_srcs(perlasm_src_dsts); | |||
for src in srcs { | |||
let obj_path = obj_path(&pregenerated, &src, MSVC_OBJ_EXT); | |||
run_command(nasm(&src, asm_target.arch, &obj_path, &pregenerated)); | |||
run_command(win_asm(&src, asm_target.arch, &obj_path, &pregenerated)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the refactoring in recent PRs will allow us to avoid making any changes here in this PR.
I can't seem to build it with
|
@briansmith Please take a look at the changes I've made to |
Hmm, forget about |
Okay, so unless we care about the ability to compile assembly code with |
I agree to license my contributions to each file under the terms given at the top of each file I changed. Co-authored-by: Marc-André Moreau <marcandre.moreau@gmail.com>
I don't think we care about building the assembly code with clang-cl, mostly because we really intend to use clang for it, and I'm not even sure that clang-cl can compile the assembly code in question with the MSVC-style arguments. |
I'm not even sure it tries to compile it using a cl-driver when it sees a |
I must admit I'm pretty happy with the result! 🙂 |
Done in PR #1354. |
Here's how I tested this on the device:
On the Surface Pro X:
I also tested the debug builds in an analogous way. |
Nice! If you want to get rid of the annoying vcruntime140.dll, you can use RUSTFLAGS='-C target-feature=+crt-static' as documented here. |
@Alovchin91, @awakecoding Thanks for all your effort on this! I especially appreciate the effort you put in after the extra PR to make this easier to maintain (I hope) for people who don't have access to an aarch64-pc-windows-msvc device. |
@briansmith Thanks for all your help along the way! 😊 That was a lot of fun gentlemen 😀 |
@briansmith it's been a year to the day that these Windows ARM64 build fixes are been merged on the main branch, but no release has been done since then, and we're still unable to build some of our Rust dependencies for all platforms because of this. We're getting closer to a first possible Windows ARM64 build of Remote Desktop Manager, so it's becoming a more pressing issue that we get a working build. I have tried backporting the fixes back on top of 0.16.20 (there is no release tag on GitHub, so finding the right commit is tricky) and force it to work for Windows ARM64, but that version of build.rs was already quite different from the version on top of which the fixes were made. I wasted a few hours on it, and I'm stick figuring out how the static library is built assuming a .a extension, and why it still wants to build a bunch of linux64 assembly files. I tried using the version of ring right after this pull request got merged, and edited Cargo.toml to make a fake 0.16.20 release out of it, such that it would get picked up by other crates like webpki, but then I hit "expected struct What I need is a version of ring that can be built for Windows ARM64 and webpki, and I guess this means updating webpki to an newer version of the untrusted crate matching the latest version of ring. What would be the simplest way to get this unblocked? Is a ring 0.16.20 backport the only option, or is a new release of compatible crates coming soon? |
@briansmith What can we do to get a new release out for the main branch? Rust is basically unusable on Windows ARM I you're relying on major libraries like Reqwest, Rustls, OAuth, etc. |
#1167