-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime backend autodetection #523
Conversation
51206bb
to
738cfee
Compare
FWIW, in lieu of In this PR though, it seems like you're applying it to larger code blocks. |
The
gets turned into this:
So the inner function which actually contains the body of the function is not marked with |
I've also ran the benchmarks for this PR; here are the results for anyone interested. The benches were ran on a Xeon Platinum 8481C 2.70GHz on Google Cloud. Scalar vs AVX2
Scalar vs AVX512
AVX2 vs AVX512
Before this PR vs after this PR (AVX2)
Before this PR vs after this PR (AVX512)
|
@koute any chance you could push a commit that would show the difference of what this PR would look like without I ask because I think we'd really like to keep 3rd party dependencies to a minimum. There's currently only two enabled by default: |
Done. Please take a look at the latest commit. I've only done it partially though, as doing it is really tedious and error-prone (it's easy to forget e.g. an |
Can I suggest to have the features as optional by way of them being negative ? This way if using negative "disallow" features these can be explicitly ruled out and if one rules them out it applies wholesale Also this would make them non-default so managing the featureset becomes easier when people use one-size-fits-all featureset This would make sense also given most people should not need to be disallowing the detection. fwiw - I'm beginning to think that disallow / forbid should be cfg flags as well giben it's niche e.g.
Having niche configuration scattered between featuresets and Depending on which one is chosen I can send documentation PR after. |
| `simd_avx2` | ✓ | Allows the AVX2 SIMD backend to be used, if available. | | ||
| `simd_avx512` | ✓ | Allows the AVX512 SIMD backend to be used, if available. | | ||
| `simd` | ✓ | Allows every SIMD backend to be used, if available. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this:
| `simd_avx2` | ✓ | Allows the AVX2 SIMD backend to be used, if available. | | |
| `simd_avx512` | ✓ | Allows the AVX512 SIMD backend to be used, if available. | | |
| `simd` | ✓ | Allows every SIMD backend to be used, if available. | | |
| `forbid_avx2` | | Disallows the AVX2 SIMD backend to be used, if available. | | |
| `forbid_avx512` | | Disallows the AVX512 SIMD backend to be used, if available. | | |
| `forbid_simd` | | Disallows every SIMD backend to be used, if available. | |
Optionally could make the forbid as cfg as is done with the rest of the configuration.
So do we have a consensus regarding removing the @pinkforest Usually cargo features should be only additive, but I guess in this case this doesn't really matter as those features don't actually change any functionality per-se, so sure, we could make those negative. However, I'd really prefer to keep them as cargo features though; the issue with specifying anything through RUSTFLAGS is that it's really awkward to use, especially for bigger framework-like projects with hundreds/thousands of downstream users where telling everyone "hey, you now need to specify those RUSTFLAGS to compile your stuff because it depends on our stuff" is just not feasible. If we had to specify the flags we need through RUSTFLAGS we'd be most likely forced to fork the crate and change the configuration knob to be a cargo feature. |
Yeah I hear you - we can have both via build.rs - similarly in build.rs for the No need to change anything on this PR - I can send a PR straight after this to address this as negative feature that works both ways. |
I'm happy to change it if you want, but that works for me too. Thank you! |
Just to offer an alternate perspective: I think the usual justification for "no negative features" is that they're not composable: if client crate A enables "no_apples" and client crate B enables "no_bananas", a project might not be able to use A and B together because A depends on bananas and B on apples. But in this case it's "avx2_backend" and "avx512_backend" that aren't composable; you cannot have more than one backend enabled at once. If you want to phrase these positively, you could call them "pre_avx2_compatibility" and "pre_avx512_compatibility" or something; then at least (The cfg vs. feature debate is not unrelated; I agree with @koute that cfg flags are much less discoverable and more awkward to work with, and I agree with @tarcieri that some controls really are "one choice for the entire build" and cannot be represented composably.) |
Sorry all, this thread flew under the radar for me. Getting up to date now. |
Yeah that's valid point re: Should the feature be around to disable detection and forcing backend. In any case I'll create a separate issue as we've visited this issue before and it needs to be re-visited properly. |
FYI, actually in this case they are composable. (: That's because they don't force a given backend, they just make it available for selection at runtime if the host on which the program's running supports it. |
Just reviewed everything. My thought is: removing |
If we had an in-tree dependency like Note that it will make the custom derive stack at least a default dependency where it isn't right now (i.e. |
Would be good reason to do the monorepo base here (now) for this + git combine (later) then May I suggest landing this PR pinning to version first and then I can just send another PR to rename some files and add it in-tree as monorepo ? |
I like that! |
That sounds good to me! So I'll revert the last commit and I'll pin the version of the Or perhaps alternatively I could just transfer it to My three cents regarding the |
Yeah better just pin it for now - we were going to re-organise everything to monorepo anyways so this was a good reason to start it |
Sorry for the delay. I have reverted the commit, pinned the version of the dependency, and fixed failing clippy. Should be ready to go! |
environment variable: | ||
```sh | ||
RUSTFLAGS='--cfg curve25519_dalek_backend="BACKEND"' | ||
``` | ||
where `BACKEND` is `simd` or `fiat`. Equivalently, you can write to | ||
where `BACKEND` is `fiat`. Equivalently, you can write to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, if fiat
is the only backend we have, perhaps we should reconsider how gating works. But I think that's something we can do in a followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with one caveat: we're back to an odd mixture of crate features and cfg attributes for backend selection. I think it kind of makes sense in this case since autodetection can map to multiple backends which work additively, but I worry this might persist the problem where curve25519-dalek
is a transitive dependency with these features being enabled via intermediate dependencies.
Curious how this is all going to work with SIMD backends for non-x86 platforms, e.g. NEON (#457). It may make sense to try to land that PR before a final release, just to make sure we have the right abstractions for backend selection which can work correctly across CPU architectures.
Indeed. Not sure if this is a good idea, and this would be a big change and require more refactoring, but maybe we could parametrize all of the public types by the backend? For example, instead of
Hm, this is a fair point, but I think doing what @pinkforest suggested would probably fix this -- make the features be negative and non-default, so that would make it very unlikely that an intermediate crate using
From what I can see it should work mostly fine, although it might still require some minor refactoring. |
FWIW we did this with @RustCrypto and I still consider it a mistake. We have/had That was the major impetus for using |
Agreed. Using either positive or negative features for selection of crypto implementations is also a nightmare for auditing, because it greatly expands the number of crates in which a backdoor could potentially force use of an insecure implementation. (Example: suppose that you don't trust RDRAND. Now try to audit that the |
| CPU feature | `RUSTFLAGS` | Requires nightly? | | ||
| :--- | :--- | :--- | | ||
| avx2 | `-C target_feature=+avx2` | no | | ||
| avx512ifma | `-C target_feature=+avx512ifma` | yes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed all of this documentation is removed, but the newly introduced crate features aren't equivalent and perhaps this should still be documented.
Namely these flags bypass autodetection and force the use of SIMD features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I can take care of documenting that - just need to re-organise a bit to make room for the vendored dep first
Also need to do a lot more testing now
This PR makes the following changes:
-C target_feature
) then that particular feature is assumed to be always available (courtesy of thecpufeatures
crate), and we won't try to detect whether it's available at runtime.x86_64-fortanix-unknown-sgx
) then no runtime autodetection will take place (courtesy of thecpufeatures
crate)--cfg curve25519_dalek_backend
configuration knob doesn't acceptsimd
anymore; now this can only be used to switch to thefiat
backend.avx512ifma
feature is sufficient. This is incorrect. This backend actually uses two features:avx512ifma
andavx512vl
, so I've made this explicit and corrected the README. In particular, if you've previously compiled with only-C target_feature=+avx512ifma
without theavx512vl
feature the performance of the resulting code was utterly horrible since Rust disables inlining if the appropriate target feature is not enabled (but the code still accidentally worked since CPUs which supportavx512ifma
usually also supportavx512vl
)unsafe_target_feature
crate's description for why it's necessary.Since the code in
src/backend/vector/scalar_mul
had to be nested in an extramod
the diff is a little messy; I suggest reviewing commit-by-commit and just ignoring the commit whichrustfmt
s those files.