-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aes: autodetection support for AES-NI #208
Conversation
Preliminary benchmarks show whatever effect this may have on performance is well within the variance of noise. Perhaps criterion would give better data but otherwise this appears to have negligible impact on performance. |
ca62f3f
to
0fb7bb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use union
instead of enum
. The new version of cpuid-bool
now supports initialization tokens, which makes the tag in AES types redundant (i.e. the tag will be stored in a static variable inside module created by the new cpuid-bool
macro).
cpuid_bool::new(aes_cpuid, "aes");
union Inner {
ni: crate::ni::Aes128,
soft: crate::soft::Aes128,
}
pub struct Aes128 {
inner: Inner,
token: aes_cpuid::InitToken,
}
fn new(key: &GenericArray<u8, $key_size>) -> Self {
let (token, val) = aes_cpuid::init_get();
let inner = if val {
Inner { ni: crate::ni::Aes128::new(key) }
} else {
Inner { soft: crate::soft::Aes128::new(key) }
};
Self { token,
}
fn encrypt_block(&self, block: &mut Block) {
if self.token.get() {
unsafe { self.inner.ni.encrypt_block(block) }
} else {
unsafe { self.inner.soft.encrypt_block(block) }
}
}
@newpavlov I'm not sure that enabling detection without the corresponding target feature being enabled accomplishes anything. See the issue @kazcw opened here: cryptocorrosion/cryptocorrosion#7
If this is true, trying to use AES-NI without the corresponding target feature enabled will use a software fallback provided by LLVM instead. As a separate issue, I need to test how enabling/disabling AES-NI in UEFI/BIOS affects CPUID. |
I confirmed disabling AES-NI in UEFI removes the flag from CPUID. However this is a notable example of how even on the exact same physical CPU, the flag may or may not be present depending on environmental conditions. |
Well, this isn't good. I just ran the test suite against this PR in its current state with the
Will investigate where things are going wrong... |
@newpavlov as far as I can tell the SIGILL is a bug in
(I confirmed the flag was present before disabling AES) I also double checked this is occurring in the autodetection code in this PR, both via a simple
...and double checking using lldb. I can try seeing if the problem is solved by using the |
I confirmed using the
|
@tarcieri I was wrong about the soft fallback in cryptocorrosion/cryptocorrosion#7 -- in the followup comment in that thread I go into what was actually causing the effect I observed. |
@kazcw okay thanks! I guess it also make sense that if Will try removing the |
Okay, I confirmed that without supplying any target features, Let me flip AES-NI back on and determine that it is properly detected when enabled and the performance is what is expected. |
@newpavlov okay, seems everything is working the way you expected. Let me at least remove the After that we can then look at inverting the |
0fb7bb5
to
9407a2d
Compare
Note that with runtime detection we should add |
@newpavlov this is somewhat problematic:
I can try marking the inner structs non- Edit: I guess the |
d85a29f
to
b1468ba
Compare
Since |
I'm curious what's the use-case for |
@roblabla |
b1468ba
to
4d1dff8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Release cpuid-bool v0.2
Done.
Decide if enum is acceptable for Aes*Ctr types, and if not, potentially add Copy (and at least Clone) impls to ctr::Ctr128
Lack of the Clone
impls is certainly an oversight, but I don't think we should add Copy
. The main reason is existence of the methods which take &mut self
, adding Copy
would make the types too error-prone for my taste.
The best solution for now probably would be to vendor CTR implementation into the aes
crate until the non-Copy
unions get stabilized. Luckily the soft-CTR code is not that complex and big.
Does anyone knows if there are CPUs which have AES-NI, but not SSSE3?
2880fe1
to
d8587e8
Compare
Wouldn't it be simpler to just stick with an enum for now? It bloats the size of an Aes*Ctr instance marginally, but to me that seems preferable to code duplication. Note that the |
Alright, this is strange: just tried to re-enable Note: this specifically occurs when ARM64https://github.com/RustCrypto/block-ciphers/pull/208/checks?check_run_id=1480139183
PPC32https://github.com/RustCrypto/block-ciphers/pull/208/checks?check_run_id=1480139204
|
It's breaking cross-based builds. See: #208 (comment)
It's breaking cross-based builds. See: #208 (comment)
5ee51e5
to
f2f37ec
Compare
It's breaking cross-based builds. See: #208 (comment)
61ce5e4
to
13f2968
Compare
Okay, managed to get the test suite green again with the Will try to get |
ac003cc
to
7582d0f
Compare
Seems I added it back and
AES-NI was introduced in Westmere which also had SSSE3, so I think it's fairly safe to assume any AES-NI capable CPU also has SSSE3. @newpavlov do you see anywhere that needs to be annotated with I did a quick spot check and it seems like relevant functions are either annotated with that or |
On i686/x86_64 platforms, uses the `cpuid-bool` crate to detect at runtime whether AES-NI is available. This eliminates the need to specify `target_feature=+aes` when compiling the crate in order to take advantage of AES-NI.
7582d0f
to
d8aead3
Compare
I'd say this is now ready to merge, so long as the It seems like it might be worth waiting for non- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disadvantage of the enum
-based approach is that we will not be able to remove the fallback branches by enabling the necessary target features via RUSTFLAGS
. But I guess it can be a good enough starting point and we can revisit it after deciding on what CTR flavors we want to expose in the ctr
crate.
do you see anywhere that needs to be annotated with target_feature(enable = "aes") that isn't already?
It looks like you've missed annotations for CTR types and some load/store intrinsics stay outside of the enabled functions. With some code shuffling AES-128 parallel encryption improved on my PC from 8 GB/s to 10.6 GB/s. I will try to submit a separate PR with improvements, so I think we can merge this PR as is.
BlockCipher, BlockDecrypt, BlockEncrypt, NewBlockCipher, | ||
}; | ||
|
||
cpuid_bool::new!(aes_cpuid, "aes"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We either need to add ssse3
to this list or use a separate aes_ssse3_cpuid
module as you did before. Also it may be worth to a comment that SSE2 is implied by AES-NI, so we don't need to check for it separately. Or we could simply add sse2
to the list to be completely thorough. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per my comments above, as far as I can tell every CPU with AES-NI has both SSE2 and SSSE3.
AES-NI was introduced in the Westmere architecture, which also has SSSE3. Unless there's some strange AMD/other CPU that has AES-NI but not SSSE3, I don't think it will be a problem.
If you're worried though, I can add back aes_ssse3_cpuid
(perhaps as a separate PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per reference we can omit sse2
check if we have verified that aes
or ssse3
is enabled, but it does not indicate any implicit dependency between aes
and ssse3
. So even though in practice there is probably no such CPU, for our code to be correct we have to check both those features. The check is cheap enough and will be executed only once, so I think it's worth to be extra-safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm ok, sorry just merged but I can follow up with this.
@newpavlov I took a look into work on supporting And this PR appears to stabilize what we'd need, namely:
So I think this problem may get addressed soon, after which we can migrate the |
I just double checked: |
Closes #25.
Adds an off-by-default
autodetect
feature which thecpuid-bool
crate to detect whether AES-NI is available when compiling on i686/x86_64 architectures.