AArch64 implementation #90

liamwhite · 2021-05-30T05:47:53Z

Fixes #84. Not particularly polished.

Passes all tests:

running 42 tests
test fallback_hash::tests::test_conversion ... ok
test aes_hash::tests::test_conversion ... ok
test aes_hash::tests::test_sanity ... ok
test fallback_hash::tests::test_hash ... ok
test hash_quality_test::aes_tests::aes_all_bytes_matter ... ok
test aes_hash::tests::test_hash ... ok
test hash_quality_test::aes_tests::aes_finish_is_consistant ... ok
test hash_map::test::test_borrow ... ok
test hash_quality_test::aes_tests::aes_input_affect_every_byte ... ok
test hash_quality_test::aes_tests::aes_keys_change_output ... ok
test hash_quality_test::aes_tests::aes_keys_affect_every_byte ... ok
test hash_quality_test::aes_tests::aes_single_key_bit_flip ... ok
test hash_quality_test::aes_tests::test_single_bit_in_byte ... ok
test hash_quality_test::fallback_tests::fallback_all_bytes_matter ... ok
test hash_quality_test::fallback_tests::fallback_finish_is_consistant ... ok
test hash_quality_test::fallback_tests::fallback_input_affect_every_byte ... ok
test hash_quality_test::fallback_tests::fallback_keys_affect_every_byte ... ok
test hash_quality_test::fallback_tests::fallback_keys_change_output ... ok
test hash_quality_test::aes_tests::aes_single_bit_flip ... ok
test hash_quality_test::aes_tests::aes_test_no_pair_collisions ... ok
test hash_quality_test::fallback_tests::fallback_single_key_bit_flip ... ok
test hash_quality_test::fallback_tests::fallback_single_bit_flip ... ok
test operations::test::test_shuffle_contains_each_value ... ok
test hash_quality_test::fallback_tests::fallback_test_no_pair_collisions ... ok
test hash_quality_test::aes_tests::aes_padding_doesnot_collide ... ok
test operations::test::test_shuffle_moves_high_bits ... ok
test operations::test::test_shuffle_moves_every_value ... ok
test random_state::test::test_unique ... ok
test random_state::test::test_not_pi ... ok
test random_state::test::test_with_seeds_const ... ok
test specialize::test::test_input_processed ... ok
test specialize::test::test_ref_independent ... ok
test specialize::test::test_specialized_invoked ... ok
test test::test_ahasher_construction ... ok
test test::test_builder ... ok
test test::test_default_builder ... ok
test test::test_conversion ... ok
test hash_quality_test::fallback_tests::fallback_padding_doesnot_collide ... ok
test test::test_non_zero ... ok
test test::test_non_zero_specialized ... ok
test hash_quality_test::fallback_tests::fallback_test_no_full_collisions ... ok
test hash_quality_test::aes_tests::ase_test_no_full_collisions ... ok

test result: ok. 42 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 3.04s

Benchmark (Qualcomm Snapdragon 730G):

aeshash/u8              time:   [3.2561 ns 3.2563 ns 3.2564 ns]                        
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
  4 (4.00%) high severe
aeshash/u16             time:   [3.2557 ns 3.2560 ns 3.2564 ns]                         
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
  6 (6.00%) high severe
aeshash/u32             time:   [3.2558 ns 3.2560 ns 3.2562 ns]                         
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low severe
  3 (3.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe
aeshash/u64             time:   [3.2557 ns 3.2571 ns 3.2599 ns]                         
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  4 (4.00%) high mild
  3 (3.00%) high severe
aeshash/u128            time:   [940.45 ps 944.50 ps 948.68 ps]                          
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
aeshash/string          time:   [199.92 ns 200.29 ns 200.65 ns]                           
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

... as compared to fallback implementation on 730G:

fallback/u8             time:   [3.2571 ns 3.2574 ns 3.2576 ns]                         
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  5 (5.00%) high mild
  3 (3.00%) high severe
fallback/u16            time:   [3.2569 ns 3.2570 ns 3.2572 ns]                          
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) low severe
  2 (2.00%) low mild
  3 (3.00%) high mild
  3 (3.00%) high severe
fallback/u32            time:   [3.2572 ns 3.2575 ns 3.2577 ns]                          
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe
fallback/u64            time:   [3.2573 ns 3.2591 ns 3.2630 ns]                          
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low severe
  2 (2.00%) low mild
  2 (2.00%) high mild
  5 (5.00%) high severe
fallback/u128           time:   [913.23 ps 914.34 ps 915.56 ps]                           
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
fallback/string         time:   [346.18 ns 346.20 ns 346.22 ns]                            
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) low severe
  3 (3.00%) high mild
  2 (2.00%) high severe

(main timing difference appears in string benchmark)

eadf · 2021-06-01T11:49:18Z

All tests pass on M1 Mac.

aeshash/u8              time:   [581.65 ps 581.77 ps 581.90 ps]                        
Found 17 outliers among 100 measurements (17.00%)
  1 (1.00%) low mild
  9 (9.00%) high mild
  7 (7.00%) high severe
aeshash/u16             time:   [581.67 ps 581.80 ps 581.95 ps]                         
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe
aeshash/u32             time:   [583.43 ps 585.03 ps 587.65 ps]                         
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
aeshash/u64             time:   [585.17 ps 587.13 ps 589.45 ps]                         
Found 13 outliers among 100 measurements (13.00%)
  4 (4.00%) high mild
  9 (9.00%) high severe
aeshash/u128            time:   [373.79 ps 374.13 ps 374.53 ps]                         
Found 19 outliers among 100 measurements (19.00%)
  4 (4.00%) low mild
  1 (1.00%) high mild
  14 (14.00%) high severe
aeshash/string          time:   [90.148 ns 90.166 ns 90.190 ns]                           
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) high mild
  9 (9.00%) high severe

Edit: fallback comparison (thanks liamwhite)

fallback/u8             time:   [581.93 ps 582.21 ps 582.55 ps]                         
Found 14 outliers among 100 measurements (14.00%)
  6 (6.00%) high mild
  8 (8.00%) high severe
fallback/u16            time:   [582.17 ps 582.45 ps 582.78 ps]                          
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
fallback/u32            time:   [582.06 ps 582.35 ps 582.70 ps]                          
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe
fallback/u64            time:   [581.95 ps 582.20 ps 582.50 ps]                          
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe
fallback/u128           time:   [373.65 ps 373.77 ps 373.91 ps]                          
Found 15 outliers among 100 measurements (15.00%)
  4 (4.00%) high mild
  11 (11.00%) high severe
fallback/string         time:   [108.99 ns 109.06 ns 109.14 ns]                            
Found 14 outliers among 100 measurements (14.00%)
  7 (7.00%) high mild
  7 (7.00%) high severe

I'm getting a few warning: unused import:... warnings while compiling. But didn't investigate If any of those are added by this PR.

liamwhite · 2021-06-01T13:03:10Z

@eadf comment this line to test the fallback implementation

aHash/tests/bench.rs

Line 136 in ffa04fc

bench_ahash,

eadf · 2021-06-02T20:40:36Z

M1 Mac: passes all tests and aes benches, but only on +nightly (because of the #![feature(stdsimd)] in lib.rs)

tkaitchuck · 2021-06-07T06:09:32Z

The #![feature(stdsimd)] will have to be put behind a nightly flag to get the build to pass.

There also is a second issue:
On x86 the aes instruction is defined strangely, it first does the sub-bytes, shift rows, and mix columns and then xors the key at the end. (This means the first round of the encryption requires a manual xor, and all the intermediate rounds use one instruction to complete the round and start the next). This is actually very unfortunate for aHash because it is much more convenient if the xor happens first, because it means an additional round is required. Currently the code is designed to work with this x86 way of doing things, so doing it the other way or arm invalidates my analysis of the correctness of the algorithm.

I will look through the code and see if there is any problem with this approach as-is. However this is also an opportunity, it ought to be possible to design a version which on arm invokes one less call to aes, but produces the same quality output.

In the meantime, can either of you run the smhasher test against it as a sanity test? There is a shell script checked into the repo in the smhasher directory. If you run clone_smhasher.sh from that directory, and then build the ahash-cbindings and run the install script
you should be able to just go into the smhasher directory and run ./build.sh && ./SMHasher ahash64

tkaitchuck · 2021-06-07T06:25:08Z

~~In fact there is a problem caused by the ordering:~~
https://github.com/tkaitchuck/aHash/blob/master/src/aes_hash.rs#L86

Currently the length of a slice is added in at the beginning under the assumption that the first aes round will mix it BEFORE xoring with the first block of user data. On ARM this would not be the case, making it possible to force a collision with a 50% chance between two slices which are identical but differ by one of them being longer by one byte (which has a value of 0).

~~The most direct solution would be on ARM to have an alternative implementation of add_in_length:~~

        self.enc = aesenc(self.enc, length as u128);

This way the length is mixed in instead of simply adding it. If we want to avoid the additional aes call it would be possible to move the adding of the length to the end of write instead of the beginning. But in my experience performing additional instruction after the if statements which are common between branches causes sub-optimal assembly to be generated. You can try it and benchmark which is better.

liamwhite · 2021-06-08T02:54:47Z

I wrote a test:

fn test_length_extension<T: Hasher>(hasher: impl Fn() -> T) {
    for i in 0..=255u8 {
        let v1 = hash(&[i], &hasher);
        let v2 = hash(&[i, 0], &hasher);
        assert!(v1 != v2, "i: {} -> v1 = {:x}, v2 = {:x}", i, v1, v2);
    }
}

mod aes_tests {
    #[test]
    fn aes_length_extension() {
        test_length_extension(|| AHasher::test_with_keys(0, 0));
    }
}

It passes on aarch64:

test hash_quality_test::aes_tests::aes_length_extension ... ok

(note that it fails the fallback test)

How should I change it to get the failure mode you describe?

tkaitchuck · 2021-06-09T05:01:49Z

The fallback is failing because you are calling test_with_keys instead of new_with_keys.
The code there is not quite what I intended. I'll post a test which I suspect may fail.

tkaitchuck · 2021-06-13T02:36:11Z

I was wrong above. There is no issue. The code is robust against what I was proposing there.
If we can fix the build issue this should be good to go.

eadf · 2021-06-14T13:04:01Z

The name of the "nightly" feature feels wrong. If it was automatically enabled when using the +nightly toolkit it would be ok, but it isn't. To use it we have to build with +nightly and select the feature "nightly".

Wouldn't it be more descriptive to name it "stdsimd" - i.e. a feature that only can be used in nightly?

P.S. I just checked that #![feature(stdsimd)] and #[cfg(feature = "stdsimd"))] can use the same name.

novacrazy · 2021-06-14T13:18:43Z

Quite a few crates use a nightly flag, including hashbrown.

eadf · 2021-06-14T13:42:28Z

Guess it comes down to the intent of the feature flag. If there are more (future) features that could make use of the same flag (like hash brown) it makes sense with a name like "nightly". For an isolated feature it would be better with a more descriptive name, imho.
That said, "stdsimd", probably isn't the best name for this flag.

And yes: M1 mac completes all tests on +stable and +nightly.
On +stable the aes bench complains about 'aes must be enabled' (as expected). But fallback bench works.
On +nightly aes and fallback benches works as before (if the flag is set accordingly).

tkaitchuck · 2021-06-15T18:42:11Z

Detection of nightly can be handled in the build without requiring users to specify a flag. See:
https://github.com/tkaitchuck/aHash/blob/master/build.rs#L8

Cargo.toml

build.rs

tkaitchuck

It would be good if we could get the SMHasher results also.

Cargo.toml

eadf · 2021-06-20T13:27:39Z

M1 Mac, regarding SMHasher:

The patch 0001-Add-support-for-aHash.patch used by clone_smhasher.sh seems to be out of date, it does not apply at all.

git clone ... && cd smhasher && git apply ../0001-Add-support-for-aHash.patch

error: patch failed: CMakeLists.txt:638
error: CMakeLists.txt: patch does not apply
error: patch failed: Hashes.h:17
error: Hashes.h: patch does not apply
error: ahash.h: already exists in working directory
error: patch failed: main.cpp:302
error: main.cpp: patch does not apply

tkaitchuck · 2021-06-25T23:15:26Z

I see the patch does not apply because smhasher directly added support for aHash!
So it should be possible to run directly.

tkaitchuck · 2021-07-07T04:15:57Z

@eadf What is the status of SMHasher? I don't quite follow from the linked PR.

eadf · 2021-07-07T09:21:51Z

It does not build on a M1, the cmake script fails to identify the architecture and the compiler says that it does not support -march 'native'. But I don't have the time nor cmake knowledge to fix that.
However, I could help run a few tests if someone w/o access to a M1 wants to give it a go.

tkaitchuck · 2021-07-13T05:49:21Z

In that case I am going to merge this, and open an issue to track it.

aarch64 implementation

b021133

add aarch32 support

cb2eb63

tkaitchuck self-requested a review June 7, 2021 05:49

conditionally enable nightly crypto on cargo feature

2c087fd

add nightly feature in build script

abd30f5

tkaitchuck reviewed Jun 16, 2021

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

build.rs Outdated Show resolved Hide resolved

rename nightly feature to stdsimd

c68a29a

tkaitchuck approved these changes Jun 20, 2021

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

remove stdsimd feature from Cargo.toml

86839d0

eadf mentioned this pull request Jun 26, 2021

Surrounded CMakeLists.txt ${CPUINFO} with quotes for Darwin. (M1/Arm) rurban/smhasher#202

Closed

tkaitchuck merged commit 8781819 into tkaitchuck:master Jul 13, 2021

tkaitchuck mentioned this pull request Jul 13, 2021

Run smhasher on ARM #96

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AArch64 implementation #90

AArch64 implementation #90

liamwhite commented May 30, 2021

eadf commented Jun 1, 2021 •

edited

Loading

liamwhite commented Jun 1, 2021

eadf commented Jun 2, 2021

tkaitchuck commented Jun 7, 2021

tkaitchuck commented Jun 7, 2021 •

edited

Loading

liamwhite commented Jun 8, 2021

tkaitchuck commented Jun 9, 2021

tkaitchuck commented Jun 13, 2021

eadf commented Jun 14, 2021 •

edited

Loading

novacrazy commented Jun 14, 2021

eadf commented Jun 14, 2021 •

edited

Loading

tkaitchuck commented Jun 15, 2021

tkaitchuck left a comment

eadf commented Jun 20, 2021 •

edited

Loading

tkaitchuck commented Jun 25, 2021 •

edited

Loading

tkaitchuck commented Jul 7, 2021

eadf commented Jul 7, 2021

tkaitchuck commented Jul 13, 2021

AArch64 implementation #90

AArch64 implementation #90

Conversation

liamwhite commented May 30, 2021

eadf commented Jun 1, 2021 • edited Loading

liamwhite commented Jun 1, 2021

eadf commented Jun 2, 2021

tkaitchuck commented Jun 7, 2021

tkaitchuck commented Jun 7, 2021 • edited Loading

liamwhite commented Jun 8, 2021

tkaitchuck commented Jun 9, 2021

tkaitchuck commented Jun 13, 2021

eadf commented Jun 14, 2021 • edited Loading

novacrazy commented Jun 14, 2021

eadf commented Jun 14, 2021 • edited Loading

tkaitchuck commented Jun 15, 2021

tkaitchuck left a comment

Choose a reason for hiding this comment

eadf commented Jun 20, 2021 • edited Loading

tkaitchuck commented Jun 25, 2021 • edited Loading

tkaitchuck commented Jul 7, 2021

eadf commented Jul 7, 2021

tkaitchuck commented Jul 13, 2021

eadf commented Jun 1, 2021 •

edited

Loading

tkaitchuck commented Jun 7, 2021 •

edited

Loading

eadf commented Jun 14, 2021 •

edited

Loading

eadf commented Jun 14, 2021 •

edited

Loading

eadf commented Jun 20, 2021 •

edited

Loading

tkaitchuck commented Jun 25, 2021 •

edited

Loading