-
Notifications
You must be signed in to change notification settings - Fork 256
CPU Benchmarks
Originally written by @lxp in Issue #23
Note that
- Benchmark4kEncStupidGCM = OpenSSL through our stupidgcm wrapper,
- Benchmark4kEncGoGCM = Go stdlib.
In recent gocryptfs versions you can run gocryptfs -speed
to run the benchmarks and get nicer output.
The tests were run on go version go1.6 linux/amd64
unless noted otherwise.
Kaby Lake (Launch: Q2'17)
$ cat /proc/cpuinfo | grep -E "model name|flags" | head -2
model name : Intel(R) Core(TM) i3-7130U CPU @ 2.70GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp flush_l1d
$ ./gocryptfs -version
gocryptfs v1.7-23-gcc0a603; go-fuse v1.0.0-186-g467f4e0; 2019-04-14 go1.12.4
$ ./gocryptfs -speed
AES-GCM-256-OpenSSL 877.83 MB/s
AES-GCM-256-Go 1905.48 MB/s (selected in auto mode)
AES-SIV-512-Go 212.29 MB/s
Skylake (Launch: Q3'15)
$ cat /proc/cpuinfo
model name : Intel(R) Core(TM) i3-6100U CPU @ 2.30GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm hwp hwp_notify hwp_act_window hwp_epp intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4 200000 10688 ns/op 383.22 MB/s
Benchmark4kEncGoGCM-4 300000 4073 ns/op 1005.57 MB/s
Haswell (Launch: Q2'14)
$ cat /proc/cpuinfo
model name : Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt dtherm ida arat pln pts
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4 200000 6710 ns/op 610.43 MB/s
Benchmark4kEncGoGCM-4 500000 2422 ns/op 1690.86 MB/s
Ivy Bridge (Launch: Q2'12)
$ grep 'model name\|flags' /proc/cpuinfo | head -n2
model name : Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
$ gocryptfs -version
gocryptfs v1.7-37-gb1468a7; go-fuse v1.0.0-174-g22a9cb9; 2019-06-11 go1.12 linux/amd64
$ gocryptfs -speed
AES-GCM-256-OpenSSL 546.39 MB/s
AES-GCM-256-Go 828.67 MB/s (selected in auto mode)
AES-SIV-512-Go 158.73 MB/s
$ cat /proc/cpuinfo
model name : Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4 200000 14684 ns/op 278.94 MB/s
Benchmark4kEncGoGCM-4 300000 7792 ns/op 525.62 MB/s
Sandy Bridge (Launch: Q1'11)
$ cat /proc/cpuinfo
model name : Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4 100000 19070 ns/op 214.78 MB/s
Benchmark4kEncGoGCM-4 200000 10981 ns/op 373.01 MB/s
Westmere (Launch: Q1'10)
$ cat /proc/cpuinfo
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm epb tpr_shadow vnmi flexpriority ept vpid dtherm ida arat
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-16 100000 18297 ns/op 223.85 MB/s
Benchmark4kEncGoGCM-16 200000 9579 ns/op 427.58 MB/s
Ivy Bridge (Launch: Q1'13)
$ cat /proc/cpuinfo
model name : Intel(R) Pentium(R) CPU G2130 @ 3.20GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-2 100000 22691 ns/op 180.51 MB/s
Benchmark4kEncGoGCM-2 20000 92810 ns/op 44.13 MB/s
Sandy Bridge (Launch: Q3'11)
$ grep 'model name\|flags' /proc/cpuinfo | head -n2
model name : Intel(R) Pentium(R) CPU G630 @ 2.70GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave lahf_lm epb tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm arat pln pts
$ ./gocryptfs -speed
AES-GCM-256-OpenSSL 175.80 MB/s (selected in auto mode)
AES-GCM-256-Go 49.53 MB/s
AES-SIV-512-Go 38.37 MB/s
Nehalem (Launch: Q3'09)
$ cat /proc/cpuinfo
model name : Intel(R) Xeon(R) CPU X3460 @ 2.80GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dtherm tpr_shadow vnmi flexpriority ept vpid
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-8 50000 35247 ns/op 116.21 MB/s
Benchmark4kEncGoGCM-8 20000 92230 ns/op 44.41 MB/s
Core (Launch: Q1'08)
$ cat /proc/cpuinfo
model name : Intel(R) Core(TM)2 Duo CPU E7400 @ 2.80GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-2 30000 46697 ns/op 87.71 MB/s
Benchmark4kEncGoGCM-2 10000 194095 ns/op 21.10 MB/s
Apple M1 ( https://github.com/rfjakob/gocryptfs/issues/556#issuecomment-848079309 )
% gocryptfs -speed
gocryptfs v2.0-beta4-5-g09870bf; go-fuse v2.1.1-0.20210423170155-a90e1f463c3f => github.com/rfjakob/go-fuse/v2 v2.1.1-0.20210508151621-62c5aa1919a7; 2021-05-25 go1.16.3 darwin/arm64
AES-GCM-256-OpenSSL 1627.09 MB/s (selected in auto mode)
AES-GCM-256-Go 3746.85 MB/s
AES-SIV-512-Go 452.57 MB/s
XChaCha20-Poly1305-Go 747.43 MB/s (benchmark only, not selectable yet)
Data gathered by @DavyLandman (post), thank you very much!
Ubuntu 19.10, 64-bit
$ ./gocryptfs -speed
AES-GCM-256-OpenSSL 21.46 MB/s (selected in auto mode)
AES-GCM-256-Go 21.65 MB/s
AES-SIV-512-Go 17.59 MB/s
From https://github.com/rfjakob/gocryptfs/issues/531#issue-760624096 , Raspberry Pi 4b running ubuntu 20.10 64bit
gocryptfs 1.8.0; go-fuse 2.0.3; 2020-11-27 go1.15.5 linux/arm64
AES-GCM-256-OpenSSL 21.44 MB/s (selected in auto mode)
AES-GCM-256-Go 21.06 MB/s
AES-SIV-512-Go 17.70 MB/s
XChaCha20-Poly1305-Go 122.86 MB/s
model name : ARMv7 Processor rev 3 (v7l)
Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae
$ gocryptfs -speed
AES-GCM-256-OpenSSL 34.26 MB/s (selected in auto mode)
AES-GCM-256-Go 17.24 MB/s
AES-SIV-512-Go 17.58 MB/s
From https://github.com/rfjakob/gocryptfs/issues/452#issuecomment-593334109 :
$ ./gocryptfs.xchacha20.armv7 --speed
AES-GCM-256-OpenSSL N/A
AES-GCM-256-Go 17.04 MB/s (selected in auto mode)
AES-SIV-512-Go 14.79 MB/s
XChaCha20-Poly1305-Go 23.37 MB/s
$ openssl speed -evp chacha20-poly1305 && openssl speed -evp aes-256-gcm
...
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
chacha20-poly1305 64066.72k 130153.44k 275532.80k 306572.84k 320018.56k 307903.74k
aes-256-gcm 40323.87k 49980.74k 64734.47k 70323.03k 71862.66k 71786.19k
model name : ARMv7 Processor rev 4 (v7l)
Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
$ gocryptfs -speed
AES-GCM-256-OpenSSL 17.13 MB/s (selected in auto mode)
AES-GCM-256-Go 5.27 MB/s
AES-SIV-512-Go 4.31 MB/s
$ openssl speed -evp chacha20-poly1305 && openssl speed -evp aes-256-gcm
...
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
chacha20-poly1305 30020.39k 63560.13k 77169.32k 82019.33k 83536.55k 83645.78k
aes-256-gcm 16137.38k 19500.97k 20668.33k 20986.20k 21127.17k 21135.36k
model name : ARMv6-compatible processor rev 7 (v6l)
Features : half thumb fastmult vfp edsp java tls
$ gocryptfs -speed
AES-GCM-256-OpenSSL 4.80 MB/s (selected in auto mode)
AES-GCM-256-Go 1.85 MB/s
AES-SIV-512-Go 1.50 MB/s
$ openssl speed -evp chacha20-poly1305 && openssl speed -evp aes-256-gcm
...
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
chacha20-poly1305 8090.97k 18202.65k 23222.03k 24960.34k 25666.44k 24958.29k
aes-256-gcm 4525.91k 6268.65k 6972.36k 7141.38k 7230.33k 7150.88k