-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: use SwissTable #54766
Comments
Performance comparisonHere the SwissTable is just a basic version. To avoid a CL bringing too many changes, it does not use SIMD and is compatible with the previous hashmap memory layout(it means many optimizations are not included in this version). Its performance still has room for improvement; we will add these optimizations in subsequent CLs. In general, the performance changes for the basic version of SwissTable are as follows.
Platform
Benchmark-1Located in https://github.com/zhangyunhao116/gomapbench This benchmark contains performance comparisons of common operations of common types in different situations. INFOname old time/op new time/op delta MapIter/Int/6-16 60.2ns ± 1% 55.7ns ± 0% -7.35% (p=0.000 n=9+7) MapIter/Int/12-16 121ns ±10% 102ns ± 3% -15.51% (p=0.000 n=10+10) MapIter/Int/18-16 173ns ± 2% 165ns ± 1% -4.58% (p=0.000 n=10+8) MapIter/Int/24-16 214ns ± 4% 194ns ± 3% -9.46% (p=0.000 n=10+10) MapIter/Int/30-16 287ns ± 3% 272ns ± 2% -5.22% (p=0.000 n=10+9) MapIter/Int/64-16 609ns ± 0% 572ns ± 3% -6.10% (p=0.000 n=7+10) MapIter/Int/128-16 1.21µs ± 4% 1.17µs ± 2% -3.63% (p=0.004 n=10+10) MapIter/Int/256-16 2.46µs ± 2% 2.23µs ± 2% -9.31% (p=0.000 n=10+10) MapIter/Int/512-16 4.98µs ± 4% 4.38µs ± 2% -12.22% (p=0.000 n=10+10) MapIter/Int/1024-16 9.94µs ± 3% 8.84µs ± 2% -11.11% (p=0.000 n=10+10) MapIter/Int/2048-16 20.1µs ± 3% 17.8µs ± 1% -11.33% (p=0.000 n=10+10) MapIter/Int/4096-16 39.7µs ± 1% 35.5µs ± 2% -10.49% (p=0.000 n=10+10) MapIter/Int/8192-16 79.3µs ± 2% 71.8µs ± 1% -9.42% (p=0.000 n=10+10) MapIter/Int/65536-16 631µs ± 2% 571µs ± 1% -9.52% (p=0.000 n=10+9) MapAccessHit/Int64/6-16 4.24ns ± 1% 3.71ns ± 1% -12.37% (p=0.000 n=8+9) MapAccessHit/Int64/12-16 6.61ns ± 6% 6.63ns ±11% ~ (p=0.796 n=9+9) MapAccessHit/Int64/18-16 6.74ns ± 4% 7.01ns ± 1% +4.02% (p=0.002 n=9+8) MapAccessHit/Int64/24-16 7.81ns ± 8% 8.17ns ± 5% +4.62% (p=0.034 n=10+8) MapAccessHit/Int64/30-16 6.23ns ± 6% 6.99ns ± 1% +12.34% (p=0.000 n=10+7) MapAccessHit/Int64/64-16 6.68ns ± 6% 7.22ns ± 2% +8.08% (p=0.000 n=10+9) MapAccessHit/Int64/128-16 6.98ns ± 6% 7.32ns ± 6% +4.93% (p=0.004 n=10+10) MapAccessHit/Int64/256-16 7.12ns ± 2% 7.28ns ± 2% +2.21% (p=0.002 n=9+10) MapAccessHit/Int64/512-16 7.68ns ± 4% 7.40ns ± 2% -3.65% (p=0.000 n=9+10) MapAccessHit/Int64/1024-16 8.85ns ± 2% 7.35ns ± 2% -16.92% (p=0.000 n=10+9) MapAccessHit/Int64/2048-16 11.2ns ± 3% 7.7ns ± 1% -31.23% (p=0.000 n=10+8) MapAccessHit/Int64/4096-16 14.1ns ± 1% 7.9ns ± 1% -43.96% (p=0.000 n=7+8) MapAccessHit/Int64/8192-16 16.2ns ± 2% 8.2ns ± 2% -49.14% (p=0.000 n=10+10) MapAccessHit/Int64/65536-16 19.6ns ± 2% 10.5ns ± 1% -46.18% (p=0.000 n=10+9) MapAccessHit/Int32/6-16 3.78ns ± 1% 3.86ns ± 1% +2.12% (p=0.000 n=10+8) MapAccessHit/Int32/12-16 6.23ns ± 6% 6.16ns ± 4% ~ (p=0.483 n=10+9) MapAccessHit/Int32/18-16 6.42ns ± 5% 6.99ns ± 3% +8.91% (p=0.000 n=10+9) MapAccessHit/Int32/24-16 7.46ns ± 5% 8.60ns ± 8% +15.32% (p=0.000 n=9+10) MapAccessHit/Int32/30-16 6.07ns ± 5% 6.97ns ± 1% +14.88% (p=0.000 n=10+8) MapAccessHit/Int32/64-16 6.50ns ± 4% 7.18ns ± 5% +10.34% (p=0.000 n=9+10) MapAccessHit/Int32/128-16 6.68ns ± 6% 7.15ns ± 4% +7.12% (p=0.000 n=10+10) MapAccessHit/Int32/256-16 6.81ns ± 4% 7.32ns ± 4% +7.45% (p=0.000 n=10+10) MapAccessHit/Int32/512-16 7.40ns ± 3% 7.30ns ± 3% ~ (p=0.156 n=9+10) MapAccessHit/Int32/1024-16 8.46ns ± 2% 7.38ns ± 1% -12.79% (p=0.000 n=10+9) MapAccessHit/Int32/2048-16 11.0ns ± 3% 7.6ns ± 2% -31.46% (p=0.000 n=10+10) MapAccessHit/Int32/4096-16 13.9ns ± 2% 7.8ns ± 2% -44.12% (p=0.000 n=10+10) MapAccessHit/Int32/8192-16 15.8ns ± 2% 8.1ns ± 1% -48.99% (p=0.000 n=10+10) MapAccessHit/Int32/65536-16 19.0ns ± 2% 10.3ns ± 2% -45.85% (p=0.000 n=10+10) MapAccessHit/Str/6-16 13.5ns ± 1% 12.7ns ± 2% -5.88% (p=0.000 n=9+10) MapAccessHit/Str/12-16 9.77ns ±11% 9.40ns ± 2% ~ (p=0.460 n=10+8) MapAccessHit/Str/18-16 9.05ns ± 7% 9.47ns ± 1% +4.66% (p=0.001 n=9+9) MapAccessHit/Str/24-16 9.92ns ± 7% 10.07ns ± 9% ~ (p=0.604 n=9+10) MapAccessHit/Str/30-16 8.46ns ± 7% 9.43ns ± 1% +11.50% (p=0.000 n=10+8) MapAccessHit/Str/64-16 8.91ns ± 5% 9.57ns ± 2% +7.39% (p=0.000 n=10+8) MapAccessHit/Str/128-16 9.82ns ± 4% 10.58ns ± 5% +7.75% (p=0.000 n=10+10) MapAccessHit/Str/256-16 11.8ns ± 4% 11.4ns ± 1% -3.01% (p=0.008 n=10+8) MapAccessHit/Str/512-16 14.6ns ± 2% 11.9ns ± 2% -18.89% (p=0.000 n=8+9) MapAccessHit/Str/1024-16 17.9ns ± 2% 12.7ns ± 1% -29.02% (p=0.000 n=10+9) MapAccessHit/Str/2048-16 21.5ns ± 3% 13.0ns ± 2% -39.43% (p=0.000 n=10+9) MapAccessHit/Str/4096-16 25.1ns ± 1% 13.2ns ± 2% -47.63% (p=0.000 n=10+10) MapAccessHit/Str/8192-16 26.6ns ± 1% 14.2ns ± 1% -46.62% (p=0.000 n=9+9) MapAccessHit/Str/65536-16 31.8ns ± 1% 17.3ns ± 2% -45.64% (p=0.000 n=10+10) MapAccessMiss/Int64/6-16 8.03ns ± 1% 7.35ns ± 1% -8.39% (p=0.000 n=10+10) MapAccessMiss/Int64/12-16 10.2ns ± 2% 18.3ns ± 3% +79.48% (p=0.000 n=8+8) MapAccessMiss/Int64/18-16 10.3ns ± 1% 7.2ns ± 1% -29.85% (p=0.000 n=9+8) MapAccessMiss/Int64/24-16 13.9ns ± 2% 10.8ns ± 2% -22.43% (p=0.000 n=7+8) MapAccessMiss/Int64/30-16 10.1ns ± 3% 7.1ns ± 3% -29.69% (p=0.000 n=8+8) MapAccessMiss/Int64/64-16 10.3ns ± 1% 7.9ns ±12% -23.29% (p=0.000 n=7+10) MapAccessMiss/Int64/128-16 10.2ns ± 2% 7.6ns ± 8% -25.64% (p=0.000 n=10+9) MapAccessMiss/Int64/256-16 10.4ns ± 2% 7.9ns ± 7% -23.40% (p=0.000 n=9+9) MapAccessMiss/Int64/512-16 10.3ns ± 3% 7.9ns ± 8% -23.37% (p=0.000 n=9+10) MapAccessMiss/Int64/1024-16 10.4ns ± 2% 8.0ns ± 4% -22.40% (p=0.000 n=9+10) MapAccessMiss/Int64/2048-16 10.4ns ± 2% 7.9ns ± 3% -23.97% (p=0.000 n=10+10) MapAccessMiss/Int64/4096-16 10.4ns ± 2% 8.2ns ± 3% -21.67% (p=0.000 n=10+10) MapAccessMiss/Int64/8192-16 10.3ns ± 2% 8.6ns ± 3% -16.95% (p=0.000 n=10+10) MapAccessMiss/Int64/65536-16 13.5ns ± 2% 10.9ns ± 3% -19.23% (p=0.000 n=9+8) MapAccessMiss/Int32/6-16 6.17ns ± 2% 7.45ns ± 2% +20.86% (p=0.000 n=10+10) MapAccessMiss/Int32/12-16 8.50ns ± 1% 15.94ns ± 3% +87.44% (p=0.000 n=8+8) MapAccessMiss/Int32/18-16 8.49ns ± 2% 8.21ns ±30% ~ (p=0.156 n=9+10) MapAccessMiss/Int32/24-16 11.9ns ± 2% 10.8ns ± 2% -9.03% (p=0.000 n=8+7) MapAccessMiss/Int32/30-16 8.48ns ± 1% 7.53ns ±15% ~ (p=0.138 n=10+10) MapAccessMiss/Int32/64-16 8.49ns ± 1% 8.11ns ±17% -4.50% (p=0.034 n=8+10) MapAccessMiss/Int32/128-16 8.89ns ± 2% 7.86ns ±10% -11.57% (p=0.000 n=7+10) MapAccessMiss/Int32/256-16 8.72ns ± 3% 7.83ns ± 3% -10.23% (p=0.000 n=10+9) MapAccessMiss/Int32/512-16 8.79ns ± 4% 7.70ns ± 4% -12.40% (p=0.000 n=10+9) MapAccessMiss/Int32/1024-16 8.81ns ± 1% 7.74ns ± 5% -12.10% (p=0.000 n=9+10) MapAccessMiss/Int32/2048-16 9.16ns ± 1% 7.91ns ± 2% -13.64% (p=0.000 n=7+9) MapAccessMiss/Int32/4096-16 9.40ns ± 2% 8.15ns ± 3% -13.29% (p=0.000 n=9+9) MapAccessMiss/Int32/8192-16 9.47ns ± 2% 8.29ns ± 2% -12.45% (p=0.000 n=9+9) MapAccessMiss/Int32/65536-16 12.9ns ± 2% 10.6ns ± 2% -18.32% (p=0.000 n=8+10) MapAccessMiss/Str/6-16 5.69ns ± 1% 6.89ns ± 1% +21.05% (p=0.000 n=10+9) MapAccessMiss/Str/12-16 11.7ns ± 5% 10.6ns ±11% -9.74% (p=0.001 n=10+10) MapAccessMiss/Str/18-16 10.2ns ± 1% 9.6ns ± 1% -5.81% (p=0.000 n=8+8) MapAccessMiss/Str/24-16 12.5ns ±11% 11.4ns ± 9% -8.83% (p=0.011 n=10+10) MapAccessMiss/Str/30-16 10.8ns ± 4% 8.9ns ± 1% -17.66% (p=0.000 n=10+8) MapAccessMiss/Str/64-16 10.8ns ± 4% 9.3ns ± 8% -13.63% (p=0.000 n=10+10) MapAccessMiss/Str/128-16 12.0ns ± 7% 9.9ns ±10% -17.57% (p=0.000 n=10+10) MapAccessMiss/Str/256-16 11.2ns ± 3% 10.0ns ± 5% -11.09% (p=0.000 n=10+10) MapAccessMiss/Str/512-16 11.1ns ± 3% 9.6ns ± 6% -13.51% (p=0.000 n=10+10) MapAccessMiss/Str/1024-16 12.3ns ± 2% 9.7ns ± 3% -20.85% (p=0.000 n=10+10) MapAccessMiss/Str/2048-16 15.7ns ± 2% 10.1ns ± 4% -35.32% (p=0.000 n=10+10) MapAccessMiss/Str/4096-16 14.0ns ± 3% 10.2ns ± 3% -27.20% (p=0.000 n=10+10) MapAccessMiss/Str/8192-16 14.1ns ± 2% 10.9ns ± 4% -22.70% (p=0.000 n=10+10) MapAccessMiss/Str/65536-16 19.2ns ± 2% 13.6ns ± 4% -29.47% (p=0.000 n=10+10) MapAssignGrow/Int64/6-16 69.1ns ± 1% 69.5ns ± 1% ~ (p=0.133 n=9+10) MapAssignGrow/Int64/12-16 759ns ± 0% 539ns ± 0% -28.91% (p=0.000 n=7+10) MapAssignGrow/Int64/18-16 1.67µs ± 0% 1.19µs ± 1% -28.56% (p=0.000 n=10+10) MapAssignGrow/Int64/24-16 2.02µs ± 0% 1.36µs ± 0% -32.89% (p=0.000 n=10+10) MapAssignGrow/Int64/30-16 3.53µs ± 0% 2.53µs ± 0% -28.19% (p=0.000 n=8+10) MapAssignGrow/Int64/64-16 7.84µs ± 1% 5.38µs ± 1% -31.36% (p=0.000 n=10+10) MapAssignGrow/Int64/128-16 15.5µs ± 0% 10.9µs ± 0% -29.52% (p=0.000 n=9+10) MapAssignGrow/Int64/256-16 30.2µs ± 0% 21.9µs ± 0% -27.59% (p=0.000 n=10+10) MapAssignGrow/Int64/512-16 58.8µs ± 0% 43.7µs ± 0% -25.58% (p=0.000 n=10+9) MapAssignGrow/Int64/1024-16 116µs ± 0% 88µs ± 0% -24.24% (p=0.000 n=10+9) MapAssignGrow/Int64/2048-16 231µs ± 0% 175µs ± 0% -24.04% (p=0.000 n=10+10) MapAssignGrow/Int64/4096-16 458µs ± 0% 351µs ± 0% -23.49% (p=0.000 n=10+10) MapAssignGrow/Int64/8192-16 919µs ± 0% 712µs ± 0% -22.51% (p=0.000 n=10+10) MapAssignGrow/Int64/65536-16 8.69ms ± 1% 6.29ms ± 0% -27.66% (p=0.000 n=10+7) MapAssignGrow/Int32/6-16 71.7ns ± 1% 67.6ns ± 1% -5.72% (p=0.000 n=9+10) MapAssignGrow/Int32/12-16 734ns ± 0% 521ns ± 0% -29.00% (p=0.000 n=9+10) MapAssignGrow/Int32/18-16 1.61µs ± 1% 1.15µs ± 0% -28.59% (p=0.000 n=10+7) MapAssignGrow/Int32/24-16 1.97µs ± 0% 1.32µs ± 0% -33.10% (p=0.000 n=10+9) MapAssignGrow/Int32/30-16 3.41µs ± 0% 2.43µs ± 1% -28.67% (p=0.000 n=10+10) MapAssignGrow/Int32/64-16 7.45µs ± 0% 5.13µs ± 0% -31.09% (p=0.000 n=10+10) MapAssignGrow/Int32/128-16 15.0µs ± 0% 10.7µs ± 0% -28.89% (p=0.000 n=10+9) MapAssignGrow/Int32/256-16 29.8µs ± 0% 22.0µs ± 1% -26.05% (p=0.000 n=9+10) MapAssignGrow/Int32/512-16 58.1µs ± 0% 42.9µs ± 0% -26.07% (p=0.000 n=10+8) MapAssignGrow/Int32/1024-16 115µs ± 0% 86µs ± 0% -24.96% (p=0.000 n=10+10) MapAssignGrow/Int32/2048-16 226µs ± 1% 171µs ± 1% -24.44% (p=0.000 n=10+10) MapAssignGrow/Int32/4096-16 445µs ± 1% 341µs ± 0% -23.31% (p=0.000 n=10+10) MapAssignGrow/Int32/8192-16 900µs ± 0% 686µs ± 0% -23.72% (p=0.000 n=10+10) MapAssignGrow/Int32/65536-16 7.96ms ± 0% 6.11ms ± 1% -23.18% (p=0.000 n=7+10) MapAssignGrow/Str/6-16 86.5ns ± 2% 91.6ns ± 2% +5.88% (p=0.000 n=10+10) MapAssignGrow/Str/12-16 908ns ± 0% 719ns ± 0% -20.87% (p=0.000 n=10+10) MapAssignGrow/Str/18-16 2.01µs ± 0% 1.59µs ± 0% -20.98% (p=0.000 n=7+8) MapAssignGrow/Str/24-16 2.36µs ± 0% 1.79µs ± 1% -23.93% (p=0.000 n=10+10) MapAssignGrow/Str/30-16 4.17µs ± 0% 3.22µs ± 0% -22.72% (p=0.000 n=10+10) MapAssignGrow/Str/64-16 9.21µs ± 0% 6.73µs ± 0% -26.96% (p=0.000 n=7+9) MapAssignGrow/Str/128-16 18.6µs ± 0% 13.4µs ± 1% -27.60% (p=0.000 n=8+10) MapAssignGrow/Str/256-16 35.5µs ± 0% 26.8µs ± 0% -24.53% (p=0.000 n=10+10) MapAssignGrow/Str/512-16 70.6µs ± 0% 53.3µs ± 0% -24.58% (p=0.000 n=10+10) MapAssignGrow/Str/1024-16 142µs ± 1% 109µs ± 0% -23.46% (p=0.000 n=9+10) MapAssignGrow/Str/2048-16 289µs ± 1% 221µs ± 0% -23.54% (p=0.000 n=10+9) MapAssignGrow/Str/4096-16 588µs ± 0% 448µs ± 0% -23.83% (p=0.000 n=9+9) MapAssignGrow/Str/8192-16 1.27ms ± 1% 0.91ms ± 0% -28.82% (p=0.000 n=10+10) MapAssignGrow/Str/65536-16 12.4ms ± 1% 9.9ms ± 1% -20.47% (p=0.000 n=10+9) MapAssignPreAllocate/Pointer/6-16 310ns ± 1% 281ns ± 0% -9.41% (p=0.000 n=10+7) MapAssignPreAllocate/Pointer/12-16 864ns ± 0% 668ns ± 1% -22.73% (p=0.000 n=10+10) MapAssignPreAllocate/Pointer/18-16 1.31µs ± 0% 0.97µs ± 0% -25.94% (p=0.000 n=10+10) MapAssignPreAllocate/Pointer/24-16 1.77µs ± 0% 1.26µs ± 0% -29.07% (p=0.000 n=10+9) MapAssignPreAllocate/Pointer/30-16 2.16µs ± 0% 1.60µs ± 1% -26.13% (p=0.000 n=10+9) MapAssignPreAllocate/Pointer/64-16 4.73µs ± 1% 3.23µs ± 0% -31.72% (p=0.000 n=10+10) MapAssignPreAllocate/Pointer/128-16 9.18µs ± 0% 6.36µs ± 1% -30.65% (p=0.000 n=10+10) MapAssignPreAllocate/Pointer/256-16 18.0µs ± 0% 12.4µs ± 0% -30.98% (p=0.000 n=8+10) MapAssignPreAllocate/Pointer/512-16 36.2µs ± 1% 24.6µs ± 0% -32.11% (p=0.000 n=10+8) MapAssignPreAllocate/Pointer/1024-16 72.4µs ± 1% 50.3µs ± 0% -30.49% (p=0.000 n=10+10) MapAssignPreAllocate/Pointer/2048-16 145µs ± 1% 102µs ± 0% -29.93% (p=0.000 n=10+10) MapAssignPreAllocate/Pointer/4096-16 291µs ± 0% 209µs ± 1% -28.18% (p=0.000 n=10+9) MapAssignPreAllocate/Pointer/8192-16 590µs ± 0% 444µs ± 1% -24.76% (p=0.000 n=10+10) MapAssignPreAllocate/Pointer/65536-16 6.21ms ± 1% 5.68ms ± 1% -8.48% (p=0.000 n=9+9) MapAssignPreAllocate/Int64/6-16 74.7ns ± 2% 71.2ns ± 2% -4.68% (p=0.000 n=10+10) MapAssignPreAllocate/Int64/12-16 540ns ± 0% 360ns ± 0% -33.39% (p=0.000 n=9+9) MapAssignPreAllocate/Int64/18-16 811ns ± 0% 525ns ± 0% -35.20% (p=0.000 n=9+9) MapAssignPreAllocate/Int64/24-16 1.16µs ± 0% 0.68µs ± 1% -41.94% (p=0.000 n=9+10) MapAssignPreAllocate/Int64/30-16 1.33µs ± 1% 0.87µs ± 1% -34.60% (p=0.000 n=10+10) MapAssignPreAllocate/Int64/64-16 2.97µs ± 1% 1.75µs ± 0% -41.01% (p=0.000 n=10+10) MapAssignPreAllocate/Int64/128-16 5.73µs ± 0% 3.39µs ± 0% -40.79% (p=0.000 n=9+8) MapAssignPreAllocate/Int64/256-16 11.0µs ± 0% 6.5µs ± 0% -40.50% (p=0.000 n=10+8) MapAssignPreAllocate/Int64/512-16 21.8µs ± 0% 12.9µs ± 0% -40.91% (p=0.000 n=10+10) MapAssignPreAllocate/Int64/1024-16 43.6µs ± 0% 26.4µs ± 0% -39.42% (p=0.000 n=10+10) MapAssignPreAllocate/Int64/2048-16 88.0µs ± 1% 52.5µs ± 0% -40.40% (p=0.000 n=10+8) MapAssignPreAllocate/Int64/4096-16 175µs ± 0% 108µs ± 0% -38.60% (p=0.000 n=9+10) MapAssignPreAllocate/Int64/8192-16 362µs ± 0% 224µs ± 1% -38.10% (p=0.000 n=8+10) MapAssignPreAllocate/Int64/65536-16 3.78ms ± 0% 3.21ms ± 2% -15.14% (p=0.000 n=10+10) MapAssignPreAllocate/Int32/6-16 72.8ns ± 1% 69.2ns ± 1% -4.98% (p=0.000 n=9+10) MapAssignPreAllocate/Int32/12-16 503ns ± 0% 342ns ± 0% -32.02% (p=0.000 n=10+9) MapAssignPreAllocate/Int32/18-16 770ns ± 1% 498ns ± 1% -35.29% (p=0.000 n=10+8) MapAssignPreAllocate/Int32/24-16 1.11µs ± 1% 0.65µs ± 0% -41.79% (p=0.000 n=10+8) MapAssignPreAllocate/Int32/30-16 1.26µs ± 0% 0.81µs ± 1% -35.71% (p=0.000 n=10+8) MapAssignPreAllocate/Int32/64-16 2.79µs ± 1% 1.63µs ± 0% -41.60% (p=0.000 n=10+9) MapAssignPreAllocate/Int32/128-16 5.60µs ± 1% 3.39µs ± 1% -39.52% (p=0.000 n=10+10) MapAssignPreAllocate/Int32/256-16 11.1µs ± 1% 6.7µs ± 1% -39.66% (p=0.000 n=10+10) MapAssignPreAllocate/Int32/512-16 21.6µs ± 0% 12.2µs ± 0% -43.30% (p=0.000 n=10+10) MapAssignPreAllocate/Int32/1024-16 42.4µs ± 1% 24.2µs ± 0% -42.91% (p=0.000 n=10+10) MapAssignPreAllocate/Int32/2048-16 84.9µs ± 0% 50.0µs ± 0% -41.08% (p=0.000 n=10+8) MapAssignPreAllocate/Int32/4096-16 173µs ± 0% 104µs ± 0% -40.04% (p=0.000 n=10+10) MapAssignPreAllocate/Int32/8192-16 347µs ± 1% 212µs ± 1% -39.07% (p=0.000 n=10+8) MapAssignPreAllocate/Int32/65536-16 3.69ms ± 1% 3.07ms ± 2% -16.96% (p=0.000 n=10+10) MapAssignPreAllocate/Str/6-16 89.6ns ± 2% 94.6ns ± 1% +5.49% (p=0.000 n=9+7) MapAssignPreAllocate/Str/12-16 641ns ± 1% 504ns ± 0% -21.37% (p=0.000 n=9+9) MapAssignPreAllocate/Str/18-16 1.01µs ± 0% 0.76µs ± 0% -24.98% (p=0.000 n=7+8) MapAssignPreAllocate/Str/24-16 1.36µs ± 0% 0.96µs ± 0% -29.74% (p=0.000 n=9+9) MapAssignPreAllocate/Str/30-16 1.68µs ± 0% 1.22µs ± 0% -27.15% (p=0.000 n=10+9) MapAssignPreAllocate/Str/64-16 3.82µs ± 0% 2.43µs ± 0% -36.29% (p=0.000 n=10+10) MapAssignPreAllocate/Str/128-16 7.60µs ± 0% 4.66µs ± 0% -38.67% (p=0.000 n=9+10) MapAssignPreAllocate/Str/256-16 13.8µs ± 0% 9.2µs ± 0% -33.55% (p=0.000 n=10+10) MapAssignPreAllocate/Str/512-16 27.5µs ± 0% 18.2µs ± 0% -33.89% (p=0.000 n=10+9) MapAssignPreAllocate/Str/1024-16 55.4µs ± 0% 37.8µs ± 0% -31.90% (p=0.000 n=10+10) MapAssignPreAllocate/Str/2048-16 111µs ± 0% 77µs ± 0% -30.20% (p=0.000 n=9+9) MapAssignPreAllocate/Str/4096-16 226µs ± 0% 159µs ± 0% -29.57% (p=0.000 n=10+9) MapAssignPreAllocate/Str/8192-16 481µs ± 0% 368µs ± 1% -23.34% (p=0.000 n=9+10) MapAssignPreAllocate/Str/65536-16 4.83ms ± 1% 4.11ms ± 1% -14.86% (p=0.000 n=10+10) MapAssignReuse/Pointer/6-16 330ns ± 2% 290ns ± 1% -12.18% (p=0.000 n=10+10) MapAssignReuse/Pointer/12-16 757ns ± 1% 550ns ± 1% -27.35% (p=0.000 n=9+9) MapAssignReuse/Pointer/18-16 1.16µs ± 1% 0.81µs ± 1% -30.13% (p=0.000 n=9+10) MapAssignReuse/Pointer/24-16 1.61µs ± 1% 1.08µs ± 1% -32.59% (p=0.000 n=8+10) MapAssignReuse/Pointer/30-16 1.91µs ± 2% 1.31µs ± 1% -31.72% (p=0.000 n=10+9) MapAssignReuse/Pointer/64-16 4.12µs ± 1% 2.75µs ± 1% -33.26% (p=0.000 n=8+9) MapAssignReuse/Pointer/128-16 8.22µs ± 2% 5.50µs ± 1% -33.03% (p=0.000 n=10+10) MapAssignReuse/Pointer/256-16 16.4µs ± 1% 11.0µs ± 1% -33.01% (p=0.000 n=9+10) MapAssignReuse/Pointer/512-16 33.1µs ± 2% 21.9µs ± 1% -33.72% (p=0.000 n=10+10) MapAssignReuse/Pointer/1024-16 66.7µs ± 2% 44.8µs ± 1% -32.78% (p=0.000 n=10+9) MapAssignReuse/Pointer/2048-16 132µs ± 2% 92µs ± 1% -30.11% (p=0.000 n=10+10) MapAssignReuse/Pointer/4096-16 268µs ± 1% 187µs ± 1% -30.25% (p=0.000 n=9+10) MapAssignReuse/Pointer/8192-16 546µs ± 4% 389µs ± 1% -28.67% (p=0.000 n=10+10) MapAssignReuse/Pointer/65536-16 5.14ms ± 3% 4.18ms ± 1% -18.60% (p=0.000 n=10+10) MapAssignReuse/Int64/6-16 81.1ns ± 1% 74.8ns ± 2% -7.75% (p=0.000 n=10+10) MapAssignReuse/Int64/12-16 413ns ± 3% 152ns ± 1% -63.04% (p=0.000 n=9+10) MapAssignReuse/Int64/18-16 429ns ± 3% 223ns ± 1% -48.00% (p=0.000 n=8+10) MapAssignReuse/Int64/24-16 1.01µs ± 4% 0.30µs ± 1% -69.93% (p=0.000 n=10+9) MapAssignReuse/Int64/30-16 618ns ± 3% 358ns ± 2% -42.00% (p=0.000 n=9+10) MapAssignReuse/Int64/64-16 1.31µs ± 1% 0.75µs ± 2% -42.74% (p=0.000 n=8+8) MapAssignReuse/Int64/128-16 2.64µs ± 3% 1.51µs ± 2% -42.96% (p=0.000 n=9+10) MapAssignReuse/Int64/256-16 5.23µs ± 4% 3.00µs ± 2% -42.59% (p=0.000 n=10+10) MapAssignReuse/Int64/512-16 10.3µs ± 3% 6.0µs ± 1% -41.70% (p=0.000 n=10+8) MapAssignReuse/Int64/1024-16 21.0µs ± 2% 12.1µs ± 2% -42.09% (p=0.000 n=10+10) MapAssignReuse/Int64/2048-16 42.2µs ± 2% 25.5µs ± 2% -39.56% (p=0.000 n=10+10) MapAssignReuse/Int64/4096-16 85.3µs ± 3% 51.3µs ± 2% -39.88% (p=0.000 n=10+10) MapAssignReuse/Int64/8192-16 176µs ± 3% 107µs ± 2% -39.54% (p=0.000 n=10+10) MapAssignReuse/Int64/65536-16 1.73ms ± 3% 1.16ms ± 2% -33.33% (p=0.000 n=10+10) MapAssignReuse/Int32/6-16 79.6ns ± 1% 72.4ns ± 2% -9.14% (p=0.000 n=10+10) MapAssignReuse/Int32/12-16 356ns ± 7% 150ns ± 1% -57.73% (p=0.000 n=9+10) MapAssignReuse/Int32/18-16 394ns ± 5% 217ns ± 2% -44.82% (p=0.000 n=9+9) MapAssignReuse/Int32/24-16 1.01µs ± 7% 0.29µs ± 2% -70.98% (p=0.000 n=10+10) MapAssignReuse/Int32/30-16 618ns ± 5% 348ns ± 2% -43.68% (p=0.000 n=10+10) MapAssignReuse/Int32/64-16 1.29µs ± 4% 0.73µs ± 2% -43.23% (p=0.000 n=10+10) MapAssignReuse/Int32/128-16 2.61µs ± 3% 1.46µs ± 2% -43.88% (p=0.000 n=10+10) MapAssignReuse/Int32/256-16 5.21µs ± 3% 2.92µs ± 2% -43.90% (p=0.000 n=10+10) MapAssignReuse/Int32/512-16 10.2µs ± 3% 5.9µs ± 1% -42.54% (p=0.000 n=10+9) MapAssignReuse/Int32/1024-16 20.5µs ± 3% 11.8µs ± 1% -42.72% (p=0.000 n=10+9) MapAssignReuse/Int32/2048-16 41.7µs ± 3% 24.8µs ± 2% -40.50% (p=0.000 n=9+10) MapAssignReuse/Int32/4096-16 84.5µs ± 2% 50.8µs ± 1% -39.93% (p=0.000 n=10+9) MapAssignReuse/Int32/8192-16 170µs ± 1% 103µs ± 2% -39.52% (p=0.000 n=10+8) MapAssignReuse/Int32/65536-16 1.68ms ± 3% 1.12ms ± 2% -33.25% (p=0.000 n=10+10) MapAssignReuse/Str/6-16 96.8ns ± 3% 102.0ns ± 1% +5.33% (p=0.000 n=10+10) MapAssignReuse/Str/12-16 491ns ± 3% 202ns ± 1% -58.99% (p=0.000 n=8+9) MapAssignReuse/Str/18-16 500ns ± 4% 292ns ± 2% -41.68% (p=0.000 n=10+10) MapAssignReuse/Str/24-16 1.13µs ± 3% 0.40µs ± 1% -64.33% (p=0.000 n=9+10) MapAssignReuse/Str/30-16 783ns ± 5% 480ns ± 2% -38.77% (p=0.000 n=10+10) MapAssignReuse/Str/64-16 1.53µs ± 2% 1.02µs ± 2% -33.00% (p=0.000 n=10+10) MapAssignReuse/Str/128-16 3.06µs ± 1% 2.04µs ± 2% -33.22% (p=0.000 n=10+10) MapAssignReuse/Str/256-16 6.15µs ± 3% 4.08µs ± 2% -33.59% (p=0.000 n=9+10) MapAssignReuse/Str/512-16 12.3µs ± 3% 8.2µs ± 1% -33.49% (p=0.000 n=10+10) MapAssignReuse/Str/1024-16 25.3µs ± 2% 17.1µs ± 2% -32.55% (p=0.000 n=10+10) MapAssignReuse/Str/2048-16 50.7µs ± 3% 34.6µs ± 2% -31.79% (p=0.000 n=10+10) MapAssignReuse/Str/4096-16 103µs ± 2% 70µs ± 2% -31.98% (p=0.000 n=10+10) MapAssignReuse/Str/8192-16 218µs ± 1% 157µs ± 2% -28.32% (p=0.000 n=9+10) MapAssignReuse/Str/65536-16 2.03ms ± 1% 1.64ms ± 2% -19.62% (p=0.000 n=10+9) Benchmark-2The benchmarks from runtime. INFOname old time/op new time/op delta MegMap-16 9.01ns ± 1% 8.83ns ± 1% -2.00% (p=0.000 n=8+8) MegOneMap-16 4.59ns ± 1% 9.21ns ± 2% +100.52% (p=0.000 n=9+10) MegEqMap-16 20.4µs ± 2% 20.1µs ± 2% -1.49% (p=0.011 n=9+9) MegEmptyMap-16 2.18ns ± 3% 2.06ns ± 1% -5.42% (p=0.000 n=10+10) SmallStrMap-16 8.27ns ± 1% 7.17ns ± 4% -13.31% (p=0.000 n=8+10) MapStringKeysEight_16-16 6.89ns ± 3% 7.79ns ± 1% +13.09% (p=0.000 n=9+8) MapStringKeysEight_32-16 7.75ns ± 5% 7.97ns ± 0% +2.83% (p=0.000 n=10+7) MapStringKeysEight_64-16 7.66ns ± 3% 8.76ns ± 1% +14.43% (p=0.000 n=10+7) MapStringKeysEight_1M-16 7.72ns ± 3% 16483.00ns ± 1% +213363.36% (p=0.000 n=10+8) IntMap-16 5.21ns ± 1% 6.42ns ±16% +23.30% (p=0.000 n=8+10) MapFirst/1-16 2.68ns ± 1% 2.45ns ± 1% -8.78% (p=0.000 n=10+9) MapFirst/2-16 2.68ns ± 1% 2.43ns ± 1% -9.34% (p=0.000 n=9+9) MapFirst/3-16 2.66ns ± 2% 2.44ns ± 1% -7.98% (p=0.000 n=10+9) MapFirst/4-16 2.68ns ± 1% 2.44ns ± 0% -8.67% (p=0.000 n=8+9) MapFirst/5-16 2.69ns ± 1% 2.45ns ± 0% -8.73% (p=0.000 n=8+8) MapFirst/6-16 2.68ns ± 0% 2.44ns ± 1% -8.78% (p=0.000 n=8+9) MapFirst/7-16 2.69ns ± 1% 2.45ns ± 1% -9.10% (p=0.000 n=10+9) MapFirst/8-16 2.67ns ± 1% 4.79ns ± 1% +79.47% (p=0.000 n=10+9) MapFirst/9-16 4.66ns ± 1% 4.75ns ± 2% +1.98% (p=0.000 n=10+9) MapFirst/10-16 4.68ns ± 2% 4.73ns ± 1% +1.19% (p=0.016 n=9+10) MapFirst/11-16 4.69ns ± 2% 4.78ns ± 1% +1.93% (p=0.000 n=9+8) MapFirst/12-16 4.64ns ± 2% 4.75ns ± 2% +2.47% (p=0.005 n=10+10) MapFirst/13-16 4.69ns ± 1% 4.78ns ± 1% +1.86% (p=0.000 n=8+9) MapFirst/14-16 4.68ns ± 2% 4.79ns ± 1% +2.34% (p=0.000 n=10+8) MapFirst/15-16 4.67ns ± 1% 6.90ns ± 2% +47.88% (p=0.000 n=8+10) MapFirst/16-16 4.65ns ± 2% 6.95ns ± 1% +49.60% (p=0.000 n=10+9) MapMid/1-16 2.68ns ± 1% 2.62ns ± 1% -2.15% (p=0.000 n=10+9) MapMid/2-16 3.23ns ± 1% 2.95ns ± 1% -8.59% (p=0.000 n=10+9) MapMid/3-16 3.21ns ± 1% 2.94ns ± 0% -8.33% (p=0.000 n=10+7) MapMid/4-16 3.64ns ± 2% 3.46ns ± 1% -4.84% (p=0.000 n=10+8) MapMid/5-16 3.67ns ± 1% 3.47ns ± 1% -5.46% (p=0.000 n=10+8) MapMid/6-16 4.15ns ± 2% 3.93ns ± 1% -5.25% (p=0.000 n=10+9) MapMid/7-16 4.19ns ± 1% 3.91ns ± 1% -6.55% (p=0.000 n=10+10) MapMid/8-16 4.84ns ± 2% 5.76ns ±11% +19.02% (p=0.000 n=10+10) MapMid/9-16 5.66ns ±10% 5.83ns ± 9% ~ (p=0.393 n=10+10) MapMid/10-16 5.81ns ± 5% 6.20ns ± 4% +6.67% (p=0.001 n=8+7) MapMid/11-16 5.80ns ± 4% 6.17ns ±11% ~ (p=0.278 n=9+10) MapMid/12-16 5.87ns ±10% 6.26ns ± 2% +6.48% (p=0.008 n=9+7) MapMid/13-16 6.15ns ±13% 5.92ns ± 8% ~ (p=0.315 n=10+8) MapMid/14-16 5.29ns ± 8% 6.18ns ±14% +16.89% (p=0.001 n=9+10) MapMid/15-16 5.44ns ±12% 7.04ns ± 2% +29.37% (p=0.000 n=10+10) MapMid/16-16 5.91ns ±11% 7.18ns ± 3% +21.44% (p=0.000 n=10+9) MapLast/1-16 2.69ns ± 1% 2.64ns ± 2% -2.00% (p=0.000 n=8+10) MapLast/2-16 3.22ns ± 2% 2.93ns ± 1% -9.05% (p=0.000 n=9+10) MapLast/3-16 3.66ns ± 2% 3.47ns ± 1% -5.17% (p=0.000 n=10+8) MapLast/4-16 4.20ns ± 1% 3.88ns ± 2% -7.54% (p=0.000 n=8+9) MapLast/5-16 4.85ns ± 2% 4.39ns ± 2% -9.53% (p=0.000 n=9+9) MapLast/6-16 5.33ns ± 1% 5.10ns ± 8% -4.27% (p=0.017 n=9+10) MapLast/7-16 5.79ns ± 1% 5.33ns ± 2% -7.93% (p=0.000 n=9+8) MapLast/8-16 6.28ns ± 1% 6.55ns ±17% ~ (p=0.704 n=9+10) MapLast/9-16 6.90ns ±18% 6.87ns ±12% ~ (p=1.000 n=9+9) MapLast/10-16 6.83ns ±15% 7.32ns ± 7% +7.10% (p=0.010 n=9+8) MapLast/11-16 7.84ns ± 3% 7.46ns ± 5% -4.78% (p=0.006 n=7+8) MapLast/12-16 7.58ns ± 9% 7.58ns ±12% ~ (p=0.743 n=8+9) MapLast/13-16 9.15ns ±25% 14.06ns ± 2% +53.70% (p=0.000 n=10+8) MapLast/14-16 6.31ns ±15% 7.95ns ± 8% +25.94% (p=0.000 n=9+8) MapLast/15-16 6.20ns ±16% 7.11ns ± 2% +14.59% (p=0.004 n=10+10) MapLast/16-16 6.78ns ±19% 7.19ns ± 1% ~ (p=0.139 n=9+8) MapCycle-16 11.5ns ± 2% 15.8ns ± 2% +36.83% (p=0.000 n=10+10) RepeatedLookupStrMapKey32-16 9.37ns ± 2% 7.81ns ± 1% -16.67% (p=0.000 n=9+9) RepeatedLookupStrMapKey1M-16 16.6µs ± 1% 16.6µs ± 2% ~ (p=0.604 n=10+9) MakeMap/[Byte]Byte-16 120ns ± 1% 116ns ± 1% -3.61% (p=0.000 n=10+10) MakeMap/[Int]Int-16 164ns ± 1% 160ns ± 0% -2.36% (p=0.000 n=10+9) NewEmptyMap-16 4.32ns ± 1% 4.55ns ± 3% +5.26% (p=0.000 n=10+10) NewSmallMap-16 19.4ns ± 2% 24.1ns ± 1% +24.31% (p=0.000 n=10+10) MapIter-16 73.7ns ± 2% 82.3ns ± 3% +11.60% (p=0.000 n=10+10) MapIterEmpty-16 3.65ns ± 1% 4.11ns ± 2% +12.80% (p=0.000 n=9+9) SameLengthMap-16 3.14ns ± 2% 3.38ns ± 2% +7.50% (p=0.000 n=9+10) BigKeyMap-16 10.0ns ± 4% 11.9ns ± 1% +18.84% (p=0.000 n=10+10) BigValMap-16 10.0ns ± 3% 11.9ns ± 2% +18.74% (p=0.000 n=10+10) SmallKeyMap-16 8.51ns ± 1% 9.91ns ± 1% +16.44% (p=0.000 n=10+9) MapPopulate/1-16 10.4ns ± 1% 14.4ns ± 2% +38.28% (p=0.000 n=9+9) MapPopulate/10-16 608ns ± 1% 495ns ± 1% -18.58% (p=0.000 n=10+10) MapPopulate/100-16 9.32µs ± 1% 6.28µs ± 1% -32.59% (p=0.000 n=9+10) MapPopulate/1000-16 111µs ± 0% 86µs ± 0% -22.28% (p=0.000 n=10+10) MapPopulate/10000-16 965µs ± 0% 745µs ± 1% -22.78% (p=0.000 n=10+10) MapPopulate/100000-16 10.4ms ± 1% 7.4ms ± 0% -28.46% (p=0.000 n=10+10) ComplexAlgMap-16 20.6ns ± 1% 22.9ns ± 1% +10.86% (p=0.000 n=9+10) GoMapClear/Reflexive/1-16 19.3ns ± 1% 17.9ns ± 1% -7.39% (p=0.000 n=10+10) GoMapClear/Reflexive/10-16 21.2ns ± 1% 18.9ns ± 2% -10.91% (p=0.000 n=10+9) GoMapClear/Reflexive/100-16 37.3ns ± 1% 40.2ns ± 1% +7.90% (p=0.000 n=9+9) GoMapClear/Reflexive/1000-16 345ns ± 2% 433ns ± 1% +25.73% (p=0.000 n=10+10) GoMapClear/Reflexive/10000-16 2.62µs ± 1% 3.94µs ± 1% +50.33% (p=0.000 n=10+10) GoMapClear/NonReflexive/1-16 80.2ns ± 2% 62.8ns ± 2% -21.61% (p=0.000 n=9+9) GoMapClear/NonReflexive/10-16 92.2ns ± 2% 72.5ns ± 2% -21.42% (p=0.000 n=10+10) GoMapClear/NonReflexive/100-16 216ns ± 2% 101ns ± 1% -53.49% (p=0.000 n=9+8) GoMapClear/NonReflexive/1000-16 2.13µs ± 1% 0.34µs ± 2% -84.02% (p=0.000 n=10+9) GoMapClear/NonReflexive/10000-16 16.4µs ± 1% 2.1µs ± 1% -86.96% (p=0.000 n=10+8) MapStringConversion/32/simple-16 7.79ns ± 1% 12.22ns ± 1% +56.92% (p=0.000 n=9+9) MapStringConversion/32/struct-16 7.72ns ± 1% 12.32ns ± 2% +59.70% (p=0.000 n=9+10) MapStringConversion/32/array-16 7.72ns ± 3% 12.00ns ± 3% +55.34% (p=0.000 n=10+9) MapStringConversion/64/simple-16 7.47ns ± 1% 11.77ns ± 3% +57.65% (p=0.000 n=9+10) MapStringConversion/64/struct-16 7.49ns ± 1% 11.94ns ± 2% +59.38% (p=0.000 n=9+10) MapStringConversion/64/array-16 7.50ns ± 1% 11.92ns ± 2% +58.89% (p=0.000 n=9+10) MapInterfaceString-16 11.2ns ± 5% 13.7ns ±61% ~ (p=1.000 n=8+10) MapInterfacePtr-16 10.8ns ±29% 11.3ns ±52% ~ (p=0.684 n=10+10) NewEmptyMapHintLessThan8-16 6.54ns ± 1% 6.04ns ± 1% -7.56% (p=0.000 n=9+9) NewEmptyMapHintGreaterThan8-16 271ns ± 2% 268ns ± 1% ~ (p=0.062 n=10+9) MapPop100-16 10.8µs ± 6% 7.2µs ±14% -33.41% (p=0.000 n=10+10) MapPop1000-16 170µs ± 1% 101µs ± 1% -40.54% (p=0.000 n=9+9) MapPop10000-16 3.35ms ± 3% 1.74ms ± 4% -48.15% (p=0.000 n=10+9) MapAssign/Int32/256-16 10.1ns ± 4% 8.9ns ± 1% -12.13% (p=0.000 n=10+9) MapAssign/Int32/65536-16 22.0ns ± 1% 12.9ns ± 1% -41.11% (p=0.000 n=10+8) MapAssign/Int64/256-16 10.7ns ± 2% 8.6ns ± 2% -19.79% (p=0.000 n=8+9) MapAssign/Int64/65536-16 23.2ns ± 2% 13.1ns ± 2% -43.39% (p=0.000 n=10+9) MapAssign/Str/256-16 16.6ns ± 4% 13.1ns ± 3% -21.18% (p=0.000 n=10+9) MapAssign/Str/65536-16 27.5ns ± 3% 20.5ns ± 3% -25.66% (p=0.000 n=10+9) MapOperatorAssign/Int32/256-16 9.89ns ± 2% 8.69ns ± 3% -12.15% (p=0.000 n=9+9) MapOperatorAssign/Int32/65536-16 21.9ns ± 1% 12.8ns ± 3% -41.52% (p=0.000 n=10+10) MapOperatorAssign/Int64/256-16 10.9ns ± 3% 8.6ns ± 1% -21.20% (p=0.000 n=9+8) MapOperatorAssign/Int64/65536-16 22.4ns ± 0% 13.2ns ± 2% -41.15% (p=0.000 n=7+9) MapOperatorAssign/Str/256-16 1.43µs ± 2% 1.40µs ± 1% -1.93% (p=0.000 n=10+10) MapOperatorAssign/Str/65536-16 234ns ± 3% 237ns ± 4% ~ (p=0.138 n=10+10) MapAppendAssign/Int32/256-16 21.0ns ± 2% 20.1ns ± 4% -4.43% (p=0.000 n=8+10) MapAppendAssign/Int32/65536-16 38.5ns ± 2% 33.9ns ± 2% -12.00% (p=0.000 n=10+10) MapAppendAssign/Int64/256-16 21.4ns ± 3% 20.4ns ± 6% -4.26% (p=0.008 n=9+10) MapAppendAssign/Int64/65536-16 39.8ns ± 2% 34.9ns ± 1% -12.40% (p=0.000 n=10+8) MapAppendAssign/Str/256-16 52.8ns ± 7% 52.1ns ± 5% ~ (p=0.739 n=10+10) MapAppendAssign/Str/65536-16 66.5ns ± 2% 67.7ns ± 1% +1.94% (p=0.004 n=9+9) MapDelete/Int32/100-16 27.7ns ± 2% 17.8ns ± 2% -35.77% (p=0.000 n=10+10) MapDelete/Int32/1000-16 22.6ns ± 2% 12.8ns ± 1% -43.31% (p=0.000 n=9+9) MapDelete/Int32/10000-16 25.0ns ± 2% 15.2ns ± 1% -39.33% (p=0.000 n=7+10) MapDelete/Int64/100-16 29.8ns ± 3% 17.6ns ± 2% -40.98% (p=0.000 n=10+9) MapDelete/Int64/1000-16 23.3ns ± 2% 13.1ns ± 1% -43.77% (p=0.000 n=10+8) MapDelete/Int64/10000-16 25.6ns ± 2% 16.1ns ± 1% -37.13% (p=0.000 n=9+9) MapDelete/Str/100-16 38.4ns ± 4% 24.8ns ± 2% -35.29% (p=0.000 n=10+10) MapDelete/Str/1000-16 30.8ns ± 2% 20.7ns ± 1% -32.72% (p=0.000 n=10+10) MapDelete/Str/10000-16 35.8ns ± 2% 24.4ns ± 2% -31.95% (p=0.000 n=10+10) MapDelete/Pointer/100-16 30.7ns ± 2% 21.0ns ± 1% -31.69% (p=0.000 n=10+10) MapDelete/Pointer/1000-16 24.5ns ± 2% 16.6ns ± 1% -32.37% (p=0.000 n=10+10) MapDelete/Pointer/10000-16 27.1ns ± 3% 19.8ns ± 2% -26.87% (p=0.000 n=10+10) |
Change https://go.dev/cl/426614 mentions this issue: |
cc @golang/runtime |
cc @randall77 Overall this sounds great. The main concern I have up front here is that growing happens all at once rather than incrementally. Could you explain more why that is and if it is possible to do so incrementally? |
Do iterators get invalidated when the map is modified? This is common for open-addressing hash tables. An important feature of the current map is that you can modify the map while iterating and the iterator still works. |
I checked the implementation and it does look like iterators stay valid when the map is modified. |
The semantics of Go iterators basically mean you can't move items in the map data structure once you've placed them. So, for instance, you can't move a valid item to fill in a deleted one - you need to always use a tombstone. But other than that, iterators don't add much of a restriction to the implementation. |
This is really exciting. Thanks for doing all of this work.
There are many slowdowns in these benchmarks, some quite significant. Do you understand the source of these slowdowns? Is it something that can be fixed?
This is an important property to maintain because it can significantly impact the tail latency of services. We've done a lot of work throughout the Go runtime over the years to avoid unpredictable performance spikes and it would be a shame to regress here. I'm not personally familiar with the details of SwissTable. Is incremental rehashing something that can be done and just hasn't been implemented yet, or is it fundamentally difficult? If it's fundamentally difficult to do directly in SwissTable, I've been kickaround the the idea of adopting aspects of extendible hashing for a while, which could be a general solution to incremental resizing. I should probably just file an "idea" issue about this, but I'll try to lay it out here. The idea is that, once a hash table grows beyond some size threshold based on bounding how long it takes to resize a map, you switch to a two-level scheme. I'm guessing that threshold would be around 4MiB, but that would have to be determined experimentally. In the two-level scheme, the top level is an index array keyed by the top k bits of the hash, which then points to map shards, which are individual SwissTables. The Wikipedia article I linked has some good diagrams for visualizing this. Each shard s contains only the items whose top js <= k bits are all identical, and multiple index entries will point to the same shard if that shard's js < k. When an individual shard overflows the threshold size, you split just that shard into two new shards s1 and s2 with js1 = js2 = js+1, and update the index. If js1 > k, then you also double the index array in order to increase k by 1. This would enable incremental growth and bound the resize cost paid by any single map operation. Iterators can continue to work while a map is resized by holding on to the pre-resize shard they are currently iterating (let the GC collect this once all iterators have moved past it). It would also reduce overall memory footprint because the total memory size of a map doesn't have to be a power of two (even if each individual shard is). It would result in less memory fragmentation caused by large maps because it limits the size of each individual allocation. And it would address problems with our current map resizing algorithm where, if a map is grown to a size that puts it into resizing mode, but then modifications stop and the application switches to only reading from the map, accesses are relatively slow and the map continues to consume additional memory (#51410). |
I placed this in the Unplanned milestone for now since it seems like there's still a bunch to discuss before landing it (it's non-committal). I assigned it to @zhangyunhao116 since you're driving the effort at the moment. |
Thanks for your advice! @aclements We hope to get more feedback from the community and the Go team.
Many reasons affect the performance degradation of these benchmarks(for basic version):
I think it is fundamentally difficult since a single empty slot in SwissTable can break the search process(https://faultlore.com/blah/hashbrown-tldr/#deletion).
That sounds like a great idea! I'll think about this idea carefully later. |
I like to use small maps for many purposes, so it worries me a bit that SwissTable seems to perform worse for them. Maybe a special case is needed for small maps? |
Yes, but it may have to consider carefully(optimizing the small map may cause performance impact in other situations). SwissTable does not always cause worse performance for small maps, and we can see the cases in the benchmark(e.g., in most key miss scenarios, the performance improves instead). Could you please tell me the cases where the small maps are often used? Some of the slowdowns in the benchmarks can be fixed. PatchSet 10 should fix the slowdown of |
Well I use small maps often in stead of case statements, or for mapping a few input file names to processors, or for lookups such as os.Substitute can do. I am looking forward to the improved results. |
Thanks for the detailed breakdown of the performance losses! My take is that there are two things that need to happen on that front:
You mentioned that SwissTable is better for the "miss" case than the "hit" case. If I understand your layout correctly (I haven't looked closely), you're already packing the tophash and the data together, so you'd expect one cache miss for a hit. Is that right? Does the SIMD version of SwissTable level the field between the hit and miss cases at all? |
Stepping back, I think there are three high-level potential concerns to address with this: Incremental resizing. We've already been discussing this above. :) DoS prevention. Currently, Go maps have fairly strong denial-of-service properties. The main concern is if an attacker can control the placement of items to induce collisions, leading to n^2 behavior. Having a good, per-map seed that is factored into the hash computation (not just combined at the end) goes a long way toward preventing this. Beyond this, the concern becomes leaking too much information about long-lived maps such that an attacker could reconstruct the seed. Having non-deterministic iteration order helps a lot here. A third problem is that copying elements one-by-one from one map into another, newly allocated map can easily go n^2, but having a per-map seed also mitigates that. I think all of these mechanisms are still in place in your implementation, but wanted to confirm that. SIMD, specifically how to actually use SIMD instructions in the map implementation. The most straightforward approach is to write the SIMD bits in assembly, but that has significant maintainability problems and may even perform worse because of the cost of transitioning to and from assembly. So, that suggests adding compiler intrinsics. General SIMD intrinsics are a wide-open area right now and I don't think we want to block the map implementation on solving that problem. :) A lot of proposals have floated around (admittedly, some of which have not been published). We could start trying them out just in the runtime. Or, if I understand correctly, SwissTable doesn't need very many or any particularly fancy SIMD operations, so maybe we just hack in some intrinsics (which we could use as a starting point for more general exploration). |
Another option that would help alleviate most n^2 concerns is having a per-map seed that changes when the map grows/shrinks. |
This is challenging with incremental map growth. FWIW, I believe we do opportunistically reset the seed when the length hits zero. |
Thanks for all of the feedback :)
PatchSet-11 adding the optimization of There are still two optimizations that can be added
Indeed. We can add some new benchmarks to it. I feel like adding the new one into https://github.com/zhangyunhao116/gomapbench first and then migrating some of it into the runtime is a good idea.
Yes! All of these mechanisms are still in the current implementation, including the non-deterministic iteration and resetting the map's seed when the length hits zero.
We planned to write the SIMD directly in assembly, but the experiments outside the runtime show that their performance is almost the same. I wonder if we can use ABIInternal can make the assembly run faster? I think SIMD intrinsics is a great idea! Agree that we can start trying SIMD intrinsics just in the runtime. After some basic optimizations are done, I will start to implement the SIMD version :) Most SIMD in the SwissTable is not very tricky except the |
Another challenge of SIMD is that its default bucket size is 16(the basic version is 8, which is the same as before), which means memory consumption may increase significantly in some cases. |
Sounds good. Possibly we should fold your package into bent, so they run as part of the performance dashboard.
Perfect!
Yeah, this doesn't surprise me if you're jumping to assembly just to run a few instructions. ABIInternal would certainly improve that, since it would probably allow the entire call to be registerized. I'm not sure exactly what operations you need, but if they're taking or returning vector values there's still going to be some overhead since ABIInternal doesn't know anything about passing vector registers. I'm not sure how much effort it is to add a few one-off SIMD intrinsics. You just need SSE, right, not AVX? The operations themselves are probably not hard to add. The register allocator already knows about the XMM registers, but only as non-vector 64-bit floating-point. That mostly shouldn't matter unless it decides to spill one (thanks @cherrymui for pointing that out). If you need AVX, things may get a lot more complicated, or maybe not since the YMM registers are just aliased onto the XMM registers, so we would just call them X registers anyway.
It's hard to say how concerning this is. Sure, if an application has a huge number of small maps, perhaps with large key or value types (though they can't go over 128 bytes each), this could add up to a lot. But the relative overhead of this decreases as map size increases. I'll see if we can get some data on this. Are you considering the trick described in this video, where you need to allocate 16 bytes of control (or a little more for unaligned groups) but the actual bucket array can be smaller? The net result of that might actually be that the tables would be smaller than current tables. |
This is exciting, but before going too far down the optimization path, it seems to me it’d be good to figure out incremental growth, as that could be a dealbreaker. |
I asked @mcy for useful references to learn more about SwissTables and he pointed me to these:
|
I totally agree with @josharian . Making sure there's a workable incremental growth story is higher priority than SIMD optimizations.
I have not had a chance to look closely at your code yet, but I just realized you mentioned interleaving the control bytes and the values, so you must be using aligned groups. De-interleaving them may be worth exploring, but again incremental growth is more important to figure out. |
Hi there, this is indeed quite exciting! Two other references I found useful:
(My possibly incorrect understanding of the history is the Facebook team was inspired by the 2017 CppCon Swisstable talk, but before any Swisstable code was released to the public, the Facebook team implemented their own version of the core idea along with some different tradeoffs. One difference of interest might be the recommended F14FastMap variant that based on entry size chooses between values inline vs. values packed in a contiguous array). Also, FWIW, I had an older rough Swisstable implemented outside the runtime that used SSE for the 16-way probing, but which did not support incremental growth. Inspired by this issue here, I dusted it off recently and I am attempting to add incremental growth. I have a sketch of how it could work, but it is a bit subtle, so before adding too much noise here, I am trying to implement at least a first cut of incremental growth. I will hopefully post more details here within a few days (which might be "whoops, sorry, flawed idea"). Finally, just to be clear, what I am doing is a simple prototype. I don't want to derail any work on the current CLs, and I would be more than happy for the talented team at ByteDance to be the ones to make something real in the runtime. ;-) |
Change https://go.dev/cl/611189 mentions this issue: |
Change https://go.dev/cl/616457 mentions this issue: |
Change https://go.dev/cl/616455 mentions this issue: |
Change https://go.dev/cl/616459 mentions this issue: |
Change https://go.dev/cl/616466 mentions this issue: |
Change https://go.dev/cl/611187 mentions this issue: |
Change https://go.dev/cl/616463 mentions this issue: |
Change https://go.dev/cl/616460 mentions this issue: |
Change https://go.dev/cl/611188 mentions this issue: |
Change https://go.dev/cl/616461 mentions this issue: |
Change https://go.dev/cl/618016 mentions this issue: |
Change https://go.dev/cl/618535 mentions this issue: |
Add a new package that will contain a new "Swiss Table" (https://abseil.io/about/design/swisstables) map implementation, which is intended to eventually replace the existing runtime map implementation. This implementation is based on the fabulous github.com/cockroachdb/swiss package contributed by Peter Mattis. This CL adds an hash map implementation. It supports all the core operations, but does not have incremental growth. For #54766. Change-Id: I52cf371448c3817d471ddb1f5a78f3513565db41 Reviewed-on: https://go-review.googlesource.com/c/go/+/582415 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Change https://go.dev/cl/619035 mentions this issue: |
For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10 Change-Id: I0a928c4b1e90056c50d2abca8982bdb540c33a34 Reviewed-on: https://go-review.googlesource.com/c/go/+/619035 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com>
json-iterator heavily relies on an unmaintained modern-go/reflect2 that pokes at the internals of Go runtime. This was shown to cause failures in golang/go#54766 (comment). Switching to encoding/json does come with a performance penalty. Updates #9380
Change https://go.dev/cl/620215 mentions this issue: |
Change https://go.dev/cl/620217 mentions this issue: |
Change https://go.dev/cl/620216 mentions this issue: |
Use the new SwissTable-based map in internal/runtime/maps as the basis for the runtime map when GOEXPERIMENT=swissmap. Integration is complete enough to pass all.bash. Notable missing features: * Race integration / concurrent write detection * Stack-allocated maps * Specialized "fast" map variants * Indirect key / elem For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap Change-Id: Ie97b656b6d8e05c0403311ae08fef9f51756a639 Reviewed-on: https://go-review.googlesource.com/c/go/+/594596 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Change https://go.dev/cl/621116 mentions this issue: |
Change https://go.dev/cl/621115 mentions this issue: |
Based on the benchmarks in github.com/cockroachlabs/swiss. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I9ad925d3272c671e21ec04eb2da5ebd8f0fc6a28 Reviewed-on: https://go-review.googlesource.com/c/go/+/596295 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>
Extendible hashing splits a swisstable map into many swisstables. This keeps grow operations small. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap Change-Id: Id91f34af9e686bf35eb8882ee479956ece89e821 Reviewed-on: https://go-review.googlesource.com/c/go/+/604936 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
Change https://go.dev/cl/621615 mentions this issue: |
CL 594596 already did this for regabi, but missed non-regabi. Stack allocated swiss maps don't call rand32. For #54766. Change-Id: I312ea77532ecc6fa860adfea58ea00b01683ca69 Reviewed-on: https://go-review.googlesource.com/c/go/+/621615 Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap Change-Id: I6695c0b143560d974b710e1d78e7a7d09278f7cc Reviewed-on: https://go-review.googlesource.com/c/go/+/620215 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Change https://go.dev/cl/622376 mentions this issue: |
Change https://go.dev/cl/622377 mentions this issue: |
Change https://go.dev/cl/622042 mentions this issue: |
Abstract
We suggest using SwissTable in the runtime to replace the original implementation of the hashmap.
SwissTable is now used in abseil and rust by default. We borrowed ideas from the two implementations and combined some optimization ideas of Go's original hashmap to construct the current implementation.
See the following comment for performance comparison(the GitHub issue has a character limit).
Motivations
We need to improve the performance of the hashmap as much as possible. At ByteDance, our Go service consumes about 4% of the CPU on hashmap. It is worth noting that on many services, the CPU consumption of
mapassign
andmapaccess
are almost 1:1, so the improvement of insertion performance is equally important.Support dynamic adjustment of the capacity of the hashmap. Some Go services will make a map too large due to burst traffic, which will cause a lot of pressure on the memory, but the map does not shrink after elements removal. We found that there are also many related discussions in the community:
Implementations
Basic version: https://go-review.googlesource.com/c/go/+/426614
SIMD version: TODO
Tasks
Advantages compared to the original implementation
The overall performance is better, and we can also use SIMD to improve performance.
SwissTable's LoadFactor can be set higher to save some memory.
Open-addressing and making dynamic adjustments to the capacity is easier to implement.
After allocating a fixed-size hashmap, if the number of inserted elements does not exceed the capacity, no additional memory is required, and the performance will be significantly improved, which has a greater advantage over reused fixed-size hashmap. (The original hashmap may still allocate additional memory even when the limit is not exceeded)
Disadvantages compared to the original implementation
The rehash is done at once, so
Since SwissTable needs to ensure that each bucket has at least one empty slot, it will cause performance degradation in some cases. For example, if we insert an element when there are seven elements in the hashmap, the CPU consumption will significantly increase because the hashmap needs to grow.
Implementation details
Here is an overview of the basic version. For more detailed implementation, you can read the code. Before reading the code, it is recommended to check the video or the article in the references to get a general idea of how it works.
Differences with SwissTable
tophash
into the bucket, not as an array alone.The classic SwissTable aggregates all tophash into an array (also called control bytes array, here we follow the convention, called tophash), while the original hashmap of Go puts tophash and items in the same bucket.
Our experiments outside the runtime found that if tophash is used as an array alone, we couldn’t find any performance wins. This is probably because we need two memory allocations when initializing the hashmap.
So we use the bucket memory layout almost the same as the original in the current version. In the future, we may consider aggregating all tophash into an array, using a data structure similar to
sync.Pool
, so that all tophash of the hashmap can be reused to improve performance.Differences with the original implementation
For hashmap header (
hmap
) andhiter
For bucket (
bmap
)The tophash is still 8-bit(uint8), but 7-bit for hash info and 1-bit as flags.
Removed overflow pointer.
For the way of querying
For initializing bucket arrays
0xff
instead of0
;0xff
represents the empty slot now.References
C++: https://github.com/abseil/abseil-cpp/blob/master/absl/container/internal/raw_hash_set.h
Rust: https://github.com/rust-lang/hashbrown
The text was updated successfully, but these errors were encountered: