Changes
🚀 Enhancements
- Further speed-up vectorized array/span initialization (improves
ToList()
too) (9a667be)
Loop body comparison:
| Old | New | New Aarch64 |
|-------------------------------|-------------------------------|-----------------------------------|
| M02_L01: | M02_L01: | G_M000_IG04: |
| vmovd xmm1,r9d | vmovupd [rax],ymm2 | str q18, [x1] |
| vpbroadcastd ymm1,xmm1 | vmovupd [rax+20],ymm0 | str q16, [x1, #0x10] |
| vpaddd ymm1,ymm1,ymm0 | vpaddd ymm2,ymm2,ymm1 | add v18.4s, v18.4s, v17.4s |
| add r9d,r11d | vpaddd ymm0,ymm0,ymm1 | add v16.4s, v16.4s, v17.4s |
| vmovd xmm2,r9d | add rax,40 | add x1, x1, #32 |
| vpbroadcastd ymm2,xmm2 | add r9d,10 | add w5, w5, #8 |
| vpaddd ymm2,ymm2,ymm0 | cmp r11d,r9d | cmp w4, w5 |
| add r9d,r11d | jg short M02_L01 | bgt G_M000_IG04 |
| movsxd r10,esi | | |
| vmovupd [rax+r10*4],ymm1 | | |
| lea r10d,[rsi+8] | | |
| movsxd r10,r10d | | |
| vmovupd [rax+r10*4],ymm2 | | |
| add esi,10 | | |
| cmp edi,esi | | |
| jg short M02_L01 | | |
|-------------------------------|-------------------------------|-----------------------------------|
🗒️ Notes
If you are interested, make sure to check the past two releases for more details!
https://github.com/neon-sunset/RangeExtensions/releases/tag/2.0.0
https://github.com/neon-sunset/RangeExtensions/releases/tag/2.1.0
Full Changelog: 2.1.0...2.1.1
Published with dotnet-releaser