Remove dead code in fwht #275

AndersTrier · 2024-03-11T10:53:22Z

When reworking my Rust implementation of leopard16, I realized that the last loop in fwht is unreachable.

On the last iteration of the outermost loop, dist4 == order, after which we update the value of dist to dist4, making us unable to enter the body of the last loop.

order is always either 256 or 16384, and

In [2]: 4<<2<<2<<2
Out[2]: 256

In [5]: 4<<2<<2<<2<<2<<2<<2
Out[5]: 16384

On a related note: I tried to optimize fwht further by using 8 values at a time (fwht_8). It was slightly faster on x86, but slower on ARM.
fwht_16 was slower on both architectures.

as we're always working on an array of size `order`

klauspost · 2024-03-11T11:14:34Z

Looks like a nice cleanup! btw, added your repo to the README since I assume they are compatible.

klauspost · 2024-03-11T11:37:09Z

wrt unrolling, yeah I settled on 4x since 8x didn't give a significant speedup and didn't want to choke old platforms with limited registers more than needed.

AndersTrier · 2024-03-11T11:38:07Z

Looks like a nice cleanup! btw, added your repo to the README since I assume they are compatible.

Thanks!

I would have to test wrt. compatibility, as I don't think high rate and low rate are compatible, and sometimes you can choose either.

AndersTrier · 2024-03-11T11:46:06Z

wrt unrolling, yeah I settled on 4x since 8x didn't give a significant speedup and didn't want to choke old platforms with limited registers more than needed.

Did you consider implementing it with SIMD?

I don't know if it's worth it, but I found some code to get you started if you want to give it a try ;)
https://github.com/catid/leopard/blob/master/docs/vector_fwht_4.txt
https://github.com/paritytech/reed-solomon-novelpoly/blob/df906e1ca27dc6c0c1b663f1653f57d4620f03dd/reed-solomon-novelpoly/src/field/inc_log_mul.rs#L118
https://stackoverflow.com/a/54133143

klauspost · 2024-03-11T12:26:06Z

Yeah, it could be done... It could actually be rather effective AFAICT. But it's been too long since I looked at this to judge if/when it would provide a speedup.

AndersTrier marked this pull request as draft March 11, 2024 11:00

AndersTrier added 2 commits March 11, 2024 12:08

fwht() doesn't need length of data parameter

1960068

as we're always working on an array of size `order`

Remove dead code in fwht

b32e610

AndersTrier force-pushed the AndersTrier/leopard_fwht_del_dead_code branch from 4b07afa to b32e610 Compare March 11, 2024 11:10

AndersTrier marked this pull request as ready for review March 11, 2024 11:14

klauspost merged commit 85a5e93 into klauspost:master Mar 11, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove dead code in fwht #275

Remove dead code in fwht #275

AndersTrier commented Mar 11, 2024

klauspost commented Mar 11, 2024

klauspost commented Mar 11, 2024

AndersTrier commented Mar 11, 2024

AndersTrier commented Mar 11, 2024

klauspost commented Mar 11, 2024

Remove dead code in fwht #275

Remove dead code in fwht #275

Conversation

AndersTrier commented Mar 11, 2024

klauspost commented Mar 11, 2024

klauspost commented Mar 11, 2024

AndersTrier commented Mar 11, 2024

AndersTrier commented Mar 11, 2024

klauspost commented Mar 11, 2024