You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, this can also theoretically be assembled with SUB (R32, I8)
0: 83 e8 0a subeax,0xa
Which yields a smaller size in terms of code generation.
At first, I expected the EAX specific instruction to be faster, so I then had a look at uops.info; and it appears that the non-specialized instruction is measured to be faster on modern CPUs.
Namely, on modern architectures like Zen 3, SUB (R32, I8) clocks in at 0.25 throughput and SUB (EAX, I32) clocks in at 0.33. [Higher value is worse]. Intel CPUs follow the same trend.
I did cross reference these values with very well known Agner's CPU Optimization Guide; they matched; however Agner not have the measurements for SUB (EAX, I32) specifically, only SUB (R32, I8).
Note: In the Rust bindings, the code is auto-generated as:
Hi; I have a quick question regarding code generation with
code_asm
.I couldn't see anything related in issues list; so I figured I would ask here:
When I assemble with
I get the result:
I.e. It uses SUB (EAX, I32)
However, this can also theoretically be assembled with SUB (R32, I8)
Which yields a smaller size in terms of code generation.
At first, I expected the EAX specific instruction to be faster, so I then had a look at uops.info; and it appears that the non-specialized instruction is measured to be faster on modern CPUs.
Namely, on modern architectures like Zen 3,
SUB (R32, I8)
clocks in at0.25
throughput andSUB (EAX, I32)
clocks in at0.33
. [Higher value is worse]. Intel CPUs follow the same trend.I did cross reference these values with very well known Agner's CPU Optimization Guide; they matched; however Agner not have the measurements for
SUB (EAX, I32)
specifically, onlySUB (R32, I8)
.Note: In the Rust bindings, the code is auto-generated as:
(Picking
EAX
when possible)The text was updated successfully, but these errors were encountered: