-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MIR] SwitchInt Everywhere #39456
[MIR] SwitchInt Everywhere #39456
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
So my take on this has traditionally been that I would prefer to do it as a kind of LOWERING step. That is, we would retain the semantic information when we build the tree, but lose it after dropck. However, I am not sure that this has a lot of value, and I'd be willing to go forward with this PR (after I give it a bit more of a look...), with the understanding that if, at some time later, we want to reason about the fact that some particular integer represents an enum discriminant, we may modify MIR to make that more direct (of course, one could also imagine tracking that fact via some kind of limited abstract interpretation or other analysis). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this feels pretty nice. I am definitely +1 on merging If and SwitchInt (and have been for a while). As I wrote earlier, merging Switch and SwitchInt also seems ok to me for now at least. =)
@@ -195,7 +209,7 @@ impl<'tcx> Rvalue<'tcx> { | |||
} | |||
} | |||
} | |||
&Rvalue::InlineAsm { .. } => None | |||
Rvalue::InlineAsm { .. } => None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, this is pre-existing and not a problem with this PR per se, but I think that every rvalue should have a type. The cases where we see None
here are, I think, basically bugs. For example, inline-asm should probably be a statement, if it has no result, and AggregateKind::Array
should store the type as part of its variant, to handle the empty case. Is there a strong reason for it not to be this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a strong reason for it not to be this way?
IIRC it has been this way all the way back to when first MIR patches landed. I have no idea why is it this way at the moment and I do not see any reason why it should stay an Rvalue
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, seems fine (but independent from this PR)
src/librustc/mir/tcx.rs
Outdated
}; | ||
Some(ty) | ||
} else { | ||
None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just always return int here. But I am biased (see below) against this function even returning an Option<>
to begin with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really depends on the semantics we pick for the Discriminant
rvalue. If we want it to be UB for non-Adt, it seems fine to me to return int
. Otherwise I’d just report a bug!
here.
src/librustc_trans/mir/rvalue.rs
Outdated
let enum_ty = discr_lvalue.ty.to_ty(bcx.tcx()); | ||
let discr_ty = rvalue.ty(&*self.mir, bcx.tcx()).unwrap(); | ||
let discr_type = type_of::immediate_type_of(bcx.ccx, discr_ty); | ||
// FIXME: inline this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why inline this? Is it the only place it is called? (Even so, why inline it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I’d expect would happen to this function once we finally remove the old trans. Since now MIR can express the Discriminant
rvalue, it is possible to port all the old trans bits which deal with discriminants (e.g. discriminant
intrinsic) to the MIR trans.
That being said, this FIXME should not exist currently, as there a fair amount of work involved in making this happen still.
src/librustc_typeck/collect.rs
Outdated
let hint = UncheckedExprHint(ty_hint); | ||
match ConstContext::new(ccx.tcx, body).eval(e, hint) { | ||
Ok(ConstVal::Integral(i)) => { | ||
// FIXME: eval should return an error if the hint is wrong |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this FIXME talking about? I don't follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a preexisting FIXME, it is saying that const-evaluation should fail if hint is incorrect so the check below shouldn’t be necessary. I should double-check whether it is not fixed yet.
src/librustc/ty/layout.rs
Outdated
fn repr_discr(tcx: TyCtxt, ty: Ty, hints: &[attr::ReprAttr], min: i64, max: i64) | ||
-> (Integer, bool) { | ||
pub fn repr_discr(tcx: TyCtxt, hints: &[attr::ReprAttr], min: i128, max: i128) | ||
-> (Integer, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: formatting is unusual.
src/librustc/mir/mod.rs
Outdated
@@ -471,6 +464,7 @@ pub enum TerminatorKind<'tcx> { | |||
|
|||
/// Possible values. The locations to branch to in each case | |||
/// are found in the corresponding indices from the `targets` vector. | |||
// FIXME: ConstVal doesn’t quite make any sense here? Its a Switch*Int*. | |||
values: Vec<ConstVal>, | |||
|
|||
/// Possible branch sites. The length of this vector should be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we may want to adjust SwitchInt
here -- perhaps you did so but I didn't notice, but it so this comment is out of date. The current comment suggests you always need a "fall-through" case, but we don't want that now, right? Although I guess that we could code it up by loading the set of discriminants (let's call them D[i]
for i = 1...n
) and then generating branches for D[i]
where i in 1...n-1
, and using the fall through case for D[n]
. That is probably better, actually, as I'd prefer to keep this uniform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is not out of date. That being said I think it could make sense to adjust SwitchInt
to be something like this instead:
values: Vec<(ConstVal, BasicBlock)>,
otherwise: Option<BasicBlock> // if None, is exhaustive Switch.
At the very least setup as above would make invariants obviously clear, rather than fuzzy like it is now. Does that sound sensible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current setup is nice as the last block always gets considered an otherwise
(without much effort, as you describe), but is also easy to overlook.
kind: TerminatorKind::SwitchInt { | ||
discr: Operand::Consume(discr), | ||
switch_ty: discr_ty, | ||
values: values, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to adopt the strategy I advocated above, then values
has one too many elements. In any case, this seems to be wrong today, unless the comment is out of date.
} | ||
debug!("num_enum_variants: {}, tested variants: {:?}, variants: {:?}", | ||
num_enum_variants, values, variants); | ||
// FIXME: WHY THIS DOES NOT WORK?! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, how does it fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails with an error that looks like this:
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter 'tcx in function call due to conflicting requirements
--> src/librustc/mir/tcx.rs:177:43
|
177 | Some(adt_def.discr_ty.to_ty(tcx))
| ^^^^^
And I have no idea where the conflict arises – everything looks correct to me. The function being called looks like this:
impl IntTypeExt for attr::IntType {
fn to_ty<'a, 'gcx, 'tcx>(self, tcx: TyCtxt<'a, 'gcx, 'tcx>) -> Ty<'tcx> {
match self {
SignedInt(i) => tcx.mk_mach_int(i),
UnsignedInt(i) => tcx.mk_mach_uint(i),
}
}
}
Do you have an idea as to why it happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured it out. Bounds on to_ty
are too relaxed.
src/librustc_mir/build/expr/into.rs
Outdated
@@ -120,15 +119,15 @@ impl<'a, 'gcx, 'tcx> Builder<'a, 'gcx, 'tcx> { | |||
this.cfg.terminate(block, source_info, TerminatorKind::SwitchInt { | |||
discr: lhs, | |||
switch_ty: this.hir.bool_ty(), | |||
values: vec![ConstVal::Bool(true)], | |||
values: BOOL_SWITCH_TRUE.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think I'd expect, when using switch, that we set it up as false => ..., _ => ...
, since in C one normally interprets "non-zero == true". Maybe it leads to marginally more efficiency assembly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acknowledged.
literal: Literal::Value { ref value }, .. | ||
}), ref values, ref targets, .. } => { | ||
if let Some(ref constint) = value.to_const_int() { | ||
let (otherwise, targets) = targets.split_last().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code suggests the "blocks 1 longer than values invariant" is still in effect.
fd2cc21
to
f2147bd
Compare
Could you add a |
src/librustc_typeck/collect.rs
Outdated
/// Find the appropriate type for the given signed discriminant range and #[repr] attribute. | ||
/// N.B.: u128 values above i128::MAX will be treated as signed, but that shouldn't affect | ||
/// anything, other than maybe debuginfo. | ||
pub fn repr_discr(tcx: TyCtxt, hints: &[attr::ReprAttr], min: i128, max: i128) -> attr::IntType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use the results of evaluating constants in collect
.
1f945cb
to
90b5c70
Compare
☔ The latest upstream changes (presumably #39463) made this pull request unmergeable. Please resolve the merge conflicts. |
f756897
to
1eff9d3
Compare
I feel like this is pretty much done ± the fixes for test failures. If the travis passes, its ready to be r+d. |
79b29b0
to
2087848
Compare
@@ -644,13 +645,13 @@ impl<'b, 'tcx> ElaborateDropsCtxt<'b, 'tcx> { | |||
variant_path, | |||
&adt.variants[variant_index], | |||
substs); | |||
self.drop_ladder(c, fields) | |||
(self.drop_ladder(c, fields), true) | |||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this logic to open_drop_for_adt
instead of messing with booleans?
targets: targets | ||
}) | ||
} | ||
let (values, targets, ret) = if switch_ty.sty == ty::TyBool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the old code with TerminatorKind::if_
?
☔ The latest upstream changes (presumably #39563) made this pull request unmergeable. Please resolve the merge conflicts. |
0b02d79
to
dfea053
Compare
// otherwise: Option<BasicBlock> // exhaustive if None | ||
// | ||
// However we’ve decided to keep this as-is until we figure a case | ||
// where some other approach seems to be strictly better than other. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason that I didn't do it this way is so that one could easily get a [BasicBlock]
of places you might branch to. We could also just have the type-checker or other sanity checkers assert this property dynamically, which would help.
@bors r+ |
📌 Commit dfea053 has been approved by |
@bors r- |
So, I'm happy with this PR, but I want to give @rust-lang/compiler a chance to weigh in. I'll r+ a litle later today, unless anyone raises any blocking objections in the next few hours. =) |
📌 Commit dfea053 has been approved by |
7964f16
to
075ae04
Compare
Should be okay for 'nother go at bots. |
Because certain somebody sucks at resolving big conflicts
075ae04
to
b663d9d
Compare
@bors r=nikomatsakis |
📌 Commit b663d9d has been approved by |
@bors: retry |
@bors: r=nikomatsakis |
💡 This pull request was already approved, no need to approve it again. |
📌 Commit b663d9d has been approved by |
@bors: retry |
[MIR] SwitchInt Everywhere Something I've been meaning to do for a very long while. This PR essentially gets rid of 3 kinds of conditional branching and only keeps the most general one - `SwitchInt`. Primary benefits are such that dealing with MIR now does not involve dealing with 3 different ways to do conditional control flow. On the other hand, constructing a `SwitchInt` currently requires more code than what previously was necessary to build an equivalent `If` terminator. Something trivially "fixable" with some constructor methods somewhere (MIR needs stuff like that badly in general). Some timings (tl;dr: slightly faster^1 (unexpected), but also uses slightly more memory at peak (expected)): ^1: Not sure if the speed benefits are because of LLVM liking the generated code better or the compiler itself getting compiled better. Either way, its a net benefit. The CORE and SYNTAX timings done for compilation without optimisation. ``` AFTER: Building stage1 std artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Finished release [optimized] target(s) in 31.50 secs Finished release [optimized] target(s) in 31.42 secs Building stage1 compiler artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Finished release [optimized] target(s) in 439.56 secs Finished release [optimized] target(s) in 435.15 secs CORE: 99% (24.81 real, 0.13 kernel, 24.57 user); 358536k resident CORE: 99% (24.56 real, 0.15 kernel, 24.36 user); 359168k resident SYNTAX: 99% (49.98 real, 0.48 kernel, 49.42 user); 653416k resident SYNTAX: 99% (50.07 real, 0.58 kernel, 49.43 user); 653604k resident BEFORE: Building stage1 std artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Finished release [optimized] target(s) in 31.84 secs Building stage1 compiler artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Finished release [optimized] target(s) in 451.17 secs CORE: 99% (24.66 real, 0.20 kernel, 24.38 user); 351096k resident CORE: 99% (24.36 real, 0.17 kernel, 24.18 user); 352284k resident SYNTAX: 99% (52.24 real, 0.56 kernel, 51.66 user); 645544k resident SYNTAX: 99% (51.55 real, 0.48 kernel, 50.99 user); 646428k resident ``` cc @nikomatsakis @eddyb
☀️ Test successful - status-appveyor, status-travis |
If I’m reading the release-mode assembly correctly, before this change PropertyDeclaration::id is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ```
If I’m reading the release-mode assembly correctly, before this change PropertyDeclaration::id is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ```
If I’m reading the release-mode assembly correctly, before this change PropertyDeclaration::id is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ```
Rewrite PropertyDeclaration::id to help the optimizer. If I’m reading the release-mode assembly correctly, before this change `PropertyDeclaration::id` is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ``` <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/15992) <!-- Reviewable:end -->
…imizer (from servo:id-table); r=bholley If I’m reading the release-mode assembly correctly, before this change `PropertyDeclaration::id` is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ``` Source-Repo: https://github.com/servo/servo Source-Revision: 9e8e1a47241c6906b4f5777da0d04e3655982fae --HG-- extra : subtree_source : https%3A//hg.mozilla.org/projects/converted-servo-linear extra : subtree_revision : 80fb6158601e5f4715805013dd7c9492ceb38461
…imizer (from servo:id-table); r=bholley If I’m reading the release-mode assembly correctly, before this change `PropertyDeclaration::id` is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ``` Source-Repo: https://github.com/servo/servo Source-Revision: 9e8e1a47241c6906b4f5777da0d04e3655982fae
…imizer (from servo:id-table); r=bholley If I’m reading the release-mode assembly correctly, before this change `PropertyDeclaration::id` is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ``` Source-Repo: https://github.com/servo/servo Source-Revision: 9e8e1a47241c6906b4f5777da0d04e3655982fae
If I’m reading the release-mode assembly correctly, before this change PropertyDeclaration::id is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ```
…imizer (from servo:id-table); r=bholley If I’m reading the release-mode assembly correctly, before this change `PropertyDeclaration::id` is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ``` Source-Repo: https://github.com/servo/servo Source-Revision: 9e8e1a47241c6906b4f5777da0d04e3655982fae UltraBlame original commit: af90f70b05df8c81f440b222728ae48c2e2bf5f8
…imizer (from servo:id-table); r=bholley If I’m reading the release-mode assembly correctly, before this change `PropertyDeclaration::id` is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ``` Source-Repo: https://github.com/servo/servo Source-Revision: 9e8e1a47241c6906b4f5777da0d04e3655982fae UltraBlame original commit: af90f70b05df8c81f440b222728ae48c2e2bf5f8
…imizer (from servo:id-table); r=bholley If I’m reading the release-mode assembly correctly, before this change `PropertyDeclaration::id` is implemented with a computed jump: ```assembly lea rcx, [rip + .LJTI117_0] movsxd rax, dword ptr [rcx + 4*rax] add rax, rcx jmp rax .LBB117_3: mov dword ptr [rdi], 65536 mov rax, rdi ret .LBB117_2: mov dword ptr [rdi], 0 mov rax, rdi ret .LBB117_4: mov dword ptr [rdi], 131072 mov rax, rdi ret .LBB117_6: mov dword ptr [rdi], 262144 mov rax, rdi ret .LBB117_7: mov dword ptr [rdi], 327680 mov rax, rdi ret ; Four similar lines repeated for each of the few hundred variants... ``` With Rust 1.15 (currently used for geckolib) this doesn’t change significantly. In Nightly 1.17 however, the compiled code uses a lookup table, possibly thanks to rust-lang/rust#39456. ```assembly movq (%rsi), %rax cmpq $171, %rax jne .LBB23_1 addq $8, %rsi movq %rsi, 8(%rdi) movb $1, %al jmp .LBB23_3 .LBB23_1: xorq $128, %rax leaq .Lswitch.table.6(%rip), %rcx movb (%rax,%rcx), %al movb %al, 1(%rdi) xorl %eax, %eax .LBB23_3: movb %al, (%rdi) movq %rdi, %rax retq ``` Source-Repo: https://github.com/servo/servo Source-Revision: 9e8e1a47241c6906b4f5777da0d04e3655982fae UltraBlame original commit: af90f70b05df8c81f440b222728ae48c2e2bf5f8
Something I've been meaning to do for a very long while. This PR essentially gets rid of 3 kinds of conditional branching and only keeps the most general one -
SwitchInt
. Primary benefits are such that dealing with MIR now does not involve dealing with 3 different ways to do conditional control flow. On the other hand, constructing aSwitchInt
currently requires more code than what previously was necessary to build an equivalentIf
terminator. Something trivially "fixable" with some constructor methods somewhere (MIR needs stuff like that badly in general).Some timings (tl;dr: slightly faster^1 (unexpected), but also uses slightly more memory at peak (expected)):
^1: Not sure if the speed benefits are because of LLVM liking the generated code better or the compiler itself getting compiled better. Either way, its a net benefit. The CORE and SYNTAX timings done for compilation without optimisation.
cc @nikomatsakis @eddyb