Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy trait defeats generator optimization #62952

Open
tmandry opened this issue Jul 24, 2019 · 2 comments
Open

Copy trait defeats generator optimization #62952

tmandry opened this issue Jul 24, 2019 · 2 comments
Labels
A-coroutines Area: Coroutines C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tmandry
Copy link
Member

tmandry commented Jul 24, 2019

The following example (playground):

#![feature(generators, generator_trait)]

use std::ops::Generator;

#[derive(Clone)]
struct Foo([u8; 1024]);

fn overlap_foo() -> impl Generator<Yield = (), Return = ()>{
    static || {
        let x = Foo([0u8; 1024]);
        yield;
        drop(x);
        let y = Foo([0u8; 1024]);
        yield;
        drop(y);
    }
}

#[derive(Clone, Copy)]
struct Bar([u8; 1024]);

fn overlap_bar() -> impl Generator<Yield = (), Return = ()>{
    static || {
        let x = Bar([0u8; 1024]);
        yield;
        drop(x);
        let y = Bar([0u8; 1024]);
        yield;
        drop(y);
    }
}

fn main() {
    dbg!(std::mem::size_of_val(&overlap_foo()));
    dbg!(std::mem::size_of_val(&overlap_bar()));
}

Outputs:

[gen-explicit-drop.rs:34] std::mem::size_of_val(&overlap_foo()) = 1028                                                                                                                                                                                                                                                       
[gen-explicit-drop.rs:35] std::mem::size_of_val(&overlap_bar()) = 2052  

The only difference between Foo and Bar is that Foo is Copy. This defeats the generator optimization implemented in #61922 which considers any local that has been moved from as a candidate for overlap.

Technically, this is semantically consistent with the existing overlap behavior for generators. But making a type Copy and getting worse performance is a footgun that we should fix.

This can be fixed with a pass that turns unnecessary copies into moves in MIR.

@jonas-schievink jonas-schievink added A-coroutines Area: Coroutines C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 24, 2019
@Centril
Copy link
Contributor

Centril commented Jul 25, 2019

This can be fixed with a pass that turns unnecessary copies into moves in MIR.

I thought that's what https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/copy_prop/index.html does?

cc @oli-obk

@Centril Centril added the I-heavy Issue: Problems and improvements with respect to binary size of generated code. label Jul 25, 2019
@tmandry
Copy link
Member Author

tmandry commented Jul 30, 2019

I thought that's what https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/copy_prop/index.html does?

Nice find, this does indeed do the required transformation. Running that (along with simplify_locals) right before the generator transform almost does the trick.

The problem is that we currently eliminate all storage markers as part of copy_prop, and this defeats the generator optimization. If we merged the live ranges of the two variables instead, it should work.

(EDIT: simplify_locals is not needed)

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-coroutines Area: Coroutines C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants