Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken loop unrolling in branchy loop #117659

Open
dragostis opened this issue Nov 7, 2023 · 1 comment
Open

Broken loop unrolling in branchy loop #117659

dragostis opened this issue Nov 7, 2023 · 1 comment
Labels
A-codegen Area: Code generation A-iterators Area: Iterators C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@dragostis
Copy link

I've been looking at why some iterator fail to optimize well (#38038 & #80416) and it seems like it all boils down to having branches in the next() call which breaks loop unrolling. To give some context, it seems like these are some of the most low-hanging fruit in the Iterator API which almost always ends up generating ideal assembly.

This fails to unroll:

const LEN: usize = 100_000;

pub fn compute(vals0: &mut [f32; LEN / 2], vals1: &mut [f32; LEN / 2]) {
    struct Iter<'a> {
        vals: &'a mut [f32; LEN / 2],
        i: usize,
    }

    let mut iter = Iter { vals: vals0, i: 0 };
    let mut iters = Some(Iter { vals: vals1, i: 0 });

    loop {
        // Adding a likely hint here doesn't change the codegen.
        if let Some(val) = iter.vals.get_mut(iter.i) {
            *val = val.sqrt();
            iter.i += 1;
        } else {
            if let Some(new_iter) = iters.take() {
                iter = new_iter;
            } else {
                break;
            }
        }
    }
}

... while this doesn't:

const LEN: usize = 100_000;

pub fn compute(vals0: &mut [f32; LEN / 2], vals1: &mut [f32; LEN / 2]) {
    for vals in [vals0, vals1] {
        for val in vals {
            *val = val.sqrt();
        }
    }
}

The first case can be unrolled manually:

const AMOUNT: usize = 128;
if let Some(vals) = iter.vals.get_mut(iter.i..iter.i + AMOUNT) {
    for val in vals {
        *val = val.sqrt();
    }
    iter.i += AMOUNT;
}

... but this doesn't work with Iterator::next.

@dragostis dragostis added the C-bug Category: This is a bug. label Nov 7, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 7, 2023
@Jules-Bertholet
Copy link
Contributor

@rustbot label A-iterators A-codegen I-slow

@rustbot rustbot added A-codegen Area: Code generation A-iterators Area: Iterators I-slow Issue: Problems and improvements with respect to performance of generated code. labels Nov 7, 2023
@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-iterators Area: Iterators C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

4 participants