Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New lint: creation of huge array on the stack #4520

Open
Shnatsel opened this issue Sep 7, 2019 · 11 comments
Open

New lint: creation of huge array on the stack #4520

Shnatsel opened this issue Sep 7, 2019 · 11 comments
Labels
A-lint Area: New lints E-hard Call for participation: This a hard problem and requires more experience or effort to work on L-suggestion Lint: Improving, adding or fixing lint suggestions

Comments

@Shnatsel
Copy link
Member

Shnatsel commented Sep 7, 2019

The following code will segfault on playground due to stack overflow:

fn do_stuff() {
    let data = [0; 10000000];
}

The problem is that Rust arrays are created on the stack, but this array is too large to fit on the stack.

What's worse, the naive solution doesn't work either:

fn do_stuff() {
    let data = Box::new([0; 10000000]);
}

This still instantiates the array on the stack and segfaults. The proper solution is this:

fn do_stuff() {
    let data = vec![0; 10000000].into_boxed_slice();
}

This issue is particularly tricky if the array size is dynamic, and does not typically manifest on tests, resulting in unexpected crashes in production. Example:

fn do_stuff(len: usize) {
    let data = [0; len];
}

Here len can be set to an arbitrarily large number that would overflow the stack. Only length values of types u8, i8, u16, i16 are definitely safe. The solution is to use one of them or create the array on the heap as described above.

@llogiq
Copy link
Contributor

llogiq commented Sep 7, 2019

@Shnatsel your last example fails to compile because len is non-const. Once const generics are widespread, it could be a generic parameter, though, like

fn do_stuff<const N>() { // I probably got the syntax wrong here
    let _data = [0usize; N];
    ..
}

So we should detect large numbers – though how large is too large? Should we take the type into account?

@mati865
Copy link
Contributor

mati865 commented Sep 7, 2019

Stack size is different between systems and there is no way to confidently tell if array will fit without asking the kernel.

@Shnatsel
Copy link
Member Author

Shnatsel commented Sep 7, 2019

The default stack size in Rust is 2Mb: https://doc.rust-lang.org/std/thread/index.html#stack-size

Any dynamic length type larger than u16 will definitely allow stack overflow.

In fact, if you're making an array of u64, then even a u16 length is already potentially problematic: every u64 is 8 bytes, so you're taking up 512Kb out of the entire 2Mb allotment with this one array. But that's potentially, not definitely, so that's only applicable as a pedantic lint.

As for large fixed sizes, that's more of a rookie mistake, this code would crash more or less up front, so I'd err on the side of people knowing what they're doing and set a rather high limit such as 512Kb.

And yes, the type needs to be taken into account because the memory limit counts against size_of(T) * len, not just len. If you're putting huge sized structs in an array you could get an overflow with very few of them.

@jakubadamw
Copy link
Contributor

jakubadamw commented Sep 8, 2019

@Shnatsel, that's the default stack size for spawned threads. As the page says, "the stack size of the main thread is not determined by Rust".

I think a lint like this would best be made to sum all worst-case stack allocations within a given function/stack frame. It, of course, could not reason about the possible stack growth scenarios of the whole program but at the very least it could trigger on:

fn main() {
    let data1 = [0; 2000000];
    if f() {
        let data2 = [0; 2000000];
    }
}

As indeed it's impossible to know the maximum stack size of the system the program would run on, this would essentially be a lint against stack frames that are simply very large (where how large would be configurable like in some other lints) and not necessarily bound to overflow the stack.

@flip1995 flip1995 added L-suggestion Lint: Improving, adding or fixing lint suggestions E-medium Call for participation: Medium difficulty level problem and requires some initial experience. good-first-issue These issues are a good way to get started with Clippy A-lint Area: New lints labels Sep 10, 2019
@flip1995
Copy link
Member

flip1995 commented Sep 10, 2019

good first issue: A basic implementation of the lint with a configurable parameter

E-hard: Extending this to const generics and sum up required stack size in a bigger scope, like @jakubadamw suggested

@basil-cow
Copy link
Contributor

I would like to tackle this one.

@basil-cow
Copy link
Contributor

I am not sure about local variable allocation in Rust (I've read LLVM documentation and Rust Reference about local variables, but they left me with the feeling that it's not really specified). Is it ok to just sum up all the array allocations on the worst path that are binded to a variable + temp variable with max size?

@flip1995 flip1995 added E-hard Call for participation: This a hard problem and requires more experience or effort to work on and removed E-medium Call for participation: Medium difficulty level problem and requires some initial experience. labels Nov 13, 2019
@flip1995
Copy link
Member

I am not sure about local variable allocation in Rust

Yeah, me neither. I changed the classification to E-hard. No idea, why I thought, this is E-medium.

At first, only the "easy" implementation would be enough. So to just search for array allocations and check the size against a threshold.

Is it ok to just sum up all the array allocations on the worst path that are binded to a variable + temp variable with max size?

I would say that this is a good heuristic.

@cauebs
Copy link
Contributor

cauebs commented Oct 3, 2020

Hasn't this been implemented? https://rust-lang.github.io/rust-clippy/master/index.html#large_stack_arrays

Or is something still missing?

@ebroto
Copy link
Member

ebroto commented Oct 3, 2020

@cauebs only the "good first issue" part of this comment was implemented.

@cauebs
Copy link
Contributor

cauebs commented Oct 3, 2020

@ebroto Thanks! Missed that. Maybe the labels should be updated to reflect that, then. Or even close and open another issue to make it clearer.

@ebroto ebroto removed good-first-issue These issues are a good way to get started with Clippy hacktoberfest labels Oct 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lint Area: New lints E-hard Call for participation: This a hard problem and requires more experience or effort to work on L-suggestion Lint: Improving, adding or fixing lint suggestions
Projects
None yet
Development

No branches or pull requests

8 participants