-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
missed optimization with C constructors of sparse structures #40122
Comments
After playing around with it a little more, I see that clang avoids the stack allocation and memcpy as well if I initialize any other member besides the string to a nonzero value: struct i2c_adapter {
char name[48];
int a[1000];
} s;
void hexium_attach(void) {
s = (struct i2c_adapter){"h", {1}};
} |
I think this is bug very early on in the optimization pipeline; within SROA. Comparing Arnd's second example w/ $ clang -O0 -Xclang -disable-O0-optnone x0.c -emit-llvm -c -g0 -S -o x.0.ll It looks like the after the 4th pass (of many) things start to diverge: For x.1.ll: For x.0.ll: So the next thing for me to dig into is, "why is SROA not able to correctly split the large alloca in this case?" |
I see Arnd submitted a few more kernel patches for this, as it seems to be a recurring issue. Playing with the example more, I think another issue is the assignment of char[] members via a string literal. I think that bug may be earlier, in Clang's codegen of LLVM IR. For example: struct foo {
char bar [1000];
} my_foo;
void bad_lazy_init(void) {
my_foo = (struct foo){
.bar = "h",
};
} produces a 4000B .asciiz string "h<3999 zeros>" but
produces much more concise code (and LLVM IR). For an AST node like:
in a compound literal, I wonder if we should try to optimize that in clang codegen, or tackle this strictly in SROA? |
Extended Description
I stumbled over a file with excessive stack usage in the Linux kernel when building with clang, but not with gcc, reduced the test case to:
$ clang-8 -Werror -c hexium-gemini.c -Wframe-larger-than=100 -O2
hexium-gemini.c:5:6: error: stack frame size of 4056 bytes in function 'hexium_attach' [-Werror,-Wframe-larger-than=]
Comparing the output shows that clang not only allocates stack space for the extra copy of the structure, but also fails to optimize the string assignment:
https://godbolt.org/z/7jpy_y
I worked around it in the kernel sources by using an explict memset and strscpy, but this seems like a generally useful optimization that could be added.
The text was updated successfully, but these errors were encountered: