Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang] Miscompilation concerning const pointer to file-scope compound literal #72784

Open
vchakhno opened this issue Nov 19, 2023 · 7 comments
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" miscompilation

Comments

@vchakhno
Copy link

Minimal reproducible example:

main.c

#include <stdio.h>

typedef struct {
	int foo;
}	global_t;

global_t *const global = &(global_t){0};

int main(void) {
	global->foo = 10;
	printf("%d\n", global->foo);
}

Expected behaviour:

The above code should print 10. The value being assigned also appears in the generated assembly, but isn't printed at execution.

Actual behaviour:

This code compiles successfully without warnings and prints 0.

Compiled with:

clang main.c -Wall -Wextra -Werror -pedantic -fsanitize=undefined,address

  • on version 17.0.0.1, target x86_64-unknown-linux-gnu on godbolt (link)
  • on version 14.0.0-1ubuntu1.1, target x86_64-pc-linux-gnu
  • on version 12.0.1-19ubuntu3, target x86_64-pc-linux-gnu

All three versions are affected.

I also compiled it on gcc in order to compare:
gcc main.c -Wall -Wextra -Werror -pedantic -fsanitize=undefined,address

  • on version 13.2.0 on godbolt (link)
  • on version 11.4.0-1ubuntu1~22.04

Both versions compile successfully and produce 10, the expected behaviour.

@github-actions github-actions bot added the clang Clang issues not falling into any other category label Nov 19, 2023
@shafik
Copy link
Collaborator

shafik commented Nov 29, 2023

Interestingly removing the const from the declaration of the pointer changes the behavior in clang: https://godbolt.org/z/zqzKv9E73

I am not sure what is going on here but @AaronBallman probably has a better idea

@AaronBallman
Copy link
Collaborator

Wow, that is a neat one! The behavior seems to have changed around Clang 4.0.0: https://godbolt.org/z/8s4ffY5Pf and the IR changed drastically. If I had to make a guess at what caused this change, I would guess it was this patch: 9648288 -- changing the literal from a global to a local also resolves the issue, so this seems to be related to lifetimes.

I continue to believe that the way we model compound literals is wrong in C (see #68746 (comment) for details) and I think this may be another demonstration of it. CC @rjmccall @efriedma-quic for opinions or ideas on how to resolve this. Perhaps there's a more narrow fix we can make, but I suspect we should be putting time into refactoring compound literals more broadly because I'm not certain our current approach works for C23's ability to add storage class specifiers to the compound literal.

@efriedma-quic
Copy link
Collaborator

This is not a CodeGen thing; it's constant evaluation misbehaving. For example:

typedef struct {
	int foo;
}	global_t;
global_t *const global = &(global_t){0};
_Static_assert(global->foo==0,"");

I'm not sure how much it helps to introduce CompouldLiteralVarDecls; you still need an expression in the AST to represent the point at which the contents of the CompoundLiteral are evaluated, and nothing else can actually refer to the CompouldLiteralVarDecl directly, so I don't see when you'd ever use a DeclRefExpr to refer to a CompoundLiteral. You could introduce a CompouldLiteralVarDecl anyway, but that doesn't seem to get you much benefit unless you want to introduce decls for everything that allocates space. I mean, do you want MaterializeTemporaryVarDecl for C++?

@AaronBallman
Copy link
Collaborator

This is not a CodeGen thing; it's constant evaluation misbehaving. For example:

typedef struct {
	int foo;
}	global_t;
global_t *const global = &(global_t){0};
_Static_assert(global->foo==0,"");

I'm not sure how much it helps to introduce CompouldLiteralVarDecls; you still need an expression in the AST to represent the point at which the contents of the CompoundLiteral are evaluated, and nothing else can actually refer to the CompouldLiteralVarDecl directly, so I don't see when you'd ever use a DeclRefExpr to refer to a CompoundLiteral. You could introduce a CompouldLiteralVarDecl anyway, but that doesn't seem to get you much benefit unless you want to introduce decls for everything that allocates space. I mean, do you want MaterializeTemporaryVarDecl for C++?

My thinking was that this would be modeled after OpaqueValueExpr; the CompoundLiteralExpr opaquely wraps a DeclRefExpr so that use of the CompoundLiteralExpr eventually references the hidden variable. That CompoundLiteralExpr would also be what stores the VarDecl for the hidden variable.

That said, I'm not tied to this strategy specifically. Do you have other ideas on how we would cleanly associate a storage class like thread_local with a compound literal given the way we codegen it today?

@efriedma-quic
Copy link
Collaborator

CompoundLiteralExpr already has an "isFileScope()" bit, which is effectively the storage class. We can extend it to return a proper storage class if we want. It would require a slight extension of the code to support thread_local.

My thinking was that this would be modeled after OpaqueValueExpr; the CompoundLiteralExpr opaquely wraps a DeclRefExpr so that use of the CompoundLiteralExpr eventually references the hidden variable. That CompoundLiteralExpr would also be what stores the VarDecl for the hidden variable.

That seems mostly like a refactoring. You can always just write the relevant code to handle "a VarDecl or a CompoundLiteralExpr". We don't have very much code specific to CompoundLiteral anyway.

@EugeneZelenko EugeneZelenko added clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed clang Clang issues not falling into any other category labels Nov 30, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 30, 2023

@llvm/issue-subscribers-clang-frontend

Author: Vélimir Chakhnovski (vchakhno)

### Minimal reproducible example: `main.c` ```c #include <stdio.h>

typedef struct {
int foo;
} global_t;

global_t *const global = &(global_t){0};

int main(void) {
global->foo = 10;
printf("%d\n", global->foo);
}


### Expected behaviour:
The above code should print 10. The value being assigned also appears in the generated assembly, but isn't printed at execution.

### Actual behaviour:
This code compiles successfully without warnings and prints 0.

### Compiled with:
`clang main.c -Wall -Wextra -Werror -pedantic -fsanitize=undefined,address`
- on version `17.0.0.1`, target `x86_64-unknown-linux-gnu` on godbolt ([link](https://godbolt.org/z/1xqhqrqor))
- on version `14.0.0-1ubuntu1.1`, target `x86_64-pc-linux-gnu`
- on version `12.0.1-19ubuntu3`, target `x86_64-pc-linux-gnu`

All three versions are affected.

I also compiled it on gcc in order to compare:
`gcc main.c -Wall -Wextra -Werror -pedantic -fsanitize=undefined,address`
- on version `13.2.0` on godbolt ([link](https://godbolt.org/z/33xWvh4vK))
- on version `11.4.0-1ubuntu1~22.04`

Both versions compile successfully and produce `10`, the expected behaviour.
</details>

@rjmccall
Copy link
Contributor

How do we handle l-value references to temporaries in C++ constant evaluation? I don't think we make Decls for them all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" miscompilation
Projects
None yet
Development

No branches or pull requests

8 participants