Skip to content

[clang] Taking address of unreachable function can be used to obtain identical integers that compare unequal #60596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
SvizelPritula opened this issue Feb 8, 2023 · 4 comments · May be fixed by #109732
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc.

Comments

@SvizelPritula
Copy link

SvizelPritula commented Feb 8, 2023

Using __builtin_unreachable, it's possible to create a function a that compiles to zero Assembly instructions, like this:

void a() {
    __builtin_unreachable();
}

void b() {}

With -O1 or higher this compiles to:

a:
b:
        ret

The function a is pretty useless, since calling it will unconditionally result in undefined behaviour. It is, however, possible to take its address, like this:

#include <stdlib.h>
#include <stdio.h>

void a() {
    __builtin_unreachable();
}

void b() {}

int main() {
    size_t ap = (size_t) a;
    size_t bp = (size_t) b;

    printf("%zu %zu %d\n", ap, bp, ap == bp);
}

Executing this will reveal that ap and bp have identical values, since a and b have the same address. However, it will also show that ap == bp is false, which contradicts that.

My guess is that some optimization pass assumes that different functions have different addresses, which is required by the C standard.

This bug is unlikely to happen in real programs, since:
a) few programs have functions that have unconditionally undefined behaviour,
b) even fewer programs will take the address of such a function, and
c) fewer still programs compare function pointers.

__builtin_unreachable can also be replaced by other statements with undefined behaviour, such as for (int i=0; i>=0; i++);.

Tested with clang and clang++ 15.0.7 with an optimization level of 1.

@Endilll
Copy link
Contributor

Endilll commented Sep 29, 2023

Confirmed: https://godbolt.org/z/75z4zMTnd
CC @AaronBallman

@Endilll Endilll added clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed new issue labels Sep 29, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 29, 2023

@llvm/issue-subscribers-clang-frontend

Using `__builtin_unreachable`, it's possible to create a function `a` that compiles to zero Assembly instructions, like this:
void a() {
    __builtin_unreachable();
}

void b() {}

With -O1 or higher this compiles to:

a:
b:
        ret

The function a is pretty useless, since calling it will unconditionally result in undefined behaviour. It is, however, possible to take its address, like this:

#include &lt;stdlib.h&gt;
#include &lt;stdio.h&gt;

void a() {
    __builtin_unreachable();
}

void b() {}

int main() {
    size_t ap = (size_t) a;
    size_t bp = (size_t) b;

    printf("%zu %zu %d\n", ap, bp, ap == bp);
}

Executing this will reveal that ap and bp have identical values, since a and b have the same address. However, it will also show that ap == bp is false, which contradicts that.

My guess is that some optimization pass assumes that different functions have different addresses, which is required by the C standard.

This bug is unlikely to happen in real programs, since:
a) few programs have functions that have unconditionally undefined behaviour,
b) even fewer programs will take the address of such a function, and
c) fewer still programs compare function pointers.

__builtin_unreachable can also be replaced by other statements with undefined behaviour, such as for (int i=0; i&gt;=0; i++);.

Tested with clang and clang++ 15.0.7 with an optimization level of 1.

@shafik
Copy link
Collaborator

shafik commented Sep 29, 2023

This is known problem with unreachable: #48943

Also see: https://discourse.llvm.org/t/can-we-keep-must-progress-optimizations-around-infinite-loop-in-c-while-avoiding-some-surprising-behavior/69205

Right now it does not seem like anyone has the bandwidth to tackle this issue.

I feel like this is not exactly a duplicate but if the OP feels like it is a close enough then feel free to close.

@Endilll Endilll added clang:codegen IR generation bugs: mangling, exceptions, etc. and removed clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 29, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 29, 2023

@llvm/issue-subscribers-clang-codegen

Using `__builtin_unreachable`, it's possible to create a function `a` that compiles to zero Assembly instructions, like this:
void a() {
    __builtin_unreachable();
}

void b() {}

With -O1 or higher this compiles to:

a:
b:
        ret

The function a is pretty useless, since calling it will unconditionally result in undefined behaviour. It is, however, possible to take its address, like this:

#include &lt;stdlib.h&gt;
#include &lt;stdio.h&gt;

void a() {
    __builtin_unreachable();
}

void b() {}

int main() {
    size_t ap = (size_t) a;
    size_t bp = (size_t) b;

    printf("%zu %zu %d\n", ap, bp, ap == bp);
}

Executing this will reveal that ap and bp have identical values, since a and b have the same address. However, it will also show that ap == bp is false, which contradicts that.

My guess is that some optimization pass assumes that different functions have different addresses, which is required by the C standard.

This bug is unlikely to happen in real programs, since:
a) few programs have functions that have unconditionally undefined behaviour,
b) even fewer programs will take the address of such a function, and
c) fewer still programs compare function pointers.

__builtin_unreachable can also be replaced by other statements with undefined behaviour, such as for (int i=0; i&gt;=0; i++);.

Tested with clang and clang++ 15.0.7 with an optimization level of 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants