Skip to content

[bug] clang (incorrectly?) optimizes away an entire simple coroutine #65030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jacobsa opened this issue Aug 28, 2023 · 7 comments
Closed

[bug] clang (incorrectly?) optimizes away an entire simple coroutine #65030

jacobsa opened this issue Aug 28, 2023 · 7 comments

Comments

@jacobsa
Copy link
Contributor

jacobsa commented Aug 28, 2023

In #65018 I made a report about clang introducing a data race in the following program with -O1, by writing to the coroutine frame after suspending:

#include <coroutine>

// A simple awaiter type with an await_suspend method that can't be
// inlined.
struct Awaiter {
  const int& x;

  bool await_ready() { return false; }
  std::coroutine_handle<> await_suspend(const std::coroutine_handle<> h);
  void await_resume() {}
};

struct MyTask {
  // A lazy promise with an await_transform method that supports awaiting
  // integer references using the Awaiter struct above.
  struct promise_type {
    MyTask get_return_object() { return {}; }
    std::suspend_always initial_suspend() { return {}; }
    std::suspend_always final_suspend() noexcept { return {}; }
    void unhandled_exception();

    auto await_transform(const int& x) { return Awaiter{x}; }
  };
};

// A global array of integers.
int g_array[32];

// A coroutine that awaits each integer in the global array.
MyTask FooBar() {
  for (const int& x : g_array) {
    co_await x;
  }
}

I found that if you compile it with -O2 instead at a738bdf35e (Compiler Explorer), clang completely eliminates the application logic and just returns a pointer to an uninitialized object:

FooBar():                             # @FooBar()
        mov     edi, 56
        jmp     operator new(unsigned long)@PLT                       # TAILCALL
g_array:
        .zero   128

I don't think this optimization is correct. It seems like clang is asserting there is undefined behavior in the original code, but if there is I can't find it. It should be initializing the resume and destroy function pointers in the coroutine frame, then faithfully calling Awaiter::await_suspend for each element in the array. That function may have arbitrary behavior, including side effects that consume the integer in the array.

This may or may not have the same cause as #65018, where clang introduces a data race into this coroutine. Perhaps the data race that clang introduces allows it to convince itself that there is undefined behavior?

@llvmbot
Copy link
Member

llvmbot commented Aug 28, 2023

@llvm/issue-subscribers-coroutines

1 similar comment
@llvmbot
Copy link
Member

llvmbot commented Aug 28, 2023

@llvm/issue-subscribers-coroutines

@ChuanqiXu9
Copy link
Member

At least this may not be the same reason with #65018, since the problem exists at least from clang14 (may be longer).

@jacobsa
Copy link
Contributor Author

jacobsa commented Aug 28, 2023

Yes, good point! I should have mentioned that.

@ChuanqiXu9
Copy link
Member

I feel this might not be a bug now. Because the coroutine handle is not passed to any where after it gets created. So we can't resume it or destroy it. Then no matter what the await_suspend may do, it is unreachable. Then the compiler chose to eliminate the unused coroutine handle, which makes sense.

So I'd like to close the page and feel free to reopen it if you find something differently.

@jacobsa
Copy link
Contributor Author

jacobsa commented Aug 29, 2023

Good point, thank you! If I alter my example so that the coroutine handle escapes, then the optimization goes away. I apologize for missing this.

@ChuanqiXu9
Copy link
Member

Never mind. Your excellent reproducer helped a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants