Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang mistakenly elides coroutine allocation resulting in a segfault and stack-use-after-return from AddressSanitizer #59723

Closed
cppdev123 opened this issue Dec 27, 2022 · 12 comments · Fixed by llvm/llvm-project-release-prs#637

Comments

@cppdev123
Copy link

cppdev123 commented Dec 27, 2022

The problem was submitted in #56513 and #56455 but there were no responses.
I thought it was specific to Windows, but it turned out to happen also on Linux with Clang 15.0.6 with optimization level -O3 the address sanitizer detects the access of stack after the coroutine returns.
The code in the #56513 had a race condition problem in final_suspend but it was not the cause of the problem.

Compile the following code with:

clang++-15 clangcorobug.cpp -std=c++20 -O3 -g -fsanitize=address -lpthread -o corobug
#include <atomic>
#include <thread>
#include <condition_variable>
#include <coroutine>
#include <variant>
#include <deque>
#include <cassert>

// executor and operation base

class bug_any_executor;

struct bug_async_op_base {
	void invoke() {
		invoke_operation();
	}

protected:

	~bug_async_op_base() = default;

	virtual void invoke_operation() = 0;
};

class bug_any_executor {
	using op_type = bug_async_op_base;

public:

	virtual ~bug_any_executor() = default;

	// removing noexcept enables clang to find that the pointer has escaped
	virtual void post(op_type& op) noexcept = 0;

	virtual void wait() noexcept = 0;
};

class bug_thread_executor : public bug_any_executor {
	void work_thd() {
		while (!ops_.empty()) {
			std::unique_lock<std::mutex> lock{ lock_ };
			cv_.wait(lock, [this] { return !ops_.empty(); });

			while (!ops_.empty()) {
				bug_async_op_base* op = ops_.front();
				ops_.pop_front();
				op->invoke();
			}
		}

		cv_.notify_all();
	}

	std::mutex lock_;
	std::condition_variable cv_;
	std::deque<bug_async_op_base*> ops_;
	std::thread thd_;

public:

	void start() {
		thd_ = std::thread(&bug_thread_executor::work_thd, this);
	}

	~bug_thread_executor() {
		if (thd_.joinable())
			thd_.join();
	}

	// although this implementation is not realy noexcept due to allocation but I have a real one that is and required to be noexcept
	virtual void post(bug_async_op_base& op) noexcept override {
		{
			std::unique_lock<std::mutex> lock{ lock_ };
			ops_.push_back(&op);
		}
		cv_.notify_all();
	}

	virtual void wait() noexcept override {
		std::unique_lock<std::mutex> lock{ lock_ };
		cv_.wait(lock, [this] { return ops_.empty(); });
	}
};

// task and promise

struct bug_final_suspend_notification {
	virtual std::coroutine_handle<> get_waiter() = 0;
};

class bug_task;

class bug_resume_waiter {
public:
	bug_resume_waiter(std::variant<std::coroutine_handle<>, bug_final_suspend_notification*> waiter) noexcept : waiter_{ waiter } {}

	constexpr bool await_ready() const noexcept { return false; }

	std::coroutine_handle<> await_suspend(std::coroutine_handle<>) noexcept {
		return waiter_.index() == 0 ? std::get<0>(waiter_) : std::get<1>(waiter_)->get_waiter();
	}

	constexpr void await_resume() const noexcept {}

private:
	std::variant<std::coroutine_handle<>, bug_final_suspend_notification*> waiter_;
};

class bug_task_promise {
	friend bug_task;
public:

	bug_task get_return_object() noexcept;

	constexpr std::suspend_always initial_suspend() noexcept { return {}; }

	bug_resume_waiter final_suspend() noexcept {
		return bug_resume_waiter{ waiter_ };
	}

	void unhandled_exception() noexcept {
		ex_ptr = std::current_exception();
	}

	constexpr void return_void() const noexcept {}

	void get_result() const {
		if (ex_ptr)
			std::rethrow_exception(ex_ptr);
	}

	std::variant<std::monostate, std::exception_ptr> result_or_error() const noexcept {
		if (ex_ptr)
			return ex_ptr;
		return {};
	}

private:
	std::variant<std::coroutine_handle<>, bug_final_suspend_notification*> waiter_;
	std::exception_ptr ex_ptr = nullptr;
};

class bug_task {
	friend bug_task_promise;
	using handle = std::coroutine_handle<>;
	using promise_t = bug_task_promise;

	bug_task(handle coro, promise_t* p) noexcept : this_coro{ coro }, this_promise{ p } {
		//printf("task(%p) coroutine(%p) promise(%p)\n", this, this_coro.address(), this_promise);
	}

public:

	using promise_type = bug_task_promise;

	bug_task(bug_task&& other) noexcept
		: this_coro{ std::exchange(other.this_coro, nullptr) }, this_promise{ std::exchange(other.this_promise, nullptr) } { 
		printf("task(task&&: %p) coroutine(%p) promise(%p)\n", this, this_coro.address(), this_promise); 
	}

	~bug_task() {
		if (this_coro) {
			//printf("~task(%p) coroutine(%p) promise(%p)\n", this, this_coro.address(), this_promise);
			this_coro.destroy();
		}
	}

	constexpr bool await_ready() const noexcept {
		return false;
	}

	handle await_suspend(handle waiter) noexcept {
		assert(this_coro != nullptr && this_promise != nullptr);
		this_promise->waiter_ = waiter;
		return this_coro;
	}

	void await_resume() {
		return this_promise->get_result();
	}

	bool is_valid() const noexcept {
		return this_promise != nullptr && this_coro != nullptr;
	}

	void start_coro(bug_final_suspend_notification& w) noexcept {
		assert(this_promise != nullptr && this_coro != nullptr);
		this_promise->waiter_ = &w;
		this_coro.resume(); // never throws since all exceptions are caught by the promise
	}

private:
	handle this_coro;
	promise_t* this_promise;
};

bug_task bug_task_promise::get_return_object() noexcept {
	return { std::coroutine_handle<bug_task_promise>::from_promise(*this), this };
}

// spawn operation and spawner

template<class Handler>
class bug_spawn_op final : public bug_async_op_base, bug_final_suspend_notification {
	Handler handler;
	bug_task task_;

public:

	bug_spawn_op(Handler handler, bug_task&& t)
		: handler { handler }, task_{ std::move(t) } {}

	virtual void invoke_operation() override {
		printf("starting the coroutine\n");
		task_.start_coro(*this);
		printf("started the coroutine\n");
	}

	virtual std::coroutine_handle<> get_waiter() override {
                auto handler2 = std::move(handler);
                delete this;
		handler2();
		return std::noop_coroutine();
	}
};

struct dummy_spawn_handler_t {
	constexpr void operator()() const noexcept {}
};

void bug_spawn(bug_any_executor& ex, bug_task&& t) {
	using op_t = bug_spawn_op<dummy_spawn_handler_t>;
	op_t* op = new op_t{ dummy_spawn_handler_t{}, std::move(t) };
	ex.post(*op);
}

class bug_spawner;

struct bug_spawner_awaiter {
	bug_spawner& s;
	std::coroutine_handle<> waiter;

	bug_spawner_awaiter(bug_spawner& s) : s{ s } {}

	bool await_ready() const noexcept;

	void await_suspend(std::coroutine_handle<> coro);

	void await_resume() {}
};

class bug_spawner {
	friend bug_spawner_awaiter;

	struct final_handler_t {
		bug_spawner& s;

		void operator()() {
			s.on_spawn_finished();
		}
	};

public:

	bug_spawner(bug_any_executor& ex) : ex_{ ex } {}

	void spawn(bug_task&& t) {
		using op_t = bug_spawn_op<final_handler_t>;
		// move task into ptr
		op_t* ptr = new op_t(final_handler_t{ *this }, std::move(t));
		++count_;
		ex_.post(*ptr); // ptr escapes here thus task escapes but clang can't deduce that unless post() is not noexcept
	}

	bug_spawner_awaiter wait() noexcept { return { *this }; }

	void on_spawn_finished()
	{
		if (!--count_ && awaiter_)
		{
			auto a = std::exchange(awaiter_, nullptr);
			a->waiter.resume();
		}
	}

private:

	bug_any_executor& ex_; // if bug_thread_executor& is used instead enables clang to detect the escape of the promise
	bug_spawner_awaiter* awaiter_ = nullptr;
	std::atomic<std::size_t> count_ = 0;
};

bool bug_spawner_awaiter::await_ready() const noexcept {
	return s.count_ == 0;
}

void bug_spawner_awaiter::await_suspend(std::coroutine_handle<> coro) {
	waiter = coro;
	s.awaiter_ = this;
}

template<std::invocable<bug_spawner&> Fn>
bug_task scoped_spawn(bug_any_executor& ex, Fn fn) {
	bug_spawner s{ ex };
	std::exception_ptr ex_ptr;

	try
	{
		fn(s);
	}
	catch (const std::exception& ex) // ex instead of ... to observe the address of ex
	{
		printf("caught an exception from fn(s): %p\n", std::addressof(ex));
		ex_ptr = std::current_exception();
	}

	co_await s.wait();
	if (ex_ptr)
		std::rethrow_exception(ex_ptr);
}

// forked task to start the coroutine from sync code

struct bug_forked_task_promise;

class bug_forked_task {
	friend struct bug_forked_task_promise;
	bug_forked_task() = default;
public:
	using promise_type = bug_forked_task_promise;
};

struct bug_forked_task_promise {
	bug_forked_task get_return_object() noexcept { return {}; }

	constexpr std::suspend_never initial_suspend() noexcept { return {}; }

	constexpr std::suspend_never final_suspend() noexcept { return {}; }

	void unhandled_exception() noexcept {
		std::terminate();
	}

	constexpr void return_void() const noexcept {}
};

// test case

bug_task bug_spawned_task(int id, int inc, std::atomic<int>& n) {
	int result = n += inc;
	std::string msg = "count in coro (" + std::to_string(id) + ") = " + std::to_string(result);
	printf("%s\n", msg.c_str());
	co_return;
}

// using bug_thread_executor& instead of bug_any_executor& resolves the problem
bug_forked_task run_coros(bug_any_executor& ex) {
	std::atomic<int> count = 0;
	auto throwing_fn = [&](bug_spawner& s) {
		int frame_ptr = 0;
		printf("frame ptr ptr: %p\n", std::addressof(frame_ptr));
		s.spawn(bug_spawned_task(1, 2, count)); // the coroutine frame is allocated on the stack !
		s.spawn(bug_spawned_task(2, 3, count));
		s.spawn(bug_spawned_task(3, 5, count));
                // commenting the following line hides the problem
		throw std::runtime_error{ "catch this !" }; // on windows allocated on the stack as required by msvc c++ abi
	};

	try {
		co_await scoped_spawn(ex, throwing_fn);
	}
	catch (const std::exception& ex) {
		printf("scoped_spawn propagated exception: %s\n", ex.what());
	}

	printf("count after scoped_spawn: %d\n", count.load());
}


int main() {
	int var = 0;
	bug_thread_executor ex;
	printf("stack address: %p\n", std::addressof(var));
	run_coros(ex);
	ex.start();
	ex.wait();
	return 0;
}

the run ./corobug and you will get something like this AddressSanitizer: stack-use-after-return on address 0x7fb1ba9f9338 at pc 0x7fb1bd4d3b58 bp 0x7fb1ba1efd40 sp 0x7fb1ba1efd38
without the address sanitizer a segfault is received.

@EugeneZelenko EugeneZelenko added coroutines C++20 coroutines and removed new issue labels Dec 27, 2022
@llvmbot
Copy link
Member

llvmbot commented Dec 27, 2022

@llvm/issue-subscribers-coroutines

@cppdev123
Copy link
Author

I put it on compiler explorer and it is clear from the llvm output that the coroutines are allocated on the stack

std::atomic<int> count = 0;
void throwing_fn(bug_spawner& s) {
	int frame_ptr = 0;
	printf("frame ptr ptr: %p\n", std::addressof(frame_ptr));
	s.spawn(bug_spawned_task(1, 2, count)); // the coroutine frame is allocated on the stack !
	s.spawn(bug_spawned_task(2, 3, count));
	s.spawn(bug_spawned_task(3, 5, count));
	// commenting the following line hides the problem
	throw std::runtime_error{ "catch this !" }; // allocated on the stack as required by msvc c++ abi
};

translates to :

define dso_local void @_Z11throwing_fnR11bug_spawner(ptr noundef nonnull align 8 dereferenceable(24) %0) #13 personality ptr @__gxx_personality_v0 !dbg !5530 {
  %2 = alloca i32, align 4
  %3 = alloca [72 x i8], align 8 ; stack space for first spawned coroutine
  %4 = alloca [72 x i8], align 8  ; stack space for second spawned coroutine
  %5 = alloca [72 x i8], align 8  ; stack space for third spawned coroutine
  call void @llvm.dbg.value(metadata ptr %0, metadata !5534, metadata !DIExpression()), !dbg !5536
  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %2) #27, !dbg !5537
  call void @llvm.dbg.value(metadata i32 0, metadata !5535, metadata !DIExpression()), !dbg !5536
  store i32 0, ptr %2, align 4, !dbg !5538, !tbaa !5539
  call void @llvm.dbg.value(metadata ptr %2, metadata !5535, metadata !DIExpression(DW_OP_deref)), !dbg !5536
  %6 = call i32 (ptr, ...) @printf(ptr noundef nonnull @.str.3, ptr noundef nonnull %2), !dbg !5541
  call void @llvm.dbg.value(metadata i32 1, metadata !4950, metadata !DIExpression()), !dbg !5542
  call void @llvm.dbg.value(metadata i32 2, metadata !4951, metadata !DIExpression()), !dbg !5542
  call void @llvm.dbg.value(metadata ptr @count, metadata !4952, metadata !DIExpression()), !dbg !5542
  call void @llvm.dbg.declare(metadata ptr %3, metadata !4953, metadata !DIExpression(DW_OP_plus_uconst, 16)), !dbg !5542
  call void @llvm.dbg.declare(metadata ptr %3, metadata !4964, metadata !DIExpression()), !dbg !5542
  store ptr @_Z16bug_spawned_taskiiRSt6atomicIiE.resume, ptr %3, align 8, !dbg !5542
  %7 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %3, i64 0, i32 1, !dbg !5542
  store ptr @_Z16bug_spawned_taskiiRSt6atomicIiE.cleanup, ptr %7, align 8, !dbg !5542
  %8 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %3, i64 0, i32 2, !dbg !5542
  %9 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %3, i64 0, i32 3, !dbg !5542
  store ptr @count, ptr %9, align 8, !dbg !5542
  %10 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %3, i64 0, i32 5, !dbg !5542
  store <2 x i32> <i32 1, i32 2>, ptr %10, align 8, !dbg !5542
  call void @llvm.dbg.value(metadata i32 1, metadata !4950, metadata !DIExpression()), !dbg !5542
  call void @llvm.dbg.value(metadata i32 2, metadata !4951, metadata !DIExpression()), !dbg !5542
  call void @llvm.dbg.value(metadata ptr @count, metadata !4952, metadata !DIExpression()), !dbg !5542
  call void @llvm.lifetime.start.p0(i64 24, ptr nonnull %8) #27, !dbg !5544
  call void @llvm.dbg.value(metadata ptr %8, metadata !4982, metadata !DIExpression()), !dbg !5545
  call void @llvm.dbg.value(metadata ptr %8, metadata !4988, metadata !DIExpression()), !dbg !5547
  call void @llvm.dbg.value(metadata ptr %8, metadata !4993, metadata !DIExpression()), !dbg !5549
  call void @llvm.dbg.value(metadata ptr %8, metadata !4999, metadata !DIExpression()), !dbg !5551
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5007, metadata !DIExpression()), !dbg !5553
  call void @llvm.dbg.value(metadata ptr %8, metadata !5011, metadata !DIExpression()), !dbg !5554
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5018, metadata !DIExpression()), !dbg !5554
  call void @llvm.dbg.value(metadata ptr %8, metadata !5022, metadata !DIExpression()), !dbg !5556
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5029, metadata !DIExpression()), !dbg !5556
  call void @llvm.dbg.value(metadata ptr %8, metadata !5033, metadata !DIExpression()), !dbg !5558
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5040, metadata !DIExpression()), !dbg !5558
  call void @llvm.dbg.value(metadata ptr %8, metadata !5044, metadata !DIExpression()), !dbg !5560
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5051, metadata !DIExpression()), !dbg !5560
  call void @llvm.dbg.value(metadata ptr %8, metadata !5055, metadata !DIExpression()), !dbg !5562
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5061, metadata !DIExpression()), !dbg !5564
  call void @llvm.dbg.value(metadata ptr %8, metadata !5066, metadata !DIExpression()), !dbg !5565
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5073, metadata !DIExpression()), !dbg !5567
  call void @llvm.dbg.value(metadata ptr %8, metadata !5078, metadata !DIExpression()), !dbg !5568
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5084, metadata !DIExpression()), !dbg !5570
  call void @llvm.dbg.value(metadata ptr %8, metadata !5089, metadata !DIExpression()), !dbg !5571
  store ptr null, ptr %8, align 8, !dbg !5573, !tbaa !5095
  %11 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %3, i64 0, i32 2, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 1, !dbg !5574
  store i8 0, ptr %11, align 8, !dbg !5574, !tbaa !5097
  %12 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %3, i64 0, i32 2, i32 1, !dbg !5575
  call void @llvm.dbg.value(metadata ptr %12, metadata !5100, metadata !DIExpression()), !dbg !5576
  call void @llvm.dbg.value(metadata ptr null, metadata !5103, metadata !DIExpression()), !dbg !5576
  store ptr null, ptr %12, align 8, !dbg !5578, !tbaa !5107
  %13 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %3, i64 0, i32 7, !dbg !5544
  store i1 false, ptr %13, align 8, !dbg !5544
  call void @llvm.dbg.value(metadata ptr %0, metadata !5579, metadata !DIExpression()), !dbg !5587
  call void @llvm.dbg.value(metadata ptr undef, metadata !5582, metadata !DIExpression()), !dbg !5587
  %14 = call noalias noundef nonnull dereferenceable(40) ptr @_Znwm(i64 noundef 40) #29, !dbg !5589
  call void @llvm.dbg.value(metadata ptr %0, metadata !5590, metadata !DIExpression()), !dbg !5596
  call void @llvm.dbg.value(metadata ptr %14, metadata !5593, metadata !DIExpression()), !dbg !5596
  call void @llvm.dbg.value(metadata ptr undef, metadata !5595, metadata !DIExpression()), !dbg !5596
  %15 = getelementptr inbounds i8, ptr %14, i64 8, !dbg !5598
  store ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV12bug_spawn_opIN11bug_spawner15final_handler_tEE, i64 0, inrange i32 0, i64 2), ptr %14, align 8, !dbg !5598, !tbaa !4835
  store ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV12bug_spawn_opIN11bug_spawner15final_handler_tEE, i64 0, inrange i32 1, i64 2), ptr %15, align 8, !dbg !5598, !tbaa !4835
  %16 = getelementptr inbounds %class.bug_spawn_op.14, ptr %14, i64 0, i32 2, !dbg !5599
  store ptr %0, ptr %16, align 8, !dbg !5599, !tbaa.struct !4802
  %17 = getelementptr inbounds %class.bug_spawn_op.14, ptr %14, i64 0, i32 3, !dbg !5600
  call void @llvm.dbg.value(metadata ptr %17, metadata !4838, metadata !DIExpression()), !dbg !5601
  call void @llvm.dbg.value(metadata ptr undef, metadata !4841, metadata !DIExpression()), !dbg !5601
  store ptr %3, ptr %17, align 8, !dbg !5603
  %18 = getelementptr inbounds %class.bug_spawn_op.14, ptr %14, i64 0, i32 3, i32 1, !dbg !5604
  store ptr %8, ptr %18, align 8, !dbg !5604, !tbaa !4808
  %19 = call i32 (ptr, ...) @printf(ptr noundef nonnull @.str.15, ptr noundef nonnull %17, ptr noundef nonnull %3, ptr noundef nonnull %8), !dbg !5605
  call void @llvm.dbg.value(metadata ptr %14, metadata !5583, metadata !DIExpression()), !dbg !5587
  %20 = getelementptr inbounds %class.bug_spawner, ptr %0, i64 0, i32 2, !dbg !5606
  call void @llvm.dbg.value(metadata ptr %20, metadata !5607, metadata !DIExpression()), !dbg !5611
  %21 = atomicrmw add ptr %20, i64 1 seq_cst, align 8, !dbg !5613
  %22 = load ptr, ptr %0, align 8, !dbg !5614, !tbaa !5615
  %23 = load ptr, ptr %22, align 8, !dbg !5616, !tbaa !4835
  %24 = getelementptr inbounds ptr, ptr %23, i64 2, !dbg !5616
  %25 = load ptr, ptr %24, align 8, !dbg !5616
  call void %25(ptr noundef nonnull align 8 dereferenceable(8) %22, ptr noundef nonnull align 8 dereferenceable(8) %14), !dbg !5616
  call void @llvm.dbg.value(metadata i32 2, metadata !4950, metadata !DIExpression()), !dbg !5617
  call void @llvm.dbg.value(metadata i32 3, metadata !4951, metadata !DIExpression()), !dbg !5617
  call void @llvm.dbg.value(metadata ptr @count, metadata !4952, metadata !DIExpression()), !dbg !5617
  call void @llvm.dbg.declare(metadata ptr %4, metadata !4953, metadata !DIExpression(DW_OP_plus_uconst, 16)), !dbg !5617
  call void @llvm.dbg.declare(metadata ptr %4, metadata !4964, metadata !DIExpression()), !dbg !5617
  store ptr @_Z16bug_spawned_taskiiRSt6atomicIiE.resume, ptr %4, align 8, !dbg !5617
  %26 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %4, i64 0, i32 1, !dbg !5617
  store ptr @_Z16bug_spawned_taskiiRSt6atomicIiE.cleanup, ptr %26, align 8, !dbg !5617
  %27 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %4, i64 0, i32 2, !dbg !5617
  %28 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %4, i64 0, i32 3, !dbg !5617
  store ptr @count, ptr %28, align 8, !dbg !5617
  %29 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %4, i64 0, i32 5, !dbg !5617
  store <2 x i32> <i32 2, i32 3>, ptr %29, align 8, !dbg !5617
  call void @llvm.dbg.value(metadata i32 2, metadata !4950, metadata !DIExpression()), !dbg !5617
  call void @llvm.dbg.value(metadata i32 3, metadata !4951, metadata !DIExpression()), !dbg !5617
  call void @llvm.dbg.value(metadata ptr @count, metadata !4952, metadata !DIExpression()), !dbg !5617
  call void @llvm.lifetime.start.p0(i64 24, ptr nonnull %27) #27, !dbg !5619
  call void @llvm.dbg.value(metadata ptr %27, metadata !4982, metadata !DIExpression()), !dbg !5620
  call void @llvm.dbg.value(metadata ptr %27, metadata !4988, metadata !DIExpression()), !dbg !5622
  call void @llvm.dbg.value(metadata ptr %27, metadata !4993, metadata !DIExpression()), !dbg !5624
  call void @llvm.dbg.value(metadata ptr %27, metadata !4999, metadata !DIExpression()), !dbg !5626
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5007, metadata !DIExpression()), !dbg !5628
  call void @llvm.dbg.value(metadata ptr %27, metadata !5011, metadata !DIExpression()), !dbg !5629
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5018, metadata !DIExpression()), !dbg !5629
  call void @llvm.dbg.value(metadata ptr %27, metadata !5022, metadata !DIExpression()), !dbg !5631
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5029, metadata !DIExpression()), !dbg !5631
  call void @llvm.dbg.value(metadata ptr %27, metadata !5033, metadata !DIExpression()), !dbg !5633
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5040, metadata !DIExpression()), !dbg !5633
  call void @llvm.dbg.value(metadata ptr %27, metadata !5044, metadata !DIExpression()), !dbg !5635
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5051, metadata !DIExpression()), !dbg !5635
  call void @llvm.dbg.value(metadata ptr %27, metadata !5055, metadata !DIExpression()), !dbg !5637
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5061, metadata !DIExpression()), !dbg !5639
  call void @llvm.dbg.value(metadata ptr %27, metadata !5066, metadata !DIExpression()), !dbg !5640
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5073, metadata !DIExpression()), !dbg !5642
  call void @llvm.dbg.value(metadata ptr %27, metadata !5078, metadata !DIExpression()), !dbg !5643
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5084, metadata !DIExpression()), !dbg !5645
  call void @llvm.dbg.value(metadata ptr %27, metadata !5089, metadata !DIExpression()), !dbg !5646
  store ptr null, ptr %27, align 8, !dbg !5648, !tbaa !5095
  %30 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %4, i64 0, i32 2, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 1, !dbg !5649
  store i8 0, ptr %30, align 8, !dbg !5649, !tbaa !5097
  %31 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %4, i64 0, i32 2, i32 1, !dbg !5650
  call void @llvm.dbg.value(metadata ptr %31, metadata !5100, metadata !DIExpression()), !dbg !5651
  call void @llvm.dbg.value(metadata ptr null, metadata !5103, metadata !DIExpression()), !dbg !5651
  store ptr null, ptr %31, align 8, !dbg !5653, !tbaa !5107
  %32 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %4, i64 0, i32 7, !dbg !5619
  store i1 false, ptr %32, align 8, !dbg !5619
  call void @llvm.dbg.value(metadata ptr %0, metadata !5579, metadata !DIExpression()), !dbg !5654
  call void @llvm.dbg.value(metadata ptr undef, metadata !5582, metadata !DIExpression()), !dbg !5654
  %33 = call noalias noundef nonnull dereferenceable(40) ptr @_Znwm(i64 noundef 40) #29, !dbg !5656
  call void @llvm.dbg.value(metadata ptr %0, metadata !5590, metadata !DIExpression()), !dbg !5657
  call void @llvm.dbg.value(metadata ptr %33, metadata !5593, metadata !DIExpression()), !dbg !5657
  call void @llvm.dbg.value(metadata ptr undef, metadata !5595, metadata !DIExpression()), !dbg !5657
  %34 = getelementptr inbounds i8, ptr %33, i64 8, !dbg !5659
  store ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV12bug_spawn_opIN11bug_spawner15final_handler_tEE, i64 0, inrange i32 0, i64 2), ptr %33, align 8, !dbg !5659, !tbaa !4835
  store ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV12bug_spawn_opIN11bug_spawner15final_handler_tEE, i64 0, inrange i32 1, i64 2), ptr %34, align 8, !dbg !5659, !tbaa !4835
  %35 = getelementptr inbounds %class.bug_spawn_op.14, ptr %33, i64 0, i32 2, !dbg !5660
  store ptr %0, ptr %35, align 8, !dbg !5660, !tbaa.struct !4802
  %36 = getelementptr inbounds %class.bug_spawn_op.14, ptr %33, i64 0, i32 3, !dbg !5661
  call void @llvm.dbg.value(metadata ptr %36, metadata !4838, metadata !DIExpression()), !dbg !5662
  call void @llvm.dbg.value(metadata ptr undef, metadata !4841, metadata !DIExpression()), !dbg !5662
  store ptr %4, ptr %36, align 8, !dbg !5664
  %37 = getelementptr inbounds %class.bug_spawn_op.14, ptr %33, i64 0, i32 3, i32 1, !dbg !5665
  store ptr %27, ptr %37, align 8, !dbg !5665, !tbaa !4808
  %38 = call i32 (ptr, ...) @printf(ptr noundef nonnull @.str.15, ptr noundef nonnull %36, ptr noundef nonnull %4, ptr noundef nonnull %27), !dbg !5666
  call void @llvm.dbg.value(metadata ptr %33, metadata !5583, metadata !DIExpression()), !dbg !5654
  call void @llvm.dbg.value(metadata ptr %20, metadata !5607, metadata !DIExpression()), !dbg !5667
  %39 = atomicrmw add ptr %20, i64 1 seq_cst, align 8, !dbg !5669
  %40 = load ptr, ptr %0, align 8, !dbg !5670, !tbaa !5615
  %41 = load ptr, ptr %40, align 8, !dbg !5671, !tbaa !4835
  %42 = getelementptr inbounds ptr, ptr %41, i64 2, !dbg !5671
  %43 = load ptr, ptr %42, align 8, !dbg !5671
  call void %43(ptr noundef nonnull align 8 dereferenceable(8) %40, ptr noundef nonnull align 8 dereferenceable(8) %33), !dbg !5671
  call void @llvm.dbg.value(metadata i32 3, metadata !4950, metadata !DIExpression()), !dbg !5672
  call void @llvm.dbg.value(metadata i32 5, metadata !4951, metadata !DIExpression()), !dbg !5672
  call void @llvm.dbg.value(metadata ptr @count, metadata !4952, metadata !DIExpression()), !dbg !5672
  call void @llvm.dbg.declare(metadata ptr %5, metadata !4953, metadata !DIExpression(DW_OP_plus_uconst, 16)), !dbg !5672
  call void @llvm.dbg.declare(metadata ptr %5, metadata !4964, metadata !DIExpression()), !dbg !5672
  store ptr @_Z16bug_spawned_taskiiRSt6atomicIiE.resume, ptr %5, align 8, !dbg !5672
  %44 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %5, i64 0, i32 1, !dbg !5672
  store ptr @_Z16bug_spawned_taskiiRSt6atomicIiE.cleanup, ptr %44, align 8, !dbg !5672
  %45 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %5, i64 0, i32 2, !dbg !5672
  %46 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %5, i64 0, i32 3, !dbg !5672
  store ptr @count, ptr %46, align 8, !dbg !5672
  %47 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %5, i64 0, i32 5, !dbg !5672
  store <2 x i32> <i32 3, i32 5>, ptr %47, align 8, !dbg !5672
  call void @llvm.dbg.value(metadata i32 3, metadata !4950, metadata !DIExpression()), !dbg !5672
  call void @llvm.dbg.value(metadata i32 5, metadata !4951, metadata !DIExpression()), !dbg !5672
  call void @llvm.dbg.value(metadata ptr @count, metadata !4952, metadata !DIExpression()), !dbg !5672
  call void @llvm.lifetime.start.p0(i64 24, ptr nonnull %45) #27, !dbg !5674
  call void @llvm.dbg.value(metadata ptr %45, metadata !4982, metadata !DIExpression()), !dbg !5675
  call void @llvm.dbg.value(metadata ptr %45, metadata !4988, metadata !DIExpression()), !dbg !5677
  call void @llvm.dbg.value(metadata ptr %45, metadata !4993, metadata !DIExpression()), !dbg !5679
  call void @llvm.dbg.value(metadata ptr %45, metadata !4999, metadata !DIExpression()), !dbg !5681
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5007, metadata !DIExpression()), !dbg !5683
  call void @llvm.dbg.value(metadata ptr %45, metadata !5011, metadata !DIExpression()), !dbg !5684
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5018, metadata !DIExpression()), !dbg !5684
  call void @llvm.dbg.value(metadata ptr %45, metadata !5022, metadata !DIExpression()), !dbg !5686
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5029, metadata !DIExpression()), !dbg !5686
  call void @llvm.dbg.value(metadata ptr %45, metadata !5033, metadata !DIExpression()), !dbg !5688
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5040, metadata !DIExpression()), !dbg !5688
  call void @llvm.dbg.value(metadata ptr %45, metadata !5044, metadata !DIExpression()), !dbg !5690
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5051, metadata !DIExpression()), !dbg !5690
  call void @llvm.dbg.value(metadata ptr %45, metadata !5055, metadata !DIExpression()), !dbg !5692
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5061, metadata !DIExpression()), !dbg !5694
  call void @llvm.dbg.value(metadata ptr %45, metadata !5066, metadata !DIExpression()), !dbg !5695
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5073, metadata !DIExpression()), !dbg !5697
  call void @llvm.dbg.value(metadata ptr %45, metadata !5078, metadata !DIExpression()), !dbg !5698
  call void @llvm.dbg.declare(metadata ptr undef, metadata !5084, metadata !DIExpression()), !dbg !5700
  call void @llvm.dbg.value(metadata ptr %45, metadata !5089, metadata !DIExpression()), !dbg !5701
  store ptr null, ptr %45, align 8, !dbg !5703, !tbaa !5095
  %48 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %5, i64 0, i32 2, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 1, !dbg !5704
  store i8 0, ptr %48, align 8, !dbg !5704, !tbaa !5097
  %49 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %5, i64 0, i32 2, i32 1, !dbg !5705
  call void @llvm.dbg.value(metadata ptr %49, metadata !5100, metadata !DIExpression()), !dbg !5706
  call void @llvm.dbg.value(metadata ptr null, metadata !5103, metadata !DIExpression()), !dbg !5706
  store ptr null, ptr %49, align 8, !dbg !5708, !tbaa !5107
  %50 = getelementptr inbounds %_Z16bug_spawned_taskiiRSt6atomicIiE.Frame, ptr %5, i64 0, i32 7, !dbg !5674
  store i1 false, ptr %50, align 8, !dbg !5674
  call void @llvm.dbg.value(metadata ptr %0, metadata !5579, metadata !DIExpression()), !dbg !5709
  call void @llvm.dbg.value(metadata ptr undef, metadata !5582, metadata !DIExpression()), !dbg !5709
  %51 = call noalias noundef nonnull dereferenceable(40) ptr @_Znwm(i64 noundef 40) #29, !dbg !5711
  call void @llvm.dbg.value(metadata ptr %0, metadata !5590, metadata !DIExpression()), !dbg !5712
  call void @llvm.dbg.value(metadata ptr %51, metadata !5593, metadata !DIExpression()), !dbg !5712
  call void @llvm.dbg.value(metadata ptr undef, metadata !5595, metadata !DIExpression()), !dbg !5712
  %52 = getelementptr inbounds i8, ptr %51, i64 8, !dbg !5714
  store ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV12bug_spawn_opIN11bug_spawner15final_handler_tEE, i64 0, inrange i32 0, i64 2), ptr %51, align 8, !dbg !5714, !tbaa !4835
  store ptr getelementptr inbounds ({ [4 x ptr], [3 x ptr] }, ptr @_ZTV12bug_spawn_opIN11bug_spawner15final_handler_tEE, i64 0, inrange i32 1, i64 2), ptr %52, align 8, !dbg !5714, !tbaa !4835
  %53 = getelementptr inbounds %class.bug_spawn_op.14, ptr %51, i64 0, i32 2, !dbg !5715
  store ptr %0, ptr %53, align 8, !dbg !5715, !tbaa.struct !4802
  %54 = getelementptr inbounds %class.bug_spawn_op.14, ptr %51, i64 0, i32 3, !dbg !5716
  call void @llvm.dbg.value(metadata ptr %54, metadata !4838, metadata !DIExpression()), !dbg !5717
  call void @llvm.dbg.value(metadata ptr undef, metadata !4841, metadata !DIExpression()), !dbg !5717
  store ptr %5, ptr %54, align 8, !dbg !5719
  %55 = getelementptr inbounds %class.bug_spawn_op.14, ptr %51, i64 0, i32 3, i32 1, !dbg !5720
  store ptr %45, ptr %55, align 8, !dbg !5720, !tbaa !4808
  %56 = call i32 (ptr, ...) @printf(ptr noundef nonnull @.str.15, ptr noundef nonnull %54, ptr noundef nonnull %5, ptr noundef nonnull %45), !dbg !5721
  call void @llvm.dbg.value(metadata ptr %51, metadata !5583, metadata !DIExpression()), !dbg !5709
  call void @llvm.dbg.value(metadata ptr %20, metadata !5607, metadata !DIExpression()), !dbg !5722
  %57 = atomicrmw add ptr %20, i64 1 seq_cst, align 8, !dbg !5724
  %58 = load ptr, ptr %0, align 8, !dbg !5725, !tbaa !5615
  %59 = load ptr, ptr %58, align 8, !dbg !5726, !tbaa !4835
  %60 = getelementptr inbounds ptr, ptr %59, i64 2, !dbg !5726
  %61 = load ptr, ptr %60, align 8, !dbg !5726
  call void %61(ptr noundef nonnull align 8 dereferenceable(8) %58, ptr noundef nonnull align 8 dereferenceable(8) %51), !dbg !5726
  %62 = call ptr @__cxa_allocate_exception(i64 16) #27, !dbg !5727
  invoke void @_ZNSt13runtime_errorC1EPKc(ptr noundef nonnull align 8 dereferenceable(16) %62, ptr noundef nonnull @.str.4)
          to label %63 unwind label %64, !dbg !5728

@ChuanqiXu9
Copy link
Member

Thanks for reporting this. I might not be able to look it soon. And I'm wondering if it is possible to reduce this further. It is still long for a minimal reproducer. It is not required but it'll be pretty helpful.

@ChuanqiXu9
Copy link
Member

Note to myself: the fundamental problem of the issue is that we didn't perform a real/strict escape analysis. We imaged that the stack unwinding can take the job of destroying the coroutine frame in case the coroutine frame is elided. But we missed one point: the stack unwinding has different semantics with the explicit coroutine_handle<>::destroy(). Since the latter is explicit so it shows the intention of the user. So we can blame the user to destroy the coroutine frame incorrectly in case of use-after-free happens. But we can't do so with stack unwinding.

@avikivity
Copy link
Contributor

Please consider backporting to 16.0.x.

@EugeneZelenko
Copy link
Contributor

@avikivity: 17 is only currently maintained branch.

@avikivity
Copy link
Contributor

@mykaul @tchaikov this may be responsible for the problems we see with aarch64 and excessive inlining.

See https://github.com/scylladb/scylladb/blob/93be4c0cb0f0c53fe0eb7cd4d06ba15a7fc01d29/configure.py#L1417 for the workaround. Maybe we can get rid of it on the next clang release.

@nikic nikic added this to the LLVM 17.0.X Release milestone Aug 24, 2023
@nikic nikic reopened this Aug 24, 2023
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Aug 24, 2023
@nikic
Copy link
Contributor

nikic commented Aug 24, 2023

/cherry-pick 7037331

@tchaikov
Copy link
Contributor

tchaikov commented Aug 24, 2023

@mykaul @tchaikov this may be responsible for the problems we see with aarch64 and excessive inlining.

See https://github.com/scylladb/scylladb/blob/93be4c0cb0f0c53fe0eb7cd4d06ba15a7fc01d29/configure.py#L1417 for the workaround. Maybe we can get rid of it on the next clang release.

not sure what is the right number we should use in -mllvm -inline-threshold=<number>. not to mention that it would be difficult to teach CMake not to pass -mllvm ... when it uses clang++ as the driver of archiver. i guess probably a better (or more dangerous) approach would be https://src.fedoraproject.org/rpms/llvm/pull-request/182 .

@llvmbot
Copy link
Member

llvmbot commented Aug 24, 2023

/branch llvm/llvm-project-release-prs/issue59723

llvmbot pushed a commit to llvm/llvm-project-release-prs that referenced this issue Aug 24, 2023
…k coro handle unconditionally any more

Close llvm/llvm-project#59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.

(cherry picked from commit 7037331a2f05990cd59f35a7c0f6ce87c0f3cb5f)
@avikivity
Copy link
Contributor

@mykaul @tchaikov this may be responsible for the problems we see with aarch64 and excessive inlining.
See https://github.com/scylladb/scylladb/blob/93be4c0cb0f0c53fe0eb7cd4d06ba15a7fc01d29/configure.py#L1417 for the workaround. Maybe we can get rid of it on the next clang release.

not sure what is the right number we should use in -mllvm -inline-threshold=<number>. not to mention that it would be difficult to teach CMake not to pass -mllvm ... when it uses clang++ as the driver of archiver. i guess probably a better (or more dangerous) approach would be https://src.fedoraproject.org/rpms/llvm/pull-request/182 .

Let's discuss this on a scylladb issue so we don't bore the clang coroutine developers here.

@llvmbot
Copy link
Member

llvmbot commented Aug 24, 2023

/pull-request llvm/llvm-project-release-prs#637

tru pushed a commit to llvm/llvm-project-release-prs that referenced this issue Aug 25, 2023
…k coro handle unconditionally any more

Close llvm/llvm-project#59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.

(cherry picked from commit 7037331a2f05990cd59f35a7c0f6ce87c0f3cb5f)
@tru tru moved this from Needs Review to Done in LLVM Release Status Aug 25, 2023
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 2, 2023
…k coro handle unconditionally any more

Close llvm#59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 2, 2023
…k coro handle unconditionally any more

Close llvm#59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 2, 2023
…k coro handle unconditionally any more

Close llvm#59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 3, 2023
…k coro handle unconditionally any more

Close llvm#59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 3, 2023
…k coro handle unconditionally any more

Close llvm#59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 6, 2023
…k coro handle unconditionally any more

Close llvm#59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 11, 2023
…k coro handle unconditionally any more

Close llvm#59723.

The fundamental cause of the above issue is that we assumed the memory
of coroutine frame can be released by stack unwinding automatically
if the allocation of the coroutine frame is elided. But we missed one
point: the stack unwinding has different semantics with the explicit
coroutine_handle<>::destroy(). Since the latter is explicit so it shows
the intention of the user. So we can blame the user to destroy the
coroutine frame incorrectly in case of use-after-free happens. But we
can't do so with stack unwinding.

So after this patch, we won't think the exceptional terminator don't
leak the coroutine handle unconditionally. Instead, we think the
exceptional terminator will leak the coroutine handle too if the
coroutine is leaked somewhere along the search path.

Concretely for C++, we can think the exceptional terminator is not
special any more. Maybe this may cause some performance regressions.
But I've tested the motivating example (std::generator). And on the
other side, the coroutine elision is a middle end opitmization and not
a language feature. So we don't think we should blame such regressions
especially we are correcting the miscompilations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

7 participants