Skip to content

Commit

Permalink
kafka: avoid access to freed context in produce handler
Browse files Browse the repository at this point in the history
When the produce handler first stage failed the original code would also
immediately fail the second stage return future with the same error.
There is nothing wrong with those semantics, but the problem is that by
immediately failing the second stage the actual futures for the second
stage were run in the background while holding a live reference to the
request context, leading to the segfault below.

What's most interesting about this is that it requires the produce
request to have data for two partitions for the bug to trigger. In the
scenario the handling for one partition append succeeds in the first and
second stages, while the other partition append fails in the first
stage. In this case the first stage failure causes the request context
to be freed while the second stage still contains a live append request.
Is is the live append request that eventually touches the request
context.

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/util/backtrace.hh:59
 (inlined by) seastar::backtrace_buffer::append_backtrace() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:758
 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:788
 (inlined by) seastar::print_with_backtrace(char const*, bool) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:800
 (inlined by) seastar::sigsegv_action() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3593
 (inlined by) operator() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3579
 (inlined by) __invoke at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3575
?? ??:0
addr2line: DWARF error: could not find variable specification at offset 4b
addr2line: DWARF error: could not find variable specification at offset ec210
addr2line: DWARF error: could not find variable specification at offset ec3d7
lw_shared_ptr at /vectorized/include/seastar/core/shared_ptr.hh:291
 (inlined by) kafka::request_context::connection() at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-0eeaee3e12186c1d5-1/vectorized/redpanda/vbuild/release/clang/../../../src/v/kafka/server/request_context.h:81
 (inlined by) operator() at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-0eeaee3e12186c1d5-1/vectorized/redpanda/vbuild/release/clang/../../../src/v/kafka/server/handlers/produce.cc:315
 (inlined by) _ZNSt3__18__invokeIRZN5kafkaL23produce_topic_partitionERNS1_11produce_ctxERNS1_18topic_produce_dataERNS1_22partition_produce_dataEE3$_3JNS1_26partition_produce_responseEEEEDTclclsr3std3__1E7forwardIT_Efp_Espclsr3std3__1E7forwardIT0_Efp0_EEEOSB_DpOSC_ at /vectorized/llvm/bin/../include/c++/v1/type_traits:3694
 (inlined by) std::__1::invoke_result<kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&, kafka::partition_produce_response>::type std::__1::invoke<kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&, kafka::partition_produce_response>(kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&, kafka::partition_produce_response&&) at /vectorized/llvm/bin/../include/c++/v1/functional:2989
 (inlined by) auto seastar::internal::future_invoke<kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&, kafka::partition_produce_response>(kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&, kafka::partition_produce_response&&) at /vectorized/include/seastar/core/future.hh:1211
 (inlined by) operator() at /vectorized/include/seastar/core/future.hh:1582
 (inlined by) void seastar::futurize<kafka::partition_produce_response>::satisfy_with_result_of<seastar::future<kafka::partition_produce_response>::then_impl_nrvo<kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3, seastar::future<kafka::partition_produce_response> >(kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&&)::{lambda(seastar::internal::promise_base_with_type<kafka::partition_produce_response>&&, kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&, seastar::future_state<kafka::partition_produce_response>&&)#1}::operator()(seastar::internal::promise_base_with_type<kafka::partition_produce_response>&&, kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&, seastar::future_state<kafka::partition_produce_response>&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type<kafka::partition_produce_response>&&, kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&&) at /vectorized/include/seastar/core/future.hh:2122
 (inlined by) operator() at /vectorized/include/seastar/core/future.hh:1575
 (inlined by) seastar::continuation<seastar::internal::promise_base_with_type<kafka::partition_produce_response>, kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3, seastar::future<kafka::partition_produce_response>::then_impl_nrvo<kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3, seastar::future<kafka::partition_produce_response> >(kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&&)::{lambda(seastar::internal::promise_base_with_type<kafka::partition_produce_response>&&, kafka::produce_topic_partition(kafka::produce_ctx&, kafka::topic_produce_data&, kafka::partition_produce_data&)::$_3&, seastar::future_state<kafka::partition_produce_response>&&)#1}, kafka::partition_produce_response>::run_and_dispose() at /vectorized/include/seastar/core/future.hh:767
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2263
 (inlined by) seastar::reactor::run_some_tasks() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2672
seastar::reactor::run() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2831
operator() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:4022
std::__1::__function::__value_func<void ()>::operator()() const at /vectorized/llvm/bin/../include/c++/v1/functional:1885
 (inlined by) std::__1::function<void ()>::operator()() const at /vectorized/llvm/bin/../include/c++/v1/functional:2560
 (inlined by) seastar::posix_thread::start_routine(void*) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/posix.cc:60
addr2line: '/opt/redpanda/lib/libpthread.so.0': No such file
/opt/redpanda/lib/libpthread.so.0 0x9298
addr2line: '/opt/redpanda/lib/libc.so.6': No such file
/opt/redpanda/lib/libc.so.6 0x1006a2

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
  • Loading branch information
dotnwat committed Aug 8, 2021
1 parent 1adef78 commit 7e35133
Showing 1 changed file with 18 additions and 2 deletions.
20 changes: 18 additions & 2 deletions src/v/kafka/server/handlers/produce.cc
Original file line number Diff line number Diff line change
Expand Up @@ -598,9 +598,25 @@ produce_handler::handle(request_context ctx, ss::smp_service_group ssg) {
octx.response)));
});
} catch (...) {
/*
* if the first stage failed then we cannot resolve the
* current future (do_with holding octx) immediately,
* otherwise octx will be destroyed and all of the second
* stage futures (which have a reference to octx) will be
* backgrounded. logging about the second stage return value
* is handled in connection_context handler.
*/
dispatched_promise.set_exception(std::current_exception());
return ss::make_exception_future<response_ptr>(
std::current_exception());
return when_all_succeed(produced.begin(), produced.end())
.discard_result()
.then([] {
return ss::make_exception_future<response_ptr>(
std::runtime_error("First stage produce failed but "
"second stage succeeded."));
})
.handle_exception([](std::exception_ptr e) {
return ss::make_exception_future<response_ptr>(e);
});
}
});
});
Expand Down

0 comments on commit 7e35133

Please sign in to comment.