Segfault in round-robin assignor #2121

ankon · 2018-11-27T15:08:22Z

Description

We're using the 'roundrobin' partition assignment strategy, and sometimes I get segfaults in rdk:main. I looked at one of these in more detail:

#0  0x00007fc0f32d6fd3 in rd_kafka_group_member_find_subscription (rk=rk@entry=0x7fc0d4002e50, rkgm=rkgm@entry=0x7fc0b806a380, topic=0x7fc0b8076de0 "ml3-commands") at rdkafka_assignor.c:81
#1  0x00007fc0f32d7fb4 in rd_kafka_roundrobin_assignor_assign_cb (rk=0x7fc0d4002e50, member_id=<optimized out>, protocol_name=<optimized out>, metadata=<optimized out>, members=0x7fc0b806a2e0, 
    member_cnt=<optimized out>, eligible_topics=0x7fc0b805bfa0, eligible_topic_cnt=12, errstr=0x7fc0d97f9a00 "", errstr_size=8, opaque=0x0) at rdkafka_roundrobin_assignor.c:80
#2  0x00007fc0f32d74d8 in rd_kafka_assignor_run (rkcg=rkcg@entry=0x7fc0d4003b20, protocol_name=protocol_name@entry=0x6b8ec40 "roundrobin", metadata=0x7fc0b80769f0, members=members@entry=0x7fc0b806a2e0, 
    member_cnt=member_cnt@entry=2, errstr=errstr@entry=0x7fc0d97f9a00 "", errstr_size=512) at rdkafka_assignor.c:331
#3  0x00007fc0f32be5f7 in rd_kafka_cgrp_assignor_run (rkcg=0x7fc0d4003b20, protocol_name=0x6b8ec40 "roundrobin", err=<optimized out>, metadata=<optimized out>, members=0x7fc0b806a2e0, member_cnt=2)
    at rdkafka_cgrp.c:671
#4  0x00007fc0f32be7cf in rd_kafka_cgrp_assignor_handle_Metadata_op (rk=<optimized out>, rkq=<optimized out>, rko=<optimized out>) at rdkafka_cgrp.c:733
#5  0x00007fc0f32ab5fc in rd_kafka_op_call (rk=<optimized out>, rkq=<optimized out>, rko=0x7fc0b802d090) at rdkafka_op.c:519
#6  0x00007fc0f32ac7e5 in rd_kafka_op_handle (rk=0x7fc0d4002e50, rkq=0x7fc0d97f9cc0, rko=0x7fc0b802d090, cb_type=RD_KAFKA_Q_CB_CALLBACK, opaque=0x0, callback=0x0) at rdkafka_op.c:647
#7  0x00007fc0f32a58df in rd_kafka_q_serve (rkq=0x7fc0d40038f0, timeout_ms=<optimized out>, max_cnt=max_cnt@entry=0, cb_type=cb_type@entry=RD_KAFKA_Q_CB_CALLBACK, callback=callback@entry=0x0, 
    opaque=opaque@entry=0x0) at rdkafka_queue.c:481
#8  0x00007fc0f327d4e4 in rd_kafka_thread_main (arg=arg@entry=0x7fc0d4002e50) at rdkafka.c:1393
#9  0x00007fc0f32da907 in _thrd_wrapper_function (aArg=<optimized out>) at tinycthread.c:624
#10 0x00007fc0f5d8358e in start_thread () from /lib64/libpthread.so.0
#11 0x00007fc0f5cb0653 in clone () from /lib64/libc.so.6

Note:

This is librdkafka via node-rdkafka in NodeJS
The commands in gdb below have been trimmed and reordered for readability :D

It seems that in rd_kafka_roundrobin_assignor_assign_cb the code is running over the length of the members array:

(gdb) up
#1  0x00007fc0f32d7fb4 in rd_kafka_roundrobin_assignor_assign_cb (rk=0x7fc0d4002e50, member_id=<optimized out>, protocol_name=<optimized out>, metadata=<optimized out>, members=0x7fc0b806a2e0, 
    member_cnt=<optimized out>, eligible_topics=0x7fc0b805bfa0, eligible_topic_cnt=12, errstr=0x7fc0d97f9a00 "", errstr_size=8, opaque=0x0) at rdkafka_roundrobin_assignor.c:80
80			for (partition = 0 ;
(gdb) p next
$12 = 2

(cf. "2", an index, to the member_cnt "2" seen in frame 2).

(gdb) p members[2]
$13 = {rkgm_subscription = 0x0, rkgm_assignment = 0x3885, rkgm_eligible = {rl_size = 0, rl_cnt = 0, rl_elems = 0xf4240, rl_free_cb = 0x0, rl_flags = 2, rl_elemsize = 0, rl_p = 0x8000000007}, 
  rkgm_member_id = 0xff, rkgm_userdata = 0xd00000100, rkgm_member_metadata = 0x700}

I also looked the subscriptions involved for members[0]:

(gdb) p *members[0].rkgm_subscription 
$21 = {cnt = 12, size = 24, elems = 0x7fc0b80591c0}
(gdb) p members[0].rkgm_subscription->elems[0]
$23 = {topic = 0x7fc0b8017ad0 "collaborne-commands", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[1]
$24 = {topic = 0x7fc0b8056e10 "collaborne-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[2]
$25 = {topic = 0x7fc0b805c110 "ml1-commands", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[3]
$26 = {topic = 0x7fc0b8016620 "ml1-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[4]
$27 = {topic = 0x7fc0b8016670 "ml2-commands", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[5]
$28 = {topic = 0x7fc0b8015bf0 "ml2-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[6]
$29 = {topic = 0x7fc0b803e260 "ml3-commands", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[7]
$30 = {topic = 0x7fc0b8058170 "ml3-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[8]
$31 = {topic = 0x7fc0b803e2b0 "ml4-commands", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[9]
$32 = {topic = 0x7fc0b80508f0 "ml4-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[10]
$33 = {topic = 0x7fc0b805ee50 "qa-commands", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[0].rkgm_subscription->elems[11]
$34 = {topic = 0x7fc0b80378d0 "qa-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}

... and members[1]:

(gdb)  p *members[1].rkgm_subscription
$50 = {cnt = 5, size = 10, elems = 0x7fc0b8015660}
(gdb) p members[1].rkgm_subscription->elems[0]
$42 = {topic = 0x7fc0b8057760 "collaborne-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[1].rkgm_subscription->elems[1]
$43 = {topic = 0x7fc0b8046c10 "ml1-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[1].rkgm_subscription->elems[2]
$44 = {topic = 0x7fc0b805e730 "ml2-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[1].rkgm_subscription->elems[3]
$45 = {topic = 0x7fc0b8037340 "ml3-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}
(gdb) p members[1].rkgm_subscription->elems[4]
$46 = {topic = 0x7fc0b803d9e0 "qa-events", partition = -1, offset = -1001, metadata = 0x0, metadata_size = 0, opaque = 0x0, err = RD_KAFKA_RESP_ERR_NO_ERROR, _private = 0x0}

The crash doesn't always happen, and looking at the subscriptions there seems to be a good chance that it is caused by some form of concurrent modifications to the subscription list. This would not be impossible, our consumer does see changes in subscribed topics sometimes.

Looking at https://github.com/edenhill/librdkafka/blob/d7d58e5407852d17485cf8a82841523eb2b8f6d1/src/rdkafka_roundrobin_assignor.c#L85-L90: It seems that next++ should be guarded over the length of the members, for example:

next = (next+1) % member_cnt;

I'm also a bit in doubt about the later part in https://github.com/edenhill/librdkafka/blob/d7d58e5407852d17485cf8a82841523eb2b8f6d1/src/rdkafka_roundrobin_assignor.c#L105: Instead of checking the eligible topic this should probably use member_cnt as well.

Checklist

Please provide the following information:

librdkafka version (release number or git tag): v0.11.5 (from pkgconfig)
Apache Kafka version: 2.0.0
librdkafka client configuration: partition.assignment.strategy=roundrobin (and others)
Operating system: Fedora 29/x86_64
Provide logs (with debug=.. as necessary) from librdkafka
Provide broker log excerpts
Critical issue

The text was updated successfully, but these errors were encountered:

edenhill · 2018-11-29T07:29:21Z

Can you provide:

the number of members and each member's topics subscription
the number of partitions for each subscribed topic

This way I can reproduce this locally.

ankon · 2018-11-29T11:03:52Z

the number of partitions for each subscribed topic

1 partition per topic.

the number of members and each member's topics subscription

2 members in total. The subscriptions seen in the gdb status seem reasonable to me, in the sense that members[1] was starting up and adding subscriptions, while members[0] was running already with a full set. The crash happened before the second member was fully started.

In other words:
member0 subscriptions: collaborne-commands, collaborne-events, ml1-commands, ml1-events, ml2-commands, ml2-events, ml3-commands, ml3-events, ml4-commands, ml4-events, qa-commands, qa-events
member1 subscriptions: collaborne-events, ml1-events, ml2-events, ml3-events, qa-events

The application startup works by incrementally adding subscriptions to the ...-events topics to the consumer, and then adding further subscriptions to the linked ...-commands topic a while later. The crash seemed to have happened at the starting phase of the second member, so seeing only some ...-events there "makes complete sense to me" :)
Unfortunately I wasn't able to capture the logs of this run, and I had to learn the hard way that coredumpctl debug (or something related to that thing) for some reason decided to delete the crash dump on me.
I have tried to debug the situation with loglevel set to cgrp, but wasn't able so far to get anything there.

ankon · 2018-11-30T10:08:05Z

FWIW: I now have a specific container image and matching coredump, so I can again look into the state. I've also checked how often this happens for us, and it seems to be very frequent, so that for now I'll have to switch back to the "range" strategy for our production systems.

edenhill · 2018-11-30T14:44:28Z

It might be worthwhile to recompile librdkafka with --disable-optimization to get better backtraces, and then do the full gdb dance to provide enough information for me to create a reproducible test case

wilmai · 2019-12-10T04:04:01Z

#2650

I ran into this while using roundrobin assignment, it seemed to be caused by a race condition where a new topic is picked up by a subset of the consumers, and the code handling the case of an unsubscribed consumer overflows memory.

Reproducing:

Run 3 consumers in a group, subscribed to "foo.*"
Create a topic matching the regex with 3 partitions; alternatively, 3 topics with 1 partition
Wait for the new topic to trigger rebalancing

edenhill · 2019-12-11T08:29:31Z

Okay, so it sounds like it is a problem with asymetric subscriptions, we should add a unit test to trigger this.

takaomag · 2020-02-05T07:43:25Z

When is this expected to be completed/closed?
I want to prevent bias of messages caused by range strategy.

edenhill · 2020-02-05T07:48:32Z

This will be addressed after the transactions PR has been merged.

) Check next roundrobin consumer is in the topic's eligible list, otherwise select the next eligible consumer. Previous code overflowed members[next] and crashed.

This also adds declarative unit tests of the assignors.

edenhill added the GREAT REPORT label Nov 28, 2018

kstaken mentioned this issue Jan 25, 2019

Ability to specify partition.assignment.strategy terascope/kafka-assets#18

Closed

edenhill added this to the v1.5.0 milestone Mar 12, 2020

edenhill added a commit that referenced this issue May 4, 2020

Fix crash in roundrobin assignor for asymmetrical subscriptions (#2121)

1817bf9

This also adds declarative unit tests of the assignors.

edenhill added a commit that referenced this issue May 5, 2020

Fix crash in roundrobin assignor for asymmetrical subscriptions (#2121)

8ba5df7

This also adds declarative unit tests of the assignors.

edenhill closed this as completed May 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segfault in round-robin assignor #2121

Segfault in round-robin assignor #2121

ankon commented Nov 27, 2018 •

edited

Loading

edenhill commented Nov 29, 2018

ankon commented Nov 29, 2018

ankon commented Nov 30, 2018

edenhill commented Nov 30, 2018

wilmai commented Dec 10, 2019

edenhill commented Dec 11, 2019

takaomag commented Feb 5, 2020

edenhill commented Feb 5, 2020

Segfault in round-robin assignor #2121

Segfault in round-robin assignor #2121

Comments

ankon commented Nov 27, 2018 • edited Loading

Description

Checklist

edenhill commented Nov 29, 2018

ankon commented Nov 29, 2018

ankon commented Nov 30, 2018

edenhill commented Nov 30, 2018

wilmai commented Dec 10, 2019

edenhill commented Dec 11, 2019

takaomag commented Feb 5, 2020

edenhill commented Feb 5, 2020

ankon commented Nov 27, 2018 •

edited

Loading