Consumer gets stuck after temporary network connection drops #2363

erik-neumann · 2019-06-14T14:11:02Z

Description

We have about 70 clients that are connecting to a 3 broker Kafka cluster via WiFi. The clients are moved so the network connection can be off for shorter time (access point roaming) or longer time (when moved out of range). All clients have same configuration. They subscribe to two topics, the important one having a single partition, the other one 3 partitions.

The problem is that sometimes a client does not recover its connection. The topics do not get assigned anymore, even if the connection is restored and stable over a longer time. Only an application restart helps. After restart, all queued messages are successfully received from the broker. Unfortunately we have this issue on production...!

After looking into the logs it seems that the request to get the Metadata fails so that the assignment is not triggered. I was hoping that we were running into #2266, but the problem still persists after updating to 1.0.1 version.

I realized the log entry complaining about fetch.wait.max.ms not being at least 1000ms shorter than socket.timeout.ms. We changed fetch.wait.max.ms to 2500ms and rolled out to a couple of clients. But not sure if the issue might be caused by this. I would expect maybe certain request to fail, but not the consumer to get stuck without to recover.

A log extract is included below. At 2019-06-14 08:43:30 the client was restarted, and the messages were received successfully after that.

How to reproduce

I was not able to reliably reproduce the issue. It seems to be related to bad network quality, but does not always occur. Over the day about 3-5 out of ~70 random clients encounter this issue.

Checklist

librdkafka version (release number or git tag): 1.0.1
Apache Kafka version: 2.0.0
librdkafka client configuration: [fetch.wait.max.ms, 5000],[socket.timeout.ms, 5000],[bootstrap.servers, xxx],[group.id, 136-Group],[log.connection.close, False],[auto.offset.reset, latest],[enable.auto.offset.store, False],[enable.auto.commit, True],[socket.keepalive.enable, True]
Operating system: Ubuntu 18.04 x64
Provide logs (with debug=.. as necessary) from librdkafka
log-kafka.txt
Provide broker log excerpts
Critical issue

The text was updated successfully, but these errors were encountered:

edenhill · 2019-06-18T09:28:50Z

As a start, let socket.timeout.ms be at least max(fetch.wait.max.ms, session.timeout.ms) + 1000 (preferably more).
We see why here:

2019-06-14 05:23:19.331 | Kafka log (Notice): [thrd:10.0.0.62:9092/bootstrap]: 10.0.0.62:9092/2: Timed out MetadataRequest in flight (after 5044ms, timeout #0): possibly held back by preceeding blocking FetchRequest with timeout in 2861ms
2019-06-14 05:23:19.331 | Kafka log (Warning): [thrd:10.0.0.62:9092/bootstrap]: 10.0.0.62:9092/2: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests

Secondly, I'd need some more debugging, the currently level seems to be cgrp or similar, but I would need a reproduction with debug=cgrp,broker enabled to see what the current broker states are, and why.

erik-neumann · 2019-06-18T11:28:42Z

Thanks for your response! We rolled out a new configuration to all clients yesterday, reducing fetch.wait.max.ms from 5000 to 2500. I'm hoping so much that the issue was caused by this!

Yes, we logged using 'consumer,cgrp,topic,fetch'. If the problem still occurs, I will update the attached log entries by an extract using 'consumer,cgrp,topic,fetch,broker'.

So far it did not happen again, but it's too early to say. By the end of the day it should be more clear. We always had at least 2 occurrences over one day.

erik-neumann · 2019-06-19T10:25:56Z

Unfortunately it happened again today, 3 clients were affected. Logging was not enabled. Will enable using 'consumer,cgrp,topic,fetch,broker', and hope to catch an event soon.

erik-neumann · 2019-06-21T10:01:26Z

Alright, it happened again on a client that had logging activated. At "2019-06-21 09:32:37" the client was restarted, and then successfully received the messages again.

@edenhill is there something you can see from the logs? I tried, but I'm not that fit interpreting the logs.

log-kafka2.txt

erik-neumann · 2019-06-25T17:46:09Z

This issue is really hitting us badly... I'm thinking of implementing a workaround, like triggering an automatic re-connect. But when would be a suitable time to perform reconnect, also in respect to not cause any additional delay when connections recover. Maybe I could use the Admin Client to periodically check if brokers are gone/available then the have Consumer reconnect.

Or is there any better way to detect if the consumer connection is up or down? Also maybe by using the cgrp part of librdkafka statistics to detect if partitions are assigned?

haegele-tv · 2019-06-26T14:41:48Z

Those symptoms sound exactly like the problem we are seeing.

We are using librdkafka (v1.0.1) on Windows (build with OpenSSL) and connecting via SSL and Client-Certificate to Kafka-Cluster. We have 1000+ clients delivering log-data and many of them stop delivering any data to the Kafka-Cluster. We don't know yet what triggers that problem.
After a restart of the application it works again without problems.
Another workaround is to manually "kill" all existing TCP-Connections (yes, that's possible on Windows) from the application to the Brokers and then it also starts working.

Unfortunately we couldn't activate verbose logging on the application so far. However, I was able to profile some of the applications which had this problem and noticed that about 1-4 of the librdkafka-threads run in a more or less tight loop eating up about one CPU-Core each.
Looking at the call-stacks it looks like it's doing a SSL-Handshake over and over again (more than once per 100ms).

When we have Debug-Logging for one of the problematic applications I will add the information here. We are also rolling out 1.1.0 right now.
@erik-neumann : Do you also see higher CPU-Load on your application while the problem is happening? Are you using SSL?

This problem is also very critical for us.

erik-neumann · 2019-06-26T15:28:26Z

Did not realize that there is a 1.1.0 in the meantime. We might consider to update from 1.0.1 to 1.1.0 as well. Not sure if this change might be related to our problem: Consumer: max.poll.interval.ms now correctly handles blocking poll calls, allowing a longer poll timeout than the max poll interval.

@haegele-tv we're not using SSL. CPU looked good, also during the time the problem occurred (we're logging it). We're not going over 50%. Producing is not an issue in our case, but the consumer stops receiving messages.

koushikchitta · 2019-11-07T21:37:28Z

We see similar issue where consumer cannot consume from a partition when the client to broker or when is there is a leader change at broker disconnection happens.

TestNode is a compact topic.
TestNode - partition 8 is not consumable after a leader change for that partition.
The consumer auto resets to "BEGINNING" based on the config. But fails to consume because of out of range. and the consumer keeps switching between old & Invalid offset.

On the restart, the consumer can start consuming again.

Can you help me understand this?

What is the difference between the Beginning and earliest/smallest for auto.offset.reset ?
Ideally, the Kafka cluster should return a valid offset when queried for a Beginning/earliest offset. In this case, it is always getting an out of range offset even after multiple retries.
I would expect the consumer to give up after max retries and throw RD_KAFKA_RESP_ERR_OFFSET_OUT_OF_RANGE. but the consumer still hangs in without consuming.
If the retry scenario is unable to fetch the right offset, how come the client restart scenario works?

AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Fetch topic TestNode [8] at offset 19890042 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Enqueue 1 message(s) (3159 bytes, 1 ops) on TestNode [8] fetch queue (qlen 1, v2, last_offset 19890042, 0 ctrl msgs)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Fetch topic TestNode [8] at offset 19890043 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Fetch topic TestNode [8] at offset 19890043 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Fetch topic TestNode [8] at offset 19890043 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Fetch topic TestNode [8] at offset 19890043 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Fetch topic TestNode [8] at offset 19890043 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Fetch topic TestNode [8] at offset 19890043 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Fetch topic TestNode [8] at offset 19890043 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Fetch topic TestNode [8] at offset 19890043 (v2)
AzPubSub: Level: Debug, Message: [thrd:main]: Topic TestNode [8] migrated from broker 15 to 19
AzPubSub: Level: Debug, Message: [thrd:main]: TestNode [8]: delegate to broker 25.88.64.3:9092/19 (rktp 000002FBCCA61BF0, term 0, ref 6, remove 0)
AzPubSub: Level: Debug, Message: [thrd:main]: TestNode [8]: broker 25.66.203.176:9092/15 no longer leader
AzPubSub: Level: Debug, Message: [thrd:main]: TestNode [8]: broker 25.88.64.3:9092/19 is now leader for partition with 0 messages (0 bytes) queued
AzPubSub: Level: Debug, Message: [thrd:main]: Migrating topic TestNode [8] 000002FBCCA61BF0 from 25.66.203.176:9092/15 to 25.88.64.3:9092/19 (sending PARTITION_LEAVE to 25.66.203.176:9092/15)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Topic TestNode [8] in state active at offset 18820224 (0/100000 msgs, 0/1048576 kb queued, opv 2) is not fetchable: forced removal
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Removed TestNode [8] from fetch list (0 entries, opv 2)
AzPubSub: Level: Debug, Message: [thrd:25.66.203.176:9092/bootstrap]: 25.66.203.176:9092/15: Topic TestNode [8]: leaving broker (0 messages in xmitq, next leader 25.88.64.3:9092/19, rktp 000002FBCCA61BF0)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Topic TestNode [8]: joining broker (rktp 000002FBCCA61BF0, 0 message(s) queued)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Topic TestNode [8] in state active at offset 18820224 (0/100000 msgs, 0/1048576 kb queued, opv 2) is fetchable: 
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Added TestNode [8] to fetch list (2 entries, opv 2, 0 messages queued)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Fetch topic TestNode [8] at offset 19890043 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: TestNode [8]: Fetch backoff for 10ms: Broker: Offset out of range
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state active -> offset-query
AzPubSub: Level: Debug, Message: [thrd:main]: TestNode [8]: offset reset (at offset 19890043) to BEGINNING: Broker: Offset out of range: Broker: Offset out of range
AzPubSub: Level: Debug, Message: [thrd:main]: 25.88.64.3:9092/19: Partition TestNode [8]: querying for logical offset BEGINNING (opv 2)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state offset-query -> offset-wait
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Topic TestNode [8] in state offset-wait at offset 18820224 (0/100000 msgs, 0/1048576 kb queued, opv 2) is not fetchable: not in active fetch state
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Removed TestNode [8] from fetch list (0 entries, opv 2)
AzPubSub: Level: Debug, Message: [thrd:main]: 25.88.64.3:9092/19: Offset reply for topic TestNode [8] (v2 vs v2)
AzPubSub: Level: Debug, Message: [thrd:main]: Offset BEGINNING request for TestNode [8] returned offset 18820224 (18820224)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state offset-wait -> active
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] start fetching at offset 18820224
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Topic TestNode [8] in state active at offset 18820224 (0/100000 msgs, 0/1048576 kb queued, opv 2) is fetchable: 
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Added TestNode [8] to fetch list (2 entries, opv 2, 0 messages queued)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Fetch topic TestNode [8] at offset -1001 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: TestNode [8]: Fetch backoff for 10ms: Broker: Offset out of range
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Topic TestNode [8] in state active at offset 18820224 (341/100000 msgs, 988/1048576 kb queued, opv 2) is not fetchable: fetch backed off
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Removed TestNode [8] from fetch list (1 entries, opv 2)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state active -> offset-query
AzPubSub: Level: Debug, Message: [thrd:main]: TestNode [8]: offset reset (at offset INVALID) to BEGINNING: Broker: Offset out of range: Broker: Offset out of range
AzPubSub: Level: Debug, Message: [thrd:main]: 25.88.64.3:9092/19: Partition TestNode [8]: querying for logical offset BEGINNING (opv 2)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state offset-query -> offset-wait
AzPubSub: Level: Debug, Message: [thrd:main]: 25.88.64.3:9092/19: Offset reply for topic TestNode [8] (v2 vs v2)
AzPubSub: Level: Debug, Message: [thrd:main]: Offset BEGINNING request for TestNode [8] returned offset 18820224 (18820224)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state offset-wait -> active
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] start fetching at offset 18820224
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Topic TestNode [8] in state active at offset 18820224 (2052/100000 msgs, 4694/1048576 kb queued, opv 2) is fetchable: 
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Added TestNode [8] to fetch list (2 entries, opv 2, 0 messages queued)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Fetch topic TestNode [8] at offset -1001 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: TestNode [8]: Fetch backoff for 10ms: Broker: Offset out of range
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state active -> offset-query
AzPubSub: Level: Debug, Message: [thrd:main]: TestNode [8]: offset reset (at offset INVALID) to BEGINNING: Broker: Offset out of range: Broker: Offset out of range
AzPubSub: Level: Debug, Message: [thrd:main]: 25.88.64.3:9092/19: Partition TestNode [8]: querying for logical offset BEGINNING (opv 2)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state offset-query -> offset-wait
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Topic TestNode [8] in state offset-wait at offset 18820224 (3153/100000 msgs, 7468/1048576 kb queued, opv 2) is not fetchable: not in active fetch state
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Removed TestNode [8] from fetch list (1 entries, opv 2)
AzPubSub: Level: Debug, Message: [thrd:main]: 25.88.64.3:9092/19: Offset reply for topic TestNode [8] (v2 vs v2)
AzPubSub: Level: Debug, Message: [thrd:main]: Offset BEGINNING request for TestNode [8] returned offset 18820224 (18820224)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state offset-wait -> active
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] start fetching at offset 18820224
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Topic TestNode [8] in state active at offset 18820224 (5020/100000 msgs, 12542/1048576 kb queued, opv 2) is fetchable: 
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Added TestNode [8] to fetch list (2 entries, opv 2, 0 messages queued)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Fetch topic TestNode [8] at offset -1001 (v2)
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: TestNode [8]: Fetch backoff for 10ms: Broker: Offset out of range
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Topic TestNode [8] in state active at offset 18820224 (5451/100000 msgs, 13875/1048576 kb queued, opv 2) is not fetchable: fetch backed off
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Removed TestNode [8] from fetch list (1 entries, opv 2)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state active -> offset-query
AzPubSub: Level: Debug, Message: [thrd:main]: TestNode [8]: offset reset (at offset INVALID) to BEGINNING: Broker: Offset out of range: Broker: Offset out of range
AzPubSub: Level: Debug, Message: [thrd:main]: 25.88.64.3:9092/19: Partition TestNode [8]: querying for logical offset BEGINNING (opv 2)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state offset-query -> offset-wait
AzPubSub: Level: Debug, Message: [thrd:main]: 25.88.64.3:9092/19: Offset reply for topic TestNode [8] (v2 vs v2)
AzPubSub: Level: Debug, Message: [thrd:main]: Offset BEGINNING request for TestNode [8] returned offset 18820224 (18820224)
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] changed fetch state offset-wait -> active
AzPubSub: Level: Debug, Message: [thrd:main]: Partition TestNode [8] start fetching at offset 18820224

koushikchitta · 2019-11-12T22:27:34Z

@edenhill any suggestions on what is going wrong here?

tarvip · 2020-02-04T10:36:52Z

Any update on this, I suspect we have hit this issue as well. We are using node-rdkafka v2.7.4 which is using librdkafka version v1.2.2

edenhill · 2020-02-04T11:11:16Z

The issue is even though it looks up the proper offset thru
AzPubSub: Level: Debug, Message: [thrd:main]: Offset BEGINNING request for TestNode [8] returned offset 18820224 (18820224)

it doesn't actually use it for the next fetch, but rather OFFSET_INVALID:
AzPubSub: Level: Debug, Message: [thrd:25.88.64.3:9092/bootstrap]: 25.88.64.3:9092/19: Fetch topic TestNode [8] at offset -1001 (v2)

This has been fixed in da7a0a0#diff-ede56b68a6b39a69e16aea71fd6952b0R1958 which was included in librdkafka v1.3.0

linshaoyong · 2020-02-19T09:50:28Z

I suspect I encountered the similar problem on the producer side, We use vector that depends on librdkafka 1.2.1.
In a bad network environment, the broker has closed the connection, and the producer keeps the TCP connection, which leads to high CPU usage.
vector issue 1818

edenhill · 2020-02-19T09:59:55Z

There was a performance regression in v1.2.1 that was fixed in v1.2.2.
https://github.com/edenhill/librdkafka/releases/tag/v1.2.2
Suggest upgrading to v1.3.0.

linshaoyong · 2020-02-19T12:01:38Z

We only have 10+ messages/second, producer stuck in an endless loop in a broken TCP connection without reconnecting.
The error message is error: MessageTimedOut (Local: Message timed out)
strace command outputs a lot of duplicate information:

poll([{fd=29, events=POLLIN|POLLOUT}, {fd=24, events=POLLIN}], 2, 172) = 1 ([{fd=29, revents=POLLOUT}])
read(29, 0x281e783, 5)                  = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=29, events=POLLIN|POLLOUT}, {fd=24, events=POLLIN}], 2, 172) = 1 ([{fd=29, revents=POLLOUT}])
read(29, 0x281e783, 5)                  = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=29, events=POLLIN|POLLOUT}, {fd=24, events=POLLIN}], 2, 172) = 1 ([{fd=29, revents=POLLOUT}])
read(29, 0x281e783, 5)                  = -1 EAGAIN (Resource temporarily unavailable)

keith-chew · 2020-03-02T10:25:53Z

Hi @tarvip

How are you reconnecting to the server on disconnect? You might be interested in this post that I posted today:

Blizzard/node-rdkafka#758

with a workaround at the app layer. It is only a problem if you are not already using a callback in the connect() method. Nasty one and rare, but it hit us the other day.

I am about to file another issue here with another reconnection scenario, but I am in the middle of getting the logs for the report. I had to write a script which periodically enables/disables the firewall port to Kafka and DNS to simulate a network outage, but it paid off, as I managed to reproduce the condition earlier today outside of production. I don't have a workaround for this one unfortunately, as it appears to be no errors from the app layer which I can hook into for a recovery. I will reference the issue number here once I have posted it.

keith-chew · 2020-03-02T11:18:15Z

Issue posted here:
#2739

erik-neumann mentioned this issue Jun 27, 2019

Kafka brokers unavailable is not a fatal error, even when trying to consume confluentinc/confluent-kafka-dotnet#982

Open

8 tasks

erik-neumann changed the title ~~Consumer might get stuck after temporary network connection drops~~ Consumer gets stuck after temporary network connection drops Jul 2, 2019

erik-neumann mentioned this issue Jul 2, 2019

Consumer temporarily stops message consumption confluentinc/confluent-kafka-dotnet#985

Open

8 tasks

edenhill closed this as completed Feb 4, 2020

tarvip mentioned this issue Feb 4, 2020

Upgrade librdkafka to 1.3.0 Blizzard/node-rdkafka#753

Merged

linshaoyong mentioned this issue Feb 19, 2020

kafka sink thread stuck in infinite loop vectordotdev/vector#1818

Closed

melzayat mentioned this issue Apr 18, 2020

KafkaReciever unable to recover from temp network disconnect reactor/reactor-kafka#131

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consumer gets stuck after temporary network connection drops #2363

Consumer gets stuck after temporary network connection drops #2363

erik-neumann commented Jun 14, 2019

edenhill commented Jun 18, 2019

erik-neumann commented Jun 18, 2019

erik-neumann commented Jun 19, 2019

erik-neumann commented Jun 21, 2019 •

edited

Loading

erik-neumann commented Jun 25, 2019

haegele-tv commented Jun 26, 2019 •

edited

Loading

erik-neumann commented Jun 26, 2019

koushikchitta commented Nov 7, 2019 •

edited

Loading

koushikchitta commented Nov 12, 2019

tarvip commented Feb 4, 2020

edenhill commented Feb 4, 2020

linshaoyong commented Feb 19, 2020 •

edited

Loading

edenhill commented Feb 19, 2020

linshaoyong commented Feb 19, 2020 •

edited

Loading

keith-chew commented Mar 2, 2020 •

edited

Loading

keith-chew commented Mar 2, 2020

Consumer gets stuck after temporary network connection drops #2363

Consumer gets stuck after temporary network connection drops #2363

Comments

erik-neumann commented Jun 14, 2019

Description

How to reproduce

Checklist

edenhill commented Jun 18, 2019

erik-neumann commented Jun 18, 2019

erik-neumann commented Jun 19, 2019

erik-neumann commented Jun 21, 2019 • edited Loading

erik-neumann commented Jun 25, 2019

haegele-tv commented Jun 26, 2019 • edited Loading

erik-neumann commented Jun 26, 2019

koushikchitta commented Nov 7, 2019 • edited Loading

koushikchitta commented Nov 12, 2019

tarvip commented Feb 4, 2020

edenhill commented Feb 4, 2020

linshaoyong commented Feb 19, 2020 • edited Loading

edenhill commented Feb 19, 2020

linshaoyong commented Feb 19, 2020 • edited Loading

keith-chew commented Mar 2, 2020 • edited Loading

keith-chew commented Mar 2, 2020

erik-neumann commented Jun 21, 2019 •

edited

Loading

haegele-tv commented Jun 26, 2019 •

edited

Loading

koushikchitta commented Nov 7, 2019 •

edited

Loading

linshaoyong commented Feb 19, 2020 •

edited

Loading

linshaoyong commented Feb 19, 2020 •

edited

Loading

keith-chew commented Mar 2, 2020 •

edited

Loading