Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker]A failed consumer/producer future in ServerCnx can never be removed #23123

Merged

Conversation

poorbarcode
Copy link
Contributor

@poorbarcode poorbarcode commented Aug 4, 2024

Motivation

Conditions under which problems occur

  • Failover/Exclusive subscription
  • Sockets of Pulsar IO are not stable
  • ServerCnx thread is busy.
time registering consumer 1st registering consumer 2nd
1 Registered
2 Socket is unstable
3 Closing ServerCnx
4 Retry to subscribe
5 Check whether the previous consumer is still active[1], if not, override it by the current registering
6 ServerCnx is closed
7 Call serverCnx.checkConnectionLiveness, the Completable Future returned will never be completed
8 Client-side: call ClientCnx.removeConsumer after the subscribing is timeout
9 There is a failed future in ServerCnx.consumers

[1]: AbstractDispatcherSingleActiveConsumer.java

 public synchronized CompletableFuture<Void> addConsumer(Consumer consumer) {

  if (getActiveConsumer() != null) {
      return actConsumer.cnx().checkConnectionLiveness().thenCompose(actConsumerStillAlive -> {
          if (actConsumerStillAlive.isEmpty() || actConsumerStillAlive.get()) {
              return FutureUtil.failedFuture(new ConsumerBusyException("Exclusive consumer is already"
                      + " connected"));
          } else {
              return addConsumer(consumer);
          }
      });
  }
}

Affects

It leads to the following subscribing can never succeed due to the ServerCnx maintaining a failed future and it can never be removed.

Error logs

2024-07-25T09:28:00,109+0000 [pulsar-io-4-30] WARN  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.2:48345][persistent://{tenant}/{namespace}/__change_events-partition-0][multiTopicsReader-381f38cdef] A failed consumer with id is already present on the connection, consumerId=7425
2024-07-25T09:28:00,109+0000 [pulsar-io-4-30] ERROR org.apache.pulsar.broker.service.ServerCnx - A failed consumer with id is already present on the connection. consumerId: 7425, remoteAddress: /127.0.0.2:48345, subscription: multiTopicsReader-381f38cdef
...
...
...
// loop

Modifications

Fix the bug.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: x

@poorbarcode poorbarcode added type/bug The PR fixed a bug or issue reported a bug release/2.11.5 release/3.0.7 release/3.3.2 labels Aug 4, 2024
@poorbarcode poorbarcode added this to the 3.4.0 milestone Aug 4, 2024
@poorbarcode poorbarcode self-assigned this Aug 4, 2024
Copy link

github-actions bot commented Aug 4, 2024

@poorbarcode Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

@poorbarcode poorbarcode requested a review from lhotari August 4, 2024 16:49
Copy link
Member

@RobertIndie RobertIndie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@poorbarcode
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

@poorbarcode poorbarcode changed the title [fix] [broker] A failed consumer/producer future in ServerCnx can never be removed [fix][broker]A failed consumer/producer future in ServerCnx can never be removed Aug 5, 2024
@github-actions github-actions bot added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Aug 5, 2024
@poorbarcode poorbarcode requested review from gaoran10 and RobertIndie and removed request for RobertIndie August 5, 2024 02:45
@poorbarcode poorbarcode requested a review from RobertIndie August 5, 2024 03:08
@poorbarcode poorbarcode requested a review from lhotari August 5, 2024 08:26
@codecov-commenter
Copy link

codecov-commenter commented Aug 5, 2024

Codecov Report

Attention: Patch coverage is 69.23077% with 4 lines in your changes missing coverage. Please review.

Project coverage is 74.54%. Comparing base (bbc6224) to head (ae26a3e).
Report is 494 commits behind head on master.

Files Patch % Lines
...va/org/apache/pulsar/broker/service/ServerCnx.java 69.23% 3 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23123      +/-   ##
============================================
+ Coverage     73.57%   74.54%   +0.96%     
- Complexity    32624    34063    +1439     
============================================
  Files          1877     1919      +42     
  Lines        139502   144143    +4641     
  Branches      15299    15755     +456     
============================================
+ Hits         102638   107446    +4808     
+ Misses        28908    28485     -423     
- Partials       7956     8212     +256     
Flag Coverage Δ
inttests 27.83% <30.76%> (+3.25%) ⬆️
systests 24.66% <0.00%> (+0.34%) ⬆️
unittests 73.89% <69.23%> (+1.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...va/org/apache/pulsar/broker/service/ServerCnx.java 72.76% <69.23%> (+0.62%) ⬆️

... and 484 files with indirect coverage changes

@poorbarcode poorbarcode requested review from lhotari and removed request for lhotari August 5, 2024 16:23
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@poorbarcode poorbarcode merged commit 114880b into apache:master Aug 6, 2024
51 checks passed
poorbarcode added a commit that referenced this pull request Aug 6, 2024
poorbarcode added a commit that referenced this pull request Aug 6, 2024
poorbarcode added a commit that referenced this pull request Aug 6, 2024
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Aug 8, 2024
… be removed (apache#23123)

(cherry picked from commit 114880b)
(cherry picked from commit 2d162ba)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Aug 12, 2024
… be removed (apache#23123)

(cherry picked from commit 114880b)
(cherry picked from commit 2d162ba)
Denovo1998 pushed a commit to Denovo1998/pulsar that referenced this pull request Aug 17, 2024
grssam pushed a commit to grssam/pulsar that referenced this pull request Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants