-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][broker] The topic might reference a closed ledger #22860
Conversation
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java
Outdated
Show resolved
Hide resolved
pulsar-broker/src/test/java/org/apache/pulsar/broker/service/ReplicatorTest.java
Show resolved
Hide resolved
pulsar-broker/src/test/java/org/apache/pulsar/client/api/OrphanPersistentTopicTest.java
Show resolved
Hide resolved
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java
Outdated
Show resolved
Hide resolved
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java
Show resolved
Hide resolved
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #22860 +/- ##
============================================
- Coverage 73.57% 73.27% -0.30%
- Complexity 32624 32708 +84
============================================
Files 1877 1891 +14
Lines 139502 141982 +2480
Branches 15299 15571 +272
============================================
+ Hits 102638 104040 +1402
- Misses 28908 29931 +1023
- Partials 7956 8011 +55
Flags with carried forward coverage won't be shown. Click here to find out more.
|
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java
Outdated
Show resolved
Hide resolved
b3d598b
to
40ce065
Compare
(cherry picked from commit a91a172)
(cherry picked from commit a91a172)
…pache#22860) (apache#22900) (cherry picked from commit 8be3e8a)
…pache#22860) (apache#22900) (cherry picked from commit 8be3e8a)
…pache#22860) (apache#22900) (cherry picked from commit 8be3e8a)
@shibd I have a concern about this pr. After this pr, the process is "topicFuture timeout -> close topic -> remove topicFuture". During the "close topic", client can acquire the timeout response immediately and retry. That means, if topicFuture timeout, client may generate a large amount of retry requests to server. Is seems not very good. I have test and see client retry frequently in log
|
Yes, you are right. Maybe we need improve client, when get a timoutexception, don't be so aggressive in retrying. or, we should completely refactor the broker's behavior in topic load:
Anyway, the current solution will not leave the topic in an unusable state. (It is link with a closed ledger.) WHDYT? |
This way seems better. |
Could you cherry-pick this PR into |
hi, @poorbarcode For branch-2.11, I think we can use this fix, It is a simple solution. Do you agree? |
@shibd I have another question. Is there exist the situation: broker's memory usage is high, persistentTopic object is be GC between the persistentTopic object create and add object to reference map ? And then result in topicFuture permanently not finish ? pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java Lines 1746 to 1787 in 47f204f
|
(cherry picked from commit a91a172)
(cherry picked from commit a91a172) Signed-off-by: Zixuan Liu <nodeces@gmail.com>
Motivation
We observe that a
normal topic
might reference aclosed ledger
and it never auto recover. will cause the producer and customer stuck.The root cause is related to
topic create timeout
. Assuming we have two concurrent requests to create a topic:firstCreate(Thread-1)
,secondCreate(Thread-2)
old ledger
being referenced tonew topic
and that stats is closeWhen the
firstCreate
topic timeout, will calltopic.close
. it will close the ledger, and remove it from the ledger cache.pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java
Lines 1745 to 1752 in f07b3a0
If the
secondCreate
request successfully creates a topic before theold ledger closes
, the reference will be made to theold ledger
.Modifications
Refactor
BrokerService#getTopic()
method:pulsar.getExecutor().execute(() -> topics.remove(topicName.toString()));
topicFuture
that created by previous get topic operation. If the previous topicFuture is not removed from the map yet, the broker should always use the existing topicFuture (waiting for completion or return an error to the client side directly)not include topicFuture
because the actual future object placed in the map might not be the same as thetopicFuture
. You can check this code to validation it.Verifying this change
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: shibd#37