-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improve][broker] Reduce the CPU pressure from the transaction buffer in rolling restarts #23062
[improve][broker] Reduce the CPU pressure from the transaction buffer in rolling restarts #23062
Conversation
31d847c
to
79430fc
Compare
This improvement for the reader side can be verified by the following example. public class LoadTopicTest extends ProducerConsumerBase {
@BeforeClass
@Override
protected void setup() throws Exception {
conf.setDefaultNumberOfNamespaceBundles(1);
conf.setNumTransactionReplayThreadPoolSize(1);
conf.setTransactionCoordinatorEnabled(true);
super.isTcpLookup = true;
super.internalSetup();
super.producerBaseSetup();
}
@AfterClass
@Override
protected void cleanup() throws Exception {
super.internalCleanup();
}
@Test
public void test() throws Exception {
final var topic = "test";
admin.topics().createPartitionedTopic(topic, 5);
@Cleanup final var producer = pulsarClient.newProducer().topic(topic + "-partition-0").create();
Thread.sleep(1000);
}
} Save the outputs into
Even though the After this change, the logs become
|
Mark it as draft to avoid repeated creation of reader. It's better to maintain a long live reader. |
…ders and writers in rolling restarts ### Motivation During the rolling restarts, the namespace bundle ownerships will change. Assuming there is a producer created on a single topic, and the ownership was transferred to the new broker. Assuming the namespace bundle has N topics and the namespace is `tenant/ns`, 1. All N topics in the same bundle of that topic will be loaded. 2. For each topic, the managed ledger will be initialized, when the transaction coordinator is enabled, a `TopicTransactionBuffer` will be created. 2.1 A Pulsar producer will be created on `tenant/ns/__transaction_buffer_snapshot` concurrently. 2.2 A Pulsar reader will be created on `tenant/ns/__transaction_buffer_snapshot` concurrently. 3. Once all N readers are created, the owner of the snapshot topic will start dispatching messages to N readers. Each dispatcher will read messages from BookKeeper concurrently and might fail with too many requests error because BK can only have `maxPendingReadRequestsPerThread` pending read requests (default: 10000). We have a `numTransactionReplayThreadPoolSize` config to limit the concurrency of transaction snapshot readers. However, it only limits the read loop. For example, if it's configured with 1, only 1 reader could read messages at the same time. However, N readers will be created concurrently. Each when one of these reader explicitly calls `readNext`, all N dispatchers at brokers side will dispatch messages to N readers. The behaviors above brings much CPU pressure on the owner broker, especially for a small cluster with only two brokers. ### Modifications - Synchronize the reader creation, read loop and the following process on its result. Maintain only one reader for each namespace.
9378fa9
to
ffcc578
Compare
@liangyepianzhou @congbobo184 @gaoran10 Please take a look |
ef22430
to
3b192a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #23062 +/- ##
============================================
- Coverage 73.57% 73.47% -0.11%
- Complexity 32624 33530 +906
============================================
Files 1877 1919 +42
Lines 139502 144086 +4584
Branches 15299 15741 +442
============================================
+ Hits 102638 105860 +3222
- Misses 28908 30101 +1193
- Partials 7956 8125 +169
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Merge it first. |
… in rolling restarts (apache#23062) (cherry picked from commit 40c8c23) (cherry picked from commit 53fb549)
… in rolling restarts (apache#23062) (cherry picked from commit 40c8c23) (cherry picked from commit 53fb549)
Motivation
During the rolling restarts, the namespace bundle ownerships will change. Assuming there is a producer created on a single topic, and the ownership was transferred to the new broker. Assuming the namespace bundle has N topics and the namespace is
tenant/ns
,TopicTransactionBuffer
will be created. A Pulsar reader will be created ontenant/ns/__transaction_buffer_snapshot
concurrently.maxPendingReadRequestsPerThread
pending read requests (default: 10000).We have a
numTransactionReplayThreadPoolSize
config to limit the concurrency of transaction snapshot readers. However, it only limits the read loop. For example, if it's configured with 1, only 1 reader could read messages at the same time. However, N readers will be created concurrently. Each when one of these reader explicitly callsreadNext
, all N dispatchers at brokers side will dispatch messages to N readers.The behaviors above brings much CPU pressure on the owner broker, especially for a small cluster with only two brokers.
Modifications
Synchronize the reader creation, read loop and the following process on its result. Maintain only one reader for each namespace. The reader is now not closed unless there is no snapshot read request in 1 minute.
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: BewareMyPower#32