You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.
Describe the bug
A clear and concise description of what the bug is.
Pulsar and bookkeeper version:
pulsar-2.8.0 and pulsar-2.8.0 built-in bookkeeper
cluster with 5 brokers and 5 bookies
In order to figure out the reason for the OOM of the pulsar broker's direct memory, I tested different scenarios and got some different results.
After analyzing the pulsar broker heap dump, a large number of PendingAddOp instances have not been recycled or destroyed.
As shown in the figure below, I suspect that a large number of entry requests written to bookie have not received all the WQ responses, which makes PendingAddOp unable to be recycled or destroyed.
Therefore, I use maxMessagePublishBufferSizeInMB to limit the traffic handled by the broker according to apache#7406 and apache#6178.
But next is my test results:
The broker is configured with maxMessagePublishBufferSizeInMB=512, EW A=3:3:2, OOM still occurs after the pressure test
The broker configures maxMessagePublishBufferSizeInMB=512, and tests EW A=3:3:3, 3:2:2, and 2:2:2 respectively. After the pressure test, the direct memory is normal
The broker configures maxMessagePublishBufferSizeInMB=2048, test EW A=3:3:3 and 3:2:2, after the pressure test, the direct memory is normal
The broker configuration keeps maxMessagePublishBufferSizeInMB as the default value, the default is 1/2 of the maximum allocated off-heap memory (8/2=4GB in the test), test EW A=3:3:3 and 3:2:2, pressure test The off-heap memory is normal
The broker configures maxMessagePublishBufferSizeInMB=-1, closes current limiting measures, tests EW A=3:3:3 and 3:2:2, the memory is normal after the pressure test
The broker configures maxMessagePublishBufferSizeInMB=-1, closes current limiting measures, tests EW A=3:3:2, OOM occurs after the pressure test
The next questions also are related to apache#9562
My question is, whether maxMessagePublishBufferSizeInMB is configured or not,
as long as AQ=WQ, direct memory is normal,
as long as AQ<WQ, direct memory will appear OOM,
this may be related to bookie’s processing logic, but how does maxMessagePublishBufferSizeInMB work?
Except that the EWA configuration ratio is different, all tests use the same configuration and only include writing, no consumption
Original Issue: apache#12169
Describe the bug
A clear and concise description of what the bug is.
Pulsar and bookkeeper version:
pulsar-2.8.0 and pulsar-2.8.0 built-in bookkeeper
cluster with 5 brokers and 5 bookies
In order to figure out the reason for the OOM of the pulsar broker's direct memory, I tested different scenarios and got some different results.
After analyzing the pulsar broker heap dump, a large number of PendingAddOp instances have not been recycled or destroyed.
As shown in the figure below, I suspect that a large number of entry requests written to bookie have not received all the WQ responses, which makes PendingAddOp unable to be recycled or destroyed.
Therefore, I use maxMessagePublishBufferSizeInMB to limit the traffic handled by the broker according to apache#7406 and apache#6178.
But next is my test results:
The next questions also are related to apache#9562
My question is, whether maxMessagePublishBufferSizeInMB is configured or not,
as long as AQ=WQ, direct memory is normal,
as long as AQ<WQ, direct memory will appear OOM,
this may be related to bookie’s processing logic, but how does maxMessagePublishBufferSizeInMB work?
Except that the EWA configuration ratio is different, all tests use the same configuration and only include writing, no consumption
workloads yaml
topics: 1
partitionsPerTopic: 2
messageSize: 1024
payloadFile: "payload/payload-1Kb.data"
subscriptionsPerTopic: 0
consumerPerSubscription: 0
producersPerTopic: 2
producerRate: 880000000
consumerBacklogSizeGB: 0
testDurationMinutes: 60
The text was updated successfully, but these errors were encountered: