-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix][broker] Fix NPE when ledger id not found in OpReadEntry
#15837
Conversation
@mattisonchao:Thanks for your contribution. For this PR, do we need to update docs? |
/pulsarbot run-failure-checks |
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
Show resolved
Hide resolved
After discussing with @Technoboy-, we may find why the Precondition
Cases
pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java Lines 1618 to 1659 in 5455f4d
After line:1643, we may have empty ledgers.
|
And because the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Maybe we could add a unit test to reproduce this problem, for example reading a position greater than max ledger id exists in |
@gaoran10 Got it, I will do that. |
Done, PTAL. |
(cherry picked from commit 7a3ad61)
(cherry picked from commit 7a3ad61)
(cherry picked from commit 7a3ad61)
move |
Motivation
In current implementation, We have more than one potential NPE relate
ManagedLedgerImpl#startReadOperationOnLedger
method.1. Unbox NPE
pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
Lines 2230 to 2244 in b13d15c
According to this code, we can know that when the ledger id is null, we will fail
OpReadEntry
. However, there is no direct return here after failure. NPE will be thrown on line 2238. Because we want to unboxnull
.2. null
OpReadEntry
contextAccording to the code above, we can see when we fail
OpReadEntry
, we pass a null value as context(line 2235). there is an NPE when the dispatcher gets the callback.pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentDispatcherSingleActiveConsumer.java
Lines 478 to 484 in 0975cdc
3.
OpReadEntry#create
pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Lines 48 to 63 in 9376128
When we create
OpReadEntry
and then callManagedCursor#startReadOperationOnLedger
, assuming the ledger id is equal to null. At this point, we will fail thisOpReadEntry
. But at the current time, other parameters are not initialized. When we callOpReadEntry#readEntriesFailed
. The cursor will be null and the NPE will be thrown.pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Lines 90 to 94 in 9376128
Modifications
OpReadEntry
logic inManagedCursor#startReadOperationOnLedger
, when ledger id equals null, we can return the original value. theManagedLedger
will validate if this operation is legal.From the perspective of the overall design, the
OpReadEntry
is just a middle state object, that may have illegal value, we have to check thisOpReadEntry
is valid inManagedLedger#asyncReadEntries
(Current we already do that).Verifying this change
Documentation
doc-not-needed
(Please explain why)