Avoid using same OpAddEntry between different ledger handles #5942

codelipenghui · 2019-12-26T09:21:02Z

Motivation

Avoid using same OpAddEntry between different ledger handles.

Modifications

Add state for OpAddEntry, if op handled by new ledger handle, the op will set to CLOSED state, after the legacy callback happens will check the op state, only INITIATED can be processed.

When ledger rollover happens, pendingAddEntries will be processed. when process pendingAddEntries, will create a new OpAddEntry by the old OpAddEntry to avoid different ledger handles use same OpAddEntry.

Verifying this change

Added new unit test

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

Dependencies (does it add or upgrade a dependency): (no)
The public API: (no)
The schema: (no)
The default values of configurations: (no)
The wire protocol: (no)
The rest endpoints: (no)
The admin cli options: (no)
Anything that affects deployment: (no)

Documentation

Does this pull request introduce a new feature? (no)

codelipenghui · 2019-12-27T09:29:02Z

run cpp tests
run integration tests

codelipenghui · 2019-12-27T12:04:45Z

run cpp tests

codelipenghui · 2019-12-31T02:33:51Z

run cpp tests

tuteng · 2019-12-31T11:34:56Z

run cpp tests

rdhabalia · 2019-12-31T18:44:57Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java

            internalAsyncAddEntry(addOperation);
        }));
    }

    private synchronized void internalAsyncAddEntry(OpAddEntry addOperation) {
+        pendingAddEntries.add(addOperation);


Is this real root cause of #5588 or it just a patch to avoid such behavior?

The root cause of #5588 is an entry is "re-used" between ledgers. The code at line 1297 is the fix.

sijie · 2020-01-01T01:28:46Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java

            internalAsyncAddEntry(addOperation);
        }));
    }

    private synchronized void internalAsyncAddEntry(OpAddEntry addOperation) {
+        pendingAddEntries.add(addOperation);


The root cause of #5588 is an entry is "re-used" between ledgers. The code at line 1297 is the fix.

sijie · 2020-01-01T01:32:21Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java

@@ -1294,9 +1293,24 @@ public synchronized void updateLedgersIdsComplete(Stat stat) {
            log.debug("[{}] Resending {} pending messages", name, pendingAddEntries.size());
        }

+        // Avoid use same OpAddEntry between different ledger handle
+        int pendingSize = pendingAddEntries.size();


hmm, this doesn't seem to be correct to me. you need to preserve the order when adding the newly created ops back to the queue.

what you need to do:

drain the pendingAddEntries queue;

create a new OpAddEntry for each entry

add these ops into an intermediate list in the order of how they are drained

after the pendingAddEntries are drained, add the intermediate list back to the pendingAddEntries queue.

The pendingAddEntries is ConcurrentLinkedQueue, there is no drainTo method in ConcurrentLinkedQueue, maybe we can use pendingAddEntries.toArray() and then create a new OpAddEntry for each item of the array and add the new entry to an intermediate list.

I see. it is a queue here.

sijie · 2020-01-01T01:33:06Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java

    LedgerHandle ledger;
    private long entryId;

    @SuppressWarnings("unused")
-    private volatile AddEntryCallback callback;
+    private static final AtomicReferenceFieldUpdater<OpAddEntry, AddEntryCallback> callbackUpdater =


do we need the change here?

I just copy the updater close to the field

sijie · 2020-01-06T03:37:52Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java

+
+        if (!STATE_UPDATER.compareAndSet(OpAddEntry.this, State.INITIATED, State.COMPLETED)) {
+            log.warn("[{}] The add op is terminal legacy callback for entry {}-{} adding.", ml.getName(), lh.getId(), entryId);
+            OpAddEntry.this.recycle();


since we are creating a new entry when retrying the ops on the new ledger, do we need to introduce the state field and recycle here?

Since this op is only used by the old ledger handler, it will not be reused across ledgers. It should already be recycled correctly.

sijie · 2020-01-06T03:39:06Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java

+            if (existsOp != null) {
+                // If op is used by another ledger handle, we need to close it and create a new one
+                if (existsOp.ledger != null) {
+                    existsOp.close();


not sure we need to close here. I think once we duplicate the operation, we can just let the original callback close the old op. so it seems to me that we don't need introducing another state field here.

We need to close the original op, otherwise when the old op callback, will poll the first op in the pendingAddEntries

pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java

Line 165 in ef23a4b

checkArgument(this == firstInQueue);

But, the first op is the new op we replaced.

…5942) ### Motivation Avoid using same OpAddEntry between different ledger handles. ### Modifications Add state for OpAddEntry, if op handled by new ledger handle, the op will set to CLOSED state, after the legacy callback happens will check the op state, only INITIATED can be processed. When ledger rollover happens, pendingAddEntries will be processed. when process pendingAddEntries, will create a new OpAddEntry by the old OpAddEntry to avoid different ledger handles use same OpAddEntry.

…5942) ### Motivation Avoid using same OpAddEntry between different ledger handles. ### Modifications Add state for OpAddEntry, if op handled by new ledger handle, the op will set to CLOSED state, after the legacy callback happens will check the op state, only INITIATED can be processed. When ledger rollover happens, pendingAddEntries will be processed. when process pendingAddEntries, will create a new OpAddEntry by the old OpAddEntry to avoid different ledger handles use same OpAddEntry. (cherry picked from commit 7ec17b2)

### Motivation Avoid using same OpAddEntry between different ledger handles. ### Modifications Add state for OpAddEntry, if op handled by new ledger handle, the op will set to CLOSED state, after the legacy callback happens will check the op state, only INITIATED can be processed. When ledger rollover happens, pendingAddEntries will be processed. when process pendingAddEntries, will create a new OpAddEntry by the old OpAddEntry to avoid different ledger handles use same OpAddEntry. (cherry picked from commit 7ec17b2)

…5942) ### Motivation Avoid using same OpAddEntry between different ledger handles. ### Modifications Add state for OpAddEntry, if op handled by new ledger handle, the op will set to CLOSED state, after the legacy callback happens will check the op state, only INITIATED can be processed. When ledger rollover happens, pendingAddEntries will be processed. when process pendingAddEntries, will create a new OpAddEntry by the old OpAddEntry to avoid different ledger handles use same OpAddEntry. (cherry picked from commit 7ec17b2)

…5942) ### Motivation Avoid using same OpAddEntry between different ledger handles. ### Modifications Add state for OpAddEntry, if op handled by new ledger handle, the op will set to CLOSED state, after the legacy callback happens will check the op state, only INITIATED can be processed. When ledger rollover happens, pendingAddEntries will be processed. when process pendingAddEntries, will create a new OpAddEntry by the old OpAddEntry to avoid different ledger handles use same OpAddEntry.

Avoid using same ops between different ledger handles

70b83be

codelipenghui requested review from rdhabalia, sijie and merlimat and removed request for rdhabalia and sijie December 26, 2019 09:21

codelipenghui assigned jiazhai and codelipenghui and unassigned jiazhai Dec 26, 2019

codelipenghui requested a review from jiazhai December 26, 2019 09:22

codelipenghui added the area/broker label Dec 26, 2019

codelipenghui added 5 commits December 26, 2019 18:13

Add unit test.

309de4f

Get pendingSize before set state to LedgerOpened

c60556a

Handle none ledger ops

736cf41

Add more test case

f659e8c

Fix unit test

ef23a4b

rdhabalia reviewed Dec 31, 2019

View reviewed changes

sijie requested changes Jan 1, 2020

View reviewed changes

sijie requested changes Jan 6, 2020

View reviewed changes

sijie approved these changes Jan 8, 2020

View reviewed changes

codelipenghui merged commit 7ec17b2 into apache:master Jan 9, 2020

sijie added this to the 2.5.1 milestone Jan 13, 2020

sijie added the release/2.5.1 label Jan 22, 2020

sijie modified the milestones: 2.5.1, 2.6.0 Jan 22, 2020

codelipenghui deleted the issue-5588 branch April 24, 2020 08:23

merlimat mentioned this pull request Jun 1, 2021

[ML] Fix ByteBuf leaks when execution fails #10755

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid using same OpAddEntry between different ledger handles #5942

Avoid using same OpAddEntry between different ledger handles #5942

codelipenghui commented Dec 26, 2019 •

edited

Loading

codelipenghui commented Dec 27, 2019

codelipenghui commented Dec 27, 2019

codelipenghui commented Dec 31, 2019

tuteng commented Dec 31, 2019

rdhabalia Dec 31, 2019

sijie Jan 1, 2020

sijie Jan 1, 2020

sijie Jan 1, 2020

codelipenghui Jan 2, 2020

sijie Jan 6, 2020

sijie Jan 1, 2020

codelipenghui Jan 2, 2020

sijie Jan 6, 2020

sijie Jan 6, 2020

codelipenghui Jan 6, 2020

Avoid using same OpAddEntry between different ledger handles #5942

Avoid using same OpAddEntry between different ledger handles #5942

Conversation

codelipenghui commented Dec 26, 2019 • edited Loading

Motivation

Modifications

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

codelipenghui commented Dec 27, 2019

codelipenghui commented Dec 27, 2019

codelipenghui commented Dec 31, 2019

tuteng commented Dec 31, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codelipenghui commented Dec 26, 2019 •

edited

Loading