Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.

ISSUE-8293: Race condition in updating ManagedCursorImpl.readPosition #1569

Closed
sijie opened this issue Oct 19, 2020 · 0 comments
Closed

ISSUE-8293: Race condition in updating ManagedCursorImpl.readPosition #1569

sijie opened this issue Oct 19, 2020 · 0 comments
Labels

Comments

@sijie
Copy link
Member

sijie commented Oct 19, 2020

Original Issue: apache#8293


Describe the bug
apache#8229 seems to have been caused by a race condition in updating ManagedCursorImpl.readPosition

To Reproduce
Since this is a concurrency issue, it's hard to reproduce and there isn't yet a publicly shared way to reproduce.

Expected behavior
Updates to ManagedCursorImpl.readPosition field should not lead to inconsistent state. It's not clear without understanding the code how concurrent updates should be handled.

Additional context
Please refer to apache#8229 for additional context. There's a link to a Slack thread for more discussions.

There's a fix for apache#8229 which prevents the infinite loop: apache#8284 . This fix doesn't specifically address the race condition that happens in updating the ManagedCursorImpl.readPosition field.

There seems to be quite a few past issues where a race condition in updating readPosition has been an issue. For example apache#1478 , apache#3015 & apache#287 .

There is also a change apache#6606 which adds READ_POSITION_UPDATER for ManagedCursorImpl.readPosition.

Regarding the race condition in apache#8229, it seems that ManagedCursorImpl.readPosition could get out of sync from OpReadEntry.readPosition if ManagedCursorImpl.readPosition gets updated after the OpReadEntry has been created since OpReadEntry's readPosition gets initialized from ManagedCursorImpl.readPosition.

The race condition seems to happen in this code in the setAcknowledgePosition method:

https://github.com/apache/pulsar/blob/825fdd4222dd65ef3099f1a975a1555226297379/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java#L1512-L1523

In other locations, whenever readPosition field is modified, it is locked. In this particular location, there is no lock.

However apache#6606 introduced another method for handling race condition. So there are 2 ways to handle race conditions for ManagedCursorImpl.readPosition field: ManagedCursorImpl.lock.writeLock() and there's also ManagedCursorImpl.READ_POSITION_UPDATER which is used in one location to update ManagedCursorImpl.readPosition.

@sijie sijie added the type/bug label Oct 19, 2020
@sijie sijie closed this as completed Oct 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant