-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a message id tracker for acking messages that are batched. #82
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,30 +18,31 @@ | |
package pulsar | ||
|
||
import ( | ||
"math/big" | ||
"strings" | ||
"sync" | ||
"time" | ||
|
||
"github.com/apache/pulsar-client-go/pkg/pb" | ||
"github.com/golang/protobuf/proto" | ||
) | ||
|
||
func earliestMessageID() MessageID { | ||
return newMessageID(-1, -1, -1, -1) | ||
} | ||
"github.com/apache/pulsar-client-go/pkg/pb" | ||
) | ||
|
||
type messageID struct { | ||
ledgerID int64 | ||
entryID int64 | ||
batchIdx int | ||
partitionIdx int | ||
|
||
tracker *ackTracker | ||
} | ||
|
||
func newMessageID(ledgerID int64, entryID int64, batchIdx int, partitionIdx int) MessageID { | ||
return &messageID{ | ||
ledgerID: ledgerID, | ||
entryID: entryID, | ||
batchIdx: batchIdx, | ||
partitionIdx: partitionIdx, | ||
func (id *messageID) ack() bool { | ||
if id.tracker != nil && id.batchIdx > -1 { | ||
return id.tracker.ack(id.batchIdx) | ||
} | ||
|
||
return true | ||
} | ||
|
||
func (id *messageID) Serialize() []byte { | ||
|
@@ -70,10 +71,24 @@ func deserializeMessageID(data []byte) (MessageID, error) { | |
return id, nil | ||
} | ||
|
||
const maxLong int64 = 0x7fffffffffffffff | ||
func newMessageID(ledgerID int64, entryID int64, batchIdx int, partitionIdx int) MessageID { | ||
return &messageID{ | ||
ledgerID: ledgerID, | ||
entryID: entryID, | ||
batchIdx: batchIdx, | ||
partitionIdx: partitionIdx, | ||
} | ||
} | ||
|
||
func latestMessageID() MessageID { | ||
return newMessageID(maxLong, maxLong, -1, -1) | ||
func newTrackingMessageID(ledgerID int64, entryID int64, batchIdx int, partitionIdx int, | ||
tracker *ackTracker) *messageID { | ||
return &messageID{ | ||
ledgerID: ledgerID, | ||
entryID: entryID, | ||
batchIdx: batchIdx, | ||
partitionIdx: partitionIdx, | ||
tracker: tracker, | ||
} | ||
} | ||
|
||
func timeFromUnixTimestampMillis(timestamp uint64) time.Time { | ||
|
@@ -126,3 +141,37 @@ func (msg *message) EventTime() time.Time { | |
func (msg *message) Key() string { | ||
return msg.key | ||
} | ||
|
||
func newAckTracker(size int) *ackTracker { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this ack tracker is self-contained, should we move it into a separate file, perhaps in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it makes sense in this file since it's related to the message id implementation. |
||
var batchIDs *big.Int | ||
if size <= 64 { | ||
shift := uint32(64 - size) | ||
setBits := ^uint64(0) >> shift | ||
batchIDs = new(big.Int).SetUint64(setBits) | ||
} else { | ||
batchIDs, _ = new(big.Int).SetString(strings.Repeat("1", size), 2) | ||
} | ||
return &ackTracker{ | ||
size: size, | ||
batchIDs: batchIDs, | ||
} | ||
} | ||
|
||
type ackTracker struct { | ||
sync.Mutex | ||
size int | ||
batchIDs *big.Int | ||
} | ||
|
||
func (t *ackTracker) ack(batchID int) bool { | ||
t.Lock() | ||
defer t.Unlock() | ||
t.batchIDs = t.batchIDs.SetBit(t.batchIDs, batchID, 0) | ||
return len(t.batchIDs.Bits()) == 0 | ||
} | ||
|
||
func (t *ackTracker) completed() bool { | ||
t.Lock() | ||
defer t.Unlock() | ||
return len(t.batchIDs.Bits()) == 0 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why adding
ackTracker
into themessageID
structure?MessageID
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @sijie
Maybe we can do this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The go client produces messages in a batch format. Each message in that batch needs to be acked before sending the ack to the broker. This seemed like the logical place to put it since acks are based on a message id and it's similar to the java client BatchMessageIdImpl. It's an internal class too so it's not exported and can be changed later without effecting clients or apis. I think this approach makes the code easier to reason about since everything is treated as a batch message. If the messageId.ack() returns true we can send the ack to the broker. The other option is to maintain the state in a consumer and I think that complicates the code. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i agree with you, just a little difference.
In here,
tracker
does not belong tomessageId
, as defined by PulsarApi.proto. So, we can abstract new struct(MsgTracker or BatchMsgID or so on) to impl it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the tracker in the message Id object is the same implementation pattern as the Java lib. The reason is not keep the tracker in a separate map.
The fact that the protobuf doesn’t have the tracker doesn’t matter, since it’s completely different usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@merlimat @cckellogg I understand the implementation follows the Java implementation. The approach doesn't have any problem here. However I think we should do it in golang's way. We can follow what @wolfstudy suggested. It is exactly same as your approach but much clearer.
In this way, msgTracker inherits all the fields from
messageID
and it avoids pollutingmessageID
struct.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it looks like this design will cause us always have to check for two different message ids types and then handle them in a different way. Is that correct?
@wolfstudy how do you see your suggested design working when the client sends back a MessageID interface that we need to ack?
The general goal of this patch was to make ack checking in the code simple whether the message was batched or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sijie can you take another look? the other approach would involve a lot of downcasting, which we're actively trying to avoid.