Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature][broker] PIP 37: Support chunking with Shared subscription #16202

Merged

Conversation

BewareMyPower
Copy link
Contributor

@BewareMyPower BewareMyPower commented Jun 23, 2022

Motivation

https://github.com/apache/pulsar/wiki/PIP-37%3A-Large-message-size-handling-in-Pulsar#option-1-broker-caches-mapping-of-message-uuid-and-consumerid

The cons of option 1 described in the original proposal don't exist
for current code because broker keeps redelivered messages into
sorted map now.

Modifications

First of all, to avoid too many code changes, an EntryAndMetadata
class is introduced to bind the Entry with the associated
MessageMetadata to avoid parsing the metadata repeatedly. It also
implements the Entry interface, so this PR changes some
List<Entry> parameters to List<? extends Entry> so that a
List<EntryAndMetadata> argument can be accepted.

Then, a SharedConsumerAssignor is introduced to assign a list of
entries to all shared consumers.

  1. Use a default selector to select the next consumer, like
    PersistentDispatcherMultipleConsumers#getNextConsumer,
  2. Each time a consumer is chosen, assign the entries in range
    [i, i+permits) to the consumer except entries that have uuid:
  • If uuid is not cached, cache uuid -> consumer to indicate the
    chunked message of this uuid must be dispatched to this consumer.
  • Otherwise, assign this entry to the owner consumer of the uuid.

The assign method returns a map that maps Consumer to
List<EntryAndMetadata>. The following logic is similar to the
Key_Shared dispatcher.

Finally, cancel the limit in ConsumerImpl.

Verifying this change

SharedConsumerAssignorTest is added to show how the assignor works
in detail.

MessageChunkingSharedTest is added to verify the Shared dispatcher
works on chunked messages, including:

  • Single producer sends chunked messages with various chunk count to a
    consumer has a limited permits.
  • Single producer sends chunked messages to two consumers to verify
    both they can receive chunked messages.
  • Produce interleaved chunks via PersistentTopic directly to simulate
    multiple producers, and verify the new consumer can receive all
    unacknowledged messages received by the old consumer.

Documentation

Check the box below or label this PR directly.

Need to update docs?

  • doc-required
    (Your PR needs to update docs and you will update later)

  • doc-not-needed
    (Please explain why)

  • doc
    (Your PR contains doc changes)

  • doc-complete
    (Docs have been already added)

@BewareMyPower BewareMyPower added type/feature The PR added a new feature or issue requested a new feature area/broker type/PIP labels Jun 23, 2022
@BewareMyPower BewareMyPower added this to the 2.11.0 milestone Jun 23, 2022
@BewareMyPower BewareMyPower self-assigned this Jun 23, 2022
@github-actions
Copy link

@BewareMyPower Please provide a correct documentation label for your PR.
Instructions see Pulsar Documentation Label Guide.

1 similar comment
@github-actions
Copy link

@BewareMyPower Please provide a correct documentation label for your PR.
Instructions see Pulsar Documentation Label Guide.

@BewareMyPower BewareMyPower force-pushed the bewaremypower/chunk-msg-for-shared branch from a815c22 to b60557b Compare June 23, 2022 18:00
@BewareMyPower BewareMyPower added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Jun 23, 2022
Copy link
Member

@RobertIndie RobertIndie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. Left some comments.

long stickyKeyHash = getStickyKeyHash(entry);
addMessageToReplay(entry.getLedgerId(), entry.getEntryId(), stickyKeyHash);
entry.release();
if (messagesForC < entryAndMetadataList.size()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will messagesForC or consumer.getAvailablePermits() be less than entryAndMetadataList.size()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, maybe the consumer's permits might change in the loop. It's added here for safety programming.

@RobertIndie
Copy link
Member

This PR will fixes #7645

@BewareMyPower BewareMyPower requested a review from 315157973 July 4, 2022 07:34
@BewareMyPower
Copy link
Contributor Author

@codelipenghui @merlimat @rdhabalia @Jason918 @eolivelli @315157973 Could you take a look at this PR?

@BewareMyPower BewareMyPower force-pushed the bewaremypower/chunk-msg-for-shared branch from 588c3c1 to a89ef8a Compare July 11, 2022 08:46
@BewareMyPower
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

@BewareMyPower BewareMyPower force-pushed the bewaremypower/chunk-msg-for-shared branch 2 times, most recently from 89cc8ce to 788c422 Compare July 15, 2022 03:17
@BewareMyPower
Copy link
Contributor Author

@codelipenghui codelipenghui modified the milestones: 2.11.0, 2.12.0 Jul 26, 2022
@BewareMyPower BewareMyPower force-pushed the bewaremypower/chunk-msg-for-shared branch from 788c422 to 43d7264 Compare July 28, 2022 09:17
@codelipenghui
Copy link
Contributor

@BewareMyPower Please help resolve the conflicts.

@BewareMyPower BewareMyPower force-pushed the bewaremypower/chunk-msg-for-shared branch from 43d7264 to 76db14f Compare July 29, 2022 03:42
### Motivation

https://github.com/apache/pulsar/wiki/PIP-37%3A-Large-message-size-handling-in-Pulsar#option-1-broker-caches-mapping-of-message-uuid-and-consumerid

The cons of option 1 described in the original proposal don't exist
for current code because broker keeps redelivered messages into
**sorted** map now.

### Modifications

First of all, to avoid too many code changes, an `EntryAndMetadata`
class is introduced to bind the `Entry` with the associated
`MessageMetadata` to avoid parsing the metadata repeatedly. It also
implements the `Entry` interface, so this PR changes some
`List<Entry>` parameters to `List<? extends Entry>` so that a
`List<EntryAndMetadata>` argument can be accepted.

Then, a `SharedConsumerAssignor` is introduced to assign a list of
entries to all shared consumers.
1. Use a default selector to select the next consumer, like
   `PersistentDispatcherMultipleConsumers#getNextConsumer`,
2. Each time a consumer is chosen, assign the entries in range
   [i, i+permits) to the consumer except entries that have uuid:
  - If uuid is not cached, cache `uuid -> consumer` to indicate the
    chunked message of this uuid must be dispatched to this consumer.
  - Otherwise, assign this entry to the owner consumer of the uuid.

The `assign` method returns a map that maps `Consumer` to
`List<EntryAndMetadata>`. The following logic is similar to the
Key_Shared dispatcher.

Finally, cancel the limit in `ConsumerImpl`.

### Verifying this change

`SharedConsumerAssignorTest` is added to show how the assignor works
in detail.

`MessageChunkingSharedTest` is added to verify the Shared dispatcher
works on chunked messages, including:
- Single producer sends chunked messages with various chunk count to a
  consumer has a limited permits.
- Single producer sends chunked messages to two consumers to verify
  both they can receive chunked messages.
- Produce interleaved chunks via `PersistentTopic` directly to simulate
  multiple producers, and verify the new consumer can receive all
  unacknowledged messages received by the old consumer.

### TODO

We need to change the implementation of `ChunkMessageIdImpl` to make
it possible for consumer to acknowledge all entries of a chunked
message.

Since this PR already includes many changes, I will do that later.
@BewareMyPower BewareMyPower force-pushed the bewaremypower/chunk-msg-for-shared branch from 76db14f to 6fcc4cf Compare August 1, 2022 02:24
@codelipenghui codelipenghui merged commit b1a29b5 into apache:master Aug 2, 2022
Gleiphir2769 pushed a commit to Gleiphir2769/pulsar that referenced this pull request Aug 4, 2022
…pache#16202)

* [feature][broker] PIP 37: Support chunking with Shared subscription

### Motivation

https://github.com/apache/pulsar/wiki/PIP-37%3A-Large-message-size-handling-in-Pulsar#option-1-broker-caches-mapping-of-message-uuid-and-consumerid

The cons of option 1 described in the original proposal don't exist
for current code because broker keeps redelivered messages into
**sorted** map now.

### Modifications

First of all, to avoid too many code changes, an `EntryAndMetadata`
class is introduced to bind the `Entry` with the associated
`MessageMetadata` to avoid parsing the metadata repeatedly. It also
implements the `Entry` interface, so this PR changes some
`List<Entry>` parameters to `List<? extends Entry>` so that a
`List<EntryAndMetadata>` argument can be accepted.

Then, a `SharedConsumerAssignor` is introduced to assign a list of
entries to all shared consumers.
1. Use a default selector to select the next consumer, like
   `PersistentDispatcherMultipleConsumers#getNextConsumer`,
2. Each time a consumer is chosen, assign the entries in range
   [i, i+permits) to the consumer except entries that have uuid:
  - If uuid is not cached, cache `uuid -> consumer` to indicate the
    chunked message of this uuid must be dispatched to this consumer.
  - Otherwise, assign this entry to the owner consumer of the uuid.

The `assign` method returns a map that maps `Consumer` to
`List<EntryAndMetadata>`. The following logic is similar to the
Key_Shared dispatcher.

Finally, cancel the limit in `ConsumerImpl`.

### Verifying this change

`SharedConsumerAssignorTest` is added to show how the assignor works
in detail.

`MessageChunkingSharedTest` is added to verify the Shared dispatcher
works on chunked messages, including:
- Single producer sends chunked messages with various chunk count to a
  consumer has a limited permits.
- Single producer sends chunked messages to two consumers to verify
  both they can receive chunked messages.
- Produce interleaved chunks via `PersistentTopic` directly to simulate
  multiple producers, and verify the new consumer can receive all
  unacknowledged messages received by the old consumer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker doc-not-needed Your PR changes do not impact docs type/feature The PR added a new feature or issue requested a new feature type/PIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants