Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][pip] PIP-282: Change definition of the recently joined consumers position #20776

Conversation

equanz
Copy link
Contributor

@equanz equanz commented Jul 11, 2023

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@poorbarcode poorbarcode added this to the 3.1.0 milestone Jul 11, 2023
pip/pip-282.md Outdated Show resolved Hide resolved
pip/pip-282.md Outdated Show resolved Hide resolved
pip/pip-282.md Show resolved Hide resolved
pip/pip-282.md Show resolved Hide resolved
@equanz
Copy link
Contributor Author

equanz commented Jul 31, 2023

@poorbarcode Sorry for the late reply. I addressed your comments. PTAL

@Technoboy- Technoboy- modified the milestones: 3.1.0, 3.2.0 Jul 31, 2023
@poorbarcode poorbarcode added the type/bug The PR fixed a bug or issue reported a bug label Aug 3, 2023
private final LinkedHashMap<Consumer, PositionImpl> recentlyJoinedConsumers;

+ private PositionImpl lastSentPosition;
+ private final RangeSetWrapper<PositionImpl> individuallySentPositions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • individuallySentPositions = {Range-Full} - {cursor.individualDeletedMessages} - {dispatcher.redeliveryMessages} - {the positions in inflight Replay Reading}, right?
  • Should we add metrics to describe how much memory individuallySentPositions usage?
  • Should we add a mechanism to stop delivering messages to the client if individuallySentPositions uses more memory than expected?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@equanz Sorry for the late reply. Could you take a look at these questions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

individuallySentPositions = {Range-Full} - {cursor.individualDeletedMessages} - {dispatcher.redeliveryMessages} - {the positions in inflight Replay Reading}, right?

Roughly, the {cursor.individualDeletedMessages} is not subtracted. It is initialized by individualDeletedMessages. Messages scheduled to be sent are pushed to this field.
After, if the first range is contiguous to the last sent position, remove the first range and update the last sent position to the range's upper bound.
So, if it is not initialized by individualDeletedMessages, then the last sent position can be stuck because of "sent-hole".

More specifically, the details may differ because the definitions are different.

Should we add metrics to describe how much memory individuallySentPositions usage?

I'll add these metrics to the subscription stats.

    /** The last sent position of the cursor. This is for Key_Shared subscription. */
    public String lastSentPosition;

    /** Set of individually sent ranges. This is for Key_Shared subscription. */
    public String individuallySentPositions;

Should we add a mechanism to stop delivering messages to the client if individuallySentPositions uses more memory than expected?

I referred to the definition of the individualDeletedMessages. It has no limitation to persist on the memory(not the storage).
(Of course, we can add the limitation if necessary.)

Copy link
Contributor

@poorbarcode poorbarcode Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@equanz

I'll add these metrics to the subscription stats.

/** The last sent position of the cursor. This is for Key_Shared subscription. */
public String lastSentPosition;

/** Set of individually sent ranges. This is for Key_Shared subscription. */
public String individuallySentPositions;

But individuallySentPositions in topic stats doesn't accurately reflect how much memory individuallySentPositions uses, right?

If there are a huge number of elements in individuallySentPositions, it is possible that pulsar-admin topics stats might not work, because the response body is too large (we have encountered responses close to 200m).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poorbarcode

But individuallySentPositions in topic stats doesn't accurately reflect how much memory individuallySentPositions uses, right?

Yes.
However, I think we can't do effective operations even if we can observe these metrics. What do you think?

If there are a huge number of elements in individuallySentPositions, it is possible that pulsar-admin topics stats might not work, because the response body is too large (we have encountered responses close to 200m).

Your concerns are correct. I'll reconsider any other approaches(e.g. add a new REST API to expose these stats, just output log as debug level).
BTW, it is not considered in stats(e.g. consumersAfterMarkDeletePosition, keyHashRanges) and stats-internal(e.g. individuallyDeletedMessages). We should care about these stats, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@equanz

it is not considered in stats(e.g. consumersAfterMarkDeletePosition, keyHashRanges) and stats-internal(e.g. individuallyDeletedMessages). We should care about these stats, too.

Yes, I'm planning to do it.

Your concerns are correct. I'll reconsider any other approaches(e.g. add a new REST API to expose these stats, just output log as debug level).

After we added new metrics which indicate how much memory is used by individuallySentPositions. Push an alert when the memory limit is exceeded, so there are no scenarios that the HTTP API cannot handle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new PIP to limit the place usage of individuallyDeletedMessages, the PIP will be submitted today

Okay.
If implementation is not far off, I will wait for your new design for reference.
(This PIP issue is one of bug fixes. So, I think we should fix it sooner if possible.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It is #21118

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poorbarcode
Sorry for the late reply. I have reconsidered your comments.

This field is appended when messages sent are not contiguous, and is automatically removed when the delivery becomes contiguous.
The frequency of occurrence between instances where messages are not delivered and instances where messages are not acked, would likely differ.
We could add limitations as you mentioned. However, it raises questions about how often this would be effective.

Additionally, introducing such additional stoppage specifications for deliveries would impose more complex limitations on Key_Shared alone.
If these limitations work correctly, that would be ideal. However, past bugs suggest that these complexities often cause omission from consideration.
Hence, if these added limitations do not effectively contribute, it may be better to consider not imposing them in the first place.

@poorbarcode poorbarcode changed the title [pip][design] PIP-282: Change definition of the recently joined consumers position [improve][pip] PIP-282: Change definition of the recently joined consumers position Sep 4, 2023
@github-actions
Copy link

github-actions bot commented Oct 5, 2023

The pr had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Oct 5, 2023
@equanz
Copy link
Contributor Author

equanz commented Nov 20, 2023

If the discussion does not seem to return to active, I will start the voting phase once. => I'll wait for approval in this PR.
This PIP issue is one of bug fixes(not improvement). So, I think we should fix it sooner if possible.

@poorbarcode
Copy link
Contributor

I'll wait for approval in this PR.
This PIP issue is one of bug fixes(not improvement). So, I think we should fix it sooner if possible.

Sure, agree

@equanz
Copy link
Contributor Author

equanz commented Nov 28, 2023

I have opened a voting thread, so please check it out.
https://lists.apache.org/thread/45x056t8njjnzflbkhkofh00gcy4z5g6

@Technoboy- Technoboy- modified the milestones: 3.2.0, 3.3.0 Dec 22, 2023
@equanz
Copy link
Contributor Author

equanz commented Dec 25, 2023

@nkurihar @poorbarcode @codelipenghui

The vote was closed by 3 binding +1. Could you merge this proposal?
https://lists.apache.org/thread/ovckl6gdfppffvq0wk2l2lp5tq9hz8gb

@poorbarcode
Copy link
Contributor

@nkurihar @poorbarcode @codelipenghui
The vote was closed by 3 binding +1. Could you merge this proposal?
https://lists.apache.org/thread/ovckl6gdfppffvq0wk2l2lp5tq9hz8gb

Sure, merging

@poorbarcode poorbarcode merged commit 4e96e7f into apache:master Dec 25, 2023
19 checks passed
@equanz equanz deleted the pip_change_definition_of_recently_joined_consumers_position branch December 25, 2023 09:34
@equanz
Copy link
Contributor Author

equanz commented Feb 7, 2024

I opened implementation PR #21953.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs Stale type/bug The PR fixed a bug or issue reported a bug type/PIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants