Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] [broker] correct compaction phase one timeout meaning #21908

Merged
merged 2 commits into from
Jan 22, 2024

Conversation

thetumbled
Copy link
Member

@thetumbled thetumbled commented Jan 17, 2024

Motivation

Configuratioin brokerServiceCompactionPhaseOneLoopTimeInSeconds is defined as the total timeout of phase one.

    @FieldContext(
            category = CATEGORY_SERVER,
            doc = "Timeout for the compaction phase one loop, If the execution time of the compaction "
                    + "phase one loop exceeds this time, the compaction will not proceed."
    )
    private long brokerServiceCompactionPhaseOneLoopTimeInSeconds = 30;

But current implementation just set a timeout for a single message read operation, which is different with the meaning of brokerServiceCompactionPhaseOneLoopTimeInSeconds and will result into thousands of timer tasks be scheduled and cancelled in the scheduler because we may iterate many messages.

Modifications

Set a timer task for the tatol phase one.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: thetumbled#32

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 17, 2024
@codelipenghui
Copy link
Contributor

I think I have introduced a bad configuration name in #11206. Actually, it was the timeout for an entry read.

IMO, we should have a new configuration name and deprecate the existing one. I don't think it's reasonable to have a timeout for the whole phase one loop. After the first time the phase one loop get timeout, the subsequent phase one loop will almost always get timeout with more published messages unless changing phase one loop timeout.

@thetumbled
Copy link
Member Author

I think I have introduced a bad configuration name in #11206. Actually, it was the timeout for an entry read.

IMO, we should have a new configuration name and deprecate the existing one. I don't think it's reasonable to have a timeout for the whole phase one loop. After the first time the phase one loop get timeout, the subsequent phase one loop will almost always get timeout with more published messages unless changing phase one loop timeout.

I agree. I have introduced a new configuration name and deprecate the existing one.
PTAL, thanks.

@thetumbled
Copy link
Member Author

I have created a pip to change the conf name, please help review it, thanks!
#21910
@codelipenghui @poorbarcode @BewareMyPower

@thetumbled thetumbled changed the title [fix] [broker] fix compaction phase one timeout [fix] [broker] correct compaction phase one timeout meaning Jan 18, 2024
@thetumbled
Copy link
Member Author

I have refactor the implementation to just modify the doc, so that we can avoid to create a pip and introduce a new conf name. PTAL, thanks.
@codelipenghui @poorbarcode @BewareMyPower

@poorbarcode
Copy link
Contributor

/pulsarbot rerun-failure-checks

1 similar comment
@thetumbled
Copy link
Member Author

/pulsarbot rerun-failure-checks

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (74f1cd5) 36.46% compared to head (4f977ff) 73.58%.
Report is 6 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #21908       +/-   ##
=============================================
+ Coverage     36.46%   73.58%   +37.11%     
- Complexity    12381    32402    +20021     
=============================================
  Files          1725     1861      +136     
  Lines        131697   138589     +6892     
  Branches      14400    15184      +784     
=============================================
+ Hits          48019   101976    +53957     
+ Misses        77278    28720    -48558     
- Partials       6400     7893     +1493     
Flag Coverage Δ
inttests 24.13% <ø> (+0.02%) ⬆️
systests 23.63% <ø> (-0.01%) ⬇️
unittests 72.87% <ø> (+40.84%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...org/apache/pulsar/broker/ServiceConfiguration.java 99.39% <ø> (+1.41%) ⬆️

... and 1437 files with indirect coverage changes

@poorbarcode poorbarcode merged commit 3fa85f0 into apache:master Jan 22, 2024
49 checks passed
@Technoboy- Technoboy- added this to the 3.3.0 milestone Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants