Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cleanup offload #40

Open
wants to merge 19 commits into
base: dev-master
Choose a base branch
from
Open

Fix cleanup offload #40

wants to merge 19 commits into from

Conversation

zymap
Copy link
Owner

@zymap zymap commented Sep 8, 2022

(If this PR fixes a github issue, please add Fixes #<xyz>.)

Fixes #

(or if this PR is one task of a github issue, please add Master Issue: #<xyz> to link to the master issue.)

Master Issue: #

Motivation

Explain here the context, and why you're making that change. What is the problem you're trying to solve.

Modifications

Describe the modifications you've done.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API: (yes / no)
  • The schema: (yes / no / don't know)
  • The default values of configurations: (yes / no)
  • The wire protocol: (yes / no)
  • The rest endpoints: (yes / no)
  • The admin cli options: (yes / no)
  • Anything that affects deployment: (yes / no / don't know)

Documentation

Check the box below or label this PR directly.

Need to update docs?

  • doc-required
    (Your PR needs to update docs and you will update later)

  • doc-not-needed
    (Please explain why)

  • doc
    (Your PR contains doc changes)

  • doc-complete
    (Docs have been already added)

zymap and others added 19 commits September 8, 2022 10:36
---

*Motivation*

There have two ways that will cause the offload data cleanup. One is met
offload conflict exception, and another is completeLedgerInfoForOffloaded
reaches max retry time and throws zookeeper exceptions.

We retry the zookeeper operation on connection loss exception. We should
be careful about this exception, because we may loss data if the metadata
update successfully.

When a MetaStore exception happens, we can not make sure the metadata update is
failed or not. Because we have a retry on the connection loss, it is
possible to get a BadVersion or other exception after retrying.

So we don't clean up the data if this happens.

*Modification*

- don't delete data if has meta store exception
* Following the action plan defined in https://lists.apache.org/thread/yh72bmlf2w97593c5d5pqf3jz1ldb501 .
* A lazy consensus was reached on this matter to fix CI following the action plan.
- Follow up on apache#17539 changes due to error message https://lists.apache.org/thread/hj8b9n1n29vg4c6oslz6n1j230f9v284

- Following the action plan defined in https://lists.apache.org/thread/yh72bmlf2w97593c5d5pqf3jz1ldb501 .
- A lazy consensus was reached on this matter to fix CI following the action plan.
)

- there's no need for this anymore since GitHub Actions has built-in support
  for cancelling duplicate workflows and this is used in Pulsar CI workflows.
- move the filter from the build step to the build job
   - only start the build job if the comment includes "/pulsarbot"
- no need to checkout code
- no need to run the tune VM action
Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
apache#17541)

* [ci] Reduce runners load for pulls that affects only doc or cpp changes

* add final jobs to mark required workflows as completed even if they are skipped

* test parametrized job id

* fix latest step
…che#17551)

* [ci] Skip run flaky tests suite for cpp-only changes

* also ignore doc changes

* use changed_files
…le after 10 minutes (apache#17401)

* [fix][broker] Leader election cache should not invalidate entries

- cache expiration leads to problems and it is better to serve a stale entry than to ever expire the
  entry
- MetadataStore didn't support passing cache configuration for a MetadataCache, it
  was necessary to modify the interface to support that.

* Add test case

* Prevent stale values in leader election cache

Co-authored-by: Enrico Olivelli <eolivelli@apache.org>
@github-actions
Copy link

github-actions bot commented Oct 9, 2022

The pr had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Oct 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants