Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(notification): ensure new snapshot is only notified after new fragment_mapping #7042

Merged
merged 3 commits into from
Dec 29, 2022

Conversation

zwang28
Copy link
Contributor

@zwang28 zwang28 commented Dec 23, 2022

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR addresses #5446, by ensuring frontend is always notified of new snapshot after corresponding fragment_mapping.

Also a minor refactor that extracts collect_synced_ssts from complete_barrier, to make the latter cleaner.

This PR doesn't (and no need to) affect relative order between snapshot and catalog notification:

Checklist

  • I have written necessary rustdoc comments
    - [ ] I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Documentation

If your pull request contains user-facing changes, please specify the types of the changes, and create a release note. Otherwise, please feel free to remove this section.

Types of user-facing changes

Please keep the types that apply to your changes, and remove those that do not apply.

  • Installation and deployment
  • Connector (sources & sinks)
  • SQL commands, functions, and operators
  • RisingWave cluster configuration changes
  • Other (please specify in the release note below)

Release note

Please create a release note for your changes. In the release note, focus on the impact on users, and mention the environment or conditions where the impact may occur.

Refer to a related PR or issue link (optional)

#5446

@zwang28 zwang28 added the component/meta Meta related issue. label Dec 23, 2022
@codecov
Copy link

codecov bot commented Dec 23, 2022

Codecov Report

Merging #7042 (dc875a8) into main (7de49c0) will increase coverage by 0.00%.
The diff coverage is 64.17%.

@@           Coverage Diff           @@
##             main    #7042   +/-   ##
=======================================
  Coverage   73.15%   73.16%           
=======================================
  Files        1052     1052           
  Lines      167399   167413   +14     
=======================================
+ Hits       122467   122488   +21     
+ Misses      44932    44925    -7     
Flag Coverage Δ
rust 73.16% <64.17%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/meta/src/rpc/service/hummock_service.rs 3.14% <0.00%> (+0.03%) ⬆️
src/meta/src/hummock/manager/mod.rs 79.45% <40.90%> (-0.04%) ⬇️
src/meta/src/hummock/mock_hummock_meta_client.rs 65.78% <50.00%> (+0.52%) ⬆️
src/meta/src/barrier/mod.rs 84.58% <79.48%> (+0.24%) ⬆️
src/meta/src/hummock/test_utils.rs 94.95% <100.00%> (ø)
src/stream/src/executor/aggregation/minput.rs 96.49% <0.00%> (+0.10%) ⬆️
src/storage/src/hummock/compactor/mod.rs 83.41% <0.00%> (+0.15%) ⬆️
... and 1 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM! Thanks for taking this up.

@@ -1714,11 +1708,11 @@ where
{
self.check_state_consistency().await;
}
Ok(())
Ok(Some(snapshot))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may return an error in L1704 and thus miss to notify frontend. Can this be an issue? Recovery should be triggered in this case so maybe missing one notification is fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored try_send_compaction_request to make it infallible (by handling errors internally).
Because I think a compaction scheduling failure should not trigger recovery. And when it fails, recovery doesn't help at all. And currently the only reason of failure is compaction scheduler shutdown.

Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@zwang28
Copy link
Contributor Author

zwang28 commented Dec 29, 2022

Scale test fails after merging main, investigating 🤔
I think it's not likely related to this PR.

@mergify mergify bot merged commit 105f931 into main Dec 29, 2022
@mergify mergify bot deleted the wangzheng/refactor_notification branch December 29, 2022 07:55
@BugenZhao
Copy link
Member

Scale test fails after merging main, investigating 🤔
I think it's not likely related to this PR.

Look similar to #7103. 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants