Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up cluster controller's wait on recoveredDiskFiles #9105

Merged
merged 2 commits into from
Jan 10, 2023

Conversation

sfc-gh-yiwu
Copy link
Collaborator

@sfc-gh-yiwu sfc-gh-yiwu commented Jan 7, 2023

The recoveredDiskFiles is a promise that will be fulfilled once all the local TLog and storage files have been initialized in a process. It was added previously to make a process wait on it before joining the cluster, and it was to avoid a slow recovering TLog to join the cluster to slowdown cluster recovery.

With #7510, we allow a process to join the cluster to play stateless role, while still avoid it to join the cluster as stateful role before its TLog and storage is recovered. As such, the recoveredDiskFiles wait is no longer needed. This PR cleanup the logic.

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: fe3051d
  • Duration 0:18:14
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: fe3051d
  • Duration 0:37:24
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: fe3051d
  • Duration 0:45:58
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: fe3051d
  • Duration 2:01:57
  • Result: ❌ FAILED
  • Error: Error while executing command: if $(grep -q -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log); then echo "TESTS FAILED SEE THESE LOGS:"; echo ; grep -l -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@sfc-gh-yiwu sfc-gh-yiwu marked this pull request as ready for review January 9, 2023 21:57
Copy link
Collaborator

@sfc-gh-ahusain sfc-gh-ahusain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!
Thanks for the cleanup and simplification.

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: c9247bf
  • Duration 0:20:17
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: c9247bf
  • Duration 0:54:15
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: c9247bf
  • Duration 1:20:58
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: c9247bf
  • Duration 2:03:27
  • Result: ❌ FAILED
  • Error: Error while executing command: if $(grep -q -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log); then echo "TESTS FAILED SEE THESE LOGS:"; echo ; grep -l -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@sfc-gh-yiwu sfc-gh-yiwu merged commit 0849f60 into apple:main Jan 10, 2023
@sfc-gh-yiwu sfc-gh-yiwu deleted the encrypt_recoveredfiles branch January 10, 2023 00:26
xis19 pushed a commit to xis19/foundationdb that referenced this pull request Jan 11, 2023
The `recoveredDiskFiles` is a promise that will be fulfilled once all the local TLog and storage files have been initialized in a process. It was added previously to make a process wait on it before joining the cluster, and it was to avoid a slow recovering TLog to join the cluster to slowdown cluster recovery. 

With apple#7510, we allow a process to join the cluster to play stateless role, while still avoid it to join the cluster as stateful role before its TLog and storage is recovered. As such, the `recoveredDiskFiles` wait is no longer needed. This PR cleanup the logic.
sfc-gh-nwijetunga added a commit to sfc-gh-nwijetunga/foundationdb that referenced this pull request Jan 14, 2023
…ant-deletion

* nim/restore-optional-to-required: (40 commits)
  Use existing function to read database configuration
  Ignore `g_simulator` when testing on a real cluster
  More ss cf perf fixes main (apple#9109)
  The metacluster consistency check didn't account for the possibility that a partially applied operation could leave the set of tenant groups different between the management cluster and a data cluster. Also update metacluster consistency to use comparison based asserts, where appropriate.
  fix the no tenant check failure
  Trigger a commit if none happens within some amount of time when a tenant lookup is performed
  Change TLog pull async data warning timeout
  clearify the return type
  Blob Worker Encryption doesn't use BG_METADATA_SOURCE (apple#9121)
  Restart joshua
  Add tokensign dependency for Windows
  Trace data hall id in MachineMetrics events
  Add event for txn server initialization and a warning for TLog slow catching up
  fix assertion error
  check SetVersionstampedKey offset
  toml file format
  Resolver uses Encryption DB Config (apple#9002)
  Apply suggestions from code review
  Add data verification at the end of BlobRestoreBasic.toml
  Clean up cluster controller's wait on recoveredDiskFiles (apple#9105)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants