Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More ss cf perf fixes main #9109

Merged
merged 6 commits into from
Jan 12, 2023

Conversation

sfc-gh-jslocum
Copy link
Collaborator

@sfc-gh-jslocum sfc-gh-jslocum commented Jan 9, 2023

This PR contains several change feed performance optimizations, with each one separated into a separate commit.

  • One large optimization is around fetching feeds from behind storages. To enable behind storages to catch up and not spend time on forwarding new mutations to fetching feed queries, fetching feed queries wait until the storage is completely caught up to the end version of the feed query before starting the feed query.
  • Optimizing feed map data structures and removing per-loop accesses where possible
  • Coalescing contiguous client-side feed query ranges if they are going to the same team. This in particular is common during data movement, as if a shard is split and one half is moved away, the shard map has [A - B) = (old team) and [B - C) = (old team) until the move finishes, but as separate shards. Originally these would be 2+ separate feed queries to the same team, now they are one. This reduces the number of individual feed query streams greatly during heavy data movement and shard splitting.
  • The largest optimization is to limit the in-memory bytes traversed instead of returned in getChangeFeedMutations, similar to how the disk portion works. This behavior caused range-filtered feed queries with skewed write workloads and range-filtered feed reads to essentially read the entire in-memory mutations queue every getChangeFeedMutations loop, which completely dominated storage cpu.
  • Using the same trick redwood does (skipping common prefixes) to make filterMutations more cpu-efficient

100k BlobGranule* correctness in progress

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: aa0c52d
  • Duration 0:04:35
  • Result: ❌ FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: aa0c52d
  • Duration 0:05:14
  • Result: ❌ FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: aa0c52d
  • Duration 0:05:14
  • Result: ❌ FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: aa0c52d
  • Duration 0:05:37
  • Result: ❌ FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: e08d8f5
  • Duration 0:23:21
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: e08d8f5
  • Duration 0:33:32
  • Result: ❌ FAILED
  • Error: Error while executing command: docker build --label "org.foundationdb.version=${FDB_VERSION}" --label "org.foundationdb.build_date=${BUILD_DATE}" --label "org.foundationdb.commit=${COMMIT_SHA}" --progress plain --build-arg FDB_VERSION="${FDB_VERSION}" --build-arg FDB_LIBRARY_VERSIONS="${FDB_VERSION}" --build-arg FDB_WEBSITE="${FDB_WEBSITE}" --tag foundationdb/foundationdb-kubernetes-sidecar:${FDB_VERSION}-${COMMIT_SHA}-1 --file Dockerfile --target foundationdb-kubernetes-sidecar .. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: e08d8f5
  • Duration 0:50:30
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: e08d8f5
  • Duration 1:05:44
  • Result: ❌ FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@sfc-gh-jslocum sfc-gh-jslocum force-pushed the more_ss_cf_perf_fixes_main branch from e08d8f5 to cf93fb8 Compare January 11, 2023 22:12
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: cf93fb8
  • Duration 0:18:20
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: cf93fb8
  • Duration 0:31:11
  • Result: ❌ FAILED
  • Error: Error while executing command: docker build --label "org.foundationdb.version=${FDB_VERSION}" --label "org.foundationdb.build_date=${BUILD_DATE}" --label "org.foundationdb.commit=${COMMIT_SHA}" --progress plain --build-arg FDB_VERSION="${FDB_VERSION}" --build-arg FDB_LIBRARY_VERSIONS="${FDB_VERSION}" --build-arg FDB_WEBSITE="${FDB_WEBSITE}" --tag foundationdb/base:${FDB_VERSION}-${COMMIT_SHA} --file Dockerfile --target base .. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Monterey 12.x

  • Commit ID: cf93fb8
  • Duration 0:43:03
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Monterey 12.x

  • Commit ID: cf93fb8
  • Duration 0:52:40
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: cf93fb8
  • Duration 1:04:08
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: cf93fb8
  • Duration 1:04:43
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

@sfc-gh-jslocum sfc-gh-jslocum merged commit b6450f9 into apple:main Jan 12, 2023
sfc-gh-nwijetunga added a commit to sfc-gh-nwijetunga/foundationdb that referenced this pull request Jan 14, 2023
…ant-deletion

* nim/restore-optional-to-required: (40 commits)
  Use existing function to read database configuration
  Ignore `g_simulator` when testing on a real cluster
  More ss cf perf fixes main (apple#9109)
  The metacluster consistency check didn't account for the possibility that a partially applied operation could leave the set of tenant groups different between the management cluster and a data cluster. Also update metacluster consistency to use comparison based asserts, where appropriate.
  fix the no tenant check failure
  Trigger a commit if none happens within some amount of time when a tenant lookup is performed
  Change TLog pull async data warning timeout
  clearify the return type
  Blob Worker Encryption doesn't use BG_METADATA_SOURCE (apple#9121)
  Restart joshua
  Add tokensign dependency for Windows
  Trace data hall id in MachineMetrics events
  Add event for txn server initialization and a warning for TLog slow catching up
  fix assertion error
  check SetVersionstampedKey offset
  toml file format
  Resolver uses Encryption DB Config (apple#9002)
  Apply suggestions from code review
  Add data verification at the end of BlobRestoreBasic.toml
  Clean up cluster controller's wait on recoveredDiskFiles (apple#9105)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants