Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(PDB-5691) benchmark: space out command types #3880

Merged
merged 2 commits into from
Sep 26, 2023

Conversation

austb
Copy link
Contributor

@austb austb commented Sep 15, 2023

To alleviate the certname row contention, this replaces the pipeline that sends each command separately to the submission pipeline with a go-loop that schedules the sending of catalogs and reports at a later time (5-15 seconds delay for each). This should more accurately simulate the actual behavior and allow more realistic performance testing.

Every command runs scf-storage/maybe-activate-node! which causes contention on the certname row when multiple commands of the same certname are processed simultaneously. Previously we generated a set of commands for a certname and submitted them all at once, so there was contention on most submissions from benchmark. This is not reflective of reality because a catalog is delayed by the time it takes puppetserver to compile the node's catalog after receiving the factset, and the report is delayed after catalog compilation by the time it takes the agent to apply it.

@austb
Copy link
Contributor Author

austb commented Sep 15, 2023

Marked as don't merge until I complete the performance testing of this change.

It might be helpful to review this with whitespace changes off

@austb austb force-pushed the pdb-5691/main/benchmark-delay-commands branch from a001d97 to 76119a1 Compare September 15, 2023 18:03
@austb austb marked this pull request as ready for review September 15, 2023 18:08
@austb austb requested review from a team as code owners September 15, 2023 18:08
@austb
Copy link
Contributor Author

austb commented Sep 15, 2023

This change does not appear to have degraded performance.

In the performance test environment, benchmark is still able to send the 83.333 commands per second required to simulate 50,000 nodes.

Perf test results (locally)

before this change to submit 5000 nodes

real    6m28.132s
user    10m0.507s
sys     0m24.076s

after this change to submit 5000 nodes

real    7m10.615s
user    10m11.129s
sys     0m24.837s

We see a small increase in user CPU time, and a moderate increase in real time. The increase in real time is at least partially due to the fact that instead of sending all commands immediately they are scheduled to send within the next 30 seconds. So instead of ending at a high rate, we end at a slow rate as we submit the final commands.

Ending before change

Sending 45.717 messages/s (load equivalent to 27,429 nodes with a run interval of 30 minutes)
Sending 46.684 messages/s (load equivalent to 28,010 nodes with a run interval of 30 minutes)
Cleaning up temp files from "/tmp/pdb-bench-16280516017042729251"
Sending 44.737 messages/s (load equivalent to 26,842 nodes with a run interval of 30 minutes)
Finished cleaning up temp files

Ending after change

Cleaning up temp files from "/tmp/pdb-bench-9898774072576177076"
Sending 36.749 messages/s (load equivalent to 22,049 nodes with a run interval of 30 minutes)
Finished cleaning up temp files
Sending 24.922 messages/s (load equivalent to 14,953 nodes with a run interval of 30 minutes)
Sending 10.923 messages/s (load equivalent to 6,553 nodes with a run interval of 30 minutes)
Sending 3.723 messages/s (load equivalent to 2,233 nodes with a run interval of 30 minutes)

@austb austb removed the don't merge label Sep 15, 2023
@austb austb force-pushed the pdb-5691/main/benchmark-delay-commands branch 3 times, most recently from 203034b to e769809 Compare September 26, 2023 20:27
To alleviate the certname row contention, this replaces the pipeline
that sends each command separately to the submission pipeline with a
go-loop that schedules the sending of catalogs and reports at a later
time (5-15 seconds delay for each). This should more accurately simulate
the actual behavior and allow more realistic performance testing.

Every command runs scf-storage/maybe-activate-node! which causes
contention on the certname row when multiple commands of the same
certname are processed simultaneously. Previously we generated a set of
commands for a certname and submitted them all at once, so there was
contention on most submissions from benchmark. This is not reflective of
reality because a catalog is delayed by the time it takes puppetserver
to compile the node's catalog after receiving the factset, and the report
is delayed after catalog compilation by the time it takes the agent to
apply it.
@austb austb force-pushed the pdb-5691/main/benchmark-delay-commands branch from e769809 to 75c90ed Compare September 26, 2023 21:33
@austb austb merged commit 632f383 into puppetlabs:main Sep 26, 2023
11 of 12 checks passed
@austb austb deleted the pdb-5691/main/benchmark-delay-commands branch September 26, 2023 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants