[PROPOSAL] Run performance benchmarks on pull requests #14553

rishabh6788 · 2024-06-26T06:16:34Z

Is your feature request related to a problem? Please describe

Creating this issue to gather feedback and track progress on adding a feature to submit performance benchmark runs on pull requests created in OpenSearch or any plugins repository and update the result on the same.

The basic fundamentals of the proposed design are:

Opensource in true sense, any contributor who wants to run performance benchmark on their pull requests should be able to do so.
To prevent abuse and execution of malicious code the maintainers will have the final authority to allow or deny the request.
Maintainers of the repo should be able to submit benchmarks without any approval.
The users should have a baseline to compare their results against.

Describe the solution you'd like

The solution that we are proposing is very similar to how gradle-check runs on each PR.
We will be leveraging github-actions and jenkins remote job execution feature to enable this automation.
The choice was between either use a label trigger or issue_comment trigger to enable this workflow.
We recommend using issue_comment as it gives flexibility to easily manage many different cluster and workload configurations, not that is not possible with label but then there would be too many labels to maintain and manage.

Who can submit performance runs?

Any developer working on OpenSearch project can submit benchmark run request by adding a comment on the pull request, the comment will have to be in a particular format for the job to be able to process.

Who can approve/deny performance run request?

Any available maintainer can approve/deny the performance run request.

How the approval mechanism work?

We will use the tried and trusted https://github.com/trstringer/manual-approval github actions to manage approvals. How it works is once a legitimate comment is added on the PR the approval action will create an issue and add that issue to the PR in a comment, the maintainer then has to comment approve or deny on the issue. Once done the issue will auto-close and the rest of the execution will continue.

Why approval is needed?

Running benchmark will require significant infra resources as it will be spinning up a production grade cluster and a benchmarking client and we don't want a bad actor abuse the system by submitting too many requests and overwhelm the system. The other reason is the probability of execution of a malicious code, given we will be running OpenSearch process that contains the code submitted by the developer.

How will you manage cluster and workload settings?

The plan is to have json file containing various cluster and workload configurations with each having a unique-id. This id will be part of the formatted comment to be added to invoke the workflow. The user needs to look at the config file and decide which configuration they want to use to run benchmark.

Below is the sample flow:

Related component

Other

Describe alternatives you've considered

The alternative to avoid approval is to let maintainer add the required comment to initiate the workflow. In this case the pull request submitter will have to ask the maintainer to add the desired comment on their behalf.
I am open to any other recommendation to make this as smooth as possible for the maintainers.

Additional context

The jenkins benchmark job that will be triggered is https://build.ci.opensearch.org/job/benchmark-test/.
The cluster will be spun up using opensearch-cluster-cdk in an AWS account.
opensearch-benchmark will be used to run performance benchmark.

Related Milestones:

The text was updated successfully, but these errors were encountered:

rishabh6788 · 2024-06-26T06:17:30Z

tagging @dblock @reta @andrross @peternied @getsaurabh02 for feedback and comments.

reta · 2024-06-26T13:23:02Z

tagging @dblock @reta @andrross @peternied @getsaurabh02 for feedback and comments.

Thanks a lot @rishabh6788 , the workflow + manual approval look great to me to start with

peternied · 2024-06-26T15:16:54Z

[Triage - attendees 1 2 3]
@rishabh6788 Thanks for creating this issue, looking forward to a pull request to resolve

peternied · 2024-06-26T17:37:59Z

@rishabh6788 Nice work - looking forward to seeing it in action!

andrross · 2024-06-26T17:49:59Z

@rishabh6788 What happens to the issue created in the "Create an issue in the repo" step? Is it automatically closed somewhere?

rishabh6788 · 2024-06-26T17:52:09Z

@rishabh6788 What happens to the issue created in the "Create an issue in the repo" step? Is it automatically closed somewhere?

The issue's lifecycle will be maintained by workflow, once it opens and gets added to the PR, the maintainer has to add appropriate comment and then it will auto-close. See sample opensearch-project/opensearch-benchmark#567

sohami · 2024-07-01T17:50:00Z

@rishabh6788 This looks great. Few questions:

Assuming you are planning to share the format of the comment to trigger perf runs in separate issue. Would be great if we can consider to perform the runs both by freshly indexing the data with each run vs restoring the data from snapshot.
Does maintainer needs to be aware of or verify any load on backend infra before approving the request (assuming no) ? If there are any failure on execution post approval, can we post those errors to this newly created issue ? Or even tag infra team member to help with it ?

rishabh6788 · 2024-07-01T18:07:23Z

@rishabh6788 This looks great. Few questions:

Assuming you are planning to share the format of the comment to trigger perf runs in separate issue. Would be great if we can consider to perform the runs both by freshly indexing the data with each run vs restoring the data from snapshot.

Does maintainer needs to be aware of or verify any load on backend infra before approving the request (assuming no) ? If there are any failure on execution post approval, can we post those errors to this newly created issue ? Or even tag infra team member to help with it ?

Thank you for the feedback @sohami.
For 1, The plan is to use a separate json file to handle all the varied configurations so it will give user flexibility to choose indexing only benchmark, search-only benchmark (will use our existing snapshots) and also use any feature flags, such as enabling concurrent-search on the cluster. The user will have to add a comment on the PR which will have the id of the configuration they are interested to run.
The issue is only for pausing the workflow and get an approval from a maintainer if a non-maintainer person submits a request, the request will auto close once a maintainer has approved/denied the request and the workflow will continue further with execution.

For 2, The maintainer need not worry about the load on the infra, we will have monitors and alerts to let infra team know if there are any faults and errors. In case of workflow failure a comment will be added on PR with job link for maintainer to look and inform infra team if they see any infra related issues. We can also add a feature to create an issue for infra team to look at.

rishabh6788 added enhancement Enhancement or improvement to existing feature or request untriaged labels Jun 26, 2024

github-actions bot added the Other label Jun 26, 2024

peternied added the Performance This is for any performance related enhancements or bugs label Jun 26, 2024

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Jun 26, 2024

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Jun 26, 2024

peternied removed the untriaged label Jun 26, 2024

rishabh6788 mentioned this issue Jun 26, 2024

[BENCHMARK] Create new Jenkins job for running benchmarks on pull requests opensearch-project/opensearch-build#4804

Closed

rishabh6788 self-assigned this Jun 28, 2024

rishabh6788 added this to OpenSearch Engineering Effectiveness Jul 1, 2024

github-project-automation bot moved this to Backlog in OpenSearch Engineering Effectiveness Jul 1, 2024

rishabh6788 moved this from Backlog to In Progress in OpenSearch Engineering Effectiveness Jul 1, 2024

peterzhuamazon added this to Engineering Effectiveness Board Jul 1, 2024

github-project-automation bot moved this to 🆕 New in Engineering Effectiveness Board Jul 1, 2024

peterzhuamazon moved this from 🆕 New to 🏗 In progress in Engineering Effectiveness Board Jul 1, 2024

This was referenced Jul 17, 2024

add throttling and baseline runs for pull request benchmark runs opensearch-project/opensearch-build#4859

Merged

Add baseline comparison results to pull requests for performance benchmarks #14826

Closed

rishabh6788 closed this as completed Aug 2, 2024

github-project-automation bot moved this from Open to Closed in OpenSearch Lucene & Core Performance Tracking Aug 2, 2024

github-project-automation bot moved this from 🏗 In progress to ✅ Done in Engineering Effectiveness Board Aug 2, 2024

github-project-automation bot moved this from In Progress to Done in OpenSearch Engineering Effectiveness Aug 2, 2024

rishabh6788 mentioned this issue Aug 2, 2024

[META] Automate performance benchmark for OpenSearch and components opensearch-project/opensearch-build#4906

Closed

5 tasks

peternied mentioned this issue Sep 5, 2024

[Feature Request] The perf workflow requires maintainer approval - it could be useful to provide a way to opt out of these requests #15770

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Run performance benchmarks on pull requests #14553

[PROPOSAL] Run performance benchmarks on pull requests #14553

rishabh6788 commented Jun 26, 2024 •

edited

Loading

rishabh6788 commented Jun 26, 2024

reta commented Jun 26, 2024

peternied commented Jun 26, 2024

peternied commented Jun 26, 2024

andrross commented Jun 26, 2024

rishabh6788 commented Jun 26, 2024

sohami commented Jul 1, 2024

rishabh6788 commented Jul 1, 2024 •

edited

Loading

[PROPOSAL] Run performance benchmarks on pull requests #14553

[PROPOSAL] Run performance benchmarks on pull requests #14553

Comments

rishabh6788 commented Jun 26, 2024 • edited Loading

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Who can submit performance runs?

Who can approve/deny performance run request?

How the approval mechanism work?

Why approval is needed?

How will you manage cluster and workload settings?

Related component

Describe alternatives you've considered

Additional context

Related Milestones:

rishabh6788 commented Jun 26, 2024

reta commented Jun 26, 2024

peternied commented Jun 26, 2024

peternied commented Jun 26, 2024

andrross commented Jun 26, 2024

rishabh6788 commented Jun 26, 2024

sohami commented Jul 1, 2024

rishabh6788 commented Jul 1, 2024 • edited Loading

rishabh6788 commented Jun 26, 2024 •

edited

Loading

rishabh6788 commented Jul 1, 2024 •

edited

Loading