-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add workflow to test perf using sightglass #3740
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,151 @@ | ||
# This is a workflow triggered by PR or triggered manually | ||
# Runs quick performance tests and reports the comparison against HEAD | ||
# Test should take less than 10 minutes to run on current self-hosted devices | ||
name: "Performance Testing" | ||
|
||
# Controls when the action will run. | ||
# Workflow runs when manually triggered using the UI or API. | ||
on: | ||
push: | ||
pull_request: | ||
branches: [ main ] | ||
|
||
# Env variables | ||
env: | ||
SG_COMMIT: 649509c | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible to use the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a ref from the main branch of sightglas .. it's actually HEAD as of today. However I wanted to talk about versioning with sightglass so as to be more transparent about what we are using to run tests. |
||
|
||
jobs: | ||
Performance_x86-64: | ||
name: Performance x86-64 | ||
runs-on: [self-hosted, linux, x64] | ||
# Inputs the workflow accepts. | ||
steps: | ||
- run: echo "This job is now running on a ${{ runner.os }} self-hosted server." | ||
|
||
- name: Setup Rust Toolchain | ||
uses: actions-rs/toolchain@v1 | ||
with: | ||
toolchain: stable | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally I'd prefer to keep the usage of external actions to a minimum for the wasmtime repo, so here if we can arrange for rustup to be installed on the hosted runner this I think could be |
||
|
||
- name: Build sightglass commit ${{env.SG_COMMIT}}" | ||
run: | | ||
cd ../ && ls -l && rm -rf ./sightglass | ||
git clone https://github.com/bytecodealliance/sightglass.git && cd ./sightglass | ||
git checkout ${{env.SG_COMMIT}} | ||
cargo build --release | ||
|
||
- name: Checkout ${{ github.ref }} from ${{ github.repository }} | ||
uses: actions/checkout@v2 | ||
with: | ||
submodules: true | ||
path: wasmtime_commit | ||
|
||
- name: Build ${{ github.ref }} | ||
working-directory: ./wasmtime_commit | ||
run: | | ||
cargo build --release -p wasmtime-bench-api | ||
cp target/release/libwasmtime_bench_api.so /tmp/wasmtime_commit.so | ||
|
||
- name: Checkout Main | ||
uses: actions/checkout@v2 | ||
with: | ||
ref: 'main' | ||
submodules: true | ||
path: wasmtime_main | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When a push to Ideally though it'd be awesome if we could do just one build as part of this PR and use stored data for previous runs perhaps for comparison. We should be guaranteed that all commits on There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be honest, I think there is some caching that is taking place via this github code that lives on the server. I do though the blast the sightglass directory and rebuild that. I don't blast the wasmtime directories but at some point during developing this it seemed those directories were removed but then very quickly rebuilt by the environment, implying to me that something was cached. This iteration doesn't actually test against much of anything (that will be critiqued for improvement), but as a result this entire process from start to finish only takes about 4-1/2 minutes, so even if everything is rebuilt it currently really doesn't take that long. Also like I mentioned above (and suggested offline), I want this to only run on demand not after every PR is submitted or updated. For this patch, I wanted to have something that was for sure going to work every time and then get fancy with the caching and other efficiency improvements on a future iteration. If there is something obvious to change now though, lets do it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah ok if it's fast enough then seems reasonable to at least start with this and we can alter later if it ever becomes necessary |
||
|
||
- name: Build Main | ||
working-directory: ./wasmtime_main | ||
run: | | ||
cargo build --release -p wasmtime-bench-api | ||
cp target/release/libwasmtime_bench_api.so /tmp/wasmtime_main.so | ||
|
||
- name: Run performance tests | ||
working-directory: ../sightglass | ||
run: | | ||
cargo run -- \ | ||
benchmark \ | ||
--processes 1 \ | ||
--iterations-per-process 2 \ | ||
--engine /tmp/wasmtime_main.so \ | ||
--engine /tmp/wasmtime_commit.so \ | ||
--output-format csv \ | ||
--output-file /tmp/results.csv \ | ||
--raw \ | ||
-- benchmarks-next/blake3-scalar/benchmark.wasm | ||
./target/release/sightglass-cli summarize --input-format csv --output-format csv -f /tmp/results.csv > /tmp/results_summarized.csv | ||
|
||
- name: Setup Python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: '3.9' | ||
|
||
- name: Post Process Perf Results | ||
run: | | ||
pip3 install pandas | ||
pip3 install numpy | ||
grep -v "cycles" /tmp/results_summarized.csv > /tmp/results_cycles_summarized.csv | ||
sed -i 's/\/tmp\/wasmtime_commit.so/patch/g' /tmp/results_cycles_summarized.csv | ||
sed -i 's/\/tmp\/wasmtime_main.so/main/g' /tmp/results_cycles_summarized.csv | ||
python3 -c "import pandas as pd; pp = pd.read_csv('/tmp/results_cycles_summarized.csv', usecols=['arch','engine','phase', 'mean'], header=0); pp_sorted = pp.sort_values('phase', ascending=True); print(pp_sorted.to_string(index=False));" > /tmp/results_cycles_summarized_pp_sorted.csv | ||
sed -i 's/^/ /' /tmp/results_cycles_summarized_pp_sorted.csv | ||
sed -i 's/ \+/|/g' /tmp/results_cycles_summarized_pp_sorted.csv | ||
sed -i -z 's/\n/|\n/g' /tmp/results_cycles_summarized_pp_sorted.csv | ||
sed -i '2 i\ |-|-|-|-|' /tmp/results_cycles_summarized_pp_sorted.csv | ||
sed -i '1 i\ Performance results in clockticks (lower is better):\n' /tmp/results_cycles_summarized_pp_sorted.csv | ||
|
||
#python3 -c "import pandas as pd; pp = pd.read_csv('/tmp/results_cycles_summarized.csv', usecols=['arch','engine','phase', 'mean'], header=0); pp_sorted = pp; print(pp_sorted.to_string(index=False)); pp_sorted.insert(4, '%Change (1-Head/Patch)',[ 1 -pp_sorted.loc[0][3]/pp_sorted.loc[3][3], 0, 1-pp_sorted.loc[1][3]/pp_sorted.loc[4][3],0, 1-pp_sorted.loc[2][3]/pp_sorted.loc[5][3] ,0], True); print(pp_sorted)" > /tmp/results_cycles_summarized_pp_sorted2.csv | ||
python3 -c "import pandas as pd; pp = pd.read_csv('/tmp/results_cycles_summarized.csv', usecols=['arch','engine','phase', 'mean'], header=0); pp_sorted = pp; pp_sorted.insert(4, '%Change_(1-Head/Patch)',[ 1 -pp_sorted.loc[0][3]/pp_sorted.loc[3][3], 1-pp_sorted.loc[1][3]/pp_sorted.loc[4][3], 1-pp_sorted.loc[2][3]/pp_sorted.loc[5][3],0,0,0], True); pp_sorted.drop('mean', axis=1, inplace=True); print(pp_sorted.to_string(index=False))" > /tmp/results_cycles_summarized_pp_sorted2.csv | ||
sed -i 's/^/ /' /tmp/results_cycles_summarized_pp_sorted2.csv | ||
sed -i 's/ \+/|/g' /tmp/results_cycles_summarized_pp_sorted2.csv | ||
sed -i -z 's/\n/|\n/g' /tmp/results_cycles_summarized_pp_sorted2.csv | ||
sed -i '2 i\ |-|-|-|-|' /tmp/results_cycles_summarized_pp_sorted2.csv | ||
sed -i '/main/d' /tmp/results_cycles_summarized_pp_sorted2.csv | ||
sed -i '1 i\ Performance results based on clockticks comparison with main HEAD (higher %change shows improvement):\n' /tmp/results_cycles_summarized_pp_sorted2.csv | ||
|
||
# - id: get-comment-body | ||
# name: Get Result | ||
# run: | | ||
# body="$(cat /tmp/results_cycles_summarized_pp_sorted.csv)" | ||
# body="${body//'%'/'%25'}" | ||
# body="${body//$'\n'/'%0A'}" | ||
# body="${body//$'\r'/'%0D'}" | ||
# echo "::set-output name=body::$body" | ||
|
||
- name: Find PR | ||
uses: jwalton/gh-find-current-pr@v1 | ||
id: findPr | ||
with: | ||
# Can be "open", "closed", or "all". Defaults to "open". | ||
state: open | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we might be able to detect this with various env vars set on github actions for prs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although if not my main concern about using third-party actions is the security implications. AFAIK we're largely giving all these actions read/write access to the wasmtime repo so a push to this jwalton/gh-find-current-pr repo could end up compromising us. One way to mitigate this perhaps would be to run this workflow in a separate repository that is triggered from this repository (or something like that). That way if our other repository gets messed up it's not the end of the world. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can get away with not using this findPr or other third-party actions which I'll try to do. That said, when you say "AFAIK we're largely giving all these actions read/write access to the wasmtime repo so a push to this jwalton/gh-find-current-pr repo could end up compromising us." do you mean read/write access to just the downloaded wasmtime repository that is housed on the remote server or are you thinking access to the wasmtime repository as it lives in github? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that workflows generally get read/write tokens to the bytecodealliance/wasmtime repository that lives on github, so I'm under the impression that a malicious workflow could compromise big chunks of the source code and repo |
||
|
||
- run: echo ${{ steps.findPr.outputs.pr }} | ||
- run: echo ${{ steps.fc.outputs.comment-id }} | ||
# echo "${{ steps.findPr.outputs.pr }}" | ||
# echo "${{ steps.fc.outputs.comment-id }}" | ||
|
||
# - name: Publish Results | ||
# uses: peter-evans/create-or-update-comment@v1 | ||
# with: | ||
# issue-number: ${{ steps.findPr.outputs.pr }} | ||
# reactions: rocket | ||
# body: ${{ steps.get-comment-body.outputs.body }} | ||
|
||
- id: get-comment-body2 | ||
name: Get Result 2 | ||
run: | | ||
body2="$(cat /tmp/results_cycles_summarized_pp_sorted2.csv)" | ||
body2="${body2//'%'/'%25'}" | ||
body2="${body2//$'\n'/'%0A'}" | ||
body2="${body2//$'\r'/'%0D'}" | ||
echo "::set-output name=body2::$body2" | ||
|
||
- name: Publish Results 2 | ||
uses: peter-evans/create-or-update-comment@v1 | ||
with: | ||
issue-number: ${{ steps.findPr.outputs.pr }} | ||
reactions: rocket | ||
body: ${{ steps.get-comment-body2.outputs.body2 }} | ||
|
||
# Performance_Aarch64: | ||
# name: Performance Aarch64 | ||
# runs-on: [self-hosted, linux, x86-64] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll want to confirm the security story here before merging this. Unfortunately hooking up hosted runners to public repos is explicitly recommended against in the github actions documentation because it becomes pretty easy to run arbitrary code on these hosted runners from external contributors via PRs. I think there's various measures we could put in place to mitigate that but we'll want to make sure that's all in order before having this all hooked up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is some mechanism for generating/confirming with tokens that may be what we want. I don't believe I use that here as I was just following the github doc for a basic setup but it is something I'll look into. Also, since you highlighted the on: push/pull_request, note I want to change this to be on comment .. where the trigger occurs on a comment with the string /bench_x64 or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'd be worried about is that once we have a custom runner hooked up any contributor can make a PR with a new workflow file to run on all PRs which uses the custom runner which I think means we basically don't have any ability to limit what runs on the runner unless we can hard configure it to ignore PRs entirely (or something like that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was discussed today and I think the solution is to just only allow to have the privileges to run. Anyone else can run what this is running offline before committing. I will look into a way to have this be triggered via a comment (which I already have but need to work out a wrinkle) and then look to see if we can discover the author of that comment to allow running of the code based on that comment author.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I'm mostly parroting this which says:
and this
I don't know why the "almost" is there, though.