Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workflow to test perf using sightglass #3740

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 151 additions & 0 deletions .github/workflows/performance.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# This is a workflow triggered by PR or triggered manually
# Runs quick performance tests and reports the comparison against HEAD
# Test should take less than 10 minutes to run on current self-hosted devices
name: "Performance Testing"

# Controls when the action will run.
# Workflow runs when manually triggered using the UI or API.
on:
push:
pull_request:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll want to confirm the security story here before merging this. Unfortunately hooking up hosted runners to public repos is explicitly recommended against in the github actions documentation because it becomes pretty easy to run arbitrary code on these hosted runners from external contributors via PRs. I think there's various measures we could put in place to mitigate that but we'll want to make sure that's all in order before having this all hooked up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is some mechanism for generating/confirming with tokens that may be what we want. I don't believe I use that here as I was just following the github doc for a basic setup but it is something I'll look into. Also, since you highlighted the on: push/pull_request, note I want to change this to be on comment .. where the trigger occurs on a comment with the string /bench_x64 or something like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'd be worried about is that once we have a custom runner hooked up any contributor can make a PR with a new workflow file to run on all PRs which uses the custom runner which I think means we basically don't have any ability to limit what runs on the runner unless we can hard configure it to ignore PRs entirely (or something like that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was discussed today and I think the solution is to just only allow to have the privileges to run. Anyone else can run what this is running offline before committing. I will look into a way to have this be triggered via a comment (which I already have but need to work out a wrinkle) and then look to see if we can discover the author of that comment to allow running of the code based on that comment author.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I'm mostly parroting this which says:

We recommend that you only use self-hosted runners with private repositories

and this

As a result, self-hosted runners should almost never be used for public repositories on GitHub

I don't know why the "almost" is there, though.

branches: [ main ]

# Env variables
env:
SG_COMMIT: 649509c
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use the main branch of sightglass? (unsure if this is largely for testing or whether it's intended that we periodically update this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a ref from the main branch of sightglas .. it's actually HEAD as of today. However I wanted to talk about versioning with sightglass so as to be more transparent about what we are using to run tests.


jobs:
Performance_x86-64:
name: Performance x86-64
runs-on: [self-hosted, linux, x64]
# Inputs the workflow accepts.
steps:
- run: echo "This job is now running on a ${{ runner.os }} self-hosted server."

- name: Setup Rust Toolchain
uses: actions-rs/toolchain@v1
with:
toolchain: stable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd prefer to keep the usage of external actions to a minimum for the wasmtime repo, so here if we can arrange for rustup to be installed on the hosted runner this I think could be rustup update stable && rustup default stable


- name: Build sightglass commit ${{env.SG_COMMIT}}"
run: |
cd ../ && ls -l && rm -rf ./sightglass
git clone https://github.com/bytecodealliance/sightglass.git && cd ./sightglass
git checkout ${{env.SG_COMMIT}}
cargo build --release

- name: Checkout ${{ github.ref }} from ${{ github.repository }}
uses: actions/checkout@v2
with:
submodules: true
path: wasmtime_commit

- name: Build ${{ github.ref }}
working-directory: ./wasmtime_commit
run: |
cargo build --release -p wasmtime-bench-api
cp target/release/libwasmtime_bench_api.so /tmp/wasmtime_commit.so

- name: Checkout Main
uses: actions/checkout@v2
with:
ref: 'main'
submodules: true
path: wasmtime_main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a push to main happens I think this'll double-checkout main and build it twice?

Ideally though it'd be awesome if we could do just one build as part of this PR and use stored data for previous runs perhaps for comparison. We should be guaranteed that all commits on main have performance information so comparing a new merge to main or a PR is ideally just about picking the revision to compare against.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I think there is some caching that is taking place via this github code that lives on the server. I do though the blast the sightglass directory and rebuild that. I don't blast the wasmtime directories but at some point during developing this it seemed those directories were removed but then very quickly rebuilt by the environment, implying to me that something was cached.

This iteration doesn't actually test against much of anything (that will be critiqued for improvement), but as a result this entire process from start to finish only takes about 4-1/2 minutes, so even if everything is rebuilt it currently really doesn't take that long. Also like I mentioned above (and suggested offline), I want this to only run on demand not after every PR is submitted or updated. For this patch, I wanted to have something that was for sure going to work every time and then get fancy with the caching and other efficiency improvements on a future iteration. If there is something obvious to change now though, lets do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok if it's fast enough then seems reasonable to at least start with this and we can alter later if it ever becomes necessary


- name: Build Main
working-directory: ./wasmtime_main
run: |
cargo build --release -p wasmtime-bench-api
cp target/release/libwasmtime_bench_api.so /tmp/wasmtime_main.so

- name: Run performance tests
working-directory: ../sightglass
run: |
cargo run -- \
benchmark \
--processes 1 \
--iterations-per-process 2 \
--engine /tmp/wasmtime_main.so \
--engine /tmp/wasmtime_commit.so \
--output-format csv \
--output-file /tmp/results.csv \
--raw \
-- benchmarks-next/blake3-scalar/benchmark.wasm
./target/release/sightglass-cli summarize --input-format csv --output-format csv -f /tmp/results.csv > /tmp/results_summarized.csv

- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: '3.9'

- name: Post Process Perf Results
run: |
pip3 install pandas
pip3 install numpy
grep -v "cycles" /tmp/results_summarized.csv > /tmp/results_cycles_summarized.csv
sed -i 's/\/tmp\/wasmtime_commit.so/patch/g' /tmp/results_cycles_summarized.csv
sed -i 's/\/tmp\/wasmtime_main.so/main/g' /tmp/results_cycles_summarized.csv
python3 -c "import pandas as pd; pp = pd.read_csv('/tmp/results_cycles_summarized.csv', usecols=['arch','engine','phase', 'mean'], header=0); pp_sorted = pp.sort_values('phase', ascending=True); print(pp_sorted.to_string(index=False));" > /tmp/results_cycles_summarized_pp_sorted.csv
sed -i 's/^/ /' /tmp/results_cycles_summarized_pp_sorted.csv
sed -i 's/ \+/|/g' /tmp/results_cycles_summarized_pp_sorted.csv
sed -i -z 's/\n/|\n/g' /tmp/results_cycles_summarized_pp_sorted.csv
sed -i '2 i\ |-|-|-|-|' /tmp/results_cycles_summarized_pp_sorted.csv
sed -i '1 i\ Performance results in clockticks (lower is better):\n' /tmp/results_cycles_summarized_pp_sorted.csv

#python3 -c "import pandas as pd; pp = pd.read_csv('/tmp/results_cycles_summarized.csv', usecols=['arch','engine','phase', 'mean'], header=0); pp_sorted = pp; print(pp_sorted.to_string(index=False)); pp_sorted.insert(4, '%Change (1-Head/Patch)',[ 1 -pp_sorted.loc[0][3]/pp_sorted.loc[3][3], 0, 1-pp_sorted.loc[1][3]/pp_sorted.loc[4][3],0, 1-pp_sorted.loc[2][3]/pp_sorted.loc[5][3] ,0], True); print(pp_sorted)" > /tmp/results_cycles_summarized_pp_sorted2.csv
python3 -c "import pandas as pd; pp = pd.read_csv('/tmp/results_cycles_summarized.csv', usecols=['arch','engine','phase', 'mean'], header=0); pp_sorted = pp; pp_sorted.insert(4, '%Change_(1-Head/Patch)',[ 1 -pp_sorted.loc[0][3]/pp_sorted.loc[3][3], 1-pp_sorted.loc[1][3]/pp_sorted.loc[4][3], 1-pp_sorted.loc[2][3]/pp_sorted.loc[5][3],0,0,0], True); pp_sorted.drop('mean', axis=1, inplace=True); print(pp_sorted.to_string(index=False))" > /tmp/results_cycles_summarized_pp_sorted2.csv
sed -i 's/^/ /' /tmp/results_cycles_summarized_pp_sorted2.csv
sed -i 's/ \+/|/g' /tmp/results_cycles_summarized_pp_sorted2.csv
sed -i -z 's/\n/|\n/g' /tmp/results_cycles_summarized_pp_sorted2.csv
sed -i '2 i\ |-|-|-|-|' /tmp/results_cycles_summarized_pp_sorted2.csv
sed -i '/main/d' /tmp/results_cycles_summarized_pp_sorted2.csv
sed -i '1 i\ Performance results based on clockticks comparison with main HEAD (higher %change shows improvement):\n' /tmp/results_cycles_summarized_pp_sorted2.csv

# - id: get-comment-body
# name: Get Result
# run: |
# body="$(cat /tmp/results_cycles_summarized_pp_sorted.csv)"
# body="${body//'%'/'%25'}"
# body="${body//$'\n'/'%0A'}"
# body="${body//$'\r'/'%0D'}"
# echo "::set-output name=body::$body"

- name: Find PR
uses: jwalton/gh-find-current-pr@v1
id: findPr
with:
# Can be "open", "closed", or "all". Defaults to "open".
state: open
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be able to detect this with various env vars set on github actions for prs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although if not my main concern about using third-party actions is the security implications. AFAIK we're largely giving all these actions read/write access to the wasmtime repo so a push to this jwalton/gh-find-current-pr repo could end up compromising us.

One way to mitigate this perhaps would be to run this workflow in a separate repository that is triggered from this repository (or something like that). That way if our other repository gets messed up it's not the end of the world.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get away with not using this findPr or other third-party actions which I'll try to do. That said, when you say "AFAIK we're largely giving all these actions read/write access to the wasmtime repo so a push to this jwalton/gh-find-current-pr repo could end up compromising us." do you mean read/write access to just the downloaded wasmtime repository that is housed on the remote server or are you thinking access to the wasmtime repository as it lives in github?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that workflows generally get read/write tokens to the bytecodealliance/wasmtime repository that lives on github, so I'm under the impression that a malicious workflow could compromise big chunks of the source code and repo


- run: echo ${{ steps.findPr.outputs.pr }}
- run: echo ${{ steps.fc.outputs.comment-id }}
# echo "${{ steps.findPr.outputs.pr }}"
# echo "${{ steps.fc.outputs.comment-id }}"

# - name: Publish Results
# uses: peter-evans/create-or-update-comment@v1
# with:
# issue-number: ${{ steps.findPr.outputs.pr }}
# reactions: rocket
# body: ${{ steps.get-comment-body.outputs.body }}

- id: get-comment-body2
name: Get Result 2
run: |
body2="$(cat /tmp/results_cycles_summarized_pp_sorted2.csv)"
body2="${body2//'%'/'%25'}"
body2="${body2//$'\n'/'%0A'}"
body2="${body2//$'\r'/'%0D'}"
echo "::set-output name=body2::$body2"

- name: Publish Results 2
uses: peter-evans/create-or-update-comment@v1
with:
issue-number: ${{ steps.findPr.outputs.pr }}
reactions: rocket
body: ${{ steps.get-comment-body2.outputs.body2 }}

# Performance_Aarch64:
# name: Performance Aarch64
# runs-on: [self-hosted, linux, x86-64]