Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try 1 fast 2CPU and 3 slow 1CPU aarch64 runners #2040

Merged
merged 12 commits into from
Jan 14, 2024

Conversation

mathbunnyru
Copy link
Member

@mathbunnyru mathbunnyru commented Nov 23, 2023

Describe your changes

Depends on: #2066
Our time measurings will change after using actions/{download,upload}-artifact v4.
After we merge the mentioned PR, it might make sense to reconsider the speed of self-hosted runners we're using.

Issue ticket if applicable

Checklist (especially for first-time contributors)

  • I have performed a self-review of my code
  • If it is a core feature, I have added thorough tests
  • I will try not to use force-push to make the review process easier for reviewers
  • I have updated the documentation for significant changes

@mathbunnyru mathbunnyru changed the title Try 4 aarch64 1CPU runners Try 2 fast and 3 slow aarch64 1CPU runners Nov 25, 2023
@mathbunnyru mathbunnyru reopened this Nov 30, 2023
@mathbunnyru mathbunnyru changed the title Try 2 fast and 3 slow aarch64 1CPU runners Try fast 2CPU and slow aarch64 1CPU runners Dec 11, 2023
@mathbunnyru mathbunnyru marked this pull request as draft January 7, 2024 14:58
@mathbunnyru
Copy link
Member Author

mathbunnyru commented Jan 13, 2024

The fresh run: https://github.com/jupyter/docker-stacks/actions/runs/7513154764

The best speed we can achieve (for the building phase, not including push + merge tags, because there is nothing to optimize there, to be honest) is 1min21s+2min52s+3min39s+6min6s+6min3s+7min32s, which is 27min33s.
https://github.com/jupyter/docker-stacks/actions/runs/7513154764

The time comes from x86_64 runners, there are an unlimited number of them (for our purposes the number is practically unlimited), and they are 4CPU machines.
Most of the time difference comes from running tests (they do run in parallel).

So, it makes sense to speed up our aarch64 runners.

The coolest thing would be to implement autoscaling though: https://medium.com/google-cloud/autoscaling-runners-on-google-cloud-f285ed8d85de
This way we would be able to run 4CPU machines easily, with zero idle VMs. And it would cost many times less.
I welcome such a change, but I still hope GitHub will provide aarch64 Linux runners one day.

@mathbunnyru mathbunnyru changed the title Try fast 2CPU and slow aarch64 1CPU runners Try 1 fast 3CPU and 3 slow 1CPU aarch64 runners Jan 13, 2024
@mathbunnyru mathbunnyru changed the title Try 1 fast 3CPU and 3 slow 1CPU aarch64 runners Try 1 fast 2CPU and 3 slow 1CPU aarch64 runners Jan 13, 2024
@mathbunnyru mathbunnyru marked this pull request as ready for review January 13, 2024 20:17
@mathbunnyru
Copy link
Member Author

mathbunnyru commented Jan 14, 2024

aarch64 build phase took approximately 40 minutes, that's nice. Really close to the optimal build time.

Total CI time in configuration 3x1CPU+1x2CPU 43m 42s, which is also significantly better than the configuration 4x1CPU 1h 1m 38s.

(All comparisons are made already with v4 download/upload artifact actions, of course).

@mathbunnyru mathbunnyru merged commit eb04996 into jupyter:main Jan 14, 2024
64 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant