-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: Limit macOS testing to one version of python #7507
base: main
Are you sure you want to change the base?
Conversation
This limits macOS testing to one version of python since macOS unittests take a long time to run and are the most expensive runner type that we currently utilize Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7507
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 FailuresAs of commit 45806db: NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$$ savings, yeah!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Why not 3.9 though as we are using 3.9 for all MacOS jobs in CI. 3.8, as the minimum version, also kind of makes sense though
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
But we test minversion in other platforms, so 3.9 is probably a good idea |
Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I don't understand yet is how this saves money. My understanding is that the macos-12
runner is not self-hosted, but rather the "regular" one from GitHub. This is how this comment came into being:
vision/.github/workflows/test-macos.yml
Lines 29 to 31 in 5b07d6c
# We need an increased timeout here, since the macos-12 runner is the free one from GH | |
# and needs roughly 2 hours to just run the test suite | |
timeout: 240 |
Quoting from the billing documentation
GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners.
Since we are a public repository, we should be able to use them for free?
One thing @malfet and I discussed offline might be that we are paying extra to increase the concurrency limits and thus decrease the queue time. Can we make sure that this is actually saving money before we merge?
- python-version: "3.8" | ||
runner: macos-m1-12 | ||
runner: "macos-12" | ||
# Minimum version available for Apple Silicon is 3.9, so just use that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have been testing against 3.8 before, so this can't be the whole truth? Could you clarify this?
include: | ||
# Test against the most popular version of Python (at the time of commit 40% of torch downloads use 3.8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do the same for Windows as well? Meaning, only Linux will have the full 3.7 to 3.11 coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'm marking this as "request changes" to avoid merging prematurely before concerns are addressed)
Thanks for the PR @seemethere.
I have the same concerns that those that were raised in #5479. In particular, I am worried that removing those jobs will reduce our capacity to catch bugs or errors early. As mentioned in #5479 (review), it is fairly common to have some Python version CI job pass while some others fail - often due to different dependency versions (PIL, etc.).
Before we stop running those on PRs, do we have a safe and reliable mechanism to be alerted when those start failing on main
as well?
No but we're going to need to do this anyway, intel macOS is a platform that we are de-prioritizing and with needs for efficiency we're going to need to cut intel macOS testing across our entire organization. If these jobs could run in 5 minutes this wouldn't be an issue but since these jobs take over an hour to run then they need to get cut. |
Does that mean we won't be releasing binaries for intel macOS? If we're not releasing binaries then that's fine, we can remove the testing jobs. But if we're still going to provide binaries, surely we want to keep some form of testing for those platforms.
Do we know why these jobs take 1h? The linux tests run in < 30min and they run the same tests. If it's just a matter of speeding up the CI, perhaps the MacOS runners are simply underspecced? |
We will still release binaries for the next release at a minimum but we're considering dropping after that release.
They are underspecced at 3 cores, so it makes sense as to why they take 1.5 hours to actually run. To give you an idea of how much each of these runs costs (reference: Pricing Documentation):
One other thing to note as well is that core |
@seemethere The numbers you quoted come from the same page that I quoted above in #7507 (review)
Could you explain why we need to pay for them at all? |
Perusing the pricing documentation some more, here is how I understand this:
|
Me and @pmeier talked about this over VC. Here's an overview of some of the questions that came up?
A: Free plans typically only account for a maximum number of minutes, 2000 in the case of free organizations. Since we are an enterprise organization and we exceed 2000 minutes we pay for all usage of Github Actions, including what would typically be "free" github actions runners. With that in mind, macOS x86 is our second most expensive platform to test on so we are approaching all angles in order to ensure that the cost for this specific platform goes down, which includes limiting unittest runs here. |
This limits macOS testing to one version of python since macOS unittests take a long time to run and are the most expensive runner type that we currently utilize
Long time follow up to: