Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Limit macOS testing to one version of python #7507

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

seemethere
Copy link
Member

This limits macOS testing to one version of python since macOS unittests take a long time to run and are the most expensive runner type that we currently utilize

Long time follow up to:

This limits macOS testing to one version of python since macOS unittests
take a long time to run and are the most expensive runner type that we
currently utilize

Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
@seemethere seemethere requested a review from pmeier April 7, 2023 21:09
@pytorch-bot
Copy link

pytorch-bot bot commented Apr 7, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7507

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 Failures

As of commit 45806db:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link

@weiwangmeta weiwangmeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$$ savings, yeah!

Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Why not 3.9 though as we are using 3.9 for all MacOS jobs in CI. 3.8, as the minimum version, also kind of makes sense though

.github/workflows/test-macos.yml Outdated Show resolved Hide resolved
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
@malfet
Copy link
Contributor

malfet commented Apr 7, 2023

LGTM! Why not 3.9 though as we are using 3.9 for all MacOS jobs in CI. 3.8, as the minimum version, also kind of makes sense though

But we test minversion in other platforms, so 3.9 is probably a good idea

Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I don't understand yet is how this saves money. My understanding is that the macos-12 runner is not self-hosted, but rather the "regular" one from GitHub. This is how this comment came into being:

# We need an increased timeout here, since the macos-12 runner is the free one from GH
# and needs roughly 2 hours to just run the test suite
timeout: 240

Quoting from the billing documentation

GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners.

Since we are a public repository, we should be able to use them for free?

One thing @malfet and I discussed offline might be that we are paying extra to increase the concurrency limits and thus decrease the queue time. Can we make sure that this is actually saving money before we merge?

- python-version: "3.8"
runner: macos-m1-12
runner: "macos-12"
# Minimum version available for Apple Silicon is 3.9, so just use that
Copy link
Collaborator

@pmeier pmeier Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been testing against 3.8 before, so this can't be the whole truth? Could you clarify this?

include:
# Test against the most popular version of Python (at the time of commit 40% of torch downloads use 3.8)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do the same for Windows as well? Meaning, only Linux will have the full 3.7 to 3.11 coverage?

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm marking this as "request changes" to avoid merging prematurely before concerns are addressed)

Thanks for the PR @seemethere.

I have the same concerns that those that were raised in #5479. In particular, I am worried that removing those jobs will reduce our capacity to catch bugs or errors early. As mentioned in #5479 (review), it is fairly common to have some Python version CI job pass while some others fail - often due to different dependency versions (PIL, etc.).

Before we stop running those on PRs, do we have a safe and reliable mechanism to be alerted when those start failing on main as well?

@seemethere
Copy link
Member Author

(I'm marking this as "request changes" to avoid merging prematurely before concerns are addressed)

Thanks for the PR @seemethere.

I have the same concerns that those that were raised in #5479. In particular, I am worried that removing those jobs will reduce our capacity to catch bugs or errors early. As mentioned in #5479 (review), it is fairly common to have some Python version CI job pass while some others fail - often due to different dependency versions (PIL, etc.).

Before we stop running those on PRs, do we have a safe and reliable mechanism to be alerted when those start failing on main as well?

No but we're going to need to do this anyway, intel macOS is a platform that we are de-prioritizing and with needs for efficiency we're going to need to cut intel macOS testing across our entire organization. If these jobs could run in 5 minutes this wouldn't be an issue but since these jobs take over an hour to run then they need to get cut.

@NicolasHug
Copy link
Member

intel macOS is a platform that we are de-prioritizing

Does that mean we won't be releasing binaries for intel macOS? If we're not releasing binaries then that's fine, we can remove the testing jobs. But if we're still going to provide binaries, surely we want to keep some form of testing for those platforms.

If these jobs could run in 5 minutes this wouldn't be an issue but since these jobs take over an hour to run then they need to get cut.

Do we know why these jobs take 1h? The linux tests run in < 30min and they run the same tests. If it's just a matter of speeding up the CI, perhaps the MacOS runners are simply underspecced?

@seemethere
Copy link
Member Author

intel macOS is a platform that we are de-prioritizing

Does that mean we won't be releasing binaries for intel macOS? If we're not releasing binaries then that's fine, we can remove the testing jobs. But if we're still going to provide binaries, surely we want to keep some form of testing for those platforms.

We will still release binaries for the next release at a minimum but we're considering dropping after that release.

If these jobs could run in 5 minutes this wouldn't be an issue but since these jobs take over an hour to run then they need to get cut.

Do we know why these jobs take 1h? The linux tests run in < 30min and they run the same tests. If it's just a matter of speeding up the CI, perhaps the MacOS runners are simply underspecced?

They are underspecced at 3 cores, so it makes sense as to why they take 1.5 hours to actually run.

To give you an idea of how much each of these runs costs (reference: Pricing Documentation):

1.5 hours (90 minutes) * $0.08 / minute (rate for macos-12) * 4 versions of python = $28.80 / run

One other thing to note as well is that core pytorch/pytorch doesn't even test more than one macOS version on either it's pull request workflows / trunk workflows so this will bring it more in line with what we do on core pytorch as well

@pmeier
Copy link
Collaborator

pmeier commented Apr 13, 2023

@seemethere The numbers you quoted come from the same page that I quoted above in #7507 (review)

GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners

Could you explain why we need to pay for them at all?

@pmeier
Copy link
Collaborator

pmeier commented Apr 13, 2023

Perusing the pricing documentation some more, here is how I understand this:

  • The very first paragraph states:

    GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners. For private repositories, each GitHub account receives a certain amount of free minutes and storage for use with GitHub-hosted runners, depending on the product used with the account. Any usage beyond the included amounts is controlled by spending limits. [emphasis mine]

    My understanding is that as long as we are using "standard GitHub-hosted runners", they should be free and incur no charge at all.

  • In the "Per-minutes rates" section they qualify the above statement a little further:

    The larger runners are not free for public repositories.

  • The page dedicated to larger runners links to "standard GitHub-hosted runners". That page lists macos-12 and thus it shouldn't be a "larger runner".

  • The same page also lists a macos-12-xl runner, which might be a "larger runner" after all, but that doesn't matter here, since we are not using it.

@seemethere
Copy link
Member Author

seemethere commented Apr 17, 2023

Me and @pmeier talked about this over VC. Here's an overview of some of the questions that came up?

Q: Why don't the macOS runners fall under the free plan for open source repositories?
Screenshot 2023-04-17 at 12 27 06 PM

A: Free plans typically only account for a maximum number of minutes, 2000 in the case of free organizations. Since we are an enterprise organization and we exceed 2000 minutes we pay for all usage of Github Actions, including what would typically be "free" github actions runners. With that in mind, macOS x86 is our second most expensive platform to test on so we are approaching all angles in order to ensure that the cost for this specific platform goes down, which includes limiting unittest runs here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants