-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tests
] add slow tests to GH workflow
#304
[tests
] add slow tests to GH workflow
#304
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Added the corresponding GCP machine to the list of hosted runners, I think we should be all good once we add the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I'll note here that occasionally the nightly reporter bugs out, but for now this does the job :) (I'll ping when I figure out a better reporting setup :) )
Specifically it'll say X job failed/test if it failed a day before. You can still view the runner to see if it all passed
|
||
jobs: | ||
run_all_tests_single_gpu: | ||
runs-on: [self-hosted, docker-gpu, multi-gpu] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want to use multi-GPU on a single GPU VM 😃 it's definitely a way to save money.
run_all_tests_single_gpu: | ||
runs-on: [self-hosted, docker-gpu, multi-gpu] | ||
env: | ||
CUDA_VISIBLE_DEVICES: "0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we really need this. If not, we can remove them, and just use a matrix thing to combine the 2 different jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with matrix (imo) is it's less clear, vs running them this way when we view the reports in GHA. And matrix get messy quick as well :)
But you are right, both options work. @younesbelkada definitely worth I think playing with both to see which you prefer more eventually :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though thinking on this more, probably worth doing on our nightlies. Good point @ydshieh. Most of my fear resided in matrix upon matrix, but the nightlies don't have split tests like our main does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a necessary change for me, it is merely a suggestion :-) So your call 🤗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the pointers! I will stick with the first solution proposed by @muellerzr and see if I face into any issue
run_all_tests_multi_gpu: | ||
runs-on: [self-hosted, docker-gpu, multi-gpu] | ||
env: | ||
CUDA_VISIBLE_DEVICES: "0,1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 but with 2 nits for you to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @younesbelkada, @muellerzr and @ydshieh for setting up the slow tests and Slack notifications! 🚀🤗✨
Thank you all for your comments! |
This PR adds the workflow to run slow tests