-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true
as default value in Queue Mode and use false
for proper CI providers
#198
Conversation
true
as default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT
true
as default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT
in Queue Mode
true
as default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT
in Queue Modetrue
as a default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT
in Queue Mode
true
as a default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT
in Queue Modetrue
as a default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT
in Queue Mode for GitHub Actions, Buildkite, Travis, CodeShip
@ArturT what if we make it Since we should be able to configure all CIs, we can ensure this is always properly set. And, if the CI is "not supported," the behaviour is the most predictable (single retry gets the same tests). And the only downside is that full retries are slower (cached split instead of new split). |
@3v0k4 I was thinking about it. Maybe let's do something radical and use The biggest downside that I see by using when someone retries the whole CI build (all parallel nodes are retried and a new CI build ID is used) AND the same (commit sha, branch, nodes) are used. Then tests are split based on the cache split (not well balanced). This should be a rare case, but in general, this would be when someone retries the whole build. But in such case, we can give users the answer that this is on purpose to run tests in the same order, but if they want, they can disable that with KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false. I can image some users might want to run tests in a different order on purpose so that flaky tests are passed when run in a different order.
Maybe it should be set to false in development by default? But this still might be a bit confusing that gem behaves differently in certain environments. Maybe for the sake of simplicity, let's use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true as default always. If this will be problematic for someone, we can recommend using KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false. In case of Regular Mode no one ever ask about this so maybe it wouldn't be a big issue for Queue Mode as well. FYI @shadre |
I haven't thought about the development case. At the same time, what's a good reason to want dynamic splits in development at the moment? I'm pretty sure we will need to revisit some things anyway for the new onboarding, so I wouldn't mind this one too much now. One problem I see with defaulting to a fixed split is if people onboard by retrying builds:
|
You mentioned it:
@3v0k4 I worry that people might think it's not optimal split when they do retry locally (the retry would be cache split). I suspect when someone does retries to set up something locally or debug issues with configuring their project locally, then they might experience cache split often. If so then there won't be mulitple batches of tests fetched from Queue API and it might be harder to reproduce a problem (a failing specs) that occurs only when you run many batches. |
I cannot imagine people judging KS' perf in development. They would have to manually simulate multiple nodes and take the timings.
Though, those batches are not the same batches because it's a new split. This could be a problem too in debugging.
I'm still thinking about this one. If people onboard by retrying the first build on CI, they could be mislead. Maybe we could also add some logging later on? Like printing out in the terminal the (auto)configuration of Knapsack:
|
Yes. I was also thinking about it.
Good point about being mislead unless they use different number of nodes on CI than in development. If they use Github Actions they still might be mislead because they have to use fixed_queue_split=true |
true
as a default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT
in Queue Mode for GitHub Actions, Buildkite, Travis, CodeShipKNAPSACK_PRO_FIXED_QUEUE_SPLIT=true
as default value in Queue Mode and use false
for GitHub Actions, Buildkite, Travis, CodeShip
KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true
as default value in Queue Mode and use false
for GitHub Actions, Buildkite, Travis, CodeShipKNAPSACK_PRO_FIXED_QUEUE_SPLIT=true
as default value in Queue Mode and use false
for proper CI providers
@@ -632,20 +632,6 @@ | |||
end | |||
end | |||
|
|||
describe '.fixed_test_suite_split' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -667,38 +653,94 @@ | |||
end | |||
end | |||
|
|||
describe '.fixed_queue_split' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArturT ready for CR |
lib/knapsack_pro/config/env.rb
Outdated
ENV.fetch('KNAPSACK_PRO_FIXED_QUEUE_SPLIT', false) | ||
@fixed_queue_split ||= begin | ||
env = ENV['KNAPSACK_PRO_FIXED_QUEUE_SPLIT'] | ||
computed = env || ci_env_for(:fixed_queue_split).to_s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ENV value is always string so it's fine to use env || ci_env_for(:fixed_queue_split).to_s
even when KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false
then the false
string value is used instead of ci_env_for(:fixed_queue_split).to_s
.
But if someone would stub KNAPSACK_PRO_FIXED_QUEUE_SPLIT to false as boolean in test environment then it would behave differently.
The more explicit approach would be:
ENV.fetch('KNAPSACK_PRO_FIXED_QUEUE_SPLIT', ci_env_for(:fixed_queue_split).to_s)
But I leave it up to you. The current code also works fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean if somebody in the Knapsack team stubbed it, right? Fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
@3v0k4 I approve the PR. Do you think we should update the list of CI providers and the their default value for |
@ArturT We definitely need to update docs; I'll do that when releasing. It's tracked already in the ticket thanks to your research 🙏 |
Use
KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true
as the default value and ensure it'strue
for:For other CI providers use
KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false
:More info:
GitHub Actions, CodeShip use the same CI build ID for retried CI node. In such a case, Queue Mode would assume the queue is already consumed and there are no tests to run for retried CI node. You would expect to run the same set of tests as it was assigned to the CI node during the very first run. To achieve that, we use
KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true
then a retried CI node will run the same set of tests as on the very first run for a given (commit SHA, branch, and number of nodes).Some CI providers like Buildkite (or Github Actions) allow you to retry an individual CI node. In such a case, you would expect to run the same set of tests as it was assigned to the CI providers. To do so, you need to set
KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true
(we will do it automatically from now on).Other CI providers use a new CI build ID when you retry the CI node. So it's safe to use for them
KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false
, in such case, a new dynamic tests split happens in Queue Mode. This also assumes that all nodes are retried.If you would rerun only a single CI node, then it's better to set
KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true
so that the retried CI node won't run the whole test suite on the retried CI node. Instead, it would run the same set of tests as during the very first run of tests for a given set of (commit hash, branch, number of nodes).