Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true as default value in Queue Mode and use false for proper CI providers #198

Merged
merged 1 commit into from
Jun 7, 2023

Conversation

ArturT
Copy link
Member

@ArturT ArturT commented May 24, 2023

Use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true as the default value and ensure it's true for:

  • GitHub Actions
  • Buildkite
  • Travis
  • CodeShip

For other CI providers use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false:

  • App Veyor
  • Cirrurs CI
  • CodeFresh
  • GitLab CI
  • Heroku CI
  • Semaphore CI 1.0
  • Semaphore CI 2.0

More info:

GitHub Actions, CodeShip use the same CI build ID for retried CI node. In such a case, Queue Mode would assume the queue is already consumed and there are no tests to run for retried CI node. You would expect to run the same set of tests as it was assigned to the CI node during the very first run. To achieve that, we use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true then a retried CI node will run the same set of tests as on the very first run for a given (commit SHA, branch, and number of nodes).

Some CI providers like Buildkite (or Github Actions) allow you to retry an individual CI node. In such a case, you would expect to run the same set of tests as it was assigned to the CI providers. To do so, you need to set KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true (we will do it automatically from now on).

Other CI providers use a new CI build ID when you retry the CI node. So it's safe to use for them KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false, in such case, a new dynamic tests split happens in Queue Mode. This also assumes that all nodes are retried.

If you would rerun only a single CI node, then it's better to set KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true so that the retried CI node won't run the whole test suite on the retried CI node. Instead, it would run the same set of tests as during the very first run of tests for a given set of (commit hash, branch, number of nodes).

@ArturT ArturT changed the title Use true as default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT Use true as default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT in Queue Mode May 24, 2023
@ArturT ArturT changed the title Use true as default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT in Queue Mode Use true as a default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT in Queue Mode May 24, 2023
@ArturT ArturT changed the title Use true as a default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT in Queue Mode Use true as a default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT in Queue Mode for GitHub Actions, Buildkite, Travis, CodeShip May 24, 2023
@3v0k4
Copy link
Contributor

3v0k4 commented May 26, 2023

@ArturT what if we make it true by default and set it to false where needed?

Since we should be able to configure all CIs, we can ensure this is always properly set. And, if the CI is "not supported," the behaviour is the most predictable (single retry gets the same tests). And the only downside is that full retries are slower (cached split instead of new split).

@ArturT
Copy link
Member Author

ArturT commented May 27, 2023

@3v0k4 I was thinking about it.

Maybe let's do something radical and use true as a default value (always) and leave false as an option to set if someone really wants to. This is what's done for regular mode with KNAPSACK_PRO_FIXED_TEST_SUITE_SPLIT=true default.
https://docs.knapsackpro.com/ruby/reference/#knapsack_pro_fixed_test_suite_split-regular-mode.

The biggest downside that I see by using KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true as default always is:

when someone retries the whole CI build (all parallel nodes are retried and a new CI build ID is used) AND the same (commit sha, branch, nodes) are used. Then tests are split based on the cache split (not well balanced). This should be a rare case, but in general, this would be when someone retries the whole build. But in such case, we can give users the answer that this is on purpose to run tests in the same order, but if they want, they can disable that with KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false. I can image some users might want to run tests in a different order on purpose so that flaky tests are passed when run in a different order.
When you run tests locally with KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true and you keep testing something, for example during onboarding setup in development (if we recommend development setup first for new onboarding), you most likely work on the same commit hash so once you run tests once with a dynamic split (Queue split) then further splits will be cached even if you update CI build ID to a new one in development.

what if we make it true by default and set it to false where needed?

Maybe it should be set to false in development by default?
How can we know it is development environment? Maybe we should detect that non of the known CI provider env vars have been detected.

But this still might be a bit confusing that gem behaves differently in certain environments.

Maybe for the sake of simplicity, let's use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true as default always. If this will be problematic for someone, we can recommend using KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false. In case of Regular Mode no one ever ask about this so maybe it wouldn't be a big issue for Queue Mode as well.

FYI @shadre

@3v0k4
Copy link
Contributor

3v0k4 commented May 30, 2023

@ArturT

When you run tests locally with KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true and you keep testing something, for example during onboarding setup in development (if we recommend development setup first for new onboarding), you most likely work on the same commit hash so once you run tests once with a dynamic split (Queue split) then further splits will be cached even if you update CI build ID to a new one in development.

I haven't thought about the development case. At the same time, what's a good reason to want dynamic splits in development at the moment?

I'm pretty sure we will need to revisit some things anyway for the new onboarding, so I wouldn't mind this one too much now.

One problem I see with defaulting to a fixed split is if people onboard by retrying builds:

  • set up Knapsack
  • push
  • first (suboptimal) build completes
  • second build triggered with a full retry (same suboptimal build) instead of pushing a new commit

@ArturT
Copy link
Member Author

ArturT commented May 30, 2023

I haven't thought about the development case. At the same time, what's a good reason to want dynamic splits in development at the moment?

You mentioned it:

second build triggered with a full retry (same suboptimal build) instead of pushing a new commit

@3v0k4 I worry that people might think it's not optimal split when they do retry locally (the retry would be cache split). I suspect when someone does retries to set up something locally or debug issues with configuring their project locally, then they might experience cache split often. If so then there won't be mulitple batches of tests fetched from Queue API and it might be harder to reproduce a problem (a failing specs) that occurs only when you run many batches.

@3v0k4
Copy link
Contributor

3v0k4 commented May 30, 2023

@ArturT

I worry that people might think it's not optimal split when they do retry locally (the retry would be cache split)

I cannot imagine people judging KS' perf in development. They would have to manually simulate multiple nodes and take the timings.

it might be harder to reproduce a problem (a failing specs) that occurs only when you run many batches.

Though, those batches are not the same batches because it's a new split. This could be a problem too in debugging.

second build triggered with a full retry (same suboptimal build) instead of pushing a new commit (Riccardo)

I'm still thinking about this one. If people onboard by retrying the first build on CI, they could be mislead.


Maybe we could also add some logging later on? Like printing out in the terminal the (auto)configuration of Knapsack:

Detected Github Actions, using KNAPSACK_PRO_FIXED=true (you can override this one)
Detected RSpec, using KNAPSACK_PRO_SPLIT=true (you can override this one)

@ArturT
Copy link
Member Author

ArturT commented May 30, 2023

Maybe we could also add some logging later on? Like printing out in the terminal the (auto)configuration of Knapsack:

Yes. I was also thinking about it.

second build triggered with a full retry (same suboptimal build) instead of pushing a new commit (Riccardo)

I'm still thinking about this one. If people onboard by retrying the first build on CI, they could be mislead.

Good point about being mislead unless they use different number of nodes on CI than in development.

If they use Github Actions they still might be mislead because they have to use fixed_queue_split=true

@ArturT ArturT changed the title Use true as a default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT in Queue Mode for GitHub Actions, Buildkite, Travis, CodeShip Use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true as default value in Queue Mode and use false for GitHub Actions, Buildkite, Travis, CodeShip Jun 5, 2023
@3v0k4 3v0k4 self-assigned this Jun 6, 2023
@3v0k4 3v0k4 force-pushed the fixed_queue_split branch from b2ab364 to 9ea5c15 Compare June 6, 2023 15:50
@3v0k4 3v0k4 changed the base branch from master to detect-ci June 6, 2023 15:50
@3v0k4 3v0k4 marked this pull request as draft June 6, 2023 15:59
@ArturT ArturT changed the title Use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true as default value in Queue Mode and use false for GitHub Actions, Buildkite, Travis, CodeShip Use KNAPSACK_PRO_FIXED_QUEUE_SPLIT=true as default value in Queue Mode and use false for proper CI providers Jun 6, 2023
Base automatically changed from detect-ci to master June 7, 2023 09:58
@3v0k4 3v0k4 force-pushed the fixed_queue_split branch from 3092bb2 to a4c8ddb Compare June 7, 2023 10:01
@@ -632,20 +632,6 @@
end
end

describe '.fixed_test_suite_split' do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -667,38 +653,94 @@
end
end

describe '.fixed_queue_split' do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@3v0k4 3v0k4 marked this pull request as ready for review June 7, 2023 10:08
@3v0k4
Copy link
Contributor

3v0k4 commented Jun 7, 2023

@ArturT ready for CR

ENV.fetch('KNAPSACK_PRO_FIXED_QUEUE_SPLIT', false)
@fixed_queue_split ||= begin
env = ENV['KNAPSACK_PRO_FIXED_QUEUE_SPLIT']
computed = env || ci_env_for(:fixed_queue_split).to_s
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENV value is always string so it's fine to use env || ci_env_for(:fixed_queue_split).to_s even when KNAPSACK_PRO_FIXED_QUEUE_SPLIT=false then the false string value is used instead of ci_env_for(:fixed_queue_split).to_s.

But if someone would stub KNAPSACK_PRO_FIXED_QUEUE_SPLIT to false as boolean in test environment then it would behave differently.

The more explicit approach would be:

ENV.fetch('KNAPSACK_PRO_FIXED_QUEUE_SPLIT', ci_env_for(:fixed_queue_split).to_s)

But I leave it up to you. The current code also works fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean if somebody in the Knapsack team stubbed it, right? Fixed it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@ArturT
Copy link
Member Author

ArturT commented Jun 7, 2023

@3v0k4 I approve the PR.

Do you think we should update the list of CI providers and the their default value for KNAPSACK_PRO_FIXED_QUEUE_SPLIT in the reference page? Or is it enough that users would see the info log message.
https://docs.knapsackpro.com/ruby/reference/#knapsack_pro_fixed_queue_split-queue-mode

@3v0k4
Copy link
Contributor

3v0k4 commented Jun 7, 2023

@ArturT We definitely need to update docs; I'll do that when releasing. It's tracked already in the ticket thanks to your research 🙏

@3v0k4 3v0k4 force-pushed the fixed_queue_split branch from 42c5a1c to e6fbd52 Compare June 7, 2023 12:01
@3v0k4 3v0k4 merged commit 40076ee into master Jun 7, 2023
@3v0k4 3v0k4 deleted the fixed_queue_split branch June 7, 2023 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants