-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor integration tests #2821
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2821 +/- ##
==========================================
+ Coverage 76.05% 76.07% +0.02%
==========================================
Files 213 217 +4
Lines 16611 16626 +15
==========================================
+ Hits 12634 12649 +15
+ Misses 3202 3200 -2
- Partials 775 777 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍🏻 🎉 🙇🏻
3ebefce
to
cd90a96
Compare
cd90a96
to
8713cf5
Compare
|
||
// Writer syncs writes with a mutex and, if the output is a TTY, clears before | ||
// newlines. | ||
type Writer struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same consoleWriter
, but since it needed to be exported so that it could be used in cmd/tests/
and elsewhere, it's the only thing in this new ui/console
package. Otherwise the functionality is identical, and we can do larger changes and bring console.Console
back later.
ff11aeb
to
698d53f
Compare
698d53f
to
2d00049
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I tried to check that the previously found issues are now fixed but I would prefer that @na-- goes through it as well.
My only actionable feedback is - I would've amended the original commit to add ExecuteWithGlobalState
. IMO this would've been cleaner and I already think it is required to go through the PR commit by commit, so this makes it easier to not come back to the same change later.
There's already a lib/doc.go with much more documentation.
2d00049
to
91a287c
Compare
Sorry about the force-push again, but I squashed the latest two commits in previous ones, as suggested by Mihail, and resolved a merge conflict that GitHub couldn't, but somehow Git did automatically 🤷♂️ |
91a287c
to
03f8cee
Compare
To avoid having the "testing" package part of the production binary. Resolves #2821 (comment)
To avoid increasing the size of lib/. See #2821 (comment)
cmd/tests/doc.go
Outdated
// but also very expensive to run. They're also brittle, as they depend on large | ||
// parts of the codebase. When in doubt, prefer adding lower-level unit tests | ||
// first, and an integration test only if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually disagree with this on multiple levels 😅 I've found that these tests are the least brittle of them all, since the APIs they depend on are very unlikely to change, and since timing issues are not a problem.
You essentially give inputs to the k6 process and check its outputs (metrics, logs, etc.). Lower-level tests regularly have to be refactored because a Go API changed, whereas here the API consists of CLI flags, environment variables, JS scripts - all of the things that we try very hard not to break and keep backwards-compatible. Trying to hook and time things in different goroutines has also proven quite problematic, while waiting for some text to appear on the stdout is quite reliable.
So, my recommendation would be to always prefer to add an integration test to a lower-level test, unless we are testing some very clear-cut and independent low-level API. All of our currently flaky tests should probably be rewritten as integration tests.
They might be a bit more expensive in terms of time and CPU, considering the overheads, but I'd argue it's worth it. We can always add more CPU and increase the parallelism we run the tests with, if we need to,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The APIs certainly won't change much, but since they test a large portions of the system, integration tests are brittle by definition. Any issue in the underlying packages might trigger a failure in all integration tests that depend on it. The failure matrix is much larger with integration tests than for unit tests.
In addition to this, integration tests are much more expensive to run. I really don't want us to rely solely on them for any new features, since I would prefer to run them only occasionally locally (say, before git push
). Whereas I can run a batch of unit tests much more quickly, and rely on them as a more frequent check while developing.
Related to this, prefering integration tests to unit tests goes against the traditional test pyramid. I actually don't think we need to prefer one over the other at all. All levels of testing complement each other, but it's clear that we should have mostly unit tests, many integration tests, and some E2E tests.
The fact some of our unit tests are unreliable and flaky is not a reason to avoid them, but a reason to make them more reliable. Having to refactor them more frequently is the price of admission, and not a reason to prefer higher-level tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The APIs certainly won't change much, but since they test a large portions of the system, integration tests are brittle by definition. Any issue in the underlying packages might trigger a failure in all integration tests that depend on it. The failure matrix is much larger with integration tests than for unit tests.
No, it won't, these integration tests don't actually use most of the underlying packages directly 😅 If you can compile the k6
binary, you should pretty much be able to run the integration tests...
Again, these tests basically rely only on the user-facing APIs, which we try very, very hard to keep somewhat stable. They are the very opposite of brittle.
I actually don't think we need to prefer one over the other at all. All levels of testing complement each other, but it's clear that we should have mostly unit tests, many integration tests, and some E2E tests.
The fact some of our unit tests are unreliable and flaky is not a reason to avoid them, but a reason to make them more reliable. Having to refactor them more frequently is the price of admission, and not a reason to prefer higher-level tests.
Mostly agree with this, though with a clarification. Low-level unit tests that verify a single API are great. High-level integration tests like these that verify the whole application (basically from the user's perspective) are also great. The more you go towards the middle, the worse things get.
The most horrible and flaky tests we have are these mid-level tests that first have to do a whole lot of setup and initialization of a whole bunch of components and mocks. And then they have to inject and verify very complicated things. Basically, the old Engine
and ExecutionScheduler
tests, a bunch of the tests around the executors and js/
. Most of them would be better off as integration tests, since they try to set up the whole world before the test can be executed, and they do it by somewhat replicating what happens during a normal k6 run, but not quite exactly the same. So, any API change breaks them, and they often are not very realistic or good tests, i.e. they are the brittle ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it won't, these integration tests don't actually use most of the underlying packages directly 😅 If you can compile the k6 binary, you should pretty much be able to run the integration tests...
Again, these tests basically rely only on the user-facing APIs, which we try very, very hard to keep somewhat stable. They are the very opposite of brittle.
I think we have different definitions of "brittle" 😄
Having stable APIs doesn't matter if, e.g. cmdRun.run()
returns an error, which can happen for any reason in any of the underlying packages. The k6
binary will build just fine, but the error will happen at runtime. You can try it yourself, just return a fake error from cmdRun.run()
and see all integration tests fail. This is a contrived example, but a failure could occur deep within some dependent module, and trigger similar broad failures.
This is the opposite of a unit test where the only reason it will fail (or should fail, assuming it's not flaky) is if the unit under test fails. It's not dependent on any other part of the system, and so, by definition, unit test are much more stable than higher level tests.
The reason you want to have both is because the integration test failure often doesn't tell you anything about the root cause. Preferably you'd want to have at least one unit test failure that would pinpoint exactly the root cause, and then you wouldn't really spend time debugging integration test failures at all. This is why it's important to have both. Lower level tests to debug and fix things quickly. And higher level tests to ensure components integrate well, and E2E tests that it all works from the user's perspective.
BTW, as you mentioned, what we're currently calling integration tests are closer to E2E tests, just without actually running the k6 binary. So it's even a worse idea to suggest we should have a lot of them, or worse, prefer them over lower level tests.
I see your point about middle-layer tests (i.e. integration tests) being troublesome, but just as frequent refactoring is a pain-point for unit tests, mocking and test setup are a chore for integration tests. So I'm against trying to rewrite these as even higher-level almost-E2E tests to avoid the mocking, but try to make mockable APIs to begin with, and test helpers as we do now to assist with the setup boilerplate.
I think we fundamentally disagree on testing strategies, but a) we won't make much progress discussing it here, and b) I'd like to involve more people in this discussion. So let's schedule it for tomorrow's sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we apparently have different preferences here, and 👍 for discussing it more with the rest of the team.
Until then, can you remove the opinion parts of this we disagree on, so only the more factual bits on which we have consensus remain? This will allow us to merge the PR more quickly, which is important, since it's blocking a few things.
// but also very expensive to run. They're also brittle, as they depend on large | |
// parts of the codebase. When in doubt, prefer adding lower-level unit tests | |
// first, and an integration test only if necessary. | |
// but also very expensive to run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything except #2821 (comment) LGTM now ❤️
This allows us to use it in tests outside of the cmd package. See #2459
This also removes the duplication of the xk6-timers module, and uses it directly now that it's an experimental built-in module. Although, shouldn't these be xk6-timers tests rather than event loop ones? We have plenty of tests that indirectly test the event loop, and we could always add much simpler ones that don't involve the timers module. See #2459
0df87b8
to
b03553b
Compare
OK, one last force-push to squash as much as possible. Hopefully didn't screw anything up, so give it one last glance. 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This does the changes discussed in #2459, see the commits for details. I'm adding everyone as a reviewer because it's a large refactor (splitting it into more PRs wouldn't help much), but @na-- should give the final approval.
Closes #2459