-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor [Scrutinizer] #3266
refactor [Scrutinizer] #3266
Conversation
|
One outstanding concern/question I have is around the route parameters. Scrutinizer supports 4 different types of VCS (GitHub, Bitbucket, GitLab, and what they call "Plain Git"), and the latter two seemingly require different params for the API call we need to get the data (for all of our Scrutinizer badges) I do not believe the routes as currently set in this PR will work correctly with those latter two use cases, and I'm also skeptical the existing regex in the legacy service implementation works in all cases either (particularly if the user wanted a Scrutinizer badge for a GitLab repo on a specific Need to think on this one for a bit, but welcome any thoughts/suggestions anyone may have |
I think I've found an approach that'll work. Will likely have this ready for review by tomorrow (the 24th) |
|
Aright this is finally ready for review. I suspect we may want to make some improvements to this in the future, but I wanted to go ahead and get a refactored version out there in the spirit of hitting the 26th target in #2863 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's quite a lot going on in these! Nice work.
} | ||
|
||
transform({ json, branch }) { | ||
branch = this.transformBranch({ json, wantedBranch: branch }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call this branchInfo
? It's a bit confusing to re-use the name since (old) branch
is a string and (new) branch
is a nested object.
Or destructure:
const { build_status: { status } } = this.transformBranch(...)
return { status }
You could also inline this in render()
to reduce indirection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call this branchInfo? It's a bit confusing to re-use the name since (old) branch is a string and (new) branch is a nested object.
Or destructure:
Good point!
You could also inline this in render() to reduce indirection
Not sure I follow as there's no render
functions here. Are you suggesting handling all this below by skipping transform
and inlining in renderBuildStatusBadge
call? Something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I meant inlining the transform()
function in handle()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, let me know if that looks any better
index: Joi.object({ | ||
_embedded: Joi.object({ | ||
project: Joi.object({ | ||
metric_values: Joi.object({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have to wonder if they ever intended for people to read these…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣
} | ||
|
||
static render({ coverage }) { | ||
if (isNaN(coverage)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an error? Should it be checked in transform()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy/paste from legacy implementation!
I actually don't think this a possible scenario anymore given what's already in place in the schema and transform
so I'll update it accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, I was wrong (one of the downsides of working on this off and on for nearly a month). The current behavior of our badge is to display unknown
when the specified repo/project doesn't have any coverage available.
I've updated this behavior a bit by checking for coverage in the transform
function, and throwing an error with a more explicit coverage not found
error (open to any alternative error messages) on that scenario, removing this particular check in the render
function.
It is admittedly another change to the existing implementation, but I think an improvement.
given({ coverage: 61 }).expect({ | ||
message: '61%', | ||
color: 'brightgreen', | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test worth keeping? It seems substantially to test colorScale
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this as a net-new test since there is custom render logic/behavior. Yes, the service class is using the colorScale
helper, but the intent is to validate that our Scrutinizer Coverage badges follow the same scheme/thresholds that Scrutinizer presents to users (which I assume we want to support given #735, unless folks feel differently).
IMO that's definitely something worth testing/validating. Without this test, we would not have an automated way of validating that expected behavior (for example if someone switched to using the coveragePercentage
formatter instead)
Are you suggesting removing this render
test, testing the render
logic differently, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also just remembered that I had intended to add similar tests for the Quality service but forgot. Will wait til we have a decision here though before making any other changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel strongly about this, just thinking that in the same way we don't write defensive tests that our Joi validation code rejects JSON that doesn't match the schema, it would be good if we felt the same about colorScale
.
In other words, maybe we could revise the interface to colorScale
so it's more obviously correct that it's working as intended, just from reading the code.
Unless there's something more nuanced happening here that I'm missing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Back to testing philosophies 😄
I don't feel strongly enough to risk the 26th target, but I do think this is a good use case to have a dialog on an aspect of testing. I think that dialog should be helpful for future discussions on testing guidelines for Shields so I'll try to clarify my thought process:
There's an expected behavior/output from the system:
- When the scrutinizer code coverage is less than 40%, the badge should be
red
- When coverage is between 40% and 60% (inclusive), the badge should be
yellow
- When coverage is greater than 60% the badge should be
brightgreen
I believe (strongly) in the notion having automated tests that validate expected behavior, and the notion of having automated tests validating behavior at multiple levels (unit/component/integration/functional/etc.).
Yes, the current service code happens to implement that expected behavior by leveraging the colorScale
helper function, and yes that helper function itself has unit tests that validate its functionality, and yes we trust colorScale
.
However, if we remove this test, then we have zero automated tests that are explicitly validating the expected Scrutinizer Coverage behavior/coloring articulated above (unless I am missing some tests somewhere).
Without this test, it is entirely possible that this expected behavior/functionality could be broken by any number of code changes to the render
function, and there'd be no automated test to catch that.
A trivial example would be someone changing the scale steps used on the the upper end (61
-> 60
):
const scale = colorScale([40, 60], ['red', 'yellow', 'brightgreen'])
the colorScale
function would still be doing exactly what it is supposed to do (the tests for the above render
function is not intended to verify the validity of colorScale
, nor imply that I doubt colorScale
), but the expected behavior of the Scrutinizer Coverage badge would be broken (60%
coverage would be brightgreen instead of yellow).
I suppose an alternative way of viewing the above test is checking that the Scrutinizer Coverage implementation is passing the correct arguments to colorScale
as opposed the functionality of colorScale
itself. Were it easy to do so, I likely would have structured the tests to simply validate that render was using the helper function correctly; I've just not seen an easy way to validate that in these sorts of tests when the helper is colorScale
.
Admittedly, a "true/pure" unit test of the render
function would validate it in complete isolation with all external touch points mocked out However, as I mentioned that just didn't seem worth the effort to me here.
Anyway, my two cents!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My response got long, but here it is! It's more of a continuation of discussion than a conclusion… so maybe we should merge something and pick this up in a follow-on.
I wanted to start off with a couple points which summarize some previous discussions we’ve had about service-testing strategy.
What we ideally want from our service tests is that when a target project is in a certain state (e.g. code coverage 60%) the badge has certain contents (label, message, and color).
An important piece of that bracket – how the upstream API responds when the project is in that state – is (1) outside our control, (2) often undocumented, and (3) subject to change.
Few of the services will go out of their way to notify us, and we have hundreds of these APIs to keep track of. So in addition to the tests, we rely on schemas which are just permissive enough. One value of the schema is preventing code from crashing if the API changes. The other is to document for the next developer the exact assumptions the transform and render code is making of the API response. That way when the API changes, we can infer the correct way to update the transform and render code.
A real integration test of “code coverage 60%” would require a real project that is at 82%. That’s the only way we can know how the API responds to that call today. A service test backed by a mocked / recorded server response gives us much of this. It documents the URL our code should be hitting and an expected response. However it leaves out an important part of the bracket, which is the behavior of the upstream API. So it needs to be paired with a live test that uses the same schema. The live tests gives us decent signal that the code is working. Though it’s not perfect, of course. Without a real project at 60% we have to trust that the mock fixture is accurate. For the simpler APIs this isn’t a big leap, however for more complex APIs, it might be.
The other thing a live test with minimally lenient validation tells us is that everything is wired up correctly – i.e. the response, the validation, the transform, and the render.
When a render() function has lots of conditional rendering logic, we can write data-driven tests that exercise all the branches of the conditional rendering. Sometimes this logic might use a helper function. Like render(), the helper functions are tested with data-driven tests. I don’t particularly think mocking them helps anything; it obscures the meaning of the tests and introduces errors if anything. (I tend to call these data-driven tests of render() unit tests. I think it’s okay to call something a unit test when it doesn’t mock every helper function it calls. Though if that’s confusing or there’s disagreement on that point, maybe it’s better to stick with “data-driven.”)
A render() function at its best is almost declarative. If it has no branches nor conditionals it may not need tests. In a very declarative render() function, we can often verify its correctness from reading it. We have other assurances, too. BaseService will complain if we misspell a property in the return value. A live test covers the entire code path of a render() function (again, provided it has no branches).
Branches are a good reason to test render(), as it’s the easiest way to make sure all the branches run and behave as expected. There may also be some cases without branches where this is desirable. For example, if there’s a lot of manipulation, we might like to test that we’re passing the right value to a color function. Since the live test doesn’t know the specific value it won’t know the specific color either, and the minimally lenient validator may not really be able to prove the color function is being invoked correctly.
Now returning to the example at hand!
There's an expected behavior/output from the system:
- When the scrutinizer code coverage is less than 40%, the badge should be
red
- When coverage is between 40% and 60% (inclusive), the badge should be
yellow
- When coverage is greater than 60% the badge should be
brightgreen
This is a well stated acceptance test: it describes the behavior we want.
The real thing would be a live test against a project that was locked into each of these tranches. That tells us things are wired up all the way through. That is of course not sustainable though, which leaves us to either write service tests with mocked responses, ensuring that our code is working as intended, or data-driven tests of render(), ensuring that if the non-conditional data-flow path is working as expected, the part that is conditional varies in the ways we expect.
I’m not opposed to behavior from colorScale
being part of the bracket of these tests. As I articulated above, it makes the tests clearer and avoids false negatives.
However, if we remove this test, then we have zero automated tests that are explicitly validating the expected Scrutinizer Coverage behavior/coloring articulated above (unless I am missing some tests somewhere).
Without this test, it is entirely possible that this expected behavior/functionality could be broken by any number of code changes to the
render
function, and there'd be no automated test to catch that.A trivial example would be someone changing the scale steps used on the the upper end (
61
->60
):
const scale = colorScale([40, 60], ['red', 'yellow', 'brightgreen'])
the
colorScale
function would still be doing exactly what it is supposed to do (the tests for the aboverender
function is not intended to verify the validity ofcolorScale
, nor imply that I doubtcolorScale
), but the expected behavior of the Scrutinizer Coverage badge would be broken (60%
coverage would be brightgreen instead of yellow).
At its best I view render() as almost declarative. In the same way that we don’t write tests on url.base
or examples
, I’d argue that simple render functions could be treated the same way. Yes, it’s a function which needs to be called, so it’s fundamentally different from a string. However the render functions are called not only by our live tests, but also (typically) by a server test that checks the service definitions.
Yes, the current service code happens to implement that expected behavior by leveraging the
colorScale
helper function
In contrast, here’s a declarative explanation:
The badge color is computed from
score
according to the service’s own definedscale
.
I guess the question is:
When we already can prove a function is being called, and it’s declarative, is it necessary to include a test?
We have a lot of trivial render() functions which basically delegate to a color formatter and a text formatter. If the answer is yes, would they all need tests too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gonna take me a while to read through and digest all of this, so in the spirit of completing the service refactor by today's target date...
Do I need to remove this unit test in order to complete this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge it as is!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I was supposed to put something together beforehand so I'll take the action item to follow up on this to keep the conversation rolling.
Did you get a chance to look at the other changes already too? I think they're better-ish now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think I had read it all before. Looks good!
/scrutinizer/quality/...
) with a redirector added to maintain any existing users of the old route