Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor [Scrutinizer] #3266

Merged
merged 9 commits into from
Apr 27, 2019
Merged

refactor [Scrutinizer] #3266

merged 9 commits into from
Apr 27, 2019

Conversation

calebcartwright
Copy link
Member

@calebcartwright calebcartwright commented Apr 4, 2019

  • Split service classes into separate files
  • Quality badge was given explicit route (/scrutinizer/quality/...) with a redirector added to maintain any existing users of the old route
  • Updated color scale used for coverage badge to match Scrutinizer scale (close Code coverage color doesn't correspond to colors of Scrutinizer badge #735)
    • < 40% is red, between 40% and 60% (inclusive) is yellow, > 60% is bright green (scale displayed on reports like this)

@calebcartwright calebcartwright added the service-badge New or updated service badge label Apr 4, 2019
@shields-ci
Copy link

shields-ci commented Apr 4, 2019

Messages
📖 ✨ Thanks for your contribution to Shields, @calebcartwright!

Generated by 🚫 dangerJS against 0a845b9

@paulmelnikow paulmelnikow temporarily deployed to shields-staging-pr-3266 April 4, 2019 03:00 Inactive
@calebcartwright
Copy link
Member Author

One outstanding concern/question I have is around the route parameters.

Scrutinizer supports 4 different types of VCS (GitHub, Bitbucket, GitLab, and what they call "Plain Git"), and the latter two seemingly require different params for the API call we need to get the data (for all of our Scrutinizer badges)
https://scrutinizer-ci.com/docs/api/#repository-details

I do not believe the routes as currently set in this PR will work correctly with those latter two use cases, and I'm also skeptical the existing regex in the legacy service implementation works in all cases either (particularly if the user wanted a Scrutinizer badge for a GitLab repo on a specific branch) 🤔

Need to think on this one for a bit, but welcome any thoughts/suggestions anyone may have

@calebcartwright
Copy link
Member Author

I think I've found an approach that'll work. Will likely have this ready for review by tomorrow (the 24th)

@calebcartwright calebcartwright marked this pull request as ready for review April 24, 2019 21:20
@calebcartwright calebcartwright changed the title refactor [Scrutinizer] [WIP] refactor [Scrutinizer] Apr 24, 2019
@paulmelnikow paulmelnikow temporarily deployed to shields-staging-pr-3266 April 24, 2019 21:21 Inactive
@paulmelnikow paulmelnikow temporarily deployed to shields-staging-pr-3266 April 24, 2019 21:22 Inactive
@shields-ci
Copy link

shields-ci commented Apr 24, 2019

Messages
📖 ✨ Thanks for your contribution to Shields, @calebcartwright!

Generated by 🚫 dangerJS against 6ed7e92

@calebcartwright calebcartwright changed the title [WIP] refactor [Scrutinizer] refactor [Scrutinizer] Apr 24, 2019
@paulmelnikow paulmelnikow temporarily deployed to shields-staging-pr-3266 April 24, 2019 22:14 Inactive
@calebcartwright calebcartwright changed the title refactor [Scrutinizer] [WIP] refactor [Scrutinizer] Apr 24, 2019
@calebcartwright calebcartwright changed the title [WIP] refactor [Scrutinizer] refactor [Scrutinizer] Apr 24, 2019
@paulmelnikow paulmelnikow temporarily deployed to shields-staging-pr-3266 April 24, 2019 22:49 Inactive
@calebcartwright
Copy link
Member Author

Aright this is finally ready for review. I suspect we may want to make some improvements to this in the future, but I wanted to go ahead and get a refactored version out there in the spirit of hitting the 26th target in #2863

Copy link
Member

@paulmelnikow paulmelnikow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's quite a lot going on in these! Nice work.

}

transform({ json, branch }) {
branch = this.transformBranch({ json, wantedBranch: branch })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this branchInfo? It's a bit confusing to re-use the name since (old) branch is a string and (new) branch is a nested object.

Or destructure:

const { build_status: { status } } = this.transformBranch(...)
return { status }

You could also inline this in render() to reduce indirection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this branchInfo? It's a bit confusing to re-use the name since (old) branch is a string and (new) branch is a nested object.

Or destructure:

Good point!

You could also inline this in render() to reduce indirection

Not sure I follow as there's no render functions here. Are you suggesting handling all this below by skipping transform and inlining in renderBuildStatusBadge call? Something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I meant inlining the transform() function in handle().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, let me know if that looks any better

index: Joi.object({
_embedded: Joi.object({
project: Joi.object({
metric_values: Joi.object({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to wonder if they ever intended for people to read these…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣

}

static render({ coverage }) {
if (isNaN(coverage)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an error? Should it be checked in transform()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy/paste from legacy implementation!

I actually don't think this a possible scenario anymore given what's already in place in the schema and transform so I'll update it accordingly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I was wrong (one of the downsides of working on this off and on for nearly a month). The current behavior of our badge is to display unknown when the specified repo/project doesn't have any coverage available.

I've updated this behavior a bit by checking for coverage in the transform function, and throwing an error with a more explicit coverage not found error (open to any alternative error messages) on that scenario, removing this particular check in the render function.

It is admittedly another change to the existing implementation, but I think an improvement.

given({ coverage: 61 }).expect({
message: '61%',
color: 'brightgreen',
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test worth keeping? It seems substantially to test colorScale.

Copy link
Member Author

@calebcartwright calebcartwright Apr 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this as a net-new test since there is custom render logic/behavior. Yes, the service class is using the colorScale helper, but the intent is to validate that our Scrutinizer Coverage badges follow the same scheme/thresholds that Scrutinizer presents to users (which I assume we want to support given #735, unless folks feel differently).

IMO that's definitely something worth testing/validating. Without this test, we would not have an automated way of validating that expected behavior (for example if someone switched to using the coveragePercentage formatter instead)

Are you suggesting removing this render test, testing the render logic differently, or something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also just remembered that I had intended to add similar tests for the Quality service but forgot. Will wait til we have a decision here though before making any other changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about this, just thinking that in the same way we don't write defensive tests that our Joi validation code rejects JSON that doesn't match the schema, it would be good if we felt the same about colorScale.

In other words, maybe we could revise the interface to colorScale so it's more obviously correct that it's working as intended, just from reading the code.

Unless there's something more nuanced happening here that I'm missing!

Copy link
Member Author

@calebcartwright calebcartwright Apr 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back to testing philosophies 😄

I don't feel strongly enough to risk the 26th target, but I do think this is a good use case to have a dialog on an aspect of testing. I think that dialog should be helpful for future discussions on testing guidelines for Shields so I'll try to clarify my thought process:

There's an expected behavior/output from the system:

  • When the scrutinizer code coverage is less than 40%, the badge should be red
  • When coverage is between 40% and 60% (inclusive), the badge should be yellow
  • When coverage is greater than 60% the badge should be brightgreen

I believe (strongly) in the notion having automated tests that validate expected behavior, and the notion of having automated tests validating behavior at multiple levels (unit/component/integration/functional/etc.).

Yes, the current service code happens to implement that expected behavior by leveraging the colorScale helper function, and yes that helper function itself has unit tests that validate its functionality, and yes we trust colorScale.

However, if we remove this test, then we have zero automated tests that are explicitly validating the expected Scrutinizer Coverage behavior/coloring articulated above (unless I am missing some tests somewhere).

Without this test, it is entirely possible that this expected behavior/functionality could be broken by any number of code changes to the render function, and there'd be no automated test to catch that.

A trivial example would be someone changing the scale steps used on the the upper end (61 -> 60):

const scale = colorScale([40, 60], ['red', 'yellow', 'brightgreen'])

the colorScale function would still be doing exactly what it is supposed to do (the tests for the above render function is not intended to verify the validity of colorScale, nor imply that I doubt colorScale), but the expected behavior of the Scrutinizer Coverage badge would be broken (60% coverage would be brightgreen instead of yellow).

I suppose an alternative way of viewing the above test is checking that the Scrutinizer Coverage implementation is passing the correct arguments to colorScale as opposed the functionality of colorScale itself. Were it easy to do so, I likely would have structured the tests to simply validate that render was using the helper function correctly; I've just not seen an easy way to validate that in these sorts of tests when the helper is colorScale.

Admittedly, a "true/pure" unit test of the render function would validate it in complete isolation with all external touch points mocked out However, as I mentioned that just didn't seem worth the effort to me here.

Anyway, my two cents!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My response got long, but here it is! It's more of a continuation of discussion than a conclusion… so maybe we should merge something and pick this up in a follow-on.

I wanted to start off with a couple points which summarize some previous discussions we’ve had about service-testing strategy.

What we ideally want from our service tests is that when a target project is in a certain state (e.g. code coverage 60%) the badge has certain contents (label, message, and color).

An important piece of that bracket – how the upstream API responds when the project is in that state – is (1) outside our control, (2) often undocumented, and (3) subject to change.

Few of the services will go out of their way to notify us, and we have hundreds of these APIs to keep track of. So in addition to the tests, we rely on schemas which are just permissive enough. One value of the schema is preventing code from crashing if the API changes. The other is to document for the next developer the exact assumptions the transform and render code is making of the API response. That way when the API changes, we can infer the correct way to update the transform and render code.

A real integration test of “code coverage 60%” would require a real project that is at 82%. That’s the only way we can know how the API responds to that call today. A service test backed by a mocked / recorded server response gives us much of this. It documents the URL our code should be hitting and an expected response. However it leaves out an important part of the bracket, which is the behavior of the upstream API. So it needs to be paired with a live test that uses the same schema. The live tests gives us decent signal that the code is working. Though it’s not perfect, of course. Without a real project at 60% we have to trust that the mock fixture is accurate. For the simpler APIs this isn’t a big leap, however for more complex APIs, it might be.

The other thing a live test with minimally lenient validation tells us is that everything is wired up correctly – i.e. the response, the validation, the transform, and the render.

When a render() function has lots of conditional rendering logic, we can write data-driven tests that exercise all the branches of the conditional rendering. Sometimes this logic might use a helper function. Like render(), the helper functions are tested with data-driven tests. I don’t particularly think mocking them helps anything; it obscures the meaning of the tests and introduces errors if anything. (I tend to call these data-driven tests of render() unit tests. I think it’s okay to call something a unit test when it doesn’t mock every helper function it calls. Though if that’s confusing or there’s disagreement on that point, maybe it’s better to stick with “data-driven.”)

A render() function at its best is almost declarative. If it has no branches nor conditionals it may not need tests. In a very declarative render() function, we can often verify its correctness from reading it. We have other assurances, too. BaseService will complain if we misspell a property in the return value. A live test covers the entire code path of a render() function (again, provided it has no branches).

Branches are a good reason to test render(), as it’s the easiest way to make sure all the branches run and behave as expected. There may also be some cases without branches where this is desirable. For example, if there’s a lot of manipulation, we might like to test that we’re passing the right value to a color function. Since the live test doesn’t know the specific value it won’t know the specific color either, and the minimally lenient validator may not really be able to prove the color function is being invoked correctly.

Now returning to the example at hand!

There's an expected behavior/output from the system:

  • When the scrutinizer code coverage is less than 40%, the badge should be red
  • When coverage is between 40% and 60% (inclusive), the badge should be yellow
  • When coverage is greater than 60% the badge should be brightgreen

This is a well stated acceptance test: it describes the behavior we want.

The real thing would be a live test against a project that was locked into each of these tranches. That tells us things are wired up all the way through. That is of course not sustainable though, which leaves us to either write service tests with mocked responses, ensuring that our code is working as intended, or data-driven tests of render(), ensuring that if the non-conditional data-flow path is working as expected, the part that is conditional varies in the ways we expect.

I’m not opposed to behavior from colorScale being part of the bracket of these tests. As I articulated above, it makes the tests clearer and avoids false negatives.

However, if we remove this test, then we have zero automated tests that are explicitly validating the expected Scrutinizer Coverage behavior/coloring articulated above (unless I am missing some tests somewhere).

Without this test, it is entirely possible that this expected behavior/functionality could be broken by any number of code changes to the render function, and there'd be no automated test to catch that.

A trivial example would be someone changing the scale steps used on the the upper end (61 -> 60):

const scale = colorScale([40, 60], ['red', 'yellow', 'brightgreen'])

the colorScale function would still be doing exactly what it is supposed to do (the tests for the above render function is not intended to verify the validity of colorScale, nor imply that I doubt colorScale), but the expected behavior of the Scrutinizer Coverage badge would be broken (60% coverage would be brightgreen instead of yellow).

At its best I view render() as almost declarative. In the same way that we don’t write tests on url.base or examples, I’d argue that simple render functions could be treated the same way. Yes, it’s a function which needs to be called, so it’s fundamentally different from a string. However the render functions are called not only by our live tests, but also (typically) by a server test that checks the service definitions.

Yes, the current service code happens to implement that expected behavior by leveraging the colorScale helper function

In contrast, here’s a declarative explanation:

The badge color is computed from score according to the service’s own defined scale.

I guess the question is:

When we already can prove a function is being called, and it’s declarative, is it necessary to include a test?

We have a lot of trivial render() functions which basically delegate to a color formatter and a text formatter. If the answer is yes, would they all need tests too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna take me a while to read through and digest all of this, so in the spirit of completing the service refactor by today's target date...

Do I need to remove this unit test in order to complete this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge it as is!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I was supposed to put something together beforehand so I'll take the action item to follow up on this to keep the conversation rolling.

Did you get a chance to look at the other changes already too? I think they're better-ish now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think I had read it all before. Looks good!

@paulmelnikow paulmelnikow temporarily deployed to shields-staging-pr-3266 April 25, 2019 19:16 Inactive
@paulmelnikow paulmelnikow temporarily deployed to shields-staging-pr-3266 April 25, 2019 20:02 Inactive
@shields-deployment
Copy link

This pull request was merged to master branch. This change is now waiting for deployment, which will usually happen within a few days. Stay tuned by joining our #ops channel on Discord!

After deployment, changes are copied to gh-pages branch:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
service-badge New or updated service badge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code coverage color doesn't correspond to colors of Scrutinizer badge
3 participants