-
Notifications
You must be signed in to change notification settings - Fork 800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to revive, drop golint #3945
Conversation
ef87b76
to
532a54b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very nice! It will allow us to improve code quality in the long run.
One question on the initial rule set. As I understand this is currently roughly equivalent to golint + few more rules that you see as high value, but not do not have too much violations currently, correct?
Would it make sense to add all available rules in the config file, but comment some them out with either:
- TODO: too much failures currently, once fixed enable it
- We will not use this rule because of [reason]
That way it would be more visible what we are using, what we are planning to use and what is not suitable for us and why.
I've kinda debated doing that, yeah. The full list is rather large though, and some of them are just kinda clearly "meh" or "no" due to kinda innately having high false-positives or applying pressures that IMO lead to gaming metrics: https://github.com/mgechev/revive Also you can eyeball cyclomatic complexity and some other lints easily enough during code review. tbh I'm more interested in the things that are hard to review - those are the ones that are particularly valuable to automate. Anyway though. Yep, I could do that - rejects really only take a few words, and it'd at least document the full list that we've seen. |
ah, to answer the first part of your comment: yep, exactly. I'd include the I may add a "lint-next" target to make it easy to find stuff worth picking away at gradually, before enforcing. The others new ones are a bit higher importance though imo. |
66f0f46
to
6861044
Compare
All lints now documented and organized! |
d66902e
to
02e255e
Compare
f1e29be
to
5dff3e9
Compare
76000c5
to
b58a3ee
Compare
service/frontend/adminHandler.go
Outdated
@@ -248,7 +248,7 @@ func (adh *adminHandlerImpl) DescribeWorkflowExecution( | |||
} | |||
|
|||
shardID := common.WorkflowIDToHistoryShard(request.Execution.WorkflowID, adh.numberOfHistoryShards) | |||
shardIDstr := string(shardID) | |||
shardIDstr := strconv.Itoa(shardID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can drop one of shardID strings here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into this one a bit more (and failed to make a smaller repro), and have decided to undo it / change it to a form that go vet
allows (and is recommended in the github issue thread that lead to the vet check being added) but produces the same output as before.
Since this appears to change how shards in a hash ring are selected, is it safe to change at all? Or could it cause Problems™ on deploy, as different instances will compute different shards for the same keys?
shardIDstr := string(shardID) | ||
shardIDstr := string(rune(shardID)) // originally `string(int_shard_id)`, but changing it will change the ring hashing | ||
shardIDForOutput := strconv.Itoa(shardID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mantas-sidlauskas ^ this is the new version
lint: $(BIN)/revive | ||
$(BIN)/revive -config revive.toml -exclude './canary/...' -exclude './vendor/...' -formatter unix ./... | sort |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strangely it also doesn't exclude vendor by default. maybe if I set my go flags to include -mod=vendor...?
meh. at least you can tell it to ignore it.
@@ -18,8 +18,7 @@ require ( | |||
github.com/dmetzgar/goveralls v0.0.3 | |||
github.com/eapache/go-resiliency v1.2.0 // indirect | |||
github.com/emirpasic/gods v0.0.0-20190624094223-e689965507ab | |||
github.com/fatih/color v0.0.0-20181010231311-3f9d52f7176a | |||
github.com/fatih/structtag v1.1.0 // indirect | |||
github.com/fatih/color v1.10.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(to pick a random version change)
these changes are the result of a go get
for revive, not a go get -u
, so it likely has these versions as minimums.
either way though, none of these seem particularly concerning.
36943fa
to
12acdaf
Compare
@@ -1048,7 +1049,7 @@ func (c *clientImpl) getClientForDomainID(domainID string) (Client, error) { | |||
} | |||
|
|||
func (c *clientImpl) getClientForShardID(shardID int) (Client, error) { | |||
client, err := c.clients.GetClientForKey(string(shardID)) | |||
client, err := c.clients.GetClientForKey(strconv.Itoa(shardID)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't go into any load-distribution system, just an in-memory client cache. safe to change.
Golint is borderline abandonware (all that energy seems to be on `go vet` instead), and doesn't even ignore vendor folders from `./...` despite [having a github issue since 2017](golang/lint#320). Plus it's non-configurable by design, which IMO encourages some bad habits. See the next commit for details. After a brief skim of the current state of things, `revive` popped out as a possibility, and the results are pretty good. Runtimes on a fairly beefy machine: ```sh # time make lint (before) make lint 8.80s user 13.14s system 154% cpu 14.206 total # time make lint (simply swapping bins) make lint 10.85s user 16.89s system 149% cpu 18.545 total # time make lint (new call) make lint 21.72s user 4.22s system 2555% cpu 1.015 total ``` So that's a **14x reduction** in runtime, and now it's at the point where it's fine to run all the time (~1 second). Locally, perhaps due to having fewer cores, this goes from 14s -> 2s. Still substantially better, and still comfortably fast. This default list also finds an additional 14 items, none major, and they basically match newer `golint` versions / what IDEs will often highlight: ``` # new lint, potentially worth following. # there are many instances of this in the canary/ folder, but this was ignored earlier, so it is ignored now as well. service/worker/failovermanager/starter.go:91:9: [context-keys-type] should not use basic type string as key in context.WithValue # new lint, potentially worth following service/worker/scanner/shardscanner/scanner_workflow.go:176:2: [if-return] redundant if ...; err != nil check, just return error instead. # 3 new comment-formatting / exported-type-without-comment additions ``` Nothing significant, but (ignoring the new comment lints) possibly worth tackling, and not excessively noisy. The following commits add some new lints + fixes, this one just aims for parity.
This does not cover all of the comment-related lints, but it does cover all of the ones that currently produce warnings. After this, 10 go away, bringing the total to 18. **Rationale:** Many of the comment-requiring and comment-formatting lints are focused on both: 1. Encouraging documenting public APIs. But this repo is effectively entirely private / not intended for people to import. 2. Producing "nice" documentation in `go doc` output. Whether this makes more readable docs or not is debatable, but I understand the goal. When *either or both* of those don't apply, the lints just kinda suck. Their mere existence encourages pointless stuff like: ```go // ConvertPretty ... <snip> func (e *ESql) ConvertPretty(sql string, pagination ...interface{}) (dsl string, sortField []string, err error) { ``` Or this, which appears in 30 places: ```go // Ptr is a helper function for getting pointer value func ... Ptr() ... ``` Or this TBD pattern, in **OVER 2000** places: ```go // WorkflowExecutionCanceledEventAttributes is an internal type (TBD...) ``` It's a waste of time, space, and mental-energy. In my experience it also implies the code is in a "higher quality" state than it may actually be, as good code often has documentation. While that's obviously not *actually true*, I've frequently seen people stop reading when they find a documented func regardless of what the documentation actually says, on the assumption that it's either 1) functioning correctly, or 2) intended for broader use. 2 in particular often manifests as code importing some random unrelated package because it happens to have the only documented + commonly-used pointer-conversion-func. I've seen this in dozens of repositories, it's a plague. Even if we (think we) aren't vulnerable to that kind of thinking when writing code, it seems worth avoiding since it clearly brings no benefit. Document what needs documentation. Don't add documentation that offers nothing beyond the type name (or less). And let's delete bad docs when we see them.
This finds 21 failures, though only 3 of which are outside of tests. In practice, these look like: ```go # service/history/shard/controller.go:381:55: [string-of-int] dubious convertion of an integer into a string, use strconv.Itoa info, err := c.GetHistoryServiceResolver().Lookup(string(shardID)) ``` And they are *nearly always* a bug. The correlation is so strong, `go test` on 1.15 and newer will by default refuse to compile a package with this code. They have a pretty good bit on this in the release notes: https://golang.org/doc/go1.15#vet To copy a relevant snippet from that: > `string(9786)` does not evaluate to the string `"9786"`; it evaluates to the string `"\xe2\x98\xba"`, or `"☺"`. So yeah. That's basically never what is intended. If you are on 1.15 or newer, you can reproduce these with `go vet -stringintconv ./...`. Or just run `go test` on an affected package, it'll error rather than run tests. We should *absolutely* prevent *all* of these. We can't rely on `go` though, since this repo targets 1.12, so I'm glad this lint exists. --- There is one spot I *did not* change (behaviorally), because I'm not sure what doing so will do. The `shardIDstr` in service/frontend/adminHandler.go:251 is used as a hash-ring lookup key. Changing this will change how shards hash, which could cause issues when deployed. So, for now at least, I've converted it to the equivalent that the Go github issue recommends: `string(rune(int))`. This produces the same values as before. --- Strangely, this misses a fairly trivial case that `go vet` catches: ```go --- a/service/frontend/adminHandler.go +++ b/service/frontend/adminHandler.go @@ -248,7 +248,7 @@ func (adh *adminHandlerImpl) DescribeWorkflowExecution( } shardID := common.WorkflowIDToHistoryShard(request.Execution.WorkflowID, adh.numberOfHistoryShards) - shardIDstr := string(shardID) // misses this one + shardIDstr := string(rune(shardID)) // ... shardIDForOutput := strconv.Itoa(shardID) // correct, not changed in this diff historyHost, err := adh.GetMembershipMonitor().Lookup(common.HistoryServiceName, shardIDstr) ``` This might be due to the variable being handled correctly once? I dunno. I haven't been able to reproduce this with a smaller test, unfortunately. Thankfully, this repo is open source, so it can serve as the reproduction.
Finds 3 failures that are partially false positives: ``` common/log/panic.go:37:17: [defer] recover must be called inside a deferred function common/persistence/persistence-tests/executionManagerTestForEventsV2.go:51:7: [defer] recover must be called inside a deferred function service/history/execution/cache.go:285:14: [defer] recover must be called inside a deferred function ``` These are not *all* incorrect. This kind of construct is safe, but it will still warn: ```go recov := func() { recover() } defer recov() ``` This is the equivalent of `common/log/panic.go` plus lines like this in `service/frontend/adminHandler.go`: ```go defer log.CapturePanic(adh.GetLogger(), &retError) ``` But this kind is not correct: ```go recov := func() { recover() } defer func() { recov() }() // or recov() ``` When the containing func is not *directly* executed by `defer`, the `recover()` call *silently does nothing*, which is why this lint exists. And the non-deferred case is clearly ineffective as well. The first incorrect example is the same as `service/history/execution/cache.go`, when returned by `GetAndCreateWorkflowExecution`, and deferred in `service/history/historyEngine.go` in `DescribeMutableState`. There are more examples of this though, the signature of the recover-ing func encourages misuse because it accepts an `error` argument. The second is the same as `service/frontend/dcRedirectionHandler.go`. This one is likely not intended to catch anything though. Even though there is a safe way to do this, and much of the code does so, IMO this is simply too risky to even approach. So I would like to change ALL uses to get rid of the false positives. That change will involve making all `recover()` calls explicit, so they can be linted, e.g.: ```go defer func() { recov(recover(), &err) }() `` This is noticeably more verbose, but it's trivially linted, and (when enforced) is practically impossible to mess up. --- This is all pretty easily verified with the following code (comment out one of the `recov()` strategies): ```go var missed interface{} var caught interface{} defer func() { fmt.Println("caught:", caught, "missed:", missed) }() defer func() { missed = recover() }() recov := func() { caught = recover() } defer recov() // this works defer func() { recov() }() // this does not panic("oops") ``` This prints: ``` // defer recov(): caught: oops missed: <nil> // defer func() { recov() }(): caught: <nil> missed: oops ``` Good: https://play.golang.org/p/4D4Absnk7U7 Bad: https://play.golang.org/p/cX9XnqQgZJz I honestly have no idea how the Go authors think this is *even remotely* a good idea. Especially without warnings / errors. Maybe it's for performance reasons.......?
This set seems useful, and only has one violation: ``` common/testing/event_generator.go:420:2: [modifies-value-receiver] suspicious assignment to a by-value method receiver ``` That has been fixed in this commit. It appears unused, and is likely just a mistake from copying the value getters. And I've kept this lint especially because I can't see a legitimate reason to allow it - it's fairly error-prone, due to implicit copies when passed by value. The other lints are pretty obvious by their name, have zero results currently, and have no significant impact on runtime. They seem worth maintaining going forward.
With this, `revive.toml` mentions all currently-available lints. Some seem desirable, but they have more lint failures than I feel like tackling at the moment :) So they're just documented for now. --- `unhandled-error` I'd *love* to enable immediately, but ~300 warnings is far too many. Newly-introduced violations of other lints would just be lost in the noise. Reducing (or rarely whitelisting) these seems necessary before enabling, which IMO should be considered fairly high priority. Missing error handling is pretty frequently a source of hard-to-identify misbehavior, and a lint like this is the only feasible way to catch many cases. It's just far too hard to reliably catch during code review (as evidence: 300 cases!), especially when the returned value(s) are not assigned to anything.
See the individual commits for a detailed breakdown of lints / changes and what they mean / why they are made.
I intend to merge without squashing (as a fast-forward), so they are all retained.
As a high-level tl;dr, these commits: