Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/vet: add check for sync.WaitGroup abuse #18022

Open
dsnet opened this issue Nov 22, 2016 · 40 comments
Open

proposal: cmd/vet: add check for sync.WaitGroup abuse #18022

dsnet opened this issue Nov 22, 2016 · 40 comments
Assignees
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis)
Milestone

Comments

@dsnet
Copy link
Member

dsnet commented Nov 22, 2016

The API for WaitGroup is very easy for people to misuse. The documentation for Add says:

calls to Add should execute before the statement creating the goroutine or other event to be waited for.

However, it is very common to see this incorrect pattern:

var wg sync.WaitGroup
defer wg.Wait()

go func() {
	wg.Add(1)
	defer wg.Done()
}()

This usage is fundamentally racy and probably does not do what the user wanted. Worse yet, is that it is not detectable by the race detector.

Since the above pattern is common, I propose that we add a method Go that essentially does the Add(1) and subsequent call to Done in the correct way. That is:

func (wg *WaitGroup) Go(f func()) {
	wg.Add(1)
	go func() {
		defer wg.Done()
		f()
	}()
}
@dsnet dsnet added the Proposal label Nov 22, 2016
@dsnet dsnet added this to the Proposal milestone Nov 22, 2016
@cznic
Copy link
Contributor

cznic commented Nov 23, 2016

I don't like the idea. It's IMHO perhaps a task for go vet, if not implemented already. Also, the proposed method would add a(nother) closure layer when the go-ed function has parameters.

@minux
Copy link
Member

minux commented Nov 23, 2016 via email

@dsnet
Copy link
Member Author

dsnet commented Nov 23, 2016

A go vet check seems pretty reasonable. I just tried it right now on the following:

func main() {
	var wg sync.WaitGroup
	defer wg.Wait()
	go func() {
		wg.Add(1)
		defer wg.Done()
	}()
}

and vet doesn't report anything.

In terms of vet's requirements:

  • Correctness: The problem is clearly a race.
  • Frequency: My gut feeling is that this is fairly common. A number of internal bugs that I've fixed is of this nature. So it subjective feels common enough. I don't have hard numbers.
  • Precision: I don't have an algorithm in mind that can accurately identify the pattern and I can imagine some number of false positives.

@dominikh , staticcheck does report a problem:
main.go:10:3: should call wg.Add(1) before starting the goroutine to avoid a race (SA2000)
I'm wondering how accurate the check is and whether it is worth adding to vet.

@mvdan
Copy link
Member

mvdan commented Nov 23, 2016

As far as the proposal goes, I don't like how it limits the func signature to func().

@dominikh
Copy link
Member

@dsnet The check in staticcheck has no (known) false positives. It shouldn't have a significant number of false negatives, either. The implementation is a simple pattern-based check, detecting go with function literals where the first statement is a call to wg.Add – this avoids flagging wg.Add calls further down the goroutine, which tend to be valid uses.

I'm -1 on the proposed Go function. I'd prefer not having to read code with an unnecessary level of nesting that looks callback-esque and reminds me of JavaScript.

@mvdan
Copy link
Member

mvdan commented Nov 23, 2016

@dominikh I don't see any extra level of nesting here, though (assuming any func signature is allowed).

To be nitpicky, another thing that stands out from the proposal is how wg.Go() will create a goroutine even though go is never directly used by the user. I don't know if the standard library does this anywhere else, but I would prefer if it was left explicit.

@dominikh
Copy link
Member

@mvdan The extra level of nesting would come from a predicted usage that looks something like this:

wg.Go(func() {
  // do stuff
})

as opposed to

go func() {
  // do stuff
}()

Admittedly the same level of indentation, but syntactically it's one extra level of nesting.

@mvdan
Copy link
Member

mvdan commented Nov 23, 2016

Ah yes, I was thinking indentation there.

@cznic
Copy link
Contributor

cznic commented Nov 23, 2016

The problematic case is that

wg.Add(1)
go func(i int) { ... }(42)

// becomes

wg.Go(func() {
        go func(i int) { ... }(42)
})

@mvdan
Copy link
Member

mvdan commented Nov 23, 2016

@cznic if you mean without the extra go, this would be solved if the restriction on the func() signature was removed.

@dominikh
Copy link
Member

@mvdan Do you mean by allowing something like the following?

wg.Go(func(x, y int) { ...}, v1, v2)

IMHO that's way too much interface{} and not enough type safety.

@mattn
Copy link
Member

mattn commented Nov 23, 2016

panic is recovered?

@mvdan
Copy link
Member

mvdan commented Nov 23, 2016

@dominikh true; I was simply pointing at the issue without contemplating a solution :)

@rsc
Copy link
Contributor

rsc commented Nov 28, 2016

The API change here has the problems identified above with argument evaluation. Also, in general the libraries do not typically phrase functionality in terms of callbacks. If we're going to start using callbacks broadly, that should be a separate decision (and not one to make today). For both these reasons, it seems like .Go is not a clear win.

It would be nice to have a vet check that we trust (no false positives). Perhaps it is enough to look for two statements

wg.Add(1)
defer wg.Done()

back to back and reject that always. Thoughts about how to make vet catch this reliably?

@renannprado
Copy link

I agree that vet is better place for this. The proposed API reminds of JavaScript, which will force us many times to wrap the code or function within a function with no arguments, while you could have just go func()....
Still nothing can stop you from creating such a helper methid, even though I don't see the need for it.

@rsc
Copy link
Contributor

rsc commented Dec 5, 2016

It sounds like we are deciding to make go vet check this and not add new API here. Any arguments against that?

@dsnet
Copy link
Member Author

dsnet commented Dec 5, 2016

SGTM

@dsnet dsnet changed the title proposal: sync: add Go method to WaitGroup cmd/vet: add check for sync.WaitGroup abuse Dec 5, 2016
@dsnet dsnet removed the Proposal label Dec 9, 2016
@rsc
Copy link
Contributor

rsc commented Aug 5, 2020

I've added this proposal to the proposal process bin, but it's blocked on someone figuring out how to implement a useful check. Is anyone interested in doing that?

@dominikh
Copy link
Member

dominikh commented Aug 6, 2020

Staticcheck has a fairly trivial check: for a GoStmt of a FuncLit, if the first statement in the FuncLit is a call to (*sync.WaitGroup).Add, we flag it. That has potential for false positives, but none have been reported in all the years that the check has existed.

The check could be trivially hardened by

  1. looking for an immediately following defer of (*sync.WaitGroup).Done and comparing the two receivers.
  2. checking that the argument to Add is 1, not some other number.

Edit: which is pretty much what you have suggested in #18022 (comment)

dsnet added a commit to tailscale/tailscale that referenced this issue Mar 7, 2023
The addition of WaitGroup.Go in the standard library has been
repeatedly proposed and rejected.
See golang/go#18022, golang/go#23538, and golang/go#39863

In summary, the argument for WaitGroup.Go is that it avoids bugs like:

	go func() {
		wg.Add(1)
		defer wg.Done()
		...
	}()

where the increment happens after execution (not before)
and also (to a lesser degree) because:

	wg.Go(func() {
		...
	})

is shorter and more readble.

The argument against WaitGroup.Go is that the provided function
takes no arguments and so inputs and outputs must closed over
by the provided function. The most common race bug for goroutines
is that the caller forgot to capture the loop iteration variable,
so this pattern may make it easier to be accidentally racy.
However, that is changing with golang/go#57969.

In my experience the probability of race bugs due to the former
still outwighs the latter, but I have no concrete evidence to prove it.

The existence of errgroup.Group.Go and frequent utility of the method
at least proves that this is a workable pattern and
the possibility of accidental races do not appear to
manifest as frequently as feared.

A reason *not* to use errgroup.Group everywhere is that there are many
situations where it doesn't make sense for the goroutine to return an error
since the error is handled in a different mechanism
(e.g., logged and ignored, formatted and printed to the frontend, etc.).
While you can use errgroup.Group by always returning nil,
the fact that you *can* return nil makes it easy to accidentally return
an error when nothing is checking the return of group.Wait.
This is not a hypothetical problem, but something that has bitten us
in usages that was only using errgroup.Group without intending to use
the error reporting part of it.

Thus, add a (yet another) variant of WaitGroup here that
is identical to sync.WaitGroup, but with an extra method.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
dsnet added a commit to tailscale/tailscale that referenced this issue Mar 7, 2023
The addition of WaitGroup.Go in the standard library has been
repeatedly proposed and rejected.
See golang/go#18022, golang/go#23538, and golang/go#39863

In summary, the argument for WaitGroup.Go is that it avoids bugs like:

	go func() {
		wg.Add(1)
		defer wg.Done()
		...
	}()

where the increment happens after execution (not before)
and also (to a lesser degree) because:

	wg.Go(func() {
		...
	})

is shorter and more readble.

The argument against WaitGroup.Go is that the provided function
takes no arguments and so inputs and outputs must closed over
by the provided function. The most common race bug for goroutines
is that the caller forgot to capture the loop iteration variable,
so this pattern may make it easier to be accidentally racy.
However, that is changing with golang/go#57969.

In my experience the probability of race bugs due to the former
still outwighs the latter, but I have no concrete evidence to prove it.

The existence of errgroup.Group.Go and frequent utility of the method
at least proves that this is a workable pattern and
the possibility of accidental races do not appear to
manifest as frequently as feared.

A reason *not* to use errgroup.Group everywhere is that there are many
situations where it doesn't make sense for the goroutine to return an error
since the error is handled in a different mechanism
(e.g., logged and ignored, formatted and printed to the frontend, etc.).
While you can use errgroup.Group by always returning nil,
the fact that you *can* return nil makes it easy to accidentally return
an error when nothing is checking the return of group.Wait.
This is not a hypothetical problem, but something that has bitten us
in usages that was only using errgroup.Group without intending to use
the error reporting part of it.

Thus, add a (yet another) variant of WaitGroup here that
is identical to sync.WaitGroup, but with an extra method.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
dsnet added a commit to tailscale/tailscale that referenced this issue Mar 9, 2023
The addition of WaitGroup.Go in the standard library has been
repeatedly proposed and rejected.
See golang/go#18022, golang/go#23538, and golang/go#39863

In summary, the argument for WaitGroup.Go is that it avoids bugs like:

	go func() {
		wg.Add(1)
		defer wg.Done()
		...
	}()

where the increment happens after execution (not before)
and also (to a lesser degree) because:

	wg.Go(func() {
		...
	})

is shorter and more readble.

The argument against WaitGroup.Go is that the provided function
takes no arguments and so inputs and outputs must closed over
by the provided function. The most common race bug for goroutines
is that the caller forgot to capture the loop iteration variable,
so this pattern may make it easier to be accidentally racy.
However, that is changing with golang/go#57969.

In my experience the probability of race bugs due to the former
still outwighs the latter, but I have no concrete evidence to prove it.

The existence of errgroup.Group.Go and frequent utility of the method
at least proves that this is a workable pattern and
the possibility of accidental races do not appear to
manifest as frequently as feared.

A reason *not* to use errgroup.Group everywhere is that there are many
situations where it doesn't make sense for the goroutine to return an error
since the error is handled in a different mechanism
(e.g., logged and ignored, formatted and printed to the frontend, etc.).
While you can use errgroup.Group by always returning nil,
the fact that you *can* return nil makes it easy to accidentally return
an error when nothing is checking the return of group.Wait.
This is not a hypothetical problem, but something that has bitten us
in usages that was only using errgroup.Group without intending to use
the error reporting part of it.

Thus, add a (yet another) variant of WaitGroup here that
is identical to sync.WaitGroup, but with an extra method.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
darksip pushed a commit to darksip/tailscale that referenced this issue Apr 3, 2023
The addition of WaitGroup.Go in the standard library has been
repeatedly proposed and rejected.
See golang/go#18022, golang/go#23538, and golang/go#39863

In summary, the argument for WaitGroup.Go is that it avoids bugs like:

	go func() {
		wg.Add(1)
		defer wg.Done()
		...
	}()

where the increment happens after execution (not before)
and also (to a lesser degree) because:

	wg.Go(func() {
		...
	})

is shorter and more readble.

The argument against WaitGroup.Go is that the provided function
takes no arguments and so inputs and outputs must closed over
by the provided function. The most common race bug for goroutines
is that the caller forgot to capture the loop iteration variable,
so this pattern may make it easier to be accidentally racy.
However, that is changing with golang/go#57969.

In my experience the probability of race bugs due to the former
still outwighs the latter, but I have no concrete evidence to prove it.

The existence of errgroup.Group.Go and frequent utility of the method
at least proves that this is a workable pattern and
the possibility of accidental races do not appear to
manifest as frequently as feared.

A reason *not* to use errgroup.Group everywhere is that there are many
situations where it doesn't make sense for the goroutine to return an error
since the error is handled in a different mechanism
(e.g., logged and ignored, formatted and printed to the frontend, etc.).
While you can use errgroup.Group by always returning nil,
the fact that you *can* return nil makes it easy to accidentally return
an error when nothing is checking the return of group.Wait.
This is not a hypothetical problem, but something that has bitten us
in usages that was only using errgroup.Group without intending to use
the error reporting part of it.

Thus, add a (yet another) variant of WaitGroup here that
is identical to sync.WaitGroup, but with an extra method.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
@adonovan adonovan added the Analysis Issues related to static analysis (vet, x/tools/go/analysis) label Apr 23, 2023
@adonovan
Copy link
Member

adonovan commented Nov 20, 2024

I ran staticcheck's SA2000 analyzer across the module mirror corpus and found 57 matches (see attached file). A random 25 are shown here. All (!!) are true positives.

https://go-mod-viewer.appspot.com/github.com/kyma-project/kyma/components/asset-store-controller-manager@v0.0.0-20191203152857-3792b5df17c5/internal/controllers/suite_test.go#L38: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L250: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L320: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/Go-To-Byte/DouSheng/user_center@v0.0.0-20230524130918-ad531c1a3f6a/apps/user/impl/impl.go#L88: should call wait.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/go.dedis.ch/cothority/v3@v3.4.9/eventlog/api.go#L257: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/mholt/caddy-l4@v0.0.0-20241104153248-ec8fae209322/modules/l4proxyprotocol/matcher_test.go#L38: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/mholt/caddy-l4@v0.0.0-20241104153248-ec8fae209322/modules/l4proxyprotocol/handler_test.go#L68: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/mholt/caddy-l4@v0.0.0-20241104153248-ec8fae209322/modules/l4proxyprotocol/matcher_test.go#L64: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L62: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/mholt/caddy-l4@v0.0.0-20241104153248-ec8fae209322/modules/l4proxyprotocol/matcher_test.go#L90: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/jtzjtz/kit@v1.0.2/conn/rabbitmq_pool/rabbitmq_pool.go#L121: should call S.wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/deso-protocol/core@v1.2.9/lib/txindex.go#L158: should call txi.updateWaitGroup.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/portworx/kvdb@v0.0.0-20241107215734-a185a966f535/test/kv.go#L537: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/abiosoft/semaphore@v0.0.0-20240818083615-bc6b5b10c137/semaphore_test.go#L21: should call g.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L266: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/Go-To-Byte/DouSheng/user_center@v0.0.0-20230524130918-ad531c1a3f6a/apps/user/impl/impl.go#L78: should call wait.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/lastbackend/toolkit@v0.0.0-20241020043710-cafa37b95aad/pkg/server/grpc/grpc.go#L258: should call g.wait.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/bluesky-social/indigo@v0.0.0-20241119181532-966c093275b7/cmd/sonar/main.go#L171: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/doc_test.go#L48: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/msales/pkg/v3@v3.24.0/clix/utils_test.go#L46: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L329: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L222: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L70: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/doc_test.go#L56: should call wg.Add(1) before starting the goroutine to avoid a race
https://go-mod-viewer.appspot.com/github.com/zhufuyi/sponge@v1.10.3/pkg/ws/server_test.go#L88: should call wg.Add(1) before starting the goroutine to avoid a race

waitgroup.txt

@timothy-king
Copy link
Contributor

https://go-mod-viewer.appspot.com/github.com/kyma-project/kyma/components/asset-store-controller-manager@v0.0.0-20191203152857-3792b5df17c5/internal/controllers/suite_test.go#L38

    33  // StartTestManager adds recFn
    34  func StartTestManager(mgr manager.Manager, g *GomegaWithT) (chan struct{}, *sync.WaitGroup) {
    ...
    37  	go func() {
    38  		wg.Add(1)
    ...
    41  	}()
    42  	return stop, wg
    43  }

Do we need to establish that wg.Wait() is called to issue a diagnostic? That would be what this "races" with. We possibly could just always issue a report when the first statement in the function is wg.Add(*). That would have some potential issues with very obscure cases (like wrapping wg.Wait() on its own goroutine and blocking progress through additional channels), but is probably pretty accurate.

@adonovan
Copy link
Member

Do we need to establish that wg.Wait() is called to issue a diagnostic?

No. If someone is using WaitGroup without calling Wait, they have bigger problems.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/632915 mentions this issue: go/analysis/passes/waitgroup: report WaitGroup.Add in goroutine

@adonovan adonovan self-assigned this Dec 2, 2024
@adonovan adonovan moved this from Hold to Incoming in Proposals Dec 2, 2024
@adonovan
Copy link
Member

adonovan commented Dec 2, 2024

CL 632915 contains a port of staticcheck's SA2000 checker, whose results on the module mirror corpus had no false positives.

I move to accept this proposal and (eventually) add the check to cmd/vet.

@adonovan
Copy link
Member

adonovan commented Dec 2, 2024

Oops, closed by mistake. The new analyzer is enabled in gopls, but we need to wait for the tree to reopen and this proposal to be approved before adding it to cmd/vet.

@adonovan adonovan reopened this Dec 2, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/633704 mentions this issue: go/analysis/passes/waitgroup: report WaitGroup.Add in goroutine

@rsc
Copy link
Contributor

rsc commented Dec 4, 2024

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Dec 4, 2024
@rsc rsc removed the Proposal-Hold label Dec 4, 2024
@lpxz
Copy link

lpxz commented Dec 6, 2024

since there is an active discussion here, I would like to bring to your attention that SA2000 in staticcheck misses real bugs.

If you run staticcheck against the following, it fails to find the bug. If you comment the fmt.Print in front of wg.Add(), it finds the bug. In general, as long as there is some code between go func(){ and wg.Add(), SA2000 fails to fire alarm.

// main.go
package main

import (
	"fmt"
	"sync"
)


func  somefunc() error {
   var  wg sync.WaitGroup
   go func() {
        fmt.Println("hi")
	wg.Add(1)
	defer wg.Done()
    }()
   return nil
}

func main(){
   fmt.Println("starting")
   somefunc()
}

$ go install honnef.co/go/tools/cmd/staticcheck@latest
$ staticcheck main.go

SA2000 code is self-explanatory on why this happens.

I wrote a linter to address this. Will move to staticcheck github repo to first discuss whether such improvement is welcome or negligible.

@adonovan
Copy link
Member

adonovan commented Dec 6, 2024

SA2000 in staticcheck misses real bugs.

Yes, SA2000 and the port of it in https://go.dev/cl/633704 are intentionally very limited in the patterns they match, to avoid false positives. I suspect that relaxing the "wg.Add must be the first statement" rule would cause the analyzer to report a spurious diagnostic for this program:

func  f() error {
	var wg sync.WaitGroup
	wg.Add(1)
	go func() {
		defer wg.Done()

		...

		// Create another goroutine.
		wg.Add(1) // waitgroup: "WaitGroup.Add called from inside new goroutine"
		go func() {
			defer wg.Done()
			...
		}()
	}()
	return nil
}

I am sure there is room for improvement, but in general our inclination is to optimize for fewer false positives, even at the expense of true positives. If you have concrete suggestions (or code) for better algorithms, we can easily test them out on the corpus of the Module Mirror.

@lpxz
Copy link

lpxz commented Dec 6, 2024

@adonovan yes, you spotted a valid false positive.

Here is my linter that uses the golang analysis, not based on staticcheck. It also has fixing logic btw.

I scanned our code at Uber, it led to two false positives, which look exactly like what you described. All other reports are true positives.

It found one more true positive that SA2000 missed, which is arguably marginal :)

@adonovan
Copy link
Member

Thanks @lpxz. I ran your linter on a sample of 16,193 modules from the Go Module Mirror corpus, and got 73 diagnostics (see attached file).

waitgroup.txt

A random sample of 10 are shown below. Of those, three are true positives that would not otherwise have been caught:

and zero are false positives (though I did notice false positives in a larger sample). That's good. Would you care to inspect a larger sample to classify them as (false, true existing, true new)? Thanks.

https://go-mod-viewer.appspot.com/github.com/jtzjtz/kit@v1.0.2/conn/rabbitmq_pool/rabbitmq_pool.go#L121: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/github.com/getlantern/eventual@v1.0.0/eventual_test.go#L102: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/github.com/itering/scale.go@v1.9.13/types/customType_test.go#L18: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/github.com/starskim/DDBOT@v1.1.4/lsp/bilibili/concern.go#L110: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/github.com/cgrates/cgrates@v0.10.4/cmd/cgr-tester/parallel/parallel.go#L37: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L336: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/github.com/omniscale/go-osm@v0.3.1/parser/pbf/parser_test.go#L266: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/github.com/xlab/c-for-go@v1.3.0/process.go#L123: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/github.com/operator-framework/operator-lifecycle-manager@v0.30.0/test/e2e/subscription_e2e_test.go#L580: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/github.com/zhufuyi/sponge@v1.11.0/pkg/ws/server_test.go#L88: Add and Done should not both exist inside the same goroutine block

@lpxz
Copy link

lpxz commented Dec 10, 2024

@adonovan
thanks for spending efforts in trying it out!

yep, I can take a look.
could you clarify the difference between "true existing" vs "true new"?

@Groxx
Copy link

Groxx commented Dec 11, 2024

From skimming a couple as well (maybe I'll get interested enough to be exhaustive):

https://go-mod-viewer.appspot.com/golang.org/x/crypto@v0.30.0/ssh/session_test.go#L678: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/golang.org/x/crypto@v0.30.0/ssh/session_test.go#L698: Add and Done should not both exist inside the same goroutine block
https://go-mod-viewer.appspot.com/golang.org/x/crypto@v0.30.0/ssh/session_test.go#L871: Add and Done should not both exist inside the same goroutine block

These are false positives, the same exception @adonovan pointed out. They could also be written like this though, and I believe it'd incorrectly trigger SA2000 as well:

func  f() {
	var wg sync.WaitGroup
	wg.Add(1)
	go func() {
		wg.Add(1)
		defer wg.Done() // done setting up work
		...
		go func() {
			defer wg.Done() // done doing work
			...
		}()
	}()
}

but I suspect that's less common code in practice. "defer Done() first" is a very strong pattern as far as I can tell.

I've been trying to think of a check that'd be better at allowing these, because I like "same block" a fair bit, but the closest I can come up with is something like "Add(1) plus includes a deeper nested Done() == ignore". Basically "valid inside invalid is fine". It would definitely miss some, but it might be a good enough sign of "perhaps this is complex enough code that it can be ignored".


This one is a bit interesting:

https://go-mod-viewer.appspot.com/github.com/getlantern/eventual@v1.0.0/eventual_test.go#L102: Add and Done should not both exist inside the same goroutine block

It's essentially inverted, and it's... arguably fine:

var wg sync.WaitGroup
wg.Add(1)
// Do some concurrent setting to make sure that it works
for i := 0; i < concurrency; i++ {
	go func() {
		// Wait for waitGroup so that all goroutines run at basically the same
		// time.
		wg.Wait()
		v.Set("hi")
		atomic.AddInt32(&sets, 1)
	}()
}
wg.Done()

I've used similar constructs to try to maximize racing, but I'm fairly sure this one is ineffective since in practice the Done() may run before any of the Wait()s. It's almost guaranteed with GOMAXPROCS=1 iirc. My pattern has been:

var ready, started, complete sync.WaitGroup
ready.Add(1)
started.Add(concurrency)
complete.Add(concurrency)
for i := 0; i < concurrency; i++ {
	go func() {	
		started.Done()        // inform that this goroutine has been started
		ready.Wait()          // wait for all to be started
		atomic.AddInt32(&sets, 1)
		complete.Done()       // this goroutine has finished its work	
	}()
}
started.Wait()  // wait for all goroutines to have been started, and likely at or near Wait
ready.Done()    // unblock all ~simultaneously
complete.Wait() // wait for them all to finish

which has so far been the most effective way to trigger real races that I've found, by quite a large margin, across various GOMAXPROCS. And doesn't take long to figure out when you see it.

I suspect this would also trigger the linter, on the ready waitgroup (which could easily be a close(chan) instead, and I generally do that in practice)

@aclements
Copy link
Member

The proposal committee is on board with adding this check to vet. We'll let the domain experts hash out exactly what the right heuristic is. :)

@lpxz
Copy link

lpxz commented Dec 12, 2024

@adonovan @Groxx
given we only saw one false positive pattern for my linter, i can improve it to avoid that false positive.

Then we can rerun analysis again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis)
Projects
Status: Active
Development

No branches or pull requests