fix: avoid panic from releasing unacquired semaphore #363

abursavich · 2022-11-20T00:48:59Z

When using this pattern:

eg, egCtx := errgroup.WithContext(ctx)
for /* ... */ {
	limiter.Acquire(ctx, 1)
	eg.Go(func() error {
		defer limiter.Release(1)
		// ...
		return nil
	})
}
if err := eg.Wait(); err != nil {
	return err
}

If ctx is cancelled then Acquire may return an error, which will then cause Release to panic after it's called more times than the semaphore was successfully acquired.

This change uses the SetLimit functionality which has been added to errgroup.Group and provides a small wrapper around it in the internal/syncutil package to improve the ergonomics of its usage.

codecov-commenter · 2022-11-22T14:23:21Z

Codecov Report

Merging #363 (b19b776) into main (621970f) will decrease coverage by 0.05%.
The diff coverage is 18.75%.

@@            Coverage Diff             @@
##             main     #363      +/-   ##
==========================================
- Coverage   72.16%   72.10%   -0.06%     
==========================================
  Files          42       43       +1     
  Lines        4092     4094       +2     
==========================================
- Hits         2953     2952       -1     
- Misses        857      862       +5     
+ Partials      282      280       -2

Impacted Files	Coverage Δ
internal/syncutil/limitgroup.go	`0.00% <0.00%> (ø)`
content.go	`74.55% <100.00%> (-1.20%)`	⬇️
internal/syncutil/merge.go	`87.50% <0.00%> (+17.85%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

shizhMSFT · 2022-11-22T14:39:27Z

@abursavich Thank you for your contribution.

I thought about using SetLimit instead of semaphore. However, when I look into the implementation of SetLimit(), I find

func (g *Group) SetLimit(n int) {
	if n < 0 {
		g.sem = nil
		return
	}
	if len(g.sem) != 0 {
		panic(fmt.Errorf("errgroup: modify limit while %v goroutines in the group are still active", len(g.sem)))
	}
	g.sem = make(chan token, n)
}

The statement g.sem = make(chan token, n) makes sync.Group with SetLimit(n int) having a space complexity of O(n) while the semaphore.Weighted has a space complexity of O(1). Therefore, I think the original implementation is better.

However, your concern is still valid. The pattern

eg, egCtx := errgroup.WithContext(ctx)
for /* ... */ {
	limiter.Acquire(ctx, 1)
	eg.Go(func() error {
		defer limiter.Release(1)
		// ...
		return nil
	})
}
if err := eg.Wait(); err != nil {
	return err
}

is flawed since the error of limiter.Acquire(ctx, 1) is never checked. I think the fix will be as simple as checking the error as follows.

eg, egCtx := errgroup.WithContext(ctx)
for /* ... */ {
	if err := limiter.Acquire(ctx, 1); err != nil {
		return err
	}
	eg.Go(func() error {
		defer limiter.Release(1)
		// ...
		return nil
	})
}
if err := eg.Wait(); err != nil {
	return err
}

shizhMSFT · 2022-11-22T14:40:58Z

BTW, thank you for informing golang.org/x/sync has a tagged version v0.1.0.

abursavich · 2022-11-22T15:48:08Z

The statement g.sem = make(chan token, n) makes sync.Group with SetLimit(n int) having a space complexity of O(n) while the semaphore.Weighted has a space complexity of O(1). Therefore, I think the original implementation is better.

The type used in the channel is actually 0 bytes: https://go.dev/play/p/ctojkieS8zW

type token struct{}

// ...

func (g *Group) SetLimit(n int) {
	// ...
	g.sem = make(chan token, n)
}

So its space complexity ends up being O(1) too :)

I think the fix will be as simple as checking the error as follows.

limiter := semaphore.NewWeighted(opts.Concurrency)
eg, egCtx := errgroup.WithContext(ctx)
for /* ... */ {
	if err := limiter.Acquire(ctx, 1); err != nil {
		return err
	}
	eg.Go(func() error {
		defer limiter.Release(1)
		// ...
		return nil
	})
}
if err := eg.Wait(); err != nil {
	return err
}

This was also my initial fix, but I ended up changing it because:

If you use ctx to acquire the limiter, it will keep going after an error has occurred and egCtx has been cancelled.
If you use egCtx to acquire the limiter to fix that problem, it will return a "context cancelled" error instead of the underlying error that triggered the cancellation.

Then I did this, which fixed all of those problems:

limiter := semaphore.NewWeighted(opts.Concurrency)
eg, egCtx := errgroup.WithContext(ctx)
for /* ... */ {
	if err := limiter.Acquire(ctx, 1); err != nil {
		return err
	}
	eg.Go(func() error {
		defer limiter.Release(1)
		select {
		case <-egCtx.Done():
			return egCtx.Err()
		default:
		}
		// ...
		return nil
	})
}
if err := eg.Wait(); err != nil {
	return err
}

But I felt like that was too complicated to be repeating everywhere, so I switched to SetLimit since its built into errgroup and got rid of the limiter, which helped some.

eg, egCtx := errgroup.WithContext(ctx)
eg.SetLimit(int(opts.Concurrency))
for /* ... */ {
	eg.Go(func() error {
		select {
		case <-egCtx.Done():
			return egCtx.Err()
		default:
		}
		// ...
		return nil
	})
}
if err := eg.Wait(); err != nil {
	return err
}

But that still felt a little too complicated to be repeating everywhere, so I encapsulated all the grunge into LimitGroup, leaving behind:

eg, egCtx := syncutil.NewLimitGroup(ctx, int(opts.Concurrency))
for /* ... */ {
	eg.Go(func() error {
		// ...
		return nil
	})
}
if err := eg.Wait(); err != nil {
	return err
}

shizhMSFT · 2022-11-23T07:52:36Z

@abursavich You are right. The struct{} is indeed special and has zero cost in the go channel implementation.

c.buf = mallocgc(mem, elem, true)

shizhMSFT · 2022-11-23T11:07:57Z

content.go

@@ -119,13 +118,10 @@ func TagN(ctx context.Context, target Target, srcReference string, dstReferences
 			return err
 		}

-		limiter := semaphore.NewWeighted(opts.Concurrency)
-		eg, egCtx := errgroup.WithContext(ctx)
+		eg, egCtx := syncutil.NewLimitGroup(ctx, int(opts.Concurrency))


There is an issue on casting int64 to int. What if we lose precisions?

@Wwwsylvia Should we change the type of all occurrences of Concurrency to int from int64?

Yeah int should be enough.

Sent a PR for this. See #376.

shizhMSFT · 2022-11-23T11:09:56Z

internal/syncutil/limitgroup.go

+
+// A LimitGroup is a collection of goroutines working on subtasks that are part of
+// the same overall task.
+type LimitGroup struct {


Instead of LimitGroup / NewLimitGroup() pair, it is better to align the name to LimitedGroup / LimitGroup(), just like io.LimitedReader / io.LimitReader() and LimitedRegion / LimitRegion.

I disagree with the change to the constructor's name, but I made the change as requested.

LimitReader is taking in a Reader and limiting it, so its name is like a command "limit this reader". I think NewLimitedGroup would be better than LimitGroup, since it's creating the LimitedGroup and not just wrapping an existing Group with limits.

shizhMSFT · 2022-11-23T11:18:07Z

@abursavich @Wwwsylvia Do you also want to revisit internal/syncutil/limit.go for context cancellation?

abursavich · 2022-11-29T01:13:21Z

@abursavich @Wwwsylvia Do you also want to revisit internal/syncutil/limit.go for context cancellation?

I think that this type of concurrent code is tricky and you should probably use whatever you're most comfortable with. I'm more comfortable with for { select {} } loops that manage worker goroutines (kinda like this talk).

LimitedRegion has the ended field and guards all over the place to protect the Semaphore from being released when it isn't held. That's nice to avoid panics, but it makes me concerned that the code using it may be invalid.

Could probably tweak the Go function a little to use the egCtx and short-circuit on error:

func Go[T any](ctx context.Context, limiter *semaphore.Weighted, fn GoFunc[T], items ...T) error {
	eg, egCtx := errgroup.WithContext(ctx)
	for _, item := range items {
		region := LimitRegion(egCtx, limiter)
		if err := region.Start(); err != nil {
			eg.Go(func() error { return err })
			break
		}
		eg.Go(func(t T) func() error {
			return func() error {
				defer region.End()
				return fn(egCtx, region, t)
			}
		}(item))
	}
	return eg.Wait()
}

TerryHowe · 2022-12-06T00:41:39Z

rebsae?

Signed-off-by: Andy Bursavich <abursavich@gmail.com>

abursavich · 2022-12-06T00:59:26Z

I've rebased this a couple times to keep it up to date with main. It's unclear to me if there are any more requested changes to this PR, or if the more recent discussion is about other related changes that should be made separately?

Wwwsylvia

LGTM

Wwwsylvia · 2022-12-06T03:09:09Z

I've rebased this a couple times to keep it up to date with main. It's unclear to me if there are any more requested changes to this PR, or if the more recent discussion is about other related changes that should be made separately?

LGTM. Let's wait for review from @shizhMSFT.

I disagree with the change to the constructor's name, but I made the change as requested.
LimitReader is taking in a Reader and limiting it, so its name is like a command "limit this reader". I think NewLimitedGroup would be better than LimitGroup, since it's creating the LimitedGroup and not just wrapping an existing Group with limits.

+1 for NewLimitedGroup.

shizhMSFT

LGTM

abursavich force-pushed the semerr branch 3 times, most recently from a7188d3 to b19b776 Compare November 21, 2022 19:20

shizhMSFT requested a review from Wwwsylvia November 22, 2022 14:19

shizhMSFT reviewed Nov 23, 2022

View reviewed changes

abursavich force-pushed the semerr branch 2 times, most recently from 78c5148 to 43c0ad6 Compare November 28, 2022 23:12

abursavich force-pushed the semerr branch from 43c0ad6 to b715ec1 Compare November 30, 2022 19:49

fix: avoid panic from releasing unacquired semaphore

8552f40

Signed-off-by: Andy Bursavich <abursavich@gmail.com>

abursavich force-pushed the semerr branch from b715ec1 to 8552f40 Compare December 6, 2022 00:56

Wwwsylvia approved these changes Dec 6, 2022

View reviewed changes

shizhMSFT approved these changes Dec 6, 2022

View reviewed changes

shizhMSFT merged commit a5be49a into oras-project:main Dec 6, 2022

Wwwsylvia mentioned this pull request Dec 7, 2022

Update the type of Concurrency options to int from int64 #375

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: avoid panic from releasing unacquired semaphore #363

fix: avoid panic from releasing unacquired semaphore #363

abursavich commented Nov 20, 2022

codecov-commenter commented Nov 22, 2022

shizhMSFT commented Nov 22, 2022

shizhMSFT commented Nov 22, 2022

abursavich commented Nov 22, 2022 •

edited

Loading

shizhMSFT commented Nov 23, 2022

shizhMSFT Nov 23, 2022

Wwwsylvia Nov 24, 2022

Wwwsylvia Dec 7, 2022 •

edited

Loading

shizhMSFT Nov 23, 2022

abursavich Nov 28, 2022

shizhMSFT commented Nov 23, 2022

abursavich commented Nov 29, 2022

TerryHowe commented Dec 6, 2022

abursavich commented Dec 6, 2022

Wwwsylvia left a comment

Wwwsylvia commented Dec 6, 2022 •

edited

Loading

shizhMSFT left a comment

fix: avoid panic from releasing unacquired semaphore #363

fix: avoid panic from releasing unacquired semaphore #363

Conversation

abursavich commented Nov 20, 2022

codecov-commenter commented Nov 22, 2022

Codecov Report

shizhMSFT commented Nov 22, 2022

shizhMSFT commented Nov 22, 2022

abursavich commented Nov 22, 2022 • edited Loading

shizhMSFT commented Nov 23, 2022

shizhMSFT Nov 23, 2022

Choose a reason for hiding this comment

Wwwsylvia Nov 24, 2022

Choose a reason for hiding this comment

Wwwsylvia Dec 7, 2022 • edited Loading

Choose a reason for hiding this comment

shizhMSFT Nov 23, 2022

Choose a reason for hiding this comment

abursavich Nov 28, 2022

Choose a reason for hiding this comment

shizhMSFT commented Nov 23, 2022

abursavich commented Nov 29, 2022

TerryHowe commented Dec 6, 2022

abursavich commented Dec 6, 2022

Wwwsylvia left a comment

Choose a reason for hiding this comment

Wwwsylvia commented Dec 6, 2022 • edited Loading

shizhMSFT left a comment

Choose a reason for hiding this comment

abursavich commented Nov 22, 2022 •

edited

Loading

Wwwsylvia Dec 7, 2022 •

edited

Loading

Wwwsylvia commented Dec 6, 2022 •

edited

Loading