-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race condition allowing multiple Tick() calls to a single Shard #409
Conversation
storage/shard.go
Outdated
} | ||
|
||
func (s *dbShard) tickAndExpire( | ||
c context.Cancellable, | ||
softDeadline time.Duration, | ||
policy tickPolicy, | ||
) tickResult { | ||
skipIfClosing bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you combine tickPolicy
and skipIfClosing
? i.e. just change tickPolicyForceExpiry
to tickPolicyCloseShard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
storage/shard.go
Outdated
if s.ticking.Swap(true) { | ||
// i.e. we were previously ticking | ||
return tickResult{}, errShardAlreadyTicking | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the tickPolicy is tickPolicyRegular
and s.isClosing()
can we also return err? i.e. don't let regular ticks start if closing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
storage/shard.go
Outdated
// NB(r): Asynchronously we purge expired series to ensure pressure on the | ||
// GC is not placed all at one time. If the deadline is too low and still | ||
// causes the GC to impact performance when closing shards the deadline | ||
// should be increased. | ||
cancellable := context.NewNoOpCanncellable() | ||
softDeadline := s.opts.TickInterval() | ||
s.tickAndExpire(cancellable, softDeadline, tickPolicyForceExpiry) | ||
s.tickAndExpire(cancellable, softDeadline, tickPolicyForceExpiry, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to check if this returns an error and return it yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
storage/namespace.go
Outdated
@@ -435,6 +438,8 @@ func (n *dbNamespace) Tick(c context.Cancellable) { | |||
n.metrics.tick.madeUnwiredBlocks.Inc(int64(r.madeUnwiredBlocks)) | |||
n.metrics.tick.mergedOutOfOrderBlocks.Inc(int64(r.mergedOutOfOrderBlocks)) | |||
n.metrics.tick.errors.Inc(int64(r.errors)) | |||
|
|||
return multiErr.FinalError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to emit stats or set the last tick stats unless err == nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed up
glide.yaml
Outdated
@@ -89,6 +89,9 @@ import: | |||
- package: github.com/spf13/pflag | |||
version: 4f9190456aed1c2113ca51ea9b89219747458dc1 | |||
|
|||
- package: go.uber.org/atomic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to pull this in anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doh I remember thinking ill forget to revert that :|
storage/namespace.go
Outdated
return multiErr.FinalError() | ||
} | ||
|
||
if err := multiErr.FinalError(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe combine these two together? e.g. if err := multiErr.FinalError(); err != nil || c.IsCancelled() { return err }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -476,6 +487,145 @@ func TestShardWriteAsync(t *testing.T) { | |||
require.True(t, ok) | |||
} | |||
|
|||
// This tests a race in shard ticking with an empty series pending expiration. | |||
func TestShardTickRace(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good set of tests 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than nit about combining if statements
/cc @robskillington @richardartoul