Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition allowing multiple Tick() calls to a single Shard #409

Merged
merged 10 commits into from
Dec 13, 2017

Conversation

prateek
Copy link
Collaborator

@prateek prateek commented Dec 12, 2017

@m3db m3db deleted a comment from coveralls Dec 13, 2017
@m3db m3db deleted a comment from coveralls Dec 13, 2017
@m3db m3db deleted a comment from coveralls Dec 13, 2017
@prateek prateek changed the title [WIP] Fix race condition allowing multiple Tick() calls to a single Shard Fix race condition allowing multiple Tick() calls to a single Shard Dec 13, 2017
storage/shard.go Outdated
}

func (s *dbShard) tickAndExpire(
c context.Cancellable,
softDeadline time.Duration,
policy tickPolicy,
) tickResult {
skipIfClosing bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you combine tickPolicy and skipIfClosing? i.e. just change tickPolicyForceExpiry to tickPolicyCloseShard

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure thing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

storage/shard.go Outdated
if s.ticking.Swap(true) {
// i.e. we were previously ticking
return tickResult{}, errShardAlreadyTicking
}
Copy link
Collaborator

@robskillington robskillington Dec 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the tickPolicy is tickPolicyRegular and s.isClosing() can we also return err? i.e. don't let regular ticks start if closing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure thing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

storage/shard.go Outdated
// NB(r): Asynchronously we purge expired series to ensure pressure on the
// GC is not placed all at one time. If the deadline is too low and still
// causes the GC to impact performance when closing shards the deadline
// should be increased.
cancellable := context.NewNoOpCanncellable()
softDeadline := s.opts.TickInterval()
s.tickAndExpire(cancellable, softDeadline, tickPolicyForceExpiry)
s.tickAndExpire(cancellable, softDeadline, tickPolicyForceExpiry, false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to check if this returns an error and return it yeah?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, will do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@m3db m3db deleted a comment from coveralls Dec 13, 2017
@m3db m3db deleted a comment from coveralls Dec 13, 2017
@coveralls
Copy link

coveralls commented Dec 13, 2017

Coverage Status

Coverage increased (+0.07%) to 77.784% when pulling 194c5cc on prateek/hotfix/shard-tick-race into 0e5c270 on master.

@@ -435,6 +438,8 @@ func (n *dbNamespace) Tick(c context.Cancellable) {
n.metrics.tick.madeUnwiredBlocks.Inc(int64(r.madeUnwiredBlocks))
n.metrics.tick.mergedOutOfOrderBlocks.Inc(int64(r.mergedOutOfOrderBlocks))
n.metrics.tick.errors.Inc(int64(r.errors))

return multiErr.FinalError()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to emit stats or set the last tick stats unless err == nil.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed up

glide.yaml Outdated
@@ -89,6 +89,9 @@ import:
- package: github.com/spf13/pflag
version: 4f9190456aed1c2113ca51ea9b89219747458dc1

- package: go.uber.org/atomic
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to pull this in anymore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh I remember thinking ill forget to revert that :|

return multiErr.FinalError()
}

if err := multiErr.FinalError(); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe combine these two together? e.g. if err := multiErr.FinalError(); err != nil || c.IsCancelled() { return err }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure thing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -476,6 +487,145 @@ func TestShardWriteAsync(t *testing.T) {
require.True(t, ok)
}

// This tests a race in shard ticking with an empty series pending expiration.
func TestShardTickRace(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good set of tests 👍

Copy link
Collaborator

@robskillington robskillington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than nit about combining if statements

@coveralls
Copy link

coveralls commented Dec 13, 2017

Coverage Status

Coverage increased (+0.09%) to 77.807% when pulling 1195a9b on prateek/hotfix/shard-tick-race into 0e5c270 on master.

@coveralls
Copy link

coveralls commented Dec 13, 2017

Coverage Status

Coverage increased (+0.02%) to 77.733% when pulling 7826ab3 on prateek/hotfix/shard-tick-race into 0e5c270 on master.

@prateek prateek merged commit d21dce5 into master Dec 13, 2017
@prateek prateek deleted the prateek/hotfix/shard-tick-race branch December 13, 2017 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants