Skip to content
This repository has been archived by the owner on Dec 22, 2023. It is now read-only.

Fix fast restart of skygear-server result on worker stale #161

Merged
merged 3 commits into from
Oct 14, 2016

Conversation

rickmak
Copy link
Member

@rickmak rickmak commented Oct 13, 2016

zmq broker don't accept non register plugin heartbeat. It will force the plugin
resend a proper Ready and go through the init process again.

refs #150

Those function belong to broker logic and modify the Broker state that to be
used in `Run`
This fix the bug on fast skygear-server restart, with the plugin will not
know there is restart and fails to establish the init handshake.
@cheungpat cheungpat changed the title Fix fast restart of skyegar-server result on worker stale Fix fast restart of skygear-server result on worker stale Oct 13, 2016
@@ -217,10 +225,11 @@ func (q *workerQueue) Ready(worker pworker) {
for i, w = range workers {
if bytes.Equal(w.address, worker.address) {
*q = append(append(workers[:i], workers[i+1:]...), worker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an array that doesn’t work like any other arrays in golang
since we are changing the variable address here... are we sure all references to the array are updated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh you changed it in a later commit! good job!

switch status {
case Ready:
workers.Ready(newWorker(address))
lb.freshWorkers <- address
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a good move to here

case Heartbeat:
// do nothing
// no-ops
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/no-ops/no-op/

@cheungpat
Copy link
Contributor

zmq broker don't accept non register plugin heartbeat. It will force the plugin
resend a proper Ready and go through the init process again.

I am not sure if this is the old/undesired or new/desired behaviour

@rickmak
Copy link
Member Author

rickmak commented Oct 14, 2016

I am not sure if this is the old/undesired or new/desired behaviour

New, desired behaviour.

Copy link
Contributor

@limouren limouren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't see any compelling reasons to move those goczmq socks and pollers to Broker's fields, it is a matter of personal preference in this case...

@@ -194,54 +197,80 @@ func newWorker(address []byte) pworker {
}
}

type workerQueue []pworker
type pworkers []pworker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably get rid of this type alias as it doesn't provide extra value anymore after []pworker is moved as a field of workerQueue

}

func (q *workerQueue) Tick(worker pworker) error {
if _, ok := q.addresses[string(worker.address)]; !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are trying to divide Ready into Add and Tick s.t. a worker which haven't sent any Ready signal couldn't be added via Tick. I think we could further simplify the responsibility of Add and Tick s.t.

  1. Add only appends a worker to a queue (which is automatically the latest position);
  2. Tick only prompts an existing worker to latest position of the queue, and it is an error to Tick an non-existing worker

In this way we can get rid of workerQueue.addresses which is a duplicated state of workerQueue.pworkers.

Copy link
Member Author

@rickmak rickmak Oct 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workerQueue.pworkers Only holy the idle worker. That is the queue for consuming order. It don't store all previously registered worker. So the state is not duplicated.

Hence, if we make Add only register worker, but not push to the queue. It will take next round heartbeat to push itself to worker.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I forgot that a worker is removed from the queue before it is handed over a job. It LGTM then.

No I meant Add adds a worker to the queue and Tick will never add a new worker. Anyway it's not relevant now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the pull request is merged (good job rickmak and limouren!), I’d just like to say that some comments to describe Tick and Add would be a great idea. Maybe next time.

@limouren limouren merged commit f95e596 into SkygearIO:master Oct 14, 2016
@rickmak rickmak deleted the zmq-worker-state branch June 5, 2017 10:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants