Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job: fix race between channel close and job exit #1247

Merged
merged 1 commit into from
Apr 2, 2017
Merged

job: fix race between channel close and job exit #1247

merged 1 commit into from
Apr 2, 2017

Conversation

pborzenkov
Copy link
Contributor

When channel's close_cb is called, the job this channel is assigned to
may still be alive and the result status of the job will be set to
FAILED (as no actual return code is available at that moment). This is
easily reproduced for me on a single CPU Linux VPS.

To fix this, check for job's status in exit_cb instead.

Signed-off-by: Pavel Borzenkov pavel.borzenkov@gmail.com

@fatih
Copy link
Owner

fatih commented Mar 28, 2017

I remember that exit_cb was not working well. I'm not sure if it's fixed recently. Thanks @pborzenkov , need to look into it and be sure it works fine. Also we have other places that still uses close_cb which is not fixed with this PR (such as guru.vim)

@pborzenkov
Copy link
Contributor Author

@fatih I believe there might be a potential problem with exit_cb, because there still can be data in channel's buffer at the time the callback is called:

                        Note that data can be buffered, callbacks may still be
                        called after the process ends.

I think this can be fixed by the following patch:

   function cbs.exit_cb(job, exitval) dict
+    let chan = job_getchannel(a:job)
+    while ch_canread(chan)
+      call add(self.messages, ch_read(chan))
+    endwhile
+

Though, I haven't encountered such problems yet (I have tested the patch on macOS and Linux).

Sure, I'll fix the rest of the places.

When channel's close_cb is called, the job this channel is assigned to
may still be alive and the result status of the job will be set to
FAILED (as no actual return code is available at that moment). This is
easily reproduced for me on a single CPU Linux VPS.

To fix this, check for job's status in exit_cb instead.

Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>
@fatih
Copy link
Owner

fatih commented Apr 2, 2017

@pborzenkov

yeah that was the problem I had. Let me test this branch locally for couple of days. It's a very critical one that might break async build and co on anything.

@fatih
Copy link
Owner

fatih commented Apr 2, 2017

Alright extensively tested it. I think we can merge it to master (because master is a development branch). This would make it easier to spot errors, because people fetch from master frequently. Thanks @pborzenkov for the fix again.

@fatih fatih merged commit c71c75e into fatih:master Apr 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants