Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

proxy: close yamux session properly #94

Merged
merged 2 commits into from
Aug 3, 2018

Conversation

bergwolf
Copy link
Member

@bergwolf bergwolf commented Aug 2, 2018

When cleaning up, we need to make sure the yamux session is closed as
well. It does not close the session by just closing the listener.

OTOH, it closes all streams if the session is closed.

Fixes: #93

When cleaning up, we need to make sure the yamux session is closed as
well. It does not close the session by just closing the listener.

OTOH, it closes all streams if the session is closed.

Fixes: kata-containers#93

Signed-off-by: Peng Tao <bergwolf@gmail.com>
@codecov
Copy link

codecov bot commented Aug 2, 2018

Codecov Report

Merging #94 into master will increase coverage by 3.34%.
The diff coverage is 64.28%.

@@            Coverage Diff             @@
##           master      #94      +/-   ##
==========================================
+ Coverage    34.4%   37.74%   +3.34%     
==========================================
  Files           2        2              
  Lines         250      257       +7     
==========================================
+ Hits           86       97      +11     
+ Misses        151      149       -2     
+ Partials       13       11       -2

@devimc
Copy link

devimc commented Aug 2, 2018

lgtm

proxy_test.go Outdated
defer func() {
close(closeCh)
lp.Close()
s.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is closing in the opposite order to the order in proxy.go.

@jodh-intel
Copy link
Contributor

This looks good but doesn't resolve the issue with #89: taking just the top commit from that PR and applying on top of this still causes a test race. Note that you need to run the tests a few times to actually hit the race though.

@bergwolf
Copy link
Member Author

bergwolf commented Aug 2, 2018

@jodh-intel yeah, the PR is not complete. It turns out that yamux stream is closed in an asynchronous manner. I'll see how to fix it.

@bergwolf
Copy link
Member Author

bergwolf commented Aug 2, 2018

@jodh-intel Now the data race should be closed. I've run 20 times (merged with #89 for sure) without hitting it. PTAL.

@jodh-intel
Copy link
Contributor

jodh-intel commented Aug 2, 2018

Thanks @bergwolf - I've run (this branch + #89) 30 times with no go test panics (with master + #89 I see panics ~ every 1-3 runs).

lgtm!

Approved with PullApprove

@opendev-zuul
Copy link

opendev-zuul bot commented Aug 2, 2018

Build failed (third-party-check pipeline) integration testing with
OpenStack. For information on how to proceed, see
http://docs.openstack.org/infra/manual/developers.html#automated-testing

Copy link

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Globally looks good, but I think the handling of the WaitGroup is racy.

proxy.go Outdated
}

go proxyConn(conn, stream)
go proxyConn(conn, stream, wg)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is potentially racy. Usually, we call wg.Add() before to call into the go routine calling wg.Done().
The current go routine here might reach wg.Done() before the subsequent go routines will call into wg.Add().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks! ptal.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now !

@bergwolf bergwolf force-pushed the session_close branch 2 times, most recently from 82e3301 to 2df1fe5 Compare August 3, 2018 00:32
yamux stream closing is aync. Use a wait group to make sure
when we quit the serve() goroutine, all the stream copy goroutines
have quited as well.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
@opendev-zuul
Copy link

opendev-zuul bot commented Aug 3, 2018

Build failed (third-party-check pipeline) integration testing with
OpenStack. For information on how to proceed, see
http://docs.openstack.org/infra/manual/developers.html#automated-testing

@bergwolf
Copy link
Member Author

bergwolf commented Aug 3, 2018

Merge it since the failed three checks are all expected.

@bergwolf bergwolf merged commit d7559c1 into kata-containers:master Aug 3, 2018
@bergwolf bergwolf deleted the session_close branch August 3, 2018 07:11
@sboeuf
Copy link

sboeuf commented Aug 3, 2018

@bergwolf

travis-ci is failing on #95

what's the story on this ?

openstack-zuul has been failing for a long time w/o showing reasons
Actually, if you look at the results from the comment left by Zuul, the CI did succeed on kata-runsh, which is the Ubuntu 16.04 job, but failed on Fedora 28. The Fedora 28 job is still failing but at least we can confidently look at kata-runsh.

@bergwolf
Copy link
Member Author

bergwolf commented Aug 3, 2018

@sboeuf
About macos travis-ci, I should have also referenced kata-containers/tests/issues/558. Somehow the yq command becomes a requirement for running CI, and it breaks CI on darwin where that command is missing.

About openstack-zuul, yes, kata-runsh did succeed. But kata-runsh-fedora-28 seems to always fail across all repos.

@sboeuf
Copy link

sboeuf commented Aug 3, 2018

ok thanks for the feedback

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

close yamux session properly
4 participants