Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: fix exec stuck when exec get error #1605

Merged
merged 1 commit into from
Jul 5, 2018
Merged

bugfix: fix exec stuck when exec get error #1605

merged 1 commit into from
Jul 5, 2018

Conversation

Ace-Tang
Copy link
Contributor

@Ace-Tang Ace-Tang commented Jun 28, 2018

when daemon deal with exec process get error, process will get stuck,
since http connection has been hijacked, it won't finish and return
with error message. Fix it by write error message into hijacked
connection, close io when process get error.

fixes: #1151

Signed-off-by: Ace-Tang aceapril@126.com

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

fixes: #1151

Ⅲ. Describe how you did it

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

@Ace-Tang
Copy link
Contributor Author

Ace-Tang commented Jun 28, 2018

Hi, @Letty5411 , I want you to help look at a test

func (suite *APIContainerExecStartSuite) TestContainerExecStartNotFound(c *check.C) {
    obj := map[string]interface{}{
        "Detach": false,
        "Tty":    false,
    }   
    body := request.WithJSONBody(obj)
    resp, err := request.Post("/exec/TestContainerExecStartNotFound/start", body)
    c.Assert(err, check.IsNil)
    CheckRespStatus(c, resp, 404)
}

As for fix exec hang, I ignore the error https://github.com/alibaba/pouch/pull/1605/files#diff-b7f6767e7c90c31b2aec9ef134e49e54L75, since error message can not be return by a hijacked connect. Does the test fails because I ignore the error?

do not worry about this, I have fix this.

@Ace-Tang Ace-Tang requested a review from fuweid June 28, 2018 08:33
@codecov-io
Copy link

codecov-io commented Jun 28, 2018

Codecov Report

Merging #1605 into master will decrease coverage by 1.19%.
The diff coverage is 16.88%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #1605     +/-   ##
=========================================
- Coverage   41.97%   40.78%   -1.2%     
=========================================
  Files         270      270             
  Lines       17609    18100    +491     
=========================================
- Hits         7392     7382     -10     
- Misses       9307     9806    +499     
- Partials      910      912      +2
Impacted Files Coverage Δ
daemon/mgr/container.go 38.52% <ø> (-11.68%) ⬇️
cri/v1alpha2/cri_stream.go 0% <0%> (ø) ⬆️
cli/exec.go 0% <0%> (ø) ⬆️
cri/v1alpha1/cri_stream.go 0% <0%> (ø) ⬆️
cri/v1alpha1/cri.go 0% <0%> (ø) ⬆️
cri/v1alpha2/cri.go 0% <0%> (ø) ⬆️
apis/server/router.go 91.36% <100%> (+0.06%) ⬆️
daemon/mgr/container_exec.go 67.74% <39.13%> (-2.38%) ⬇️
apis/server/exec_bridge.go 68.18% <60%> (-1.82%) ⬇️
daemon/mgr/image.go 54.09% <0%> (-14.38%) ⬇️
... and 7 more


"github.com/alibaba/pouch/apis/types"
"github.com/alibaba/pouch/ctrd"
"github.com/alibaba/pouch/pkg/errtypes"
"github.com/alibaba/pouch/pkg/randomid"
"github.com/docker/docker/pkg/stdcopy"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the package into next group

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh!

case err := <-stdoutDone:
if err != nil {
logrus.Debugf("receive stdout error: %s", err)
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the code goes into this part, the following code will go into the end?

go func() {
    if stdin {
       io.Copy(conn, os.Stdin)
    }

    close(stdinDone)
}()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is nit comment. If there exists goroutine leak, the cli will die to make the goroutine exit....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe depend on when golang schedule the goroutine if stdin is flase, if stdin if true, it must became a leak.

Copy link
Contributor

@fuweid fuweid Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But the cli will return and exit so that the leak problem will be solved. This part is LGTM right now.

@Ace-Tang
Copy link
Contributor Author

@YaoZengzeng , Could you help to see if the patch will infect CRI part.

exitCode := 126
execConfig.ExitCode = int64(exitCode)
}
mgr.ExecProcesses.Put(execid, execConfig)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mgr.ExecProcesses.Put(execid, execConfig) is unnecessary, since it has added in CreateExec, WDYT, @fuweid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allencloud , @HusterWan , does mgr.ExecProcesses.Put(execid, execConfig) can be removed here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ace-Tang I do not think so, we will use exec/EXEC_ID/json to judge the exec result, so we should update the execConfig when exec done

defer func() {
if err != nil {
var stdout io.Writer = eio.Stdout
if !execConfig.Tty && !eio.MuxDisabled {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update the judgement, @fuweid

Copy link
Contributor

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ExecProcesses is useful and please add it back?

cli/exec.go Outdated
"os"
"sync"

"github.com/alibaba/pouch/apis/types"
"github.com/docker/docker/pkg/stdcopy"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you mind to split it into two parts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try...

Copy link
Contributor Author

@Ace-Tang Ace-Tang Jul 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think this part is not fixed this time.

@Ace-Tang
Copy link
Contributor Author

Ace-Tang commented Jul 2, 2018

Updated as the comment, @HusterWan @fuweid

@@ -55,8 +56,13 @@ func (s *Server) startContainerExec(ctx context.Context, rw http.ResponseWriter,
name := mux.Vars(req)["name"]
_, upgrade := req.Header["Upgrade"]

if err := s.ContainerMgr.CheckExecExist(ctx, name); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should keep consistent that object existence checking should be located in the manager layer rather than bridge layer?

So maybe we should encapsulate this checking into the function StartExec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if check in StartExec, it too late. StartExec will not return error, since hikacked connection not take error back, we only write error message into connect, then api will get the error message, if check is in StartExec, we will never get error.

@@ -186,6 +186,7 @@ func filter(handler handler, s *Server) http.HandlerFunc {
return
}
// Handle error if request handling fails.
logrus.Errorf("Handler for %s %s, client %s returen error: %s", req.Method, req.URL.RequestURI(), clientInfo, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/returen/returned

@pouchrobot
Copy link
Collaborator

ping @Ace-Tang
We found that this PR is 10 commits, which is more than 10 commits, behind master.
Please rebase the branch against master and push it back again. Thanks a lot.

when daemon deal with exec process get error, process will get stuck,
since http connection has been hijacked, it won't finish and return
with error message. Fix it by write error message into hijacked
connection, close io when process get error.

fixes: #1151

Signed-off-by: Ace-Tang <aceapril@126.com>
@fuweid
Copy link
Contributor

fuweid commented Jul 5, 2018

LGTM

@fuweid fuweid merged commit 540f51b into AliyunContainerService:master Jul 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug] exec with -u nouser hang
6 participants