Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: copy data before put it into ringbuf #1471

Merged
merged 1 commit into from
Jun 11, 2018
Merged

bugfix: copy data before put it into ringbuf #1471

merged 1 commit into from
Jun 11, 2018

Conversation

fuweid
Copy link
Contributor

@fuweid fuweid commented Jun 5, 2018

if the consumer pops the data slower than the producer, the data will be
overrided by the coming data.

Signed-off-by: Wei Fu fhfuwei@163.com

Ⅰ. Describe what this PR did

Fixed bug caused by the shared data.

In the pouch, we use the following statement to copy data from fifo into containerio.

// https://github.com/alibaba/pouch/blob/master/daemon/containerio/cio.go#L125
go func(r io.ReadCloser) {
		io.Copy(ioset.out, r)
		r.Close()
		wg.Done()
}(f)

// https://github.com/alibaba/pouch/blob/master/daemon/containerio/container_io.go#L219
func (cio *ContainerIO) Write(data []byte) (int, error) {
	switch cio.typ {
	case stdout:
		for _, b := range cio.backends {
			if cover := b.outRing.Push(data); cover {
				logrus.Warnf("cover data, backend: %s, id: %s", b.backend.Name(), cio.id)
			}
		}
	case stderr:
		for _, b := range cio.backends {
			if cover := b.errRing.Push(data); cover {
				logrus.Warnf("cover data, backend: %s, id: %s", b.backend.Name(), cio.id)
			}
		}
	}

	return len(data), nil
}

However, the data is shared by the multiple write. Please check the io.Copy.

If we don't copy the data before write it into ringbuf, it maybe override by the coming new data/

Ⅱ. Does this pull request fix one issue?

NONE.

Ⅲ. Describe how you did it

Please check the code.

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

@pouchrobot pouchrobot added kind/bug This is bug report for project size/S labels Jun 5, 2018
@allencloud
Copy link
Collaborator

daemon/containerio/container_io.go:233:55: "comming" is a misspelling of "coming"

Big brother is watching on you. 😄 @fuweid

if the consumer pops the data slower than the producer, the data will be
overrided by the coming data.

Signed-off-by: Wei Fu <fhfuwei@163.com>
@fuweid
Copy link
Contributor Author

fuweid commented Jun 5, 2018

@allencloud That is why I like the CI 😄

@allencloud
Copy link
Collaborator

I think the CI failure is a flaky test which is described in #1470 :

FAIL: /go/src/github.com/alibaba/pouch/test/cli_restart_test.go:34: PouchRestartSuite.TestPouchRestart
/go/src/github.com/alibaba/pouch/test/cli_restart_test.go:42:
    res.Assert(c, icmd.Success)
/go/src/github.com/alibaba/pouch/vendor/github.com/gotestyourself/gotestyourself/icmd/command.go:61:
    t.Fatalf("at %s:%d - %s\n", filepath.Base(file), line, err.Error())
... Error: at cli_restart_test.go:42 - 
Command:  /usr/local/bin/pouch restart -t 1 TestPouchRestart
ExitCode: 1
Error:    exit status 1
Stdout:   
Stderr:   Error: {"message":"service endpoint with name b4f54fcf already exists"}
Failures:
ExitCode was 1 expected 0
Expected no error

I re-trigger the test and I will fix the test asap.

@codecov-io
Copy link

Codecov Report

Merging #1471 into master will decrease coverage by 1.64%.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1471      +/-   ##
==========================================
- Coverage   40.31%   38.67%   -1.65%     
==========================================
  Files         251      251              
  Lines       16416    17212     +796     
==========================================
+ Hits         6618     6656      +38     
- Misses       8956     9705     +749     
- Partials      842      851       +9
Impacted Files Coverage Δ
daemon/containerio/container_io.go 61.53% <50%> (+0.49%) ⬆️
apis/server/container_bridge.go 53.8% <0%> (-29.09%) ⬇️
apis/server/volume_bridge.go 63.1% <0%> (-23.56%) ⬇️
apis/server/network_bridge.go 44.87% <0%> (-18.77%) ⬇️
daemon/mgr/container_types.go 61.16% <0%> (-16.62%) ⬇️
apis/server/utils.go 49.09% <0%> (-15.2%) ⬇️
daemon/mgr/spec.go 40% <0%> (-14.55%) ⬇️
daemon/mgr/container.go 35.45% <0%> (-13.84%) ⬇️
daemon/mgr/container_exec.go 55.43% <0%> (-12.57%) ⬇️
pkg/system/cgroup.go 87.09% <0%> (-6.46%) ⬇️
... and 4 more

// in the ringbuf.
//
// However, copy data maybe impact the performance. We need to reconsider
// other better way to handle the IO.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, AWY, we need a better way to handle IO later !

@HusterWan
Copy link
Contributor

LGTM

@pouchrobot pouchrobot added the LGTM one maintainer or community participant agrees to merge the pull reuqest. label Jun 11, 2018
@allencloud allencloud merged commit 00a3631 into AliyunContainerService:master Jun 11, 2018
@fuweid fuweid deleted the bugfix_override_data_if_consume_slow_than_producer branch August 3, 2018 02:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is bug report for project LGTM one maintainer or community participant agrees to merge the pull reuqest. size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants