Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: fix lock competition between Stats and Start interface #2238

Merged
merged 1 commit into from
Sep 14, 2018

Conversation

HusterWan
Copy link
Contributor

Signed-off-by: Michael Wan zirenwan@gmail.com

Ⅰ. Describe what this PR did

We use Stats interface to get running container's resource metrics, and this api can be called at any time of the container's life cycle.

If we called the Stats interface when the container is starting, there may caused Stats or Start failed because of lock competition.

The code that get the lock:

     if !c.lock.Trylock(id) {
          return nil, errtypes.ErrLockfailed
       }
       defer c.lock.Unlock(id)
  

The log that get lock failed:

time="2018-09-12T16:57:10.49956582+08:00" level=error msg="failed to decode metrics of container "b36804c94a701cfb5f8e3c27c6e9eed096e8179f6352ec465df156efb01b28c8": failed to get stats of container "b36804c94a701cfb5f8e3c27c6e9eed096e8179
f6352ec465df156efb01b28c8": lock failed"
500 Internal Server Error: {"message":"failed to create container(${CNRID}) on containerd: lock failed"}

So, in order to avoid the lock competition, we should just get the Metrics just when the container is running

Ⅱ. Does this pull request fix one issue?

none

Ⅲ. Why don't you add test cases (unit test/integration test)? (你真的觉得不需要加测试吗?)

This situation is very hard to write test case.

Ⅳ. Describe how to verify it

none

Ⅴ. Special notes for reviews

none

@pouchrobot pouchrobot added kind/bug This is bug report for project size/XS labels Sep 13, 2018
@@ -81,6 +81,11 @@ func (mgr *ContainerManager) Stats(ctx context.Context, name string) (*container
ID := c.ID
c.Unlock()

// only get metrics when the container is running
if !(c.IsRunning() || c.IsPaused()) {
return nil, nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return nil will panic if the StreamStat toContainerStats consumes the result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, i have fix it.

…ontainer is not running

Signed-off-by: Michael Wan <zirenwan@gmail.com>
Copy link
Contributor

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fuweid fuweid merged commit 9f270ca into AliyunContainerService:master Sep 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is bug report for project size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants