Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: take over old containerd instance when pouchd restart #1275

Merged

Conversation

HusterWan
Copy link
Contributor

Signed-off-by: Michael Wan zirenwan@gmail.com

Ⅰ. Describe what this PR did

When pouchd restart, if old containerd is alive, we take it over.

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how you did it

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

@@ -166,3 +185,125 @@ func (c *Client) Version(ctx context.Context) (containerd.Version, error) {

return cli.client.Version(ctx)
}

func (c *Client) runContainerdDaemon(homeDir string, copts clientOpts) error {
if homeDir == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use os.Stat to check homeDir here? If the homeDir doesn't exist, we should return error right away.
WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the home-dir existence is checked in main.go, so we no need check here.
but we should check the stateDir here, thanks for your reminder, :)

ctrd/client.go Outdated
}
logrus.Infof("ctrd: new containerd process, pid: %d", cmd.Process.Pid)

if err := utils.SetOOMScore(cmd.Process.Pid, -17); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember the containerd has the OOM score settings in the configuration. In my opinion, it's better if we maintain configuration for the pouchd. The configuration can help us to locate the problem in the future. How do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, reasonable !!!

@HusterWan HusterWan force-pushed the zr/containerd-dev branch from 58c1400 to 662553d Compare May 7, 2018 08:05
@codecov-io
Copy link

codecov-io commented May 7, 2018

Codecov Report

Merging #1275 into master will decrease coverage by 0.2%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1275      +/-   ##
==========================================
- Coverage   15.52%   15.31%   -0.21%     
==========================================
  Files         173      174       +1     
  Lines       10868    11014     +146     
==========================================
  Hits         1687     1687              
- Misses       9062     9208     +146     
  Partials      119      119
Impacted Files Coverage Δ
daemon/config/config.go 7.89% <ø> (ø) ⬆️
ctrd/client_opts.go 0% <0%> (ø)
pkg/utils/utils.go 69.79% <0%> (-7.24%) ⬇️
ctrd/client.go 0% <0%> (ø) ⬆️
ctrd/wrapper_client.go 0% <0%> (ø) ⬆️

@HusterWan HusterWan force-pushed the zr/containerd-dev branch from 662553d to d21aaf3 Compare May 7, 2018 08:56
ctrd/client.go Outdated
os.RemoveAll(c.socketPath)
}

// Start a new containerd instance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use a function to wrap the following functions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

daemon/daemon.go Outdated
errMsg = fmt.Sprintf("%s\n", err.Error())
}

fmt.Println("Start cleanup containerd...")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid we should use logrus to catch this line of message.
Rather than fmt.Println.

@@ -216,7 +225,23 @@ func (d *Daemon) Run() error {

// Shutdown stops daemon.
func (d *Daemon) Shutdown() error {
return d.server.Stop()
var errMsg string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if the variable errMsg is redundant.
I think with the current implementation, we can just return the returned err, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allencloud i want to execute all stop function even occurred an error when calling one of the stop function

@HusterWan HusterWan force-pushed the zr/containerd-dev branch from d21aaf3 to cb430da Compare May 8, 2018 02:27
@HusterWan
Copy link
Contributor Author

Updated! PTAL @allencloud @fuweid

@HusterWan HusterWan force-pushed the zr/containerd-dev branch 3 times, most recently from 9588676 to 002d016 Compare May 8, 2018 08:57
@allencloud
Copy link
Collaborator

I am wondering if we could add an integration test to cover this case? @HusterWan
Also cc @Letty5411 @fuweid for more feedbacks.

We only have api and cli integration test and unit test, for other kinds of test, shall we include them in test code as well?

@HusterWan
Copy link
Contributor Author

Maybe we can add test case in daemon test, here is test steps in my compute:

  1. first start containerd, and get the pid of it.
  2. echo -n pid of containerd to pidfile
  3. check if we can start pouchd normally
  4. kill pouchd and check if old containerd can exited normally

@Letty5411 can we add test case as my described ?

// Start a new containerd instance
args := []string{
"-a", c.rpcAddr,
"--root", path.Join(c.homeDir, "containerd/root"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I don't make my suggestion clear in the last comment.

The containerd has own the configuration. The link is here. Maybe we can marshal the settings in the file so that we can use the file to check the information about pouchd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, the containerd core Opts have been in the poucd configuration file, and i think we no need keep much containerd configuration opts in pouchd code, if you want to add more configurations , use the containerd configuration file outside, DYA?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

type clientOpts struct {
startDaemon bool
debugLog bool
rpcAddr string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have default value for the option? I'm wondering that the WithXXX is not necessary for every time. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not call WithXXX will set the Opts with default value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @HusterWan

// set default value for parameters
	if copts.rpcAddr == "" {
		copts.rpcAddr = unixSocketPath
	}
	client.rpcAddr = copts.rpcAddr

	if copts.grpcClientPoolCapacity <= 0 {
		copts.grpcClientPoolCapacity = defaultGrpcClientPoolCapacity
	}
	if copts.maxStreamsClient <= 0 {
		copts.maxStreamsClient = defaultMaxStreamsClient
	}

For this part, I means we can use the following code to handle

var defaultClientOpts = clientOpts{
rpcAddr: unixSocketPath,
grpcClientPoolCapacity: defaultGrpcClientPoolCapacity,
maxStreamsClient: defaultMaxStreamsClient
}

With the default value, we don't need to have the if-else code. :)

Hope it can help

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid, i can not agree with you, default value take effect when not specified or invalid params

Signed-off-by: Michael Wan <zirenwan@gmail.com>
@fuweid
Copy link
Contributor

fuweid commented May 9, 2018

LGTM

@allencloud allencloud merged commit 86bfa6e into AliyunContainerService:master May 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants