Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: re-design upgrade feature #2211

Merged
merged 2 commits into from
Oct 17, 2018

Conversation

HusterWan
Copy link
Contributor

@HusterWan HusterWan commented Sep 6, 2018

Signed-off-by: Michael Wan zirenwan@gmail.com

Ⅰ. Describe what this PR did

Before i re-design the upgrade feature, i have thought about a question for a long time: what's for user to use upgrade ? If you want to change memory, cpu, env etc, i think you should use the update feature. But if you want to update the container's image, and inherit the volume and network from the old container, now you should think about the upgrade feature.

After figured out the purpose of the upgrade feature, i re-designed the api interface and the core logic of the upgrade.

About API

Now, upgrade api request body is below, we only support you to specify the Image which used to create a new container and the new Entrypoint and Cmd that the new container used to run.

type ContainerUpgradeConfig struct {

	// Execution commands and args
	// Min Items: 1
	Cmd []string `json:"Cmd"`

	// The entrypoint for the container as a string or an array of strings.
	// If the array consists of exactly one empty string (`[""]`) then the entry point is reset to system default.
	//
	Entrypoint []string `json:"Entrypoint"`

	// image
	// Required: true
	Image string `json:"Image"`
}

I think it's enough to upgrade a container from the old image to a new one.

About the core logic

I think we must solve problems below when implement the upgrade feature.

1. Which cmd should be used to create the new container when upgrade

There are four commands as mentioned below, when create a new container in upgrade, the order of using the command is:

  • We should first use CMD3;
  • We should use the CMD1, if CMD4 is not specified ;
  • We need to use CMD4, if CMD1 is also not specified ;
  • CMD2 should not be used.
cmd named
old cmd of container config CMD1
old image's cmd CMD2
new cmd of container upgrade config CMD3
new image's cmd CMD4

2. Inherit the volumes and network from the old container when create new container

  • network: before create a new container, we should first stop the old container if it is running, then we can use the network config of the old container.
  • storage: new container inherit all the mounts information of the old one, so we can use the volumes used by old container, and we should also parse the volume information contained in new image.

3. How to prepare new rootfs for the new container

In containerd, snapshot represents the rootfs of container, so we should use the new image to create a new snapshot for the new container in upgrade. In case of snapshot name conflicted, we add a random suffix string to the container id as the name of the new snapshot.

4. Rollback

When occurred any error during upgrade, we should have the ability to rollback the upgrade and recover the old container just like it before.

Ⅱ. Does this pull request fix one issue?

none

Ⅲ. Why don't you add test cases (unit test/integration test)? (你真的觉得不需要加测试吗?)

Yes, add more test cases

Ⅳ. Describe how to verify it

none

Ⅴ. Special notes for reviews

none

@codecov-io
Copy link

codecov-io commented Sep 6, 2018

Codecov Report

Merging #2211 into master will increase coverage by 0.03%.
The diff coverage is 67.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2211      +/-   ##
==========================================
+ Coverage   67.33%   67.37%   +0.03%     
==========================================
  Files         213      215       +2     
  Lines       17517    17593      +76     
==========================================
+ Hits        11795    11853      +58     
- Misses       4327     4340      +13     
- Partials     1395     1400       +5
Flag Coverage Δ
#criv1alpha1test 32.11% <11.25%> (-0.12%) ⬇️
#criv1alpha2test 36.14% <11.25%> (-0.18%) ⬇️
#integrationtest 40.3% <66.88%> (+0.15%) ⬆️
#nodee2etest 33.52% <11.92%> (-0.14%) ⬇️
#unittest 23.13% <4.63%> (-0.07%) ⬇️
Impacted Files Coverage Δ
cri/v1alpha1/cri_utils.go 83.22% <0%> (ø) ⬆️
cri/v1alpha2/cri_utils.go 90.57% <100%> (ø) ⬆️
daemon/mgr/image.go 57.7% <50%> (-0.23%) ⬇️
ctrd/container.go 59.28% <50%> (+0.95%) ⬆️
daemon/mgr/container_upgrade.go 65.09% <65.09%> (ø)
daemon/mgr/container_types.go 83.33% <80%> (-0.31%) ⬇️
pkg/utils/random_string.go 81.81% <81.81%> (ø)
daemon/mgr/container.go 59% <83.33%> (+1.3%) ⬆️
ctrd/watch.go 75.75% <0%> (-4.55%) ⬇️
daemon/mgr/snapshot.go 89.85% <0%> (-4.35%) ⬇️
... and 9 more

Labels map[string]string
IO *containerio.IO
Spec *specs.Spec
SnapshotKey string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SnapshotKey is the snapshot id used by container.

we used to use container id as snapshot id, but when we support upgrade a container, the snapshot id may be different with container id.

So, we must record the snapshot id that container used

// Upgrade upgrades a container with new image and args.
func (mgr *ContainerManager) Upgrade(ctx context.Context, name string, config *types.ContainerUpgradeConfig) error {
c, err := mgr.container(name)
func (mgr *ContainerManager) getImageRef(ctx context.Context, image string) (digest.Digest, string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this is function of ContainerManager. It is better to locate it in ImageManager. So we should use containerMgr.ImageMgr.GetImageRef. WDYT? @HusterWan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good advice, i will put those function to ImageMgr

return imageID, primaryRef.String(), nil
}

func (mgr *ContainerManager) getOCIImageConfig(ctx context.Context, image string) (ocispec.ImageConfig, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue, improper place to locate the function here.

return ociImage.Config, nil
}

func (mgr *ContainerManager) prepareContainerEntrypointForUpgrade(ctx context.Context, c *Container, config *types.ContainerUpgradeConfig) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the related Upgrade functions. could we move all of them to a saperated file container_upgrade.go ?
Just a piece of advice.

// Upgrade success, remove snapshot of old container
if err := mgr.Client.RemoveSnapshot(ctx, oldSnapID); err != nil {
// TODO(ziren): remove old snapshot failed, may cause dirty data
logrus.Errorf("failed to remove snapshot %s: %v", oldSnapID, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the log level of Error? Is Warn enough?
Just double check to see the right way to use error and warn.

@@ -258,6 +258,9 @@ type Container struct {

// MountFS is used to mark the directory of mount overlayfs for pouch daemon to operate the image.
MountFS string `json:"-"`

// SnapshotKey specify id of the snapshot that container using.
SnapshotKey string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about directly using SnapshotID?

@HusterWan HusterWan force-pushed the zr/fix-upgrade branch 2 times, most recently from 21c3b71 to bffe170 Compare September 7, 2018 11:03
@HusterWan HusterWan changed the title [wip]refactor: re-design upgrade feature refactor: re-design upgrade feature Sep 7, 2018
apis/swagger.yml Outdated
properties:
HostConfig:
$ref: "#/definitions/HostConfig"
ContainerUpgradeConfig is used for API "POST /containers/upgrade". when upgrade a container,
Copy link
Collaborator

@allencloud allencloud Sep 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"POST /containers/{id}/upgrade"?

Are you missing the {id}/? @HusterWan

s/when upgrade/When upgrading/? Uppercase and add the ing?

@allencloud
Copy link
Collaborator

validate-swagger
./hack/validate_swagger.sh
2018/09/09 15:14:46 yaml: line 2457: did not find expected key
Makefile:121: recipe for target 'validate-swagger' failed
make: *** [validate-swagger] Error 1

@HusterWan
Copy link
Contributor Author

validate-swagger
./hack/validate_swagger.sh
2018/09/09 15:14:46 yaml: line 2457: did not find expected key
Makefile:121: recipe for target 'validate-swagger' failed
make: *** [validate-swagger] Error 1

@allencloud fixed

@allencloud
Copy link
Collaborator

Could we make this move on ASAP? @HusterWan
I think it is always a fancy feature for end-users. ❤️

@HusterWan
Copy link
Contributor Author

Could we make this move on ASAP? @HusterWan
I think it is always a fancy feature for end-users. ❤️

of course, i have updated this PR

@@ -322,9 +321,9 @@ func (mgr *ContainerManager) Create(ctx context.Context, name string, config *ty
return nil, errors.Wrapf(errtypes.ErrInvalidParam, "unknown runtime %s", config.HostConfig.Runtime)
}

config.Image = primaryRef.String()
snapID := id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why to use generated container ID to assign to snapshot ID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SnapshotID is used to equal with ContainerID, but after upgrade they are different.

So, i use param snapID here just to reminder that snapshot id and container id are different

Signed-off-by: Michael Wan <zirenwan@gmail.com>
@HusterWan
Copy link
Contributor Author

upgrade should consider things below:

  1. Should support delete old volumes logic when upgrade a container.
  2. CopyData only does once when the volume is created
  3. the order of which cmd should be used need discuss again.

Signed-off-by: Michael Wan <zirenwan@gmail.com>
@HusterWan
Copy link
Contributor Author

Since we need more discussion to determine how to integrate upgrade feature to Sigma, we can first merge this pr now. @allencloud @fuweid

@allencloud
Copy link
Collaborator

LGTM

@pouchrobot pouchrobot added the LGTM one maintainer or community participant agrees to merge the pull reuqest. label Oct 17, 2018
@allencloud allencloud merged commit d53105e into AliyunContainerService:master Oct 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design kind/feature kind/refactor LGTM one maintainer or community participant agrees to merge the pull reuqest. size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants