Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container deltas #11

Merged
merged 14 commits into from
Aug 21, 2017
Merged

Container deltas #11

merged 14 commits into from
Aug 21, 2017

Conversation

petrosagg
Copy link
Contributor

Overview

This is a fairly big PR but it's very well split to individual commits so I suggest reviewing commit by commit.

This PR is based on the resin-os/librsync-go package that can produce and apply deltas on arbitrary data streams.

Building bottom up, this PR includes:

Groundwork in pkg/*

The pkg/ioutils package got two new utilities. ReadSeekCloser is the same as ReadCloser but with Seek support and a ConcatReadSeekCloser utility that concatenates two streams.

The pkg/tarsplitutils is augmenting the functionality provided by vbatts/tar-split by providing a Seekable tar stream using the combination of tar metadata and a file tree on disk.

Making a delta-able stream for a docker image

A new TarSeekStream() method was added to the Layer interface that is similar to the existing TarStream() but it is Seekable. On the other hand it does not implement digest verification.

A new GetTarSeekStream() method was added to the ImageStore interface. The name isn't very accurate since the resulting stream is not a valid tar archive, but it's the concatenation of all the layer streams in a given image.

The streams produced by these methods are reproducible across any host having the same image.

Delta calculation

A new docker daemon API was added, POST deltas/create?src=<id>&dest=<id> that takes the streams of the two images using GetTarSeekStream() and computes a binary delta using librsync-go. The result is stored as a new image with a single layer containing a single file, the binary diff. The config of the delta image contains two special labels, io.resin.delta.base and io.resin.delta.config. The former serves as metadata for delta application and the latter holds the serialised config of the destination image.

Delta pull

Deltas are made deliberately compatible with the docker registry so doing a delta pull is done using a normal docker pull command. When the pulling code detects the two special labels in the target config it will attempt to apply the delta. The layers are recreated on the fly inside the distribution code as if they were downloaded in full. This minimises the changes needed for the rest of the codebase.

Secondary layer store for deltas

Lastly, three new parameters were added to the docker daemon, --delta-storage-root, --delta-storage-driver, --delta-storage-opts. These options allow the docker daemon to load a secondary docker storage and use the layers from there as candidates for delta pulls.

This is useful for dual partition situations (like we have in resinOS) where the source image is in a different partition than where we want the destination image to go.

"github.com/vbatts/tar-split/tar/storage"
)

func min(x, y int64) int64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why min uses int 64 and max uses int?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason other than that's the types I have in the places I use the respective functions

image/store.go Outdated
@@ -213,6 +215,46 @@ func (is *store) Get(id ID) (*Image, error) {
return img, nil
}

func (is *store) GetTarSeekStream(id ID) (ioutils.ReadSeekCloser, error) {
Copy link
Contributor

@zozo123 zozo123 Aug 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider to add a documentation to this function. something like
// GetTarSeekStream returns a concatenation of Tar streams

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"golang.org/x/net/context"
)

func (d *deltaRouter) postDeltasCreate(ctx context.Context, w http.ResponseWriter, r *http.Request, vars map[string]string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good, but I would expect it to be a context dependent.
Something like that:
https://github.com/moby/moby/blob/b248de7e332b6e67b08a8981f68060e6ae629ccf/distribution/pull_v2.go#L63

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the comment. You mean passing the context down the line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, the func gets ctx as an argument, but doesn't use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed but that's just because the router framework passes it to the handlers. The same pattern can be seen here https://github.com/moby/moby/blob/master/api/server/router/volume/volume_routes.go

return err
}

deltaSrc := r.Form.Get("src")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that d.backend.DeltaCreate will return an error if the src and dst are invalid, though I think it's better to check if Get returned an error as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would I check that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's ok if DeltaCreate can work with empty strings to see here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it can handle empty strings

}

deltaSrc := r.Form.Get("src")
deltaDest := r.Form.Get("dest")
Copy link
Contributor

@zozo123 zozo123 Aug 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Copy link
Contributor

@zozo123 zozo123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good. I think it's worth to ask one more guy to review it.

Copy link
Contributor

@zozo123 zozo123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This is a useful abstraction that exists in a few places in docker's
codebase

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
The TarSeekStream is added to the Layer interface to allow any user of
layers to request a ReadSeeker of the tar archive that would have been
produced in a normal TarStream().

This allows reading parts of the resulting archive on the fly, without
having to buffer the file on disk and then seek on top of it.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
This method creates a seekable stream that is the concatenation of the
tar seekable streams of the layers the image is composed of. This method
is intended to be used as a basis for delta based updates where bits of
the previous image can be reused to reconstruct a layer of a future
image.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
This will allow distribution code to request a seekable tar stream in
order to compute a binary delta on top of it.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Pull the config before starting anything else. We need the config before
starting the layer download to prepare for a delta pull if this config
happens to be a delta config.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
This commit allows the daemon to load a secondary daemon store to be
used as source for deltas.

Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
@petrosagg petrosagg merged commit d7b6017 into 17.06-resin Aug 21, 2017
@petrosagg petrosagg deleted the delta-pull branch August 21, 2017 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants