Skip to content

Design proposal: make the docker-py interface more 'data oriented' #538

Open
@moysesb

Description

@moysesb

TL;DR Make client.build() feel more like client.create_container(), and then make create_container()feel more like the new build().

In its current form, the api exposed by docker-py is a direct one to one mapping over the names of the operations supported by the docker remote API. Since these names refer to operations that can be parametrized with query params, it would be nice if this parametrization were more explicit and dictionary-ish, so that it could be constructed independently of the call to the docker-py method that
implements the operation.

For example, say I want to send a docker context to the daemon to build. The daemon accepts a certain number of options that control both the interpretation of the contents being sent and the properties of the image that will be generated from it. These options are formally documented in the Docker Remote API docs, so in principle it's safe to declare them as some kind of version aware data structure in docker-py.

Right now, docker-py does not offer a way for a client application to declare what these options are, except implicitly in the call to client.build(). The build options are the named parameters for the build() method, which when called looks at the specific combination of arguments that was passed to decide how to construct the request to the daemon.

The proposal is to introduce a way for docker-py to construct that specific combination of parameters in a more explicit way, i.e., as a first class value that can be generated and passed around.

So, keeping the example of a 'build' operation, suppose I want to send a directory containing a Dockerfile to the daemon. The way it works now is:

    client = Client(**params)
    client.build(path='my/path', tag='some_image', 
            rm=True, nocache=False, stream=True)

The signature for build is

def build(self, path=None, tag=None, quiet=False, fileobj=None,
        nocache=False, rm=False, stream=False, timeout=None,
        custom_context=False, encoding=None, pull=True,
        forcerm=False, dockerfile=None)

One obvious problem is that here are four types of named arguments declared there:

  1. path, fileobj and custom_context control how the final context object would be built locally.
  2. tag, quiet, nocache, rm, pull, forcerm and dockerfile control how the docker daemon will execute the actual build.
  3. stream and timeout specify the mechanics of the interaction with the server
  4. encoding is kind of an outlier, it appears to be an implementation detail and it should be calculated automatically based on the contents of the generated context. Permissible values are gzip, bzip2, xz, identity

The proposal:

a) The arguments groups should be formally separated so that the distinction is made declaratively.
b) User code should be able to create each of these groups of parameters independently of the call to build(), in an "API safe" way.

A sketch implementation:

  # This is the parametrization of the remote build job 
  BuildSpec spec = BuildSpec(api_version='1.17')
  spec['tag'] = 'my_image'
  spec['nocache'] = True
  spec['forcerm'] = False
  spec['dockerfile'] = './mydockerfile'

The set of valid keys is specified in the Docker Remote API documentation, so there's a level of safety here. Ideally, this should not be an open dictionary, but something that docker-py knows how to construct safely, maybe via a function:

  # the advantage here is that this function can raise an error if the
  # kwargs are not valid for the current API (the one 'client' was
  # constructed with)
  spec = client.make_build_spec(**params)

The other argument type is the group of parameters used to control how the context is built locally by docker-py, namely path, fileobj, and custom_context. A data oriented alternative could be something like:

  # This is the parametrization of the call to docker-py's build()
  Context ctx = Context(type='directory')
  ctx['dockerfile'] = 'a_docker_file_name'
  ctx['dockerignore'] = ['.git/', '**/*.ignorable'] 
  ctx['path'] = 'my/local/path'
  # etc.

BUT, since the docker remote API mandates that the context must be sent as a tar file (optionally compressed), docker-py should expose a 'maker' that returns a properly constructed tar object, one that is presumably valid taking into account the api version implemented, the content supplied, etc. In the current implementation this knowledge - how to make a context - is camouflaged in the imperative logic of the build method itself.

Exposing the ability to create context values:

  # Expose a maker that builds a context from a local directory
  Context ctx = client.make_context_from_dir(path='my/path')

  # Here's another way
  Context ctx = client.make_context_from_dockerfile('path/to/dockerfile')

  # And another
  Context ctx = client.make_context_from_tarball('path/to/tarball')

After the user has obtained a context value, it can optionally tweak it some more before passing it along (for example, merging the dockerignore list from the context with another list, adding some
files to the context data object, etc)

And then the implementation of build() would look like:

def client.build(context, build_spec, stream=True, timeout=60):
    #validate the context and the build_spec

    #then
    response = self._post(
                       u,
                       data=context['data'], # put there by make_context_from_*
                       params=build_spec,    # might need a transformation here
                       headers=headers,
                       stream=stream,
                       timeout=timeout,
    )

    # now use the response as directed by the stream and timeout arguments

Good, bad, ugly?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions