Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design proposal: make the docker-py interface more 'data oriented' #538

Open
moysesb opened this issue Mar 30, 2015 · 2 comments
Open

Design proposal: make the docker-py interface more 'data oriented' #538

moysesb opened this issue Mar 30, 2015 · 2 comments

Comments

@moysesb
Copy link

moysesb commented Mar 30, 2015

TL;DR Make client.build() feel more like client.create_container(), and then make create_container()feel more like the new build().

In its current form, the api exposed by docker-py is a direct one to one mapping over the names of the operations supported by the docker remote API. Since these names refer to operations that can be parametrized with query params, it would be nice if this parametrization were more explicit and dictionary-ish, so that it could be constructed independently of the call to the docker-py method that
implements the operation.

For example, say I want to send a docker context to the daemon to build. The daemon accepts a certain number of options that control both the interpretation of the contents being sent and the properties of the image that will be generated from it. These options are formally documented in the Docker Remote API docs, so in principle it's safe to declare them as some kind of version aware data structure in docker-py.

Right now, docker-py does not offer a way for a client application to declare what these options are, except implicitly in the call to client.build(). The build options are the named parameters for the build() method, which when called looks at the specific combination of arguments that was passed to decide how to construct the request to the daemon.

The proposal is to introduce a way for docker-py to construct that specific combination of parameters in a more explicit way, i.e., as a first class value that can be generated and passed around.

So, keeping the example of a 'build' operation, suppose I want to send a directory containing a Dockerfile to the daemon. The way it works now is:

    client = Client(**params)
    client.build(path='my/path', tag='some_image', 
            rm=True, nocache=False, stream=True)

The signature for build is

def build(self, path=None, tag=None, quiet=False, fileobj=None,
        nocache=False, rm=False, stream=False, timeout=None,
        custom_context=False, encoding=None, pull=True,
        forcerm=False, dockerfile=None)

One obvious problem is that here are four types of named arguments declared there:

  1. path, fileobj and custom_context control how the final context object would be built locally.
  2. tag, quiet, nocache, rm, pull, forcerm and dockerfile control how the docker daemon will execute the actual build.
  3. stream and timeout specify the mechanics of the interaction with the server
  4. encoding is kind of an outlier, it appears to be an implementation detail and it should be calculated automatically based on the contents of the generated context. Permissible values are gzip, bzip2, xz, identity

The proposal:

a) The arguments groups should be formally separated so that the distinction is made declaratively.
b) User code should be able to create each of these groups of parameters independently of the call to build(), in an "API safe" way.

A sketch implementation:

  # This is the parametrization of the remote build job 
  BuildSpec spec = BuildSpec(api_version='1.17')
  spec['tag'] = 'my_image'
  spec['nocache'] = True
  spec['forcerm'] = False
  spec['dockerfile'] = './mydockerfile'

The set of valid keys is specified in the Docker Remote API documentation, so there's a level of safety here. Ideally, this should not be an open dictionary, but something that docker-py knows how to construct safely, maybe via a function:

  # the advantage here is that this function can raise an error if the
  # kwargs are not valid for the current API (the one 'client' was
  # constructed with)
  spec = client.make_build_spec(**params)

The other argument type is the group of parameters used to control how the context is built locally by docker-py, namely path, fileobj, and custom_context. A data oriented alternative could be something like:

  # This is the parametrization of the call to docker-py's build()
  Context ctx = Context(type='directory')
  ctx['dockerfile'] = 'a_docker_file_name'
  ctx['dockerignore'] = ['.git/', '**/*.ignorable'] 
  ctx['path'] = 'my/local/path'
  # etc.

BUT, since the docker remote API mandates that the context must be sent as a tar file (optionally compressed), docker-py should expose a 'maker' that returns a properly constructed tar object, one that is presumably valid taking into account the api version implemented, the content supplied, etc. In the current implementation this knowledge - how to make a context - is camouflaged in the imperative logic of the build method itself.

Exposing the ability to create context values:

  # Expose a maker that builds a context from a local directory
  Context ctx = client.make_context_from_dir(path='my/path')

  # Here's another way
  Context ctx = client.make_context_from_dockerfile('path/to/dockerfile')

  # And another
  Context ctx = client.make_context_from_tarball('path/to/tarball')

After the user has obtained a context value, it can optionally tweak it some more before passing it along (for example, merging the dockerignore list from the context with another list, adding some
files to the context data object, etc)

And then the implementation of build() would look like:

def client.build(context, build_spec, stream=True, timeout=60):
    #validate the context and the build_spec

    #then
    response = self._post(
                       u,
                       data=context['data'], # put there by make_context_from_*
                       params=build_spec,    # might need a transformation here
                       headers=headers,
                       stream=stream,
                       timeout=timeout,
    )

    # now use the response as directed by the stream and timeout arguments

Good, bad, ugly?

@shin-
Copy link
Contributor

shin- commented Apr 22, 2015

Sorry for the absence of response, haven't had much time to spend on the project recently. Things could definitely be better in that regard, but there is a lot to consider with regards to backwards compatibility, ease of use and familiarity. I'll give this proposal another look later, and answer more thoroughly.

@moysesb
Copy link
Author

moysesb commented Apr 22, 2015

@shin- If you'd like having something concrete to look at while evaluating this proposal, there's an open PR in docker-compose with an implementation of the ideas presented here. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants