Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a config format #1

Closed
ferrouswheel opened this issue Dec 8, 2014 · 11 comments
Closed

Define a config format #1

ferrouswheel opened this issue Dec 8, 2014 · 11 comments

Comments

@ferrouswheel
Copy link
Contributor

build:
   # build this image, using the current dir as the docker context
   docker.example.com/app: .
   # example of building multiple images from dockerfile in different dir:
   #- docker.example.com/rabbitmq: containers/rabbitmq/.
   #- docker.example.com/couchdb: containers/couchdb/.

# All credentials that are used are defined. The UI will allow these
# to be filled in. They will be passed to any containers that needs them.
# By default these are assumed to be unique and part of a set within a multi
# container setup.  e.g. if one container requires RABBIT_USER, then any other
# containers with credential RABBIT_USER will get the same value.
#
# All are provided to the container at run time as environment variables.
# FILE means a file will be uploaded and the environment variable will
# be the path to it.
#
# Some day these may alternatively be placed in a config system like etcd,
# or be placed in a config file format at a configurable location.
credentials:
    RABBIT_CERTIFICATE: FILE,
    RABBIT_USER: VALUE,
    RABBIT_PASSWORD: VALUE,

# by default, all dependent containers will be built and be available
# and linked in as their label name
depends:
   rabbitmq:
      src: "git@github.com:docker-systems/rabbitmq.git"
      minhash: "deedbeef" # must have this commit
      # image should be inferred from the baleen.yml of the src repo
      # image: "docker.example.com/rabbitmq"
      #tag: v0.1.1
   db:
      # use official image from dockerhub
      image: "postgres"
   couchdb:
      image: "docker.example.com/couchdb"

# Need to define this better, what will be set up etc.
pre_test:
  - "echo 1"
  - "load up any data"

tests:
    # Tests should run with their minimal dependencies to avoid complexity
    links:
        # remap the link, otherwise uses --link db. --link rabbitmq is implicit
        - "db:db1"
    ports:
       - "8000:8000"
    env:
       TEST: 1
       # will also have credential variables defined above
    volumes:
       # preserve between runs so that multiple commands can have persistent effects
       - "/data"
    cmd:
       # each command is a separate container run, they will share the same
       # volume
       - "run_tests.sh"
       - "collect_coverage.sh"

# How to run integration tests?
# integration tests may require prompting from outside the system, whether it
# is from a browser, a selenium controlled browser like browserstack or other.
# To run them using baleen requires making another container that links all
# the dependencies together and then connects via running something within that
# integration container (e.g. a headless browser)

# All artifacts will be preserved and available for each build.
# Some artifacts like coverage % and test counts will be graphed.
# htmldir and files will be downloadable and served from baleen too
artifacts:
  xunit:
    python_xml: /data/xunit.xml
  coverage:
    python_xml: /data/coverage.xml
    htmldir: /data/htmlcov/
  documentation:
    htmldir: /data/mydocs
  pdf:
    file: /data/my.pdf

# custom or should we be more declarative?
deploy:
    - "docker push docker.example.com/app:latest"
    - "docker stop app"
    # question, should we try to define the parts like 
    - "docker run --rm --name app -t docker.example.com/app:latest"
@finlay
Copy link

finlay commented Dec 8, 2014

Wow, you've been really busy!

Quite a lot to look at. We'll be having a look over the next couple of days.

@edwardabraham
Copy link

Just an FYI, as noted by @vizowl who pays attention to these things. Docker
are moving along with orchestration services
http://blog.docker.com/2014/12/docker-announces-orchestration-for-multi-container-distributed-apps/
.

One new component is Docker compose, which holds out the fabled "just a
few keystrokes":

Defining a distributed application stack and its dependencies through a
simple YAML configuration file converts what was an incredibly complex
process into a simple one that can be executed in just a few keystrokes.

My guess is that this will help with the build step. Not much on
implementation yet, but it seems that it is based on this issue:
moby/moby#9459

On 9 December 2014 at 09:32, Joel notifications@github.com wrote:

build:
docker.domarino.com/api: .

example of building multiple from dockerfile in different dir:

#- docker.example.com/rabbitmq: containers/rabbitmq/.
#- docker.example.com/couchdb: containers/couchdb/.

All credentials that are used are defined. The UI will allow these# to be filled in. They will be passed to any containers that needs them.# By default these are assumed to be unique and part of a set within a multi# container setup. e.g. if one container requires RABBIT_USER, then any other# containers with credential RABBIT_USER will get the same value.## All are provided to the container at run time as environment variables.# FILE means a file will be uploaded and the environment variable will# be the path to it.## Some day these may alternatively be placed in a config system like etcd,# or be placed in a config file format at a configurable location.credentials:

RABBIT_CERTIFICATE: FILE,
RABBIT_USER: VALUE,
RABBIT_PASSWORD: VALUE,

by default, all dependent containers will be built and be available# and linked in as their label namedepends:

rabbitmq:
src: "git@github.com:docker-systems/rabbitmq.git"
minhash: "deedbeef" # must have this commit
# image should be inferred from the baleen.yml of the src repo
# image: "docker.example.com/rabbitmq"
#tag: v0.1.1
db:
# use official image from dockerhub
image: "postgres"
couchdb:
image: "docker.example.com/couchdb"

Need to define this better, what will be set up etc.pre_test:

  • "echo 1"
  • "load up any data"
    tests:

    Tests should run with their minimal dependencies to avoid complexity

    links:
    - "db:db1" # remap the link, otherwise uses --link db. --link rabbitmq is implicit
    ports:
    • "8000:8000"
      env:
      TEST: 1

      will also have credential variables defined above

      volumes:
    • "/data" # preserve between runs so that multiple commands can have persistent effects
      cmd:

      each command is a separate container run, they will share the same

      volume

    • "run_tests.sh"
    • "collect_coverage.sh"

      How to run integration tests?# integration tests may require prompting from outside the system, whether it# is from a browser, a selenium controlled browser like browserstack or other.# To run them using baleen requires making another container that links all# the dependencies together and then connects via running something within that# integration container (e.g. a headless browser)

      All artifacts will be preserved and available for each build.# Some artifacts like coverage % and test counts will be graphed.# htmldir and files will be downloadable and served from baleen tooartifacts:

      xunit:
      python_xml: /data/xunit.xml
      coverage:
      python_xml: /data/coverage.xml
      htmldir: /data/htmlcov/
      documentation:
      htmldir: /data/mydocs
      pdf:
      file: /data/my.pdf

      custom or should we be more declarative?deploy:

      • "docker push docker.example.com/app:latest"
      • "docker stop app"

        question, should we try to define the parts like

      • "docker run --rm --name app -t docker.example.com/app:latest"


Reply to this email directly or view it on GitHub
#1.

Edward Abraham
www.dragonfly.co.nz
Dragonfly Science, PO Box 27535, Wellington 6141, New Zealand
Level 5, 158 Victoria Street, Te Aro, Wellington
M: +64 21 989 454
T: +64 4 385 9285

@ferrouswheel
Copy link
Contributor Author

Thanks! I saw that announcement and will want to try and keep things consistent with what they come up with. Ideally I can drop out some of the config, but I'm not sure if they'll provide what I'd see as the ideal CI for multi container applications.

Basically, I want to point Baleen at a git repo, have it read the config and be able to go fetch and build the rest of the docker containers in your stack (including versioning dependencies, which I haven't seen mentioned in group.yml yet).

I also kind of want to keep most of the configuration separate. So each git repo defines how to build, test, and deploy the containers it builds. The overall stack architecture would only be defined in as much as it's needed for testing the container.

@edwardabraham
Copy link

@vizowl and I have been having a discussion, and have got to here, Chris will have a go at translating your example into this format. Your build, pre_test, tests, and artefacts steps would all have this same structure. We are imagining a more generic structure, specifying dependencies. Each label at the top level in the yaml file defines conditions that must be satisfied. If these conditions are not met, then the command is run in the image, once the dependencies are satisfied.

# All fields optional. The label is a name, that is used to refer to that dependency
LABEL: 
    conditions: [false] #define questions that are satisfied if they return true (0)
         - conditions schema
    image: [host] #the host
    cmd: [default] #the default command in a docker image
        - command schema
    wait: [0] #wait seconds before testing the condition
    depends: #list of labels

@ferrouswheel
Copy link
Contributor Author

Looking forward to hearing further suggestions.

The definition will probably remain pluggable depending on what people need. For my initial work we'll just support using a fig file to run a test command in a container, so they'd be a single "Action" for "fig up".

Any new definition plugin will ultimately have to convert their actions into the underlying data model Baleen understands (Actions, ExpectedActionOutputs etc.). I'm attempting to tidy the data model a bit to make it easier to add new types of actions in future and will see if I can remove the db persistence around project actions. If the actions are defined in a static file, then we should be able to parse that whenever we need it.

I've already configured Projects to generate a project-specific ssh key and clone a repo, and moved the existing remotely executed ssh commands to their own Action type: RemoteSSHAction.

@vizowl
Copy link

vizowl commented Dec 9, 2014

# maybe could define a namespace for container names
# in this case awesomeapp

# realised that conditions are actually optional - no conditions means that command
# is always run

rabbitmq:
  conditions:
    container: awesomeapp-rabbitmq
  cmd:
    docker_build_and_run: containers/rabbitmq

couchdb:
  conditions:
    container: awesomeapp-couchdb
  cmd:
    docker_build_and_run: containers/couchdb

db:
  conditions:
    container: awesomeapp-db
  cmd:
    docker_fetch_and_run: postgres

appimage:
  conditions:
    image: awesomeapp
  cmd:
    docker_build: app

pretest:
  cmd:
    shell: cleandb_and_load_fixtures.sh
  depends:
    - db

tests:
  cmd:
    container:
      image: awesomeapp
      name: awesomeapp-tests
      links:
        - "awesome-db:db"
        - "awesome-rabbitmq:rabbitmq"
        - "awesome-couchdb:couchdb"
      ports:
        - "8000:8000"
      env:
        - TEST: 1
      volumes:
        - /data
      cmd: run_tests.sh
  depends:
    - pretest
    - couchdb
    - rabbitmq
    - appimage

coverage:
  # since we want coverage to run everytime we ask for it would be easiest to not define
  # any conditions and I don't think we actually need to list the coverage outputs here
  # as it seems that is really for the web front end. However if it turns out we did need
  # to define the conditions then we could define an 'always' flag
  cmd:
    container:
      image: awesomeapp
      volumes:
        - "$PWD/testdata:/data"
      volumes_from:
        - awesomeapp-tests  # need logic to determine that awesomeapp-tests is not used after this and then delete it so that tests runs again
      cmd: collect_coverage.sh
  depends:
    - tests

deploy:
  cmd:
    container:
      image: awesomeapp
      env:
        - "SECRETS:SNEAKY_STUFF"
  depends:
    - appimage
    - db
    - rabbitmq
    - couchdb

@vizowl
Copy link

vizowl commented Dec 9, 2014

A few other thoughts I had along the way

  • The config file should not assume the existence of a registry server - rather that is an implementation detail of baleen itself. If a registry server is defined then baleen can use it to push images
    built on one host to another, but if it does not exist then it will just have to build images on the hosts that need them
  • If a container has volumes and gets replaced then by default the containers should be preserved, should add a flag to discard the volumes when the container image is updated.
  • This config assumes that if a container has a name then it is detached - that is probably too narrow.
  • The config file assumes that all the dependent containers for a given container are running on the same host. Again this is probably too narrow

@vizowl
Copy link

vizowl commented Dec 9, 2014

In terms of how this would be used I would image a set of tasks along the lines of

git pull &&
baleen coverage &&
baleen staginghost deploy

@ferrouswheel
Copy link
Contributor Author

Cool thanks. I think furthering this discussion might be easier to have in person one lunchtime or after work.

  • Making the dependency graph a bit more explicit for each action seems like it could be a good idea.
  • It'd be worth discussing what the config's goal is. I'm leaning towards purely the building and testing of containers, but nothing about running them in production (there are already competing tools for doing that).

Responding to your four points:

  • While the config defines docker.example.com/myapp it doesn't require the registry docker.example.com to exist, it just names the successful builds with the registry prefix. The deploy step can decide to push the image or not.
  • Sounds sensible, though I presume you mean "by default the volumes should be preserved"?
  • Might be better to have a container flag to indicate whether something is detached if we're try to make the steps generic?
  • I'm happy to make this assumption for now as I want to use this purely for building and testing, whereas deployment and orchestration can happen independently (and be triggered by a successful build).

A couple of things about the direction we wanted to head:

The baleen.yml would link in other projects that are needed for a build. we have containers across repos and want to be able to set up a core project, then have the baleen.yml in each repo set up their own projects as they are included. That's the reason for:

depends:
   rabbitmq:
      src: "git@github.com:docker-systems/rabbitmq.git"
      minhash: "deedbeef"

This would automatically use the git@github.com:docker-systems/rabbitmq.git repo's baleen.yml to set up a new project inside baleen, and require a successful build with commit hash deedbeef or newer, before continuing with building awesomeapp.

I didn't have any specific plans to make baleen a deployment tool as suggested in your last comment. We also wanted coverage and other build artifacts in the configuration because I don't want this to be a python specific system (indeed, it'd be mostly useless to us if it was!)

Having said that, I'm keen to make this a collaborative project that can be useful for more than just how one team does things :-)

@ferrouswheel
Copy link
Contributor Author

Chatted with @holic and found that for our needs we'd probably be able to get away with using a fig.yml for describing the container start up (what's currently in the test block)

Fig will also resolve environment variables from the host, so we can still keep credentials separate: docker/compose#297

This doesn't need negate the need for a file describing the build's other metadata (including the links to other repos with dependent projects to build)

@ferrouswheel
Copy link
Contributor Author

I've implemented a baleen.yml format that is mostly functional, so am considering this issue closed. Alternative formats or improvements can be opened as new issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants