129 validate compose yml #1808

mnowster · 2015-08-04T16:52:12Z

Introducing schema validation for compose yml.

Utilising jsonschema, we have a defined schema we can pass to it so it can validate the config that the user has specified.

The validation errors that are raised replicate existing behaviour and print out human readable error messages. It also expands to cover new scenarios and provide helpful error messages rather than stack traces.

Functionality in here should address each of the points raised in this issue: #129

✨ 🐈

aanand · 2015-08-04T17:15:03Z

Looking great.

thaJeztah · 2015-08-04T18:15:44Z

Nice!! thank you :-)

Define a schema that we can pass to jsonschema to validate against the config a user has supplied. This will help catch a wide variety of common errors that occur. If the config does not pass schema validation then it raises an exception and prints out human readable reasons. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

We validate the config against our schema before a service is created so checking whether a service name is valid at time of instantiation of the Service class is not needed. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Improves readability. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

jsonschema provides a rich error tree of info, by parsing each error we can pull out relevant info and re-write the error messages. This covers current error handling behaviour. This includes new error handling behaviour for types and formatting of the ports field. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

These functions weren't being called by anything. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Move validation out into its own file without causing circular import errors. Fix some of the tests to import from the right place. Also fix tests that were not using valid test data, as the validation schema is now firing telling you that you couldn't "just" have this dict without a build/image config key. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

mnowster · 2015-08-07T11:22:39Z

@aanand ok, I've squashed commits as far as I'm happy too without it becoming difficult to follow, imo.

aanand · 2015-08-07T11:49:12Z

compose/config/schema.json

+
+        "ports": {
+          "oneOf": [
+            {"type": "string", "format": "ports"},


We don't actually support a string value for ports - it must be a list.

Rather than implement the logic a second time, use docker-py split_port function to test if the ports is valid. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

funkyfuture · 2015-08-07T15:47:17Z

ciao @mnowster, as i was already working on a config-validation, and formulated a lot of invalid service-dictionaries on that behalf. i tested these against your implementation and 80 of them failed. (and it's a total of 80, what baffles me the more. but i checked with debug-statements twice.)
shall i open a pull request on your repo to integrate these?

i'm in a mix of feelings here. i was about to do some work on my implementation, respectively the used validation-library first. (it deserves a better handling of error-messages incl. l11n.) another point that imo makes that library preferable is the possibility to use yaml for the schema, but that may be possible with jsonschema too.

on the other hand i'm relieved that i don't necessarily have that on my agenda anymore. and your implementation certainly is less invasive to the config-module than mine. seems to have gotten some proper refactoring in the meantime. :-)

funkyfuture · 2015-08-07T16:21:43Z

oh, the heat here… i found out that my tests do not cover the code, as they assume that validation happens after calling ServiceLoader.make_service_dict.

and it's to be criticized that it doesn't, as config-mixins are left out of validation. furthermore it makes more sense if you use the ServiceLoader in client-code.

i'll look into the test-behaviour with config.load when it cooled down.

funkyfuture · 2015-08-08T13:16:11Z

i opened a pull request with test-generators that will test a bunch of valid and invalid configurations. of these tests, 42 are failing.

i also roughly reviewed what i did in #1355 so far and want to emphasize which checks are imo missing here. my paradigm for that is the assumption that any configuration-error can possibly lead to data-loss and thus any validation must take racetime-conditions into consideration and anticipate any possible human- or machine-generated mistake.

validate complete configuration including anything resolved into it (paths, extends-, environment-files and such)
validate values for keys with a restricted set of allowed values (capabilities, net, memory, etc)
validate accessibility for referenced filesystem-objects
validate legal posix-paths for mount-points in containers
validate network-configuration like hostnames, ips, matching port-ranges, …

furthermore a nice-to-have:

propose keys that best fuzzy-match an invalid config-key

mnowster · 2015-08-10T10:31:01Z

Hi @funkyfuture,

Thank you for your contribution to the initial spike on the approach.

I reviewed both your initial PR and draghuram's with @aanand and @bfirsh, and we decided on the jsonschema approach. An initial scope was agreed that we were seeking to replace stack traces with better error messages but doesn't go much further than what we already have with existing validation functionality, eg nothing extra for network config as the docker daemon takes care of this.

The requirement to get validation schema in for 1.5 release was increased in priority, as is the way of open source projects, so I took over the issue.

Thanks again for contributing.

🎈

bfirsh · 2015-08-10T16:59:39Z

Does the schema get included in the Python source package? It might need adding to MANIFEST.in or something or rather...

funkyfuture · 2015-08-10T18:35:52Z

propably it's not a good idea to discuss this here, but i don't want to open a new issue unless you actually regard my concerns as an issue (or if you prefer: user story).

first of all, let me understand you correctly @mnowster:

The requirement to get validation schema in for 1.5 release was increased in priority, as is the way of open source projects

there surely are diverse ways to develop an open-source-project. do you rather mean "as is the way of agile development"? that'd make sense to me. otherwise i'd respond that there is always an alternative. if you mean "agile" and if this is the paradigm, me and my team will reconsider whether Compose is still the horse to bet on for orchestration in our production environments.

eg nothing extra for network config as the docker daemon takes care of this.

i remember this argument, but i also recall that it was regarded as a non-satisfying answer, because the Docker daemon handles single containers, not a set of 'em. thus, a misconfigured composition of containers can run into unknown outcomes if one container fails to start, but the others are started anyway.

to not validate anything that get's mixed in is imo a significant design flaw.

my perspective on this is one of a paranoid and extra-cautious sysadmin. is this perspective relevant for y'all? if so, what do you propose how to deal with it?

aanand · 2015-08-10T18:50:11Z

@funkyfuture You're right - this isn't the place to discuss whether or not to validate things on the client that are already validated on the daemon. That's a complicated question, and you should feel free to open a separate issue for that.

As @mnowster said, this PR is an incremental step that should result in a significant reduction in cryptic stack traces for Compose users. We don't need to have a completely laid-out plan for our general approach to validation in order to merge it.

mnowster · 2015-08-11T09:17:17Z

@bfirsh ah, that is a good point, I'll look into it, thank you.

While it was intended as a positive to be stricter in validation it would in fact break backwards compatibility, which we do not want to be doing. Consider re-visiting this later and include a deprecation warning if we want to be stricter. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Unfortunately the way that jsonschema is calling %r on its property and then encoding the complete message means I've had to do this manual way of removing the literal string prefix, u'. eg: key = 'extends' message = "Invalid value for %r" % key error.message = message.encode("utf-8")" results in: "Invalid value for u'extends'" Performing a replace to strip out the extra "u'", does not change the encoding of the string, it is at this point the character u followed by a '. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

The validation message was confusing by displaying only 1 level of property of the service, even if the error was another level down. Eg. if the 'files' property of 'extends' was the incorrect format, it was displaying 'an invalid value for 'extends'', rather than correctly retrieving 'files'. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

When a schema type is set as unique, we should display the validation error to indicate that non-unique values have been provided for a key. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Tiny bit of refactoring to make it clearer and only pop service_name once. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

We use $ref in the schema to allow us to specify multiple type, eg command, it can be a string or a list of strings. It required some extra parsing to retrieve a helpful type to display in our error message rather than 'string or string'. Which while correct, is not helpful. We value helpful. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

aanand · 2015-08-12T10:15:46Z

LGTM

129 validate compose yml

GordonTheTurtle added the status/0-triage label Aug 4, 2015

mnowster force-pushed the 129-validate-compose-yml branch from 7919143 to 6bd2819 Compare August 4, 2015 16:59

mnowster force-pushed the 129-validate-compose-yml branch from 6bd2819 to 489edf7 Compare August 5, 2015 10:24

mnowster added current sprint labels Aug 5, 2015

aanand mentioned this pull request Aug 6, 2015

Interpolate environment variables #1765

Merged

mnowster force-pushed the 129-validate-compose-yml branch 2 times, most recently from 6d233bc to 096fd41 Compare August 7, 2015 10:34

mnowster changed the title ~~WIP: 129 validate compose yml~~ 129 validate compose yml Aug 7, 2015

mnowster force-pushed the 129-validate-compose-yml branch from 096fd41 to 368639b Compare August 7, 2015 10:53

mnowster mentioned this pull request Aug 7, 2015

Validate Compose file and produce better error messages #129

Closed

7 tasks

mnowster force-pushed the 129-validate-compose-yml branch from 368639b to 4cb7dc0 Compare August 7, 2015 11:03

mnowster added 9 commits August 7, 2015 12:06

Replace service tests with config tests

76e6029

We validate the config against our schema before a service is created so checking whether a service name is valid at time of instantiation of the Service class is not needed. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Format validation of ports

6c7c598

Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Order properties alphabetically

98c7a7d

Improves readability. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Include remaining valid config properties

8d66940

Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Improve test coverage for validation

ea3608e

Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Remove dead code

0557b5d

These functions weren't being called by anything. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

mnowster force-pushed the 129-validate-compose-yml branch from 4cb7dc0 to 2e428f9 Compare August 7, 2015 11:07

aanand reviewed Aug 7, 2015
View reviewed changes

Use split_port for ports format check

df74b13

Rather than implement the logic a second time, use docker-py split_port function to test if the ports is valid. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

mnowster added this to the 1.5.0 milestone Aug 7, 2015

bfirsh removed the current sprint label Aug 10, 2015

mnowster added 6 commits August 11, 2015 12:01

Catch non-unique errors

df14a43

When a schema type is set as unique, we should display the validation error to indicate that non-unique values have been provided for a key. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

Clean up error.path handling

68de84a

Tiny bit of refactoring to make it clearer and only pop service_name once. Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

mnowster force-pushed the 129-validate-compose-yml branch from d7d47d1 to f8efb54 Compare August 11, 2015 12:08

aanand mentioned this pull request Aug 11, 2015

TypeError when service name is a number #1845

Closed

Include schema in manifest

810bb70

Signed-off-by: Mazz Mosley <mazz@houseofmnowster.com>

aanand added a commit that referenced this pull request Aug 12, 2015

Merge pull request #1808 from mnowster/129-validate-compose-yml

fb4c9fb

129 validate compose yml

aanand merged commit fb4c9fb into docker:master Aug 12, 2015

mnowster mentioned this pull request Aug 12, 2015

Adds test-generators for valid and invalid configuration files mnowster/compose#1

Closed

mnowster deleted the 129-validate-compose-yml branch August 12, 2015 15:07

mnowster mentioned this pull request Aug 19, 2015

Introduce schema for compose's configuration file. #1348

Closed

thaJeztah mentioned this pull request Aug 21, 2015

Strict yaml validation? docker/libcompose#34

Closed

This was referenced Aug 24, 2015

WIP: Validate file and services #1355

Closed

Hardcoded Values? #1057

Closed

This was referenced Sep 4, 2015

TypeError: can only concatenate list (not "int") to list #1096

Closed

Cannot service with an empty list of links #1086

Closed

aanand mentioned this pull request Oct 19, 2015

Environment interpolation regression in docker-compose.yml #2221

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

129 validate compose yml #1808

129 validate compose yml #1808

mnowster commented Aug 4, 2015

aanand commented Aug 4, 2015

thaJeztah commented Aug 4, 2015

mnowster commented Aug 7, 2015

aanand Aug 7, 2015

funkyfuture commented Aug 7, 2015

funkyfuture commented Aug 7, 2015

funkyfuture commented Aug 8, 2015

mnowster commented Aug 10, 2015

bfirsh commented Aug 10, 2015

funkyfuture commented Aug 10, 2015

aanand commented Aug 10, 2015

mnowster commented Aug 11, 2015

aanand commented Aug 12, 2015

129 validate compose yml #1808

129 validate compose yml #1808

Conversation

mnowster commented Aug 4, 2015

aanand commented Aug 4, 2015

thaJeztah commented Aug 4, 2015

mnowster commented Aug 7, 2015

aanand Aug 7, 2015

Choose a reason for hiding this comment

funkyfuture commented Aug 7, 2015

funkyfuture commented Aug 7, 2015

funkyfuture commented Aug 8, 2015

mnowster commented Aug 10, 2015

bfirsh commented Aug 10, 2015

funkyfuture commented Aug 10, 2015

aanand commented Aug 10, 2015

mnowster commented Aug 11, 2015

aanand commented Aug 12, 2015