Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Systematic validation of Synapse's config #12651

Open
DMRobertson opened this issue May 6, 2022 · 12 comments
Open

Systematic validation of Synapse's config #12651

DMRobertson opened this issue May 6, 2022 · 12 comments
Labels
A-Config Configuration, or the documentation thereof T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@DMRobertson
Copy link
Contributor

I'd love to see something does all of our config key checking and typechecking at startup time. Ideally this would give us useful type annotations which could flow through the rest of the program; but I think we should optimise for something which produces good error messages for the server owner.

@DMRobertson
Copy link
Contributor Author

Thoughts: jsonschema, pydantic. Other options?

@DMRobertson DMRobertson added the T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. label May 6, 2022
@dklimpel
Copy link
Contributor

dklimpel commented May 6, 2022

Related to

@clokep
Copy link
Member

clokep commented May 9, 2022

Thoughts: jsonschema, pydantic. Other options?

We tend to use jsonschema for the config already.

See also, the validate_config function:

def validate_config(
json_schema: JsonDict, config: Any, config_path: Iterable[str]
) -> None:
"""Validates a config setting against a JsonSchema definition
This can be used to validate a section of the config file against a schema
definition. If the validation fails, a ConfigError is raised with a textual
description of the problem.
Args:
json_schema: the schema to validate against
config: the configuration value to be validated
config_path: the path within the config file. This will be used as a basis
for the error message.
"""
try:
jsonschema.validate(config, json_schema)
except jsonschema.ValidationError as e:
raise json_error_to_config_error(e, config_path)

@dklimpel
Copy link
Contributor

dklimpel commented May 9, 2022

I am not sure if this also related:

@DMRobertson
Copy link
Contributor Author

Thoughts: jsonschema, pydantic. Other options?

We tend to use jsonschema for the config already.

Maybe we just need more schemas for the config subsections then? (I see 42 matches for def read_config\( but only 11 for validate_config\(.)

With that said: for fuller annotations, we'd have to manually maintain a type in the schema and a type in the config class itself. (That's the only reason I mention pydantic.)

@richvdh
Copy link
Member

richvdh commented May 16, 2022

Maybe we just need more schemas for the config subsections then?

Yes, my vision here was that we would add more jsonschema validation over time, and that we should make sure new config sections at least have validation. Ultimately we might be able to join up the bits of jsonschema into a single big schema.

With that said: for fuller annotations, we'd have to manually maintain a type in the schema and a type in the config class itself.

I had a bit of a go at automatic validation when I wrote the parser for the oidc config settings (which parses the settings into a typed OidcProviderConfig object, so suffers exactly the duplication problem you describe). I decided that an explicit schema would be better, but I don't think I actually tried pydantic.

One of the problems is that there is often a bit of pre-parsing between reading the YAML and storing it in the Config class. That's not necessarily a barrier to doing things differently, though.

I wouldn't be averse to switching to pydantic if it

  • gave clear errors to the user - in particular, pointing to which bit of the config is invalid in a way that doesn't make it sound like a Synapse bug.
  • gives code at least as clear as the jsonschema-based solution. Not that that is a very high bar.

If you're interested in pursuing pydantic, it might be worth trying an alternative implementation for OIDCConfig, so we can compare and contrast?

@DMRobertson
Copy link
Contributor Author

DMRobertson commented May 19, 2022

If you're interested in pursuing pydantic, it might be worth trying an alternative implementation for OIDCConfig, so we can compare and contrast?

I'm giving this a go on https://github.com/matrix-org/synapse/tree/dmr/oidc-config-pydantic in spare moments. There's some investigation to be done though. pydantic/pydantic#4087 for starters!

@symphorien
Copy link

It would be nice if it was possible to validate the syntax of config file without running synapse (and possibly without having access to the network, database, certificates etc). It allows to validate config before restarting synapse and possibly having it not restart because of a typo. It would also allow config validation at "build time" before deployment for example on NixOS.

Random examples of prior work:
postgres -C
sshd -t
varnishd -C
nagios -v
collectd -t

nginx has a similar possibility but it insists in reading all certificates so you can't do the syntax check on another machine pre-deployment.

@DMRobertson
Copy link
Contributor Author

If you're interested in pursuing pydantic, it might be worth trying an alternative implementation for OIDCConfig, so we can compare and contrast?

I'm giving this a go on https://github.com/matrix-org/synapse/tree/dmr/oidc-config-pydantic in spare moments. There's some investigation to be done though. samuelcolvin/pydantic#4087 for starters!

https://github.com/matrix-org/synapse/blob/44ac98422ce242d2ca7b383a33274fb4f47b980b/synapse/config/oidc2.py is a first pass at this. There are plenty of TODOs and it's a bit more verbose than I'd've liked. Maybe it's enough to see if people like the syntax.

On error messages: I've been checking that the model behaves as I expect it to here: https://github.com/matrix-org/synapse/blob/44ac98422ce242d2ca7b383a33274fb4f47b980b/tests/config/test_oidc2.py . I'd like to tweak that test case so that it can be invoked in a way which will print out all ValidationErrors, so we can get a sense of how clear the error messages are.

@richvdh
Copy link
Member

richvdh commented May 23, 2022

It would be nice if it was possible to validate the syntax of config file without running synapse

@symphorien: it's not well documented, but you can:

python -m synapse.config -c homeserver.yaml

(or equivalent under poetry), which does exactly that.

Nevertheless we should replace it with a "check mode" for synapse itself (and the worker apps)

@symphorien

This comment was marked as off-topic.

@richvdh

This comment was marked as off-topic.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Config Configuration, or the documentation thereof T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

No branches or pull requests

6 participants