HOCON schema is a type-safe data validation framework inspired by basho cuttlefish
There are two high level kinds of data types defined in HOCON schema: primitive
and complex
.
If we think of Erlang data structure as a 'tree', then primitive
types denote the 'leaves'
of the 'tree'. aka the terminal values.
While the complex
types denote the values which enclose either other complex
or primitive
values.
Most of the primitive
types are provided by (and also can be extended from) the typerefl library.
Typerefl is highly composible hence can be used to define complex types.
However in HOCON schema we only use it to define primitive
types.
Here is a list of the primitive
types for reference:
- enum: Enum is a list of Erlang atoms
- singleton: singleton is an Erlang atom
- integer: typerefl:integer()
- string: typerefl:string()
And an extended primitive type example: ip_port
-type ip_port() :: tuple().
-typerefl_from_string({ip_port/0, this_module, to_ip_port}).
to_ip_port(String) ->
case string:tokens(String) of
....
end.
HOCON schema supports 3 different complex types: struct
, array
, and union
.
NOTE: to make it easier for future extensions, it's recommended to use hoconsc
module APIs to define schema.
NOTE: HOCON schema does not support non-struct root level data types. e.g. it is not allowed to
define a root level schema with just a integer()
type.
Structs consist of data fields, which can be defined using hocon_schema
behaviour callbacks.
roots/0
: This callback returns all the root level fields.fields/1
: This callback returns the schema for each data field (in a list, so order matters).
For example, to define a struct named foo
having one integer field, the schema module may look like:
-export([roots/0, fields/1]).
roots() -> ["foo"]. %% 'exported' root names, equivalent to `[{"foo", hoconsc:ref("foo")}].`
fields("foo") -> [{"field1", typerefl:integer()}].
In this case, the schema for use in hocon_schema
APIs is the module name. There is another way to
define a struct as a Erlang map()
, so we do not have to implement the behaviour callbacks
(this is however mostly for test cases):
#{roots => ["foo"], %% 'exported' root names
fields => #{"foo" => [{"field1", typerefl:integer()}]}
}
In order to promote code abstraction and prevent copy-paste as much as possible, in HOCON schema, there is no way to define structs nested (child struct nested in a parent struct). The parent-children relationship has to be defined as struct 'referencing'.
e.g. if the type of parent-struct's field is another struct, the field's type should be defined as:
[ ...,
{field_N, hoconsc:ref("field_struct_name")},
...
].
The root struct name exported in the roots/0
API serves as top level struct's field names.
like listener
, zone
and broker
in etc/emqx.conf
.
Array is a sequence of other types which is defined as {array, Type}
.
A union type is in some contexts one_of
types.
When data is validated against the schema (recursively), the code enumerates
the union member types in the defined order until the given data matches any of the union member.
When starting a Erlang node it usually requires a system configuration file, (usually named sys.config
),
see Erlang doc for more details.
When using HOCON config format, we need a tool to transform a HOCON file to a config file of sys.config
format.
hocon_schema
is such a tool.
The content of the above mentioned config file for Erlang node to bootstrap is essentially an Erlang expression which evaluates to an Erlang term (the 'object' in Erlang).
To map HOCON objects (or their fields) to Erlang terms, we need to define a set of rules, such rules in HOCON schema is called 'mapping' rules.
This is when we need to introduce metadata to struct fields' schema.
The way to define a 'mapping' metadata is like below:
fields("struct_foo") ->
[ {field1, #{type => integer(),
mapping => "app_foo.field1"
}
]
This should map HOCON config {struct_foo: {field1: 12}}
to sys.config
like [{app_foo, [{field1, 12}]}]
.
Sometimes it's impossible to perform a perfect mapping from HOCON object to Erlang term.
This is when translation
is used.
Translations are defined as callback too, for example, if we want to translate
to config entries named 'min' and 'max' into a range tuple in sys.config
,
this schema below should do it.
-module(myapp_schema).
translation("foo") ->
[{"range", fun range/1}].
range(Conf) ->
Min = hocon_maps:get("foo.min", Conf),
Max = hocon_maps:get("foo.max", Conf),
case Min < Max of
true ->
{Min, Max};
_ ->
undefined
end.
As in the example, a translation callback is provided with the global config,
specific field values can be retrieved with hocon_maps:get
API.
Inter-field or even inter-object config validation can be done by implementing
the validations
optional callback.
Validations work similar to translations, only the OK (ok
or true
) return value is discarded
and failures are raised as exception in the map call.
NOTE: the integrity validation is performed after all fields are checked and converted.
Below is an example to ensure that the min
field is never greater than max
field.
-module(myapp_schema).
validations() ->
[{"min =< max", fun min_max/1}].
min_max(Conf) ->
Min = hocon_maps:get("foo.min", Conf),
Max = hocon_maps:get("foo.max", Conf),
case Min =< Max of
true -> ok %% return true | ok to pass this validation
false -> "min > max is not allowed" %% or If you need to return early, use throw(Reason)
end.
Besides fields' mapping
metadata, which is introduced above, for config mapping,
HOCON schema also supports below field metadata.
converter
: an anonymous function evaluated during config generation to convert the field value.validator
: field value validator, an anonymous function which should returntrue
orok
if the value is as expected. NOTE: the input to validator after convert (if present) is applied.default
: default value of the field. NOTE that default values are to be treated as raw inputs, meaning they are put through theconverter
s andvalidator
s etc, and then type-checked.required
: set tofalse
if this field is allowed to beundefined
. NOTE: there is no point setting it totrue
if fields has a default value.sensitive
: set totrue
if this field's value is sensitive so we will obfuscate the log with********
when logging.desc
: text for document generationhidden
: a boolean flag to hide it from appearing in config document
By default, a field (except for when it's inside an array element) can be overridden by an environment variable the name of which is translated from field's absolute path with dots replaced by double-underscores and then prepended with a prefix.
For example, the value of config entry foo.bar.field1
can be overridden by
PREFIX_FOO__BAR__FIELD1
, or PREFIX_foo_bar_field1
(i.e. not case-sensitive), where PREFIX_
is configurable by another environment variable HOCON_ENV_OVERRIDE_PREFIX
.
Define override_env
in struct field metadata.
Environment variables are not parsed as plain string, rather as HOCON values. This creates the flexibility for overriding config values in different ways:
- Set individual object paths, for example
export EMQX_MY__KEY__name=zz; export EMQX_MY__KEY__fingers=10
- Set the the entire object as escaped HOCON value:
export EMQX_MY__KEY="{name = \"zz\", fingers = 10}"
- Load the object from another file
export EMQX_MY__KEY="{\"include /config/my-key-override.conf\"}"
Using {include "path/to/file"}
is extremely useful to override a value with large object or an array.
NOTE: currently HOCON schema does not support array index (KEY__1
, KEY__2
etc) overrides.