Releases: Strech/avrora
Configurable schemas auto-registration
It's bin a while we have a release. In this one, you will find many good changes.
New configuration option
Starting this version you can have an explicit split between schema reader and schema writer. This means that you don't need to have local files with schemas if you would like to rely only on the schemas already registered in the registry.
config :avrora,
# ...
registry_schemas_autoreg: false
If you disable auto registration and have schema registry configured, two major behavior changes will happen:
- Local files will be completely ignored for schema resolution
- For encoding and decoding, the schema will be retrieved from the registry (see n.1)
For the case when the schema registry is not configured – behavior should remain the same 😉
New Avrora.Codec
interface
An Encoder module gets some attention and was refactored. It was split into several submodules which implement the same behavior. It allows code to be reused and well tested separately. But everything comes at a cost, hopefully pros and cons should balance each other.
Happy coding everyone and any feedback is welcome 🤗
Extract and Register schemas like a boss
In this release, 2 new features emerge one in the public API of the library and one in the CLI capabilities.
Avrora.extract_schema/1
Extracts a schema from the encoded message, useful when you would like to have some metadata about the schema used to encode the message. All the retrieved schemas will be cached accordingly to the settings.
{:ok, pid} = Avrora.start_link()
message =
<<79, 98, 106, 1, 3, 204, 2, 20, 97, 118, 114, 111, 46, 99, 111, 100, 101, 99,
8, 110, 117, 108, 108, 22, 97, 118, 114, 111, 46, 115, 99, 104, 101, 109, 97,
144, 2, 123, 34, 110, 97, 109, 101, 115, 112, 97, 99, 101, 34, 58, 34, 105,
111, 46, 99, 111, 110, 102, 108, 117, 101, 110, 116, 34, 44, 34, 110, 97, 109,
101, 34, 58, 34, 80, 97, 121, 109, 101, 110, 116, 34, 44, 34, 116, 121, 112,
101, 34, 58, 34, 114, 101, 99, 111, 114, 100, 34, 44, 34, 102, 105, 101, 108,
100, 115, 34, 58, 91, 123, 34, 110, 97, 109, 101, 34, 58, 34, 105, 100, 34, 44,
34, 116, 121, 112, 101, 34, 58, 34, 115, 116, 114, 105, 110, 103, 34, 125, 44,
123, 34, 110, 97, 109, 101, 34, 58, 34, 97, 109, 111, 117, 110, 116, 34, 44,
34, 116, 121, 112, 101, 34, 58, 34, 100, 111, 117, 98, 108, 101, 34, 125, 93,
125, 0, 84, 229, 97, 195, 95, 74, 85, 204, 143, 132, 4, 241, 94, 197, 178, 106,
2, 26, 8, 116, 120, 45, 49, 123, 20, 174, 71, 225, 250, 47, 64, 84, 229, 97,
195, 95, 74, 85, 204, 143, 132, 4, 241, 94, 197, 178, 106>>
{:ok, schema} = Avrora.extract_schema(message)
{:ok,
%Avrora.Schema{
full_name: "io.confluent.Payment",
id: nil,
json: "{\"namespace\":\"io.confluent\",\"name\":\"Payment\",\"type\":\"record\",\"fields\":[{\"name\":\"id\",\"type\":\"string\"},{\"name\":\"amount\",\"type\":\"double\"}]}",
lookup_table: #Reference<0.146116641.3853647878.152744>,
version: nil
}}
Many thanks to @apellizzn for the help!
mix avrora.reg.schema
A separate mix task to register a specific schema or all found schemas in schemas folder.
For instance, if you configure Avrora schemas folder to be at ./priv/schemas
and you want to register a schema io/confluent/Payment.avsc
then you can use this command
$ mix avrora.reg.schema --name io.confluent.Payment
schema `io.confluent.Payment` will be registered
NOTE: It will search for schema ./priv/schemas/io/confluent/Payment.avsc
If you would like to register all schemas found under ./priv/schemas
then you can simply execute this command
$ mix avrora.reg.schema --all
schema `io.confluent.Payment` will be registered
schema `io.confluent.Wrong' will be skipped due to an error `argument error'
I hope you enjoy it ❤️
Basic auth for Confluent Schema Registry
Starting version 0.11.0
a new configuration option emerges and will accompaniment registry_url
option.
config :avrora,
# ...
registry_url: "http://...",
registy_auth: {:basic, ["username", "password"]}
# ...
Schema evolution (from 🐸 into 👨🚀)
This release is a fix for the schema evolution process.
Before v0.10.0
if you have Schema Registry enabled the very first schema version will be used forever, even if you update schema files and restart the service.
Starting v0.10.0
a few major changes happen.
The flow
The schema resolution flow gets changed. Now if you never resolve your schema (i.e it was not found in Avrora.Store.Memory
) we always will read the schema file no matter do you have a version in the name or not.
Then, if you do have a version in a name (for instance you do the decoding of the message) we will check the registry and find schema there.
If you don't have a version in a name (for instance you do the encoding of the message) we will try register schema in the registry. Luckily, Confluent Schema Registry allows you to register the same schema many times since they verify its hash sum (i.e this is an idempotent operation).
All the described above leads to the next change, which I consider breaking
The generic name resolution
The names cache TTL was changed from 5 minutes to infinity. Why? Simply because it will be always resolved to the latest available schema and in case if it's compatible we are good. If it's not – you anyway will have to re-deploy your code (yes, hot-reload is still a question, if you have suggestions/problems – feel free to create an issue).
And in case if you want some periodic disk-reads of your schemas – set it to something lower than infinity. But nonetheless, it's a public interface change, so I call it breaking! Boooo 👻
Happy coding 👨💻 and don't forget to wash your hands ✋
P.S Thanks @coryodaniel for the issue report and collaboration 🤗
Better erlavro compliance
This is a very minor release, changes are internal only.
To avoid discrepancy between erlavro
and Avrora
libraries, ETS host was renamed and re-implemented to use :avro_schema_store.new/0
call instead :ets.new/2
.
Under the hood erlavro
is using :ets.new/2
, but now this responsibility is shifted from Avrora
to erlavro
.
Happy coding, everyone 👏
Fixed broken ETS references for short-living processes
It turns out that when Avrora is used inside Phoenix which controllers are short living processes, that all generated by Avrora.Schema
lookup tables (which is in fact erlang term stores) will be cleaned once the controller finishes request processing.
But all the resolved schemas are stored in Avrora.Storage.Memory
, which means that after the Phoenix controller dies all the references to ETS inside the Memory module are broken.
This release introduces an ETS host process that will own all the generated stores.
New functionality
A test helper module was added. Since Avrora.Storage.Memory
can share state between tests it's always better to ether mock it or clean state. Here is an example:
defmodule MyTest do
use ExUnit.Case, async: true
import Avrora.TestCase
setup :cleanup_storage!
test "memory storage was filled" do
asset Avrora.Storage.Memory.get("some") == nil
Avrora.Storage.Memory.put("some", 42)
asset Avrora.Storage.Memory.get("some") == 42
end
test "memory storage is clean" do
asset Avrora.Storage.Memory.get("some") == nil
end
end
Minor changes
- Some documentation improvements
Avrora.Name
was renamed toAvrora.Schema.Name
Performance and memory improvements
Thanks to @ananthakumaran who spot an issue with a new inter-schema references feature.
Issue become visible for the big schema files. A reference collection process contains a bug which leads to massive memory allocations during schema traversing. Now it has been fixed, yaw 🙌
Happy 2020 🎉
Inter-schema references
This is a feature release 🎊
From the very beginning, this library was heavily inspired by avro_turf simplicity and features. Now it's time to say – Avrora
moves one step closer to the feature set avro_turf
provides.
The must-have feature inter-schema references comes to Avrora
. Now you can split your huge schema into smaller pieces and glue them together via references.
What is a reference?
Reference is a canonical full name of a schema. Accordingly to Avrora
name to location rules if you have schema under io/confluent/Message.avsc
its full name (namespace + name) will be io.confluent.Message
.
How do references work?
Technically Avro specification doesn't support inter-schema references, only local-schema references. Because of this limitation, inter-schema references implemented via embedding referenced schema into the schema which contains reference and replacing all other references within this schema with local-references.
How to use references?
For example, you have a Messenger
schema which contains references to
the Message
schema:
priv/schemas/io/confluent/Messenger.avsc
{
"type": "record",
"name": "Messenger",
"namespace": "io.confluent",
"fields": [
{
"name": "inbox",
"type": {
"type": "array",
"items": "io.confluent.Message"
}
},
{
"name": "archive",
"type": {
"type": "array",
"items": "io.confluent.Message"
}
}
]
}
priv/schemas/io/confluent/Message.avsc
{
"type": "record",
"name": "Message",
"namespace": "io.confluent",
"fields": [
{
"name": "text",
"type": "string"
}
]
}
Final compiled schema which will be stored and registered in the Confluent
Schema Registry, will looks like this:
{
"type": "record",
"name": "Messenger",
"namespace": "io.confluent",
"fields": [
{
"name": "inbox",
"type": {
"type": "array",
"items": {
"type": "record",
"name": "Message",
"fields": [
{
"name": "text",
"type": "string"
}
]
}
}
},
{
"name": "archive",
"type": {
"type": "array",
"items": "Message"
}
}
]
}
💢 In case of avro_turf
field archive
will keep its canonical items
type reference io.confluent.Message
instead of local reference Message
.
Documentation improvements
In this minor release, documentation was greatly improved by @reachfh. More clear descriptions, precise statements and consistency.
Thanks ❤️
Fixed sub-type name resolution
When you define a complex schema with an array of records (for instance) and you want to re-use a type you created in that array definition you can simply use a name of that record.
{
"type": "record",
"name": "Messenger",
"namespace": "io.confluent",
"fields": [
{
"name": "inbox",
"type": {
"type": "array",
"items": {
"type": "record",
"name": "Message",
"fields": [
{
"name": "text",
"type": "string"
}
]
}
}
},
{
"name": "archive",
"type": {
"type": "array",
"items": "io.confluent.Message"
}
}
]
}
but in Avrora v0.6
it will throw an error because the erlavro schema was stored and used as-is. Meanwhile avro specification allows a record type to be defined once and then be re-used in schema resolution process.
Since version v0.7
resolution of the schema is done through the erlavro build-in mechanism of the schema storage which will take care of all the type names you define in the schema.
Thanks ❤️