Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update AVRO mappings to match current event schema and type system. #497

Merged
merged 5 commits into from
Sep 14, 2019

Conversation

evankanderson
Copy link
Contributor

It looks like we didn't update the AVRO format to actually include a place to store data when we graduated the payload from a special attribute to a header + payload model.

Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
"null",
"string"
]
["null", "string"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can undo this JSON auto-fomatting if preferred.

avro-format.md Outdated Show resolved Hide resolved
avro-format.md Outdated Show resolved Hide resolved
"int",
"string",
"bytes",
"CloudEvent"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familiar with Avro... is the "CloudEvent" above replaced by the stuff on line 95? (the data/bytes thingy)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "CloudEvent" was a recursive allowed value of the same object, to support Map values.

Without Map, these structures don't need to be recursive in their definition, which makes me much more comfortable.

The actual diff here is +boolean -CloudEvent

@duglin duglin added the v1.0 label Sep 3, 2019
Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
@clemensv
Copy link
Contributor

clemensv commented Sep 4, 2019

Even though we dropped 'Map' as a "portable" cross-encoding construct, JSON retains the ability to carry structured data and Avro should, too. The recursive schema (which is required for the encoder to work) enabled that. The Avro schema is not a strict schema in the sense that it accurately tracks all constraints of CloudEvents; it just makes the encoder work.

With the change you are now limiting "data" in Avro to what "data_base64" does in JSON.

In effect, I should be able to read a JSON event into an in-memory structure and then write that structure back out into Avro.

With limiting "data" to binary, you're now forcing me to pick an encoding for that structured data, and if that were Avro, I would have to make my own schema and if that were JSON then I would wonder why I am using Avro to begin with.

The "normal" case remains a self-contained event with data containing structured data and that needs to work in all encodings.

@evankanderson
Copy link
Contributor Author

I updated the .avsc file to define a union type of JSON Value and bytes, which is very unpretty because:

Unions may not immediately contain other unions

I'm also no Avro expert, so there may be a better way. Feel free to send a pull request with corrections against this branch if there are better ways to do this: https://github.com/evankanderson/spec/tree/hadoop-doop-doop

@evankanderson
Copy link
Contributor Author

(Tested via python and:

import avro.schema
avro.schema.parse(open("spec.avsc", "rb").read())

)

@evankanderson
Copy link
Contributor Author

Whoops, I left in my virtualenv... let me back that out.

…ed form

Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
avro-format.md Outdated
## 3 Data

Before encoding, the AVRO serializer MUST first determine the runtime data type
of the content. This may be determined by examining the data for invalid UTF-8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/may/might/ I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to "can" rather than "might".

Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
@duglin
Copy link
Collaborator

duglin commented Sep 14, 2019

Approved on the 9/12 - just waited for the RFC2119 keyword fix.
Thanks @evankanderson
Merging.

@duglin duglin merged commit ee9d415 into cloudevents:master Sep 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants