Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to determine which branch of a union was used #180

Open
jeroentervoorde opened this issue Feb 6, 2021 · 3 comments
Open

How to determine which branch of a union was used #180

jeroentervoorde opened this issue Feb 6, 2021 · 3 comments

Comments

@jeroentervoorde
Copy link

I've a schema that wraps a record type in a union that all have the same layout. They only differ on record name.
For instance:
Added { payload: Record}
Deleted { payload: Record}
Updated { payload: Record}

I can parse these records and that gives me a Union(Record[payload: Record]) but i can't find a way to determine if it's added, delete or updated.

Is this a limitation of the library and, if so, what do you think about adding a reference to the schema or schema name to the Record struct to make this possible? An alternative would be to add the index of the branch to Union so i can resolve it myself using the reader schema (a bit less user friendly but this won't cause any ownership issues so it might be easier to implement)

Both options may break serde deserialization as well i guess. Any ideas about that?

@flavray
Copy link
Owner

flavray commented Feb 7, 2021

This looks like one of the recurring issues we've had reported, and still have no fix for this (yet!), after a few attempts :(

Issues #61 #95
Previous attempt #90

Let me know if you were referring to something else. 🙂

If you want to have a go at this, I'd be happy to review it any time you have something (even if not complete)! If not, I'll give it a go later on.

Regarding the implementation, I think adding the index in the Record would be a decent solution. We wouldn't have to deal with lifetime/cloning shenanigans, and we should have all the building blocks ready for that (albeit quite a lot of work is required to make it happen 😄)

@lerouxrgd
Copy link
Contributor

How about using a dedicated enum as follows:

struct Record;

struct Added {
    payload: Record,
}

struct Deleted {
    payload: Record,
}

struct Updated {
    payload: Record,
}

enum UnionOperation {
    Added(Added),
    Updated(Updated),
    Deleted(Deleted),
}

@jeroentervoorde
Copy link
Author

@flavray

No, i think that's the same issue. Sorry about that :)
Thanks for the pointers. That'll be very useful.

I'd like to take a stab at this but i intend to wait until #99 is merged. If i can do something to help there please let me know.

@lerouxrgd

This is indeed how i intend to deserialize this into a rust type but the problem I'm running into now is that the intermediate model (the Value::Union and Value::Record specifically) do not contain the information that would be needed to create to right branch of my enum so that i want to change first.

I assume that I'll also need to change the serde deserialization code to use the additional information added to Union or Record to get that working.

I think I can either match the avro record name to the rust enum branch name or use something like
#[serde(tag = "recordName")] (as described here https://serde.rs/enum-representations.html) if the rust name doesn't match the avro name. My schema contains fully qualified names for instance that I'd like to support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants