You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
enum UnionMode:int { Sparse, Dense }
table Union {
mode: UnionMode;
typeIds: [Int32]; // optional, describes typeid of each child.
}
The idea is to enable providing an id different from the child offset (the default)
This enables an optimization where we use predefined ids when constructing the type vector of the union but want the children to be only the actually used types.
Wes McKinney / @wesm:
So if I understand correctly, support we had a union of 50 types, but only 5 of them actually occur in the data, then the typeIds would indicate the indices of the observed child types. That makes sense to me.
Steven Phillips / @StevenMPhillips:
I don't understand that purpose or benefit of this change. Could you give a concrete example of where this would be useful?
Julien Le Dem / @julienledem:
The current java implementation uses the ordinal in the MinorType to denote the type id in the type vector.
However the Arrow spec defines it as the index in the children of the Field.
This JIRA is a way to reconcile the too.
When the Vector is not using the child index as a type id it provides the ids in the typeIds field. (typeIds is the same length as the children in the Field)
The idea is to enable providing an id different from the child offset (the default)
This enables an optimization where we use predefined ids when constructing the type vector of the union but want the children to be only the actually used types.
Reporter: Julien Le Dem / @julienledem
Assignee: Julien Le Dem / @julienledem
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-257. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: