Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a typeids Vector to Union type #15598

Closed
asfimport opened this issue Aug 10, 2016 · 5 comments
Closed

Add a typeids Vector to Union type #15598

asfimport opened this issue Aug 10, 2016 · 5 comments

Comments

@asfimport
Copy link
Collaborator

asfimport commented Aug 10, 2016

enum UnionMode:int { Sparse, Dense }

table Union {
  mode: UnionMode;
  typeIds: [Int32]; // optional, describes typeid of each child.
}

The idea is to enable providing an id different from the child offset (the default)
This enables an optimization where we use predefined ids when constructing the type vector of the union but want the children to be only the actually used types.

Reporter: Julien Le Dem / @julienledem
Assignee: Julien Le Dem / @julienledem

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-257. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Wes McKinney / @wesm:
So if I understand correctly, support we had a union of 50 types, but only 5 of them actually occur in the data, then the typeIds would indicate the indices of the observed child types. That makes sense to me.

@asfimport
Copy link
Collaborator Author

Julien Le Dem / @julienledem:
PR: #143

@asfimport
Copy link
Collaborator Author

Steven Phillips / @StevenMPhillips:
I don't understand that purpose or benefit of this change. Could you give a concrete example of where this would be useful?

@asfimport
Copy link
Collaborator Author

Julien Le Dem / @julienledem:
The current java implementation uses the ordinal in the MinorType to denote the type id in the type vector.
However the Arrow spec defines it as the index in the children of the Field.
This JIRA is a way to reconcile the too.
When the Vector is not using the child index as a type id it provides the ids in the typeIds field. (typeIds is the same length as the children in the Field)

@asfimport
Copy link
Collaborator Author

Julien Le Dem / @julienledem:
Issue resolved by pull request 143
#143

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants