Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include HiveColumnIndex in HiveColumnHandle to Velox #23130

Closed
yingsu00 opened this issue Jul 3, 2024 · 1 comment
Closed

Include HiveColumnIndex in HiveColumnHandle to Velox #23130

yingsu00 opened this issue Jul 3, 2024 · 1 comment
Assignees

Comments

@yingsu00
Copy link
Contributor

yingsu00 commented Jul 3, 2024

Currently, the Velox HiveDataSource matches the column name from the file (fileType) with the requested schema name. THese two names could be different. For example, Presto Iceberg writer changes the space to "_x20". To solve this problem, Presto Parquet reader has a session property "parquet_use_column_names" and default it to false. When it's set to false, the hiveColumnIndex in HiveColumnHandle is used to map the schema column name to the actual column name in the file. However this field is not sent to Velox. To fix the problem, we will need to send this field to Velox.

The same needs to be done on IcebergColumnHandle's columnIdentity.id

Expected Behavior or Use Case

Presto Component, Service, or Connector

Hive and Iceberg connector

Possible Implementation

Change the presto_cpp/main/types/PrestoToVeloxConnector.cpp to add these fields

Example Screenshots (if appropriate):

Context

@yingsu00
Copy link
Contributor Author

Closing this as it's shown not needed : facebookincubator/velox#10085

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants