Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use index based Table->File columns mapping. #6531

Closed
wants to merge 1 commit into from

Conversation

spershin
Copy link
Contributor

Summary:
Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: #6466

Differential Revision: D49117889

@netlify
Copy link

netlify bot commented Sep 12, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 06516a6
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6503476873d69b0008959839

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 12, 2023
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49117889

spershin pushed a commit to spershin/velox-1 that referenced this pull request Sep 12, 2023
Summary:

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Differential Revision: D49117889
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49117889

spershin pushed a commit to spershin/velox-1 that referenced this pull request Sep 13, 2023
Summary:

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Differential Revision: D49117889
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49117889

spershin pushed a commit to spershin/velox-1 that referenced this pull request Sep 13, 2023
Summary:

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Differential Revision: D49117889
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49117889

spershin pushed a commit to spershin/velox-1 that referenced this pull request Sep 14, 2023
Summary:

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Reviewed By: Yuhta

Differential Revision: D49117889
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49117889

Summary:

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Reviewed By: Yuhta

Differential Revision: D49117889
spershin pushed a commit to spershin/velox-1 that referenced this pull request Sep 14, 2023
Summary:

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Reviewed By: Yuhta

Differential Revision: D49117889
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49117889

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49117889

spershin pushed a commit to spershin/velox-1 that referenced this pull request Sep 14, 2023
Summary:
Addressing lints and comments from D49117889 (facebookincubator#6531).
- Using TypePtr instead of std::shared_ptr<const Type> in some places.
- Throwing for bad indices in ArrayType::nameOf() and MapType::nameOf().
- Moving TypePtr in some places instead of copying.
- Fixing 'use after move'.

Reviewed By: Yuhta

Differential Revision: D49291987

fbshipit-source-id: 3daa876c1c186f09bd1abd2c7088bac5096f9d26
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 580e81a.

spershin pushed a commit to spershin/velox-1 that referenced this pull request Sep 14, 2023
Summary:

Addressing lints and comments from D49117889 (facebookincubator#6531).
- Using TypePtr instead of std::shared_ptr<const Type> in some places.
- Throwing for bad indices in ArrayType::nameOf() and MapType::nameOf().
- Moving TypePtr in some places instead of copying.
- Fixing 'use after move'.

Reviewed By: Yuhta

Differential Revision: D49291987
spershin pushed a commit to spershin/velox-1 that referenced this pull request Sep 14, 2023
Summary:

Addressing lints and comments from D49117889 (facebookincubator#6531).
- Using TypePtr instead of std::shared_ptr<const Type> in some places.
- Throwing for bad indices in ArrayType::nameOf() and MapType::nameOf().
- Moving TypePtr in some places instead of copying.
- Fixing 'use after move'.

Reviewed By: Yuhta

Differential Revision: D49291987
@conbench-facebook
Copy link

Conbench analyzed the 1 benchmark run on commit 580e81ab.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

facebook-github-bot pushed a commit that referenced this pull request Sep 15, 2023
Summary:
Pull Request resolved: #6579

Addressing lints and comments from D49117889 (#6531).
- Using TypePtr instead of std::shared_ptr<const Type> in some places.
- Throwing for bad indices in ArrayType::nameOf() and MapType::nameOf().
- Moving TypePtr in some places instead of copying.
- Fixing 'use after move'.

Reviewed By: Yuhta

Differential Revision: D49291987

fbshipit-source-id: fbc92b32c0db9b3b284fa9c683fa78d7feeb43c3
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6531

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Reviewed By: Yuhta

Differential Revision: D49117889

fbshipit-source-id: 423d94cb892003cf0ec705a936aac46869835f17
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6579

Addressing lints and comments from D49117889 (facebookincubator#6531).
- Using TypePtr instead of std::shared_ptr<const Type> in some places.
- Throwing for bad indices in ArrayType::nameOf() and MapType::nameOf().
- Moving TypePtr in some places instead of copying.
- Fixing 'use after move'.

Reviewed By: Yuhta

Differential Revision: D49291987

fbshipit-source-id: fbc92b32c0db9b3b284fa9c683fa78d7feeb43c3
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6531

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Reviewed By: Yuhta

Differential Revision: D49117889

fbshipit-source-id: 423d94cb892003cf0ec705a936aac46869835f17
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6579

Addressing lints and comments from D49117889 (facebookincubator#6531).
- Using TypePtr instead of std::shared_ptr<const Type> in some places.
- Throwing for bad indices in ArrayType::nameOf() and MapType::nameOf().
- Moving TypePtr in some places instead of copying.
- Fixing 'use after move'.

Reviewed By: Yuhta

Differential Revision: D49291987

fbshipit-source-id: fbc92b32c0db9b3b284fa9c683fa78d7feeb43c3
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6531

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Reviewed By: Yuhta

Differential Revision: D49117889

fbshipit-source-id: 423d94cb892003cf0ec705a936aac46869835f17
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6579

Addressing lints and comments from D49117889 (facebookincubator#6531).
- Using TypePtr instead of std::shared_ptr<const Type> in some places.
- Throwing for bad indices in ArrayType::nameOf() and MapType::nameOf().
- Moving TypePtr in some places instead of copying.
- Fixing 'use after move'.

Reviewed By: Yuhta

Differential Revision: D49291987

fbshipit-source-id: fbc92b32c0db9b3b284fa9c683fa78d7feeb43c3
ericyuliu pushed a commit to ericyuliu/velox that referenced this pull request Oct 12, 2023
Summary:
Pull Request resolved: facebookincubator#6531

Presto by default uses index based column mapping.
It has a connector config prop "hive.orc.use-column-names" to use names.
Make Velox use index based column mapping by default.
For this we simply rename fields in the file schema using table schema.
We also check the type compatibility of the fields.
Underneath the code still uses the name mapping, but since we adjusted
field names in the file schema using indices, it is equivalent to index mapping.
In the future the internal code might hande this on its own and this workaround
with column renaming might not be needed.
If "hive.orc.use-column-names" is passed - we don't perform renaming and
work as before.
GH issue: facebookincubator#6466

Reviewed By: Yuhta

Differential Revision: D49117889

fbshipit-source-id: 423d94cb892003cf0ec705a936aac46869835f17
ericyuliu pushed a commit to ericyuliu/velox that referenced this pull request Oct 12, 2023
Summary:
Pull Request resolved: facebookincubator#6579

Addressing lints and comments from D49117889 (facebookincubator#6531).
- Using TypePtr instead of std::shared_ptr<const Type> in some places.
- Throwing for bad indices in ArrayType::nameOf() and MapType::nameOf().
- Moving TypePtr in some places instead of copying.
- Fixing 'use after move'.

Reviewed By: Yuhta

Differential Revision: D49291987

fbshipit-source-id: fbc92b32c0db9b3b284fa9c683fa78d7feeb43c3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants