Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic implementation of Unnest operator #27

Closed
wants to merge 1 commit into from

Conversation

mbasmanova
Copy link
Contributor

This is an initial cut which supports only one unnest column of type ARRAY and doesn't support WITH ORDINALITY clause.

@mbasmanova mbasmanova requested review from kgpai and pedroerp August 11, 2021 21:59
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 11, 2021
@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D30264256

Copy link
Contributor

@kgpai kgpai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -34,4 +34,53 @@ const std::vector<std::shared_ptr<const PlanNode>>& ExchangeNode::sources()
return EMPTY_SOURCES;
}

UnnestNode::UnnestNode(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basic Question, does Unnest here mean hive explode or does it mean something else ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::vector<TypePtr> types;

for (const auto& variable : replicateVariables_) {
names.emplace_back(variable->name());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Might I suggest std::vector<Pair<string, TypePtr>> so we can use an iterator to access both instead of an index ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type constructor takes two vectors, hence, no flexibility here.

ChannelIndex outputChannel = 0;
for (const auto& variable : unnestNode->replicateVariables()) {
identityProjections_.emplace_back(
inputType->getChildIdx(variable->name()), outputChannel++);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what ChannelIndex, childIdx here are ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It helps to think about input data as a table made of a columns. Each column has a name and an ordinal zero-based position called channel. inputType is a RowType made of of column types. childIdx = channel and ChannelIndex is a typedef for the type of channel.

using ChannelIndex = uint32_t;

// Create "indices" buffer to repeat rows as many times as there are elements
// in the array.
BufferPtr repeatedIndices =
AlignedBuffer::allocate<vector_size_t>(numElements, pool());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for vector_size_t to be 0 here ? Should we also do some some max size check ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vector_size_t is a type, hence, cannot be zero, but numElements can be zero. Nice catch. Will update the code to handle that case properly.

assertQuery(
op,
std::vector<std::shared_ptr<TempFilePath>>{},
"SELECT c0, x FROM tmp, UNNEST(ARRAY[0, 1, 2]) as t(x)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test for map ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This initial cut of UNNEST doesn't support maps.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D30264256

@mbasmanova
Copy link
Contributor Author

@kgpai Krishna, thank you for review. I replied to questions and updated the code to handle numElements == 0 explicitly. Would you take another look?

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D30264256

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D30264256

Summary:
This is an initial cut which supports only one unnest column of type ARRAY and doesn't support WITH ORDINALITY clause.

Pull Request resolved: facebookincubator/velox#27

Differential Revision: D30264256

Pulled By: mbasmanova

fbshipit-source-id: adf0c138d0a48a37437eb11e628c90532b21bdf5
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D30264256

@facebook-github-bot
Copy link
Contributor

@mbasmanova merged this pull request in 545a610.

rui-mo pushed a commit to rui-mo/velox that referenced this pull request Jun 29, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Jun 30, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Jul 20, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Aug 2, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Aug 12, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Aug 22, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Sep 7, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Sep 26, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Oct 26, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Nov 8, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Nov 8, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Nov 22, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Dec 15, 2022
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Jan 6, 2023
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Jan 12, 2023
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
PHILO-HE pushed a commit to PHILO-HE/velox that referenced this pull request Feb 3, 2023
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Feb 24, 2023
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
liujiayi771 pushed a commit to liujiayi771/velox that referenced this pull request Mar 3, 2023
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
liujiayi771 pushed a commit to liujiayi771/velox that referenced this pull request Mar 9, 2023
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
liujiayi771 pushed a commit to liujiayi771/velox that referenced this pull request Apr 1, 2023
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
rui-mo pushed a commit to rui-mo/velox that referenced this pull request Apr 23, 2023
…incubator#27)

* Filter validation for Parquet reader at runtime

* Style

* Style

* Format

Removed special handling for avg (facebookincubator#31)

[OPPRO-173] Make batch size configurable (facebookincubator#32)

support dwrf format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants