Skip to content

Conversation

@danielcweeks
Copy link

No description provided.

@rdblue
Copy link
Contributor

rdblue commented Dec 5, 2015

@isnotinvain, you might want to have a look at this one.

I'll take a look next week as well. Is this something you'd like to get into 1.9.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale behind not using predicate pushdown by default? Not breaking existing jobs until it is well tested?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but if other who are more familiar with pig pushdown or filter api think it's worth turning on by default, I'm OK with that as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to have it on by default once we're confident.
Possibly once it has run in production and works well?

@rdblue
Copy link
Contributor

rdblue commented Dec 8, 2015

@danielcweeks, I found a few things in the filter generation code that looks odd to me so I updated it in my local copy. You might want to have a look at rdblue@a39fdff, which fixes most of the issues I pointed out. It doesn't support BOOLEAN yet, but that wouldn't be hard to add.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdblue (your comment here got stomped with the updates)

This was a mistake on my part and should have been new ColumnWrapper(booleanColumn(f.alias));

In a sense this introduces a "bug", but I'm not sure you can even represent that in pig latin, so it's more to improve readability and let pig sort out the conflict. Conceptually, I don't like it, but it makes it pretty clean.

@danielcweeks
Copy link
Author

@rdblue I'll take a look to see how boolean would fit into your branch and we can take a look at the two approaches.

}

private <C extends Comparable<C>> C getValue(Expression expr, Class<C> type) {
Comparable value = (Comparable) ((Const) expr).getValue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type.cast(((Const) expr).getValue())?

@julienledem
Copy link
Member

This looks good to me.

@rdblue
Copy link
Contributor

rdblue commented Feb 20, 2016

I think the only thing that still needed to be done was to throw exceptions rather than returning null when a filter can't be constructed. There was discussion about this above, but it was on an outdated diff.

I made those changes and pushed them to https://github.com/rdblue/parquet-mr/tree/PARQUET-397-pig-predicate-pushdown for review.

I also implemented not, !=, in, and between operations because it seems likely that those will be used. In and between build expressions based on comparison operators, or, and and.

@asfgit asfgit closed this in fb46b94 Feb 26, 2016
piyushnarang pushed a commit to piyushnarang/parquet-mr that referenced this pull request Jun 15, 2016
This is based on apache#296 from @danielcweeks and implements a few remaining review items.

Closes apache#296.

Author: Daniel Weeks <dweeks@netflix.com>
Author: Ryan Blue <blue@apache.org>

Closes apache#331 from rdblue/PARQUET-397-pig-predicate-pushdown and squashes the following commits:

c7a9b02 [Ryan Blue] PARQUET-397: Address review comments.
54e23a6 [Ryan Blue] PARQUET-397: Update Pig PPD to throw for bad expressions.
388099b [Daniel Weeks] Cleaning up imports
6b405b4 [Daniel Weeks] Merge remote-tracking branch 'rdblue/pig-predicate-pushdown' into pig-predicate-pushdown
f1ef73e [Daniel Weeks] Fixed binary type and storing filter predicate
a39fdff [Ryan Blue] WIP: Handle a few error cases in Pig predicate pushdown.
2666849 [Daniel Weeks] Fixed test to check the actual number of materialized rows from the reader
7b019a6 [Daniel Weeks] update tests and logging
f8ca447 [Daniel Weeks] Add predicate pushdown using filter2 api
rdblue pushed a commit to rdblue/parquet-mr that referenced this pull request Jul 13, 2016
This is based on apache#296 from @danielcweeks and implements a few remaining review items.

Closes apache#296.

Author: Daniel Weeks <dweeks@netflix.com>
Author: Ryan Blue <blue@apache.org>

Closes apache#331 from rdblue/PARQUET-397-pig-predicate-pushdown and squashes the following commits:

c7a9b02 [Ryan Blue] PARQUET-397: Address review comments.
54e23a6 [Ryan Blue] PARQUET-397: Update Pig PPD to throw for bad expressions.
388099b [Daniel Weeks] Cleaning up imports
6b405b4 [Daniel Weeks] Merge remote-tracking branch 'rdblue/pig-predicate-pushdown' into pig-predicate-pushdown
f1ef73e [Daniel Weeks] Fixed binary type and storing filter predicate
a39fdff [Ryan Blue] WIP: Handle a few error cases in Pig predicate pushdown.
2666849 [Daniel Weeks] Fixed test to check the actual number of materialized rows from the reader
7b019a6 [Daniel Weeks] update tests and logging
f8ca447 [Daniel Weeks] Add predicate pushdown using filter2 api

Conflicts:
	pom.xml
Resolution:
    Fixed up adjacent changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants