-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predicate to filter conversion. #62
Comments
Just curious, why do we need to convert the predicate to a Spark filter rather than just not ignoring the parquet predicate in the BAM/SAM/VCF code path? |
Ah, please disregard. I thought the predicate was being ignored for BAM/SAM data inside ADAM files (the name adamLoad is confusing in this respect) |
I took an attempt at this here: https://github.com/hammerlab/adam/compare/predicate-bam?expand=1 Would love some feedback or alternative approaches - right now this only supports boolean equality filters. |
This looks good! Are you intentionally only starting with booleans for the On Wed, Apr 16, 2014 at 2:54 PM, Arun Ahuja notifications@github.comwrote:
|
Yea it was for simplicity, pushed an update now for more general equality support. I hadn't realized that Parquet only supported equality filters. @fnothaft I saw you were the last contributor to Parquet on that - is there support elsewhere for other types of ColumnFilters? Also, I see what I did only supports AND's of conditions as well. |
Good stuff. I wonder if it would be worth it to generalize and let people On Wed, Apr 16, 2014 at 7:32 PM, Arun Ahuja notifications@github.comwrote:
|
Nice! Indeed, this does look cool, and like a nice approach. @arahuja I've actually added code to Parquet that supports arbitrary predicate functions. These are the "applyFunctionTo*" predicates. If you've got any questions about the predicate functions, or would like any changes made to them, let me know. I'd be glad to help on this, or to work on anything that'd make the predicates easier to use. |
Closed by #234. |
We need a way to convert Parquet predicates into Spark filters—this is needed for the adamRead methods for both read and variant data, as we ignore the predicate passed if reading BAM/SAM/VCF data.
The text was updated successfully, but these errors were encountered: