You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
In apache/datafusion#870, @b41sh added support for filtering all values that do/do not match a particular regular expression. However, it uses the (only available at time of writing) regexp_match kernel which returns any actual matches (as a ListArray) rather than just a "true/false" (BooleanArray) if the row matched or not. This is unoptimal because:
It is more work to construct a ListArray than a BooleanArray
There is extra work to then turn the ListArray back into a BooleanArray
Describe the solution you'd like
Add an arrow compute kernel (perhaps in the comparison module) that looks like
A better name TBD -- regexp_matches_utf8 is similar to like_utf8 but also perhaps too similar to regexp_match
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
In apache/datafusion#870, @b41sh added support for filtering all values that do/do not match a particular regular expression. However, it uses the (only available at time of writing)
regexp_match
kernel which returns any actual matches (as aListArray
) rather than just a "true/false" (BooleanArray
) if the row matched or not. This is unoptimal because:ListArray
than aBooleanArray
ListArray
back into aBooleanArray
Describe the solution you'd like
Add an arrow compute kernel (perhaps in the
comparison
module) that looks likeA better name TBD --
regexp_matches_utf8
is similar tolike_utf8
but also perhaps too similar toregexp_match
Where the resulting
BooleanArray
isregex_match
andlike_utf8
)Describe alternatives you've considered
None yet
Additional context
See use in apache/datafusion#870
The text was updated successfully, but these errors were encountered: