Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-11960: [C++][Gandiva] Support escape in LIKE #9700

Closed
wants to merge 4 commits into from

Conversation

Crystrix
Copy link
Contributor

Add gdv_fn_like_utf8_utf8_int8 function in Gandiva to support escape char in LIKE. An escape char is stored in an int8 type which is compatible with char type in C++.

@github-actions
Copy link

@Crystrix Crystrix changed the title ARROW-11960: [Gandiva][C++]Support escape in LIKE ARROW-11960: [C++][Gandiva]Support escape in LIKE Mar 14, 2021
@@ -100,6 +100,10 @@ std::vector<NativeFunction> GetStringFunctionRegistry() {
kResultNullIfNull, "gdv_fn_like_utf8_utf8",
NativeFunction::kNeedsFunctionHolder),

NativeFunction("like", {}, DataTypeVector{utf8(), utf8(), int8()}, boolean(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not provide utf8 instead of int8? or it won't work with multibyte utf8 character.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Int8 is used here because function SqlLikePatternToPcre only supports a single-byte ASCII char as the escape character.
The change of multibyte escape character may not trivial. Maybe we can create a new issue and another PR to implement it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure creating a different ticket to support escaping multibyte char seems fine. I still think you should provide utf8 in the signature instead of int8 and add a check that it is of length 1, since the argument is string type. Or else it won't work when you call the function from java

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is updated and the int8 type is replaced with utf8 type.

@Crystrix
Copy link
Contributor Author

Crystrix commented Jun 3, 2021

Hi @projjal, I've rebased this PR with the master branch. The CI failures seem to be unrelated. Do you know who can be involved to merge this PR?

@projjal
Copy link
Contributor

projjal commented Jun 4, 2021

@emkornfield Can you merge this PR?

@kou kou changed the title ARROW-11960: [C++][Gandiva]Support escape in LIKE ARROW-11960: [C++][Gandiva] Support escape in LIKE Jun 4, 2021
@kou
Copy link
Member

kou commented Jun 4, 2021

I'll do.

@kou kou closed this in ca66567 Jun 4, 2021
@Crystrix Crystrix deleted the arrow-11960 branch June 7, 2021 05:52
michalursa pushed a commit to michalursa/arrow that referenced this pull request Jun 13, 2021
Add gdv_fn_like_utf8_utf8_int8 function in Gandiva to support escape char in LIKE. An escape char is stored in an int8 type which is compatible with char type in C++.

Closes apache#9700 from Crystrix/arrow-11960

Authored-by: crystrix <chenxi.li@live.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
jvictorhuguenin pushed a commit to s1mbi0se/arrow that referenced this pull request Sep 21, 2021
Add gdv_fn_like_utf8_utf8_int8 function in Gandiva to support escape char in LIKE. An escape char is stored in an int8 type which is compatible with char type in C++.

Closes apache#9700 from Crystrix/arrow-11960

Authored-by: crystrix <chenxi.li@live.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
(cherry picked from commit ca66567)
zhouyuan pushed a commit to zhouyuan/arrow that referenced this pull request Nov 24, 2021
Add gdv_fn_like_utf8_utf8_int8 function in Gandiva to support escape char in LIKE. An escape char is stored in an int8 type which is compatible with char type in C++.

Closes apache#9700 from Crystrix/arrow-11960

Authored-by: crystrix <chenxi.li@live.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
zhouyuan added a commit to oap-project/arrow that referenced this pull request Nov 25, 2021
* ARROW-11960: [C++][Gandiva] Support escape in LIKE

Add gdv_fn_like_utf8_utf8_int8 function in Gandiva to support escape char in LIKE. An escape char is stored in an int8 type which is compatible with char type in C++.

Closes apache#9700 from Crystrix/arrow-11960

Authored-by: crystrix <chenxi.li@live.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>

* ARROW-12567: [C++][Gandiva] Implement ILIKE SQL function

Closes apache#10179 from jvictorhuguenin/feature/implement-sql-ilike and squashes the following commits:

f160880 <frank400> Optimize holder constructor call
97e6e2d <frank400> Remove unnecessary Make method
c2363b1 <frank400> Disable TryOptimize for ilike
a484149 <frank400> Fix checkstyle on cmake file
c6a8372 <frank400> Delete unnecessary holder
4be6cc6 <frank400> Fix redefined function
b78085a <frank400> Fix miss include
2efd43e <frank400> Implement ilike function

Authored-by: frank400 <j.victorhuguenin2018@gmail.com>
Signed-off-by: Praveen <praveen@dremio.com>

* ARROW-12410: [C++][Gandiva] Implement regexp_replace function on Gandiva

Closes apache#10059 from rodrigojdebem/feature/implement-regexp-replace and squashes the following commits:

baf2778 <rodrigojdebem> Add implementation for REGEXP_REPLACE

Authored-by: rodrigojdebem <rodrigodebem1@gmail.com>
Signed-off-by: Praveen <praveen@dremio.com>

Co-authored-by: crystrix <chenxi.li@live.com>
Co-authored-by: frank400 <j.victorhuguenin2018@gmail.com>
Co-authored-by: rodrigojdebem <rodrigodebem1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants