Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp_match does not appear to return matches #5479

Closed
alamb opened this issue Mar 4, 2023 · 4 comments
Closed

regexp_match does not appear to return matches #5479

alamb opened this issue Mar 4, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Mar 4, 2023

Describe the bug
regexp_match appears to return an empty list rather than the part of the string matched.

To Reproduce

Using datafusion-cli

❯ create table foo as values ('foo'), ('bar'), ('foobar');
0 rows in set. Query took 0.000 seconds.
❯ select * from foo;
+---------+
| column1 |
+---------+
| foo     |
| bar     |
| foobar  |
+---------+select regexp_match(column1, 'foo') from foo;
+--------------------------------------+
| regexpmatch(foo.column1,Utf8("foo")) |
+--------------------------------------+
| []                                   |. <-- note this is empty, not 'foo'!
|                                      |
| []                                   |. <-- this too!
+--------------------------------------+
3 rows in set. Query took 0.001 seconds.

Expected behavior
I expect the result to be two lists of foo, like postgres:

postgres=# create table foo as values ('foo'), ('bar'), ('foobar');
SELECT 3
postgres=# select * from foo;
 column1
---------
 foo
 bar
 foobar
(3 rows)

postgres=# select regexp_match(column1, 'foo') from foo;
 regexp_match
--------------
 {foo}

 {foo}
(3 rows)

Additional context

Found by @sanderson on influxdata/docs-v2#4774

@Jefffrey
Copy link
Contributor

Jefffrey commented Mar 5, 2023

Upstream arrow-rs issue has been raised

@Jefffrey
Copy link
Contributor

Jefffrey commented Mar 7, 2023

Upstream issue fixed by apache/arrow-rs#3807

This issue now depends on next arrow-rs version update

@Jefffrey
Copy link
Contributor

Issue resolved as of latest main by arrow-rs update: #5685

DataFusion CLI v21.0.0
❯ create table foo as values ('foo'), ('bar'), ('foobar');
0 rows in set. Query took 0.009 seconds.
❯ select regexp_match(column1, 'foo') from foo;
+--------------------------------------+
| regexpmatch(foo.column1,Utf8("foo")) |
+--------------------------------------+
| [foo]                                |
|                                      |
| [foo]                                |
+--------------------------------------+
3 rows in set. Query took 0.004 seconds.
❯

@alamb

@alamb
Copy link
Contributor Author

alamb commented Apr 1, 2023

Thanks @Jefffrey -- closing

@alamb alamb closed this as completed Apr 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants