Skip to content

Conversation

@waynexia
Copy link
Member

Which issue does this PR close?

  • Closes #.

Rationale for this change

Simplify dump regex cases like ~ '.*' or !~ '.*'.

What changes are included in this PR?

Handle special wildcard regex pattern in expr_simplifier rule

Are these changes tested?

Yes, via sqllogictests and unit tests

Are there any user-facing changes?

no

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
@github-actions github-actions bot added optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels Mar 19, 2025

if let Expr::Literal(ScalarValue::Utf8(Some(pattern))) = right.as_ref() {
// Handle the special case for ".*" pattern
if pattern == ".*" {
Copy link
Contributor

@jayzhan211 jayzhan211 Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this a const similar to COUNT_STAR_EXPANSION

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
right: empty_lit,
})
} else {
// always true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to handle nulls too as null ~ '.*' is not true (it is null)

> create or replace table foo(x varchar) as values (1), (2), (null);
0 row(s) fetched.
Elapsed 0.004 seconds.

> select x ~ '.*' from foo;
+--------------------+
| foo.x ~ Utf8(".*") |
+--------------------+
| true               |
| true               |
| NULL               |
+--------------------+
3 row(s) fetched.
Elapsed 0.016 seconds.

So maybe instead of lit(true) it is x.is_not_null() 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I added two cases about null in simplify_expr.slt, it should work as expected now.

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
40

query I
SELECT * FROM v1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this result is not deterministic, we need rowsort for it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 95848ef

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @waynexia

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
@waynexia waynexia merged commit 4af5cfc into apache:main Mar 21, 2025
27 checks passed
@waynexia
Copy link
Member Author

Thank you for reviewing @jayzhan211 @alamb ❤️

@waynexia waynexia deleted the simplify-regex branch March 21, 2025 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants