Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6145] Function 'TRIM' without parameters throw NullPointerEx… #3869

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,9 @@ public SqlTrimFunction(String name, SqlKind kind,
if (operands[1] == null) {
operands[1] = SqlLiteral.createCharString(" ", pos);
}
if (operands[2] == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something is wrong here. If you reached this point you have one argument.

Copy link
Contributor Author

@timgrein timgrein Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've two (flag from which side(s) to trim, string to trim from the 3rd operand) right?

For trim(both 'a' from 'aAa') you've three operands:

operand[0]: "BOTH"
operand[1]: "a"
operand[2]: "aAa"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, and the message says that there are "no arguments"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, currently adjusting this

Copy link
Contributor Author

@timgrein timgrein Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so I've debugged through this:

If you call trim() the following happens:

  • Enters the block for 3 operands (first gets a default value, second and third are null, not in the sense of their value, but of their absence)
  • operands[0] gets set to BOTH
  • operands[1] gets so to " "
  • operands[2] is null (indicating absence of the arg not that its actual value is null)

So the actual method call from a user perspective was one without arguments (trim()).

trim(" a ") works fine
trim(both " a ") works fine
trim("a" from) syntax error (string to trim is absent)
trim(both "a" from) syntax error (string to trim is absent)

So to handle the no-args case I think it's correct to check for if (operands[2] == null) in the block with 3 operands. I'll add a comment explaining it, otherwise it's confusing I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the first two arguments have some default values

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihaibudiu @timgrein Perhaps we can refer to SparkSQL's exception information.
SparkSQL:

spark-sql> select trim();
Error in query: Invalid number of arguments for function trim. Expected: one of 1 and 2; Found: 0; line 1 pos 7

throw new IllegalArgumentException("String to trim cannot be null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The second argument of trim cannot be null"
It's not clear what "string" you are referring to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's either the first or the third:

You can specify one argument (only the string to trim) or three (where to trim (leading, trailing both sides), what to trim from the string, the string to trim).

Copy link
Contributor Author

@timgrein timgrein Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The second argument of trim cannot be null"

This is not 100% correct I think as the string to trim from can be null: trim('a' from cast(null as varchar(1))) returns null, which is valid. I think it's more about the absence of the argument, which feels to me like a different thing than a value/field being null.

Ideally we should correctly fall through to the default case or handle the cases, where the string to trim is absent explicitly.

}
break;
default:
throw new IllegalArgumentException(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10843,6 +10843,7 @@ void assertSubFunReturns(boolean binary, String s, int start,
f.checkString("trim(trailing 'a' from 'aAa')", "aA", "VARCHAR(3) NOT NULL");
f.checkNull("trim(cast(null as varchar(1)) from 'a')");
f.checkNull("trim('a' from cast(null as varchar(1)))");
f.checkNull("trim(null)");

// SQL:2003 6.29.9: trim string must have length=1. Failure occurs
// at runtime.
Expand All @@ -10857,6 +10858,7 @@ void assertSubFunReturns(boolean binary, String s, int start,
f.checkFails("trim('' from 'abcde')",
"Trim error: trim character must be exactly 1 character",
true);
f.checkFails("trim()", "String to trim cannot be null", false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the expected error for this function?
Or should the result be null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The creator of the original https://issues.apache.org/jira/browse/CALCITE-6145 suggested to throw an exception with an exact description.

I've checked the behavior of PostgreSQL (v15):

SELECT TRIM() AS trimmed_string;

leads to a syntax error: Query Error: error: syntax error at or near ")".

SELECT TRIM("") AS trimmed_string;

leads to an error: Query Error: error: zero-length delimited identifier at or near """".

SELECT TRIM(null) AS trimmed_string;

returns null.

So I think this aligns with the behavior of PostgreSQL throwing an error during query parsing, but I think the message could be a bit better: trim cannot be called without arguments maybe?

I've added an explicit test for when the value is null, which indeed returns null (aligns with PostgreSQL's behavior).

WDYT? @mihaibudiu

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the function trim() need to have arguments, then the type checker should have complained about not finding a function with a signature without arguments. I don't expect the parser to complain about it, that is strange. The parser should just accept the function, and the type checker should validate it.

The original message is not good, since clearly there is no 'null' involved. However, if the error comes from the parser, there isn't much you can do about it.


final SqlOperatorFixture f1 = f.withConformance(SqlConformanceEnum.MYSQL_5);
f1.checkString("trim(leading 'eh' from 'hehe__hehe')", "__hehe",
Expand Down
Loading