Skip to content

Conversation

@kiszk
Copy link
Member

@kiszk kiszk commented May 20, 2017

What changes were proposed in this pull request?

This PR adds built-in SQL function BIT_LENGTH(), CHAR_LENGTH(), and OCTET_LENGTH() functions.

BIT_LENGTH() returns the bit length of the given string or binary expression.
CHAR_LENGTH() returns the length of the given string or binary expression. (i.e. equal to LENGTH())
OCTET_LENGTH() returns the byte length of the given string or binary expression.

How was this patch tested?

Added new test suites for these three functions

@maropu
Copy link
Member

maropu commented May 20, 2017

It seems you have the wrong JIRA number. Also, you need to add tests in SQLQueryTestSuite. Thanks.

@SparkQA
Copy link

SparkQA commented May 21, 2017

Test build #77135 has finished for PR 18046 at commit 82ef305.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class BitLength(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
  • case class OctetLength(child: Expression) extends UnaryExpression with ImplicitCastInputTypes

@kiszk kiszk changed the title [SPARK-20746][SQL] Built-in SQL Function Support - all variants of LEN[GTH] [SPARK-20749][SQL] Built-in SQL Function Support - all variants of LEN[GTH] May 21, 2017
@kiszk
Copy link
Member Author

kiszk commented May 21, 2017

@maropu Thank you for pointing them out. I addressed both.

@SparkQA
Copy link

SparkQA commented May 21, 2017

Test build #77142 has started for PR 18046 at commit 5967d0f.

@kiszk
Copy link
Member Author

kiszk commented May 21, 2017

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented May 21, 2017

Test build #77143 has finished for PR 18046 at commit 5967d0f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it 9 bits?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. updated.

Copy link
Member

@viirya viirya May 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it common that length function is just an alias of char_length?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g., in mysql, length is an alias of octet_length.

Copy link
Member

@viirya viirya May 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems length is also an alias of char_length in Hive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that length is an alias of octet_length in mysql. In postgreSql, length is alias an of char_length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the ANSI SQL, LENGTH function as a synonym for CHAR_LENGTH.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems you forget to update to operators.sql too?

Copy link
Member Author

@kiszk kiszk May 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use an example string which returns different results on character_length and octet_length?

Copy link
Member Author

@kiszk kiszk May 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test suites handles the case. I have just used a simple case like other operations in the file.

@SparkQA
Copy link

SparkQA commented May 21, 2017

Test build #77146 has started for PR 18046 at commit 400640a.

@viirya
Copy link
Member

viirya commented May 21, 2017

Can we also update the doc of Length? It is better to explicitly say it returns character length.

@kiszk
Copy link
Member Author

kiszk commented May 21, 2017

@viirya I agree with you. However, I leave it for consistency of documents among versions.
I would like to hear @gatorsmile 's opinion.

@kiszk
Copy link
Member Author

kiszk commented May 21, 2017

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented May 21, 2017

Test build #77159 has started for PR 18046 at commit 400640a.

@gatorsmile
Copy link
Member

gatorsmile commented May 22, 2017

@kiszk @viirya The document of length needs an update. Please always correct the function description if you find anything is incorrect. BTW, our current function description and test cases need more improvement. Please feel free to submit PRs to improve it. Thanks!

@SparkQA
Copy link

SparkQA commented May 22, 2017

Test build #77190 has finished for PR 18046 at commit 9bc1976.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

Please run the following command to generate the result file:

SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/test-only *SQLQueryTestSuite"

@SparkQA
Copy link

SparkQA commented May 23, 2017

Test build #77208 has finished for PR 18046 at commit 599fd31.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 23, 2017

Test build #77221 has finished for PR 18046 at commit e9acb63.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member Author

kiszk commented May 23, 2017

@gatorsmile thank you for your suggestion.

@kiszk
Copy link
Member Author

kiszk commented May 25, 2017

@gatorsmile Do I have to do additional things for this PR?

@gatorsmile
Copy link
Member

@kiszk Could you resolve the conflict? Thanks!

@SparkQA
Copy link

SparkQA commented May 28, 2017

Test build #77479 has started for PR 18046 at commit bf958c5.

@kiszk
Copy link
Member Author

kiszk commented May 28, 2017

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented May 28, 2017

Test build #77482 has finished for PR 18046 at commit bf958c5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@viirya viirya May 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A function that returns the char length of the given string expression or number of bytes of the given binary expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... that returns ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@SparkQA
Copy link

SparkQA commented May 28, 2017

Test build #77488 has finished for PR 18046 at commit 8ca04ca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member Author

kiszk commented May 30, 2017

ping @gatorsmile

1 similar comment
@kiszk
Copy link
Member Author

kiszk commented Jun 3, 2017

ping @gatorsmile

@SparkQA
Copy link

SparkQA commented Jun 12, 2017

Test build #77926 has finished for PR 18046 at commit e2bbc5a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 12, 2017

Test build #77927 has finished for PR 18046 at commit 550a950.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member Author

kiszk commented Jun 12, 2017

ping @gatorsmile

@SparkQA
Copy link

SparkQA commented Jun 15, 2017

Test build #78083 has started for PR 18046 at commit 1e8cbcd.

@gatorsmile
Copy link
Member

retest this please

@gatorsmile
Copy link
Member

@kiszk Sorry, I missed this ping.

@SparkQA
Copy link

SparkQA commented Jun 16, 2017

Test build #78133 has finished for PR 18046 at commit 1e8cbcd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

LGTM

@gatorsmile
Copy link
Member

Thanks! Merging to master.

@asfgit asfgit closed this in 7a3e5dc Jun 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants