Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does spark convert to_timestamp to cast syntax #4102

Closed
hellozepp opened this issue Sep 10, 2024 · 4 comments
Closed

Why does spark convert to_timestamp to cast syntax #4102

hellozepp opened this issue Sep 10, 2024 · 4 comments

Comments

@hellozepp
Copy link
Contributor

Before you file an issue
sqlglot.transpile(sql, read="postgres", write="spark")

Fully reproducible code snippet
presto sql as following:

select to_timestamp(1415792726123/1000)

My expected spark output:

select to_timestamp(1415792726123/1000)

Actual output:

SELECT CAST(FROM_UNIXTIME(1415792726123 / 1000) AS TIMESTAMP)

Spark's to_timestamp is supported since: 2.2.0, and my execution results are as follows:

+------------------------------------+
|to_timestamp((1415792726123 / 1000))|
+------------------------------------+
| 2014-11-12 19:45:...|
+------------------------------------+

24/09/10 14:16:36 INFO SparkUI: Stopped Spark web UI at http://192.168.15.150:4040

So, can Spark use to_timestamp directly?

@hellozepp
Copy link
Contributor Author

In addition, in sqlglot.dialects.spark2._unix_to_time_sql, from_unixtime uses lowercase function, which should be unified into uppercase.

@VaggelisD
Copy link
Collaborator

VaggelisD commented Sep 10, 2024

Hey @hellozepp, thanks for the report.

Regarding TO_TIMESTAMP, this is the PR that introduced it #1929. Is there any issue that arises with the cast solution? Although it could be roundtripped as you mention, I see that they produce the same results:

spark-sql (default)> SELECT CAST(1415792726123 / 1000 AS TIMESTAMP);
2014-11-12 13:45:26.123

spark-sql (default)> select to_timestamp(1415792726123/1000);
2014-11-12 13:45:26.123

Regarding FROM_UNIXTIME, the function name will be capitalized at the end, but if we raise a PR to preserve the roundtrip of TO_TIMESTAMP we can also change that for consistency purposes.

@hellozepp
Copy link
Contributor Author

@VaggelisD yes, just for consistency purposes

@VaggelisD
Copy link
Collaborator

Thanks again for the concern, I'll go ahead and close this then as the semantics of the roundtrip are preserved.

If you come across a use case where this cast (or in general, any such transformation) breaks the roundtrip we can reconsider the refactor, otherwise it's fine for now.

@VaggelisD VaggelisD closed this as not planned Won't fix, can't repro, duplicate, stale Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants