Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when formatting Spark SQL #746

Open
liyuanhao6 opened this issue Jun 12, 2024 · 5 comments
Open

Error when formatting Spark SQL #746

liyuanhao6 opened this issue Jun 12, 2024 · 5 comments

Comments

@liyuanhao6
Copy link

First Bug

set livy.session.conf.spark.driver.maxResultSize=8g;
set livy.session.conf.spark.driver.memory=12g;
set livy.session.conf.spark.driver.memoryOverhead=8g;
set livy.session.conf.spark.kryoserializer.buffer.max=512m;

Unable to format SQL: Error: Parse error: Unexpected "8g; set li" at line 1 column 50

We usually use the Spark configurations in the SQL codes, and it reports the bug. We suppose those configurations are useful, and could you please help us fix this bug?

Second Bug

WHERE
    dayno = ${v_day}

Unable to format SQL: Error: Parse error: Unexpected "${v_day} )" at line 112 column 17

We usually use shell parameter in the SQL codes, and it reports the bug. We suppose those parameters are useful, and could you please help us fix this bug?

@liyuanhao6 liyuanhao6 added the bug label Jun 12, 2024
@nene nene changed the title Unable to format SQL Error when formatting Spark SQL Jun 21, 2024
@nene
Copy link
Collaborator

nene commented Jun 21, 2024

Thanks for reporting.

  • The second problem can be dealt with the help of paramTypes configuration. See the FAQ.

  • The first problem is a case of Spark SQL syntax which I haven't yet seen. I'll look into fixing this. Though during summer I have little time for open source.

@amadeuspzs
Copy link

@liyuanhao6 if you are using VSCode try:

  "SQL-Formatter-VSCode.paramTypes": {
    "custom": [
      {
        "regex": "\\$?\\{[a-zA-Z0-9_]+\\}"
      }
    ]
  },

for the second problem.

@nene nene added the spark label Oct 26, 2024
@palinkasnorbert
Copy link

@amadeuspzs Thank you for the fix!

Interesting though, that I could format my sparksql code without this fix in the settings.
However after formatting our special param syntax, I'm getting an unnecessary whitespace after the last } as example below:

--- code before formatting:

create or replace view ${unity_catalog_name}.${environment_name}_sublayer.tablename 

--- code after formatting:

create or replace view ${unity_catalog_name}.${environment_name} _sublayer.tablename -- whitespace appears after closing '}'

@nene
Copy link
Collaborator

nene commented Nov 6, 2024

Yeah, this ${...} syntax is actually supported by default.

However the formatter assumes that ${...} constitutes some kind of identifier. If it works as just a plain string substitution, then all bets are off. In that case I guess you could also have other SQL syntax as part of these substitutions, in which case the formatter would have no idea how it should format it:

CREATE ${orReplace} ${entity} ${type};

But really I know very little about Spark. I'm not even sure if this ${...} syntax is part of Apache Spark itself or comes from some other tool.

@amadeuspzs
Copy link

We're using ${...} as a plain string substitution in SparkSQL cells in Azure Synapse Analytics, so yes, it acts as a substitution for any string.

The additional whitespace looks like a result of nene's explanation that the formatter assumes it's an identifier, rather than a general purpose string replacement.

This is why whitespace isn't introduced after a . e.g.

${one}.${two}

whereas

${one}_${two}

becomes

${one} _ ${two}

One workaround would be to use/derive a "whole word" parameter instead of partial one e.g.

spark.conf.set("environment_name_sublayer", f"{environment_name}_sublayer")

followed by

create or replace view ${unity_catalog_name}.${environment_name_sublayer}.tablename 

which yields

CREATE    OR REPLACE VIEW ${unity_catalog_name}.${environment_name_sublayer}.tablename

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants