Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlfluff lint says Found unparsable section if dbt source contains dash #3501

Closed
2 of 3 tasks
astrojuanlu opened this issue Jun 28, 2022 · 2 comments
Closed
2 of 3 tasks
Labels
bug Something isn't working

Comments

@astrojuanlu
Copy link

Search before asking

  • I searched the issues and found no similar issues.

What Happened

I have this file:

{{
  config(
    materialized='table'
  )
}}

SELECT
  CAST("timestamp" AS date) AS "timestamp"
  , distinct_id
  , event
  , COUNT(*) as "count"
FROM {{ source('tap-postgres', 'telemetry_events') }}
GROUP BY
  CAST("timestamp" AS date)
  , event
  , distinct_id

And running sqlfluff lint on it printed a number of lining problems (ok) and a parsing error (not ok):

sqlfluff lint events_per_day.sql --dialect ansi

== [meltano/transform/models/tap_postgres/events_per_day.sql] FAIL
L:   7 | P:   1 | L034 | Select wildcards then simple targets before calculations
                       | and aggregates.
L:   8 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 03]
L:   8 | P:   8 | L059 | Unnecessary quoted identifier "timestamp".
L:   8 | P:  32 | L059 | Unnecessary quoted identifier "timestamp".
L:   9 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 03]
L:   9 | P:   3 | L019 | Found leading comma. Expected only trailing.
L:  10 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 03]
L:  10 | P:   3 | L019 | Found leading comma. Expected only trailing.
L:  11 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 03]
L:  11 | P:   3 | L019 | Found leading comma. Expected only trailing.
L:  11 | P:  14 | L010 | Keywords must be consistently upper case.
L:  11 | P:  17 | L059 | Unnecessary quoted identifier "count".
L:  12 | P:   6 |  PRS | Line 8, Position 9: Found unparsable section:
                       | '-postgres_telemetry_events'
L:  14 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 09]
L:  14 | P:   8 | L059 | Unnecessary quoted identifier "timestamp".
L:  15 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 09]
L:  15 | P:   3 | L019 | Found leading comma. Expected only trailing.
L:  16 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 09]
L:  16 | P:   3 | L019 | Found leading comma. Expected only trailing.
WARNING: Parsing errors found and dialect is set to 'ansi'. Have you configured your dialect correctly?
All Finished 📜 🎉!

This model has been running fine so far. Changing the dash to an underscore makes the parsing error disappear:

--- a/events_per_day.sql
+++ b/events_per_day.sql
@@ -9,7 +9,7 @@ SELECT
   , event
   , distinct_id
   , COUNT(*) as "count"
-FROM {{ source('tap-postgres', 'telemetry_events') }}
+FROM {{ source('tap_postgres', 'telemetry_events') }}
 GROUP BY
   CAST("timestamp" AS date)
   , event

Expected Behaviour

sqlfluff lint does not give parsing errors for SQL files that look fine.

Observed Behaviour

(See above)

How to reproduce

(See above)

Dialect

ansi

Version

sqlfluff, version 1.0.0

(By the way, sqlfluff-templater-dbt was not listed as a dependency, just discovered it in this issue template ❓ installing version 1.0.0 didn't make any difference, the parsing is just slower)

dbt-core 1.1.1

Configuration

(Empty)

Are you willing to work on and submit a PR to address the issue?

  • Yes I am willing to submit a PR!

Code of Conduct

@astrojuanlu astrojuanlu added the bug Something isn't working label Jun 28, 2022
@astrojuanlu
Copy link
Author

I think the problem was that I was not setting

[sqlfluff]
templater = dbt

in the configuration. Now I see another bunch of errors that are unrelated (because I'm using dbt managed by meltano and everything is a tad more complicated).

I leave up to you the decision to do something with this issue or not.

@tunetheweb
Copy link
Member

Our ANSI dialect does not allow identifiers with dashes in them:

NakedIdentifierSegment=SegmentGenerator(
# Generate the anti template from the set of reserved keywords
lambda dialect: RegexParser(
r"[A-Z0-9_]*[A-Z][A-Z0-9_]*",
CodeSegment,
name="naked_identifier",
type="identifier",

Depending what dialect you're actually using, they might be allowed (e.g,. our bigquery dialect allows them).

Not sure if moving to dbt templater actually sorted this, or you just hid the error with other errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants