`sqlfluff lint` says `Found unparsable section` if dbt source contains dash #3501

astrojuanlu · 2022-06-28T07:02:54Z

Search before asking

I searched the issues and found no similar issues.

What Happened

I have this file:

{{
  config(
    materialized='table'
  )
}}

SELECT
  CAST("timestamp" AS date) AS "timestamp"
  , distinct_id
  , event
  , COUNT(*) as "count"
FROM {{ source('tap-postgres', 'telemetry_events') }}
GROUP BY
  CAST("timestamp" AS date)
  , event
  , distinct_id

And running sqlfluff lint on it printed a number of lining problems (ok) and a parsing error (not ok):

sqlfluff lint events_per_day.sql --dialect ansi

== [meltano/transform/models/tap_postgres/events_per_day.sql] FAIL
L:   7 | P:   1 | L034 | Select wildcards then simple targets before calculations
                       | and aggregates.
L:   8 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 03]
L:   8 | P:   8 | L059 | Unnecessary quoted identifier "timestamp".
L:   8 | P:  32 | L059 | Unnecessary quoted identifier "timestamp".
L:   9 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 03]
L:   9 | P:   3 | L019 | Found leading comma. Expected only trailing.
L:  10 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 03]
L:  10 | P:   3 | L019 | Found leading comma. Expected only trailing.
L:  11 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 03]
L:  11 | P:   3 | L019 | Found leading comma. Expected only trailing.
L:  11 | P:  14 | L010 | Keywords must be consistently upper case.
L:  11 | P:  17 | L059 | Unnecessary quoted identifier "count".
L:  12 | P:   6 |  PRS | Line 8, Position 9: Found unparsable section:
                       | '-postgres_telemetry_events'
L:  14 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 09]
L:  14 | P:   8 | L059 | Unnecessary quoted identifier "timestamp".
L:  15 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 09]
L:  15 | P:   3 | L019 | Found leading comma. Expected only trailing.
L:  16 | P:   3 | L003 | Expected 1 indentations, found 0 [compared to line 09]
L:  16 | P:   3 | L019 | Found leading comma. Expected only trailing.
WARNING: Parsing errors found and dialect is set to 'ansi'. Have you configured your dialect correctly?
All Finished 📜 🎉!

This model has been running fine so far. Changing the dash to an underscore makes the parsing error disappear:

--- a/events_per_day.sql
+++ b/events_per_day.sql
@@ -9,7 +9,7 @@ SELECT
   , event
   , distinct_id
   , COUNT(*) as "count"
-FROM {{ source('tap-postgres', 'telemetry_events') }}
+FROM {{ source('tap_postgres', 'telemetry_events') }}
 GROUP BY
   CAST("timestamp" AS date)
   , event

Expected Behaviour

sqlfluff lint does not give parsing errors for SQL files that look fine.

Observed Behaviour

(See above)

How to reproduce

(See above)

Dialect

ansi

Version

sqlfluff, version 1.0.0

(By the way, sqlfluff-templater-dbt was not listed as a dependency, just discovered it in this issue template ❓ installing version 1.0.0 didn't make any difference, the parsing is just slower)

dbt-core 1.1.1

Configuration

(Empty)

Are you willing to work on and submit a PR to address the issue?

Yes I am willing to submit a PR!

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

astrojuanlu · 2022-06-28T07:31:08Z

I think the problem was that I was not setting

[sqlfluff]
templater = dbt

in the configuration. Now I see another bunch of errors that are unrelated (because I'm using dbt managed by meltano and everything is a tad more complicated).

I leave up to you the decision to do something with this issue or not.

tunetheweb · 2022-06-28T19:57:37Z

Our ANSI dialect does not allow identifiers with dashes in them:

sqlfluff/src/sqlfluff/dialects/dialect_ansi.py

Lines 285 to 291 in 7d9717d

    
           NakedIdentifierSegment=SegmentGenerator( 
        
               # Generate the anti template from the set of reserved keywords 
        
               lambda dialect: RegexParser( 
        
                   r"[A-Z0-9_]*[A-Z][A-Z0-9_]*", 
        
                   CodeSegment, 
        
                   name="naked_identifier", 
        
                   type="identifier",

Depending what dialect you're actually using, they might be allowed (e.g,. our bigquery dialect allows them).

Not sure if moving to dbt templater actually sorted this, or you just hid the error with other errors.

astrojuanlu added the bug Something isn't working label Jun 28, 2022

tunetheweb closed this as not planned Won't fix, can't repro, duplicate, stale Jun 28, 2022

jeancochrane mentioned this issue Aug 15, 2023

Update dbt views to select from other dbt models where possible ccao-data/data-architecture#71

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`sqlfluff lint` says `Found unparsable section` if dbt source contains dash #3501

`sqlfluff lint` says `Found unparsable section` if dbt source contains dash #3501

astrojuanlu commented Jun 28, 2022

astrojuanlu commented Jun 28, 2022

tunetheweb commented Jun 28, 2022

sqlfluff lint says Found unparsable section if dbt source contains dash #3501

sqlfluff lint says Found unparsable section if dbt source contains dash #3501

Comments

astrojuanlu commented Jun 28, 2022

Search before asking

What Happened

Expected Behaviour

Observed Behaviour

How to reproduce

Dialect

Version

Configuration

Are you willing to work on and submit a PR to address the issue?

Code of Conduct

astrojuanlu commented Jun 28, 2022

tunetheweb commented Jun 28, 2022

`sqlfluff lint` says `Found unparsable section` if dbt source contains dash #3501

`sqlfluff lint` says `Found unparsable section` if dbt source contains dash #3501