Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

depends_on is empty when using backticks around schema and table names in BigQuery queries #298

Closed
1210yuichi0 opened this issue Dec 19, 2024 · 1 comment · Fixed by #299

Comments

@1210yuichi0
Copy link

Nice to meet you and thank you for the great OSS.
When using backticks to quote schema and table names in BigQuery queries, the depends_on field in the generated exposures.yml file is empty.

In BigQuery, schema and table names must be quoted with backticks. Documentation

Example 1: Using backticks (fails)

Metabase Query:

SELECT * FROM `work.transactions`  

Generated exposures.yml:

depends_on: []  

Example 2: Without backticks (works but breaks Metabase Field filters)

Metabase Query:

SELECT * FROM work.transactions  

Generated exposures.yml:

depends_on:  
  - ref('payments')  

Expected Behavior

The depends_on field should correctly capture dependencies even when schema and table names are quoted with backticks, as required by BigQuery's syntax.

Environment

  • dbt-bigquery: 1.8.0
  • dbt-core: 1.8.3
  • dbt-metabase: 1.4.1

Steps to Reproduce

  1. Create a query in Metabase that uses backticks to quote schema and table names.
  2. Run the dbt-metabase sync process.
  3. Check the exposures.yml file and observe that depends_on is empty.

Would it be possible to enhance dbt-metabase to correctly parse dependencies from queries that use backticks in BigQuery?

@syou6162
Copy link

@yuichi-github-dev The response from the API containing the query looks like this, for example.

  • "dataset_query": {
    "database": 2,
    "type": "native",
    "native": {
    "template-tags": {},
    "query": "select\n count(*)\nfrom STG_payments as p\n left join STG_orders as o on p.order_id = o.order_id\nwhere o.status = 'returned'\n;"
    }
    },

The code and regular expression for extracting the schema name and table name from a query like this are defined below, and I think that if you modify it to extend this, you should be able to extract ref correctly even in cases that include backquotes.

  • # Parse SQL for exposures through FROM or JOIN clauses
    for sql_ref in re.findall(_EXPOSURE_PARSER, native_query):
    # DATABASE.schema.table -> [database, schema, table]
    parsed_model_path = [s.strip('"').lower() for s in sql_ref.split(".")]
    # Scrub CTEs (qualified sql_refs can not reference CTEs)
    if parsed_model_path[-1] in ctes and "." not in sql_ref:
    continue
  • _EXPOSURE_PARSER = re.compile(r"[FfJj][RrOo][OoIi][MmNn]\s+([\w.\"`]+)")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants