Destination Bigquery Normalization: Name transformation should be consistent between normalization and connector #25163

edgao · 2023-04-13T15:06:29Z

The connector will ensure that the dataset ID (i.e. stream namespace) starts with an alphanumeric character - https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-bigquery/src/main/java/io/airbyte/integrations/destination/bigquery/BigQuerySQLNameTransformer.java#L38

but normalization doesn't do this. So if you configure a custom stream namespace that starts with an underscore (e.g. _foo)... then destination-bigquery will create the raw tables in the n_foo dataset, but normalization will search for them in an _foo dataset.

We probably want to add the n prefix in normalization. (this stops being a problem once normalization runs in java, since we can just pass around a single BigQueryWriteConfig - but that's not an immediate solution)

(dataset names are allowed to start with an underscore, but they won't show up in the UI - https://cloud.google.com/bigquery/docs/datasets#dataset-naming)

The text was updated successfully, but these errors were encountered:

evantahler · 2023-04-18T20:40:25Z

Grooming

${namespace}_airbyte -> "hubspot" + _airbyte => hubspot_airbyte
*happy path
${namespace}_airbyte -> "" + _airbyte => _airbyte
- hidden in BQ
${namespace}_airbyte -> "_public" + _airbyte => _public_airbyte
- hidden in BQ

#3 won't be fixed by a platform change, so we still need to handle this.

appending "n" is silly to begin with, we should at lesat append "Airbyte" or something meaningful

The plan:

in metabase confirm that case 3 is rate
if checkin catalogs #3 is really rare, this fix belongs in the platform so that "" namespaces are never empty and we append something (name of the source?). We make this change as part of this story.
Remove the "n_" hack the destination has now
docs to educate BQ users that _ tables are hidden (add to destination docs).
Migration plan - what do we tell existing users (use output of step 1)

edgao · 2023-04-21T22:09:50Z

I was reminded that https://github.com/airbytehq/airbyte-internal-issues/issues/943 also exists. I think that's a nonissue because the UI trims off whitespace (#17113), but it's another thing to fix when we're doing the typing/deduping in java.

edgao · 2023-06-23T19:45:12Z

closing since we want platform to do this https://github.com/airbytehq/airbyte-internal-issues/issues/1875

edgao added type/bug Something isn't working team/destinations Destinations team's backlog labels Apr 13, 2023

edgao closed this as not planned Won't fix, can't repro, duplicate, stale Jun 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Destination Bigquery Normalization: Name transformation should be consistent between normalization and connector #25163

Destination Bigquery Normalization: Name transformation should be consistent between normalization and connector #25163

edgao commented Apr 13, 2023

evantahler commented Apr 18, 2023 •

edited

Loading

edgao commented Apr 21, 2023

edgao commented Jun 23, 2023

Destination Bigquery Normalization: Name transformation should be consistent between normalization and connector #25163

Destination Bigquery Normalization: Name transformation should be consistent between normalization and connector #25163

Comments

edgao commented Apr 13, 2023

evantahler commented Apr 18, 2023 • edited Loading

edgao commented Apr 21, 2023

edgao commented Jun 23, 2023

evantahler commented Apr 18, 2023 •

edited

Loading