Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLToGCSOperators Add Support for Dumping JSON #26273

Closed
2 tasks done
patricker opened this issue Sep 9, 2022 · 1 comment · Fixed by #26277
Closed
2 tasks done

SQLToGCSOperators Add Support for Dumping JSON #26273

patricker opened this issue Sep 9, 2022 · 1 comment · Fixed by #26277
Labels

Comments

@patricker
Copy link
Contributor

Description

If your output format for a SQLToGCSOperator is json, then any "dict" type object returned from a database, for example a Postgres JSON column, is not dumped to a string and is kept as a nested JSON object.

Add option to dump dict objects to string in JSON exporter.

Use case/motivation

Currently JSON type columns are hard to ingest into BQ since a JSON field in a source database does not enforce a schema, and we can't reliably generate a RECORD schema for the column.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@patricker patricker added the kind:feature Feature Requests label Sep 9, 2022
@patricker
Copy link
Contributor Author

patricker commented Sep 9, 2022

Also, somewhat unrelated, the schema generated if a column is of type "JSON" is for a column of type "STRING". If you try to load the data into BigQuery using the generated schema it will fail if you don't dump the dictionaries to string first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants