Skip to content
This repository has been archived by the owner on Sep 23, 2024. It is now read-only.

target-snowflake should handle anyOf type specifications from the input schema. #228

Open
stkbailey opened this issue Nov 5, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@stkbailey
Copy link

Is your feature request related to a problem? Please describe.
Some taps emit a schema definition that uses, e.g., anyOf: [{"type": "string"}, {"type": "number"}] instead of {"type": ["string", "number"]}. Currently, target-snowflake simply omits the column from the output table silently. This has frequently resulted in integraiton errors or surprises. (tap-google-sheets is the main tap culprit here.)

See also: transferwise/pipelinewise#449 (comment)

Describe the solution you'd like
I would like for target-snowflake to create and populate the column in a conservative way, e.g. with a text datatype.

Describe alternatives you've considered
The alternative would be fixing the tap, if for example this is actually invalid Singer formatting. We could also add warning logging that a column was being bypassed, so that it was at least clear to the user.

Additional context
Add any other context or screenshots about the feature request here.

Here is an example schema message being passed from the tap:

{"type": "SCHEMA", "stream": "periods", "schema": {"properties": {"__sdc_spreadsheet_id": {"type": ["null", "string"]}, "__sdc_sheet_id": {"type": ["null", "integer"]}, "__sdc_row": {"type": ["null", "integer"]}, "period_key": {"type": ["null", "string"]}, "fiscal_quarter": {"type": ["null", "string"]}, "fiscal_year": {"type": ["null", "string"]}, "period_start_date": {"anyOf": [{"type": ["null", "string"], "format": "date"}, {"type": ["null", "string"]}]}, "period_end_date": {"anyOf": [{"type": ["null", "string"], "format": "date"}, {"type": ["null", "string"]}]}}, "type": "object", "additionalProperties": false}, "key_properties": ["__sdc_row"]}

Here is an example of the create table statement issued by target-snowflake.

target-snowflake-prod           | time=2021-11-05 14:52:06 name=target_snowflake level=INFO message=Running query: 'CREATE TABLE IF NOT EXISTS company_goals."PERIODS" ("__SDC_ROW" number, "__SDC_SHEET_ID" number, "__SDC_SPREADSHEET_ID" text, "_SDC_BATCHED_AT" timestamp_ntz, "_SDC_DELETED_AT" text, "_SDC_EXTRACTED_AT" timestamp_ntz, "FISCAL_QUARTER" text, "FISCAL_YEAR" text, "PERIOD_KEY" text, PRIMARY KEY("__SDC_ROW")) data_retention_time_in_days = 1 ' with Params {'LAST_QID': None}
@stkbailey stkbailey added the enhancement New feature or request label Nov 5, 2021
@aaronsteers
Copy link

aaronsteers commented Nov 5, 2021

Related: We have a draft proposal mentioned in the link below, up for the Singer Working Group upcoming discussion. I think it's still an open topic how targets should respond if they do not recognize the type described in JSON schema, but in this case and likely in most others, there's some argument to just failing over to string.

MeltanoLabs/Singer-Working-Group#20

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants