-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avro parser: return Decimal fields as strings #29182
Conversation
@@ -157,5 +157,11 @@ def _to_output_value(avro_format: AvroFormat, record_type: Mapping[str, Any], re | |||
return record_value | |||
if record_type.get("logicalType") == "uuid": | |||
return uuid.UUID(bytes=record_value) | |||
elif record_type.get("logicalType") == "decimal": | |||
return str(record_value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -398,7 +398,7 @@ | |||
"drummer": "George Daniel", | |||
}, | |||
"col_fixed": "\x12\x34\x56\x78", | |||
"col_decimal": 1234.56789, | |||
"col_decimal": "1234.56789", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decimals should always be read as strings
return str(record_value) | ||
return record_value | ||
if record_type.get("logicalType") == "uuid": | ||
return uuid.UUID(bytes=record_value) | ||
elif record_type.get("logicalType") == "decimal": | ||
return str(record_value) | ||
elif record_type.get("logicalType") == "date": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicitly convert the output to the format we want instead of returning a date/datetime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
double_as_string: bool = Field( | ||
title="Convert Double Fields to Strings", | ||
description="Whether to convert double fields to strings. There is a loss of precision when converting decimals to floats, so this is recommended.", | ||
default=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to confirm. For legacy S3 connectors, the adapter will by default set this to False if I understand correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, created an issue #29225
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @girarda!
What
double_as_string
to make it clearer that it is affecting double fields, not decimalsRecommended reading order
airbyte-cdk/python/airbyte_cdk/sources/file_based/config/avro_format.py
airbyte-cdk/python/airbyte_cdk/sources/file_based/file_types/avro_parser.py
airbyte-cdk/python/unit_tests/sources/file_based/file_types/test_avro_parser.py
airbyte-cdk/python/unit_tests/sources/file_based/scenarios/avro_scenarios.py
airbyte-cdk/python/unit_tests/sources/file_based/test_scenarios.py
airbyte-cdk/python/unit_tests/sources/file_based/scenarios/csv_scenarios.py