-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingest): add classification to bigquery, redshift #10031
feat(ingest): add classification to bigquery, redshift #10031
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me
Contains refractoring changes for snowflake classification
6795dda
to
a030388
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two non-blocking comments that we can do in a follow-up
data_reader_kwargs=dict( | ||
sample_size_percent=( | ||
self.config.classification.sample_size | ||
* 1.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we extract out this 1.2 constant in a follow up PR?
@@ -447,6 +447,11 @@ class TeradataConfig(BaseTeradataConfig, BaseTimeWindowConfig): | |||
@capability(SourceCapability.LINEAGE_COARSE, "Optionally enabled via configuration") | |||
@capability(SourceCapability.LINEAGE_FINE, "Optionally enabled via configuration") | |||
@capability(SourceCapability.USAGE_STATS, "Optionally enabled via configuration") | |||
@capability( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the capability annotations support inheritance, so I don't think these are actually necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondered why we had no annotation on SQLAlchemySource source. Some like PLATFORM_INSTANCE, DOMAINS, SCHEMA_METADATA, CONTAINERS, DELETION_DETECTION can probably move in there.
- also addresses followups from datahub-project#10031 (review)
Stacked on top of #10013
Checklist