Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINOR - Clean up configs & add auto classification docs #18907

Merged
merged 3 commits into from
Dec 4, 2024

Conversation

pmbrull
Copy link
Collaborator

@pmbrull pmbrull commented Dec 3, 2024

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@@ -1745,7 +1745,7 @@ WHERE JSON_EXTRACT(json, '$.pipelineType') = 'metadata';

-- classification and sampling configs from the profiler pipelines
UPDATE ingestion_pipeline_entity
SET json = JSON_REMOVE(json, '$.sourceConfig.config.processPiiSensitive', '$.sourceConfig.config.confidence', '$.sourceConfig.config.generateSampleData')
SET json = JSON_REMOVE(json, '$.sourceConfig.config.processPiiSensitive', '$.sourceConfig.config.confidence', '$.sourceConfig.config.generateSampleData', '$.sourceConfig.config.sampleDataCount')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also not needed for the profiler. It was only used for sample data

@github-actions github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Dec 3, 2024
session = self.create_session()
if self.config.endPointURL is not None:
return session.client(
service_name=service_name, endpoint_url=str(self.config.endPointURL)
)
return session.client(service_name=service_name)

logger.info(f"Getting AWS default client for service [{service_name}]")
logger.debug(f"Getting AWS default client for service [{service_name}]")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're now passing the client to the source externally, this log was too chatty

profile_sample_type=self.source_config.profileSampleType,
sampling_method_type=self.source_config.samplingMethodType,
),
default_sample_config=SampleConfig(),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keeping simpler pipeline for the auto classification. If there's anything configured for the table we'll pick it from there directly

Copy link
Contributor

github-actions bot commented Dec 3, 2024

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 63%
63.74% (40309/63235) 40.23% (16043/39883) 42.58% (4799/11270)

Copy link

sonarqubecloud bot commented Dec 4, 2024

Copy link

sonarqubecloud bot commented Dec 4, 2024

@pmbrull pmbrull merged commit 613fd33 into open-metadata:main Dec 4, 2024
26 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ingestion safe to test Add this label to run secure Github workflows on PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants