-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GEN-356] Use ServiceSpec for loading sources based on connectors #18322
Merged
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
6c3fbf7
ref(profiler): use di for system profile
sushi30 6ccbf2e
- added manifests for all custom profilers
sushi30 146025c
- implement spec for all source types
sushi30 92e53ce
remove TYPE_CHECKING in core.py
sushi30 f526351
Merge branch 'main' into system-metrics-refactor
sushi30 8d4f1d4
- deleted valuedispatch function
sushi30 0c87941
Merge remote-tracking branch 'origin/main' into system-metrics-refactor
sushi30 59a6997
- removed tests related to the profiler factory
sushi30 6476f9f
- reverted start_time
sushi30 04fcc27
- reverted start_time
sushi30 0c2cbec
fixed tests
sushi30 a32efc8
format
sushi30 0e16f3e
bigquery system profile e2e tests
sushi30 60f14d1
fixed module docstring
sushi30 c2619a9
- removed import_side_effects from redshift. we still use it in postg…
sushi30 71234d2
- tests for BaseSpec
sushi30 bd9583c
- moved constructors around to get rid of useless kwargs
sushi30 5dc797f
- changed test_system_metric
sushi30 ead59e9
- added linage and usage to service_spec
sushi30 6d678cb
add comments on collaborative constructors
sushi30 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/api/rest/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.api.rest.metadata import RestSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=RestSource) |
6 changes: 6 additions & 0 deletions
6
ingestion/src/metadata/ingestion/source/dashboard/domodashboard/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
from metadata.ingestion.source.dashboard.domodashboard.metadata import ( | ||
DomodashboardSource, | ||
) | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=DomodashboardSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/lightdash/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.lightdash.metadata import LightdashSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=LightdashSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/looker/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.looker.metadata import LookerSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=LookerSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/metabase/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.metabase.metadata import MetabaseSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=MetabaseSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/mode/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.mode.metadata import ModeSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=ModeSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/mstr/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.mstr.metadata import MstrSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=MstrSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/powerbi/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.powerbi.metadata import PowerbiSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=PowerbiSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/qlikcloud/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.qlikcloud.metadata import QlikcloudSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=QlikcloudSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/qliksense/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.qliksense.metadata import QliksenseSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=QliksenseSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/quicksight/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.quicksight.metadata import QuicksightSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=QuicksightSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/redash/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.redash.metadata import RedashSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=RedashSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/sigma/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.sigma.metadata import SigmaSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=SigmaSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/superset/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.superset.metadata import SupersetSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=SupersetSource) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/dashboard/tableau/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.dashboard.tableau.metadata import TableauSource | ||
from metadata.utils.service_spec import BaseSpec | ||
|
||
ServiceSpec = BaseSpec(metadata_source_class=TableauSource) |
10 changes: 10 additions & 0 deletions
10
ingestion/src/metadata/ingestion/source/database/athena/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
from metadata.ingestion.source.database.athena.lineage import AthenaLineageSource | ||
from metadata.ingestion.source.database.athena.metadata import AthenaSource | ||
from metadata.ingestion.source.database.athena.usage import AthenaUsageSource | ||
from metadata.utils.service_spec.default import DefaultDatabaseSpec | ||
|
||
ServiceSpec = DefaultDatabaseSpec( | ||
metadata_source_class=AthenaSource, | ||
lineage_source_class=AthenaLineageSource, | ||
usage_source_class=AthenaUsageSource, | ||
) |
10 changes: 10 additions & 0 deletions
10
ingestion/src/metadata/ingestion/source/database/azuresql/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
from metadata.ingestion.source.database.azuresql.lineage import AzuresqlLineageSource | ||
from metadata.ingestion.source.database.azuresql.metadata import AzuresqlSource | ||
from metadata.ingestion.source.database.azuresql.usage import AzuresqlUsageSource | ||
from metadata.utils.service_spec.default import DefaultDatabaseSpec | ||
|
||
ServiceSpec = DefaultDatabaseSpec( | ||
metadata_source_class=AzuresqlSource, | ||
lineage_source_class=AzuresqlLineageSource, | ||
usage_source_class=AzuresqlUsageSource, | ||
) |
Empty file.
29 changes: 29 additions & 0 deletions
29
ingestion/src/metadata/ingestion/source/database/bigquery/profiler/profiler.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
from typing import List, Type | ||
|
||
from metadata.generated.schema.entity.data.table import SystemProfile | ||
from metadata.ingestion.source.database.bigquery.profiler.system import ( | ||
BigQuerySystemMetricsComputer, | ||
) | ||
from metadata.profiler.interface.sqlalchemy.bigquery.profiler_interface import ( | ||
BigQueryProfilerInterface, | ||
) | ||
from metadata.profiler.metrics.system.system import System | ||
from metadata.profiler.processor.runner import QueryRunner | ||
|
||
|
||
class BigQueryProfiler(BigQueryProfilerInterface): | ||
def _compute_system_metrics( | ||
self, | ||
metrics: Type[System], | ||
runner: QueryRunner, | ||
*args, | ||
**kwargs, | ||
) -> List[SystemProfile]: | ||
return self.system_metrics_computer.get_system_metrics( | ||
runner.table, self.service_connection_config | ||
) | ||
|
||
def initialize_system_metrics_computer( | ||
self, **kwargs | ||
) -> BigQuerySystemMetricsComputer: | ||
return BigQuerySystemMetricsComputer(session=self.session) |
161 changes: 161 additions & 0 deletions
161
ingestion/src/metadata/ingestion/source/database/bigquery/profiler/system.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
from typing import List | ||
|
||
from pydantic import TypeAdapter | ||
from sqlalchemy.orm import DeclarativeMeta | ||
|
||
from metadata.generated.schema.entity.data.table import DmlOperationType, SystemProfile | ||
from metadata.generated.schema.entity.services.connections.database.bigQueryConnection import ( | ||
BigQueryConnection, | ||
) | ||
from metadata.ingestion.source.database.bigquery.queries import BigQueryQueryResult | ||
from metadata.profiler.metrics.system.dml_operation import DatabaseDMLOperations | ||
from metadata.profiler.metrics.system.system import ( | ||
CacheProvider, | ||
EmptySystemMetricsSource, | ||
SQASessionProvider, | ||
SystemMetricsComputer, | ||
) | ||
from metadata.utils.logger import profiler_logger | ||
from metadata.utils.time_utils import datetime_to_timestamp | ||
|
||
logger = profiler_logger() | ||
|
||
|
||
class BigQuerySystemMetricsSource( | ||
SQASessionProvider, EmptySystemMetricsSource, CacheProvider | ||
): | ||
def __init__(self, *args, **kwargs): | ||
super().__init__(*args, **kwargs) | ||
|
||
def get_kwargs( | ||
self, | ||
table: DeclarativeMeta, | ||
service_connection: BigQueryConnection, | ||
*args, | ||
**kwargs, | ||
): | ||
return { | ||
"table": table.__table__.name, | ||
"dataset_id": table.__table_args__["schema"], | ||
"project_id": super().get_session().get_bind().url.host, | ||
"usage_location": service_connection.usageLocation, | ||
} | ||
|
||
def get_deletes( | ||
self, table: str, project_id: str, usage_location: str, dataset_id: str | ||
) -> List[SystemProfile]: | ||
return self.get_system_profile( | ||
project_id, | ||
dataset_id, | ||
table, | ||
list( | ||
self.get_queries_by_operation( | ||
usage_location, | ||
project_id, | ||
dataset_id, | ||
[ | ||
DatabaseDMLOperations.DELETE, | ||
], | ||
) | ||
), | ||
"deleted_row_count", | ||
DmlOperationType.DELETE, | ||
) | ||
|
||
def get_updates( | ||
self, table: str, project_id: str, usage_location: str, dataset_id: str | ||
) -> List[SystemProfile]: | ||
return self.get_system_profile( | ||
project_id, | ||
dataset_id, | ||
table, | ||
self.get_queries_by_operation( | ||
usage_location, | ||
project_id, | ||
dataset_id, | ||
[ | ||
DatabaseDMLOperations.UPDATE, | ||
DatabaseDMLOperations.MERGE, | ||
], | ||
), | ||
"updated_row_count", | ||
DmlOperationType.UPDATE, | ||
) | ||
|
||
def get_inserts( | ||
self, table: str, project_id: str, usage_location: str, dataset_id: str | ||
) -> List[SystemProfile]: | ||
return self.get_system_profile( | ||
project_id, | ||
dataset_id, | ||
table, | ||
self.get_queries_by_operation( | ||
usage_location, | ||
project_id, | ||
dataset_id, | ||
[ | ||
DatabaseDMLOperations.INSERT, | ||
DatabaseDMLOperations.MERGE, | ||
], | ||
), | ||
"inserted_row_count", | ||
DmlOperationType.INSERT, | ||
) | ||
|
||
def get_queries_by_operation( | ||
self, | ||
usage_location: str, | ||
project_id: str, | ||
dataset_id: str, | ||
operations: List[DatabaseDMLOperations], | ||
) -> List[BigQueryQueryResult]: | ||
ops = {op.value for op in operations} | ||
yield from ( | ||
query | ||
for query in self.get_queries(usage_location, project_id, dataset_id) | ||
if query.statement_type in ops | ||
) | ||
|
||
def get_queries( | ||
self, usage_location: str, project_id: str, dataset_id: str | ||
) -> List[BigQueryQueryResult]: | ||
return self.get_or_update_cache( | ||
f"{project_id}.{dataset_id}", | ||
BigQueryQueryResult.get_for_table, | ||
session=super().get_session(), | ||
usage_location=usage_location, | ||
project_id=project_id, | ||
dataset_id=dataset_id, | ||
) | ||
|
||
@staticmethod | ||
def get_system_profile( | ||
project_id: str, | ||
dataset_id: str, | ||
table: str, | ||
query_results: List[BigQueryQueryResult], | ||
rows_affected_field: str, | ||
operation: DmlOperationType, | ||
) -> List[SystemProfile]: | ||
if not BigQueryQueryResult.model_fields.get(rows_affected_field): | ||
raise ValueError( | ||
f"rows_affected_field [{rows_affected_field}] is not a valid field in BigQueryQueryResult." | ||
) | ||
return TypeAdapter(List[SystemProfile]).validate_python( | ||
[ | ||
{ | ||
"timestamp": datetime_to_timestamp(q.start_time, milliseconds=True), | ||
"operation": operation, | ||
"rowsAffected": getattr(q, rows_affected_field), | ||
} | ||
for q in query_results | ||
if getattr(q, rows_affected_field) > 0 | ||
and q.project_id == project_id | ||
and q.dataset_id == dataset_id | ||
and q.table_name == table | ||
] | ||
) | ||
|
||
|
||
class BigQuerySystemMetricsComputer(SystemMetricsComputer, BigQuerySystemMetricsSource): | ||
pass | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
ingestion/src/metadata/ingestion/source/database/bigquery/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
from metadata.ingestion.source.database.bigquery.lineage import BigqueryLineageSource | ||
from metadata.ingestion.source.database.bigquery.metadata import BigquerySource | ||
from metadata.ingestion.source.database.bigquery.profiler.profiler import ( | ||
BigQueryProfiler, | ||
) | ||
from metadata.ingestion.source.database.bigquery.usage import BigqueryUsageSource | ||
from metadata.utils.service_spec.default import DefaultDatabaseSpec | ||
|
||
ServiceSpec = DefaultDatabaseSpec( | ||
metadata_source_class=BigquerySource, | ||
lineage_source_class=BigqueryLineageSource, | ||
usage_source_class=BigqueryUsageSource, | ||
profiler_class=BigQueryProfiler, | ||
) |
4 changes: 4 additions & 0 deletions
4
ingestion/src/metadata/ingestion/source/database/bigtable/service_spec.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from metadata.ingestion.source.database.bigtable.metadata import BigtableSource | ||
from metadata.utils.service_spec.default import DefaultDatabaseSpec | ||
|
||
ServiceSpec = DefaultDatabaseSpec(metadata_source_class=BigtableSource) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do? I am wondering why it does not override the base SystemMetricsComputer directly?