-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Slack Alert lifecycle to Dagster for Metadata publish #28759
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
6b5ae6d
DNC
bnchrch 1cad302
Add slack lifecycle logging
bnchrch 08e7404
Update to use slack
bnchrch 1143a6e
Update slack to use resource and bot
bnchrch 3e32e32
Improve markdown
bnchrch b3fecea
Improve log
bnchrch fab7469
Merge remote-tracking branch 'origin/master' into bnchrch/dagster/add…
bnchrch 36b5b7a
Add sensor logging
bnchrch d161e6d
Extend sensor time
bnchrch 75c0c4c
Merge remote-tracking branch 'origin/master' into bnchrch/dagster/add…
bnchrch cf06986
Merge remote-tracking branch 'origin/master' into bnchrch/dagster/add…
bnchrch File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
from dagster import define_asset_job, AssetSelection, job, SkipReason, op | ||
from orchestrator.assets import registry_entry | ||
from orchestrator.config import MAX_METADATA_PARTITION_RUN_REQUEST, HIGH_QUEUE_PRIORITY | ||
from orchestrator.logging.publish_connector_lifecycle import PublishConnectorLifecycle, PublishConnectorLifecycleStage, StageStatus | ||
|
||
oss_registry_inclusive = AssetSelection.keys("persisted_oss_registry", "specs_secrets_mask_yaml").upstream() | ||
generate_oss_registry = define_asset_job(name="generate_oss_registry", selection=oss_registry_inclusive) | ||
|
@@ -19,29 +20,41 @@ | |
) | ||
|
||
|
||
@op(required_resource_keys={"all_metadata_file_blobs"}) | ||
@op(required_resource_keys={"slack", "all_metadata_file_blobs"}) | ||
def add_new_metadata_partitions_op(context): | ||
""" | ||
This op is responsible for polling for new metadata files and adding their etag to the dynamic partition. | ||
""" | ||
all_metadata_file_blobs = context.resources.all_metadata_file_blobs | ||
partition_name = registry_entry.metadata_partitions_def.name | ||
|
||
new_etags_found = [ | ||
blob.etag for blob in all_metadata_file_blobs if not context.instance.has_dynamic_partition(partition_name, blob.etag) | ||
] | ||
new_files_found = { | ||
blob.etag: blob.name for blob in all_metadata_file_blobs if not context.instance.has_dynamic_partition(partition_name, blob.etag) | ||
} | ||
|
||
new_etags_found = list(new_files_found.keys()) | ||
context.log.info(f"New etags found: {new_etags_found}") | ||
|
||
if not new_etags_found: | ||
return SkipReason(f"No new metadata files to process in GCS bucket") | ||
|
||
# if there are more than the MAX_METADATA_PARTITION_RUN_REQUEST, we need to split them into multiple runs | ||
etags_to_process = new_etags_found | ||
if len(new_etags_found) > MAX_METADATA_PARTITION_RUN_REQUEST: | ||
new_etags_found = new_etags_found[:MAX_METADATA_PARTITION_RUN_REQUEST] | ||
context.log.info(f"Only processing first {MAX_METADATA_PARTITION_RUN_REQUEST} new blobs: {new_etags_found}") | ||
etags_to_process = etags_to_process[:MAX_METADATA_PARTITION_RUN_REQUEST] | ||
context.log.info(f"Only processing first {MAX_METADATA_PARTITION_RUN_REQUEST} new blobs: {etags_to_process}") | ||
|
||
context.instance.add_dynamic_partitions(partition_name, new_etags_found) | ||
context.instance.add_dynamic_partitions(partition_name, etags_to_process) | ||
|
||
# format new_files_found into a loggable string | ||
new_metadata_log_string = "\n".join([f"{new_files_found[etag]} *{etag}* " for etag in etags_to_process]) | ||
|
||
PublishConnectorLifecycle.log( | ||
context, | ||
PublishConnectorLifecycleStage.METADATA_SENSOR, | ||
StageStatus.SUCCESS, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd maybe make this an in progress? but not strongly opinionated on that one There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At this point they are successfully queued! Next step is processing. |
||
f"*Queued {len(etags_to_process)}/{len(new_etags_found)} new metadata files for processing:*\n\n {new_metadata_log_string}", | ||
) | ||
|
||
|
||
@job(tags={"dagster/priority": HIGH_QUEUE_PRIORITY}) | ||
|
75 changes: 75 additions & 0 deletions
75
...nectors/metadata_service/orchestrator/orchestrator/logging/publish_connector_lifecycle.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
import os | ||
|
||
from enum import Enum | ||
from dagster import OpExecutionContext | ||
from orchestrator.ops.slack import send_slack_message | ||
|
||
|
||
class StageStatus(str, Enum): | ||
IN_PROGRESS = "in_progress" | ||
SUCCESS = "success" | ||
FAILED = "failed" | ||
|
||
def __str__(self) -> str: | ||
# convert to upper case | ||
return self.value.replace("_", " ").upper() | ||
|
||
def to_emoji(self) -> str: | ||
if self == StageStatus.IN_PROGRESS: | ||
return "🟡" | ||
elif self == StageStatus.SUCCESS: | ||
return "🟢" | ||
elif self == StageStatus.FAILED: | ||
return "🔴" | ||
else: | ||
return "" | ||
|
||
|
||
class PublishConnectorLifecycleStage(str, Enum): | ||
METADATA_SENSOR = "metadata_sensor" | ||
METADATA_VALIDATION = "metadata_validation" | ||
REGISTRY_ENTRY_GENERATION = "registry_entry_generation" | ||
REGISTRY_GENERATION = "registry_generation" | ||
|
||
def __str__(self) -> str: | ||
# convert to title case | ||
return self.value.replace("_", " ").title() | ||
|
||
|
||
class PublishConnectorLifecycle: | ||
""" | ||
This class is used to log the lifecycle of a publishing a connector to the registries. | ||
|
||
It is used to log to the logger and slack (if enabled). | ||
|
||
This is nessesary as this lifecycle is not a single job, asset, resource, schedule, or sensor. | ||
""" | ||
|
||
@staticmethod | ||
def stage_to_log_level(stage_status: StageStatus) -> str: | ||
if stage_status == StageStatus.FAILED: | ||
return "error" | ||
else: | ||
return "info" | ||
|
||
@staticmethod | ||
def create_log_message( | ||
lifecycle_stage: PublishConnectorLifecycleStage, | ||
stage_status: StageStatus, | ||
message: str, | ||
) -> str: | ||
emoji = stage_status.to_emoji() | ||
return f"*{emoji} _{lifecycle_stage}_ {stage_status}*: {message}" | ||
|
||
@staticmethod | ||
def log(context: OpExecutionContext, lifecycle_stage: PublishConnectorLifecycleStage, stage_status: StageStatus, message: str): | ||
"""Publish a connector notification log to logger and slack (if enabled).""" | ||
message = PublishConnectorLifecycle.create_log_message(lifecycle_stage, stage_status, message) | ||
|
||
level = PublishConnectorLifecycle.stage_to_log_level(stage_status) | ||
log_method = getattr(context.log, level) | ||
log_method(message) | ||
channel = os.getenv("PUBLISH_UPDATE_CHANNEL") | ||
if channel: | ||
slack_message = f"🤖 {message}" | ||
send_slack_message(context, channel, slack_message) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we intentionally slowing things down here? What's the reasoning for doing so (just wondering)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Snuck it in to deal with a senor resource issue.
https://airbytehq-team.slack.com/archives/C056HGD1QSW/p1690931225565009
Very minor change, figured I would Save an approve-and-merge.