-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metadata-service[lib]: add commands to rollback or promote a RC metadata #44876
metadata-service[lib]: add commands to rollback or promote a RC metadata #44876
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @alafanechere and the rest of your teammates on Graphite |
806550e
to
aab660b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blockers, but I do have a question and a small request!
@@ -120,6 +134,14 @@ def _save_blob_to_gcs(blob_to_save: storage.blob.Blob, file_path: str, disable_c | |||
return True | |||
|
|||
|
|||
def _delete_blob_from_gcs(blob_to_delete: storage.blob.Blob) -> bool: | |||
"""Deletes a blob from the bucket.""" | |||
print(f"Deleting {blob_to_delete.name}...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a log statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lib is always using print
... It's 🆗 as it's mainly in airbyte-ci
, which exposes the metadata lib stdout.
def _clear_release_candidate(bucket: storage.bucket.Bucket, metadata: ConnectorMetadataDefinitionV0) -> None: | ||
release_candidate_path = get_metadata_remote_file_path(metadata.data.dockerRepository, RELEASE_CANDIDATE_GCS_FOLDER_NAME) | ||
release_candidate_blob = bucket.blob(release_candidate_path) | ||
if release_candidate_blob.exists(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we log if it doesn't exist? I think if we hit this line we expect it to be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not actually called, I just removed it, the full logic is in delete_release_candidate_from_gcs
. We do raise errors on missing expected files there.
raise FileNotFoundError(f"Release candidate metadata file {rc_path} does not exist in the bucket.") | ||
|
||
if rc_blob.md5_hash != version_blob.md5_hash: | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check my understanding - when a connector change is merged, the blob is pushed to both the RC directory and a directory for that version. So if we hit this code path it means that somehow there was a re-upload to the RC directory without a corresponding one to the version directory? If that's the case sounds like we really don't expect this to happen, but I'm curious what you think the solution would be if it did. Would be great to add that to the error message.
It would be nice to use pointers instead of uploading to multiple locations. I expect there's a good reason we're not doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case sounds like we really don't expect this to happen.
Yes this is not something which is expected to happen.
Inconsistent MD5 might happen on incomplete publish flow.
I think this check is important as it makes sure we're promoting/rolling back the right version.
It would be nice to use pointers instead of uploading to multiple locations. I expect there's a good reason we're not doing that?
Yes pointers would be better, but our whole registry generation logic is based on sensors triggered by upload events to GCS to specific path. It's more performant to identify existing RCs by listing blobs with a specific **/release_candidate/metadata.yaml
path than continually scanning all our metadata and regenerating a list of RC version metadata path. I'm open to change this approach later if we find it too fragile. But it's consistent to what we do for latest/metadata.yaml
. We have a per version folder + a latest folder holding the same metadata files.
aab660b
to
f54609d
Compare
What
Relates to https://github.com/airbytehq/airbyte-internal-issues/issues/9254
We want to communicate the finalization of a rollout by modifying metadata files we host on GCS:
Promoting an RC to a main version:
release_candidate/metadata.yaml
tolatest/metadata.yaml
RC Rollback:
<version>/metadata.yaml
andrelease_candidate/metadata.yaml
How
Expose two new commands in the metadata service.
They will be called by
airbyte-ci