Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metadata-service[lib]: add commands to rollback or promote a RC metadata #44876

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented Aug 28, 2024

What

Relates to https://github.com/airbytehq/airbyte-internal-issues/issues/9254
We want to communicate the finalization of a rollout by modifying metadata files we host on GCS:

Promoting an RC to a main version:

  • Move the release_candidate/metadata.yaml to latest/metadata.yaml

RC Rollback:

  • Delete its <version>/metadata.yaml and release_candidate/metadata.yaml

How

Expose two new commands in the metadata service.
They will be called by airbyte-ci

Copy link

vercel bot commented Aug 28, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Sep 2, 2024 2:09pm

Copy link
Contributor Author

alafanechere commented Aug 28, 2024

@alafanechere alafanechere force-pushed the augustin/08-28-airbyte-ci_publish_promote_and_rollback_release_candidates branch from 806550e to aab660b Compare August 28, 2024 21:58
@alafanechere alafanechere marked this pull request as ready for review August 28, 2024 22:01
@alafanechere alafanechere requested a review from a team as a code owner August 28, 2024 22:01
Copy link
Contributor

@clnoll clnoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blockers, but I do have a question and a small request!

@@ -120,6 +134,14 @@ def _save_blob_to_gcs(blob_to_save: storage.blob.Blob, file_path: str, disable_c
return True


def _delete_blob_from_gcs(blob_to_delete: storage.blob.Blob) -> bool:
"""Deletes a blob from the bucket."""
print(f"Deleting {blob_to_delete.name}...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a log statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lib is always using print... It's 🆗 as it's mainly in airbyte-ci, which exposes the metadata lib stdout.

def _clear_release_candidate(bucket: storage.bucket.Bucket, metadata: ConnectorMetadataDefinitionV0) -> None:
release_candidate_path = get_metadata_remote_file_path(metadata.data.dockerRepository, RELEASE_CANDIDATE_GCS_FOLDER_NAME)
release_candidate_blob = bucket.blob(release_candidate_path)
if release_candidate_blob.exists():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we log if it doesn't exist? I think if we hit this line we expect it to be there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not actually called, I just removed it, the full logic is in delete_release_candidate_from_gcs. We do raise errors on missing expected files there.

raise FileNotFoundError(f"Release candidate metadata file {rc_path} does not exist in the bucket.")

if rc_blob.md5_hash != version_blob.md5_hash:
raise ValueError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check my understanding - when a connector change is merged, the blob is pushed to both the RC directory and a directory for that version. So if we hit this code path it means that somehow there was a re-upload to the RC directory without a corresponding one to the version directory? If that's the case sounds like we really don't expect this to happen, but I'm curious what you think the solution would be if it did. Would be great to add that to the error message.

It would be nice to use pointers instead of uploading to multiple locations. I expect there's a good reason we're not doing that?

Copy link
Contributor Author

@alafanechere alafanechere Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case sounds like we really don't expect this to happen.

Yes this is not something which is expected to happen.
Inconsistent MD5 might happen on incomplete publish flow.
I think this check is important as it makes sure we're promoting/rolling back the right version.

It would be nice to use pointers instead of uploading to multiple locations. I expect there's a good reason we're not doing that?

Yes pointers would be better, but our whole registry generation logic is based on sensors triggered by upload events to GCS to specific path. It's more performant to identify existing RCs by listing blobs with a specific **/release_candidate/metadata.yaml path than continually scanning all our metadata and regenerating a list of RC version metadata path. I'm open to change this approach later if we find it too fragile. But it's consistent to what we do for latest/metadata.yaml. We have a per version folder + a latest folder holding the same metadata files.

@alafanechere alafanechere force-pushed the augustin/08-28-airbyte-ci_publish_promote_and_rollback_release_candidates branch from aab660b to f54609d Compare September 2, 2024 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants