-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement persistent bucket fixtures for integration tests #301
Changes from 10 commits
575a69c
295630c
0251d03
b3c1b11
a171c1c
d1b65f5
efda8b5
1a9dba7
7d18983
2c05f31
01b8a69
b5c5c1c
a3b7485
c4413d4
fc2a907
6c844fe
6de2bf7
0c99247
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Add persistent bucket fixtures for integration tests. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Introduce PersistentBucketAggregate class to manage bucket name and subfolder. | ||
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Add utility functions for managing persistent buckets. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Update integration tests to use persistent buckets. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. seems like duplicate IMO as they are right now all of these would go under our "infrastracture" change category i.e. not something that typical users care about, since this is not what they get with the shipped version of CLI tool. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,10 +7,16 @@ | |
# License https://www.backblaze.com/using_b2_code.html | ||
# | ||
###################################################################### | ||
from .persistent_bucket import get_or_create_persistent_bucket | ||
|
||
|
||
def test_cleanup_buckets(b2_api): | ||
# this is not a test, but it is intended to be called | ||
# via pytest because it reuses fixtures which have everything | ||
# set up | ||
pass # b2_api calls b2_api.clean_buckets() in its finalizer | ||
# set up. | ||
# The persistent bucket is cleared manually now and not | ||
# when tests tear down, as otherwise we'd lose the main benefit | ||
# of a persistent bucket, whose identity is shared across tests. | ||
persistent_bucket = get_or_create_persistent_bucket(b2_api) | ||
b2_api.clean_bucket(persistent_bucket) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. won't this break concurrently run tests? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is why lifecycle rules was suggested instead to cleanup in the first place. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It cleans the buckets once, before the tests run. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, so what will happen if I someone opens a second PR when one is being tested? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the same thing that would happen if no changes were introduced—such a scenario disrupts the test bucket lifecycle, persistent or not. Question is, is it frequent enough to warrant addressing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The bucket cleanup process only removed stale buckets, not all of them, so before we did support concurrent GHA jobs, and this change breaks it, and for no reason AFAIK, since the solution is to simply to leave that bucket alone forever. |
||
b2_api.api.list_buckets() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is this call here for? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. overlooked, redundant |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
###################################################################### | ||
# | ||
# File: test/integration/persistent_bucket.py | ||
# | ||
# Copyright 2024 Backblaze Inc. All Rights Reserved. | ||
# | ||
# License https://www.backblaze.com/using_b2_code.html | ||
# | ||
###################################################################### | ||
import hashlib | ||
import os | ||
from dataclasses import dataclass | ||
from functools import cached_property | ||
from test.integration.helpers import BUCKET_NAME_LENGTH, Api | ||
|
||
import backoff | ||
from b2sdk.v2 import Bucket | ||
from b2sdk.v2.exception import DuplicateBucketName, NonExistentBucket | ||
|
||
PERSISTENT_BUCKET_NAME_PREFIX = "constst" | ||
|
||
|
||
@dataclass | ||
class PersistentBucketAggregate: | ||
bucket_name: str | ||
subfolder: str | ||
|
||
@cached_property | ||
def virtual_bucket_name(self): | ||
return f"{self.bucket_name}/{self.subfolder}" | ||
|
||
|
||
@backoff.on_exception(backoff.expo, Exception, max_tries=3, max_time=10) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is the backoff here for? the retry mechanism built in in b2sdk doesn't work properly? |
||
def delete_all_files(bucket: Bucket): | ||
all_items = list(bucket.ls(recursive=True)) | ||
for item, _ in all_items: | ||
bucket.delete_file_version(item.id_, item.file_name) | ||
|
||
|
||
@backoff.on_exception(backoff.expo, Exception, max_tries=3, max_time=10) | ||
def delete_files(bucket: Bucket, subfolder: str): | ||
kris-konina-reef marked this conversation as resolved.
Show resolved
Hide resolved
|
||
for file_version, _ in bucket.ls(recursive=True, folder_to_list=subfolder): | ||
bucket.delete_file_version(file_version.id_, file_version.file_name) | ||
|
||
|
||
def cleanup_persistent_bucket(b2_api: Api): | ||
all_buckets = b2_api.api.list_buckets() | ||
for bucket in all_buckets: | ||
if bucket.name.startswith(PERSISTENT_BUCKET_NAME_PREFIX): | ||
print(f"Deleting all files in bucket {bucket.name}") | ||
delete_all_files(bucket) | ||
mjurbanski-reef marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
def get_persistent_bucket_name(b2_api: Api) -> str: | ||
if "CI" in os.environ: | ||
# CI environment | ||
repo_id = os.environ.get("GITHUB_REPOSITORY_ID") | ||
if not repo_id: | ||
raise ValueError("GITHUB_REPOSITORY_ID is not set") | ||
bucket_hash = hashlib.sha256(repo_id.encode()).hexdigest() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I kinda like the source of ID you used (especially the account id), but please note the tests are also run under Jenkins in case of staging environment. Probably best to simply tests for GITHUB_REPOSITORY_ID presence and use account_id. |
||
else: | ||
# Local development | ||
bucket_hash = hashlib.sha256(b2_api.account_id.encode()).hexdigest() | ||
|
||
return f"{PERSISTENT_BUCKET_NAME_PREFIX}-{bucket_hash}" [:BUCKET_NAME_LENGTH] | ||
|
||
|
||
@backoff.on_exception( | ||
backoff.expo, | ||
DuplicateBucketName, | ||
max_tries=3, | ||
jitter=backoff.full_jitter, | ||
) | ||
def get_or_create_persistent_bucket(b2_api: Api) -> Bucket: | ||
bucket_name = get_persistent_bucket_name(b2_api) | ||
try: | ||
bucket = b2_api.api.get_bucket_by_name(bucket_name) | ||
except NonExistentBucket: | ||
bucket = b2_api.api.create_bucket( | ||
bucket_name, | ||
bucket_type="allPublic", | ||
mjurbanski-reef marked this conversation as resolved.
Show resolved
Hide resolved
|
||
lifecycle_rules=[ | ||
{ | ||
"daysFromHidingToDeleting": 1, | ||
"daysFromUploadingToHiding": 14, | ||
mjurbanski-reef marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"fileNamePrefix": "", | ||
} | ||
], | ||
) | ||
# add the new bucket name to the list of bucket names | ||
b2_api.bucket_name_log.append(bucket_name) | ||
return bucket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
users won't care for these details in CLI tool changelog
btw, what is your plan for SDK? you started with CLI, but won't this be harder to apply the same thing to SDK?
I can kinda see these changelog message make sense if they were part of public api of SDK that is used in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't aware SDK was part of the same ticket, will proceed with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I wouldn't rely on tickets' description much, more on what makes sense and fixing problem in one place while identical exists in another - doesn't