-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug-1827506: Implement the storage interface for GCS. #2957
Conversation
6c31908
to
72bd4ec
Compare
Original comment
|
9e74534
to
5ba4051
Compare
5ba4051
to
c23101d
Compare
I split out all commits that are for other bugs into separate PRs. Many of the PRs needed to be filed against other branches, so we now have to merge the commits in this order:
The other PRs I filed are completely orthogonal to everything else and can be merged at any time. |
Local testing is easier using the code in #2964 than this branch (see the comment above). |
777a60d
to
7a4ff40
Compare
c23101d
to
f07eee1
Compare
d579d69
to
ca471c8
Compare
ca471c8
to
df1fc5e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is pretty substantial and makes some big changes to the codebase to make it possible to implement GCS storage. We knew going into this that it would be hard to review and require multiple back-and-forths. That's how big PRs go.
I left a few comments on minor things we should change that are worth doing in this PR.
There are two comments that probably require bigger changes:
The first is how StorageError
is structured. I'm pretty sure the structure in this PR will cause us a lot of pain when looking at problems in Sentry. We should do something about that in this PR.
The second is how upload data is handled in tests. I think I want us to change that before I review the rest of the test code and validate the changes.
@@ -96,8 +96,8 @@ def possible_upload_urls(request): | |||
context = { | |||
"urls": [ | |||
{ | |||
"url": upload_backend.url, | |||
"bucket_name": upload_backend.name, | |||
"bucket_name": upload_backend.bucket, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets used by the "upload now" ui view. I bet we could remove this ui view and the API endpoints it uses. I wrote up a bug to consider doing that.
We talked about this on Zoom. I'm going to do a pass to minimally fix the |
afb7052
to
a324a95
Compare
^^^ That rebases against main tip and addresses my concerns with @smarnach Do the last 3 commits look ok to you? If so, then we have a couple of |
Thanks a lot! The commits look good to me. I particularly like that the symfile tests and uploads now use valid data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In testing, everything looked fine.
Validation:
- ran unit tests 👍
- ran system tests 👍
- looked at upload data
- regular uploads are marked correctly 👍
- try uploads are marked correctly 👍
- looked at file upload data
- compressed vs. uncompressed looks good 👍
- try vs. non try have the right public url 👍
- at the bottom of the upload file for try, it says "Try Symbols: No" which is wrong--but this is broken in stage, too. I wrote up bug 1910180 😒
- public url is something like http://localhost:8000/TestBaseProfiler/F9E38D3857BEE30A4339CC3B15892CF00/TestBaseProfiler.sym?try and clicking on it redirects to https://gcs-emulator:8001/publicbucket/try/v1/TestBaseProfiler/F9E38D3857BEE30A4339CC3B15892CF00/TestBaseProfiler.sym which fails. I think this gets fixed in pull bug-1905455: Enable anonymous access to storage buckets. #2965. 🤷
- i uploaded a pdb sym file
- verified sym file header extraction works 👍
- verified codeinfo lookup works 👍
We dropped checking the Location
value in the download tests. I'm not sure why we did that. I kind of think we should be checking the Location
value. After we land this, we can talk about it and maybe implement it in a followup PR.
I think this is good. Let's land it!
|
||
assert parsed.path == ( | ||
"/publicbucket/v1/xul.pdb/44E4EC8C2F41492B9369D6B9A059577C2/xul.sym" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should be testing the HTTP 302 response Location. Something like this:
def test_client_happy_path(client, db, symbol_storage, metricsmock):
upload = UPLOADS["compressed"]
upload.upload(symbol_storage)
bucket = symbol_storage.upload_backend.bucket
prefix = symbol_storage.upload_backend.prefix
url = reverse(
"download:download_symbol",
args=(upload.debug_file, upload.debug_id, upload.sym_file),
)
response = client.get(url)
assert response.status_code == 302
parsed = urlparse(response["location"])
assert parsed.netloc in ["localstack:4566", "gcs-emulator:8001"]
assert parsed.path == (
f"/{bucket}/{prefix}/{upload.debug_file}/{upload.debug_id}/{upload.sym_file}"
)
metricsmock.assert_histogram_once(
"tecken.symboldownloader.file_age_days",
value=0,
tags=["storage:regular", "host:testnode"],
)
response = requests.get(response.headers["location"])
assert response.status_code == 200
assert response.content == upload.original_body
response = client.head(url)
assert response.status_code == 200
assert response["Access-Control-Allow-Origin"] == "*"
assert response["Access-Control-Allow-Methods"] == "GET"
We can add this in a follow-up PR unless there's some compelling reason I'm not seeing for not doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Determining the actual redirect location gets more complex with the next PR that adds a potential CDN. At that point, we'd also need to check whether the storage backend has public_url
set and take that into account. It's possible, but the question is how useful it is – we'd basically be re-implementing some logic from the application code in the tests and check that both implementations have the same result.
For this reason I added the code to actually download the symbols file and compare the body to what we uploaded. If we get the correct file back, that seems to be a much stronger indication that the redirect location is correct.
This adds the chained exception message to the StorageError message. This will prevent Sentry from coalescing unconnected StorageErrors and make it easier to understand what happened from the exception without having to look up the exception chain.
This removes a file that wasn't used and moves the other test files into a data/ subdirectory. This adds a __init__.py making the test directory a Python package.
This moves extract_sym_file_header to a new tecken/libsym.py where it can be used by the upload view and the tests. This reworks the tests to use well-formed sym files in the tests/data/ directory.
Co-authored-by: Will Kahn-Greene <willkg@users.noreply.github.com>
e807085
to
ea0ff03
Compare
@willkg There were two outstanding comments from your first review that I addressed in the last two commits. I think this is ready to land now – could you take a final look at the last two PRs? |
@@ -11,23 +12,25 @@ | |||
from tecken.libstorage import StorageError | |||
|
|||
|
|||
logger = logging.getLogger("tecken") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1910363 to rename the loggers in Tecken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Quoting from the description of this PR:
I forgot to do that. I'll file a PR. |
This pull request implements the storage interface for Google Cloud Storage and makes the code backend-agnostic.
symbol_storage
fixture.s3_helper
andbotomock
fixtures are no longer needed and have been removed.Next steps: