-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tokenless V3 #533
Tokenless V3 #533
Conversation
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found. @@ Coverage Diff @@
## main #533 +/- ##
==========================================
- Coverage 91.49% 91.45% -0.04%
==========================================
Files 615 615
Lines 16370 16354 -16
==========================================
- Hits 14977 14957 -20
- Misses 1393 1397 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today! |
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found @@ Coverage Diff @@
## main #533 +/- ##
==========================================
- Coverage 91.49% 91.45% -0.04%
==========================================
Files 615 615
Lines 16370 16354 -16
==========================================
- Hits 14977 14957 -20
- Misses 1393 1397 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## main #533 +/- ##
================================================
- Coverage 95.92000 95.91000 -0.01000
================================================
Files 793 793
Lines 17688 17671 -17
================================================
- Hits 16968 16949 -19
- Misses 720 722 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
We need to be careful merging these changes to avoid re-introducing the bug of having an upload from a fork branch accidentally overwrite coverage in the upstream branch. Considering the upload endpoints used by the CLI either we change the CLI to send the branch info every time (maybe through headers) OR we use the info we have to make the validation that the branch name is in the format Currently I believe it should be possible to make this check:
Personally I feel that we should be able to accept / reject a request without having to look at the request body. So I'd opt to make the CLI send specific headers with the information we need. |
We could just validate that the x-tokenless-pr header is in the format we expect and has the correct repo name (it matches the one in the url)? |
X-Tokenless-PR is a number [1] that we used to get the correct PR from the provider. That does little for us considering the changes in this PR. If you just change the value the header name will be misleading. X-Tokenless should not match the repo name in the URL if it comes from a fork [1], cause it's supposed to be the fork slug, while the slug in the URL should be the upstream's. |
960927f
to
fecbe1b
Compare
if ":" not in tokenless: | ||
raise exceptions.AuthenticationFailed(tokenless_auth_failed_message) | ||
|
||
# make sure it's backwards compatible with the old way that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't be compatible if validation fails (in the line above) because "the old way" doesn't include a :
in the X-Tokenless
header, ...right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this probably belongs in another PR, but in addition to getting rid of the get_pull_request_info()
check like you do here, we also want to get rid of the "recent CI run" check. that happens in upload/helpers.py
. with that gone, i think we can delete the whole upload/tokenless
directory
also, the other day you made a great point about the report/upload endpoints being able to get the branch from the commit rather than needing it to be sent by the CLI. i think we can take advantage of that to simplify this logic across the API, CLI, and CI actions
on the CLI side, we already have to send the branch name in the request body for the create-commit step, and we already modify the branch name for forks there. the other two steps include which commit they are working with, so the API will be able to look up the commit and get the branch from there. so none of the commands need to set X-Tokenless
, and the action doesn't have to set any extra env vars. (i think this also avoids an edge case where an attacker passes authentication by passing X-Tokenless: fork:hahaha
to the do-upload endpoint but the commit SHA they passed is actually a commit on main
)
then on the API side, i think the only difference between the commit endpoint and the others is where we get the branch name from? if so, i think we can merge back into one TokenAuthentication
class, and if the commit sha is part of the URL we get the branch from the commit DB object, otherwise we look for the branch in the request body because we're creating a new commit
my brain is a little fried right now so this comment might not be very useful. i'll do another pass tomorrow, or maybe we could hop on a call and talk about it
fecbe1b
to
b791fc6
Compare
Codecov ReportAttention: Patch coverage is ✅ All tests successful. No failed tests found.
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you can do this with one TokenAuthentication
class. the regex with the commit sha already makes that part an optional group so the same regex should work for all three endpoints. the logic could be something like:
def _get_branch(self, request, commit_from_url):
if commit_from_url:
return commit_from_url.branch
else:
body = json.loads(str(request.body, "utf8"))
return body.get("branch")
def authenticate(self, request):
repo, commit_from_url = self._get_info_from_request_path(request)
branch = self._get_branch(request, commit_from_url)
if not branch:
raise Whatever()
...
also, i left a couple comments in test files themselves but as a general note, can you make sure there are test cases that ensure a tokenless upload for a unprefixed branch name fails? the test cases covering tokenless rejections are all due to the repo being private, but we also want to reject based on the branch name
# Validate provider | ||
service = match.group(1) | ||
try: | ||
service_enum = Service(service) | ||
# Currently only Github is supported | ||
# TODO [codecov/engineering-team#914]: Extend tokenless support to other providers | ||
if service_enum != Service.GITHUB: | ||
raise exceptions.AuthenticationFailed(self.auth_failed_message) | ||
raise exceptions.AuthenticationFailed(tokenless_auth_failed_message) | ||
except ValueError: | ||
raise exceptions.AuthenticationFailed(self.auth_failed_message) | ||
raise exceptions.AuthenticationFailed(tokenless_auth_failed_message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this stuff isn't necessary anymore. if we aren't making requests to the git provider it shouldn't matter who that provider is
we shouldn't close the ticket in the comment because the CLI still needs work to support other providers, but this logic here can go
upload/tests/views/test_reports.py
Outdated
@@ -100,6 +114,7 @@ def test_reports_post_no_auth(db, mocker): | |||
repository = RepositoryFactory( | |||
name="the_repo", author__username="codecov", author__service="github" | |||
) | |||
repository.private = False | |||
token = "BAD" | |||
commit = CommitFactory(repository=repository) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test never calls commit.save()
. is it failing for the reason we expect, which is an invalid token? or is it failing because the commit isn't in the DB?
b791fc6
to
f220f21
Compare
raise exceptions.Throttled( | ||
wait=retry_after, detail=self.rate_limit_failed_message | ||
) | ||
commitid = match.group(3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to aggregate the match.group(x)
calls into one paragraph so it's easier to see
|
||
def get_branch(self, request, commitid=None): | ||
if commitid: | ||
commit = Commit.objects.get(commitid=commitid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think .get
will throw an error if this doesn't exist instead of None
113e734
to
bdf2bf1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason you got rid of all of the exceptions that we used to throw? i think returning None
means something different from raising an exception to Django. if we want to stop throwing we should probably do it as a separate change
aside from that, the implementation looks good! now just drilling into the tests because this area keeps biting us (and our users) and i want to make sure this is rock-solid so we can be free of it for a while haha. the tests generally exercise all the important behaviors, but with some tweaks we can be more confident that the system works the way we think and that the tests will start failing if any part of it breaks
- logically, we have two inputs:
private
andprefixed_branch
, and we want to test all combinations of those two inputs. all of them are tested at one point or another in your PR currently, but IMO we should have versions of all combos in all four test files:private=False
,prefixed_branch=True
, auth succeedsprivate=False
,prefixed_branch=False
, auth failsprivate=True
,prefixed_branch=True
, auth failsprivate=True
,prefixed_branch=False
, auth fails (we can actually skip this one; both of the failure reasons are already covered by other tests)
- when deciding whether a branch is prefixed or not, we can get the branch from two different places. what should happen if the branch is found in both places? our tests should address that one way or the other:
- write tests for
get_branch()
directly to guarantee that the request body's branch is ignored if the commit sha is passed in, and/or - in test cases that put the commit ID in the URL, put an "opposite" branch in the request body. in other words, if the URL commit's branch is prefixed, put an unprefixed branch in the request body. if the URL commit's branch is unprefixed, put a prefixed branch in the request body. that way, the test will fail if we ever start reading the branch name from the wrong place.
- write tests for
if a request is creating a commit then we require the name of the branch it's creating a commit on to contain a colon, because that means that it won't affect the coverage of the branches that already belong to that repository. This will only work for public repositories. Subsequent API calls to the reports and uploads endpoints will check the branch and visibility of the repo those requests are targetting. Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>
- consolidate Tokenless authentication methods - TokenlessAuthencation now accepts GET requests without the user having to provide the branch. - Ensures TokenlessAuthentication sends the clear error message every time it fails Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>
Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>
Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>
bdf2bf1
to
eca8697
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work!
Purpose/Motivation
We don't want to make GH API calls when doing tokenless anymore
Links to relevant tickets
codecov/engineering-team#1574
What does this PR do?