-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eng: Update validate_csrf() to parse CSRF tokens generated by flask 0.13 #12
Conversation
assert validate_csrf(legacy_token1, secret_key='dev') is None | ||
|
||
legacy_token2 = "##12a714b52cf57340c08dcab228f89c453399a2b4" | ||
assert validate_csrf(legacy_token2, time_limit=0, secret_key='dev') is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
time_limit=0
here because the stuff before ##
is the expiration, and in this case we have no expiration to compare against.
see https://github.com/benchling/aurelia/blob/dev/tests/unit/flask_test.py#L140
cf7056c
to
e00d6b9
Compare
5cb8865
to
2d02c8e
Compare
2d02c8e
to
5d8a643
Compare
284e788
to
248673a
Compare
248673a
to
8736e5b
Compare
nit: links in description are to a branch rather than a specific commit (or at least |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some requests for clarification... otherwise 🚀 and thanks for digging into this
|
||
if not hmac.compare_digest(session[field_name], token): | ||
raise ValidationError("The CSRF tokens do not match.") | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this broad except is hard to reason about; can the "missing field" ValidationError
s be separated out and handled differently? if I'm reading correctly, those will just never succeed (with old or new scheme), so we can just early-exit, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I'm reading correctly, those will just never succeed (with old or new scheme), so we can just early-exit, right?
I think so
are you looking for something like:
try:
...
except ValidationError as e:
if e.message == "The CSRF token is missing.":
raise e
if e.message == "The CSRF session token is missing.":
raise e
logger.info("Falling back to legacy CSRF validation.")
token_key = 'csrf_token' if token_key is None else token_key
is_valid = legacy_validate_csrf(
data=data,
secret_key=secret_key,
time_limit=time_limit,
token_key=token_key
)
if is_valid is False:
raise e
I think this works and saves us the call to legacy_validate_csrf
, but I'd say it complicates the fork, because we're adding more code / more conditional to tests. Just more places for bugs to sneak in.
this broad except is hard to reason about;
Even if we add early exit handling for ValidationError
, I think I want to keep the broad except around for the token = s.loads(data, max_age=time_limit)
line. I haven't tested, but I could imagine the old CSRF signed token could cause this function to raise an unhandled error -- but we'd still want that data to be tried against the legacy_validate_csrf()
function
token_key=token_key | ||
) | ||
if is_valid is False: | ||
raise e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kinda not sure this is the right error in all cases... if it doesn't validate with the new version, but it's actually an expired/invalid old kind of token, we should actually fail with "old-style validation error" i think? the only time we actually want to attempt "legacy validation" is if URLSafeTimedSerializer(secret_key, salt="wtf-csrf-token").loads(data)
fails? trying to solidify my understanding of error cases + ensure we have clear branch flow; the except
at the end makes it hard to tell why we might reach there, even though i'm pretty sure there will be no false positives or negatives with this implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the old "legacy validation" has a different API than new validation -- which makes it hard to return the "legacy validation" error.
- The new validation will raise a ValidationError if invalid. Otherwise nothing.
- The old validation will return true or false.
So if we returned the old result (true
or false
) -- then those would both be seen as a valid result from anything that's consuming the output of the new validation.
From the caller's perspective, I think they can just treat this forked function as the same as the unforked new validation. It will either raise a known ValidationError
if invalid or it will not.
writing this out, I guess this would be better accomplished by wrapping legacy_validate_csrf()
in another try..catch
, so we don't leak any of those errors as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to default to not wrapping it in another try..catch
, since I don't think this will affect anything -- but happy to add it if the extra compatability is worth it
8736e5b
to
428fe36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the explanations
@@ -0,0 +1,2 @@ | |||
flask_wtf | |||
3.9.11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooc, how did we pick this version specifically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random -- just a version i had installed locally.
in theory, this library should support multiple python version.
in practice, this hopefully won't matter once we delete this fork
return False | ||
|
||
if not secret_key: | ||
secret_key = current_app.config.get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if neither of these options are set, does this error/is that ok? in the new validate
above, we look for
secret_key = _get_config(
secret_key,
"WTF_CSRF_SECRET_KEY",
current_app.secret_key,
message="A secret key is required to use CSRF.",
)
which raises a RuntimeError if it's not found, then directs to here; here, this errors at the hmac.new
below with a TypeError
>>> hmac.new(None, None, digestmod=hashlib.sha1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/hmac.py", line 153, in new
return HMAC(key, msg, digestmod)
File "/usr/lib/python3.8/hmac.py", line 48, in __init__
raise TypeError("key: expected bytes or bytearray, but got %r" % type(key).__name__)
TypeError: key: expected bytes or bytearray, but got 'NoneType'
idek if its possible for both of these to be unset tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh, i actually cant tell you how secret_key gets set in our application -- but this legacy_validate_csrf
is taken from the version of flask-wtf that we're currently running.. so somehow these values are all present when they need to be.
the links in the description are broken TvT i think they just have an old branch
also finally know what wtforms is short for: pallets-eco/wtforms#146 |
Description
This PR amends the current flask-wtf 1.21 main branch with support for CSRF tokens generated by flask-wtf 0.13.
flask-wtf 0.14 changelog: https://flask-wtf.readthedocs.io/en/1.2.x/changes/#version-0-14
Note that this merges into the
main
branch! This is not the default branchmaster
for this repository. Themaster
branch is currently based off of 0.13, so the diff is very large. If this PR is approved, we can probably just change the default branch to bemain
.CSRF tokens are stored in sessions, so we should be okay to replace this forked 1.2.1 with the regular library after 90days when most of our sessions with legacy tokens will have expired. It's probably fine to replace this earlier since new tokens are generated on page load, but 90 days is the safest.
Background
A long time ago we forked flask-wtf 0.13.1, because they had a backwards incompatible change. Specifically, signed CSRF tokens generated by flask-wtf 0.13.1 were incompatible with flask-wtf >= 0.14 -- because they have changed the algorithm used to (un)sign the CSRF token. breaking change here.
This PR begins our migration back to an unforked flask-wtf. The migration strategy is:
flask-wtf-1.21.post1
to support both old and new CSRF signed tokens. This means all old signed tokens should continue working but all new page loads should get the new signed CSRF token.And we will have successfully unforked a library! 🎉
TIL about CSRF
I learned a lot about CSRF as well as our usage at benchling. Here's some of that information condensed:
hashlib.sha1(os.urandom(64)).hexdigest()
). It's stored on a user's session.URLSafeTimedSerializer
. This is what is exposed to the javascript client on page load. source code. This is also what is sent back by the client via theX-CSRFToken
header on POST/PATCH/DELETE HTTP requests to show that the intended client is making this request (and not a malicious client that doesn't have this signed token). source code.Time
part of theURLSafeTimedSerializer
sinceWTF_CSRF_TIME_LIMIT = None
. source codeURLSafeTimedSerializer
is able to unsign each signed token back into the same underlying CSRF token.CSRF by Example
Here's an example of how this works for extra clarity:
Testing Done
- [x] pip install -r requirements/tests.txt && pip install . && pytest tests