Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Token Validations #1682

Merged
merged 33 commits into from
Nov 7, 2024
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
155db7a
Add checks to ensure access tokenn validity
noah-paige Nov 4, 2024
b70b55c
store access token non cognito
noah-paige Nov 4, 2024
88c03e5
Enforce validation on Custom Auth Lambda
noah-paige Nov 4, 2024
5d1c2c9
clean up
noah-paige Nov 4, 2024
02b5d36
env var user info endpoint
noah-paige Nov 4, 2024
7117d55
fix lambda API CDK
noah-paige Nov 4, 2024
cba99e9
remove unused import
noah-paige Nov 4, 2024
09caacb
clean up and lint
noah-paige Nov 4, 2024
0e77776
handle cognito groups header
noah-paige Nov 4, 2024
e6fcf8f
update custom auth logic
noah-paige Nov 5, 2024
58a5440
fix
noah-paige Nov 5, 2024
c04760a
fix custom Auth logic
noah-paige Nov 5, 2024
b5c0c08
fix cognito groups list handling
noah-paige Nov 5, 2024
25334bc
lint
noah-paige Nov 5, 2024
fca14f2
update checkov baseline
noah-paige Nov 5, 2024
64fa2f1
use well-known/openid-config
noah-paige Nov 5, 2024
d416a55
Update deps custom authorizer
noah-paige Nov 5, 2024
913c1ce
clean up requirements
noah-paige Nov 5, 2024
a8cfe08
update cryptography version
noah-paige Nov 6, 2024
647b689
Add cryptography lambda layer
noah-paige Nov 6, 2024
eddcb86
Reformat
noah-paige Nov 6, 2024
3b4418d
Clean up and PR comments
noah-paige Nov 6, 2024
f30dd4f
Fix Integration Tests to fetch ID and Access Tokens
noah-paige Nov 6, 2024
320f652
use cdk klayers
noah-paige Nov 6, 2024
ff910c7
refactor jwt services
noah-paige Nov 6, 2024
9b9cbad
fix string formatting
noah-paige Nov 6, 2024
451b0d4
raise exceptions
noah-paige Nov 6, 2024
0e329a3
only use access token, remove id token
noah-paige Nov 6, 2024
5a0893b
update exception str
noah-paige Nov 6, 2024
8b8eb1a
Clean Up
noah-paige Nov 6, 2024
26bd79c
merge latest os main
noah-paige Nov 6, 2024
86db3e8
PR comments
noah-paige Nov 7, 2024
8db81a1
fix IDP_DOMAIN_URL setting
noah-paige Nov 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .checkov.baseline
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,13 @@
"CKV_AWS_115"
]
},
{
"resource": "AWS::Lambda::Function.CustomAuthorizerFunctiondevB38B5CCB",
"check_ids": [
"CKV_AWS_115",
"CKV_AWS_116"
]
},
{
"resource": "AWS::Lambda::Function.ElasticSearchProxyHandlerDBDE7574",
"check_ids": [
Expand All @@ -210,6 +217,12 @@
"CKV_AWS_158"
]
},
{
"resource": "AWS::Logs::LogGroup.customauthorizerloggroup8F3B5B9D",
"check_ids": [
"CKV_AWS_158"
]
},
{
"resource": "AWS::Logs::LogGroup.dataalldevapigateway2625FE76",
"check_ids": [
Expand Down
15 changes: 12 additions & 3 deletions backend/dataall/base/utils/api_handler_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,22 @@
]
ENGINE = get_engine(envname=ENVNAME)
ALLOWED_ORIGINS = os.getenv('ALLOWED_ORIGINS', '*')
AWS_REGION = os.getenv('AWS_REGION')


def redact_creds(event):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: That's why I don't like this approach, you have to remember to obfuscate stuff

if 'headers' in event and 'Authorization' in event['headers']:
if event.get('headers', {}).get('Authorization'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can anyone please tell why are we redacting these ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user creds information does not need to be logged and is no longer relevant for the remaining request lifecycle - opting to redact that info

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of redacting , can we just extract useful information and then pass it onto our lambda ?

event['headers']['Authorization'] = 'XXXXXXXXXXXX'
if 'multiValueHeaders' in event and 'Authorization' in event['multiValueHeaders']:

if event.get('multiValueHeaders', {}).get('Authorization'):
event['multiValueHeaders']['Authorization'] = 'XXXXXXXXXXXX'

if event.get('multiValueHeaders', {}).get('accesskeyid'):
event['multiValueHeaders']['accesskeyid'] = 'XXXXXXXXXXXX'

if event.get('headers', {}).get('accesskeyid'):
event['headers']['accesskeyid'] = 'XXXXXXXXXXXX'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in deploy/custom_resources/custom_authorizer/custom_authorizer_lambda.py the only values that we actually use are coming from the headers. Is it possible that the API succeeds with multiVValueHeaders? and if not, maybe we don't need these checks here. (It might be out of scope for this PR)

auth_token = incoming_event['headers']['Authorization']
access_token = incoming_event['headers']['accesskeyid']

return event


Expand Down Expand Up @@ -115,7 +124,7 @@ def check_reauth(query, auth_time, username):
# Determine if there are any Operations that Require ReAuth From SSM Parameter
try:
reauth_apis = ParameterStoreManager.get_parameter_value(
region=os.getenv('AWS_REGION', 'eu-west-1'), parameter_path=f'/dataall/{ENVNAME}/reauth/apis'
region=AWS_REGION, parameter_path=f'/dataall/{ENVNAME}/reauth/apis'
).split(',')
except Exception:
log.info('No ReAuth APIs Found in SSM')
Expand Down
2 changes: 1 addition & 1 deletion deploy/custom_resources/custom_authorizer/auth_services.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def generate_policy(verified_claims: dict, effect, incoming_resource_str: str):

for claim_name, claim_value in verified_claims.items():
if isinstance(claim_value, list):
verified_claims.update({claim_name: json.dumps(claim_value)})
verified_claims.update({claim_name: ','.join(claim_value)})

context = {**verified_claims}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import logging
import os
import json

from auth_services import AuthServices
from jwt_services import JWTServices
Expand All @@ -19,18 +20,25 @@

def lambda_handler(incoming_event, context):
# Get the Token which is sent in the Authorization Header
logger.debug(incoming_event)
petrkalos marked this conversation as resolved.
Show resolved Hide resolved
auth_token = incoming_event['headers']['Authorization']
if not auth_token:
raise Exception('Unauthorized . Token not found')
access_token = incoming_event['headers']['accesskeyid']
if not auth_token or not access_token:
raise Exception('Unauthorized. Missing identity or access JWT')

# Validate User is Active with Proper Access Token
JWTServices.validate_access_token(access_token)

# Validate JWT
verified_claims = JWTServices.validate_jwt_token(auth_token)
logger.debug(verified_claims)
if not verified_claims:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the method validate_jwt_token is now raising exception this if block can be removed

raise Exception('Unauthorized. Token is not valid')
logger.debug(verified_claims)

# Generate Allow Policy w/ Context
effect = 'Allow'
policy = AuthServices.generate_policy(verified_claims, effect, incoming_event['methodArn'])
logger.debug('Generated policy is ', policy)
logger.debug('Generated policy is ', json.dumps(policy))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.debug(f'Generated policy is json.dumps(policy)').

using f-string,. I think this logger.debug('Generated policy is ', json.dumps(policy)) doesn't work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup I made this change earlier and just ran into this - updating now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.debug('Generated policy is ', policy) was also not showing policy so opting for f string approach

return policy


Expand All @@ -39,12 +47,13 @@ def lambda_handler(incoming_event, context):
# AWS Lambda and any other local environments
if __name__ == '__main__':
# for testing locally you can enter the JWT ID Token here
token = ''
id_token = ''
access_token = ''
account_id = ''
api_gw_id = ''
event = {
'headers': {'Authorization': id_token, 'accesskeyid': access_token},
'type': 'TOKEN',
'Authorization': token,
'methodArn': f'arn:aws:execute-api:us-east-1:{account_id}:{api_gw_id}/prod/POST/graphql/api',
}
lambda_handler(event, None)
99 changes: 37 additions & 62 deletions deploy/custom_resources/custom_authorizer/jwt_services.py
Original file line number Diff line number Diff line change
@@ -1,46 +1,13 @@
import os

import requests
from jose import jwk
from jose.jwt import get_unverified_header, decode, ExpiredSignatureError, JWTError
import jwt

import logging

logger = logging.getLogger()
logger.setLevel(os.environ.get('LOG_LEVEL', 'INFO'))

# Configs required to fetch public keys from JWKS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this because pyJWT does it with some inbuilt method ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISSUER_CONFIGS = {
f'{os.environ.get("custom_auth_url")}': {
'jwks_uri': f'{os.environ.get("custom_auth_jwks_url")}',
'allowed_audiences': f'{os.environ.get("custom_auth_client")}',
},
}

issuer_keys = {}


# instead of re-downloading the public keys every time
# we download them only on cold start
# https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/
def fetch_public_keys():
try:
for issuer, issuer_config in ISSUER_CONFIGS.items():
jwks_response = requests.get(issuer_config['jwks_uri'])
jwks_response.raise_for_status()
jwks: dict = jwks_response.json()
for key in jwks['keys']:
value = {
'issuer': issuer,
'audience': issuer_config['allowed_audiences'],
'jwk': jwk.construct(key),
'public_key': jwk.construct(key).public_key(),
}
issuer_keys.update({key['kid']: value})
except Exception as e:
raise Exception(f'Unable to fetch public keys due to {str(e)}')


fetch_public_keys()

# Options to validate the JWT token
# Only modification from default is to turn off verify_at_hash as we don't provide the access token for this validation
Expand All @@ -53,49 +20,57 @@ def fetch_public_keys():
'verify_iss': True,
'verify_sub': True,
'verify_jti': True,
'verify_at_hash': False,
'require_aud': True,
'require_iat': True,
'require_exp': True,
'require_nbf': False,
'require_iss': True,
'require_sub': True,
'require_jti': True,
'require_at_hash': False,
'leeway': 0,
'verify_at_hash': True,
'require': ['aud', 'iat', 'exp', 'iss', 'sub', 'jti'], # "nbf", "at_hash"
}


class JWTServices:
petrkalos marked this conversation as resolved.
Show resolved Hide resolved
@staticmethod
def _fetch_openid_url(key):
response = requests.get(
os.path.join(os.environ.get('custom_auth_url', ''), '.well-known', 'openid-configuration')
)
response.raise_for_status()
return response.json().get(key)

@staticmethod
def validate_jwt_token(jwt_token):
try:
# Decode and verify the JWT token
header = get_unverified_header(jwt_token)
kid = header['kid']
if kid not in issuer_keys:
logger.info('Public key not found in provided set of keys')
# Retry Fetching the public certificates again in case rotation occurs and lambda has cached the publicKeys
fetch_public_keys()
if kid not in issuer_keys:
raise Exception('Unauthorized')
public_key = issuer_keys.get(kid)
payload = decode(
# get JWK URI from OpenId Configuration
jwks_url = JWTServices._fetch_openid_url('jwks_uri')

# Init pyJWT.JWKClient with JWK URI
jwks_client = jwt.PyJWKClient(jwks_url)

# get signing_key from JWT
signing_key = jwks_client.get_signing_key_from_jwt(jwt_token)

# Decode and Verify JWT
payload = jwt.decode(
jwt_token,
public_key.get('jwk'),
signing_key.key,
algorithms=['RS256', 'HS256'],
issuer=public_key.get('issuer'),
audience=public_key.get('audience'),
issuer=os.environ.get('custom_auth_url'),
audience=os.environ.get('custom_auth_client'),
leeway=0,
options=jwt_options,
)

return payload
except ExpiredSignatureError:
except jwt.exceptions.ExpiredSignatureError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would directly raise these Exceptions here instead of passing the None and then raising the general exception

logger.error('JWT token has expired.')
return None
except JWTError as e:
except jwt.exceptions.PyJWTError as e:
logger.error(f'JWT token validation failed: {str(e)}')
return None
except Exception as e:
logger.error(f'Failed to validate token - {str(e)}')
return None

@staticmethod
def validate_access_token(access_token):
Copy link
Contributor

@petrkalos petrkalos Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put a TTLCache decorator around it with a configurable ttl? We must check the negative scenario carefully though...

  1. Users is not authorized and tries to access
  2. User gets access denied and TTLCache is populated with an "Exception?"
  3. User asks admins to grant access
  4. User tries again to access

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for TTL Cache - please note custom authorizer already ahs a results_cache_ttl (set to 1 minute for now) at deploy/stacks/lambda_api.py:

custom_authorizer = apigw.RequestAuthorizer(
            self,
            'CustomAuthorizer',
            handler=self.authorizer_fn,
            identity_sources=[apigw.IdentitySource.header('Authorization')],
            authorizer_name=f'{resource_prefix}-{envname}-custom-authorizer',
            assume_role=custom_authorizer_role,
            results_cache_ttl=Duration.minutes(1),
        )

Would adding an additional caching mechanism at the above be useful?

# get JWK UserInfo URI from OpenId Configuration
user_info_url = JWTServices._fetch_openid_url('userinfo_endpoint')
r = requests.get(user_info_url, headers={'Authorization': access_token})
r.raise_for_status()
logger.debug(r.json())
5 changes: 2 additions & 3 deletions deploy/custom_resources/custom_authorizer/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
certifi==2024.7.4
charset-normalizer==3.1.0
ecdsa==0.18.0
idna==3.7
pyasn1==0.5.0
python-jose==3.3.0
requests==2.32.2
rsa==4.9
six==1.16.0
urllib3==1.26.19
urllib3==1.26.19
pyjwt==2.9.0
1 change: 1 addition & 0 deletions deploy/stacks/backend_stack.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,7 @@ def __init__(
apig_vpce=apig_vpce,
prod_sizing=prod_sizing,
user_pool=cognito_stack.user_pool if custom_auth is None else None,
user_pool_client=cognito_stack.client if custom_auth is None else None,
pivot_role_name=self.pivot_role_name,
reauth_ttl=reauth_config.get('ttl', 5) if reauth_config else 5,
email_notification_sender_email_id=email_sender,
Expand Down
Loading
Loading