Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Token Validations #1682

Merged
merged 33 commits into from
Nov 7, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
155db7a
Add checks to ensure access tokenn validity
noah-paige Nov 4, 2024
b70b55c
store access token non cognito
noah-paige Nov 4, 2024
88c03e5
Enforce validation on Custom Auth Lambda
noah-paige Nov 4, 2024
5d1c2c9
clean up
noah-paige Nov 4, 2024
02b5d36
env var user info endpoint
noah-paige Nov 4, 2024
7117d55
fix lambda API CDK
noah-paige Nov 4, 2024
cba99e9
remove unused import
noah-paige Nov 4, 2024
09caacb
clean up and lint
noah-paige Nov 4, 2024
0e77776
handle cognito groups header
noah-paige Nov 4, 2024
e6fcf8f
update custom auth logic
noah-paige Nov 5, 2024
58a5440
fix
noah-paige Nov 5, 2024
c04760a
fix custom Auth logic
noah-paige Nov 5, 2024
b5c0c08
fix cognito groups list handling
noah-paige Nov 5, 2024
25334bc
lint
noah-paige Nov 5, 2024
fca14f2
update checkov baseline
noah-paige Nov 5, 2024
64fa2f1
use well-known/openid-config
noah-paige Nov 5, 2024
d416a55
Update deps custom authorizer
noah-paige Nov 5, 2024
913c1ce
clean up requirements
noah-paige Nov 5, 2024
a8cfe08
update cryptography version
noah-paige Nov 6, 2024
647b689
Add cryptography lambda layer
noah-paige Nov 6, 2024
eddcb86
Reformat
noah-paige Nov 6, 2024
3b4418d
Clean up and PR comments
noah-paige Nov 6, 2024
f30dd4f
Fix Integration Tests to fetch ID and Access Tokens
noah-paige Nov 6, 2024
320f652
use cdk klayers
noah-paige Nov 6, 2024
ff910c7
refactor jwt services
noah-paige Nov 6, 2024
9b9cbad
fix string formatting
noah-paige Nov 6, 2024
451b0d4
raise exceptions
noah-paige Nov 6, 2024
0e329a3
only use access token, remove id token
noah-paige Nov 6, 2024
5a0893b
update exception str
noah-paige Nov 6, 2024
8b8eb1a
Clean Up
noah-paige Nov 6, 2024
26bd79c
merge latest os main
noah-paige Nov 6, 2024
86db3e8
PR comments
noah-paige Nov 7, 2024
8db81a1
fix IDP_DOMAIN_URL setting
noah-paige Nov 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion backend/api_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import os
from argparse import Namespace
from time import perf_counter

from ariadne import (
gql,
graphql_sync,
Expand Down
15 changes: 12 additions & 3 deletions backend/dataall/base/utils/api_handler_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,22 @@
]
ENGINE = get_engine(envname=ENVNAME)
ALLOWED_ORIGINS = os.getenv('ALLOWED_ORIGINS', '*')
AWS_REGION = os.getenv('AWS_REGION')


def redact_creds(event):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: That's why I don't like this approach, you have to remember to obfuscate stuff

if 'headers' in event and 'Authorization' in event['headers']:
if event.get('headers', {}).get('Authorization'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can anyone please tell why are we redacting these ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user creds information does not need to be logged and is no longer relevant for the remaining request lifecycle - opting to redact that info

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of redacting , can we just extract useful information and then pass it onto our lambda ?

event['headers']['Authorization'] = 'XXXXXXXXXXXX'
if 'multiValueHeaders' in event and 'Authorization' in event['multiValueHeaders']:

if event.get('multiValueHeaders', {}).get('Authorization'):
event['multiValueHeaders']['Authorization'] = 'XXXXXXXXXXXX'

if event.get('multiValueHeaders', {}).get('accesskeyid'):
event['multiValueHeaders']['accesskeyid'] = 'XXXXXXXXXXXX'

if event.get('headers', {}).get('accesskeyid'):
event['headers']['accesskeyid'] = 'XXXXXXXXXXXX'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in deploy/custom_resources/custom_authorizer/custom_authorizer_lambda.py the only values that we actually use are coming from the headers. Is it possible that the API succeeds with multiVValueHeaders? and if not, maybe we don't need these checks here. (It might be out of scope for this PR)

auth_token = incoming_event['headers']['Authorization']
access_token = incoming_event['headers']['accesskeyid']

return event


Expand Down Expand Up @@ -115,7 +124,7 @@ def check_reauth(query, auth_time, username):
# Determine if there are any Operations that Require ReAuth From SSM Parameter
try:
reauth_apis = ParameterStoreManager.get_parameter_value(
region=os.getenv('AWS_REGION', 'eu-west-1'), parameter_path=f'/dataall/{ENVNAME}/reauth/apis'
region=AWS_REGION, parameter_path=f'/dataall/{ENVNAME}/reauth/apis'
).split(',')
except Exception:
log.info('No ReAuth APIs Found in SSM')
Expand Down
6 changes: 4 additions & 2 deletions deploy/custom_resources/custom_authorizer/auth_services.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,17 +24,19 @@ def generate_policy(verified_claims: dict, effect, incoming_resource_str: str):
principal_id = verified_claims['sub']

# Attach a claim called 'email'. This is needed by Api Handler
verified_claims['email'] = verified_claims[os.getenv('email')]
verified_claims['email'] = verified_claims[os.getenv('email', 'email')]

for claim_name, claim_value in verified_claims.items():
if isinstance(claim_value, list):
if claim_name == 'cognito:groups':
verified_claims.update({claim_name: ','.join(claim_value)})
verified_claims.update({claim_name: json.dumps(claim_value)})

context = {**verified_claims}

context.update(
{
'user_id': verified_claims[os.getenv('user_id')],
'user_id': verified_claims[os.getenv('user_id', 'email')],
'custom_authorizer': 'true',
}
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@ def lambda_handler(incoming_event, context):
if not verified_claims:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the method validate_jwt_token is now raising exception this if block can be removed

raise Exception('Unauthorized. Token is not valid')

if os.getenv('provider') == 'Cognito':
access_token = incoming_event['headers']['AccessKeyId']
JWTServices.validate_access_token(access_token)

effect = 'Allow'
policy = AuthServices.generate_policy(verified_claims, effect, incoming_event['methodArn'])
logger.debug('Generated policy is ', policy)
Expand All @@ -39,12 +43,13 @@ def lambda_handler(incoming_event, context):
# AWS Lambda and any other local environments
if __name__ == '__main__':
# for testing locally you can enter the JWT ID Token here
token = ''
id_token = ''
access_token = ''
account_id = ''
api_gw_id = ''
event = {
'headers': {'Authorization': id_token, 'AccessKeyId': access_token},
'type': 'TOKEN',
'Authorization': token,
'methodArn': f'arn:aws:execute-api:us-east-1:{account_id}:{api_gw_id}/prod/POST/graphql/api',
}
lambda_handler(event, None)
6 changes: 6 additions & 0 deletions deploy/custom_resources/custom_authorizer/jwt_services.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,3 +99,9 @@ def validate_jwt_token(jwt_token):
except Exception as e:
logger.error(f'Failed to validate token - {str(e)}')
return None

@staticmethod
def validate_access_token(access_token):
Copy link
Contributor

@petrkalos petrkalos Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put a TTLCache decorator around it with a configurable ttl? We must check the negative scenario carefully though...

  1. Users is not authorized and tries to access
  2. User gets access denied and TTLCache is populated with an "Exception?"
  3. User asks admins to grant access
  4. User tries again to access

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for TTL Cache - please note custom authorizer already ahs a results_cache_ttl (set to 1 minute for now) at deploy/stacks/lambda_api.py:

custom_authorizer = apigw.RequestAuthorizer(
            self,
            'CustomAuthorizer',
            handler=self.authorizer_fn,
            identity_sources=[apigw.IdentitySource.header('Authorization')],
            authorizer_name=f'{resource_prefix}-{envname}-custom-authorizer',
            assume_role=custom_authorizer_role,
            results_cache_ttl=Duration.minutes(1),
        )

Would adding an additional caching mechanism at the above be useful?

user_info_url = os.getenv('user_info_url', '')
r = requests.get(user_info_url, headers={'Authorization': access_token})
r.raise_for_status()
3 changes: 3 additions & 0 deletions deploy/stacks/backend_stack.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ def __init__(
id,
envname: str = 'dev',
resource_prefix='dataall',
tooling_region=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tooling_region is not used anywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed!

tooling_account_id=None,
ecr_repository=None,
image_tag=None,
Expand Down Expand Up @@ -195,6 +196,8 @@ def __init__(
apig_vpce=apig_vpce,
prod_sizing=prod_sizing,
user_pool=cognito_stack.user_pool if custom_auth is None else None,
user_pool_client=cognito_stack.client if custom_auth is None else None,
user_pool_domain=cognito_stack.domain if custom_auth is None else None,
pivot_role_name=self.pivot_role_name,
reauth_ttl=reauth_config.get('ttl', 5) if reauth_config else 5,
email_notification_sender_email_id=email_sender,
Expand Down
Loading
Loading