Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Token Validations #1682

Merged
merged 33 commits into from
Nov 7, 2024
Merged

Added Token Validations #1682

merged 33 commits into from
Nov 7, 2024

Conversation

noah-paige
Copy link
Contributor

@noah-paige noah-paige commented Nov 4, 2024

Feature or Bugfix

  • Bugfix

Detail

  • This PR does the following w.r.t Cognito IdP and Auth Flow
    • Changes Authorizer from built-in Cognito Authorizer to Custom Authorizer to validate token signature, issuer, and expiry time, etc.
    • Adds aditional step to execute GET API on Cognito's /oauth/userInfo/ endpoint to ensure access Token validity

Allows data.all API request to execute if the above criteria are met

Relates

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

  • Does this PR introduce or modify any input fields or queries - this includes
    fetching data from storage outside the application (e.g. a database, an S3 bucket)?
    • Is the input sanitized?
    • What precautions are you taking before deserializing the data you consume?
    • Is injection prevented by parametrizing queries?
    • Have you ensured no eval or similar functions are used?
  • Does this PR introduce any functionality or component that requires authorization?
    • How have you ensured it respects the existing AuthN/AuthZ mechanisms?
    • Are you logging failed auth attempts?
  • Are you using or adding any cryptographic features?
    • Do you use a standard proven implementations?
    • Are the used keys controlled by the customer? Where are they stored?
  • Are you introducing any new policies/roles/users?
    • Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@noah-paige
Copy link
Contributor Author

noah-paige commented Nov 4, 2024

Testing Completed:

  • CodePipeline Deploys Successfully (for Cognito Auth with already existing data.all instance - i.e. UPDATE)
  • Login + Store Access Token
  • Run data.all Operation (GraphQL endpoint)
  • Run Catalog Search (ESProxy endpoint)
  • SignOut (clear tokens)
  • ReAuth Workflow Successfuly (for query)
  • ReAuth Workflow Successfuly (for mutation)
  • User removed (or disabled) from Cognito IdP --> User no longer can perform actions on data.all UI immediately (note: at most have additional access up to TTL of cached custom authorizer response)

Custom Auth Checks:

  • Proper Access and Id JWT - Checks Pass (Cognito)

  • Exceptions Raised and Request Denied For

    • Malformed JWT
    • No JWT Provided
    • Wrong Issuer
    • Wrong Audience
    • User Disabled
  • Proper Access and Id JWT - Checks Pass (Okta)

  • Exceptions Raised and Request Denied For

    • Malformed JWT
    • No JWT Provided
    • Wrong Issuer
    • Wrong Audience
    • User Deactivated

'LOG_LEVEL': 'DEBUG',
'user_info_url': f'https://{user_pool_domain.domain_name}.auth.{self.region}.amazoncognito.com/oauth2/userInfo',
'custom_auth_provider': 'Cognito',
'custom_auth_url': f'https://cognito-idp.{self.region}.amazonaws.com/{user_pool.user_pool_id}',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of passing the user_info_url explicitly we can use https://{custom_auth_url}/.well-known/openid-configuration and read the userinfo_endpoint field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using https://{custom_auth_url}/.well-known/openid-configuration for userinfo URL now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove the custom_auth_jwks_url since you are pulling this from openid-configuration endpoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in latest

@noah-paige noah-paige marked this pull request as ready for review November 5, 2024 17:29
@noah-paige noah-paige self-assigned this Nov 6, 2024
@@ -21,14 +23,18 @@ export const useToken = () => {
await auth.signinSilent();
}
const t = auth.user.id_token;
const at = auth.user.access_token;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we now using both Access and Identity token ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need any particular token for any specific use ? Like a specific gql endpoint needs an identity token and cannot work with access token ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debating whether we should just send access token and forget about ID

currently we send both which may be overkill

ID token contains headers like email that we do need. However when we run a request with access token against userinfo endpoint we get that information (such as email) in response that we can add to context

Additionally, access tokens (for cognito at least) contain user and group information - not sure if that is standard

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level the token uses are:

  • the access token is needed to invoke the /userinfo endpoint
  • the verified claims of the ID Token are needed to populate context information (email, user, groups) for downstream lambda (this theoretically could be sourced elsewhere)

@@ -35,6 +35,7 @@ def __init__(
id,
envname: str = 'dev',
resource_prefix='dataall',
tooling_region=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tooling_region is not used anywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed!

effect = 'Allow'
policy = AuthServices.generate_policy(verified_claims, effect, incoming_event['methodArn'])
logger.debug('Generated policy is ', policy)
logger.debug('Generated policy is ', json.dumps(policy))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.debug(f'Generated policy is json.dumps(policy)').

using f-string,. I think this logger.debug('Generated policy is ', json.dumps(policy)) doesn't work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup I made this change earlier and just ran into this - updating now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.debug('Generated policy is ', policy) was also not showing policy so opting for f string approach

return payload
except ExpiredSignatureError:
except jwt.exceptions.ExpiredSignatureError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would directly raise these Exceptions here instead of passing the None and then raising the general exception

import logging

logger = logging.getLogger()
logger.setLevel(os.environ.get('LOG_LEVEL', 'INFO'))

# Configs required to fetch public keys from JWKS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this because pyJWT does it with some inbuilt method ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'custom_auth_provider': custom_auth.get('provider'),
'custom_auth_url': custom_auth.get('url'),
'custom_auth_client': custom_auth.get('client_id'),
'custom_auth_jwks_url': custom_auth.get('jwks_url'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we removing the custom_auth_jwks_url URL completely ? and using the .well-known/openid-configuration instead ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

@@ -45,5 +51,5 @@ export const useToken = () => {
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the if (!token ) should it be instead if ( !token and !accessToken ) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated only using access token now



def redact_creds(event):
if 'headers' in event and 'Authorization' in event['headers']:
if event.get('headers', {}).get('Authorization'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can anyone please tell why are we redacting these ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user creds information does not need to be logged and is no longer relevant for the remaining request lifecycle - opting to redact that info

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of redacting , can we just extract useful information and then pass it onto our lambda ?

user_info = jwt_service.validate_access_token(auth_token)

# Validate JWT
verified_claims = jwt_service.validate_jwt_token(auth_token[7:])
if not verified_claims:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the method validate_jwt_token is now raising exception this if block can be removed

@@ -16,21 +17,33 @@
Custom Lambda Authorizer is attached to the API Gateway. Check the deploy/stacks/lambda_api.py for more details on deployment
"""

OPENID_CONFIG_PATH = os.path.join(os.environ.get('custom_auth_url', ''), '.well-known', 'openid-configuration')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use os.environ['custom_auth_url'] and let it throw if auth_url isn't specified? otherwise it will construct a wrong url and later on will fail with an obscure message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure that is a good call out - just made that change to os.environ['custom_auth_url']

token_url = os.path.join(idp_domain_url, 'oauth2', 'token')
login_url = os.path.join(idp_domain_url, 'login')

client_id = os.environ['COGNITO_CLIENT']
Copy link
Contributor

@petrkalos petrkalos Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now a quite generic OAuth2 client, should we rename this to CLIENT_ID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are still some particularities that makes integration tests only work with cognito. I am inclined to make it fully generic in a separate PR out of scope from here and leave it as it was inherited in this PR for now if that is okay

token = uuid.uuid4()
scope = 'aws.cognito.signin.user.admin openid'

idp_domain_url = os.environ['IDP_DOMAIN_URL']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You must amend the env variables in the pipeline step

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

believe I already do that in the deploy/stacks/pipeline.py changes in this PR - will tag you on those lines for ref

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TejasRGitHub, take a look at this change as I think it will enable support for any IdP

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @petrkalos for pointing at this. Will take a look

@@ -16,21 +17,33 @@
Custom Lambda Authorizer is attached to the API Gateway. Check the deploy/stacks/lambda_api.py for more details on deployment
"""

OPENID_CONFIG_PATH = os.path.join(os.environ.get('custom_auth_url', ''), '.well-known', 'openid-configuration')
jwt_service = JWTServices(OPENID_CONFIG_PATH)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would capitalize global variables

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't sure for global class instance - capitalizing now

@@ -716,6 +716,8 @@ def set_approval_tests_stage(
'aws sts get-caller-identity --profile buildprofile',
f'export COGNITO_CLIENT=$(aws ssm get-parameter --name /dataall/{target_env["envname"]}/cognito/appclient --profile buildprofile --output text --query "Parameter.Value")',
f'export API_ENDPOINT=$(aws ssm get-parameter --name /dataall/{target_env["envname"]}/apiGateway/backendUrl --profile buildprofile --output text --query "Parameter.Value")',
f'export IDP_DOMAIN_URL=https://$(aws ssm get-parameter --name /dataall/{target_env["envname"]}/cognito/domain --profile buildprofile --output text --query "Parameter.Value").auth.us-east-1.amazoncognito.com',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -16,21 +17,33 @@
Custom Lambda Authorizer is attached to the API Gateway. Check the deploy/stacks/lambda_api.py for more details on deployment
"""

OPENID_CONFIG_PATH = os.path.join(os.environ.get('custom_auth_url', ''), '.well-known', 'openid-configuration')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure that is a good call out - just made that change to os.environ['custom_auth_url']


# Initialize Klayers
runtime = _lambda.Runtime.PYTHON_3_9
klayers = Klayers(self, python_version=runtime, region=self.region)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@noah-paige noah-paige merged commit 5069bf8 into main Nov 7, 2024
10 checks passed
dlpzx pushed a commit that referenced this pull request Nov 8, 2024
<!-- please choose -->
- Bugfix

- This PR does the following w.r.t Cognito IdP and Auth Flow
- Changes Authorizer from built-in Cognito Authorizer to Custom
Authorizer to validate token signature, issuer, and expiry time, etc.
- Adds aditional step to execute GET API on Cognito's `/oauth/userInfo/`
endpoint to ensure access Token validity

Allows data.all API request to execute if the above criteria are met

- <URL or Ticket>

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
@dlpzx dlpzx mentioned this pull request Nov 8, 2024
dlpzx added a commit that referenced this pull request Nov 8, 2024
### Feature or Bugfix
- Security

### Detail
* get-parameter CloudfrontDistributionDomainName from us-east-1 (#1687 )
* Added Token Validations (#1682)
* add warning to untrust data.all account when removing an environment
(#1685)
* add custom domain support for apigw (#1679)
* Lambda Event Logs Handling (#1678)
* Upgrade Spark version to 3.3 (#1675) -
a0c63a4
* ES Search Query Collect All Response  (#1631)
* Extend Tenant Perms Coverage (#1630)
* Limit Response info dataset queries (#1665)
* Add Removal Policy Retain to Bucket Policy IaC (#1660) 
* log API handler response only for LOG_LEVEL DEBUG. Set log level INFO
for prod deployments (#1662)
* Add permission checks to markNotificationAsRead + deleteNotification
(#1654)
* Added error view and unified utility to check tenant user (#1657
* Userguide signout flow (#1629)

### Relates
- Security release

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: Petros Kalos <kalosp@amazon.com>
dlpzx pushed a commit that referenced this pull request Nov 15, 2024
### Feature or Bugfix
<!-- please choose -->
- Enhancement

### Detail
- Add explicit token duration (60 min) over default 60 min

### Relates
- #1682 

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants