Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for builtin API Gateway authorizers #381

Closed
wants to merge 11 commits into from

Conversation

jamesls
Copy link
Member

@jamesls jamesls commented Jun 15, 2017

Chalice currently supports IAM, Cognito, and Custom authorizers.
The custom authorizer allows you to define a lambda function which
contains custom authorization logic specific for your use case.
Previously, you'd have to create your lambda authorizer function out
of band from your chalice app. We just provided an API to associated
your custom authorizer with certain views.

This change adds support for a "built in authorizer", which allows you
to create custom authorizers that are defined within your chalice app.
When you run "chalice deploy" we'll automatically create and associate
the authorizer for you, but we'll also create the lambda function
associated with the authorizer.

This introduces new concepts in chalice including multi lambda function
support. We now have an API handler (the existing lambda function) as
well as auth handlers.

There's still a few missing pieces that need to get ironed out:

  • Need to work out how we want to configure the auth functions
    in config.json (timeouts, memory, etc)
  • Need to work out how to deal with "chalice package". It appears
    SAM doesn't support giving implicit permission to invoke the lambda
    function in the same way the event type "Api" works.
  • Docs (in progress, should be coming shortly).

Closes #356

@jamesls jamesls requested review from stealthycoin and kyleknap June 15, 2017 22:22
@codecov-io
Copy link

codecov-io commented Jun 15, 2017

Codecov Report

Merging #381 into master will increase coverage by 0.54%.
The diff coverage is 98.95%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #381      +/-   ##
==========================================
+ Coverage   91.62%   92.17%   +0.54%     
==========================================
  Files          18       18              
  Lines        2280     2440     +160     
  Branches      294      320      +26     
==========================================
+ Hits         2089     2249     +160     
  Misses        142      142              
  Partials       49       49
Impacted Files Coverage Δ
chalice/__init__.py 100% <ø> (ø) ⬆️
chalice/app.py 96.82% <100%> (+0.72%) ⬆️
chalice/cli/factory.py 89.79% <100%> (ø) ⬆️
chalice/config.py 97.14% <100%> (+0.23%) ⬆️
chalice/awsclient.py 91.48% <100%> (+0.68%) ⬆️
chalice/deploy/swagger.py 100% <100%> (ø) ⬆️
chalice/package.py 97.59% <100%> (+0.18%) ⬆️
chalice/constants.py 100% <100%> (ø) ⬆️
chalice/deploy/deployer.py 89.3% <96.22%> (+1.31%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f6705a...c95b37b. Read the comment docs.

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. My comments were mainly just related to questions, suggestions on refactorings, and double checking edge cases.

@@ -483,3 +483,29 @@ def _client(self, service_name):
self._client_cache[service_name] = self._session.create_client(
service_name)
return self._client_cache[service_name]

def add_permission_for_authorizer(self, rest_api_id, function_arn,
random_id):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense that the random_id is a required parameter to pass in? Seems like it would make more sense if the awsclient handled it automatically for the user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would. IIRC correctly it was done this way to make testing easier. I believe there were issues with the botocore stubber's ANY implementation. Those might be fixed now, I can check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I think it would make sense to parameterize it but not completely hide it. This is used to ensure uniqueness and makes it impossible to do retries at any level above the awsclient level. I'd propose making this an Optional[str].

@@ -67,7 +68,27 @@ def _add_route_paths(self, api, app):

def _generate_security_from_auth_obj(self, api_config, authorizer):
# type: (Dict[str, Any], Authorizer) -> None
config = authorizer.to_swagger()
if isinstance(authorizer, ChaliceAuthorizer):
function_name = '%s-%s' % (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is second time this name is generated. I wonder if it makes sense to have a shared helper function/property to determine this.

chalice/app.py Outdated
auth_name = auth_func.__name__
ttl_seconds = kwargs.pop('ttl_seconds', None)
execution_role = kwargs.pop('execution_role', None)
auth_config = BuiltinAuthConfig(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to put validation in on if they pass in kwargs that do not get popped off like we do in route()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update.

chalice/app.py Outdated
ttl_seconds=ttl_seconds,
execution_role=execution_role,
)
# Do we even need the builtin_auth_handlers attr? See if we can
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we would given it is used for iteration in the deployer/swagger code. Might want to remove the comment.

@@ -626,3 +646,120 @@ def _add_cors_headers(self, response, cors):
for name, value in cors.get_access_control_headers().items():
if name not in response.headers:
response.headers[name] = value


class BuiltinAuthConfig(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason for calling it Builtin as opposed to Custom? Custom falls more in-line with what API Gateway uses in its documentation. Or maybe just calling them AuthConfig's?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There'a already a custom authorizer in the chalice code. The nomenclature that made the most sense to be (given what's already there):

  • Custom authorizers - using the custom authorizer functionality in API gateway with a lambda function that you manage outside of chalice. This maps to the pre-existing CustomAuthorizer class.
  • Builtin authorizers - authorizers that are defined in a chalice app that we manage/configure for you. Implementation wise it uses the custom authorizer feature of API gateway but is largely abstracted in chalice.

random_id):
# type: (str, str, str) -> None
client = self._client('apigateway')
authorizers = client.get_authorizers(restApiId=rest_api_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to be using a paginator here. Unfortunately, I do not think botocore has one yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not. I couldn't find in their docs what the page size, but I think this might be a forward compatibility thing in their API. The max authorizers for an API is 10 (http://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html) so I don't think we'll be affected. I can add a comment about that though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works too.

authorizer_id = authorizer['id']
break
else:
raise ValueError("Unable to find authorizer associated "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is also a ResourceDoesNotExistError exception. Should that be used instead of ValueError? There is not really any documentation on when the ResourceDoesNotExistError should be raised.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think carlyl added that with the delete feature, which was done after this branch was started. I can update to use ResourceDoesNotExistError.

@@ -280,7 +308,48 @@ def deploy(self, config, existing_resources, stage_name):
config, function_name, stage_name)
deployed_values['api_handler_name'] = function_name
deployed_values['api_handler_arn'] = function_arn
return deployed_values

def _deploy_auth_handlers(self, config, existing_resources, stage_name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we currently handle auth handlers that get removed from a chalice app? So say I defined a auth handler deployed my application and then decided I did not want it anymore. So I delete it from my app.py file and redeploy. My expectation would be that the lambda function for the auth handler will get deleted. From the looks of it, it seems that the deploy code will leave orphaned lambda functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be semi-orphaned in the sense that we won't delete it on deploy, but it will still stay in deployed.json so will get removed as part of chalice delete.

I agree it does make sense if you delete your auth handler code we should remove the lambda function. I'll look into how much work that is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capturing offline: I'll use either the deployed.json or query lambda based on the expected tags to delete things at the end of the deploy.

self.aws_client, self.packager, None, self.osutils,
self.app_policy)
self.aws_client.lambda_function_exists.return_value = True
self.aws_client.update_function.return_value = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to make assertions about the parameters the update_function and create_function were called with? Currently the only way to tell that the correct parameters were used is by running an integration test. At the very least we should check that the correct function name and handler name is being past in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can update this.

"arn:aws:execute-api:us-west-2:123:rest-api-id/dev/POST/a/b",
]
response = app.AuthResponse(
[app.AuthRoute('/a', ['GET']),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it is worth adding a test that mixes string and AuthRoute in the AuthResponse list?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'll update this.

@jamesls
Copy link
Member Author

jamesls commented Jun 19, 2017

Incorporated feedback so far. Remaining items:

  • Docs
  • Configuration for authorizer functions (timeouts/roles/memory_size)
  • Delete auth lambda functions when removed from code on redeploy

Let me know if I missed anything outside of those three items.

@kyleknap
Copy link
Contributor

Changes look good to me up to this point.

@jamesls
Copy link
Member Author

jamesls commented Jun 20, 2017

The last piece of functionality missing is how to configure the authorizers. I'm going to update the original issue with an updated proposal before implementing.

@jamesls
Copy link
Member Author

jamesls commented Jun 21, 2017

@kyleknap Updated. This should be all the code related changes. Just needs docs. I'm going to try it out a bit more and see if there's any edge cases I missed, but otherwise is ready to look at again.

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just some small comments and questions

#: Params that a user provided explicitly,
#: typically via the command line.
self.chalice_stage = chalice_stage
self.function_name = function_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think this is fine for now as lambda functions are really the only thing we configure. But how do you see this will expand if we ever support per-resource configurations for other AWS resources besides Lambda? Would we add resource name to the config as well and the config would then be scoped to: stage + lambda function + any other specific managed resource configuration? I just want to make sure there won't be any issues expanding this out

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually see additional resources not using Config directly but instead using a resource specific config obj that Config can create. Initially I had this as Config --references-> LambdaConfig, but reverted that in favor of this current approach because:

  1. It's a bigger interface change across all the deployer code which is likely getting refactored soon anyways
  2. Almost all the config in Config is actually lambda configuration.

So say if we had ddb tables, you'd say something like:

ddb_config = cfg.resource_config(cfg.stage_name, 'dynamodb_tables')
ddb_config['mytable']['name']
ddb_config['mytable']['provisioned_throughput']
... etc ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok cool. I was thinking along the same lines of how you would get resource specific configuration. Makes sense to keep the changes small for now until more configurable managed resources get added as the deployer will need to get refactored.

@@ -854,6 +855,82 @@ def test_can_update_auth_handlers(self, sample_app_with_auth):
zip_contents=b'package contents',
)

def test_can_create_auth_with_different_config(self, sample_app_with_auth):
# We're notusing create_config_obj because we want to approximate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notusing -> not using




def test_can_create_scope_new_stage_only():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unimplemented test

self.aws_client.create_function.side_effect = [
self.lambda_arn, 'arn:auth-function']
deployer.deploy(config, None, stage_name='dev')
self.aws_client.create_function.assert_called_with(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to check that the call to create_function to create the api handler used the expected configuration values as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following. That's what's happening on lines 900/901 for the timeout and memory size. Was there more you wanted asserted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So right now we are only checking the last client create_function call which was for the auth handler. The first create_function call is for the api handler which is not checked, but has different parameters such as its lambda timeout will be 10 and its lambda memory size will be the default. The assertion would be on what was called in the first create_function call and it would be just a sanity check to make sure the correct scope was used in creating the api handler. If there is already a check for that though, I don't think we need to add an assertion. I may have missed it if there is one as I am only looking at the diffs as you add them, not the entire PR diff each time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok yeah I can update the test if that's not covered.

('beta', 'myauth', 'beta-myauth'),
('prod', 'api_handler', 'prod-stage'),
('prod', 'myauth', 'prod-stage'),
('foostage', 'api_handler', 'global'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a stage or a function name that does not exist in the config does not raise an error? That seems like it could raise some unexpected results down the road if the wrong stage or function name is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have no way of knowing that unless we want to change the interface to declare/define your stages before you can deploy them, which IMO is a tedious API.

Perhaps, we could add an optional prompt when you deploy a stage for the first time, but that wouldn't need changes to the config file validation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Yeah I think it would be an optional prompt for a deploying to a new stage that is not the very first chalice deploy. I feel that the initial chalice deploy should work with minimal amount of prompting. I don't think this should be done in this PR though.

c = Config(chalice_stage='dev', config_from_disk=disk_config)
new_config = c.scope(chalice_stage='dev',
function_name='myauth')
assert new_config.lambda_timeout == 40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I know it may be tedious, but I wonder if it makes sense to have explicit test to make sure all the valid variables can be configured per lambda function? In the tests, lambda_timeout, environment_variables, and tags are used routinely, but variables such as iam_role_arn, lambda_memory_size, autogen_policy, and iam_policy_file are not. I feel that it would be good to have for a quick check. If not split this into individual tests, maybe have one large test or general helper assertion that makes sure that each of these values exhibit the per function behavior (i.e. override global/stage values and are separate from other functions)?

def scope(self, chalice_stage, function_name):
# type: (str, str) -> Config
# Used to create a new config object that's scoped to a different
# stage and/or function. This creates a completely separate copy.
Copy link
Member Author

@jamesls jamesls Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API is actually wrong now. Initially I was doing a deepcopy on the self.__dict__, but ran into issues that couldn't be serialized because our config obj has a reference to the chalice_app which contains all kinds of objects. Now that we're sharing __dict__, this doesn't actually work. I'll get this fixed.

@kyleknap
Copy link
Contributor

Latest updates look good. Looks like all that is left is rebasing and adding docs then it should be good.

jamesls added 10 commits June 21, 2017 15:59
Chalice currently supports IAM, Cognito, and Custom authorizers.
The custom authorizer allows you to define a lambda function which
contains custom authorization logic specific for your use case.
Previously, you'd have to create your lambda authorizer function out
of band from your chalice app.  We just provided an API to associated
your custom authorizer with certain views.

This change adds support for a "built in authorizer", which allows you
to create custom authorizers that are defined within your chalice app.
When you run "chalice deploy" we'll automatically create and associate
the authorizer for you, but we'll also create the lambda function
associated with the authorizer.

This introduces new concepts in chalice including multi lambda function
support.  We now have an API handler (the existing lambda function) as
well as auth handlers.

There's still a few missing pieces that need to get ironed out:

* Need to work out how we want to configure the auth functions
  in config.json (timeouts, memory, etc)
* Need to work out how to deal with "chalice package".  It appears
  SAM doesn't support giving implicit permission to invoke the lambda
  function in the same way the event type "Api" works.
* Docs (in progress, should be coming shortly).
* Use ResourceDoesNotExistError
* Provide defaults for random_id param in awsclient
* Add note about API gateway authorizer pagination
* Validate unknown kwargs to authorizer()
* Test expected kwargs passed to create/update function
* Add test for mixed string and AuthRoutes in AuthResponse
As part of this change, I updated the pylintrc to permit
the scope() method to assign attributes for an instance.
That way refs to the :class: work as expected.
Includes api doc updates as well as a topic guide.
@jamesls jamesls force-pushed the multi-lambda-auth branch from 2c6213b to 8aa07e7 Compare June 21, 2017 23:12
def demo_auth(auth_request):
token = auth_request.token
# This is just for demo purposes as shown in the API Gateway docs.
# Normally you'd call an oauth provider, validae the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validae -> validate

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good. 🚢 Also make sure to add a changelog entry

@jamesls
Copy link
Member Author

jamesls commented Jun 22, 2017

Merged via 4d1751d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants