Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for layers to Docker runtimes #950

Merged
merged 21 commits into from
Aug 26, 2020

Conversation

footballencarta
Copy link
Contributor

@footballencarta footballencarta commented Apr 2, 2020

Added support for Docker to download and use Lambda Layers. This requires a number of things to be in place, but are in the documentation.

aws-sdk has been moved as a dependency because of the layer download, but this is only instantiated if it detects layers, and the provider is aws - so there should be little chance of affecting other items.

The docker runtime is also more configurable - you can now disable the read-only of the runtime (we use lumen, with a SQLite database for local development, so it was kinda needed).

This is fully implemented to HTTP items, but I've added it to the pass-through for the events and WebSocket as well incase they support Docker in the future.

Any other questions or comments, just let me know and I'll take a look 🙂 Really keen to get this merged so we can start using layers locally!

Resolves #630 ...ish

@dherault
Copy link
Owner

dherault commented Apr 4, 2020

Hi, nice work thanks! Can you add damon.williams@realitymine.com to the list of your emails under Settings > Emails so your commits are linked to a GitHub account please?

@dherault
Copy link
Owner

dherault commented Apr 4, 2020

Also can you resolve the conflicts?

@frozenbonito
Copy link
Contributor

It is a great work.
Let me some comments.

  • I think the layers should not be passed to HTTP and the other events. It complicates the implementation.
    It is simpler to get the layer settings from functionDefinition in LambdaFunction class.

  • I think --dockerLayersDir option should not be added.

    • Not only docker-runner, all handler-runners should support layers in the future.
    • I think the layers should be cached globally by default like as serverless invoke local (please refer to the here) in the future. Once the default of the option is determined, it will require breaking changes. I suggests using tmpdir for layers for now.
      For test, I think you can use public lambda layer. For example, mthenw/awesome-layers.

@footballencarta
Copy link
Contributor Author

@dherault I'll get on those changes right away. I've fixed the email one already.

@frozenbonito I'll address your points in order, prepare for lots of words:

I think the layers should not be passed to HTTP and the other events. It complicates the implementation.
It is simpler to get the layer settings from functionDefinition in LambdaFunction class.

Noted, that seems like a sensible idea - I didn't even realise you could get them from there, so I'll certainly be making that change.

I think --dockerLayersDir option should not be added.

I'll be honest, I disagree with you here - I think giving control over where you want layers to be stored is essential for reuse and maintenance. You may want to store types of layers in different places for sanity's sake, or anything like that. Having a sensible default, but allowing an over-ride gives the best of both worlds, and it doesn't tightly-couple it to a temporary directory.

Not only docker-runner, all handler-runners should support layers in the future.

I'm not sure how you'd police that, as layers are constructed in a very specific way, it would require you to run all your binary code through the same directory mimicking the structure of lambda. I understand the logic, but the complexity of something like that doesn't seem worth the effort when docker can be used - that's what AWS SAM does after all.

I think the layers should be cached globally by default like as serverless invoke local (please refer to the here) in the future. Once the default of the option is determined, it will require breaking changes. I suggests using tmpdir for layers for now.

I originally toyed with that idea, but ultimately decided against enforcing storing it in a temp directory by default.

tmpdir's can be cleared out after reboots, or other events meaning control of their persistence disappears from the developer. Allowing them to be placed in other places allows the developer to control that persistence. tmpdir's are handled badly with node on macOS as well (nodejs/node#11422) which means the tmpdir from node can't be easily mounted to docker containers on macOS without config changes. That also means I'll need to patch the recent artifacts change, as that doesn't work on macs, and the tests are failing because of it.

Again, I believe the same reasoning as above applies here; I believe a sensible default should be applied, and I believe having it in the same directory as your code is the most sensible place of putting it.

For test, I think you can use public lambda layer. For example, mthenw/awesome-layers.

I did originally try that - I was using gkrizek/bash-lambda-layer for testing as it required no other dependencies. The problem came with testing. As the build server's don't have an AWS account on them, you can't run the required aws lambda getLayerVersion check to find out where the zip file location is. If we could add an AWS account to the tests, I'll certainly add it back in.

I've tested it with bref as well using some work projects, and it worked completely as expected.

Hope that clears up some of the technical choices! 🙂

@frozenbonito
Copy link
Contributor

frozenbonito commented Apr 5, 2020

@footballencarta Thank you for your comments!

I think giving control over where you want layers to be stored is essential for reuse and maintenance. You may want to store types of layers in different places for sanity's sake, or anything like that. Having a sensible default, but allowing an over-ride gives the best of both worlds, and it doesn't tightly-couple it to a temporary directory.

I'm not sure how you'd police that, as layers are constructed in a very specific way, it would require you to run all your binary code through the same directory mimicking the structure of lambda. I understand the logic, but the complexity of something like that doesn't seem worth the effort when docker can be used - that's what AWS SAM does after all.

I agree, but docker is just one option for serverless-offline. At least, I think it's better to rename the option to --layersDirfor future expansion. For the time being, only docker-runner will support it.

tmpdir's can be cleared out after reboots, or other events meaning control of their persistence disappears from the developer. Allowing them to be placed in other places allows the developer to control that persistence.

I suggested tmpdir as a temporary workaround. In actually, I think cachedir is more would be more appropriate (like as serverless invoke local) for the global cache persistence.

tmpdir's are handled badly with node on macOS as well (nodejs/node#11422) which means the tmpdir from node can't be easily mounted to docker containers on macOS without config changes. That also means I'll need to patch the recent artifacts change, as that doesn't work on macs, and the tests are failing because of it.

Oh, sorry. I didn't know that. Thank you for patching!

Again, I believe the same reasoning as above applies here; I believe a sensible default should be applied, and I believe having it in the same directory as your code is the most sensible place of putting it.

I'm tired of having to manage many dot directories (which should be ignored by git and excluded by serverless packaging), so I prefer the default place is out of the repository.
Or I hope only one directory like a .serverless for serverless will be defined for this plugin. For example, .serverless-offline (the layers will be stored at .serverless-offline/layers by default). In this case, the plugin can use the directory for various purposes, and users only need to manage one directory.

I did originally try that - I was using gkrizek/bash-lambda-layer for testing as it required no other dependencies. The problem came with testing. As the build server's don't have an AWS account on them, you can't run the required aws lambda getLayerVersion check to find out where the zip file location is. If we could add an AWS account to the tests, I'll certainly add it back in.

I got it, thank you!

@footballencarta
Copy link
Contributor Author

@frozenbonito thanks for the comments, I think there are some really good ideas in there!

I agree, but docker is just one option for serverless-offline. At least, I think it's better to rename the option to --layersDir for future expansion. For the time being, only docker-runner will support it.

Noted. I'll change the option to --layersDir.

I'm tired of having to manage many dot directories (which should be ignored by git and excluded by serverless packaging), so I prefer the default place is out of the repository.
Or I hope only one directory like a .serverless for serverless will be defined for this plugin. For example, .serverless-offline (the layers will be stored at .serverless-offline/layers by default). In this case, the plugin can use the directory for various purposes, and users only need to manage one directory.

I've moved everything over to the .serverless-offline/layers folder as you gave in the example. I think it's a great idea to store the stuff serverless-offline might need in one place 🙂

Based on those changes, do we think moving the default to a cachedir is no longer needed?

@frozenbonito
Copy link
Contributor

@footballencarta Thank you!

Based on those changes, do we think moving the default to a cachedir is no longer needed?

Yes, I think so.

@footballencarta
Copy link
Contributor Author

footballencarta commented Apr 6, 2020

Awesome! The builds seem to be trying to compile nodejs from scratch which is causing them to timeout - is there any chance someone could have a look at re-running them?

@footballencarta
Copy link
Contributor Author

Hey, is there anything else to do on this? Or could it be merged?

@mrkirchner
Copy link

any status on this?

@YOU54F YOU54F mentioned this pull request May 7, 2020
@dherault
Copy link
Owner

If the conflicts are resolved I'm willing to merge

@footballencarta
Copy link
Contributor Author

Apologies, I've been out of the loop because of work recently. I'll take a look in the next few days.

@footballencarta
Copy link
Contributor Author

Well that was easier than I expected - all resolved now!

@Maxwell2022
Copy link

@footballencarta @dherault Any plan to merge this PR soon? We'd like to use the offline plugin to run Bref using SQS triggers

@andyinabox
Copy link

This looks great! Can't wait to see it merged...

@footballencarta
Copy link
Contributor Author

@dherault any chance of the merge?

@Maxwell2022
Copy link

@footballencarta would it be possible to inject docker options? For instance to put the container in a specific docker network so that we don't have to expose ports from other containers

@footballencarta
Copy link
Contributor Author

@Maxwell2022 - I'm not very well up on Docker, so if it's something you'd want to add, then feel free :)

@dherault - is there any chance of merging this?

@dherault dherault merged commit b31f89d into dherault:master Aug 26, 2020
@dherault
Copy link
Owner

Amazing work @footballencarta!
I'm super glad to merge this, if you need anything else ping me, I'll be more responsive in the following days since my holidays are over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can Lambda layers be used in offline mode?
6 participants