⬆️ For table of contents, click the above icon
Chrome automation microservice for AWS Lambda with an HTTP API, automatic screenshotting on errors and file upload support.
Just plop the lambdafunc.zip
from the download link into AWS Lambda.
Then set up AWS API Gateway or Edgerouter as the HTTP ingress.
Our base project recommends allocating at least 1 600 MB RAM. You know the memes about Chrome and RAM..
You should also increase timeout in Lambda from default. I used 2 minutes for timeout but your needs may vary.
Save as script.js
:
exports.handler = async (ctx, console, browser) => {
console.log('example log message')
ctx.data = {
message: 'hello world',
};
}
Now run the $ curl
command (from "How to submit jobs"). You should see this:
$ curl ...
{
"logMessages": [
"example log message"
],
"errorMessages": [],
"data": {
"message": "hello world"
},
"error": null
}
Now let's actually do something with Chrome. Modify script.js
:
exports.handler = async (ctx, console, browser) => {
const page = await browser.newPage();
await page.goto('https://www.reddit.com/');
ctx.data = {
title: await page.title(),
};
}
Running it:
$ curl ...
{
"logMessages": [],
"errorMessages": [],
"data": {
"title": "reddit: the front page of the internet"
},
"error": null
}
There is a small Go-based client available.
Example with curl
and authorization (you can use anything as token, if you haven't set
up authorization):
$ curl --data-binary "@script.js" -H 'Authorization: Bearer ...' -H 'Content-Type: application/javascript' https://example.com/api/chromeserver/job
Since you're sending the script, in theory you could replace placeholders from the script, but that's ugly.
Any URL parameters like POST /api/chromeserver/job?msg=foo
will be available at ctx.params.msg
This isn't enabled by default, but you've to add errorAutoScreenshot=1
URL parameter.
If your script run throws an exception a screenshot will be taken, saved to S3 and URL to it will be returned along with your error.
If your script run produces much data (like files) to not like returning them in
the response JSON, you can call ctx.uploadFile()
to have files uploaded to S3. The
function returns URL to the file, which you can return with your JSON.
You need to have a S3 bucket, and define that as S3_BUCKET
ENV variable for your Lambda.
We recommend you to set up automatic expiration for a specific directory prefix like
temp-7d/
to delete files in 7 days.
We recommend you don't give your Lambda full S3 access, not even to said bucket but only for a subdirectory. Here's an example policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws:s3:::YOUR-BUCKET_NAME/temp-7d/chrome-server/*"
]
}
]
}
When developing a complex script, it's very useful to be able to test it locally (think: getting screenshots when you encounter an error).
First, meet Turbo Bob.
Then:
- enter dev container
- test your script
It looks like this:
$ bob dev
$ ./devrunner.js script.js
{
logMessages: [],
errorMessages: [],
data: { title: 'reddit: the front page of the internet' },
error: null
}
(if it complains about $ npm install
, run it)
The devrunner.js
gives you same output as you would get from Lambda over HTTP.
Note: after doing local dev, if you build chrome-server
, your lambdafunc.zip
will be
larger (and above the Lambda's limit) because of additional node_modules
. But usually
you won't develop your job scripts and change chrome-server
at the same time. :)
Since this service evaluates user-sent JavaScript, you need to trust the API users.
Have the API be protected via authorization in AWS's API Gateway or in Edgerouter.
Don't give the Lambda function any privileges beyond what is needed to do its job:
- run Lambda
- upload data to S3 to a restricted path within the bucket
This is a small decoration on top of alixaxel/chrome-aws-lambda. He's done amazing work!