Skip to content

Add AWS ECR integration guide for Serverless endpoints #321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .cursor/rules/rp-styleguide.mdc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ description:
globs:
alwaysApply: true
---

Always use sentence case for headings and titles.
These are proper nouns: Runpod, Pods, Serverless, Hub, Instant Clusters, Secure Cloud, Community Cloud, Tetra.
These are generic terms: endpoint, worker, cluster, template, handler, fine-tune, network volume.
Expand All @@ -13,7 +12,7 @@ When using bullet points, end each line with a period.

When creating a tutorial, always include these sections:

- What you'll learn
- What you'll learn (followed a bulleted list of topics the tutorial covers)
- Requirements (rather than "prerequisites")

And number steps like this:
Expand Down
3 changes: 2 additions & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,8 @@
"serverless/workers/handler-functions",
"serverless/workers/concurrent-handler",
"serverless/workers/deploy",
"serverless/workers/github-integration"
"serverless/workers/github-integration",
"serverless/workers/aws-ecr-integration"
]
},
{
Expand Down
4 changes: 4 additions & 0 deletions references/faq.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,10 @@ Data privacy is important to us at Runpod. Our Terms of Service prohibit hosts f

You can run any Docker container available on any publicly reachable container registry. If you are not well versed in containers, we recommend sticking with the default run templates like our Runpod PyTorch template. However, if you know what you are doing, you can do a lot more!

### Can I use AWS ECR with RunPod?

Yes, RunPod supports pulling container images from AWS Elastic Container Registry (ECR). However, ECR credentials expire every 12 hours, so you'll need to set up automated credential refresh using AWS Lambda. See our [AWS ECR integration guide](/serverless/workers/aws-ecr-integration) for detailed setup instructions.

### Can I run my own Docker daemon on Runpod?

You can't currently spin up your own instance of Docker, as we run Docker for you! Unfortunately, this means that you cannot currently build Docker containers on Runpod or use things like Docker Compose. Many use cases can be solved by creating a custom template with the Docker image that you want to run.
Expand Down
281 changes: 281 additions & 0 deletions serverless/workers/aws-ecr-integration.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,281 @@
---
title: "Deploy from AWS ECR"
sidebarTitle: "Deploy from AWS ECR"
description: "Learn how to deploy Serverless workers from Amazon Elastic Container Registry (ECR) with automated credential management."
---

This guide shows how to deploy Serverless workers from Amazon Elastic Container Registry (ECR). ECR integration requires additional setup compared to other container registries like Docker Hub or GitHub, because ECR credentials expire every 12 hours by default. The means you'll need to set up automated credential refreshing to ensure your Serverless endpoints can continuously access your private ECR repositories.

<Warning>
Because of the additional setup required, we recommend deploying workers with [Docker Hub](/serverless/workers/deploy) or [GitHub Container Registry](/serverless/workers/github-integration) if ECR-specific features aren't required.
</Warning>

## Requirements

Before integrating ECR with Runpod, make sure that you have:

- An AWS account with ECR access.
- The AWS CLI configured on your local machine with appropriate permissions.
- A Runpod account with [API access](/get-started/api-keys).
- A local folder containing all the necessary components to build a worker image: a Dockerfile, handler function, and requirements.txt file.

<Tip>
If you want to test this workflow but haven't yet created the necessary files, you can download this [basic worker template](https://github.com/runpod-workers/worker-basic).
</Tip>

## Step 1: Create an ECR repository and push your container image

First, create an ECR repository and push your container image.

```bash
# Create ECR repository
aws ecr create-repository --repository-name my-serverless-worker --region us-east-1

# Get login token and authenticate Docker
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com

# Build and tag your image; run these commands in the same directory as your Dockerfile
docker build --platform linux/amd64 -t my-serverless-worker .
docker tag my-serverless-worker:latest AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/my-serverless-worker:latest

# Push to ECR
docker push AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/my-serverless-worker:latest
```

Replace `AWS_ACCOUNT_ID` with your actual AWS account ID number and adjust the region as needed.

## Step 2: Create initial container registry credentials

Generate ECR credentials and add them to Runpod:

```bash
# Get ECR authorization token
aws ecr get-authorization-token --region us-east-1 --query 'authorizationData[0].authorizationToken' --output text | base64 -d
```

This returns credentials in the format `AWS:password`. Use these to create container registry authentication in Runpod:

<Tabs>
<Tab title="REST API">
```bash
curl -X POST "https://rest.runpod.io/v1/containerregistryauth" \
-H "Authorization: Bearer YOUR_RUNPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "ECR Credentials",
"username": "AWS",
"password": "YOUR_ECR_TOKEN_PASSWORD"
}'
```
</Tab>

<Tab title="Python">
```python
import requests

response = requests.post(
"https://rest.runpod.io/v1/containerregistryauth",
headers={
"Authorization": "Bearer YOUR_RUNPOD_API_KEY",
"Content-Type": "application/json"
},
json={
"name": "ECR Credentials",
"username": "AWS",
"password": "YOUR_ECR_TOKEN_PASSWORD"
}
)

registry_auth_id = response.json()["id"]
print(f"Registry auth ID: {registry_auth_id}")
```
</Tab>
</Tabs>

Save the returned `id` value - you'll need it for the automation script and endpoint configuration.

## Step 3: Set up automated credential refresh

Create an AWS Lambda function to automatically refresh ECR credentials in Runpod:

### Lambda function code

```python
import json
import boto3
from botocore.vendored import requests
import base64
import os

def lambda_handler(event, context):
# Initialize AWS clients
ecr_client = boto3.client('ecr')

# Runpod configuration
runpod_api_key = os.environ['RUNPOD_API_KEY']
registry_auth_id = os.environ['REGISTRY_AUTH_ID']

try:
# Get new ECR authorization token
response = ecr_client.get_authorization_token()
auth_data = response['authorizationData'][0]

# Decode the authorization token
token = base64.b64decode(auth_data['authorizationToken']).decode('utf-8')
username, password = token.split(':', 1)

# Update Runpod container registry credentials
update_response = requests.patch(
f"https://rest.runpod.io/v1/containerregistryauth/{registry_auth_id}",
headers={
"Authorization": f"Bearer {runpod_api_key}",
"Content-Type": "application/json"
},
json={
"username": username,
"password": password
}
)

if update_response.status_code == 200:
print("Successfully updated Runpod ECR credentials")
return {
'statusCode': 200,
'body': json.dumps('ECR credentials updated successfully')
}
else:
print(f"Failed to update credentials: {update_response.text}")
return {
'statusCode': 500,
'body': json.dumps('Failed to update credentials')
}

except Exception as e:
print(f"Error: {str(e)}")
return {
'statusCode': 500,
'body': json.dumps(f'Error: {str(e)}')
}
```

### Lambda configuration

1. Create a new Lambda function in the AWS Console
2. Set the runtime to Python 3.9 or later.
3. Add the following environment variables:
- `RUNPOD_API_KEY`: Your Runpod API key.
- `REGISTRY_AUTH_ID`: The container registry auth ID from Step 2.
4. Attach an IAM role with the `AmazonEC2ContainerRegistryReadOnly` policy.
5. Set up an EventBridge (CloudWatch Events) rule to trigger the function every 6 hours:

```json
{
"Rules": [
{
"Name": "ECRCredentialRefresh",
"ScheduleExpression": "rate(6 hours)",
"State": "ENABLED",
"Targets": [
{
"Id": "1",
"Arn": "arn:aws:lambda:us-east-1:AWS_ACCOUNT_ID:function:refresh-ecr-credentials"
}
]
}
]
}
```

## Step 4: Deploy Serverless endpoint with ECR image

Once credential automation is set up, you can create your Serverless endpoint using the Runpod console or the REST API.

<Tabs>
<Tab title="Runpod Console">

Follow these steps to create your Serverless endpoint in the Runpod console:

1. Navigate to the [Serverless section](https://www.console.runpod.io/serverless).
2. Click **New Endpoint**.
3. Select **Docker Image** as your source.
4. Enter your ECR image URL: `123456789012.dkr.ecr.us-east-1.amazonaws.com/my-serverless-worker:latest`.
5. In the **Advanced** section, select your ECR credentials from the **Container Registry Auth** dropdown.
6. Configure other endpoint settings as needed.
7. Click **Create Endpoint**.
</Tab>

<Tab title="REST API">

Use the following cURL command to create your Serverless endpoint with the REST API:

```bash
curl -X POST "https://rest.runpod.io/v1/endpoints" \
-H "Authorization: Bearer YOUR_RUNPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "ECR Serverless Worker",
"template": {
"imageName": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-serverless-worker:latest",
"containerRegistryAuthId": "YOUR_REGISTRY_AUTH_ID"
},
"workersMax": 3,
"workersMin": 0,
"gpuIds": "AMPERE_16"
}'
```
</Tab>
</Tabs>

## Step 5: Monitor credential refresh

Check your Lambda function logs in CloudWatch to ensure credentials are being refreshed successfully:

```bash
aws logs filter-log-events \
--log-group-name /aws/lambda/refresh-ecr-credentials \
--start-time $(date -d '1 hour ago' +%s)000
```

## Common issues and solutions

**Image pull failures**: If your endpoint fails to pull the ECR image:

1. Verify the ECR image URL is correct
2. Check that the Lambda function is running successfully
3. Ensure the container registry auth ID is properly configured
4. Confirm your ECR repository permissions allow the necessary access

**Credential expiration**: If you see authentication errors:

1. Manually trigger the Lambda function to refresh credentials immediately
2. Check that the CloudWatch Events rule is properly configured
3. Verify the Lambda function has the correct IAM permissions

**Performance considerations**: ECR image pulls may be slower than Docker Hub:

1. Consider using smaller base images to reduce pull times
2. Enable FlashBoot in your endpoint configuration for faster cold starts
3. Use network volumes to cache frequently accessed data

## Best practices

* **Monitor Lambda execution**: Set up CloudWatch alarms for Lambda function failures.
* **Use specific image tags**: Avoid using `:latest` tags in production deployments.
* **Implement retry logic**: Add error handling and retry mechanisms to your Lambda function.
* **Regional considerations**: Deploy Lambda functions in the same region as your ECR repositories.
* **Security**: Use least-privilege IAM policies and rotate Runpod API keys regularly.

## Alternative approaches

If automated credential refresh adds complexity to your workflow, consider these alternatives:

* **Docker Hub**: Simpler authentication with longer-lived credentials.
* **GitHub Container Registry**: Integrated with GitHub workflows and repositories.
* **Public ECR**: Use Amazon's public ECR for open-source projects without authentication.

## Next steps

Now that you've integrated ECR with Runpod, you can:

* [Learn how to configure your endpoint.](/serverless/endpoints/endpoint-configurations)
* [Set up monitoring and logging for your endpoints.](/serverless/endpoints/job-states)
1 change: 1 addition & 0 deletions serverless/workers/deploy.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -161,3 +161,4 @@ After successfully deploying your worker, you can:
* [Create more advanced handler functions](/serverless/workers/handler-functions)
* [Optimize your endpoint configurations](/serverless/endpoints/endpoint-configurations)
* [Learn how to deploy workers directly from GitHub](/serverless/workers/github-integration)
* [Deploy from private AWS ECR repositories](/serverless/workers/aws-ecr-integration)
1 change: 1 addition & 0 deletions serverless/workers/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -56,5 +56,6 @@ You can view the state of your workers using the **Workers** tab of a Serverless
* [Build your first worker.](/serverless/workers/custom-worker)
* [Create a custom handler function.](/serverless/workers/handler-functions)
* [Learn how to deploy workers from Docker Hub.](/serverless/workers/deploy)
* [Deploy from private AWS ECR repositories.](/serverless/workers/aws-ecr-integration)
* [Deploy large language models using vLLM.](/serverless/vllm/overview)
* [Configure your endpoints for optimal performance.](/serverless/endpoints/endpoint-configurations)