Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New serverless pattern - API Gateway - Lambda - Redshift Data API #2487

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions apigw-lambda-redshiftdataapi/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
*.swp
package-lock.json
__pycache__
.pytest_cache
.venv
*.egg-info

# CDK asset staging directory
.cdk.staging
cdk.out
67 changes: 67 additions & 0 deletions apigw-lambda-redshiftdataapi/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# AWS Service 1 to AWS Service 2

This pattern demonstrates how to expose data from Redshift through API using API Gateway, Lambda and Redshift Data API.

Learn more about this pattern at Serverless Land Patterns: [<< Add the live URL here >>](https://serverlessland.com/patterns/apigw-lambda-redshiftdataapi)

Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example.

## Requirements

- [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.
- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured
- [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
- [AWS CDK installed](https://docs.aws.amazon.com/cdk/latest/guide/cli.html)
- [Python 3 installed](https://www.python.org/downloads/)
klmuthu marked this conversation as resolved.
Show resolved Hide resolved
klmuthu marked this conversation as resolved.
Show resolved Hide resolved
- [Create Redshift Serverless](https://docs.aws.amazon.com/redshift/latest/gsg/new-user-serverless.html#serverless-console-resource-creation)

## Deployment Instructions

1. Create a new directory, navigate to that directory in a terminal and clone the GitHub repository:
```
git clone https://github.com/aws-samples/serverless-patterns
```
1. Change directory to the pattern directory:
```
cd apigw-lambda-redshiftdataapi
```
1. Create python virtual environment and install dependencies:
```
python3 -m venv venv
source ./venv/bin/activate
pipenv install
pipenv requirements > requirements.txt
klmuthu marked this conversation as resolved.
Show resolved Hide resolved
```
1. Create .env file to include Redshift cluster environment variables:
```
Create a .env file in the cloned directory
Add the following variables with respective to values
REDSHIFT_CLUSTER_ARN=arn:aws:redshift-serverless:<region>:<accountid>:workgroup/<workgroupid>
REDSHIFT_WORKGROUP=<workgroup-name>
REDSHIFT_DATABASE=<database-name>
```
1. From the command line, use AWS CDK to deploy the AWS resources for the pattern as specified in the template.yml file:

```
cdk deploy --app "python3 apigw_lambda_redshiftdataapi_stack.py"
```

1. Note the outputs from the CDK deployment process. These contain the resource names and/or ARNs which are used for testing.

## How it works

This setup orchestrates exposing data from Redshift through API Gateway with Cognito authorizer and Lambda using Redshift Data API.

## Testing

1. Use an api client with oauth capability such as Postman or Rapid API to test the api. API url can be found from CDK stack deployment output. You will need to login to AWS Console, navigate to Cognito app client and retrieve the client id and secret to pass it in the api call.
klmuthu marked this conversation as resolved.
Show resolved Hide resolved

## Cleanup

cdk destroy --app "python3 apigw_lambda_redshiftdataapi_stack.py"

---

Copyright 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved.

SPDX-License-Identifier: MIT-0
255 changes: 255 additions & 0 deletions apigw-lambda-redshiftdataapi/apigw_lambda_redshiftdataapi_stack.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
import aws_cdk as cdk
from aws_cdk import (
Stack,
aws_lambda as _lambda,
aws_apigateway as apigw,
aws_iam as iam,
aws_cognito as cognito,
aws_logs as logs,
Duration,
CfnOutput,
)
from constructs import Construct
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Access the environment variables and add error handling
redshift_cluster_arn = os.getenv("REDSHIFT_CLUSTER_ARN")
redshift_workgroup = os.getenv("REDSHIFT_WORKGROUP")
redshift_database = os.getenv("REDSHIFT_DATABASE")

if not redshift_cluster_arn:
raise ValueError("REDSHIFT_CLUSTER_ARN environment variable is not set")
if not redshift_workgroup:
raise ValueError("REDSHIFT_WORKGROUP environment variable is not set")
if not redshift_database:
raise ValueError("REDSHIFT_DATABASE environment variable is not set")


class ApigwRedshiftDataApi(Stack):

def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
super().__init__(scope, construct_id, **kwargs)

# Create Cognito User Pool
user_pool = cognito.UserPool(
self,
"RedshiftApiUserPool",
sign_in_aliases=cognito.SignInAliases(username=True, email=True),
auto_verify=cognito.AutoVerifiedAttrs(email=True),
standard_attributes=cognito.StandardAttributes(
email=cognito.StandardAttribute(required=True, mutable=True)
),
)

# Add a domain to the User Pool
domain = user_pool.add_domain(
"CognitoDomain",
cognito_domain=cognito.CognitoDomainOptions(
domain_prefix="apigwredshiftapi-" + self.account
),
)

# Define a custom scope
redshift_api_scope = cognito.ResourceServerScope(
scope_name="redshift-api",
scope_description="Access to Redshift API",
)

# Create a resource server for custom scopes
resource_server = user_pool.add_resource_server(
"RedshiftApiResourceServer",
identifier="redshiftapi",
scopes=[redshift_api_scope],
)

# Create Cognito App Client
app_client = user_pool.add_client(
"RedshiftApiAppClient",
generate_secret=True,
o_auth=cognito.OAuthSettings(
flows=cognito.OAuthFlows(
client_credentials=True,
),
scopes=[
cognito.OAuthScope.resource_server(
resource_server, redshift_api_scope
)
],
),
access_token_validity=Duration.minutes(60),
prevent_user_existence_errors=True,
)

# Create Lambda function
lambda_function = _lambda.Function(
self,
"RedshiftApiLambda",
runtime=_lambda.Runtime.PYTHON_3_9,
handler="index.lambda_handler",
code=_lambda.InlineCode(
"""
import json
import boto3
import time
import os
from botocore.exceptions import ClientError

def lambda_handler(event, context):
redshift_workgroup = os.environ['REDSHIFT_WORKGROUP']
redshift_database = os.environ['REDSHIFT_DATABASE']
sql_query = "SELECT * FROM tickit.users LIMIT 10;"

client = boto3.client("redshift-data")

try:
# Execute the query
response = client.execute_statement(
WorkgroupName=redshift_workgroup,
Database=redshift_database,
Sql=sql_query
)
query_id = response["Id"]

# Wait for the query to complete
while True:
status_response = client.describe_statement(Id=query_id)
status = status_response['Status']

if status == 'FINISHED':
# Query completed successfully
result = client.get_statement_result(Id=query_id)
columns = [col["name"] for col in result["ColumnMetadata"]]
rows = result["Records"]
results = [
dict(zip(columns, [list(value.values())[0] for value in row]))
for row in rows
]
return {"statusCode": 200, "body": json.dumps(results)}
elif status == 'FAILED':
# Query failed
error = status_response.get('Error', 'Unknown error')
return {"statusCode": 500, "body": json.dumps({"error": error})}
elif status == 'ABORTED':
return {"statusCode": 500, "body": json.dumps({"error": "Query was aborted"})}

# Add a small delay before checking again
time.sleep(0.5)

except ClientError as e:
error_message = e.response['Error']['Message']
return {"statusCode": 500, "body": json.dumps({"error": error_message})}
except Exception as e:
return {"statusCode": 500, "body": json.dumps({"error": str(e)})}
"""
),
timeout=Duration.seconds(29),
environment={
"REDSHIFT_WORKGROUP": redshift_workgroup,
"REDSHIFT_DATABASE": redshift_database,
},
)

# Grant Redshift Serverless and Redshift Data API access to Lambda
redshift_serverless_arn = redshift_cluster_arn

lambda_function.add_to_role_policy(
iam.PolicyStatement(
actions=[
"redshift-serverless:GetWorkgroup",
"redshift-serverless:ListWorkgroups",
"redshift-data:ExecuteStatement",
"redshift-data:DescribeStatement",
"redshift-data:GetStatementResult",
"redshift-data:ListStatements",
"redshift-data:CancelStatement",
"redshift-serverless:GetCredentials",
],
resources=[redshift_serverless_arn, "*"],
)
)

# Create API Gateway
api = apigw.RestApi(
self,
"RedshiftApi",
rest_api_name="Redshift API Service",
endpoint_types=[apigw.EndpointType.REGIONAL], # Specify REGIONAL endpoint
deploy_options=apigw.StageOptions(
stage_name="prod",
logging_level=apigw.MethodLoggingLevel.INFO,
data_trace_enabled=True,
metrics_enabled=True,
),
)

# Create Cognito Authorizer specifically for the created app client
auth = apigw.CognitoUserPoolsAuthorizer(
self,
"RedshiftApiAuthorizer",
cognito_user_pools=[user_pool],
authorizer_name="RedshiftApiCognitoAuthorizer",
identity_source="method.request.header.Authorization",
)

# Create API Gateway integration with Lambda
integration = apigw.LambdaIntegration(lambda_function)

# Add a resource and method to the API Gateway with Cognito Authorizer and API Key
api_resource = api.root.add_resource("api")
api_resource.add_method(
"POST",
integration,
authorizer=auth,
authorization_type=apigw.AuthorizationType.COGNITO,
authorization_scopes=[
f"{resource_server.user_pool_resource_server_id}/redshift-api"
],
)

# Create IAM role for API Gateway to push logs to CloudWatch
api_gateway_logging_role = iam.Role(
self,
"ApiGatewayLoggingRole",
assumed_by=iam.ServicePrincipal("apigateway.amazonaws.com"),
managed_policies=[
iam.ManagedPolicy.from_aws_managed_policy_name(
"service-role/AmazonAPIGatewayPushToCloudWatchLogs"
)
],
)

# Grant API Gateway permission to assume the logging role
api_gateway_logging_role.grant_assume_role(
iam.ServicePrincipal("apigateway.amazonaws.com")
)

# Create a log group for API Gateway
api_log_group = logs.LogGroup(
self, "ApiGatewayLogGroup", retention=logs.RetentionDays.ONE_WEEK
)

# Update the API's account settings to enable CloudWatch logging
api_gateway_account = apigw.CfnAccount(
self,
"ApiGatewayAccount",
cloud_watch_role_arn=api_gateway_logging_role.role_arn,
)

# Ensure the API Gateway account settings are updated before the API is created
api.node.add_dependency(api_gateway_account)

# Output the Cognito User Pool ID, App Client ID, Domain URL, API URL, and API Key
CfnOutput(self, "UserPoolId", value=user_pool.user_pool_id)
CfnOutput(self, "AppClientId", value=app_client.user_pool_client_id)
CfnOutput(self, "TokenEndpoint", value=f"{domain.base_url()}/oauth2/token")
CfnOutput(self, "ApiUrl", value=api.url)


if __name__ == "__main__":
app = cdk.App()
ApigwRedshiftDataApi(app, "ApigwRedshiftDataApi")
app.synth()
55 changes: 55 additions & 0 deletions apigw-lambda-redshiftdataapi/example-pattern.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
{
"title": "Reshift Data API with API Gateway and Lambda",
"description": "Expose data from Redshift through API using API Gateway, Lambda and Redshift Data API",
"language": "TypeScript",
"level": "200",
"framework": "CDK",
"introBox": {
"headline": "How it works",
"text": [
"This sample project demonstrates how to expose data from Redshift through API using API Gateway, Lambda and Redshift Data API.",
"Implemented in CDK."
]
},
"gitHub": {
"template": {
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/apigw-lambda-redshiftdataapi",
"templateURL": "serverless-patterns/apigw-lambda-redshiftdataapi",
"projectFolder": "apigw-lambda-redshiftdataapi",
"templateFile": "apigw-lambda-redshiftdataapi/apigw_lambda_redshiftdataapi_stack"
}
},
"resources": {
"bullets": [
{
"text": "Redshift Data API",
"link": "https://docs.aws.amazon.com/redshift/latest/mgmt/data-api.html"
},
{
"text": "API Gateway",
"link": "https://docs.aws.amazon.com/apigateway/latest/developerguide/welcome.html"
}
]
},
"deploy": {
"text": [
"cdk deploy --app \"python3 apigw_lambda_redshiftdataapi_stack.py\""
]
},
"testing": {
"text": ["See the GitHub repo for detailed testing instructions."]
},
"cleanup": {
"text": [
"Delete the stack: <code>cdk deploy --app \"python3 apigw_lambda_redshiftdataapi_stack.py\"</code>."
]
},
"authors": [
{
"name": "Muthu Kumar",
"image": "https://avatars.githubusercontent.com/u/51725180",
"bio": "AWS Solutions Architect. Serverless advocate and enthusiast.",
"linkedin": "klmuthu"
}
]
}
3 changes: 3 additions & 0 deletions apigw-lambda-redshiftdataapi/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
aws-cdk-lib==2.118.0
constructs>=10.0.0,<11.0.0
python-dotenv==0.19.1