Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker Service: SQS - Not being configured with FIFO & Dead Letter #5944

Open
adrianbrowning opened this issue Sep 24, 2024 · 7 comments
Open

Comments

@adrianbrowning
Copy link

adrianbrowning commented Sep 24, 2024

Issue:

After specifying in my manifest that I would like the default queue to

  • Be FIFO
  • A delay of 1s
  • And have a Dead Letter queue, after 5 failed attempts

I copilot init, and copilot deploy and view my SQS queue in the UI. It says it is a Standard queue type and no Dead Letter options have been set.

image

I am using these as references:

Am I doing something wrong or is this a bug?


Sample Repo: https://github.com/adrianbrowning/sqs-fargate

My manifest.yml:

name: queue-consumer
type: Worker Service

# Configuration for your containers and service.
image:
  # Docker build arguments.
  build: Dockerfile
  platform: linux/arm64

build:
  platform: linux/arm64

cpu: 256 
memory: 512 
count:        
  range: 0-3
  queue_delay:
    acceptable_latency: 5m
    msg_processing_time: 5m
    cooldown:
      in: 60s
      out: 30s
exec: true     # Enable running commands in your container.

# New field that allows you to subscribe to events from other services in your application.
subscribe:
  queue:
    delay: 1s
    dead_letter:
      tries: 5
    fifo: true # Configure the default SQS queue to be FIFO.

cfn.pathes.yml

- op: replace
  path: /Resources/AutoScalingPolicyEventsQueue/Properties/TargetTrackingScalingPolicyConfiguration/TargetValue
  value: 0.5
@KollaAdithya
Copy link
Contributor

Hello @adrianbrowning

Can you do Copilot svc package and reverify once whether the fifo queue is in cloudformation template.

Also can you check if the manifest has any extra spaces and formatted correctly based on the Copilot documentation.

@adrianbrowning
Copy link
Author

Hi @KollaAdithya ,

`copilot svc package` gives
# Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: MIT-0
AWSTemplateFormatVersion: 2010-09-09
Description: CloudFormation template that represents a worker service on Amazon ECS.
Metadata:
  Version: v1.34.0
  Manifest: "# The manifest for the \"worker-test-worker\" service.\n# Read the full specification for the \"Worker Service\" type at:\n# https://aws.github.io/copilot-cli/docs/manifest/worker-service/\n\n# Your service name will be used in naming your resources like log groups, ECS services, etc.\nname: worker-test-worker\ntype: Worker Service\n# Configuration for your containers and service.\nimage:\n  # Docker build arguments.\n  build: Dockerfile\ncpu: 256 # Number of CPU units for the task.\nmemory: 512 # Amount of memory in MiB used by the task.\nplatform: linux/x86_64 # See https://aws.github.io/copilot-cli/docs/manifest/worker-service/#platform\ncount: 1 # Number of tasks that should be running in your service.\nexec: true # Enable running commands in your container.\n\n# storage:\n# readonly_fs: true       # Limit to read-only access to mounted root filesystems.\n\n# You can register to topics from other services.\n# The events can be received from an SQS queue via the env var $COPILOT_QUEUE_URI.\n# subscribe:\n#   topics: \n#     - name: topic-from-another-service\n#       service: another-service\n\n# Optional fields for more advanced use-cases.\n#\n#variables:                    # Pass environment variables as key value pairs.\n#  LOG_LEVEL: info\n\n#secrets:                      # Pass secrets from AWS Systems Manager (SSM) Parameter Store.\n#  GITHUB_TOKEN: GITHUB_TOKEN  # The key is the name of the environment variable, the value is the name of the SSM parameter.\n\n# You can override any of the values defined above by environment.\n#environments:\n#  test:\n#    count: 2               # Number of tasks to run for the \"test\" environment.\n#    deployment:            # The deployment strategy for the \"test\" environment.\n#       rolling: 'recreate' # Stops existing tasks before new ones are started for faster deployments.\n"
Parameters:
  AppName:
    Type: String
  EnvName:
    Type: String
  WorkloadName:
    Type: String
  ContainerImage:
    Type: String
  TaskCPU:
    Type: String
  TaskMemory:
    Type: String
  TaskCount:
    Type: Number
  AddonsTemplateURL:
    Description: 'URL of the addons nested stack template within the S3 bucket.'
    Type: String
    Default: ""
  EnvFileARN:
    Description: 'URL of the environment file.'
    Type: String
    Default: ""
  ArtifactKeyARN:
    Type: String
    Description: 'KMS Key used for encrypting artifacts'
  LogRetention:
    Type: Number
    Default: 30
Conditions:
  IsGovCloud: !Equals [!Ref "AWS::Partition", "aws-us-gov"]
  HasAddons: !Not [!Equals [!Ref AddonsTemplateURL, ""]]
  HasEnvFile: !Not [!Equals [!Ref EnvFileARN, ""]]
Resources:
  LogGroup:
    Metadata:
      'aws:copilot:description': 'A CloudWatch log group to hold your service logs'
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: !Join ['', [/copilot/, !Ref AppName, '-', !Ref EnvName, '-', !Ref WorkloadName]]
      RetentionInDays: !Ref LogRetention
  TaskDefinition:
    Metadata:
      'aws:copilot:description': 'An ECS task definition to group your containers and run them on ECS'
    Type: AWS::ECS::TaskDefinition
    DependsOn: LogGroup
    Properties:
      Family: !Join ['', [!Ref AppName, '-', !Ref EnvName, '-', !Ref WorkloadName]]
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      Cpu: !Ref TaskCPU
      Memory: !Ref TaskMemory
      ExecutionRoleArn: !GetAtt ExecutionRole.Arn
      TaskRoleArn: !GetAtt TaskRole.Arn
      ContainerDefinitions:
        - Name: !Ref WorkloadName
          Image: !Ref ContainerImage
          Environment:
            - Name: COPILOT_APPLICATION_NAME
              Value: !Sub '${AppName}'
            - Name: COPILOT_SERVICE_DISCOVERY_ENDPOINT
              Value: worker-test-env.worker-test-app.local
            - Name: COPILOT_ENVIRONMENT_NAME
              Value: !Sub '${EnvName}'
            - Name: COPILOT_SERVICE_NAME
              Value: !Sub '${WorkloadName}'
            - Name: COPILOT_QUEUE_URI
              Value: !Ref EventsQueue
          EnvironmentFiles:
            - !If
              - HasEnvFile
              - Type: s3
                Value: !Ref EnvFileARN
              - !Ref AWS::NoValue
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-region: !Ref AWS::Region
              awslogs-group: !Ref LogGroup
              awslogs-stream-prefix: copilot
  ExecutionRole:
    Metadata:
      'aws:copilot:description': 'An IAM Role for the Fargate agent to make AWS API calls on your behalf'
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
            Action: 'sts:AssumeRole'
      Policies:
        - PolicyName: !Join ['', [!Ref AppName, '-', !Ref EnvName, '-', !Ref WorkloadName, SecretsPolicy]]
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: 'Allow'
                Action:
                  - 'ssm:GetParameters'
                Resource:
                  - !Sub 'arn:${AWS::Partition}:ssm:${AWS::Region}:${AWS::AccountId}:parameter/*'
                Condition:
                  StringEquals:
                    'ssm:ResourceTag/copilot-application': !Sub '${AppName}'
                    'ssm:ResourceTag/copilot-environment': !Sub '${EnvName}'
              - Effect: 'Allow'
                Action:
                  - 'secretsmanager:GetSecretValue'
                Resource:
                  - !Sub 'arn:${AWS::Partition}:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:*'
                Condition:
                  StringEquals:
                    'secretsmanager:ResourceTag/copilot-application': !Sub '${AppName}'
                    'secretsmanager:ResourceTag/copilot-environment': !Sub '${EnvName}'
              - Effect: 'Allow'
                Action:
                  - 'kms:Decrypt'
                Resource:
                  - !Ref ArtifactKeyARN
              - Sid: DecryptTaggedKMSKey
                Effect: 'Allow'
                Action:
                  - 'kms:Decrypt'
                Resource:
                  - !Sub 'arn:${AWS::Partition}:kms:${AWS::Region}:${AWS::AccountId}:key/*'
                Condition:
                  StringEquals:
                    'aws:ResourceTag/copilot-application': !Sub '${AppName}'
                    'aws:ResourceTag/copilot-environment': !Sub '${EnvName}'
        - !If
          # Optional IAM permission required by ECS task def env file
          # https://docs.aws.amazon.com/AmazonECS/latest/developerguide/taskdef-envfiles.html#taskdef-envfiles-iam
          # Example EnvFileARN: arn:aws:s3:::stackset-demo-infrastruc-pipelinebuiltartifactbuc-11dj7ctf52wyf/manual/1638391936/env
          - HasEnvFile
          - PolicyName: !Join ['', [!Ref AppName, '-', !Ref EnvName, '-', !Ref WorkloadName, GetEnvFilePolicy]]
            PolicyDocument:
              Version: '2012-10-17'
              Statement:
                - Effect: 'Allow'
                  Action:
                    - 's3:GetObject'
                  Resource:
                    - !Ref EnvFileARN
                - Effect: 'Allow'
                  Action:
                    - 's3:GetBucketLocation'
                  Resource:
                    - !Join
                      - ''
                      - - 'arn:'
                        - !Ref AWS::Partition
                        - ':s3:::'
                        - !Select [0, !Split ['/', !Select [5, !Split [':', !Ref EnvFileARN]]]]
          - !Ref AWS::NoValue
      ManagedPolicyArns:
        - !Sub 'arn:${AWS::Partition}:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy'
  TaskRole:
    Metadata:
      'aws:copilot:description': 'An IAM role to control permissions for the containers in your tasks'
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
            Action: 'sts:AssumeRole'
      Policies:
        - PolicyName: 'DenyIAM'
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: 'Deny'
                Action: 'iam:*'
                Resource: '*'
        - PolicyName: 'ExecuteCommand'
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: 'Allow'
                Action: ["ssmmessages:CreateControlChannel", "ssmmessages:OpenControlChannel", "ssmmessages:CreateDataChannel", "ssmmessages:OpenDataChannel"]
                Resource: "*"
              - Effect: 'Allow'
                Action: ["logs:CreateLogStream", "logs:DescribeLogGroups", "logs:DescribeLogStreams", "logs:PutLogEvents"]
                Resource: "*"
  Service:
    DependsOn:
      - EnvControllerAction
    Metadata:
      'aws:copilot:description': 'An ECS service to run and maintain your tasks in the environment cluster'
    Type: AWS::ECS::Service
    Properties:
      PlatformVersion: LATEST
      Cluster:
        Fn::ImportValue: !Sub '${AppName}-${EnvName}-ClusterId'
      TaskDefinition: !Ref TaskDefinition
      DesiredCount: !Ref TaskCount
      DeploymentConfiguration:
        DeploymentCircuitBreaker:
          Enable: true
          Rollback: true
        MinimumHealthyPercent: 100
        MaximumPercent: 200
        Alarms: !If
          - IsGovCloud
          - !Ref AWS::NoValue
          - Enable: false
            AlarmNames: []
            Rollback: true
      PropagateTags: SERVICE
      EnableExecuteCommand: true
      LaunchType: FARGATE
      ServiceConnectConfiguration: !If
        - IsGovCloud
        - !Ref AWS::NoValue
        - Enabled: False
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: ENABLED
          Subnets:
            Fn::Split:
              - ','
              - Fn::ImportValue: !Sub '${AppName}-${EnvName}-PublicSubnets'
          SecurityGroups:
            - Fn::ImportValue: !Sub '${AppName}-${EnvName}-EnvironmentSecurityGroup'
      ServiceRegistries: !Ref 'AWS::NoValue'
  EventsKMSKey:
    Metadata:
      'aws:copilot:description': 'A KMS key to encrypt messages in your queues'
    Type: AWS::KMS::Key
    Properties:
      KeyPolicy:
        Version: '2012-10-17'
        Statement:
          - Sid: "Allow key use"
            Effect: Allow
            Principal:
              AWS: !Sub 'arn:${AWS::Partition}:iam::${AWS::AccountId}:root'
            Action:
              - "kms:Create*"
              - "kms:Describe*"
              - "kms:Enable*"
              - "kms:List*"
              - "kms:Put*"
              - "kms:Update*"
              - "kms:Revoke*"
              - "kms:Disable*"
              - "kms:Get*"
              - "kms:Delete*"
              - "kms:ScheduleKeyDeletion"
              - "kms:CancelKeyDeletion"
              - "kms:Tag*"
              - "kms:UntagResource"
              - "kms:Encrypt"
              - "kms:Decrypt"
              - "kms:ReEncrypt*"
              - "kms:GenerateDataKey*"
            Resource: '*'
          - Sid: "Allow SNS encryption"
            Effect: "Allow"
            Principal:
              Service: sns.amazonaws.com
            Action:
              - "kms:Decrypt"
              - "kms:GenerateDataKey*"
            Resource: '*'
          - Sid: "Allow SQS encryption"
            Effect: "Allow"
            Principal:
              Service: sqs.amazonaws.com
            Action:
              - "kms:Encrypt"
              - "kms:Decrypt"
              - "kms:ReEncrypt*"
              - "kms:GenerateDataKey*"
            Resource: '*'
          - Sid: "Allow task role encrypt/decrypt"
            Effect: "Allow"
            Principal:
              AWS:
                - !GetAtt TaskRole.Arn
            Action:
              - "kms:Encrypt"
              - "kms:Decrypt"
            Resource: '*'
  EventsQueue:
    Metadata:
      'aws:copilot:description': 'An events SQS queue to buffer messages'
    Type: AWS::SQS::Queue
    Properties:
      KmsMasterKeyId: !Ref EventsKMSKey
  QueuePolicy:
    Type: AWS::SQS::QueuePolicy
    Properties:
      Queues: [!Ref 'EventsQueue']
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              AWS:
                - !GetAtt TaskRole.Arn
            Action:
              - sqs:ReceiveMessage
              - sqs:DeleteMessage
            Resource: !GetAtt EventsQueue.Arn
  AddonsStack:
    Metadata:
      'aws:copilot:description': 'An Addons CloudFormation Stack for your additional AWS resources'
    Type: AWS::CloudFormation::Stack
    Condition: HasAddons
    Properties:
      Parameters:
        App: !Ref AppName
        Env: !Ref EnvName
        Name: !Ref WorkloadName
      TemplateURL: !Ref AddonsTemplateURL
  EnvControllerAction:
    Metadata:
      'aws:copilot:description': "Update your environment's shared resources"
    Type: Custom::EnvControllerFunction
    Properties:
      ServiceToken: !GetAtt EnvControllerFunction.Arn
      Workload: !Ref WorkloadName
      EnvStack: !Sub '${AppName}-${EnvName}'
      Parameters: []
      EnvVersion: v1.34.0
  EnvControllerFunction:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        S3Bucket: stackset-worker-test-app--pipelinebuiltartifactbuc-qvxjwl2ojrx5
        S3Key: manual/scripts/custom-resources/envcontrollerfunction/5cdb2f63626cf4ce22c15032e1f5842c8a16567b342c66ff137188f19c2cebb7.zip
      Handler: "index.handler"
      Timeout: 900
      MemorySize: 512
      Role: !GetAtt 'EnvControllerRole.Arn'
      Runtime: nodejs20.x
  EnvControllerRole:
    Metadata:
      'aws:copilot:description': "An IAM role to update your environment stack"
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - lambda.amazonaws.com
            Action:
              - sts:AssumeRole
      Path: /
      Policies:
        - PolicyName: "EnvControllerStackUpdate"
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - cloudformation:DescribeStacks
                  - cloudformation:UpdateStack
                Resource: !Sub 'arn:${AWS::Partition}:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${AppName}-${EnvName}/*'
                Condition:
                  StringEquals:
                    'cloudformation:ResourceTag/copilot-application': !Sub '${AppName}'
                    'cloudformation:ResourceTag/copilot-environment': !Sub '${EnvName}'
        - PolicyName: "EnvControllerRolePass"
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - iam:PassRole
                Resource: !Sub 'arn:${AWS::Partition}:iam::${AWS::AccountId}:role/${AppName}-${EnvName}-CFNExecutionRole'
                Condition:
                  StringEquals:
                    'iam:ResourceTag/copilot-application': !Sub '${AppName}'
                    'iam:ResourceTag/copilot-environment': !Sub '${EnvName}'
      ManagedPolicyArns:
        - !Sub arn:${AWS::Partition}:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

I can also confirm that the formatting is correct as well.

Thanks for your help!

@adrianbrowning
Copy link
Author

@KollaAdithya do you have any ideas? Or even an example repo that does work?

Many thanks.

@KollaAdithya
Copy link
Contributor

@adrianbrowning which version of Copilot are you using?

@adrianbrowning
Copy link
Author

@KollaAdithya copilot --version -> copilot version: v1.34.0

@KollaAdithya
Copy link
Contributor

KollaAdithya commented Oct 16, 2024

Hello @adrianbrowning !

In order to set up a Fifo Queue, first you need to set up a service with Fifo SNS Topic and publish to it.

Here is an blogpost of using pub sub architecture with Copilot: https://aws.amazon.com/blogs/containers/implementing-a-pub-sub-architecture-with-aws-copilot/

You can follow this blogpost and set fifo topic as true first and deploy a load balanced service

And later set Fifo queue as true and deploy the worker services.

@adrianbrowning
Copy link
Author

Hi @KollaAdithya

So I ended up using the copilot svc override and using the CDK to do what I needed.

import * as cdk from 'aws-cdk-lib';
import * as path from 'node:path';
import {
    aws_sqs as sqs,
    aws_lambda as lambda,
    aws_events as events,
    aws_events_targets as targets,
    Duration
} from 'aws-cdk-lib';
import {NodejsFunction} from 'aws-cdk-lib/aws-lambda-nodejs';
import {PolicyStatement} from "aws-cdk-lib/aws-iam";

interface TransformedStackProps extends cdk.StackProps {
    readonly appName: string;
    readonly envName: string;
}

export class TransformedStack extends cdk.Stack {
    public readonly template: cdk.cloudformation_include.CfnInclude;
    public readonly appName: string;
    public readonly envName: string;

    constructor(scope: cdk.App, id: string, props: TransformedStackProps) {
        super(scope, id, props);
        this.template = new cdk.cloudformation_include.CfnInclude(this, 'Template', {
            templateFile: path.join('.build', 'in.yml'),
        });
        this.appName = props.appName;
        this.envName = props.envName;

        // Modify the EventsQueue
        const eventsQueue = this.transformEventsQueue();

        // Create the Lambda function and schedule it to run every hour
        this.createScheduledLambda(eventsQueue);
    }

    transformEventsQueue() {
        // Access the EventsQueue from the included template
        const eventsQueue = this.template.getResource("EventsQueue") as sqs.CfnQueue;

        // Create a Dead Letter Queue (DLQ)
        const deadLetterQueue = new sqs.Queue(this, 'EventsDLQ', {
            queueName: `${this.appName}-${this.envName}-EventsDLQ.fifo`,
            fifo: true,
            encryption: sqs.QueueEncryption.KMS_MANAGED, // Optional: KMS encryption for the DLQ
        });

        // Modify the existing EventsQueue to enable FIFO and associate the DLQ
        eventsQueue.fifoQueue = true; // Mark the queue as FIFO
        eventsQueue.contentBasedDeduplication = true; // Optional: Enable content-based deduplication

        // Add the Dead Letter Queue (DLQ) configuration
        eventsQueue.redrivePolicy = {
            maxReceiveCount: 5, // Messages are sent to DLQ after 5 failed receive attempts
            deadLetterTargetArn: deadLetterQueue.queueArn,
        };

        // Optional: Output the ARN of the queues
        new cdk.CfnOutput(this, 'EventsQueueArn', {
            value: eventsQueue.attrArn,
        });

        new cdk.CfnOutput(this, 'EventsDLQArn', {
            value: deadLetterQueue.queueArn,
        });

        return eventsQueue;
    }

    createScheduledLambda(eventsQueue: sqs.CfnQueue) {
        // Define the Lambda function, which CDK will compile from TypeScript to JavaScript
        const scheduledLambda = new NodejsFunction(this, 'PingUrlLambda', {
            entry: path.join(__dirname, "..", "..", "..", 'lambda', 'ping-url.ts'), // Path to your Lambda function code
            handler: 'handler', 
            runtime: lambda.Runtime.NODEJS_20_X,
            environment: {
                SQS_QUEUE_URL: eventsQueue.attrQueueUrl, // Pass the SQS queue URL to the Lambda
                REMOTE_URL: 'some URL to fill in later',
                BEARER_TOKEN: "some string got from ENV"// Replace with the URL to ping
            },
            timeout: Duration.seconds(30),
        });

        scheduledLambda.addToRolePolicy(new PolicyStatement({
            actions: ["sqs:SendMessage"],
            resources: [eventsQueue.attrArn],
        }));

        // Set up a CloudWatch Events rule to trigger the Lambda every hour
        const hourlyRule = new events.Rule(this, 'HourlyRule', {
        	schedule: events.Schedule.rate(Duration.hours(1)),
        });
        hourlyRule.addTarget(new targets.LambdaFunction(scheduledLambda));
    }
}

So modifying the event queue, and adding a DeadLetter queue works perfectly.
However, adding the lamdba always results in the deploy failing and giving the cryptic response

The following resource(s) failed to create: [PingUrlLambda664D764A].
The following resource(s) failed to update: [Service].

I've stripped it down to just the lambda, and it still fails, so not sure why and I can't see any logs that might indicate what is happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants