Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 BUG]: Region is missing from SQS endpoint URL when RR is running inside EC2 #1833

Closed
1 task done
matteokov opened this issue Jan 2, 2024 · 9 comments · Fixed by roadrunner-server/sqs#376
Closed
1 task done
Assignees
Labels
B-bug Bug: bug, exception
Milestone

Comments

@matteokov
Copy link

No duplicates 🥲.

  • I have searched for a similar issue in our bug tracker and didn't find any solutions.

What happened?

After I moved my application to EC2 I started getting an error when GetQueueUrl is performed: sqs..amazonaws.com: no such host.

It looks like the region part is missing from the endpoint. If I try to override an endpoint with sqs config key, this is ignored and the same error occurs.

sqs:
  region: eu-central-1
  endpoint: https://sqs.eu-central-1.amazonaws.com

If I understood correctly, Inside the EC2 instance, RoadRunner should look for identity by sending a request to http://169.254.169.254/latest/dynamic/instance-identity/. Later, AWS Go SDK should fetch credentials from EC2 and use them to perform requests to SQS.

To debug, I sent a request to http://169.254.169.254/latest/dynamic/instance-identity/document from inside the instance and I can see that my region is eu-central-1.

I'm using Symfony and baldinof/roadrunner-bundle but this shouldn't be relevant to reproduce an issue as it fails before even reaching PHP/Symfony.

Version (rr --version)

Tested on both 2023.3.6 and 2023.3.8

How to reproduce the issue?

RoadRunner config:

server:
  command: "php public/index.php"
  env:
    - APP_RUNTIME: Baldinof\RoadRunnerBundle\Runtime\Runtime

http:
  address: 0.0.0.0:8080
  middleware: [ "static", "gzip" ]
  pool:
    debug: true
  uploads:
    forbid: [ ".php", ".exe", ".bat" ]
  static:
    dir: "public"
    forbid: [ ".php", ".htaccess" ]

logs:
  mode: development
  channels:
    http:
      level: debug
    server:
      level: info
      mode: raw
    metrics:
      level: debug

jobs:
  num_pollers: 2
  timeout: 60
  pipeline_size: 100000

  pool:
    debug: true

  consume: [ "events-sqs-pipeline" ]

  pipelines:
    events-sqs-pipeline:
      driver: sqs
      config:
        skip_queue_declaration: true
        prefetch: 10
        consume_all: true
        visibility_timeout: 30
        wait_time_seconds: 20
        queue: test_queue

Run the application in EC2 instance with PHP version 8.2

Relevant log output

{"level":"DEBUG","ts":"2024-01-02T21:06:37+0000","logger":"jobs        ","msg":"initializing driver","pipeline":"events-sqs-pipeline","driver":"sqs"}
{"level":"ERROR","ts":"2024-01-02T21:06:37+0000","logger":"jobs        ","msg":"failed to initialize driver","pipeline":"events-sqs-pipeline","driver":"sqs","error":"new_sqs_consumer: operation error SQS: GetQueueUrl, https response error StatusCode: 0, RequestID: , request send failed, Post \"https://sqs..amazonaws.com/\": dial tcp: lookup sqs..amazonaws.com: no such host"}
@rustatian
Copy link
Member

Hey @matteokov 👋
Yeah, RR tries to perform a call to the well-known local URL to check if it is running inside AWS environment.
The global sqs key ignored in that case (which might be a wrong decision btw).
I'll double-check that behavior because configuration from inside AWS is managed by the AWS Golang package...

@matteokov
Copy link
Author

@rustatian thanks for a fast response :)

Regarding the global sqs key, I think that a better flow would be to always respect the explicitly provided configuration

So priority would be:

  1. If there is sqs key and there are credentials/configs provided, use them
  2. Use credentials from the EC2 instance (current flow)

Let me know if you need any additional info regarding the endpoint region issue.

@rustatian
Copy link
Member

Yeah, I'm currently verifying your info, but anyway, in 2024.1 I'll update this behavior (since this is a BC).

@rustatian
Copy link
Member

@matteokov Could you please try to set AWS_REGION env variable and restart RR?

@matteokov
Copy link
Author

@rustatian This fixed an issue and RoadRunner started normally

@rustatian
Copy link
Member

It looks like a bug in the AWS GO SDK, I found an issue stating about the same problem.

@rustatian rustatian added this to the v2023.3.9 milestone Jan 3, 2024
@rustatian rustatian moved this to 🏗 In progress in Jira 😄 Jan 3, 2024
@rustatian
Copy link
Member

But, I found a way how to support both, the global sqs config with the data provided by the IAM.

@rustatian
Copy link
Member

Ok, since I can't control the changed behavior in the AWS SDK, the global configuration would be able to override the existing IAM values since the next bugfix version (2023.3.9)

@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Jira 😄 Jan 3, 2024
@rustatian
Copy link
Member

The fix will be released next Thursday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-bug Bug: bug, exception
Projects
Status: ✅ Done
Development

Successfully merging a pull request may close this issue.

2 participants