Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receive: stop relying on grpc server config to set grpc client secure/skipVerify #7219

Conversation

guillaumelecerf
Copy link
Contributor

@guillaumelecerf guillaumelecerf commented Mar 20, 2024

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

I'm using a thanos receive HA cluster on ECS Fargate, with 3 endpoints in the hashring, with one dedicated listener + grpc target group per endpoint.

The TLS termination is done at the listener level (so out of thanos). Therefore the thanos receive containers are configured with no TLS.

When testing, I ended up with error reading server preface: http2: frame too large errors.

After debugging, I found that the receive grpc client is configured in secure mode only if the receive grpc server has TLS enabled.
This breaks the inter-endpoint communication between our hashring members.

To fix this problem, I introduce 2 new flags --remote-write.client-tls-secure and --remote-write.client-tls-skip-verify to mimic what is done in the grpc client configuration of thanos query..

Verification

Everything is now working on our cluster.

@guillaumelecerf guillaumelecerf force-pushed the bugfix/client-tls-external-termination branch 6 times, most recently from 96b6157 to 515cb94 Compare March 27, 2024 13:47
@guillaumelecerf guillaumelecerf force-pushed the bugfix/client-tls-external-termination branch 3 times, most recently from 12861c0 to 2c34e17 Compare April 8, 2024 12:43
@guillaumelecerf
Copy link
Contributor Author

Hi,
Can someone please have a look?

@guillaumelecerf guillaumelecerf force-pushed the bugfix/client-tls-external-termination branch from 2c34e17 to ccc77d8 Compare April 9, 2024 08:58
@fpetkovski
Copy link
Contributor

I did not fully understand the TLS setup, at which point in Thanos do you have TLS enabled?

@guillaumelecerf
Copy link
Contributor Author

guillaumelecerf commented Apr 9, 2024

I did not fully understand the TLS setup, at which point in Thanos do you have TLS enabled?

@fpetkovski : thanks for having a look.

In my setup, thanos is never configured with TLS, as the TLS termination is done externally at the listener level in AWS.

The problem here is that the code assumes that the gRPC client (used when talking with the other members of the hashring) is using TLS only if the server is configured with TLS.

For my use case, these 2 notions must be decoupled, as it is done in query.

@guillaumelecerf guillaumelecerf force-pushed the bugfix/client-tls-external-termination branch from ccc77d8 to c026df8 Compare April 9, 2024 12:51
fpetkovski
fpetkovski previously approved these changes Apr 9, 2024
Copy link
Contributor

@fpetkovski fpetkovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this makes sense, and is also consistent with the query configuration.

@guillaumelecerf guillaumelecerf force-pushed the bugfix/client-tls-external-termination branch from c026df8 to a347b49 Compare April 10, 2024 08:04
…/skipVerify

Signed-off-by: Guillaume Lecerf <guillaume.lecerf@iziwork.com>
@fpetkovski fpetkovski merged commit c3cd031 into thanos-io:main Apr 22, 2024
20 checks passed
@guillaumelecerf guillaumelecerf deleted the bugfix/client-tls-external-termination branch April 22, 2024 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants