-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring of azure obj store #3957 #3970
Refactoring of azure obj store #3957 #3970
Conversation
Need to check why the test fails on the cortex dep. :D |
We use this part; https://github.com/cortexproject/cortex/blob/master/pkg/storage/bucket/azure/bucket_client.go#L10 and Cortex uses the Thanos part. @pracucci could you weight in here if my statement is even correct? Thank you! |
Cycling dependency problem, so:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions otherwise look good!
Is there anything I can do to help move this along? :) Or is this dependent on changes that have to be made in Cortex? |
It's currently not blocking but just needs some fixes in packages. Which eventually also requires some changes at the Cortex side. I did however not foresee the complexity at first :p But it's good to know that people need this, so I can make it a higher prio! |
Glad to see this moving along. My compact pod is dying multiple times a day at this point, hopefully this will resolve the issue for me and others. |
@airkewld & @phillebaba see #3957 - I've build an image. But feel free to build it yourself. Wouldn't mind more testers on this feature :) |
I can probably test this as I am already running Thanos in AKS. Hopefully this will work with aad-pod-identity. |
Don't use it on production yet tho.. :D Also can we move potential discussions to the issue? Makes it a bit easier. Thanks so much btw :) |
I changed the configuration in a way that it's not breaking anymore AND it has no dependency issue anymore. We now use the So this resolved two things:
|
Hi dudes! |
pkg/objstore/azure/azure.go
Outdated
} | ||
|
||
// If we don't have config specific retry values but we do have the generic MaxRetries. | ||
// This is part backwards compatibility but also ease of configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless I have misunderstood, do you mean
// This is part backwards compatibility but also ease of configuration | |
// This part backwards compatibility but also ease of configuration |
😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks, I will make this proper English :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small question about the comment 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, 💪🏽
I will update this PR as soon as this is merged: #4104 (comment) |
@wiardvanrij had some time now to try this PR. The container is created without an issue but I am seeing authorization issues when uploading blocks from the receiver. I am using aad-pod-identity which adds additional complexities, but unless someone else has verified that this feature works I think you should hold off on merging. |
Thanks! Could you perhaps provide more information in #3957 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just small suggestions to fix the msi resource value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! There are minor style nits, otherwise I would be happy to merge this.
Great work and sorry for major lag on review!
http_config: | ||
idle_conn_timeout: 0s | ||
response_header_timeout: 0s | ||
insecure_skip_verify: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could tls_config be added instead of just insecure_skip_verify
?
I know blob storage itself does not support it but we'd like to use a proxy (by changing the endpoint
in the configuration) with mTLS for blob storage to avoid having to provide the storage account key to every cluster we monitor.
I also think it would be better to make the config as similar to Prometheus as possible for consistency and code resuse (if possible)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes and no. This is in line with all other storage options in Thanos: https://thanos.io/tip/thanos/storage.md/
There is some work being done to make these configs more logical, but for now I try to keep it aligned with how everything in Thanos works (:
edit:
We must also not confuse it on which part the TLS is for. I.e. this is for the connection towards whatever object store you have. Not to implement TLS on the component itself. Just for clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the answer, I guess the solution to this problem is to implement a custom proxy and run it as a sidecar, using https with a dummy certificate and insecure_skip_verify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the answer, I guess the solution to this problem is to implement a custom proxy and run it as a sidecar, using https with a dummy certificate and
insecure_skip_verify
Yea I understand the case, it's just that I don't feel making one specific change here. However, it would make sense to implement a generic tls_config for non-component related configs, but for 'object storage connections'. As this would/should be implemented not only for Azure Blob, but also for all the other options Thanos supports.
Perhaps you could create a new issue for this?
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Co-authored-by: Philip Laine <philip.laine@gmail.com> Signed-off-by: Wiard van Rij <wiard@outlook.com>
Co-authored-by: Philip Laine <philip.laine@gmail.com> Signed-off-by: Wiard van Rij <wiard@outlook.com>
Co-authored-by: Philip Laine <philip.laine@gmail.com> Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
Signed-off-by: Wiard van Rij <wiard@outlook.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing! Let's get this in and get some nice, sweet feedback.
Nice work @wiardvanrij , Mr Golang dev! (:
@wiardvanrij Thanks for adding these changes!
|
I think got my questions answered. I have a follow up PR to support user-assigned managed identity. #4636 |
Wow thanks! Really good and thanks for helping out! |
* Introduce `http` config settings in Azure storage Cortex v1.11.0 included thanos-io/thanos#3970, which added configuration options to Azure's http client and transport, replacing usage of `http.DefaultClient`. Unfortunately since Cortex was not setting this config, Cortex implicitly switched from `http.DefaultClient` to all empty values (e.g. `MaxIdleConns: 0` rather than 100). Introduce `http` config settings to Azure storage. This motivated moving `s3.HTTPConfig` into a new `pkg/storage/bucket/config` package, to allow `azure` and `s3` to share it. Also update the instructions for running the website to include installing `embedmd`. Signed-off-by: Andrew Seigner <andrew@sig.gy> * feedback: `config.HTTP` -> `http.Config` also back out changelog cleanup Signed-off-by: Andrew Seigner <andrew@sig.gy> * Back out accidental changelog addition Signed-off-by: Andrew Seigner <andrew@sig.gy>
* Introduce `http` config settings in Azure storage Cortex v1.11.0 included thanos-io/thanos#3970, which added configuration options to Azure's http client and transport, replacing usage of `http.DefaultClient`. Unfortunately since Cortex was not setting this config, Cortex implicitly switched from `http.DefaultClient` to all empty values (e.g. `MaxIdleConns: 0` rather than 100). Introduce `http` config settings to Azure storage. This motivated moving `s3.HTTPConfig` into a new `pkg/storage/bucket/config` package, to allow `azure` and `s3` to share it. Also update the instructions for running the website to include installing `embedmd`. Signed-off-by: Andrew Seigner <andrew@sig.gy> * feedback: `config.HTTP` -> `http.Config` also back out changelog cleanup Signed-off-by: Andrew Seigner <andrew@sig.gy> * Back out accidental changelog addition Signed-off-by: Andrew Seigner <andrew@sig.gy> Signed-off-by: Alvin Lin <alvinlin@amazon.com>
Signed-off-by: Wiard van Rij wiard@outlook.com
Changes
Breaks configuration by movingmax_retries
to it's ownreader_config
Verification
How you tested it?
No, other than the tests included
Especially the MSI authentication part is untested.
How do you know it works?
This is tested via other users on #3957 :
@wiardvanrij ran your image all night locally with no failures.
Originally posted by @airkewld in #3957 (comment)
I hope this PR can get tested by users of Azure to confirm the working of the changes AND if this has positive impact for on, for example #3952 - relates to #3957