-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add no_proxy
support
#63314
Add no_proxy
support
#63314
Conversation
src/Common/HTTPConnectionPool.cpp
Outdated
{ | ||
ret.append("|"); | ||
} | ||
ret.append(host); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing escaping.
This feature looks too ad-hoc. Missing motivation about use cases. Why environment variables? |
Consider the scenario where a ClickHouse instance uses two different S3 installations: AWS and a private one. For the AWS one, the user wants all requests to go through their proxy, while for the private one, no. With the current implementation, there is no way to achieve this.
ClickHouse currently supports proxy configuration in two different places: env variables and configuration files. This PR will add the functionality to both, but I haven't had the time to implement for configuration files yet. Plus, it's pretty standard to support |
Not invited by us, just most tools in linux support that. https://stackoverflow.com/questions/62632642/how-to-use-no-proxy-on-linux-machines-wildcards-leading-dots
Curl: https://curl.se/libcurl/c/CURLOPT_NOPROXY.html Discussion about standardizing that. https://about.gitlab.com/blog/2021/01/27/we-need-to-talk-no-proxy/ |
…regex with match anything wildcard for poco
@alexey-milovidov can you enable CI? |
We should merge #63427 first |
This is an automated comment for commit f026cc4 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
@evillique can you also enable ci on #63427? That pr needs to be merged before this one |
src/Common/ProxyConfiguration.h
Outdated
@@ -49,6 +49,7 @@ struct ProxyConfiguration | |||
uint16_t port = 0; | |||
bool tunneling = false; | |||
Protocol original_request_protocol = Protocol::HTTP; | |||
std::string no_proxy_hosts = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This list is constructed from env. It is immutable for the rest of the program live.
Why we copy it to each connection?
Why do we leave it in raw format and parse each time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm... if I had to argue, I would say simplicity and consistency with other types of resolvers that do not rely on environment variables. They always read stuff on the fly and generate a new configuration.
It is not a strong argument, tho. I could introduce some sort of static
variable in EnvironmentProxyConfigurationResolver
so that it reads only once from the environment and always returns that.
As for the native format, mostly because DB::ProxyConfiguration
is 3rd party agnostic. Meaning it has no dependencies on Poco or AWS SDK. In any case, I am fine with converting it only once and storing the poco format in DB::ProxyConfiguration
. If the need to abstract that ever appears, we can do it. For now, we can make this change.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last commit is a possible implementation of the above idea, please take a look: 3ae8176.
@CheSema what dou you think of the current approach? env variables are queried only once. Poco regexp is build only once as well |
I believe the stress test failure is already being tracked -- #65043 Failing everywhere - test_checking_s3_blobs_paranoid |
e8c0caa
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Allow proxy to be bypassed for hosts specified in
no_proxy
env variable and ClickHouse proxy configuration.Documentation entry for user-facing changes
Modify your CI run
NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step
Include tests (required builds will be added automatically):
Exclude tests:
Extra options:
Only specified batches in multi-batch jobs: