Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Add support for running CHProxy behind another proxy #225

Merged

Conversation

Blokje5
Copy link
Collaborator

@Blokje5 Blokje5 commented Sep 11, 2022

Description

This PR adds a few new server settings: proxy.enable & proxy.header. proxy.enable adds enables the proxy middleware, which by default fetches the IP from well known proxy headers (X-Forwarded-For, X-Real-Ip, Forwarded). It will assume the first IP in the chain (if there are multiple) is the actual client IP. If proxy.header is set to a string, instead the proxy middleware will try to find a header that mathes that string (e.g. X-My-Proxy-Header).

Fixes #216

Pull request type

Please check the type of change your PR introduces:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

Checklist

  • Linter passes correctly
  • Add tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary

Does this introduce a breaking change?

  • Yes
  • No

Further comments

There are some points of note:

  • I have assumed everywhere that the first IP in the chain is the real client IP. For most use cases of CHProxy this will most likely be the right call. I guess documenting the behaviour is the way to go?
  • in the an.Contains method, we now ignore the fact that a RemoteAddr might not contain a port. We could try to fetch this information from other headers (e.g. X-Forwarded-Port) but we didn't seem to use the port anyway.
  • For custom headers I directly use the information stored in the header. This can go wrong if the custom header uses a custom scheme as well (e.g. IP string list, something similar to the forwarded header). But to support all edge cases, we would most likely need to support custom middleware.
  • Right now I ignore all other proxy related headers and assume that the only thing the proxy changes on the request is the IP. This is not true for all proxies, but will suffice for most.

The main takeaway is that we won't be able to solve all possible use cases, however this should provide decent "default" behaviour for running CHProxy behind another proxy or load balancer.

@@ -696,6 +697,29 @@ func TestServe(t *testing.T) {
},
startHTTP,
},
{
"http request with default proxy headers",
Copy link
Collaborator

@mga-chka mga-chka Sep 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might miss something but were do you check in this test that the remote addr is modified?

Also, you should add the test that when the proxy is disabled, nothing is changed

Copy link
Collaborator Author

@Blokje5 Blokje5 Sep 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test works because allowed_networks: ["10.0.0.0/24"] is set in the config. It would return a 403 if the remote addr wasn't modified.

The proxy disabled setting should be covered by the other tests (as the middleware is always used).

@Blokje5 Blokje5 force-pushed the fix/support-chproxy-behind-remote-proxy branch 2 times, most recently from 0a0d39e to bf0529b Compare September 11, 2022 16:34
@Blokje5
Copy link
Collaborator Author

Blokje5 commented Sep 12, 2022

I noticed an issue in the tests (I still need to update the PR). The test worked well in isolation but failed when I run all the tests:

The tests rely on creating a net.Listener to listenAndServe which wraps the listener in a server handling connections. To end the tests, the net.Listener is closed. However closing the net.Listener doesn't immediately close the server. If instead I make the http.Server available to the tests and call Shutdown the issue is removed. So it seems there are some race issues with multiple server instances running in the tests.

I need to find a clean way to handle this in the code (I used var globalServer *http.Server for easy debugging) and will let you know once I updated the tests.

@Blokje5 Blokje5 requested a review from mga-chka September 12, 2022 18:15
@Blokje5 Blokje5 force-pushed the fix/support-chproxy-behind-remote-proxy branch from 40133e0 to ada7596 Compare September 13, 2022 07:33
Copy link
Collaborator

@mga-chka mga-chka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work.

The logic looks good (minus the few edge cases I spotted regarding RFC 7239).

You should add a brief summary of the feature in chproxy/docs/content/en/index.md so that anyone reading chproxy.org can quickly see that chproxy can run behind another proxy.
You should also add more information (like the description of your PR) in chproxy/docs/content/en/configuration/server.md

Regarding the code organization, I have some concerns about creating a new package (called middleware, which is a bit vague) and unsing the chain of responsability/middleware pattern.
Out of context I like this pattern and it makes sense in a proxy. But, inside the chproxy codebase, it feels weird because it was not used before despite the fact that many features could have been implemented using this middleware pattern (like authentication, user limits ...). Moreover, all the small features are currently stored in the main package, so why adding a new one.
IMHO, this PR makes sense is if we want to refactor the current code and use a chaine of responsability pattern everywhere we can.

Since the code organization part is a bit opinionated, I'll let @sigua-cs & @Garnek20 give their point of views.

type Proxy struct {
// Enable enables parsing proxy headers. In proxy mode, CHProxy will try to
// parse the X-Forwarded-For, X-Real-IP or Forwarded header to extract the IP. If an other header is configured
// in the proxy settings, CHProxy will use that header instead.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, you should add the fact that the first ip the list will be chosen

return ""
}

return forSplits[0][4:]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will not work for ipv6 since cf the RFC, ipv6 are quoted strings;
Note that as ":" and "[]" are not valid characters in "token", IPv6 addresses are written as "quoted-string".


for _, split := range splits {
trimmed := strings.TrimSpace(split)
if strings.HasPrefix(trimmed, "for=") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should make the comparison case insensitive because in the RFC 7239 the examples shows that it can be for or For:
Forwarded: For="[2001:db8:cafe::17]:4711" Forwarded: for=192.0.2.43, for=198.51.100.17

for _, split := range splits {
trimmed := strings.TrimSpace(split)
if strings.HasPrefix(trimmed, "for=") {
forSplits := strings.Split(trimmed, ", ")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should also split by ";" because cf RFC it's a valid separator and since your return forSplits[0][4:], you'll return more than just the ip:
The header field value can be defined in ABNF syntax as:
Forwarded = 1#forwarded-element

forwarded-element = [ forwarded-pair ] *( ";" [ forwarded-pair ] )
Examples:

   Forwarded: for=192.0.2.60;proto=http;by=203.0.113.43`

t.remoteAddr = r.RemoteAddr
}

func TestProxyMiddleware(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 nice tests
it might be worth adding the edge cases I mentioned above

Copy link
Contributor

@gontarzpawel gontarzpawel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just IP validation for me is missing

@@ -105,7 +105,8 @@ func (n Networks) Contains(addr string) bool {

h, _, err := net.SplitHostPort(addr)
if err != nil {
panic(fmt.Sprintf("BUG: unexpected error while parsing RemoteAddr: %s", err))
// If we only have an IP address. This happens when the proxy middleware is enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the err comes due to other reason ?

main.go Outdated
@@ -187,8 +191,8 @@ func newTLSConfig(cfg config.HTTPS) *tls.Config {
return &tlsCfg
}

func listenAndServe(ln net.Listener, h http.Handler, cfg config.TimeoutCfg) error {
s := &http.Server{
func createServer(ln net.Listener, h http.Handler, cfg config.TimeoutCfg) *http.Server {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're changing that part let's be golang agnostic and therefore (new given it returns a pointer)

Suggested change
func createServer(ln net.Listener, h http.Handler, cfg config.TimeoutCfg) *http.Server {
func newServer(ln net.Listener, h http.Handler, cfg config.TimeoutCfg) *http.Server {

m.next.ServeHTTP(w, r)
}

func (m *ProxyMiddleware) getIP(r *http.Request) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also validate input IP. Maybe with ip := net.ParseIP(h) ?

This PR adds a few new server settings: proxy.enable & proxy.header. proxy.enable adds enables the proxy middleware, which by default fetches the IP from well known proxy headers (X-Forwarded-For, X-Real-Ip, Forwarded). It will assume the first IP in the chain (if there are multiple) is the actual client IP. If proxy.header is set to a string, instead the proxy middleware will try to find a header that mathes that string (e.g. X-My-Proxy-Header).

Signed-off-by: Lennard Eijsackers <lennardeijsackers92@gmail.com>
The tests where creating a race condition where a previous http.Server could still accept connections from a test. This lead to issues specifically with the proxy middleware tests, as the previous server instance accepted the traffic (an instance where the proxy middleware isn't enabled).
The tests where creating a race condition where a previous http.Server could still accept connections from a test. This lead to issues specifically with the proxy middleware tests, as the previous server instance accepted the traffic (an instance where the proxy middleware isn't enabled).
@Blokje5 Blokje5 force-pushed the fix/support-chproxy-behind-remote-proxy branch from ada7596 to 0fee568 Compare October 12, 2022 12:45
@@ -105,7 +105,8 @@ func (n Networks) Contains(addr string) bool {

h, _, err := net.SplitHostPort(addr)
if err != nil {
panic(fmt.Sprintf("BUG: unexpected error while parsing RemoteAddr: %s", err))
// If we only have an IP address. This happens when the proxy middleware is enabled.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should remove the mention of middleware

@@ -0,0 +1,32 @@
---
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

main.go Outdated

proxySettings := proxySettings.Load().(*config.Proxy)
proxyHandler := NewProxyHandler(proxySettings)
remoteAddr := proxyHandler.GetRemoteAddr(r)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you extract the remoteAddr but I don't see where you do the logic r.RemoteAddr = remoteAddr
I'm not an expert but I guess this logic is needed given the pb, cf a lib doing the same as what we do https://github.com/gorilla/handlers/blob/v1.5.1/proxy_headers.go#L43
I might have loose it because I'm sure this logic was in your code 7 days ago

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the code below, the extracted remoteAddr is passed to the ANs:

remoteAddr := proxyHandler.GetRemoteAddr(r)

var an *config.Networks
if r.TLS != nil {
	an = allowedNetworksHTTPS.Load().(*config.Networks)
	err = fmt.Errorf("https connections are not allowed from %s", remoteAddr)
} else {
	an = allowedNetworksHTTP.Load().(*config.Networks)
	err = fmt.Errorf("http connections are not allowed from %s", remoteAddr)
}

main.go Outdated
@@ -234,15 +242,20 @@ func serveHTTP(rw http.ResponseWriter, r *http.Request) {
promHandler.ServeHTTP(rw, r)
case "/", "/query":
var err error

proxySettings := proxySettings.Load().(*config.Proxy)
proxyHandler := NewProxyHandler(proxySettings)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit picky but IMHO, instead of storing proxySettings, you could store a proxyHandler as an atomicValue. Indeed, conceptually, there is no reason to instantiate as new proxyHandler for every query. We only need to instantiate one every time the config change (which is what you're currently doing with config.Proxy who is only needed for the proxyHandler so not really needed in the end)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually that makes a lot of sense. I've updated the PR to load the proxyHandler.

@Blokje5 Blokje5 requested a review from mga-chka October 20, 2022 07:34
Copy link
Contributor

@gontarzpawel gontarzpawel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Blokje5 !

@mga-chka mga-chka merged commit 23e016a into ContentSquare:master Oct 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

[BUG] Remote IP is wrong when deployed behind load balancer
3 participants