Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non blocking matchers & matching timeout #192

Merged
merged 14 commits into from
May 9, 2024
Merged

Non blocking matchers & matching timeout #192

merged 14 commits into from
May 9, 2024

Conversation

ydylla
Copy link
Collaborator

@ydylla ydylla commented May 6, 2024

This is a better version of #72 that actually works 😄
Instead of the wrapped route chain routing/matching is now done in a loop until a timeout is reached. Each connection is prefetched without blocking. Then all routes and their matchers are tried on the available data. If nothing matched more data is prefetched and all routes are tried again. This is done until the timeout is reached. The timeout can be configured per layer4 server and defaults to 3s. I first had it set to 500ms but since the matching phase may contain tls handshakes that was too short. I saw some failed connections from far away servers where tls handshaking apparently takes about 1000ms.

If you want to test or compare both prefetching branches you can use this caddy config:

prefetch.json
{
  "admin": {
    "disabled": true
  },
  "logging": {
    "logs": {
      "default": {"level":"DEBUG", "encoder": {"format":"console"}}
    }
  },
  "apps": {
    "tls": {
      "certificates": {
        "automate": ["localhost"]
      },
      "automation": {
        "policies": [{
          "subjects": ["localhost"],
          "issuers": [{
            "module": "internal"
          }]
        }]
      }
    },
    "layer4": {
      "servers": {
        "https": {
          "listen": ["0.0.0.0:10443"],
          "routes": [
            {
              "match": [
                {"proxy_protocol": {}}
              ],
              "handle": [
                {
                  "handler": "proxy_protocol",
                  "timeout": "2s",
                  "allow": ["127.0.0.1/32"]
                }
              ]
            },
            {
              "match": [
                {"tls": {"sni": ["localhost"]}}
              ],
              "handle": [
                {"handler": "tls"}
              ]
            },
            {
              "match": [
                {"http": [{"host": ["localhost"]}]}
              ],
              "handle": [
                {
                  "handler": "proxy",
                  "upstreams": [{"dial": ["127.0.0.1:10080"]}]
                }
              ]
            }
          ]
        }
      }
    },
    "http": {
      "servers": {
        "backend": {
          "protocols": ["h1","h2","h2c"],
          "listen": ["127.0.0.1:10080"],
          "routes": [
            {
              "handle": [{
                "handler": "static_response",
                "status_code": "200",
                "body": "Hello World\n",
                "headers": {
                  "Content-Type": ["text/plain"]
                }
              }]
            }
          ]
        }
      }
    }
  }
}

The following curl commands should all work with this branch while some do not work with the old version:

curl -v -k --http2 https://localhost:10443
curl -v -k --http2-prior-knowledge https://localhost:10443
curl -v -k http://localhost:10443
curl -v -k --haproxy-protocol https://localhost:10443

I hope this is not a too big rewrite and you are still interested in merging it. I currently also test it with real browser traffic and until now it works fine.

@mholt
Copy link
Owner

mholt commented May 7, 2024

Oh wow! This is quite the change 😅 I'm impressed.

It is a big change but I can see the crux of the changes are in about 2-3 places, the rest are just minor edits to accommodate different APIs.

I do have some questions which I'll post in-line or as a follow-up comment.

But overall I think I like where this is going and it's likely to be merged. 👍 Thank you, this is really awesome!

Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I might do this in chunks/iterations. Starting with the first things I notice, these are just questions I have -- overall I think this is a very thoughtful change so it's looking quite good. I just want to make sure I understand it and that we eek out the best possible design. 💯

layer4/routes.go Outdated Show resolved Hide resolved
modules/l4subroute/handler.go Show resolved Hide resolved
Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok did another quick scan -- looking better! I love that we don't have IsTerminal() anymore. I'm glad to see you were able to make that work 😃

I still need to fully grok the "freeze/unfreeze" stuff (as opposed to recording?) but I'll do another pass soon :) Thank you!

modules/l4subroute/handler.go Outdated Show resolved Hide resolved
layer4/routes.go Outdated Show resolved Hide resolved
@ydylla
Copy link
Collaborator Author

ydylla commented May 9, 2024

I still need to fully grok the "freeze/unfreeze" stuff (as opposed to recording?)

I renamed them because "record" felt wrong since there is no recording happening during matching anymore. Instead the connection is frozen and can only read data from the prefetched buffer. Technically a matcher could still write but that would very likely result in an invalid state. We could return an error if writes are attempted during matching to prevent misuse.

Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Thanks for the explanation!

Ok I think we're close. I understand things a little better. Still a few more questions but nothing too major. Almost there!! Thanks for hanging in there with me!

modules/l4http/httpmatcher.go Show resolved Hide resolved
layer4/connection.go Outdated Show resolved Hide resolved
layer4/connection.go Outdated Show resolved Hide resolved
Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking good 😃 Are you able to help with it if anyone reports an issue related to this? I'll be happy to merge this and we can try it out! Excellent work.

@mholt mholt added the enhancement New feature or request label May 9, 2024
@ydylla
Copy link
Collaborator Author

ydylla commented May 9, 2024

Yes I can help bug fixing it, especially as long as I use it myself 😄

@mholt mholt merged commit f049165 into master May 9, 2024
6 checks passed
@mholt
Copy link
Owner

mholt commented May 9, 2024

Awesome, thanks!!

@mholt mholt deleted the prefetching-2 branch May 9, 2024 23:31
@ydylla ydylla mentioned this pull request May 13, 2024
@T4cC0re
Copy link

T4cC0re commented May 24, 2024

This PR breaks our configuration.
A TLS connection is never established successfully with the module on this PR and Caddy 2.7.6. It works fine on c5d815d with Caddy 2.7.6.

We use caddy-l4 to forward to a Unix socket depending on the TLS Handshake, then have a secondary listened on a UNIX socket with on-demand TLS provisioning. (We are doing it this way , to be able to use different TLS provisioning policies, and others)

Setting the new timeout parameters to something like '15m' does not help.

The config:

{
  "logging": {
    "logs": {
      "default": {
        "level": "info",
        "writer": {
          "output": "stdout"
        },
        "encoder": {
          "format": "json"
        }
      }
    }
  },
  "apps": {
    "layer4": {
      "servers": {
        "http": {
          "listen": [
            ":80"
          ],
          "routes": [
            {
              "match": [
                {
                  "http": [
                    {
                      "host": [
                        "gitlab.REDACTED"
                      ]
                    }
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "unix//var/opt/caddy/caddy.gitlab"
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "match": [
                {
                  "http": [
                    {
                      "host": [
                        "registry.REDACTED"
                      ]
                    }
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "unix//var/opt/caddy/caddy.registry"
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        },
        "https": {
          "listen": [
            ":443"
          ],
          "routes": [
            {
              "match": [
                {
                  "tls": {}
                }
              ],
              "handle": [
                {
                  "handler": "subroute",
                  "routes": [
                    {
                      "match": [
                        {
                          "tls": {
                            "sni": [
                              "gitlab.REDACTED"
                            ]
                          }
                        }
                      ],
                      "handle": [
                        {
                          "handler": "proxy",
                          "proxy_protocol": "v2",
                          "upstreams": [
                            {
                              "dial": [
                                "unix//var/opt/caddy/caddy.gitlab"
                              ]
                            }
                          ]
                        }
                      ]
                    },
                    {
                      "match": [
                        {
                          "tls": {
                            "sni": [
                              "registry.REDACTED"
                            ]
                          }
                        }
                      ],
                      "handle": [
                        {
                          "handler": "proxy",
                          "proxy_protocol": "v2",
                          "upstreams": [
                            {
                              "dial": [
                                "unix//var/opt/caddy/caddy.registry"
                              ]
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        },
        "ssh": {
          "listen": [
            ":22"
          ],
          "routes": [
            {
              "handle": [
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "10.0.1.1:2022"
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      }
    },
    "http": {
      "http_port": 80,
      "https_port": 443,
      "servers": {
        "gitlab": {
          "strict_sni_host": true,
          "listen": [
            "unix//var/opt/caddy/caddy.gitlab"
          ],
          "listener_wrappers": [
            {
              "wrapper": "proxy_protocol",
              "timeout": 0
            },
            {
              "wrapper": "http_redirect"
            },
            {
              "wrapper": "tls"
            }
          ],
          "metrics": {},
          "routes": [
            {
              "match": [
                {
                  "host": [
                    "gitlab.REDACTED"
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "headers",
                  "response": {
                    "set": {
                      "Server": [
                        "REDACTED"
                      ]
                    }
                  }
                },
                {
                  "encodings": {
                    "gzip": {},
                    "zstd": {}
                  },
                  "handler": "encode",
                  "prefer": [
                    "zstd",
                    "gzip"
                  ]
                },
                {
                  "handler": "reverse_proxy",
                  "load_balancing": {
                    "selection_policy": {
                      "policy": "round_robin"
                    }
                  },
                  "upstreams": [
                    {
                      "dial": "10.0.1.1:8081"
                    }
                  ]
                }
              ],
              "terminal": true
            }
          ]
        },
        "registry": {
          "strict_sni_host": true,
          "listen": [
            "unix//var/opt/caddy/caddy.registry"
          ],
          "listener_wrappers": [
            {
              "wrapper": "proxy_protocol",
              "timeout": 0
            },
            {
              "wrapper": "http_redirect"
            },
            {
              "wrapper": "tls"
            }
          ],
          "metrics": {},
          "routes": [
            {
              "match": [
                {
                  "host": [
                    "registry.REDACTED"
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "headers",
                  "response": {
                    "set": {
                      "Server": [
                        "REDACTED"
                      ]
                    }
                  }
                },
                {
                  "encodings": {
                    "gzip": {},
                    "zstd": {}
                  },
                  "handler": "encode",
                  "prefer": [
                    "zstd",
                    "gzip"
                  ]
                },
                {
                  "handler": "reverse_proxy",
                  "load_balancing": {
                    "selection_policy": {
                      "policy": "round_robin"
                    }
                  },
                  "upstreams": [
                    {
                      "dial": "10.0.1.1:18081"
                    }
                  ]
                }
              ],
              "terminal": true
            }
          ]
        }
      }
    },
    "tls": {
      "certificates": {
        "load_folders": [
          "/var/opt/caddy/certificates"
        ]
      },
      "automation": {
        "policies": [
          {
            "on_demand": true,
            "issuers": [
              {
                "email": "REDACTED",
                "module": "acme"
              },
              {
                "email": "REDACTED",
                "module": "zerossl"
              }
            ]
          }
        ],
        "on_demand": {
          "ask": "http://127.0.0.1:2223"
        }
      }
    }
  }
}

@mholt
Copy link
Owner

mholt commented May 24, 2024

@T4cC0re Is there any particular error message? "A TLS connection is never established successfully" is not a lot to go on...

@ydylla
Copy link
Collaborator Author

ydylla commented May 25, 2024

I think I know what's happening. The subroute is blocking on its prefetch call because the root route already fetched all bytes from the client. The compiled route in the subroute handler should probably not call prefetch if there is already some data available (at least not for the first matcher loop iteration). Sorry I overlooked this, I never actually tested the subroute handler.

@T4cC0re You can work around this by not using the subroute handler. You should be able to just copy your subroute "routes" to the root "routes".

@ydylla
Copy link
Collaborator Author

ydylla commented May 25, 2024

The subroute is blocking on its prefetch call because the root route already fetched all bytes from the client.

This is exactly what happens. #194 fixes it.

@T4cC0re
Copy link

T4cC0re commented May 31, 2024

This is exactly what happens. #194 fixes it.

Can confirm. It works now :) Thanks for the quick turnaround!

@mholt
Copy link
Owner

mholt commented May 31, 2024

Really appreciate your help @ydylla -- thank you!

@mholt
Copy link
Owner

mholt commented Jun 2, 2024

@ydylla One more potential bug with this commit: have you considered the case where a TLS connection is wrapped? i.e.:

{
	"apps": {
		"layer4": {
			"servers": {
				"signal_proxy": {
					"listen": [
						":443"
					],
					"routes": [
						{
							"handle": [
								{
									"handler": "tls"
								}
							]
						},
						{
							"match": [
								{
									"tls": {
										"sni": [
											"chat.signal.org",
											"ud-chat.signal.org"
										]
									}
								}
							],
							"handle": [
								{
									"handler": "proxy",
									"upstreams": [
										{
											"dial": [
												"chat.signal.org:443"
											]
										}
									]
								}
							]
						},
						{
							"match": [
								{
									"tls": {
										"sni": [
											"storage.signal.org",
											"cdn.signal.org",
											"cdn2.signal.org",
											"cdn3.signal.org",
											"cdsi.signal.org",
											"contentproxy.signal.org",
											"uptime.signal.org",
											"sfu.voip.signal.org",
											"svr2.signal.org",
											"updates.signal.org",
											"updates2.signal.org",
											"backend1.svr3.signal.org",
											"backend2.svr3.signal.org",
											"backend3.svr3.signal.org"
										]
									}
								}
							],
							"handle": [
								{
									"handler": "proxy",
									"upstreams": [
										{
											"dial": [
												"{l4.tls.server_name}:443"
											]
										}
									]
								}
							]
						}
					]
				}
			}
		},
		"tls": {
			"certificates": {
				"automate": [
					"example.com"
				]
			}
		}
	}
}

A Signal proxy first terminates the "outer" layer of TLS, then with that encrypted tunnel, it sends an inner TLS connection. This config used to be able to match the ServerName (SNI) of the inner TLS connection without terminating it, but I noticed that with this change, it seems to try to terminate the inner connection.

Do you think you could help fix this? It appears to be a regression. (Sorry!)

@ydylla
Copy link
Collaborator Author

ydylla commented Jun 2, 2024

@mholt I don't think the wrapping is the problem. It's likely my rather arbitrary choice of jumping back to the first route after a match instead of continuing. It was easier to code and seamed reasonable. If you add an "example.com" sni matcher to the first route with the tls handler it should work.
I will try to reproduce and fix it, should be possible. And don't worry I also want the stuff I write to work 😄

@mholt
Copy link
Owner

mholt commented Jun 2, 2024

Oh you're good! I super appreciate your contributions (I've been stretched a bit thin lately.) Let me know if I can help at all. Thanks so much 😅

ydylla added a commit that referenced this pull request Jun 2, 2024
This is done to be backwards compatible with the old matching behavior see #192 (comment)
@ydylla
Copy link
Collaborator Author

ydylla commented Jun 2, 2024

@mholt #196 recreates the old matching behavior.
I spend the most time trying to come up with a command line that does TLS in TLS with curl, echo and openssl s_client but eventually I gave up. Turns out writing it in go was way easier 😅 . In case anyone also needs this:

package main

import (
	"crypto/tls"
	"log"
)

func main() {
	outerConn, err := tls.Dial("tcp", "127.0.0.1:10443", &tls.Config{
		ServerName:         "outer",
		InsecureSkipVerify: true,
	})
	if err != nil {
		log.Fatal(err)
	}

	innerConn := tls.Client(outerConn, &tls.Config{
		ServerName:         "inner",
		InsecureSkipVerify: true,
	})

	_, err = innerConn.Write([]byte("GET / HTTP/1.0\r\nHost: localhost\r\n\r\n"))
	if err != nil {
		log.Fatal(err)
	}
	buf := make([]byte, 1024)
	n, err := innerConn.Read(buf)
	if err != nil {
		log.Fatal(err)
	}
	println(string(buf[:n]))
}

ydylla added a commit that referenced this pull request Jun 2, 2024
This is done to be backwards compatible with the old matching behavior see #192 (comment)
ydylla added a commit that referenced this pull request Jun 3, 2024
This is done to be backwards compatible with the old matching behavior see #192 (comment)
@mholt
Copy link
Owner

mholt commented Jun 4, 2024

Ah 😅 Thanks for working on that and posting your test program. I will review the patch soon!! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants