-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libgit2: Configured libgit2 clone ProxyOptions #524
Conversation
Thanks a lot for your contribution 💯 🙇 Just a heads up that due to most maintainers starting their holidays next week, and libgit2 being a critical and complex area of the controller. I am (sadly) afraid it's highly likely this won't be (properly) looked at before the beginning of the new year. For the tests, it may be worth trying to run them using |
No worries. I did try adding |
Hi, libgit2 has a built-in option which tries to detect the proxy from the Git configuration and then from environment variables. I have made an example of implementation here: https://github.com/au2001/fluxcd-source-controller/pull/1/files With this PR, the behavior of libgit2 would be closer to the one of the go-git implementation and would help fix #131 which has been open for over a year, as well as fluxcd/image-automation-controller#279 potentially. |
Given this is my last hour of availability for this year, I won't be able to help this move any further I am afraid. However, if tested and proved to work, it could be moved forward by others (cc: @darkowlzz). I would still like to see at least a simple test added, as we have been working really hard on getting the test coverage up to an acceptable rate. |
@au2001 What about opening a PR on on source-controller? |
I've also confirmed @au2001 code works in our clusters. Happy to update this PR to work using the Auto option if @au2001 doesn't want to raise his own? I'm a bit stuck on how to test it though so any hints would be appreciated (I can only think of actually cloning a real remote repo via a running proxy and testing somehow that the proxy is used based on env values) Admittedly I did notice the Auto proxy type, but (mistakenly) assumed that was the default - that will teach me. |
I can raise a PR but as @hiddeco stated, at least a test (and maybe some documentation?) will have to be written. As for tests, most that I can think of would just be testing libgit2 and not source-controller itself so I'm not sure. |
Hi, thanks for the patch and all the helpful information around this. I've been looking into this and thinking about ways to test it. I will be trying it out myself in a while. As recommended above, using https://github.com/elazarl/goproxy or something simpler, we should be able to create a simple proxy that adds things that are necessary for a successful checkout, like the credentials. We can create a test git server with basic auth (examples are in the existing tests) and not provide the credentials to the checkout strategy, but let the proxy add the credentials to the outgoing request. Test cases could be to try checkout directly without credentials, which would fail, and then try checkout using the proxy and that should succeed. @racdev you can update the PR with @au2001 's recommendations. And thank you both for all the help in flux and git2go. |
Got some updates. I tried the proxy option in the test code in strategy_test.go, starting with just http git server and found out that it doesn't work. Libgit2 just ignores the proxy. Found out that it only works for https git server as per libgit2/libgit2#5650 . I guess we'll have to document this properly. Continuing the testing for https git server, I created a local proxy using goproxy. The basic default usage of goproxy provided in the examples work only for http requests. Enabling Following is a code snippet from my testing. It creates a goproxy server after creating the git server and sets // Create git server.
...
proxyGotRequest := false
// Run a proxy server.
proxy := goproxy.NewProxyHttpServer()
proxy.Verbose = true
var httpsHandler goproxy.FuncHttpsHandler = func(host string, ctx *goproxy.ProxyCtx) (*goproxy.ConnectAction, string) {
// Check if the host matches with the git server address and the user-agent is the expected git client.
userAgent := ctx.Req.Header.Get("User-Agent")
if strings.Contains(gitServer.HTTPAddress(), host) && strings.Contains(userAgent, "libgit2") {
proxyGotRequest = true
}
return goproxy.OkConnect, host
}
proxy.OnRequest().HandleConnect(httpsHandler)
proxyServer := http.Server{
Addr: "localhost:9999",
Handler: proxy,
}
// Run the proxy server in a separate goroutine.
go func() {
proxyServer.ListenAndServe()
}()
defer proxyServer.Close()
// Set the proxy env var.
os.Setenv("HTTPS_PROXY", fmt.Sprintf("http://%s", proxyServer.Addr))
defer func() {
os.Unsetenv("HTTPS_PROXY")
}()
...
// At the end of the test, check if the proxy received the expected request.
g.Expect(proxyGotRequest).To(BeTrue()) This succeeds with some warnings: 2022/01/05 12:33:53 [001] INFO: Running 1 CONNECT handlers
2022/01/05 12:33:53 [001] INFO: on 0th handler: &{0 <nil> 0xda4c80} 127.0.0.1:34279
2022/01/05 12:33:53 [001] INFO: Accepting CONNECT to 127.0.0.1:34279
2022/01/05 12:33:53 request: GET 127.0.0.1:34279/bar/test-reponame/info/refs?service=git-upload-pack
2022/01/05 12:33:53 request: GET 127.0.0.1:34279/bar/test-reponame/info/refs?service=git-upload-pack
2022/01/05 12:33:53 request: POST 127.0.0.1:34279/bar/test-reponame/git-upload-pack
2022/01/05 12:33:53 [001] WARN: Error copying to client: readfrom tcp 127.0.0.1:51448->127.0.0.1:34279: splice: connection reset by peer
2022/01/05 12:33:53 [001] WARN: Error copying to client: readfrom tcp 127.0.0.1:9999->127.0.0.1:46302: splice: broken pipe And fails if Unexpected error:
<*fmt.wrapError | 0xc000214280>: {
msg: "unable to clone: failed to connect to localhost: Connection refused",
err: <*git.GitError | 0xc000214260>{
Message: "failed to connect to localhost: Connection refused",
Class: 2,
Code: -1,
},
}
unable to clone: failed to connect to localhost: Connection refused
occurred Tried the same for go-git, but it doesn't work. I saw that go-git has some support for proxy using SOCKS5 for ssh. Maybe we need to document about it as well. This may be a little more involved than expected. Please let me know if I can help more with the tests. |
@au2001 Have you been able to use HTTPS_PROXY with go-git? If yes, how did you configure it? I'll have to go through go-git more to investigate further. |
@darkowlzz With `source-controller` and `go-git`, I can simply set `HTTPS_PROXY` on the pod and see it work.
|
I'm unable to reproduce it manually for some reason. The proxy doesn't receive any request. |
Did you try to use the lowercase version of the variables, Edit: Sorry, my mistake. I did use HTTP_PROXY and HTTPS_PROXY with go-git and only tried HTTPS_PROXY with libgit2. As you stated, plain HTTP proxies with libgit2 are not supported with the proposed patch.
|
@darkowlzz is the git server running on localhost, because if so then the default ProxyFromEnvironment being used by the http Client in go-git will ignore it as it is a special case that localhost is ignored. https://github.com/golang/go/blob/master/src/net/http/transport.go#L437 Also the proxy determination is cached (which is the issue I was having when trying to test the helper function in this PR) - don't know if that is an issue for your tests or not |
@racdev Yes, it was on localhost. Tried a few things:
I'm not sure what's going on. Need to investigate further. Just to isolate and verify the go http transport ignoring localhost proxy thought, I wrote a small program to verify the behavior: package main
import (
"fmt"
"net/http"
"os"
)
func main() {
os.Setenv("HTTP_PROXY", "http://localhost:8899")
resp, err := http.Get("http://example.com")
if err != nil {
panic(err)
}
fmt.Println(resp)
} It seems to be going through the proxy. I see the logs: 2022/01/05 16:23:46 [004] INFO: Got request / example.com GET http://example.com/
2022/01/05 16:23:46 [004] INFO: Sending request GET http://example.com/
2022/01/05 16:23:46 [004] INFO: Received response 200 OK
2022/01/05 16:23:46 [004] INFO: Copying response to client 200 OK [200]
2022/01/05 16:23:46 [004] INFO: Copied 1256 bytes to client error=<nil> Changing to
I haven't noticed any caching issue yet. package main
import (
"log"
"net/http"
"github.com/elazarl/goproxy"
)
func main() {
proxy := goproxy.NewProxyHttpServer()
proxy.Verbose = true
log.Fatal(http.ListenAndServe(":8899", proxy))
} You can run the simple client code agains this proxy and see if you observe the same behavior. Since this is working, I'm suspicious about something specific to go-git that's ignoring the proxy settings. |
@au2001 I just tried Will investigate more into what's going on with go-git and my setup. |
@darkowlzz The localhost ignore is based on the request URL, not the proxy URL, so your code will work as your request isn't to localhost (the proxy running on localhost is fine). Your comment reads as though you changed the proxy to not be on localhost, is that correct? If the git server/repo URL is localhost in the tests then the proxy will be ignored as I understand it. |
@racdev Sorry, my bad. I'm able to reproduce it now. Thanks a lot 😅 In this case, I think we'll have to clone a public repo. I was testing with a small repo that go-git uses for it's own testing: https://github.com/git-fixtures/basic.git. So, with this, we know that go-git works, libgit2 works for https only. |
Yeah I had come to the conclusion that a public repo clone would be required to test. I will see if I can get some time this evening to update this PR with something. |
@darkowlzz Since we already have a proxy running, why not try to clone a repository at Also, from what I can see, the SSL certificate used for HTTPS cloning already contains |
@au2001 That would be great if it works. In my experiments with goproxy earlier today, I found that when it's https, the request can't be modified. I tried adding some headers but on the server side, the received requests don't contain the headers added by the proxy. But it works for http requests. Please share more information if you know how to achieve it. Regarding the idea of binding to something like |
@darkowlzz Indeed goproxy does not let you modify the content of the request without setting up MITM. I tried to use your test code and adapt it, but I can't seem to make go-git use the proxy, no matter what URL I use. It works fine with libgit2 though. Here is a PoC redirecting github.com to a.github.io (which serves a certificate valid on github.com): package main
import (
"fmt"
"io/ioutil"
"net/http"
"os"
"time"
"github.com/elazarl/goproxy"
)
func main() {
proxy := goproxy.NewProxyHttpServer()
proxy.Verbose = true
var httpsHandler goproxy.FuncHttpsHandler = func(host string, ctx *goproxy.ProxyCtx) (*goproxy.ConnectAction, string) {
if host == "github.com:443" { // The port is always added, even if the user doesn't specify it.
return goproxy.OkConnect, "a.github.io:443"
}
return goproxy.RejectConnect, host // Prevent all connections except to github.com - just to be safe.
}
proxy.OnRequest().HandleConnect(httpsHandler)
proxyServer := http.Server{
Addr: "localhost:9999",
Handler: proxy,
}
go proxyServer.ListenAndServe()
defer proxyServer.Close()
os.Setenv("HTTPS_PROXY", fmt.Sprintf("http://%s", proxyServer.Addr))
defer os.Unsetenv("HTTPS_PROXY")
time.Sleep(1 * time.Second) // Wait for the proxy server to start.
res, err := http.Get("https://github.com")
if err != nil {
panic(err)
}
defer res.Body.Close()
body, err := ioutil.ReadAll(res.Body)
if err != nil {
panic(err)
}
fmt.Println(string(body)) // Displays the GitHub.io 404 page and not GitHub.com's home page.
} |
Thanks for trying that @au2001 . I too managed to make it work for libgit2 based on your method but the proxy seems to be ignored with go-git even for example.com. Tried a few things, but couldn't get it to work. |
I also couldn't get go-git to work (still think it is something to do with the ProxyFromEnvironment caching), but libgit2 was ok. NO_PROXY didn't work for libgit2 though, although I think that is because the version of libgit2 in use doesn't have this update in it for NO_PROXY libgit2/libgit2#6026 ? If we cant get go-git proxy tests to work, do we just go with adding tests for the libgit2 implementation (which is what this PR specifically targets)? Or do we want to keep trying to get go-git working too? |
Caching could definitely be an issue later on, but I believe it is not the (only) cause of the issue here. It works fine with go-git through HTTP and libgit2 through HTTPS.
I was able to bypass the proxy using A test for |
I don't think that code is in the version of libgit2 that the controller is using. I am perhaps (more than likely) completely wrong, but as far as I can see the makefile uses libgit2 version 1.1.1 which would be this, that doesn't have anything in there for NO_PROXY: https://github.com/libgit2/libgit2/blob/v1.1.1/src/remote.c#L789 Same story if you follow the version of git2go v31.6.1 and its libgit2 version: https://github.com/libgit2/git2go/tree/v31.6.1/vendor
Yeah, I did want to add a NO_PROXY test, I was trying to simply use example.com for the no proxy value and testing to make sure the request didn't go through the proxy when set, but it always did in my tests (for libgit2). |
Looks right to me, libgit2/libgit2@e5a3277 this change is only present in v1.2.0 and v1.3.0, and we are using v1.1.1 due to some issues in the newer versions of git2go.
@racdev We have seen that it all works fine with remote/public repos. Due to the situation, let's use a public repo for just proxy tests. In the future, if we figure out how to make it work locally, we can change it. |
@darkowlzz Yes, that's a way better solution. Thanks.
@racdev Agreed, that makes sense.
@racdev Seems fine to me. I actually was using a 5 seconds timeout too during most of my testing, but forgot to add it back.
@darkowlzz I'm not sure if I'm doing something wrong/different than you but when moving the proxy test function to
Which indicates empty environment variables were cached by the previous tests and the proxy isn't used. |
@au2001 I pasted the whole test function in strategy_test.go and just been running I've also been running |
@darkowlzz If I place the proxy tests at the top of the file, they are ran first and thus the cached variables are the good ones. Is it working for you even with proxy tests last? |
@au2001 Yes, fails when added at the very end with the same unable to clone, repository not found error. |
I'm not sure why this is happening. But this proves that there's some interference when the tests are in the same file. |
Discovered another related issue with the test ordering, if the proxy tests are in a separate file, such that the file name in sorted order leads to the proxy tests run after the other tests, it fails. |
Good catch. That's an issue though. |
Yes, I think I mentioned this a few days ago. We can skip just the proxy tests and run them later separately. After playing around with build tags and other options, it looks like moving the proxy tests into a separate package helps. In separate package, it doesn't matters if it runs before or after the other tests in other packages. So, I created a directory within strategy dir, go test github.com/fluxcd/source-controller/pkg/git/strategy/... -count 1
ok github.com/fluxcd/source-controller/pkg/git/strategy 5.015s
ok github.com/fluxcd/source-controller/pkg/git/strategy/proxy 0.973s It succeeded even though proxy tests ran at the end. |
A separate package also enables easy ignoring of packages using $ go test -v `go list ./... | grep -v proxy` -count 1 That'll test all the packages except for the proxy package. $ go list ./... | grep -v proxy
github.com/fluxcd/source-controller
github.com/fluxcd/source-controller/controllers
github.com/fluxcd/source-controller/internal/fs
github.com/fluxcd/source-controller/internal/helm
github.com/fluxcd/source-controller/internal/helm/chart
github.com/fluxcd/source-controller/internal/helm/getter
github.com/fluxcd/source-controller/internal/helm/repository
github.com/fluxcd/source-controller/pkg/gcp
github.com/fluxcd/source-controller/pkg/git
github.com/fluxcd/source-controller/pkg/git/gogit
github.com/fluxcd/source-controller/pkg/git/libgit2
github.com/fluxcd/source-controller/pkg/git/strategy
github.com/fluxcd/source-controller/pkg/sourceignore This pattern was very common many years ago when go didn't ignore the vendor directory for running tests. |
1a7417f
to
9ddfd4c
Compare
Think I have made the updates as required 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@racdev Everything looks and works great.
Sorry that I couldn't add some more suggestions last week. Added a few minor suggestions for improvements. Hope that's okay.
8e2219b
to
59c5747
Compare
Apologies for the delay in making updates. Hopefully I have updated as required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks a lot.
This configures ProxyOptions for all libgit2 Checkout functions when cloning and configures the options based on current environment settings using the git2go.ProxyTypeAuto option. Refs: fluxcd#131 Signed-off-by: Robert Clarke <rob@robertandrewclarke.com> Co-authored-by: Aurélien GARNIER <aurelien.garnier@atos.net>
59c5747
to
3d276b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much @racdev 🙇 💯
We should document this in the API spec and update the gogit vs libgit2 table. |
This configures ProxyOptions for all libgit2 Checkout functions when cloning and configures the options based on current environment settings using the http.ProxyFromEnvironment function.
Refs: #131
I haven't added any tests as they would only really be testing the http.ProxyFromEnvironment function.
However when I tried to create some tests, it appeared as though the go test caching meant the proxy environment values set for each test were being cached and hence the tests always failed. I don't have a huge amount of experience developing in go so that might just be an issue on my part - happy to be told/shown how I could get such tests to work.