Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPA gatekeeper error: dial tcp: connect: cannot assign requested address #1416

Closed
1 task done
mannbiher opened this issue Apr 25, 2024 · 2 comments · Fixed by #1418
Closed
1 task done

OPA gatekeeper error: dial tcp: connect: cannot assign requested address #1416

mannbiher opened this issue Apr 25, 2024 · 2 comments · Fixed by #1418
Labels
bug Something isn't working triage Needs investigation

Comments

@mannbiher
Copy link
Contributor

mannbiher commented Apr 25, 2024

What happened in your environment?

Installed ratify helm chart by following documentation for AWS signer https://ratify.dev/docs/quickstarts/ratify-with-aws-signer. We don't need image mutation, so the helm chart is installed with additional value provider.enableMutation=false. After running for few days, OPA gatekeeper audit controller cannot open any new connections to ratify and ratifyconstraint would contain below error in violations.

- enforcementAction: warn
    group: ""
    kind: Pod
    message: 'System error calling external data provider: failed to send external
      data request: Post "https://ratify.gatekeeper-system:6001/ratify/gatekeeper/v1/verify":
      dial tcp 172.20.146.228:6001: connect: cannot assign requested address'
    name: abc-767bb47d54-kvb79
    namespace: abc
    version: v1

What did you expect to happen?

ratifyconstraint should show the actual violations.

What version of Kubernetes are you running?

1.27

What version of Ratify are you running?

1.1.0

Anything else you would like to add?

I looked at the external data provider code and can see it is happening because a new client is created for every request. There is no IdleConnTimeout set on transport and the old connections remain open. At some point of time no new connections can be opened and we get above error.
https://github.com/open-policy-agent/frameworks/blob/master/constraint/pkg/externaldata/request.go#L140

Even though it has to be fixed at OPA gatekeeper, ratify http server should have a default IdleTimeout so that old idle connections are not kept open.

https://github.com/deislabs/ratify/blob/dev/httpserver/server.go#L137

I did below change and made a local build. After running for a day, I can see there are no idle connections open in gatekeeper-audit controller.

       svr := &http.Server{
		Addr:              server.Address,
		Handler:           server.Router,
		ReadHeaderTimeout: readHeaderTimeout,
+               IdleTimeout:       90 * time.Second,
	}

Are you willing to submit PRs to contribute to this bug fix?

  • Yes, I am willing to implement it.
@mannbiher mannbiher added bug Something isn't working triage Needs investigation labels Apr 25, 2024
@mannbiher mannbiher changed the title OPA gatekeeper error: dial tcp 172.20.146.228:6001: connect: cannot assign requested address OPA gatekeeper error: dial tcp : connect: cannot assign requested address Apr 25, 2024
@mannbiher mannbiher changed the title OPA gatekeeper error: dial tcp : connect: cannot assign requested address OPA gatekeeper error: dial tcp: connect: cannot assign requested address Apr 25, 2024
@akashsinghal
Copy link
Collaborator

akashsinghal commented Apr 25, 2024

Thanks for surfacing this issue @mannbiher. Would you be able to PR a fix for this? Happy to review a fix for this.

Also, I think this is a good issue to file on Gatekeeper project too. As mentioned in docs, the client should be reused, like you mentioned.

@mannbiher
Copy link
Contributor Author

OPA gatekeeper issue open-policy-agent/frameworks#423

@susanshi susanshi modified the milestone: v1.2.0 Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Needs investigation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants