Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to resolve TLS certificate chain on OpenShift repeatedly hangs for minutes if cluster uses proxy #21087

Closed
amisevsk opened this issue Jan 28, 2022 · 1 comment
Labels
area/che-operator Issues and PRs related to Eclipse Che Kubernetes Operator engine/devworkspace Issues related to Che configured to use the devworkspace controller as workspace engine. kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. sprint/current
Milestone

Comments

@amisevsk
Copy link
Contributor

amisevsk commented Jan 28, 2022

Describe the bug

Running Che on an OpenShift cluster with proxy configured results in the che-operator repeatedly getting stuck on resolving TLS certificates. This appears to be related to calling doRequestForTLSCrtChain in this section.

The initial check, which bypasses the proxy configuration, can end up waiting minutes for a timeout to expire.

Che version

7.42@latest

Steps to reproduce

  1. Start OpenShift cluster with proxy configured (if you have access to cluster-bot: launch 4.9 aws,proxy)
  2. Install Che with devworkspace enabled
  3. Check che-operator logs

Expected behavior

Timeout should at least be decreased to a reasonable duration (in the first 15 minutes of running che-operator, around 15 minutes are spent waiting on this timeout)

Runtime

OpenShift

Screenshots

No response

Installation method

chectl/latest

Environment

Linux

Eclipse Che Logs

(note the timestamps)

2022-01-28T16:34:26.256Z	INFO	controller-runtime.manager.controller.checluster	Starting workers	{"reconciler group": "org.eclipse.che", "reconciler kind": "CheCluster", "worker count": 1}
time="2022-01-28T16:38:48Z" level=error msg="An error occurred when reaching test TLS route: Get \"https://test-eclipse-che.apps.<snip>.com\": dial tcp 18.219.65.50:443: connect: connection timed out"
time="2022-01-28T16:38:48Z" level=warning msg="Failed to get certificate chain of trust of the OpenShift Ingress bypassing the proxy"
time="2022-01-28T16:38:48Z" level=info msg="Configuring proxy with http://<snip>.compute.amazonaws.com:3128 to extract certificate chain from the following URL: https://test-eclipse-che.apps.<snip>.com"
time="2022-01-28T16:38:48Z" level=info msg="Using proxy: http://<snip>.compute.amazonaws.com:3128 to access TLS endpoint URL: https://test-eclipse-che.apps.<snip>.com"
time="2022-01-28T16:43:14Z" level=error msg="An error occurred when reaching test TLS route: Get \"https://test-eclipse-che.apps.<snip>.com\": dial tcp 3.19.207.178:443: connect: connection timed out"
time="2022-01-28T16:43:14Z" level=warning msg="Failed to get certificate chain of trust of the OpenShift Ingress bypassing the proxy"
time="2022-01-28T16:43:14Z" level=info msg="Configuring proxy with http://<snip>.compute.amazonaws.com:3128 to extract certificate chain from the following URL: https://test-eclipse-che.apps.<snip>.com"
time="2022-01-28T16:43:14Z" level=info msg="Using proxy: http://<snip>.compute.amazonaws.com:3128 to access TLS endpoint URL: https://test-eclipse-che.apps.<snip>.com"

Additional context

Originally detected while verifying Che with an airgap cluster; reproduced with regular cluster with proxy configured.

@amisevsk amisevsk added kind/bug Outline of a bug - must adhere to the bug report template. area/che-operator Issues and PRs related to Eclipse Che Kubernetes Operator labels Jan 28, 2022
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Jan 28, 2022
@amisevsk amisevsk added the engine/devworkspace Issues related to Che configured to use the devworkspace controller as workspace engine. label Jan 28, 2022
@amisevsk
Copy link
Contributor Author

This issue also results in chectl server:deploy taking a total of 16m35s to deploy Che server.

@Kasturi1820 Kasturi1820 added severity/P2 Has a minor but important impact to the usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Jan 31, 2022
@ibuziuk ibuziuk added severity/P1 Has a major impact to usage or development of the system. kind/bug Outline of a bug - must adhere to the bug report template. and removed kind/bug Outline of a bug - must adhere to the bug report template. severity/P2 Has a minor but important impact to the usage or development of the system. labels Jan 31, 2022
@tolusha tolusha mentioned this issue Jan 31, 2022
18 tasks
@tolusha tolusha added this to the 7.44 milestone Feb 1, 2022
@tolusha tolusha closed this as completed Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/che-operator Issues and PRs related to Eclipse Che Kubernetes Operator engine/devworkspace Issues related to Che configured to use the devworkspace controller as workspace engine. kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. sprint/current
Projects
None yet
Development

No branches or pull requests

5 participants