Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloudServicesRequest CRs are not processed and updated after updating to 0.6.8 #182

Closed
b1zzu opened this issue Apr 16, 2021 · 4 comments · Fixed by #188
Closed

CloudServicesRequest CRs are not processed and updated after updating to 0.6.8 #182

b1zzu opened this issue Apr 16, 2021 · 4 comments · Fixed by #188
Labels
bug Something isn't working

Comments

@b1zzu
Copy link
Collaborator

b1zzu commented Apr 16, 2021

After creating a CloudServicesRequest nothing happen and the status never get updated

apiVersion: rhoas.redhat.com/v1alpha1
kind: CloudServicesRequest
metadata:
  creationTimestamp: '2021-04-16T12:19:32Z'
  generation: 1
  name: mk-e2e-kafka-request
  namespace: mk-e2e-test-dbizzarr
  resourceVersion: '156687910'
  selfLink: >-
    /apis/rhoas.redhat.com/v1alpha1/namespaces/mk-e2e-test-dbizzarr/cloudservicesrequests/mk-e2e-kafka-request
  uid: 7c1e3856-4940-4b64-b3ca-b333933deae0
spec:
  accessTokenSecretName: mk-e2e-api-accesstoken

looking at the Operator logs this is what I see:

2021-04-16 12:23:11,411 WARN  [io.jav.ope.pro.eve.int.CustomResourceEventSource] (OkHttp https://172.30.0.1/...) Received error for watch, will try to reconnect.: io.fabric8.kubernetes.client.WatcherException: too old resource version: 156684782 (156691299)
	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$TypedWatcherWebSocketListener.onMessage(WatchConnectionManager.java:103)
	at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:322)
	at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)
	at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)
	at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:273)
	at okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:209)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:174)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: too old resource version: 156684782 (156691299)
	... 11 more
@b1zzu b1zzu added the bug Something isn't working label Apr 16, 2021
@secondsun
Copy link
Contributor

This seems like a dupe of #170

@secondsun
Copy link
Contributor

I've opened this issue : operator-framework/java-operator-sdk#395 which has the details of my explorations.

locally I am testing in a crc cluster the reconnect schedule. It doesn't seem to be causing any harms, but it is too "hacky" IMHO to drop into production unannounced. Also there are some warnings printed from okhttp about leaking the connections.

@b1zzu If you have a test cluster you can spare, you can use my catalog source (https://github.com/secondsun/app-services-operator/blob/main/olm/catalogsource.yaml) to deploy a version of the operator with this logic. If we leave it idling over a period of time we can see if the connection drops or not.

@b1zzu
Copy link
Collaborator Author

b1zzu commented Apr 19, 2021

@secondsun
Copy link
Contributor

@b1zzu Awesome, I've got my fingers crossed.

There are two big drawbacks with this implementation right now.

  1. It spams the logs because the method we're calling to reset the connection is supposed to be an error handler that is called if the connection drops unexpectedly.
  2. OkHTTP notices that connections get leaked; this happens because onClose is being called by us when the connection isn't actually closed.

This is a short term bandaid of course, but if we get a fix for operator-framework/java-operator-sdk#395 then we can remove this workaround. (Assuming it works in the clusters)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants