fix(threads): back out #1388 Vertexecutor #1449

andrewazores · 2023-04-13T15:32:25Z

Welcome to Cryostat! 👋

Before contributing, make sure you have:

Read the contributing guidelines
Linked a relevant issue which this PR resolves
Linked any other relevant issues, PR's, or documentation, if any
Resolved all conflicts, if any
Rebased your branch PR on top of the latest upstream main branch
Attached at least one of the following labels to the PR: [chore, ci, docs, feat, fix, test]
Signed the last commit: git commit --amend --signoff

Related to #1448
Depends on #1466

Description of the change:

This backs out a threadpool related change from #1388, setting many injection sites of the new implementation back to what they previously were.

Motivation for the change:

The Vert.x worker pool was being used in a lot of places where it was not strictly necessary. That thread pool has a fixed size. If there are too many forking tasks to be done then it is possible that all the threads in the pool become deadlocked waiting for tasks that will never complete, because those tasks are stuck in the queue to be serviced by the same pool. This leaves only the Vert.x event loop alive.

github-actions · 2023-04-13T15:51:03Z

Test image available:

$ CRYOSTAT_IMAGE=ghcr.io/cryostatio/cryostat:pr-1449-8c89be04ecb5868e245a0b393928b078049b8eac sh smoketest.sh

github-actions · 2023-04-13T17:09:43Z

Test image available:

$ CRYOSTAT_IMAGE=ghcr.io/cryostatio/cryostat:pr-1449-92a83cd988c0ea304ec2cff0b355356d4e59a1fb sh smoketest.sh

github-actions · 2023-04-13T18:21:39Z

Test image available:

$ CRYOSTAT_IMAGE=ghcr.io/cryostatio/cryostat:pr-1449-e8408a55a02df95802e347ccdff2f3f29a2d6bbc sh smoketest.sh

andrewazores · 2023-04-13T18:41:00Z

@maxcao13 @tthvo if you have some spare time, could you help this this? I think this is a pretty low-risk change since it mostly sets some things back to how they were ~1 month ago. I am waiting for some further details from the bug reporter but I think this bug can be triggered by setting up a scenario where there are 20+ target applications available for Cryostat to discover on startup. The k8s/OpenShift part of the report is probably not important - I expect this would also reproduce in the smoketest podman setup, so simply copy-pasting enough configurations of one of the sample apps should be able to trigger this, but I haven't tried that yet.

This is a small patch and is a bugfix so no rush to get this reviewed before tomorrow's development codefreeze.

maxcao13 · 2023-04-13T19:23:20Z

I ran smoketest with 100 quay.io/andrewazores/vertx-fib-demo:0.9.1 targets, but I didn't seem to get any vertx blocked warnings in the logs. I will try on a cluster.

Here's my runDemoApps function in the smoketest script in case some one wants to try https://gist.github.com/maxcao13/1db0cee3d6df39f6d87e1ed09fa3ecb7

andrewazores · 2023-04-13T19:31:00Z

Hmm, okay. Those have the --label set too, so they should be showing up right away on startup when Cryostat queries the Podman API for startup discovery, very similar to what I imagine was happening with KubeApi discovery in the original report. There could be some other timing or threading issue that is specific to the KubeApi discovery though.

andrewazores · 2023-04-13T20:58:07Z

Original reporter confirmed that the bug reproduced in their environment with 20 targets, but with 17 it was OK. That confirms my suspicion of the root cause, so I'm pretty confident this is a good fix. It would still be best to actually test that.

maxcao13 · 2023-04-14T02:24:46Z

Running cryostat:latest on a cluster with a KubeApi env does yield some vertx event loop blocking. This was done with 23 quarkus sample app pods running.
cryostat-sample-7dfd9cc787-69vzd-cryostat-sample.log

I think there is still some problems with blocking the event loop however. For example, using this pr image as the CORE_IMG, I had 11 vertx-fib-demos running on a namespace and when I created an Automated Rule with match expression: true, I got this:

at java.base@17.0.6/java.lang.Thread.run(Thread.java:833)
Apr 14, 2023 2:16:51 AM io.cryostat.core.log.Logger info
INFO: Outgoing WS message: {"meta":{"category":"ActiveRecordingCreated","type":{"type":"application","subType":"json"},"serverTime":1681438611},"message":{"recording":{"downloadUrl":"https://REDACTED/api/v1/targets/service:jmx:rmi:%2F%2F%2Fjndi%2Frmi:%2F%2F10-130-2-56.default.pod:9093%2Fjmxrmi/recordings/auto_asdf","reportUrl":"https://REDACTED/api/v1/targets/service:jmx:rmi:%2F%2F%2Fjndi%2Frmi:%2F%2F10-130-2-56.default.pod:9093%2Fjmxrmi/reports/auto_asdf","metadata":{"labels":{"template.name":"Continuous","template.type":"TARGET"}},"archiveOnStop":false,"id":1,"name":"auto_asdf","state":"RUNNING","startTime":1681438610623,"duration":0,"continuous":true,"toDisk":true,"maxSize":0,"maxAge":0},"target":"service:jmx:rmi:///jndi/rmi://10-130-2-56.default.pod:9093/jmxrmi"}}
Apr 14, 2023 2:16:51 AM io.cryostat.core.log.Logger info
INFO: Creating connection for service:jmx:rmi:///jndi/rmi://10-131-0-82.default.pod:9093/jmxrmi
Apr 14, 2023 2:16:54 AM org.slf4j.impl.JDK14LoggerAdapter fillCallerData
WARNING: Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 2706 ms, time limit is 2000 ms
Apr 14, 2023 2:16:55 AM org.slf4j.impl.JDK14LoggerAdapter fillCallerData
WARNING: Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 3707 ms, time limit is 2000 ms
Apr 14, 2023 2:16:56 AM org.slf4j.impl.JDK14LoggerAdapter fillCallerData
WARNING: Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 4707 ms, time limit is 2000 ms
Apr 14, 2023 2:16:56 AM io.cryostat.core.log.Logger info
INFO: Removing cached connection for service:jmx:rmi:///jndi/rmi://10-130-2-56.default.pod:9093/jmxrmi: EXPIRED
Apr 14, 2023 2:16:56 AM io.cryostat.core.log.Logger info
INFO: Connection for service:jmx:rmi:///jndi/rmi://10-130-2-56.default.pod:9093/jmxrmi closed
Apr 14, 2023 2:16:56 AM io.cryostat.core.log.Logger info
INFO: Outgoing WS message: {"meta":{"category":"ActiveRecordingCreated","type":{"type":"application","subType":"json"},"serverTime":1681438616},"message":{"recording":{"downloadUrl":"REDACTED/api/v1/targets/service:jmx:rmi:%2F%2F%2Fjndi%2Frmi:%2F%2F10-131-0-82.default.pod:9093%2Fjmxrmi/recordings/auto_asdf","reportUrl":"REDACTED/api/v1/targets/service:jmx:rmi:%2F%2F%2Fjndi%2Frmi:%2F%2F10-131-0-82.default.pod:9093%2Fjmxrmi/reports/auto_asdf","metadata":{"labels":{"template.name":"Continuous","template.type":"TARGET"}},"archiveOnStop":false,"id":1,"name":"auto_asdf","state":"RUNNING","startTime":1681438615657,"duration":0,"continuous":true,"toDisk":true,"maxSize":0,"maxAge":0},"target":"service:jmx:rmi:///jndi/rmi://10-131-0-82.default.pod:9093/jmxrmi"}}
Apr 14, 2023 2:16:56 AM io.cryostat.core.log.Logger info
INFO: Creating connection for service:jmx:rmi:///jndi/rmi://10-131-0-86.default.pod:9093/jmxrmi
Apr 14, 2023 2:16:59 AM io.cryostat.core.log.Logger info
INFO: cryostat shutting down...

I'm unable to get more than 20 targets up on my oc cluster, because it apparently takes too much resources, the first log came from a local crc instance and the second from cluster bot.

Another thing I notice is that the cryostat logger generates tons of messages, super fast, this might be because the recursive update discovery tree function, which I think calls a logger message everytime it observes a target.

INFO: Observing new target: io.cryostat.platform.ServiceRef@15683810[alias=vertx-fib-demo-97487cc5-tpqnp,annotations=io.cryostat.platform.ServiceRef$Annotations@222dfff9[cryostat={HOST=10.130.2.59, PORT=9093, NAMESPACE=default, POD_NAME=vertx-fib-demo-97487cc5-tpqnp, REALM=KubernetesApi},platform={k8s.v1.cni.cncf.io/networks-status=[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.2.59"
],
"default": true,
"dns": {}
}], openshift.io/generated-by=OpenShiftNewApp, k8s.v1.cni.cncf.io/network-status=[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.2.59"
],

tons of these messages get sent even though, the targets likely can't all be new.

andrewazores · 2023-04-14T15:32:28Z

I think that last case of the Observing new target log message might just be due to how the internal discovery processes events. I believe that message is emitted while the incoming event is being processed, before the discovery system figures out what is net-new from the event. Are there corresponding log messages that reflect that all of these events are producing WebSocket notification messages and/or database records? If not, then I think that's not too much of a problem.

andrewazores · 2023-04-14T17:17:17Z

I think there is still some problems with blocking the event loop however. For example, using this pr image as the CORE_IMG, I had 11 vertx-fib-demos running on a namespace and when I created an Automated Rule with match expression: true, I got this:

It looks like in those cases it is just one event loop thread that is blocked too long but ends up becoming unblocked after a few seconds. That's still not proper threading behaviour, but it isn't as critically broken as it is in the #1448 report. When you run this scenario, does the Cryostat container end up getting killed by OpenShift or does it stay up and running?

maxcao13 · 2023-04-14T18:39:47Z

INFO: Creating connection for service:jmx:rmi:///jndi/rmi://10-131-0-86.default.pod:9093/jmxrmi 
Apr 14, 2023 2:16:59 AM io.cryostat.core.log.Logger info INFO: cryostat shutting down...

I think the container got killed, it was just hard to tell since it gets rolling updates, but the log said so.

andrewazores · 2023-04-14T18:43:11Z

Ah okay, I wasn't sure if that was because you had actually shut it down or because it got killed. Strange.

https://vertx.io/docs/apidocs/io/vertx/core/VertxOptions.html#setWorkerPoolSize-int-

The size of the worker pool can be reduced to make this condition easier to test.

maxcao13 · 2023-04-14T20:37:49Z

I will check on an openshift cluster one more time, with the vertx worker pool size to 5.

    @Provides
    @Singleton
    static Vertx provideVertx() {
        return Vertx.vertx(new VertxOptions().setWorkerPoolSize(5).setPreferNativeTransport(true));
    }

maxcao13 · 2023-04-14T20:46:24Z

cryostat-sample-5cf7b86589-72hc6-cryostat-sample.log

Yup, seems like that was the issue. I get vertx thread blocking just like the original issue

Apr 14, 2023 8:43:40 PM org.slf4j.impl.JDK14LoggerAdapter fillCallerData
WARNING: Thread Thread[vert.x-worker-thread-3,5,main] has been blocked for 65463 ms, time limit is 60000 ms
io.vertx.core.VertxException: Thread blocked
	at java.base@17.0.6/jdk.internal.misc.Unsafe.park(Native Method)
	at java.base@17.0.6/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
	at java.base@17.0.6/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:715)
	at java.base@17.0.6/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1047)
	at java.base@17.0.6/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
	at app//io.cryostat.recordings.RecordingMetadataManager.lambda$accept$9(RecordingMetadataManager.java:424)
	at app//io.cryostat.recordings.RecordingMetadataManager$$Lambda$1136/0x0000000801617940.run(Unknown Source)
	at app//io.cryostat.net.web.Vertexecutor.lambda$execute$0(Vertexecutor.java:63)
	at app//io.cryostat.net.web.Vertexecutor$$Lambda$955/0x0000000801495860.handle(Unknown Source)
	at app//io.vertx.core.impl.ContextBase.lambda$null$0(ContextBase.java:137)
	at app//io.vertx.core.impl.ContextBase$$Lambda$416/0x00000008011bf2a8.handle(Unknown Source)
	at app//io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:264)
	at app//io.vertx.core.impl.ContextBase.lambda$executeBlocking$1(ContextBase.java:135)
	at app//io.vertx.core.impl.ContextBase$$Lambda$414/0x00000008011bec28.run(Unknown Source)
	at java.base@17.0.6/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base@17.0.6/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base@17.0.6/java.lang.Thread.run(Thread.java:833)

testing with your pr has no such thing with the 9 pods and the worker size of 5.

andrewazores · 2023-04-14T20:48:47Z

Okay great, thank you for running those tests for me. I have finished setting back every instance of where I think the Vert.x worker threads were being misused, but I have had very split attention today so I need to go over this more careful and exercise things to make sure nothing new is broken.

maxcao13 · 2023-04-14T20:58:01Z

I also tried with 4 vertx pods and even that was an issue since cryostat discovers itself.

When I set the pod count to 3, it was fine :-)

andrewazores · 2023-04-14T21:03:19Z

I also tried with 4 vertx pods and even that was an issue since cryostat discovers itself.

When I set the pod count to 3, it was fine :-)

You mean this test case was still broken (as expected according to the report) pre-patch, and post-patch it now works fine, right?

github-actions · 2023-04-14T21:25:01Z

Test image available:

$ CRYOSTAT_IMAGE=ghcr.io/cryostatio/cryostat:pr-1449-f29b8654a19102985e08fa05cf911824c8a42e5a sh smoketest.sh

maxcao13 · 2023-04-14T21:51:27Z

I also tried with 4 vertx pods and even that was an issue since cryostat discovers itself.
When I set the pod count to 3, it was fine :-)

You mean this test case was still broken (as expected according to the report) pre-patch, and post-patch it now works fine, right?

I mean, prepatch if the number of targets was >= the worker pool, then there would be problems, and if i set the numTargets to workerPool - 1, it would be okay confirming the issue.

But yes the patch fixes it regardless.

maxcao13 · 2023-04-14T22:09:13Z

hmm.. but this is strange, I now get blocked threads in the normal smoketest.sh deployment (probably because default deployment still uses vertexecutor):

pr 14, 2023 10:05:35 PM io.cryostat.core.log.Logger warn
WARNING: Could not get jvmId for target service:jmx:rmi:///jndi/rmi://localhost:0/jmxrmi
Apr 14, 2023 10:05:35 PM io.cryostat.core.log.Logger warn
WARNING: Could not get jvmId for target service:jmx:rmi:///jndi/rmi://localhost:0/jmxrmi
Apr 14, 2023 10:05:35 PM io.cryostat.core.log.Logger error
SEVERE: Metadata could not be transferred upon target restart
io.cryostat.recordings.JvmIdHelper$JvmIdGetException: java.util.concurrent.TimeoutException
	at io.cryostat.recordings.JvmIdHelper.getJvmId(JvmIdHelper.java:224)
	at io.cryostat.recordings.JvmIdHelper.getJvmId(JvmIdHelper.java:211)
	at io.cryostat.recordings.RecordingMetadataManager.transferMetadataIfRestarted(RecordingMetadataManager.java:722)
	at io.cryostat.recordings.RecordingMetadataManager.handleFoundTarget(RecordingMetadataManager.java:455)
	at io.cryostat.recordings.RecordingMetadataManager.lambda$accept$9(RecordingMetadataManager.java:431)
	at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1395)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
	at io.cryostat.recordings.JvmIdHelper.getJvmId(JvmIdHelper.java:221)
	... 10 more

Apr 14, 2023 10:05:35 PM io.cryostat.core.log.Logger error
SEVERE: Metadata could not be transferred upon target restart
io.cryostat.recordings.JvmIdHelper$JvmIdGetException: java.util.concurrent.TimeoutException
	at io.cryostat.recordings.JvmIdHelper.getJvmId(JvmIdHelper.java:224)
	at io.cryostat.recordings.JvmIdHelper.getJvmId(JvmIdHelper.java:211)
	at io.cryostat.recordings.RecordingMetadataManager.transferMetadataIfRestarted(RecordingMetadataManager.java:722)
	at io.cryostat.recordings.RecordingMetadataManager.handleFoundTarget(RecordingMetadataManager.java:455)
	at io.cryostat.recordings.RecordingMetadataManager.lambda$accept$9(RecordingMetadataManager.java:431)
	at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1395)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
	at io.cryostat.recordings.JvmIdHelper.getJvmId(JvmIdHelper.java:221)
	... 10 more

Apr 14, 2023 10:05:35 PM io.cryostat.core.log.Logger warn
WARNING: Could not resolve jvmId for target service:jmx:rmi:///jndi/rmi://cryostat:9095/jmxrmi
Apr 14, 2023 10:05:35 PM io.cryostat.core.log.Logger info
INFO: Ignoring target node [service:jmx:rmi:///jndi/rmi://cryostat:9095/jmxrmi]
Hibernate: 
    select
        plugininfo0_.id as id1_1_0_,
        plugininfo0_.callback as callback2_1_0_,
        plugininfo0_.realm as realm3_1_0_,
        plugininfo0_.subtree as subtree4_1_0_ 
    from
        PluginInfo plugininfo0_ 
    where
        plugininfo0_.id=?
Hibernate: 
    /* load io.cryostat.discovery.PluginInfo */ select
        plugininfo0_.id as id1_1_0_,
        plugininfo0_.callback as callback2_1_0_,
        plugininfo0_.realm as realm3_1_0_,
        plugininfo0_.subtree as subtree4_1_0_ 
    from
        PluginInfo plugininfo0_ 
    where
        plugininfo0_.id=?
Apr 14, 2023 10:05:43 PM org.slf4j.impl.JDK14LoggerAdapter fillCallerData
INFO: 10.0.2.100 - - [Fri, 14 Apr 2023 22:05:43 GMT] 1ms "GET /api/v1/notifications_url HTTP/1.1" 200 63 bytes "http://localhost:9000/" "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/111.0"
Apr 14, 2023 10:05:47 PM org.slf4j.impl.JDK14LoggerAdapter fillCallerData
INFO: 10.0.2.100 - - [Fri, 14 Apr 2023 22:05:47 GMT] 1ms "GET /api/v1/notifications_url HTTP/1.1" 200 63 bytes "http://localhost:9000/" "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/111.0"
Apr 14, 2023 10:06:23 PM org.slf4j.impl.JDK14LoggerAdapter fillCallerData
WARNING: Thread Thread[vert.x-worker-thread-4,5,main] has been blocked for 60385 ms, time limit is 60000 ms
io.vertx.core.VertxException: Thread blocked
	at java.base@17.0.6/jdk.internal.misc.Unsafe.park(Native Method)
	at java.base@17.0.6/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
	at java.base@17.0.6/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1864)
	at java.base@17.0.6/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3463)
	at java.base@17.0.6/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3434)
	at java.base@17.0.6/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1898)
	at java.base@17.0.6/java.util.concurrent.CompletableFuture.join(CompletableFuture.java:2117)
	at app//com.github.benmanes.caffeine.cache.LocalAsyncCache$AbstractCacheView.resolve(LocalAsyncCache.java:523)
	at app//com.github.benmanes.caffeine.cache.LocalAsyncLoadingCache$LoadingCacheView.get(LocalAsyncLoadingCache.java:178)
	at app//io.cryostat.recordings.JvmIdHelper.lambda$resolveId$2(JvmIdHelper.java:151)
	at app//io.cryostat.recordings.JvmIdHelper$$Lambda$1173/0x00000008015af1b8.accept(Unknown Source)
	at java.base@17.0.6/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:718)
	at java.base@17.0.6/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base@17.0.6/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:614)
	at java.base@17.0.6/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:653)
	at java.base@17.0.6/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
	at app//io.cryostat.net.web.Vertexecutor.lambda$execute$0(Vertexecutor.java:64)
	at app//io.cryostat.net.web.Vertexecutor$$Lambda$1000/0x000000080143b428.handle(Unknown Source)
	at app//io.vertx.core.impl.ContextBase.lambda$null$0(ContextBase.java:137)
	at app//io.vertx.core.impl.ContextBase$$Lambda$982/0x00000008014353e0.handle(Unknown Source)
	at app//io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:264)
	at app//io.vertx.core.impl.ContextBase.lambda$executeBlocking$1(ContextBase.java:135)
	at app//io.vertx.core.impl.ContextBase$$Lambda$967/0x000000080142df30.run(Unknown Source)
	at java.base@17.0.6/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base@17.0.6/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base@17.0.6/java.lang.Thread.run(Thread.java:833)

This is with

cryostat on  HEAD (f29b865) [$!] is 📦 v2.3.0-SNAPSHOT via ☕ v17.0.4 took 4s 
❯ git log
commit f29b8654a19102985e08fa05cf911824c8a42e5a (HEAD, andrewazores/blocked-threads)
Author: Andrew Azores <aazores@redhat.com>
Date:   Fri Apr 14 17:08:57 2023 -0400

    don't ignore (useless) return value, make spotbugs happy

commit acab4b367a679993eaaeae1f956274d8d0363274
Author: Andrew Azores <aazores@redhat.com>
Date:   Fri Apr 14 16:59:18 2023 -0400

    Revert "suppress warning"
    
    This reverts commit 2b86d5b7ebcf57fbd1f90310e5d02ccad82ed947.

and no changes to anything including smoketest.sh itself. Seems like something to do with jvmId handling again...

I'll look and try to see what the problem is.

github-actions · 2023-04-14T22:22:57Z

Test image available:

$ CRYOSTAT_IMAGE=ghcr.io/cryostatio/cryostat:pr-1449-b8a16ca3441b7f205c05d045cb8481551ebd9598 sh smoketest.sh

This reverts commit b2fdcfd.

Signed-off-by: Andrew Azores <aazores@redhat.com>

github-actions · 2023-04-25T00:52:29Z

This PR/issue depends on:

~~cryostatio/cryostat#1466~~
By Dependent Issues (🤖). Happy coding!

andrewazores added the fix label Apr 13, 2023

mergify bot added the safe-to-test label Apr 13, 2023

andrewazores force-pushed the blocked-threads branch from 8c89be0 to 92a83cd Compare April 13, 2023 16:54

andrewazores mentioned this pull request Apr 13, 2023

Many vert.x-worker threads blocked during startup #1448

Open

andrewazores force-pushed the blocked-threads branch from 92a83cd to e8408a5 Compare April 13, 2023 18:06

andrewazores force-pushed the blocked-threads branch from e8408a5 to adf8102 Compare April 14, 2023 19:08

andrewazores force-pushed the blocked-threads branch from c599a0a to 09c6c19 Compare April 14, 2023 20:39

andrewazores force-pushed the blocked-threads branch from 2b86d5b to f29b865 Compare April 14, 2023 21:09

andrewazores force-pushed the blocked-threads branch from f29b865 to b8a16ca Compare April 14, 2023 22:07

andrewazores added 25 commits April 22, 2023 11:26

do not use vertexecutor for targetconnectionmanager

8385505

use cached threadpool for graphql workers

44543d2

handle case where recording descriptor not found

aebbf14

complete removal of deadlock-prone method abusing threadpool

61f2cb9

move confusing log line

6f03db0

extract method

040c7d1

Revert "complete removal of deadlock-prone method abusing threadpool"

5d632db

This reverts commit b2fdcfd.

testing

0e1ed0b

more testing

cb63ba6

Signed-off-by: Andrew Azores <aazores@redhat.com>

add health probes to test setups

f9f5007

try different forkjoinpool settings

bfdc15b

ignore false nonnull warning

c39ae08

apply spotless

52a768d

fixup! add health probes to test setups

0c04dbe

tinkering

9f7c7a3

add missing early return in exceptional case

d680922

fixup! tinkering

c90b75a

use named verticle worker pools

8c69ad1

singlethreaded archive migration

a68899b

removing outdated migration steps

a77efbb

spotbugs fixes

8d44b0b

fix lock state

d391419

refactor

5181269

increase JMX connection timeout

a3a4b8a

apply timeout to both method signatures

78f6f7a

andrewazores force-pushed the blocked-threads branch from 7f60a77 to 78f6f7a Compare April 22, 2023 15:26

andrewazores removed the backport label Apr 22, 2023

github-actions bot removed the dependent label Apr 25, 2023

andrewazores closed this Apr 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(threads): back out #1388 Vertexecutor #1449

fix(threads): back out #1388 Vertexecutor #1449

andrewazores commented Apr 13, 2023 •

edited

Loading

github-actions bot commented Apr 13, 2023

github-actions bot commented Apr 13, 2023

github-actions bot commented Apr 13, 2023

andrewazores commented Apr 13, 2023 •

edited

Loading

maxcao13 commented Apr 13, 2023 •

edited

Loading

andrewazores commented Apr 13, 2023

andrewazores commented Apr 13, 2023

maxcao13 commented Apr 14, 2023 •

edited by andrewazores

Loading

andrewazores commented Apr 14, 2023

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 14, 2023 •

edited

Loading

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 14, 2023 •

edited

Loading

maxcao13 commented Apr 14, 2023

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

andrewazores commented Apr 14, 2023

github-actions bot commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

maxcao13 commented Apr 14, 2023 •

edited

Loading

github-actions bot commented Apr 14, 2023

github-actions bot commented Apr 25, 2023

fix(threads): back out #1388 Vertexecutor #1449

fix(threads): back out #1388 Vertexecutor #1449

Conversation

andrewazores commented Apr 13, 2023 • edited Loading

Welcome to Cryostat! 👋

Before contributing, make sure you have:

Description of the change:

Motivation for the change:

github-actions bot commented Apr 13, 2023

github-actions bot commented Apr 13, 2023

github-actions bot commented Apr 13, 2023

andrewazores commented Apr 13, 2023 • edited Loading

maxcao13 commented Apr 13, 2023 • edited Loading

andrewazores commented Apr 13, 2023

andrewazores commented Apr 13, 2023

maxcao13 commented Apr 14, 2023 • edited by andrewazores Loading

andrewazores commented Apr 14, 2023

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 14, 2023 • edited Loading

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 14, 2023 • edited Loading

maxcao13 commented Apr 14, 2023

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

andrewazores commented Apr 14, 2023

github-actions bot commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

maxcao13 commented Apr 14, 2023 • edited Loading

github-actions bot commented Apr 14, 2023

github-actions bot commented Apr 25, 2023

andrewazores commented Apr 13, 2023 •

edited

Loading

andrewazores commented Apr 13, 2023 •

edited

Loading

maxcao13 commented Apr 13, 2023 •

edited

Loading

maxcao13 commented Apr 14, 2023 •

edited by andrewazores

Loading

maxcao13 commented Apr 14, 2023 •

edited

Loading

maxcao13 commented Apr 14, 2023 •

edited

Loading

maxcao13 commented Apr 14, 2023 •

edited

Loading