Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka UI says MSK cluster is offline when it isn't. #1353

Closed
ctwilleager-nt opened this issue Jan 4, 2022 · 2 comments
Closed

Kafka UI says MSK cluster is offline when it isn't. #1353

ctwilleager-nt opened this issue Jan 4, 2022 · 2 comments
Labels
status/duplicate This issue or pull request already exists type/bug Something isn't working

Comments

@ctwilleager-nt
Copy link

Describe the bug
Kafka UI says MSK cluster is offline when it isn't.

Set up
AWS MSK version 2.6.2
Helm 3.6.0
Provectus Kafka UI 0.3.1
Chart Version 0.0.3
Kubernetes 1.20

Steps to Reproduce
Steps to reproduce the behavior:

  1. Set up Amazon MSK using version 2.6.2 with open monitoring enabled.
  2. Deploy Provectus Kafka UI into Kubernetes cluster using chart version 0.0.3 and image 0.3.1.
  3. Cluster is always marked as offline.

Additionally

Expected behavior
I expect the cluster to be marked as online and JMX stats should be showing on the dashboard, etc..

Note that if you deploy Image version 0.2.1 using chart version 0.0.1, it does work. No other configuration items in values.yaml are changing.

Screenshots
image

JMX endpoint error

DEBUG [kafka-admin-client-thread | adminclient-1] j.m.r.rmi: [javax.management.remote.rmi.RMIConnector: jmxServiceURL=service:jmx:rmi:///jndi/rmi://b-1.foo-msk-clust.cd6ifg.c3.kafka.us-west-2.amazonaws.com:11001/jmxrmi] Failed to retrieve RMIServer stub: javax.naming.CommunicationException [Root exception is java.rmi.ConnectIOException: error during JRMP connection establishment; nested exception is: 
	java.net.SocketTimeoutException: Read timed out]
2022-01-04 16:22:17,965 ERROR [kafka-admin-client-thread | adminclient-1] c.p.k.u.u.JmxClusterUtil: Cannot get JMX connector for the pool due to: 
java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.CommunicationException [Root exception is java.rmi.ConnectIOException: error during JRMP connection establishment; nested exception is: 
	java.net.SocketTimeoutException: Read timed out]
	at java.management.rmi/javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:370)
	at java.management/javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270)
	at com.provectus.kafka.ui.util.JmxPoolFactory.create(JmxPoolFactory.java:31)
	at com.provectus.kafka.ui.util.JmxPoolFactory.create(JmxPoolFactory.java:17)
	at org.apache.commons.pool2.BaseKeyedPooledObjectFactory.makeObject(BaseKeyedPooledObjectFactory.java:62)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:1012)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:356)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:277)
	at com.provectus.kafka.ui.util.JmxClusterUtil.getJmxMetrics(JmxClusterUtil.java:99)
	at com.provectus.kafka.ui.util.JmxClusterUtil.lambda$getJmxMetric$3(JmxClusterUtil.java:82)
	at java.base/java.util.Optional.map(Optional.java:258)
	at com.provectus.kafka.ui.util.JmxClusterUtil.getJmxMetric(JmxClusterUtil.java:82)
	at com.provectus.kafka.ui.util.JmxClusterUtil.lambda$getBrokerMetrics$0(JmxClusterUtil.java:73)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:113)
	at reactor.core.publisher.FluxIterable$IterableSubscription.fastPath(FluxIterable.java:340)
	at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:227)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:169)
	at reactor.core.publisher.MonoCollect$CollectSubscriber.onSubscribe(MonoCollect.java:103)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:96)
	at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:165)
	at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:87)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4399)
	at reactor.core.publisher.MonoZip.subscribe(MonoZip.java:128)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:157)
	at reactor.core.publisher.MonoCreate$DefaultMonoSink.success(MonoCreate.java:165)
	at com.provectus.kafka.ui.service.ReactiveAdminClient.lambda$describeCluster$18(ReactiveAdminClient.java:223)
	at org.apache.kafka.common.internals.KafkaFutureImpl$WhenCompleteBiConsumer.accept(KafkaFutureImpl.java:177)
	at org.apache.kafka.common.internals.KafkaFutureImpl$WhenCompleteBiConsumer.accept(KafkaFutureImpl.java:162)
	at org.apache.kafka.common.internals.KafkaFutureImpl.complete(KafkaFutureImpl.java:221)
	at org.apache.kafka.common.KafkaFuture$AllOfAdapter.maybeComplete(KafkaFuture.java:82)
	at org.apache.kafka.common.KafkaFuture$AllOfAdapter.accept(KafkaFuture.java:76)
	at org.apache.kafka.common.KafkaFuture$AllOfAdapter.accept(KafkaFuture.java:57)
	at org.apache.kafka.common.internals.KafkaFutureImpl.complete(KafkaFutureImpl.java:221)
	at org.apache.kafka.clients.admin.KafkaAdminClient$5.handleResponse(KafkaAdminClient.java:1882)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.handleResponses(KafkaAdminClient.java:1189)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1341)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1264)
	at java.base/java.lang.Thread.run(Thread.java:830)
Caused by: javax.naming.CommunicationException: null
	at jdk.naming.rmi/com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:137)
	at java.naming/com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:207)
	at java.naming/javax.naming.InitialContext.lookup(InitialContext.java:409)
	at java.management.rmi/javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1839)
	at java.management.rmi/javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1813)
	at java.management.rmi/javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:302)
	... 37 common frames omitted
Caused by: java.rmi.ConnectIOException: error during JRMP connection establishment; nested exception is: 
	java.net.SocketTimeoutException: Read timed out
	at java.rmi/sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:300)
	at java.rmi/sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:196)
	at java.rmi/sun.rmi.server.UnicastRef.newCall(UnicastRef.java:343)
	at java.rmi/sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:116)
	at jdk.naming.rmi/com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:133)
	... 42 common frames omitted
Caused by: java.net.SocketTimeoutException: Read timed out
	at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:284)
	at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:310)
	at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:351)
	at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:802)
	at java.base/java.net.Socket$SocketInputStream.read(Socket.java:937)
	at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:245)
	at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:264)
	at java.base/java.io.DataInputStream.readByte(DataInputStream.java:270)
	at java.rmi/sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:239)
	... 46 common frames omitted

values.yaml

# Reference: https://github.com/provectus/kafka-ui
---
replicaCount: 1

image:
  repository: provectuslabs/kafka-ui
  pullPolicy: IfNotPresent

imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""

serviceAccount:
  # Specifies whether a service account should be created
  create: true
  # Annotations to add to the service account
  annotations: {}
  # The name of the service account to use.
  # If not set and create is true, a name is generated using the fullname template
  name: "kafka-ui-svcaccount"

existingConfigMap: ""
existingSecret: ""
envs:
  config:
    KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: b-1.foo-msk-clust.cd6ifg.c3.kafka.us-west-2.amazonaws.com:9092,b-2.foo-msk-clust.cd6ifg.c3.kafka.us-west-2.amazonaws.com:9092,b-3.foo-msk-clust.cd6ifg.c3.kafka.us-west-2.amazonaws.com:9092
    KAFKA_CLUSTERS_0_JMXPORT: "11001"
    KAFKA_CLUSTERS_0_JMXSSL: "false"
    KAFKA_CLUSTERS_0_NAME: dev
    KAFKA_CLUSTERS_0_READONLY: "true"
    KAFKA_CLUSTERS_0_ZOOKEEPER: z-1.foo-msk-clust.cd6ifg.c3.kafka.us-west-2.amazonaws.com:2181,z-2.foo-msk-clust.cd6ifg.c3.kafka.us-west-2.amazonaws.com:2181,z-3.foo-msk-clust.cd6ifg.c3.kafka.us-west-2.amazonaws.com:2181
    LOGGING_LEVEL_ROOT: trace
  secret: {}
networkPolicy:
  enabled: false
  egressRules:
    ## Additional custom egress rulessplit(",",
    ## e.g:
    ## customRules:
    ##   - to:
    ##       - namespaceSelector:
    ##           matchLabels:
    ##             label: example
    customRules: []
  ingressRules:
    ## Additional custom ingress rules
    ## e.g:
    ## customRules:
    ##   - from:
    ##       - namespaceSelector:
    ##           matchLabels:
    ##             label: example
    customRules: []

podAnnotations: {}
podLabels: {}

podSecurityContext: {}
  # fsGroup: 2000

securityContext: {}
  # capabilities:
  #   drop:
  #   - ALL
  # readOnlyRootFilesystem: true
  # runAsNonRoot: true
  # runAsUser: 1000

service:
  type: ClusterIP
  port: 80
  # if you want to force a specific nodePort. Must be use with service.type=NodePort
  # nodePort:

# Ingress configuration
ingress:
  # Enable ingress resource
  enabled: false

  # Annotations for the Ingress
  annotations: {}

  # The path for the Ingress
  path: ""

  # The hostname for the Ingress
  host: ""

  # configs for Ingress TLS
  tls:
    # Enable TLS termination for the Ingress
    enabled: false
    # the name of a pre-created Secret containing a TLS private key and certificate
    secretName: ""

  # HTTP paths to add to the Ingress before the default path
  precedingPaths: []

  # Http paths to add to the Ingress after the default path
  succeedingPaths: []

resources:
  limits:
    cpu: 1000m
    memory: 512Mi
  requests:
    cpu: 500m
    memory: 256Mi

autoscaling:
  enabled: false
  minReplicas: 1
  maxReplicas: 100
  targetCPUUtilizationPercentage: 80
  # targetMemoryUtilizationPercentage: 80

tolerations: []

affinity: {}

Additional context
If I deploy Kafka UI 0.2.1 using chart version 0.0.1, then eventually the MSK cluster is shown as online.

@ctwilleager-nt ctwilleager-nt added the type/bug Something isn't working label Jan 4, 2022
@github-actions github-actions bot added the status/triage Issues pending maintainers triage label Jan 4, 2022
@Haarolean
Copy link
Contributor

Hi, thanks for reaching out.

As far as I recall, there's no more JMX present for AWS MSK, there's just prometheus available, which support we haven't implemented yet.
Could you try disabling it by removing corresponding env variables?

@Haarolean Haarolean added status/pending Further information is requested and removed status/triage Issues pending maintainers triage labels Jan 4, 2022
@ctwilleager-nt
Copy link
Author

Yeah that works. Wasn't aware that MSK no longer supports JMX.

@Haarolean Haarolean added status/duplicate This issue or pull request already exists and removed status/pending Further information is requested labels Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/duplicate This issue or pull request already exists type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants