Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help needed on migrating from kafka with zookeeper to Kraft #1212

Closed
fayak opened this issue Apr 11, 2024 · 15 comments
Closed

Help needed on migrating from kafka with zookeeper to Kraft #1212

fayak opened this issue Apr 11, 2024 · 15 comments
Labels

Comments

@fayak
Copy link

fayak commented Apr 11, 2024

Hello !

I'm looking to upgrade my deployment from 21.x to 22.x. I'm trying to follow the instructions from the README, but I have a hard time figuring a few things out.
On the step 3 :

Deploy at least one Kraft controller-only in your deployment with zookeeperMigrationMode=true. The Kraft controllers will migrate the data from your Kafka ZkBroker to Kraft mode.

To do this, add the following values to your Zookeeper deployment when upgrading:

controller:
    replicaCount: 1
    controllerOnly: true
    zookeeperMigrationMode: true
broker:
    zookeeperMigrationMode: true
kraft:
    enabled: true
    clusterId: "<your_cluster_id>"

What should be the value of kafka.zookeeper.enabled there ? If i understand correctly, we want to keep zookeeper at first to migrate things from it to kafka conroller, and once it's done, only then remove zookeeper.
If I set kafka.zookeeper.enabled: true, I have an error thrown

coalesce.go:237: warning: skipped value for kafka.config: Not a table.
Error: execution error at (sentry/charts/kafka/templates/NOTES.txt:333:4): 
VALUES VALIDATION:

kafka: Zookeeper mode - No Kafka brokers configured
    Zookeper mode has been enabled, but no Kafka brokers nodes have been configured

Setting it to false (or not setting it) looks like it's working, but I don't see a sts for zookeeper anymore. Is it expected ? I won't loose any data proceeding ?

@TartanLeGrand I think you wrote the migration guide, could you advise please ? :)

@TartanLeGrand
Copy link
Contributor

Hello @fayak 👋,

What should be the value of kafka.zookeeper.enabled there ? If i understand correctly, we want to keep zookeeper at first to migrate things from it to kafka conroller, and once it's done, only then remove zookeeper.

Instead you dont have reach the step 6 according to the migration doc you have to keep kafka.zookeeper.enabled: true.

Why you set kafka.zookeeper.enabled: true you are not under zookeeper before it's an fresh install ?

@fayak
Copy link
Author

fayak commented Apr 11, 2024

Hi !

Instead you dont have reach the step 6 according to the migration doc you have to keep kafka.zookeeper.enabled: true.

Ok, I think this used to be the default value before, so I add it explicitly now, but therefore get the templating error I've mentioned.

Apparently I need to also add broker.replicacount :

kafka:
  broker:
    zookeeperMigrationMode: true
    replicaCount: 1
  controller:
    replicaCount: 1
    controllerOnly: true
    zookeeperMigrationMode: true
  kraft:
    enabled: true
    clusterId: "<id>"
  zookeeper:
    enabled: true

Does it make sense ? Tbh I'm not familiar at all with zookeeper and kafka, and my developers will be (rightfully) annoyed if I drop the data by mistake 😅

I'm using zookeeper right now on my sentry deployment, it's not a fresh install

Merci pour l'aide !

@fayak
Copy link
Author

fayak commented Apr 11, 2024

The fact that it migrates pods from sentry-kafka-0 to sentry-kafka-broker-0 will cause the new kafka brokers to be started with a fresh disk; loosing their data. How is the data migration planned here ?

I tried and got the brokers to output

INFO [KafkaServer id=100] Finished catching up on KRaft metadata log, requesting that the KRaft controller unfence this broker (kafka.server.KafkaServer)
INFO [BrokerLifecycleManager id=100 isZkBroker=true] The broker has been unfenced. Transitioning from RECOVERY to RUNNING. (kafka.server.BrokerLifecycleManager)

But the controller refuses to output

Transitioning ZK migration state from PRE_MIGRATION to MIGRATION (org.apache.kafka.controller.FeatureControlManager)

I have logs like

[2024-04-11 17:33:13,554] INFO [KRaftMigrationDriver id=0] Still waiting for ZK brokers [0, 1, 2] found in metadata to register with KRaft. (org.apache.kafka.metadata.migration.KRaftMigrationDriver)
[2024-04-11 17:33:16,184] INFO [QuorumController id=0] Cannot run write operation maybeFenceReplicas in pre-migration mode. Returning NOT_CONTROLLER. (org.apache.kafka.controller.QuorumController)
[2024-04-11 17:33:16,184] INFO [QuorumController id=0] maybeFenceReplicas: event failed with NotControllerException in 505 microseconds. (org.apache.kafka.controller.QuorumController)

@TartanLeGrand
Copy link
Contributor

Maybe, you have 3 zookeeper and 1 controller ? 😄

You do not have more logs from the controllers ? 🤔 Your disks is out of place ?

@JustDoItSascha
Copy link

I had the same questions like above, but additionally I don't understand if I have to do these steps BEFORE upgrading to 22.0.0 Chart, so still on 21.6.3 ?

Because when I try it in combination with upgrading to 22.0.0 I get the error

'sasl.interbroker.password' must not be empty, please add '--set sasl.interbroker.password=$INTER_BROKER_PASSWORD' to the command. To get the current value:

export INTER_BROKER_PASSWORD=$(kubectl get secret --namespace "sentry" sentry-kafka-user-passwords -o jsonpath="{.data.inter-broker-password}" | base64 -d)

But there is no secret named sentry-kafka-user-passwords ...

@sschamp
Copy link

sschamp commented Apr 22, 2024

I had the same questions like above, but additionally I don't understand if I have to do these steps BEFORE upgrading to 22.0.0 Chart, so still on 21.6.3 ?

Because when I try it in combination with upgrading to 22.0.0 I get the error

'sasl.interbroker.password' must not be empty, please add '--set sasl.interbroker.password=$INTER_BROKER_PASSWORD' to the command. To get the current value:

export INTER_BROKER_PASSWORD=$(kubectl get secret --namespace "sentry" sentry-kafka-user-passwords -o jsonpath="{.data.inter-broker-password}" | base64 -d)

But there is no secret named sentry-kafka-user-passwords ...

I think this issue was fixed in 22.2.0

#1222

In short, your listeners should look like this now:

  listeners:
    client:
      protocol: "PLAINTEXT"
    controller:
      protocol: "PLAINTEXT"
    interbroker:
      protocol: "PLAINTEXT"
    external:
      protocol: "PLAINTEXT"

@sschamp
Copy link

sschamp commented Apr 22, 2024

Btw, what do we need the data in kafka for?
Is it important, or can we just start over from scratch?

@Mokto
Copy link
Contributor

Mokto commented Apr 22, 2024

I would recommend to start from scratch. It has worked well in my case.

@JustDoItSascha
Copy link

I would recommend to start from scratch. It has worked well in my case.

How exactly did you Start from scratch? Did you just removed the Kafka config from your old values.yaml and took the default one?

@Mokto
Copy link
Contributor

Mokto commented Apr 23, 2024

I change the values.yaml to something like that:

externalKafka:
  host: "temp"


kafka:
    enabled: false

Deploy the chart, delete the PVCs & revert my changes

@sschamp
Copy link

sschamp commented Apr 23, 2024

From what I understand, Kafka only stores events, and ideally, they are consumed almost immediately.
So no actual data of high importance is kept in Kafka, it should have been processed and stored into another DB asap.

Deleting the PVC's and starting Kafka all over from scratch, should cause minor data loss, new events that had not yet been processed might get lost, but it's only a really small amount.

For me personally, it doesn't sound like anything important would be on there.
And, if I temp disable the ingress for Sentry API, no new events will reach Kafka anyway.

So then I can do as @Mokto says, disable local Kafka, point the chart to a fake externalKafka
And set Kafka up fresh, with Kraft.

Am I correct in my assumptions?

@fayak
Copy link
Author

fayak commented Apr 29, 2024

Re,

In the end as the steps were not working as described, I removed everything and re-deployed a new Kafka from scratch with Raft enabled. I may have lost some events while doing so, but that's ok, and things worked out in the end.

So maybe this can be a valid solution, to remove everything kafka related, deploy the new version with Raft, without the migration mode enabled, and let it re-provision things and start again from there

@Mokto
Copy link
Contributor

Mokto commented May 30, 2024

This issue is stale because it has been open for 30 days with no activity.

@Mokto Mokto added the stale label May 30, 2024
@Mokto
Copy link
Contributor

Mokto commented Jun 13, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@Mokto Mokto closed this as not planned Won't fix, can't repro, duplicate, stale Jun 13, 2024
@honghainguyen777
Copy link

Hi everyone,

I am facing a similar problem when trying to migrate from Zookeeper to Kraft.

I have:

kafka:
  zookeeper:
    enabled: true
  controller:
    replicaCount: 1
    controllerOnly: true
    zookeeperMigrationMode: true
  broker:
    replicaCount: 1
    zookeeperMigrationMode: true
  kraft:
    enabled: true
    clusterId: "REDACTED"

Unfortunately, the sentry-db-check job does not work. There is no sentry-kafka-headless service because sentry-kafka-<replica> has been replaced by sentry-kafka-broker-<replica>:

KAFKA_STATUS=1
KAFKA_REPLICAS=1
echo "Kafka Zookeeper is enabled, checking if Zookeeper is up"
ZOOKEEPER_STATUS=0
while [ $ZOOKEEPER_STATUS -eq 0 ]; do
 ZOOKEEPER_STATUS=1
 i=0; while [ $i -lt $KAFKA_REPLICAS ]; do
   ZOOKEEPER_HOST=sentry-kafka-$i.sentry-kafka-headless 
   if ! nc -z "$ZOOKEEPER_HOST" ; then
     ZOOKEEPER_STATUS=0
     echo "$ZOOKEEPER_HOST is not available yet"
   fi
   i=$((i+1))
 done
 if [ "$ZOOKEEPER_STATUS" -eq 0 ]; then
   echo "Zookeeper not ready. Sleeping for 10s before trying again"
   sleep 10;
 fi
done

You see that ZOOKEEPER_HOST=sentry-kafka-$i.sentry-kafka-headless is not valid. I manually tried to replace ZOOKEEPER_HOST=sentry-kafka-$i.sentry-kafka-headless by ZOOKEEPER_HOST=sentry-kafka-broker-$i.sentry-kafka-broker-headless , but it does not work.

Could you please re-open this issue and help me out?

Best wishes,
Hai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants