You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At this point, we have implemented the ability to throw an exception if the Kafka (aiokafka client) batch overflow (pull #1990). However, it is envisioned that we could extend this functionality by allowing developers to control the overflow of the batch themselves.
Aiokafka operates on a BatchBuilder object, which has an append method. This method replenishes the buffer of the batch and returns some metadata. However, if its buffer overflows, it returns None. By tracking directly the type of the returned object, we can throw an exception, which is generally already implemented. But we assume that some developers will want to monitor the overflow of the batch themselves, and perhaps this is part of the business logic of the application. To do this, we need to make a KafkaBatch facade object with the implementation of two methods: extend and append.
The append method should have the same implementation as aiokafka's BatchBuilder.append. The behavior of the return type is the same: if the batch is full, return None. The extend method should accept a collection of messages, extend the current batch with this collection, and throw an exception if the batch overflows.
This KafkaBatch prototype has a number of problems that it is not yet clear how to solve. The main problem is the requirements of BatchBuilder parameters for its instance. The most painful parameter is some magic. Magic is a number in aiokafka that is generated based on the current version of the API. I used the code for forming this number in the prototype code above so that the example could be run immediately. To generate this number yourself, you need to take api_version as input, which is also extremely painful if we are talking about the purpose of our task - to fill the batche with our messages in a controlled way. So far, the solution is to leave the default value, and if necessary, throw in the api_version value in some other way.
The unpleasant part is removal of two functions (for magic-number generation) and formation of headers for the batch. Ideally, it's not a big deal, but I'd like to find a more elegant solution in this case than just copying implementations from different parts of the code.
We also need to extend Kafka's publish_batch function signature:
We can have three cases when we use this behavior:
we fill the batch before calling the method, we pass the finished batch without additionally passing the message collection to the publish_batch method. Then, when we overflow the batch, we will get an error before the method call;
pass messages to be sent as before, and we will get an error when the method is called (in case of overflow of the batch);
pass both KafkaBatch and the message collection. In this case, if KafkaBatch was not overflowed, but will be overflowed when it is expanded with new messages (*msgs in publish_batch method), an exception will be thrown specifying the position of the message in the passed message collection on which the overflow occurred.
To control the overflow of a batch, you will need to create a KafkaBatch and use the append method. It will return None, if the batch are overflowed.
To summarize: the only problem we face at the moment for introducing this feature is the design of KafkaBuilder. If you have any ideas on how it can be made better, more elegant, simpler - you are welcome!
The text was updated successfully, but these errors were encountered:
Lancetnik
added
Confluent
Issues related to `faststream.confluent` module
AioKafka
Issues related to `faststream.kafka` module
and removed
Confluent
Issues related to `faststream.confluent` module
labels
Dec 16, 2024
At this point, we have implemented the ability to throw an exception if the Kafka (aiokafka client) batch overflow (pull #1990). However, it is envisioned that we could extend this functionality by allowing developers to control the overflow of the batch themselves.
Aiokafka operates on a BatchBuilder object, which has an append method. This method replenishes the buffer of the batch and returns some metadata. However, if its buffer overflows, it returns None. By tracking directly the type of the returned object, we can throw an exception, which is generally already implemented. But we assume that some developers will want to monitor the overflow of the batch themselves, and perhaps this is part of the business logic of the application. To do this, we need to make a KafkaBatch facade object with the implementation of two methods:
extend
andappend
.The
append
method should have the same implementation as aiokafka's BatchBuilder.append. The behavior of the return type is the same: if the batch is full, return None. Theextend
method should accept a collection of messages, extend the current batch with this collection, and throw an exception if the batch overflows.Here is the working code of KafkaBatch:
This KafkaBatch prototype has a number of problems that it is not yet clear how to solve. The main problem is the requirements of BatchBuilder parameters for its instance. The most painful parameter is some
magic
.Magic
is a number in aiokafka that is generated based on the current version of the API. I used the code for forming this number in the prototype code above so that the example could be run immediately. To generate this number yourself, you need to takeapi_version
as input, which is also extremely painful if we are talking about the purpose of our task - to fill the batche with our messages in a controlled way. So far, the solution is to leave the default value, and if necessary, throw in theapi_version
value in some other way.The unpleasant part is removal of two functions (for magic-number generation) and formation of headers for the batch. Ideally, it's not a big deal, but I'd like to find a more elegant solution in this case than just copying implementations from different parts of the code.
We also need to extend Kafka's publish_batch function signature:
In such a case, we will have extended logic for sending the batch as follows:
We can have three cases when we use this behavior:
publish_batch
method. Then, when we overflow the batch, we will get an error before the method call;KafkaBatch
and the message collection. In this case, ifKafkaBatch
was not overflowed, but will be overflowed when it is expanded with new messages (*msgs
inpublish_batch method
), an exception will be thrown specifying the position of the message in the passed message collection on which the overflow occurred.To control the overflow of a batch, you will need to create a
KafkaBatch
and use theappend
method. It will return None, if the batch are overflowed.To summarize: the only problem we face at the moment for introducing this feature is the design of KafkaBuilder. If you have any ideas on how it can be made better, more elegant, simpler - you are welcome!
The text was updated successfully, but these errors were encountered: