-
Notifications
You must be signed in to change notification settings - Fork 870
Producer
You should be aware that producing synchronously will reduce maximum throughput dramatically. For example, you might achieve a rate of about 500,000 msg/s if you don't wait on the result of each produce call before sending the next. If you do, you'll get at best ~300 msg/s.
This means you should typically not await
your ProduceAsync
calls, the exception being if you are in a highly concurrent scenario (such as a web request handler). In this case, although an individual call my be slow, it will not impact other calls happening simultaneously.
The two methods are equivalent, but tailored to different usage patterns. The Produce
method is more efficient, and you should care about that if your throughput is high (>~ 20k msgs/s), else the difference will be negligible compared to whatever else you application is doing.
Another difference is that tasks returned by ProduceAsync
compete on thread pool threads, so ordering is not guaranteed. By comparison, delivery report callbacks passed to Produce
are all called on the same thread, so the ordering will exactly match the order they are known to be completed by the client.
Most importantly, you should be checking the result of each produce call. In the case of ProduceAsync
, any error will be exposed in the form of a KafkaException
if you await
the call, else it will show up in the Task
s Exception
property (with the IsFaulted
property set to true
). In the case of Produce
, errors are exposed via the Error
property on the DeliveryReport
instance returned in the callback.
Note: when transactions become available in v1.4, you will typically be able to rely on the result of the transaction, commit - won't need to consider the produce delivery reports for errors if you are using transactions.
Events sent to the error event handler and log event handler are typically informational in nature - you can typically ignore them. In very rare cases, events on the error handler may be marked as fatal (IsFatal
set to true
). This can occur if idempotence or transaction semantics cannot be satisfied due to an unlikely scenario on the server (e.g. unexpected log truncation), or bug in the system.
The word 'error', and even 'fatal' are used a fair bit in the logs in benign situations. Don't be easily alarmed by this - if the producer is still operating, these are likely nothing to worry about.
A commonly adjusted configuration property on the producer is LingerMs
- the minimum time between batches of messages being sent to the cluster. Larger values allow for more batching, which increases throughput. Smaller values may reduce latency. The tradeoff is complicated somewhat as throughput increases though because a smaller LingerMs will result in more broker requests, putting greater load on the server, which in turn may reduce throughput.
A good general purpose setting is 5 (the default is 0.5).
If your throughput is very low, and you really care about latency, you should probably set SocketNagleDisable
to true
.
The default value for the Acks
configuration property is All
(prior to v1.0, the default was 1). This means that if a delivery report returns without error, the message has been replicated to all replicas in the in-sync replica set.
If you have EnableIdempotence
set to true, Acks
must be all.
You should generally prefer having acks set to all. There's no real benefit to setting it lower. Note that this won't improve end-to-end latency because messages must be replicated to all in-sync replicas before they are available for consumption - you will just get to know whether the message has been successfully written to the leader replica a little earlier.
Set EnableIdempotence
to true
. This has very little overhead - there's not much downside to having this on by default
Before the idempotent producer was available, you could achieve this by setting MaxInFlight
to 1 (at the expense of reduced throughput). Messages are sent in the same order as produced. However, when idempotence is not enabled, in case of failure (temporary network failure for example), Confluent.Kafka will try to resend the data (if the Retries
config property is set to > 0, the default is 2), which can cause reordering (note that this will not happen frequently, only in case of failure).
Custom partitioners are not available in the .NET Client yet. You can get around this simply by producing to specific partitions in your produce call (i.e. partition outside the framework of the client).