-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor and Improve streaming engines Kafka/RabbitMQ/NATS and data formats #42777
Conversation
/// Need for backward compatibility. | ||
if (format_name == "Avro" && local_context->getSettingsRef().output_format_avro_rows_in_file.changed) | ||
max_rows = local_context->getSettingsRef().output_format_avro_rows_in_file.value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is temporary, I will make a PR to depricate setting output_format_avro_rows_in_file
after this PR because this setting doesn't make sense anymore
Quite a big change, did not finish the reading yet. We need also to do benchmarks before merging |
It will be great if you can help me with it |
I checked performance difference in kafka and found out that async producing is less effective. Seems like overhead on copying of each message from the buffer to queue surpasses all benefits. I will think about how to optimize producing later. Let's make kafka producing process singlethreaded as it was before. |
hm, no, initially seemed that asynchronous writing is preferable because rabbitmq library is event-based and requires to run event loops, but indeed need to compare and check the difference. |
Revert some changes from #42777 to fix performance tests
Fixes: ClickHouse#42777 Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Fixes: ClickHouse#42777 Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> (cherry picked from commit 51d4f58)
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Refactor and Improve streaming engines Kafka/RabbitMQ/NATS and add support for all formats, also refactor formats a bit:
max_block_size
.kafka_max_rows_per_message/rabbitmq_max_rows_per_message/nats_max_rows_per_message
. They control the number of rows formatted in one message in row-based formats. Default value: 1.CC: @filimonov, @kssenii