[FEATURE] Need Increase the guarantee of the correctness of the internal message offset #687

wenbingshen · 2021-08-28T16:20:02Z

Is your feature request related to a problem? Please describe.
This issue was discovered when I fix #681
In Kafka, the message offset is allocated locally by the partition leader, and the message written in kop will not display the allocated message offset. When the producer turns on compression or writes v0, v1 messages, the internal message set offset may be abnormal. When the consumer When reading the message, even if the down-conversion is done, only the batch baseoffset is modified, but the internal record message offset may still be wrong.

Describe the solution you'd like
The batch offset in brokerEntryMetadata is correct. When down conversion is needed or the message version is v0 or v1, we should reallocate the internal message offset.

Because at present we are not doing internal message offset verification like Kafka when we write. If there is no better way, at least for now, we should make such a guarantee when we consume.

…ons (#686) Fixes #681 #687 ### Motivation #605 added a test framework for Kafka clients of different versions. However, it only added the basic e2e test, an important API commitOffset was not verified. ### Modifications Add commit offset test in BasicEndToEndPulsarTest and BasicEndToEndKafkaTest in io.streamnative.pulsar.handlers.kop.compatibility package. This feature belongs to the compatibility issue between different versions of kafka client and kop.

…ession codec when handle producer request with entryFormat=kafka (#791) fixes #687 ### Motivation Previously, in order to support low version kafka clients, such as 0.10 version clients, we verified the data when processing fetch requests. Because the lower message format would maintain the internal message set, according to the production situation I encountered, in a certain sometimes there is an error in the offset of the internal message set, such as the bug in the low version of the sarama client. The problem is that for the low version message format, no matter whether there is a problem with the internal message set, we will regenerate the message records, which is unnecessary in some cases. ### Modifications Support message set verification in production like kafka, and only need to verify when entryFormat=kafka. Because when entryFormat=pulsar, we still cannot avoid doing message conversion during consumption. Co-authored-by: Yunze Xu <xyzinfernity@163.com> Co-authored-by: Kai Wang <kwang@streamnative.io> Co-authored-by: Huanli Meng <48120384+Huanli-Meng@users.noreply.github.com>

wenbingshen added the type/feature Indicates new functionality label Aug 28, 2021

wenbingshen mentioned this issue Aug 28, 2021

[FEATURE] Add commit offset test for Kafka clients of different versions #686

Merged

wenbingshen mentioned this issue Oct 7, 2021

[FEATURE] Support kafka LogValidator validate inner records and compression codec when handle producer request with entryFormat=kafka #791

Merged

BewareMyPower closed this as completed in #791 Oct 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Need Increase the guarantee of the correctness of the internal message offset #687

[FEATURE] Need Increase the guarantee of the correctness of the internal message offset #687

wenbingshen commented Aug 28, 2021

[FEATURE] Need Increase the guarantee of the correctness of the internal message offset #687

[FEATURE] Need Increase the guarantee of the correctness of the internal message offset #687

Comments

wenbingshen commented Aug 28, 2021