-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[server] [client] [test] Global RT DIV: Chunking Support #1385
Conversation
7ee8cd9
to
d3ae029
Compare
internal/venice-common/src/main/java/com/linkedin/venice/writer/VeniceWriter.java
Outdated
Show resolved
Hide resolved
internal/venice-common/src/test/java/com/linkedin/venice/writer/VeniceWriterUnitTest.java
Outdated
Show resolved
Hide resolved
...ient/src/main/java/com/linkedin/davinci/kafka/consumer/LeaderFollowerStoreIngestionTask.java
Outdated
Show resolved
Hide resolved
internal/venice-common/src/main/java/com/linkedin/venice/kafka/protocol/enums/MessageType.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to call out another item we didn't discuss in the review meeting:
- Cleanup the Global RT DIV messages from VT.
This new type of message can be large and with chunking support, we are leaking the data chunks as chunk id is unique.
I was thinking some strategy to let follower to send out a Kafka delete or Kafka message with empty value for the previous key after consuming a new Global RT DIV message
, and if we don't do the cleanup, the size of Kafka topic might grow a lot depending on the sending frequency.
Maybe we don't need to implement such cleanup in the MVP, but I think eventually, we need some way to clean them up from the version topics.
...ient/src/main/java/com/linkedin/davinci/kafka/consumer/LeaderFollowerStoreIngestionTask.java
Show resolved
Hide resolved
...ts/da-vinci-client/src/main/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTask.java
Outdated
Show resolved
Hide resolved
...ts/da-vinci-client/src/main/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTask.java
Outdated
Show resolved
Hide resolved
...ts/da-vinci-client/src/main/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTask.java
Outdated
Show resolved
Hide resolved
...ts/da-vinci-client/src/main/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTask.java
Outdated
Show resolved
Hide resolved
...ts/da-vinci-client/src/main/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTask.java
Show resolved
Hide resolved
...inci-client/src/main/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTaskFactory.java
Outdated
Show resolved
Hide resolved
internal/venice-common/src/main/java/com/linkedin/venice/kafka/protocol/enums/MessageType.java
Show resolved
Hide resolved
internal/venice-common/src/main/java/com/linkedin/venice/kafka/protocol/enums/MessageType.java
Show resolved
Hide resolved
internal/venice-common/src/main/resources/avro/GlobalRtDivState/v1/GlobalRtDivState.avsc
Show resolved
Hide resolved
ad77fc1
to
1a22b2f
Compare
...a-vinci-client/src/test/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTaskTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. Left a few minor comments. ChunkAssembler
has been mentioned in the description part, can we remove it if it's not part of this change anymore?
...a-vinci-client/src/test/java/com/linkedin/davinci/kafka/consumer/StoreIngestionTaskTest.java
Outdated
Show resolved
Hide resolved
internal/venice-common/src/main/java/com/linkedin/venice/kafka/protocol/enums/MessageType.java
Outdated
Show resolved
Hide resolved
internal/venice-common/src/main/java/com/linkedin/venice/kafka/validation/Segment.java
Outdated
Show resolved
Hide resolved
internal/venice-common/src/main/resources/avro/GlobalRtDivState/v1/GlobalRtDivState.avsc
Show resolved
Hide resolved
* Renamed `resetUpstreamOffsetMap()` to `mergeUpstreamOffsets()`. 😮💨 * Minor refactor around `validateMessage()` 🦯🦯 * Copied `KafkaMessageEnvelope.avsc` for a new protocol version. 🥣 * Created `GlobalRtDiv` new `MessageType` that is based on the `Put` messages, and renamed `GlobalDivState` avro object to `GlobalRtDivState`. 🌯 * Unified the code paths of `bufferAndAssembleRecord()` with deserialization . 🥓 * Revised `GlobalRtDiv` chunking support in `VeniceWriter` and adjacent. 🍭 * Reusing `Put` instead of creating new message type `GlobalRtDiv`. 🫐 1. Fixed `shouldProcessRecord()` condition in LFSIT. 🍊 2. Split `ChunkAssembler` for RT DIV into its own object. 🍐 3. `GlobalRtDiv` serializer is per-message to be safe, because it doesn't seem to be thread-safe. 🍋🟩 4. Fixed spotbugs. 🌶️ 5. Fixed `divChunkAssembler` for the SIT unit test. 🫨 * The `MessageType` for the KME needs to be `PUT`. Only the `KafkaKey` will have `GLOBAL_RT_DIV` as the `MessageType`. 🏯 * Minor cleanup in the `VeniceWriterUnitTest` and `Put` rather than `Object` in the `sendMessageFunction`. 🪀
1a22b2f
to
2838c63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Summary
Continuation from #1257. Schema changes in #1523.
This PR mainly focuses on adding chunking support for DIV messages when they are produced to Kafka topics, as the size of the DIV message can surpass the ~1MB Kafka message limit. The existing chunking mechanism is reused, including the
CHUNK
andCHUNKED_VALUE_MANIFEST
values in the message'sschemaId
:Every DIV message has
GLOBAL_RT_DIV
for the header byte in itsKafkaKey
. The correspondingKafkaMessageEnvelope
has aPut
payload utilizing theputValue
field containing theGlobalRtDiv
data, and which has the followingschemaId
:schemaId
is set to the current protocol version ofGLOBAL_RT_DIV
.schemaId
is set toCHUNK
.schemaId
is set toCHUNKED_VALUE_MANIFEST
. TheschemaId
of theChunkedValueManifest
will be the current protocol version ofGLOBAL_RT_DIV
.Changes
MessageType
calledGlobalRtDiv
, which reuses thePut
message type format and objects. When the Venice server encounters a message withKafkaKey
containing theGlobalRtDiv
header byte, it will know to process this message differently from a regularPut
.GlobalRtDiv
message type is the header byte inKafkaKey
. Otherwise, it's identical to a regularPut
.KafkaMessageEnvelope.avsc
will not be updated to avoid the unnecessary risk of incompatible avro formats when upgrading the cluster.GlobalRtDiv
message type in KME is that theGlobalRtDiv
objects will be processed as user records and stored in the storage engine, which seems to be much less scary than a cluster upgrade issue.GlobalRtDiv
messages should not be processed if they originate from remote VT and RT, because those are invalid scenarios. These two conditions are checked.Minor Changes
resetUpstreamOffsetMap()
tomergeUpstreamOffsets()
inOffsetRecord
.toString()
inKafkaKey
, which incorrectly assumed all messages would beControlMessage
,Put
, orDelete
. This missesUpdate
messages and the newGlobalRtDiv
message that is being added.buildPutPayload()
andbuildManifestPayload()
) inVeniceWriter
for creating thePut
payloads and when chunking is involved.Testing
testGlobalRtDivChunking()
inVeniceWriterUnitTest
testShouldProcessRecordForGlobalRtDivMessage()
inStoreIngestionTaskTest
testProcessGlobalRtDivMessage()
inStoreIngestionTaskTest
testChunkedDiv()
inTestGlobalRtDiv
Does this PR introduce any user-facing changes?