Support KeyValue Schema with GenericRecord/AUTO_CONSUME #9844
Labels
lifecycle/stale
type/enhancement
The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
We have the
KeyValue
schema that supports a generic key-value model, and both the key and the value have a schema.When you are dealing with structured data types, currently you usually use
Sink<GenericRecord>
and theAUTO_CONSUME
schema, this way you can deal automatically with any supported from of data structures.But if you use
AUTO_CONSUME
you cannot consumeKeyValue
records.Describe the solution you'd like
I would like to see a way to use
AUTO_CONSUME
that in case ofKeyValue
schema, it passes a specialGenericRecord
instance with two fields:GenericRecord already supports nested data structures, so it is possible to set the schema for the key field and for the value field.
Advanced processors that allow to deal with nested structures will benefit from this new feature, because they will automatically be able to deal with KeyValue without changes, and in a consistent way, that is to deal only with GenericRecord, that is the generic key-value dictionary we have in Pulsar.
Describe alternatives you've considered
Modifying all of the connectors to deal with KeyValue and with GenericRecord, but this will be a big effort, and also currently (2.7.x) you cannot have a Sink that deals with two separate data type (the user must set explicitly a "classname")
Additional context
I have implementations of Sinks that deal with generic data structures and allow the user to transform/map the data before writing to the external system
The text was updated successfully, but these errors were encountered: