Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support KeyValue Schema with GenericRecord/AUTO_CONSUME #9844

Closed
eolivelli opened this issue Mar 9, 2021 · 4 comments
Closed

Support KeyValue Schema with GenericRecord/AUTO_CONSUME #9844

eolivelli opened this issue Mar 9, 2021 · 4 comments
Labels
lifecycle/stale type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

Comments

@eolivelli
Copy link
Contributor

We have the KeyValue schema that supports a generic key-value model, and both the key and the value have a schema.

When you are dealing with structured data types, currently you usually use Sink<GenericRecord> and the AUTO_CONSUME schema, this way you can deal automatically with any supported from of data structures.

But if you use AUTO_CONSUME you cannot consume KeyValue records.

Describe the solution you'd like
I would like to see a way to use AUTO_CONSUME that in case of KeyValue schema, it passes a special GenericRecord instance with two fields:

  • key
  • value
    GenericRecord already supports nested data structures, so it is possible to set the schema for the key field and for the value field.

Advanced processors that allow to deal with nested structures will benefit from this new feature, because they will automatically be able to deal with KeyValue without changes, and in a consistent way, that is to deal only with GenericRecord, that is the generic key-value dictionary we have in Pulsar.

Describe alternatives you've considered
Modifying all of the connectors to deal with KeyValue and with GenericRecord, but this will be a big effort, and also currently (2.7.x) you cannot have a Sink that deals with two separate data type (the user must set explicitly a "classname")

Additional context
I have implementations of Sinks that deal with generic data structures and allow the user to transform/map the data before writing to the external system

@eolivelli eolivelli added the type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages label Mar 9, 2021
@eolivelli
Copy link
Contributor Author

@rdhabalia @sijie @merlimat @jerrypeng

is there any current work in this direction on your side ?

@eolivelli
Copy link
Contributor Author

Related work:
#9895

@codelipenghui
Copy link
Contributor

The issue had no activity for 30 days, mark with Stale label.

@tisonkun
Copy link
Member

tisonkun commented Dec 7, 2022

Can be resolved by #10057

@tisonkun tisonkun closed this as completed Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Projects
None yet
Development

No branches or pull requests

3 participants