Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing the architecture of the ETL pipeline's transformers #1582

Open
kevintsai1202 opened this issue Oct 22, 2024 · 1 comment
Open

Enhancing the architecture of the ETL pipeline's transformers #1582

kevintsai1202 opened this issue Oct 22, 2024 · 1 comment

Comments

@kevintsai1202
Copy link

When building an ETL pipeline, transformers may need to perform multiple actions, which can result in layers of function calls that are hard to maintain. Is it possible to design them like advisor ?

The following actions include chunking, keyword, and summarization. If more transformations are required in the future, additional layers will need to be added.

vectorStore.write(splitter.split(loadTextAsDocuments( summaryDocuments(keywordDocuments(docs))));

@youngmoneee
Copy link
Contributor

There is a previous discussion #1253 regarding the design of the ETL pipeline.

This discussion covers not only handling multiple transformers in a chain format based on Reactive Streams but also includes improvements for efficiently processing large volumes of data and enhancing performance.

Please feel free to join the discussion if you’re interested. 🙇🏻‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants