-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4 thread in parallel #147
4 thread in parallel #147
Conversation
# Conflicts: # dotnet/CoreLib/Handlers/SummarizationHandler.cs
The pipeline scalability is based on asynchronous queues being processed in parallel. If a message in the queue is taking too long because it is doing too much work and the work could be divided, why not leverage the existing infrastructure and split the task over multiple messages? About embedding, many embedding generators support passing a list of strings to generate multiple embeddings at once. Maybe we should look into that too? |
@dluc FWIW I have an open PR in the SK repo to enable batching microsoft/semantic-kernel#3295. If SK supports batching natively the changes required here are minimal. Or, could create a KM specific |
As I see in my tests, this Mayme this is good idea to send bunch of strings, becase files art problem but a lot of small requests for embedding is the problem |
Here's my suggestion: rather than changing Longer term, what I would recommend investigating:
Pros and Cons:
|
@dluc btw how can I add my custim handler? I don't see any extestions like |
I was about to provide an example, but the code is too complex. Currently there's two different methods, one |
@dluc what do you think about |
# Conflicts: # service/Core/Handlers/GenerateParallelEmbeddingsHandler.cs # service/Core/Handlers/SummarizationParallelHandler.cs # service/tests/FunctionalTests/National-Planning-Policy-Framework.pdf
@dluc I tested latest version and it works fine with perfomncae, so maybe we can convert this PR into "custom handler extenstions" ? |
extensions/LlamaSharp/LlamaSharp.FunctionalTests/LlamaSharp.FunctionalTests.csproj
Outdated
Show resolved
Hide resolved
extensions/LlamaSharp/LlamaSharp.FunctionalTests/LlamaSharp.FunctionalTests.csproj
Outdated
Show resolved
Hide resolved
PR updated. If the code is still working it could be merged as is. Handlers now can be configured in the service without touching dependency injection and other files (see appsettings.json list of handlers) |
I will check this code |
# Conflicts: # service/Core/KernelMemoryBuilder.cs # service/tests/Core.FunctionalTests/National-Planning-Policy-Framework.pdf # service/tests/Core.FunctionalTests/ServerLess/SubDirFilesAndStreamsTest.cs # service/tests/Service.FunctionalTests/Service.FunctionalTests.csproj
…pipline # Conflicts: # service/Core/AppBuilders/DependencyInjection.cs # service/Core/KernelMemoryBuilder.cs
# Conflicts: # service/Core/Core.csproj
@dluc I made some changes in the processing, what do you think about it? I now prefer a more standard way - parallel foreach. Also, I think it can be just part of a regular Handler, as it really relies on asynchronous operations. As an option, we could consider how many threads this should be distributed over, because I feel that supporting 2-3 handlers might be challenging. |
also I use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some tests and renamed the handlers, not to replace the default ones. From my tests the parallel embeddings handler shows a faster execution, while the summarization handler takes about the same time to generate a summary. The handlers can be used on demand, while the default ones are still in use.
This is amazing! Thanks a lot! |
Motivation and Context (Why the change? What's the scenario?)
Relate to #131, Im trying to seepd up document processing
High level description (Approach, Design)
just paralell tasks for now
so PR is about do discuss this idea to improve perfomance