Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Epic] Improve concurrency in LLMBlock #135

Closed
2 tasks done
markmc opened this issue Jul 15, 2024 · 1 comment
Closed
2 tasks done

[Epic] Improve concurrency in LLMBlock #135

markmc opened this issue Jul 15, 2024 · 1 comment
Milestone

Comments

@markmc
Copy link
Contributor

markmc commented Jul 15, 2024

From aakankshaduggal#8

Overview

The current implementation of LLMBlock sends the entire dataset as a single large batch of requests to the OpenAI server. This may lead to some requests waiting too long for the response, resulting in timeout errors and potentially overloading the backend server with extremely large batches.

Proposed Changes

Use concurrent processing using Python’s concurrent.futures package in LLMBlock. The key changes are:

  1. Utilizes concurrent.futures for managing parallel tasks with threading for launching parallel tasks.
  2. Allows users to specify the number of requests to send in each batch.
  3. Allows users to specify the number of concurrent worker threads to handle batches.

Example Usage

If the user sets the concurrency to 8 and the batch size to 32, the system will run 8 concurrent threads, each sending a batch of 32 prompts, resulting in a total of 256 requests processed simultaneously by the backend server.


@markmc markmc added this to the 0.1.3 milestone Jul 15, 2024
@shivchander
Copy link
Member

Please use this: aakankshaduggal#6

aakankshaduggal#8 has been closed in favor of aakankshaduggal#6

gabe-l-hart added a commit to gabe-l-hart/instructlab-sdg that referenced this issue Jul 18, 2024
Problem statement from npalaska@redhat.com:

Overview

The current implementation of LLMBlock sends the entire dataset as a single
large batch of requests to the OpenAI server. This may lead to some
requests waiting too long for the response, resulting in timeout errors and
potentially overloading the backend server with extremely large batches.

Proposed Changes

Use concurrent processing using Python’s concurrent.futures package in
LLMBlock. The key changes are:

* Utilizes concurrent.futures for managing parallel tasks with threading
for launching parallel tasks.
* Allows users to specify the number of requests to send in each batch.
* Allows users to specify the number of concurrent worker threads to
handle batches.

Example Usage

If the user sets the concurrency to 8 and the batch size to 32, the system
will run 8 concurrent threads, each sending a batch of 32 prompts,
resulting in a total of 256 requests processed simultaneously by the
backend server.


Ref: aakankshaduggal#6

instructlab#135

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

Co-authored-by: Nikhil Palaskar <npalaska@redhat.com>
Co-authored-by: shiv <shivchander.s30@gmail.com>
Co-authored-by: Kai Xu <xuk@ibm.com>
Co-authored-by: Aakanksha Duggal <aduggal@redhat.com>
gabe-l-hart added a commit to gabe-l-hart/instructlab-sdg that referenced this issue Jul 19, 2024
instructlab#135

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
gabe-l-hart added a commit to gabe-l-hart/instructlab-sdg that referenced this issue Jul 19, 2024
gabe-l-hart added a commit to gabe-l-hart/instructlab-sdg that referenced this issue Jul 19, 2024
…text

This allows None to be used as a default in generate_data and from the CLI

instructlab#135

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
gabe-l-hart added a commit to gabe-l-hart/instructlab-sdg that referenced this issue Jul 19, 2024
gabe-l-hart added a commit to gabe-l-hart/instructlab-sdg that referenced this issue Jul 19, 2024
This is a mitigation to allow the `instructlab-sdg` library to merge and
release before `instructlab` has updated the CLI invocation of
generate_data to properly distinguish between backend types. It should be
reverted once that change is made in the CLI.

instructlab#135

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
@russellb russellb modified the milestones: 0.1.3, 0.2.0 Jul 22, 2024
@markmc markmc modified the milestones: 0.2.0, 0.2.1 Jul 23, 2024
@markmc markmc modified the milestones: 0.2.1, 0.2.2 Jul 26, 2024
@markmc markmc changed the title Improve concurrency in LLMBlock [Epic] Improve concurrency in LLMBlock Jul 27, 2024
markmc added a commit to gabe-l-hart/instructlab that referenced this issue Jul 27, 2024
Relates to instructlab/sdg#135

Since instructlab-sdg-0.1.3, data generation in batches is supported and
controlled by an parameter to the `generate_data()` function.

This is not supported with llama-cpp, and so we disable it in that case.

Co-authored-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
markmc added a commit to gabe-l-hart/instructlab that referenced this issue Jul 27, 2024
Relates to instructlab/sdg#135

Since instructlab-sdg-0.1.3, data generation in batches is supported and
controlled by an parameter to the `generate_data()` function.

This is not supported with llama-cpp, and so we disable it in that case.

Co-authored-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
markmc added a commit to gabe-l-hart/instructlab that referenced this issue Jul 27, 2024
Relates to instructlab/sdg#135

Since instructlab-sdg-0.1.3, data generation in batches is supported and
controlled by an parameter to the `generate_data()` function.

This is not supported with llama-cpp, and so we disable it in that case.

Co-authored-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
markmc added a commit to gabe-l-hart/instructlab that referenced this issue Jul 27, 2024
Relates to instructlab/sdg#135

Since instructlab-sdg-0.1.3, data generation in batches is supported and
controlled by an parameter to the `generate_data()` function.

This is not supported with llama-cpp, and so we disable it in that case.

Co-authored-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
markmc added a commit to gabe-l-hart/instructlab that referenced this issue Jul 27, 2024
Relates to instructlab/sdg#135

Since instructlab-sdg-0.1.3, data generation in batches is supported and
controlled by an parameter to the `generate_data()` function.

This is not supported with llama-cpp, and so we disable it in that case.

Co-authored-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
@markmc markmc closed this as completed Jul 28, 2024
jwm4 pushed a commit to jwm4/sdg that referenced this issue Dec 13, 2024
…_actions/DavidAnson/markdownlint-cli2-action-17.0.0

Bump DavidAnson/markdownlint-cli2-action from 16.0.0 to 17.0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants