[Epic] Improve concurrency in LLMBlock #135

markmc · 2024-07-15T11:16:25Z

Overview

The current implementation of LLMBlock sends the entire dataset as a single large batch of requests to the OpenAI server. This may lead to some requests waiting too long for the response, resulting in timeout errors and potentially overloading the backend server with extremely large batches.

Proposed Changes

Use concurrent processing using Python’s concurrent.futures package in LLMBlock. The key changes are:

Utilizes concurrent.futures for managing parallel tasks with threading for launching parallel tasks.
Allows users to specify the number of requests to send in each batch.
Allows users to specify the number of concurrent worker threads to handle batches.

Example Usage

If the user sets the concurrency to 8 and the batch size to 32, the system will run 8 concurrent threads, each sending a batch of 32 prompts, resulting in a total of 256 requests processed simultaneously by the backend server.

The text was updated successfully, but these errors were encountered:

shivchander · 2024-07-16T19:15:52Z

Please use this: aakankshaduggal#6

aakankshaduggal#8 has been closed in favor of aakankshaduggal#6

Problem statement from npalaska@redhat.com: Overview The current implementation of LLMBlock sends the entire dataset as a single large batch of requests to the OpenAI server. This may lead to some requests waiting too long for the response, resulting in timeout errors and potentially overloading the backend server with extremely large batches. Proposed Changes Use concurrent processing using Python’s concurrent.futures package in LLMBlock. The key changes are: * Utilizes concurrent.futures for managing parallel tasks with threading for launching parallel tasks. * Allows users to specify the number of requests to send in each batch. * Allows users to specify the number of concurrent worker threads to handle batches. Example Usage If the user sets the concurrency to 8 and the batch size to 32, the system will run 8 concurrent threads, each sending a batch of 32 prompts, resulting in a total of 256 requests processed simultaneously by the backend server. Ref: aakankshaduggal#6 instructlab#135 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Nikhil Palaskar <npalaska@redhat.com> Co-authored-by: shiv <shivchander.s30@gmail.com> Co-authored-by: Kai Xu <xuk@ibm.com> Co-authored-by: Aakanksha Duggal <aduggal@redhat.com>

instructlab#135 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

…text This allows None to be used as a default in generate_data and from the CLI instructlab#135 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

instructlab#135 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

This is a mitigation to allow the `instructlab-sdg` library to merge and release before `instructlab` has updated the CLI invocation of generate_data to properly distinguish between backend types. It should be reverted once that change is made in the CLI. instructlab#135 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

Relates to instructlab/sdg#135 Since instructlab-sdg-0.1.3, data generation in batches is supported and controlled by an parameter to the `generate_data()` function. This is not supported with llama-cpp, and so we disable it in that case. Co-authored-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com>

…_actions/DavidAnson/markdownlint-cli2-action-17.0.0 Bump DavidAnson/markdownlint-cli2-action from 16.0.0 to 17.0.0

markmc added this to the 0.1.3 milestone Jul 15, 2024

gabe-l-hart mentioned this issue Jul 16, 2024

LLMBlock concurrency #157

Merged

This was referenced Jul 18, 2024

Add support for batching and parallel data generation #167

Closed

[Epic] Support for mixing generated datasets before training #162

Closed

gabe-l-hart added a commit to gabe-l-hart/instructlab-sdg that referenced this issue Jul 19, 2024

LLMBlockConcurrency: Add batching in Pipeline

1ce4edc

instructlab#135 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

gabe-l-hart added a commit to gabe-l-hart/instructlab-sdg that referenced this issue Jul 19, 2024

LLMBlockConcurrency: Add Pipeline batching arg plumbing to generate_data

0866168

instructlab#135 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

gabe-l-hart added a commit to gabe-l-hart/instructlab-sdg that referenced this issue Jul 19, 2024

LLMBlockConcurrency: Test that _sdg_init handles batch_size correctly

68303f2

instructlab#135 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

russellb modified the milestones: 0.1.3, 0.2.0 Jul 22, 2024

markmc modified the milestones: 0.2.0, 0.2.1 Jul 23, 2024

markmc modified the milestones: 0.2.1, 0.2.2 Jul 26, 2024

markmc changed the title ~~Improve concurrency in LLMBlock~~ [Epic] Improve concurrency in LLMBlock Jul 27, 2024

markmc closed this as completed Jul 28, 2024

jwm4 pushed a commit to jwm4/sdg that referenced this issue Dec 13, 2024

Merge pull request instructlab#135 from instructlab/dependabot/github…

0cc0fa6

…_actions/DavidAnson/markdownlint-cli2-action-17.0.0 Bump DavidAnson/markdownlint-cli2-action from 16.0.0 to 17.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Epic] Improve concurrency in LLMBlock #135

[Epic] Improve concurrency in LLMBlock #135

markmc commented Jul 15, 2024 •

edited

Loading

shivchander commented Jul 16, 2024

[Epic] Improve concurrency in LLMBlock #135

[Epic] Improve concurrency in LLMBlock #135

Comments

markmc commented Jul 15, 2024 • edited Loading

shivchander commented Jul 16, 2024

markmc commented Jul 15, 2024 •

edited

Loading