Concurrently send OpenAI requests in batches #8

npalaska · 2024-07-12T21:44:13Z

Overview

The current implementation of LLMBlock sends the entire dataset as a single large batch of requests to the OpenAI server. This may lead to some requests waiting too long for the response, resulting in timeout errors and potentially overloading the backend server with extremely large batches.

Changes Introduced

This PR refactors LLMBlock to use concurrent processing using Python’s concurrent.futures package. The key changes are:

Utilizes concurrent.futures for managing parallel tasks with threading for launching parallel tasks.
Allows users to specify the number of requests to send in each batch.
Allows users to specify the number of concurrent worker threads to handle batches.

Example Usage

If the user sets the concurrency to 8 and the batch size to 32, the system will run 8 concurrent threads, each sending a batch of 32 prompts, resulting in a total of 256 requests processed simultaneously by the backend server.

shivchander · 2024-07-15T20:30:01Z

closing in favor of #6

Concurrently send OpenAI requests in batches

ba11848

markmc mentioned this pull request Jul 15, 2024

[Epic] Improve concurrency in LLMBlock instructlab/sdg#135

Closed

2 tasks

shivchander closed this Jul 15, 2024

This was referenced Jul 17, 2024

LLMBlock concurrency instructlab/sdg#157

Merged

Add support for batching and parallel data generation instructlab/sdg#167

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrently send OpenAI requests in batches #8

Concurrently send OpenAI requests in batches #8

npalaska commented Jul 12, 2024

shivchander commented Jul 15, 2024

Concurrently send OpenAI requests in batches #8

Concurrently send OpenAI requests in batches #8

Conversation

npalaska commented Jul 12, 2024

shivchander commented Jul 15, 2024