Add support for fixed schedule warmup #366

matthewkotila · 2025-04-23T23:25:38Z

Previously, someone would run PA fixed schedule mode like this:

perf_analyzer \
  --fixed-schedule \
  --input-data=input_data.json \
  -m facebook/opt-125m \
  --service-kind=openai \
  --endpoint=v1/chat/completions \
  --async

with an input_data.json like this:

{
  "data": [
    {
      "payload": [{
          "model": "facebook/opt-125m",
          "messages": [{"role": "user","content": "my_prompt_1"}],
          "max_completion_tokens": 1
        }],
      "timestamp": [1000]
    },
    {
      "payload": [{
          "model": "facebook/opt-125m",
          "messages": [{"role": "user","content": "my_prompt_2"}],
          "max_completion_tokens": 1
        }],
      "timestamp": [2000]
    }
  ]
}

But they wouldn't be able to use the warmup feature (--warmup-request-count).

Now, with this PR, users can use the warmup feature with the fixed schedule feature:

perf_analyzer \
  --fixed-schedule \
  --warmup-request-count=1 \
  --input-data=input_data.json \
  -m facebook/opt-125m \
  --service-kind=openai \
  --endpoint=v1/chat/completions \
  --async

Basically, if --warmup-request-count=N, the first N payloads in input_data.json are sent as "warmup" requests (i.e. excluded from the final performance metric/statistic calculations and profile export JSON), and the rest are part of the standard benchmark.

docs/cli.md

docs/inference_load_modes.md

genai-perf/genai_perf/config/generate/perf_analyzer_config.py

genai-perf/genai_perf/config/input/config_command.py

genai-perf/genai_perf/inputs/converters/tensorrtllm_engine_converter.py

genai-perf/tests/test_cli.py

genai-perf/tests/test_perf_analyzer_config.py

src/command_line_parser.cc

src/custom_request_schedule_manager.cc

src/infer_context.cc

src/inference_profiler.cc

src/mock_request_rate_worker.h

src/model_parser.cc

src/model_parser.h

src/request_rate_manager.cc

src/test_command_line_parser.cc

src/test_custom_request_schedule_manager.cc

src/test_model_parser.cc

Copilot

Pull Request Overview

This PR adds support for fixed schedule warmup by modifying request scheduling, adjusting input parsing, and updating command-line validations. Key changes include:

Introducing a new dataset_offset parameter in worker and thread configuration classes.
Modifying the ModelParser to initialise fixed schedule inputs via a new constructor.
Refactoring CustomRequestScheduleManager to distinguish between warmup and benchmark schedules.

Reviewed Changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/request_rate_worker.h	Added dataset_offset parameter to worker construction.
src/request_rate_manager.h & .cc	Updated thread configuration and worker creation APIs.
src/perf_analyzer.cc	Adjusted ModelParser instantiation for fixed schedule.
src/model_parser.{h,cc}	Introduced a constructor that initializes fixed schedule inputs and removed duplicate code in InitOpenAI.
src/custom_request_schedule_manager.{h,cc}	Refactored schedule generation to separate warmup and benchmark schedules.
src/command_line_parser.cc	Updated help messages and validation checks for fixed schedule mode.
Test and documentation files	Updated to support new warmup functionality.

nicolasnoble

Yeah looks good to me. The change is large due to propagation of some of the parameters, but that's mostly mechanical work. I've left only a few nits here and there, otherwise it's fine with me.

src/custom_request_schedule_manager.h

src/custom_request_schedule_manager.cc

src/test_command_line_parser.cc

Add support for fixed schedule warmup

81627f0

matthewkotila temporarily deployed to GITLAB April 23, 2025 23:25 — with GitHub Actions Inactive

matthewkotila temporarily deployed to GITLAB April 23, 2025 23:26 — with GitHub Actions Inactive

matthewkotila commented Apr 23, 2025

View reviewed changes

docs/cli.md Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

docs/inference_load_modes.md Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

genai-perf/genai_perf/config/generate/perf_analyzer_config.py Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

genai-perf/genai_perf/config/generate/perf_analyzer_config.py Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

genai-perf/genai_perf/config/input/config_command.py Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

genai-perf/genai_perf/inputs/converters/tensorrtllm_engine_converter.py Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

genai-perf/tests/test_cli.py Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

genai-perf/tests/test_perf_analyzer_config.py Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

genai-perf/tests/test_perf_analyzer_config.py Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

genai-perf/tests/test_perf_analyzer_config.py Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

genai-perf/tests/test_perf_analyzer_config.py Show resolved Hide resolved

matthewkotila commented Apr 23, 2025

View reviewed changes

src/command_line_parser.cc Show resolved Hide resolved