Skip to content

Conversation

@you-seesee-you
Copy link

@you-seesee-you you-seesee-you commented Oct 20, 2025

Purpose

Performance testing tool based on the PyTest testing framework

Modifications

1、Added tests for UC-related performance metrics, including full throughput and incremental throughput.
2、Support for custom PC hit rate.
3、Support for custom tokenizer.

Test

image

Performance test
case_hit_rate_map — {case_idx: hit_rate} 的映射
"""
print(f"[INFO] 共计 {len(test_cases)} 个测试用例待执行")
failed_case = []

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed_case is not used in this function.

@ygwpz
Copy link
Contributor

ygwpz commented Oct 23, 2025

put this file in benchmark dir seems better

@yuanzhg078
Copy link

Pursuant to the UCM code-repository guidelines, all code comments must be composed in English.

Copy link

@Potterluo Potterluo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log,config,single

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add the pip packages you use along with their versions to make it easier for others who don't have them (e.g., pandas, pydantic) to use them.

server_url: "http://141.111.32.70:9382"
tokenizer_path: "/home/models/QwQ-32B"
# Performance Test Configuration
llmperf_test_cases:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Configuration items can be added, referring to the implementation of logs and reports, with results stored using timestamps. (They can be uniformly placed in the reports directory to prevent too many subdirectories, and there should be an llmperf flag.)
  2. Parameter names such as max_num_completed_requests and num_concurrent_requests are not descriptive enough; additional descriptions should be added.

from common.llmperf.utils.utils import reset_prefill_cache


def run_test_cases(test_cases, timestamp_dir, model, server_url, tokenizer_path):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Singleton pattern can be used for optimization, ensuring that only one instance is created during a single program run, and the test is executed only once. This prevents repeated assertions from causing the test to be run multiple times. (Test results can be stored in the instance; refer to config_utils for details.)

@you-seesee-you you-seesee-you changed the title Added performance test [Feature] Added performance testing tool based on the PyTest testing framework Nov 5, 2025
NaganooMei and others added 3 commits November 5, 2025 14:35
…odelEngine-Group#322)

* linear buffer for device

* check data consistency after embedding
New performance testing tools

New performance testing tools
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants