-
Notifications
You must be signed in to change notification settings - Fork 34
[Feature] Added performance testing tool based on the PyTest testing framework #295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Performance test
858b406 to
af584ff
Compare
test/test_uc_performance
Outdated
| case_hit_rate_map — {case_idx: hit_rate} 的映射 | ||
| """ | ||
| print(f"[INFO] 共计 {len(test_cases)} 个测试用例待执行") | ||
| failed_case = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failed_case is not used in this function.
|
put this file in benchmark dir seems better |
|
Pursuant to the UCM code-repository guidelines, all code comments must be composed in English. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log,config,single
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add the pip packages you use along with their versions to make it easier for others who don't have them (e.g., pandas, pydantic) to use them.
| server_url: "http://141.111.32.70:9382" | ||
| tokenizer_path: "/home/models/QwQ-32B" | ||
| # Performance Test Configuration | ||
| llmperf_test_cases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Configuration items can be added, referring to the implementation of logs and reports, with results stored using timestamps. (They can be uniformly placed in the reports directory to prevent too many subdirectories, and there should be an llmperf flag.)
- Parameter names such as max_num_completed_requests and num_concurrent_requests are not descriptive enough; additional descriptions should be added.
| from common.llmperf.utils.utils import reset_prefill_cache | ||
|
|
||
|
|
||
| def run_test_cases(test_cases, timestamp_dir, model, server_url, tokenizer_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Singleton pattern can be used for optimization, ensuring that only one instance is created during a single program run, and the test is executed only once. This prevents repeated assertions from causing the test to be run multiple times. (Test results can be stored in the instance; refer to config_utils for details.)
* fix mtp in ucm
…odelEngine-Group#322) * linear buffer for device * check data consistency after embedding
New performance testing tools New performance testing tools
5a3bc0c to
4b8b8de
Compare
Purpose
Performance testing tool based on the PyTest testing framework
Modifications
1、Added tests for UC-related performance metrics, including full throughput and incremental throughput.
2、Support for custom PC hit rate.
3、Support for custom tokenizer.
Test