Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark fix #229

Merged
merged 24 commits into from
Oct 30, 2024
Merged

benchmark fix #229

merged 24 commits into from
Oct 30, 2024

Conversation

kiddyjinjin
Copy link
Collaborator

@kiddyjinjin kiddyjinjin commented Sep 26, 2024

PR Category

Benchmark

Type of Change

New Feature

PR Description

1. New Testing Parameters

This PR introduces new benchmark testing parameters, including:

  • level

    • Type: str
    • Description: Marks the level of the benchmark.
    • Available levels:
      • comprehensive (default): Comprehensive testing.
      • core: Core testing.
  • warmup

    • Type: int
    • Description: The number of warm-up iterations.
    • Default value: DEFAULT_WARMUP_COUNT = 1000
  • iter

    • Type: int
    • Description: The number of benchmark iterations.
    • Default value: DEFAULT_ITER_COUNT = 100
  • query

    • Description: Indicates that the benchmark will only query properties without executing the full benchmark logic.
    • Default: This parameter is not set by default.
  • record

    • Type: str
    • Description: Specifies the format of the output data.
    • Available options:
      • none (default)
      • log: Logs output in JSON format.
  • dtype

    • Type: list[str]
    • Description: Specifies the data types for benchmark testing. Available dtypes can be listed using pytest --help.
    • Available data types:
      • torch.float16, torch.float32, torch.bfloat16, torch.int16, torch.int32, torch.bool, torch.complex64
  • metric

    • Type: list[str]
    • Description: Specifies the metrics covered by the benchmark test.
    • Available metrics:
      • latency, speedup, tflops, latency_base, accuracy, utilization

2. Structural Design Adjustments

This section outlines several structural design adjustments:

  • Added the BenchmarkMetrics abstraction

    • Represents the benchmark information to be recorded for specific operations at specific sizes and data types.
  • Added the BenchmarkResult abstraction

    • Represents all test results for a specific operation on specific hardware and at a specified benchmark level.
  • Adjusted the design of the Benchmark structure

    • Changed the per-operator Function-level benchmark to a Class-level benchmark for a category of operators, facilitating unified configuration of default benchmark parameters and allowing for inheritance and overrides.

3. Improvements to Test Data

  • The previous testing data was based on a specific batch, optional size list, and optional dtype list for combinatorial testing. This approach was somewhat limited in expression. It has now been changed to a more abstract input_generator, which provides a default input generator. Special input scenarios can directly override the corresponding generator.

Copy link
Contributor

@tongxin tongxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a nice rewrite and enhancement to current benchmarking tools!

Comment on lines 8 to 14
class ReadOnly:
def __init__(self, value):
self._value = value

@property
def value(self):
return self._value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this abstraction?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This structure is intended to protect data from unexpected changes. Since Python uses reference types by default rather than value types, it is susceptible to unintended modifications.


BLAS_OPS = ReadOnly(["addmm", "mv", "addmm", "mm", "outer"])

DEFAULT_WARMUP_COUNT = 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My experience is we need a much longer warmup to warrant a stable perf result. I suggest flip the warmup and repeat values.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

# BLAS situation
# BLAS shapes is defined by (B,M,N,K), it is different from the non blas Shapes
DEFAULT_BLAS_BENCH_SHAPES = [(1, 1, 1, 32), (4, 15, 160, 1024), (16, 495, 5333, 71)]
DEFAULT_BLAS_WITHOUT_BATCH_BENCH_SHAPES = [(1, 1, 32), (15, 160, 1024), (495, 5333, 71)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about BLAS_DEFAULT_BMNK, BLAS_DEFAULT_MNK?
Also, the larger sizes are rather small.


@dataclass
class BenckmarkMatrics:
# the simple version shape info, this shape setted here just to with the last version.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm a little fussy here. The past tense of set is set still.

Copy link
Collaborator

@Bowen12992 Bowen12992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Job,how about add some docs to CONTRIBUTING.md

fn = lambda: op(*args, **kwargs)
if self.is_backward:
out = fn()
dout = torch.randn_like(out)
fn = lambda: out.backward(dout, retain_graph=True)
if CPU_MODE:
for i in range(WARMUP):
if Config.cpu_mode:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we design a new mode which outputs both cpu latency and gpu latency? @tianxiao-baai

@kiddyjinjin kiddyjinjin merged commit 4e6cb3b into FlagOpen:master Oct 30, 2024
4 checks passed
machuanjiang pushed a commit that referenced this pull request Nov 15, 2024
* benchmark fix

*  add seven new testing parameters

* move shapes info to yaml file

* Added the BenchmarkMetrics & BenchmarkResult  abstraction
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants