Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark synthesis #24

Open
10 tasks
mcopik opened this issue Jan 21, 2020 · 19 comments
Open
10 tasks

Benchmark synthesis #24

mcopik opened this issue Jan 21, 2020 · 19 comments
Labels

Comments

@mcopik
Copy link
Collaborator

mcopik commented Jan 21, 2020

We need the following:

  • Python
    • computation in flosp/instructions
    • memory allocation
    • storage read/write
    • disk read/write
  • NodeJS
    • computation in flosp/instructions
    • memory allocation
    • storage read/write
    • disk read/write
@mcopik
Copy link
Collaborator Author

mcopik commented Feb 16, 2023

We made progress on this issue on branch meta-benchmarks and in PR #59. However, there is still work to be done - any input and help towards synthesizing benchmarks are welcome!

@veenaamb
Copy link

I will research on this and update here shortly

@AtulRajput01
Copy link

i am working on it.

@octonawish-akcodes
Copy link
Collaborator

@mcopik Can i get some guidance to this issue?

@mcopik
Copy link
Collaborator Author

mcopik commented Mar 12, 2024

@octonawish-akcodes Hi! The overall idea is to synthetically create Python/JS functions that perform CPU computations, memory accesses, and I/O accesses. Given a simple configuration, it should generate a function that performs selected actions with a specified frequency and intensity, e.g., calling some well-established CPU benchmark (like matrix-matrix multiplication), using our interface to make storage calls, etc.

The next step will be to make these functions more varied, e.g., with different loop complexity.

@octonawish-akcodes
Copy link
Collaborator

Can you also provide me with some resources and target files for the start?

@mcopik
Copy link
Collaborator Author

mcopik commented Mar 12, 2024

@octonawish-akcodes I'd look in what can be reused from the prior PR: https://github.com/spcl/serverless-benchmarks/pull/59/files

I wouldn't try to merge new updates into it as it's quite difficult. Instead, I'd cherry-pick some of the files you find useful.

@MinhThieu145
Copy link

Hello @mcopik,

Thank you for outlining the specific benchmarks you're interested in: computation in FLOPS/instructions, memory allocation, storage read/write, and disk read/write. I've reviewed our current benchmark suite, and here is what I found:

  • Computation: Our workload/python/function.py benchmark measures computational performance by executing arithmetic operations on numpy arrays.

  • Memory Allocation: The memory/python/function.py benchmark is designed to evaluate memory allocation performance by timing numpy array allocations.

  • Storage Read/Write: The storage/python/function.py benchmark assesses storage operation speeds, focusing on read/write performance.

  • Disk Read/Write: While we don't have a direct benchmark for disk I/O, the disc/python/function.py script performs read/write operations with numpy arrays to and from disk, which might be useful for your disk I/O performance analysis.

Could you please provide more details on the specific improvements or additional metrics you're looking to incorporate? Currently I have a few ideas but I would be appreciate if you have any additional sources I can look into

@mcopik
Copy link
Collaborator Author

mcopik commented Mar 19, 2024

@MinhThieu145 I think the best way forward would be to add a generator that accepts a simple config - CPU ops, memory ops, storage ops - and synthesizes a single Python function out of the components you just described. Do you think it's feasible?

I'd like to hear about other ideas you might have here :)

@octonawish-akcodes
Copy link
Collaborator

@mcopik So did you mean creating simple functions that does the following operations you're providing, cant we reuse the functions proposed in the PR #59

@mcopik
Copy link
Collaborator Author

mcopik commented Mar 20, 2024

@octonawish-akcodes Yes, please feel free to reuse the code snippets.

@octonawish-akcodes @MinhThieu145 Since you are both interested in the issue, it might be beneficial to coordinate.

@MinhThieu145
Copy link

Hi @mcopik ,

I'm leaning towards writing functions that are similar to the current, pre-written one, rather than creating a dynamic generator. From my POV,

  • Current functions are reliable and give consistent results, which is crucial for benchmarks.
  • Easier to implemented, since we already have similar functions
  • Then, we can add customizability for the functions by adding parameters to these functions, we can easily adjust their behavior, like changing the number of loops or the amount of data they handle, without rewriting them from scratch.

With the pre-written functions, here are sth that can be dynamic that I think would be helpful

  • Loop Control: Introduce parameters to adjust the number of loops in a function, helping us test different levels of computational intensity.
  • Data Size Adjustment: Add parameters to change the size or type of data the functions work with, allowing us to test memory usage more effectively.
  • I/O Intensity: Implement parameters to vary the intensity of input/output operations, giving us a better view of storage and disk performance.
  • Combination Operations: Develop functions that can perform a mix of CPU, memory, and I/O operations, mirroring real-world application scenarios.

This way, it can make the pre-written more dynamic. Looking forward to your feedback and any further ideas.

@MinhThieu145
Copy link

Hi @octonawish-akcodes,

I totally agree with the idea of using the functions we already have. Right now, I'm trying out some different kinds of functions for the new serverless-benchmark issue mentioned here: SEBS New Serverless Benchmarks. But really, the main idea is the same as before.

I'm all for making the most of what we've got and seeing how we can adapt those functions to fit our new needs. Let's keep in touch about how the testing goes!

@mcopik
Copy link
Collaborator Author

mcopik commented Mar 20, 2024

@MinhThieu145 Yes, we should reuse those functions. What I meant by the generator is that we should glue together the functions that already exist in the PR, and synthesize functions that combine different behaviors, e.g., a function that does compute, then some I/O accesses, etc.

It should be reproducible - if user specifies the same config, they should receive exactly the same function and observe the same behavior :)

@MinhThieu145
Copy link

Thank you for your input, @mcopik. I've been exploring how functions work together and found a really helpful paper, ServerlessBench: ServerlessBench Paper

This paper dives deep into how serverless functions interact, which is just what we need for our project. Based on this and our existing setup, here's what I'm thinking:

Improving Our Benchmarks

We have four experiments in our toolkit right now: Current Experiments
But they don't fully cover how functions flow and work with each other. The ServerlessBench paper suggests focusing on areas like:

  • Communication Performance: This is about how well functions talk to each other and to other services, which is key for complex applications.
  • Startup Latency: Since serverless functions start on-demand, it's important to know how quickly they get going, especially when many functions start at once.
  • Stateless Execution: This looks at how the lack of saved state affects data sharing and performance.
  • Resource Efficiency and Isolation: It's crucial to use resources wisely and ensure that different functions or workloads don't interfere with each other.

These areas could really enhance how we measure and understand our benchmarks.

Bringing in Function Flows

ServerlessBench outlines two ways to orchestrate functions:

  • Nested Function Chain: This is similar to what I've done with AWS Step Functions, where one function's output directly influences the next.
  • Sequence Function Chain: This could add a fresh perspective, allowing functions to operate in order, but without depending directly on each other.

Ideas for Our Benchmarks

  • Thumbnail and Compression Workflow: We could start with creating a thumbnail (using the Thumbnailer benchmark) and then compress it (using the Compression benchmark). This mirrors a common process in handling media files.
  • Dynamic HTML and Uploader Workflow: First, generate HTML content using the 110.dynamic-html benchmark, and then upload it using the 120.uploader. This simulates creating and storing web content.

Thinking About AWS Tools

  • AWS Step Functions: It's a powerful tool for managing function flows but adds complexity. It's worth a deeper look to see how it might fit into our benchmarks.
  • ECR and Docker Containers: Using ECR could help with large benchmarks like 411.image-recognition. We need to balance this with the need to manage containers in ECR carefully to avoid extra costs. Maybe using AWS CDK could help automate this, setting up and removing resources as needed.

I'm actively developing these concepts and would greatly value your insights, particularly regarding the use of Step Functions and ECR. If you have any additional resources or suggestions, please feel free to share. I’m eager to hear your perspective and incorporate your feedback into our ongoing work

@octonawish-akcodes
Copy link
Collaborator

@mcopik I raised a PR #194 here, have a look

@mcopik
Copy link
Collaborator Author

mcopik commented Mar 22, 2024

@MinhThieu145 Thanks - yes, I know the paper, and it complements our Middleware paper in some aspects.

We already have communication performance benchmarks (unmerged branch using FMI benchmarks), and the invocation-overhead benchmark covers startup latency. Regarding the stateless execution and resource efficiency, I'm happy to hear proposals in this aspect.

Workflows - we have a branch with results from a paper in submission, and I hope we will be able to merge it soon :) It supports Step Functions, Durable Functions, and Google Cloud Workflows. I don't think we have a workflow covering typical website use cases, but adding something like this could be a good idea; there are also similar ideas for website-based workflows in #140.

ECR and containers - this is a feature we definitely need, but we should also support it on other platforms where possible (Azure also supports this).

@entiolliko
Copy link

@mcopik So to have a small recap we need the following type of computation:

  1. CPU Computation - Functions which make heavily use of the CPU. Ex: MMM with specified sizes
  2. GPU Computation - We could use ML Training or Torch Tensor Multiplication on the GPU. The config file could specify the model which has to be used or the size of the tensors to be multiplied
  3. Memory Allocation - The memory/python/function.py benchmarks it by allocating numpy arrays
  4. Disk Read/Write - We could dynamically generate some random text and write it on disk and then read it again and mesure the speed

Since most of these are already implemented we could add support for the config which lets you select how many loops you want for the MMM for example or more fine grained control. Any suggestions?

As for your suggestion to have a single config file, when you say CPU ops you mean the number of FLOP we make, memory ops the amount of data we store and use on RAM and storage ops the number of bytes we read and write from disk?
For example if we have a specific config file we need a generated python script which has 3 functions calls inside, one for CPU ops, one for memory ops and one for storage ops?
Example:
fun1(input1) //CPU Intensive fun2(input2) //Memory Intensive fun3(input3) //Disk Intensive

@mcopik
Copy link
Collaborator Author

mcopik commented Mar 27, 2024

@entiolliko @octonawish-akcodes @MinhThieu145 Linear algebra as a replacement for CPU is a good idea; we can use LAPACK for that. It can be quite flexible. I'd put GPU as the next feature, which is a different category.

Yes, this is what I mean by it. I think the ideal result would be to have a single serverless function that does specified configuration for computations (which can, of course, be composed of many local functions).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants