Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Rust] Implement micro benchmarks for each operator #94

Closed
alamb opened this issue Apr 26, 2021 · 13 comments
Closed

[Rust] Implement micro benchmarks for each operator #94

alamb opened this issue Apr 26, 2021 · 13 comments
Labels
datafusion Changes in the datafusion crate good first issue Good for newcomers help wanted Extra attention is needed

Comments

@alamb
Copy link
Contributor

alamb commented Apr 26, 2021

Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-9551

We should implement criterion microbenchmarks for each operator so that we can test the impact of code changes on performance and catch regressions.

@alamb alamb added the datafusion Changes in the datafusion crate label Apr 26, 2021
@alamb
Copy link
Contributor Author

alamb commented Apr 26, 2021

Comment from Andrew Lamb(alamb) @ 2021-04-26T12:32:40.877+0000:

Migrated to github: https://github.com/apache/arrow-rs/issues/89

@houqp houqp added good first issue Good for newcomers help wanted Extra attention is needed labels Sep 15, 2021
@OscarTHZhang
Copy link

Hi, I'd like to explore this ticket, but I wonder how and where the benchmark should be run, and also what test workload each operator should be running against?

@alamb
Copy link
Contributor Author

alamb commented Jun 26, 2022

Thanks @OscarTHZhang

I think part of this ticket would be to define a reasonable test workload

Here are some examples of benches that might server as inspiration:

Maybe the first thing to do is to take stock of the current coverage and propose some additions?

@OscarTHZhang
Copy link

OscarTHZhang commented Aug 8, 2022

Hi @alamb,

Here are some questions up on my mind

  • At what granularity a benchmark should be operating at?
  • For aggregate, for example, do we also need to implement the micro benchmarks for all the aggregate functions and also all the physical aggregate expressions (like correlation)?

I think we can divide the micro-bench into 2 types (as described above)

  • Single Operator bench
  • Targeted SQL

For all the aggregations, if we are going to implement them all, we can simply write the targeted SQL benchmarks.
For operators that operates on column- and table-granularity with output sill as columns and tables, set up the single operator bench for them, such as merge, join, filter.

How does this sound? Anything missing?

@alamb
Copy link
Contributor Author

alamb commented Aug 9, 2022

Hi @OscarTHZhang Thanks for commenting on this ticket.

I think we can divide the micro-bench into 2 types (as described above)

I think the core goal for the ticket is to ensure the vast majority of the time is spent doing the operation rather than reading data.

It might make sense to go through existing benchmarks and try to see what coverage we already have

End to end benchmarks: https://github.com/apache/arrow-datafusion/tree/master/benchmarks

more micro level benchmarks:
https://github.com/apache/arrow-datafusion/tree/master/datafusion/core/benches

There are already some benchmarks that appear to be Targeted SQL that you describe, for example https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/benches/sql_planner.rs and https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/benches/aggregate_query_sql.rs

There are also some benchmarks for operators that are used as part of other operations, such as https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/benches/merge.rs

@spencerwilson
Copy link

Not sure how strong the suggestion of using Criterion was, but I recently discovered Divan. It may be worth evaluating.

(I have no affiliation; am just an aspiring OSS contributor browsing the good-first-issues 🙈)

@spencerwilson
Copy link

https://github.com/bheisler/iai could be a good fit for benchmarking those ExecutionPlan implementations that do little or no I/O. It reports not durations of wall time, but rather exact counts or estimates of low-level metrics:

bench_fibonacci_short
  Instructions:                1735
  L1 Accesses:                 2364
  L2 Accesses:                    1
  RAM Accesses:                   1
  Estimated Cycles:            2404

I’m not sure if there are any caveats around using it to measure async-style Rust code, though.

@edmondop
Copy link
Contributor

Hi @OscarTHZhang Thanks for commenting on this ticket.

I think we can divide the micro-bench into 2 types (as described above)

I think the core goal for the ticket is to ensure the vast majority of the time is spent doing the operation rather than reading data.

It might make sense to go through existing benchmarks and try to see what coverage we already have

End to end benchmarks: master/benchmarks

more micro level benchmarks: master/datafusion/core/benches

There are already some benchmarks that appear to be Targeted SQL that you describe, for example master/datafusion/core/benches/sql_planner.rs and master/datafusion/core/benches/aggregate_query_sql.rs

There are also some benchmarks for operators that are used as part of other operations, such as master/datafusion/core/benches/merge.rs

@alamb the way this issue title is phrased, it seems the right way to address is to extend the benchmarks which you shared here as micro-benchmarks.
master/datafusion/core/benches

is that correct?

@mnorfolk03
Copy link
Contributor

@alamb Is this still issue something that would be liked? If so, I'd like a shot at it for my first issue.

I think I could start with implementing some microbenchmarks for the physical plan operators? Such as: filter, limit, union, and the different types of joins to get started -- I didn't see any in the repo, although I may have missed them.

Let me know your thoughts thanks!

@alamb
Copy link
Contributor Author

alamb commented Oct 15, 2024

Hi @mnorfolk03 👋 -- thanks.

I think since this ticket was filed, we have moved more into "end to end" type benchmarks like in https://github.com/apache/datafusion/tree/main/benchmarks

I think Joins are an area we don't really have any great benchmarks -- we only have the TPCH queries

The art of writing benchmarks is choosing what to benchmark I think, so it is often a bit hard to choose.

Perhaps you could start with creating a benchmark for physical planning (aka the process of creating the final optimized ExecutionPlan) which is not an area we have a lot of coverage

You could perhaps use the report on #12738 to create a planning benchmark in https://github.com/apache/datafusion/blob/main/datafusion/core/benches/sql_planner.rs ?

@mnorfolk03
Copy link
Contributor

Hi @mnorfolk03 👋 -- thanks.

I think since this ticket was filed, we have moved more into "end to end" type benchmarks like in https://github.com/apache/datafusion/tree/main/benchmarks

I think Joins are an area we don't really have any great benchmarks -- we only have the TPCH queries

The art of writing benchmarks is choosing what to benchmark I think, so it is often a bit hard to choose.

Perhaps you could start with creating a benchmark for physical planning (aka the process of creating the final optimized ExecutionPlan) which is not an area we have a lot of coverage

You could perhaps use the report on #12738 to create a planning benchmark in https://github.com/apache/datafusion/blob/main/datafusion/core/benches/sql_planner.rs ?

Thanks I'll look into it and start working on it!

@alamb
Copy link
Contributor Author

alamb commented Oct 18, 2024

@askalt may have added some in #12950 -- maybe you can review the benchmarks there and see if there are others worth adding

@alamb
Copy link
Contributor Author

alamb commented Oct 24, 2024

Given the lack of specificity on this ticket (it tracks a basic idea rather than any particular project I think) I'll claim it is done for the moment

I think a better approach is to add microbenchmarks for operators we are planning to improve

@alamb alamb closed this as completed Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants