Skip to content

Conversation

Kontinuation
Copy link
Member

@Kontinuation Kontinuation commented Feb 13, 2025

Which issue does this PR close?

Rationale for this change

I had to run sort-tpch queries with memory limit when testing fixes for memory related issues, so I decide to add --memory-limit option for most of the benchmarking cli tools. I wish other developers could find it handy.

What changes are included in this PR?

This PR adds 2 cli options --memory-limit, --mem-pool-type and --sort-spill-reservation-bytes to the following benchmarking tools:

  • dfbench subcommands: sort, sort-tpch, clickbench, h2o, imdb, parquet-filter
  • tpch
  • imdb

external_aggr already supports --memory-limit, it now accepts --mem-pool-type. The default value of --mem-pool-type is fair so the behavior remains unchanged.

Are these changes tested?

The changes were tested manually.

Are there any user-facing changes?

No. The benchmarking guide has not covered every option so hopefully the developers could find these options themselves using --help.

@Kontinuation Kontinuation changed the title Add support --mem-pool-type and --memory-limit options for all benchmarks feat: Add support --mem-pool-type and --memory-limit options for all benchmarks Feb 13, 2025
@Kontinuation Kontinuation changed the title feat: Add support --mem-pool-type and --memory-limit options for all benchmarks feat: Add --mem-pool-type and --memory-limit options to multiple benchmarks Feb 13, 2025
@Kontinuation Kontinuation changed the title feat: Add --mem-pool-type and --memory-limit options to multiple benchmarks feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks Feb 13, 2025
@Kontinuation Kontinuation marked this pull request as ready for review February 13, 2025 12:46
@Kontinuation
Copy link
Member Author

sort_spill_reservation_bytes is also an important configuration to tune for benchmarks involving sorts, so I think we may also want to add it to benchmarking tools.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me -- thank you @Kontinuation

Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested some queries and it's working well, thank you!

@alamb
Copy link
Contributor

alamb commented Feb 14, 2025

Thanks again @Kontinuation and @2010YOUY01

@alamb
Copy link
Contributor

alamb commented Feb 14, 2025

This PR is merged but for some reason the github ui is not showing it:

@alamb alamb merged commit c1338b7 into apache:main Feb 14, 2025
25 checks passed
jonahgao pushed a commit to jonahgao/datafusion that referenced this pull request Feb 14, 2025
…ultiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option
acking-you pushed a commit to acking-you/arrow-datafusion that referenced this pull request Mar 27, 2025
…ultiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option
acking-you pushed a commit to acking-you/arrow-datafusion that referenced this pull request Mar 28, 2025
…ultiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option
acking-you pushed a commit to acking-you/arrow-datafusion that referenced this pull request Mar 29, 2025
…ultiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option
acking-you pushed a commit to acking-you/arrow-datafusion that referenced this pull request Mar 29, 2025
…ultiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option
acking-you pushed a commit to acking-you/arrow-datafusion that referenced this pull request Mar 31, 2025
…ultiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option
acking-you pushed a commit to acking-you/arrow-datafusion that referenced this pull request Mar 31, 2025
…ultiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option
acking-you pushed a commit to acking-you/arrow-datafusion that referenced this pull request Apr 7, 2025
…ultiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option
acking-you pushed a commit to acking-you/arrow-datafusion that referenced this pull request Apr 7, 2025
…ultiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option
alamb added a commit that referenced this pull request Apr 8, 2025
* [draft] add shot circuit in BinaryExpr

* refactor: add check_short_circuit function

* refactor: change if condition to match

* feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks (#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option

* Chore/Add additional FFI unit tests (#14802)

* Add unit tests to FFI_ExecutionPlan

* Add unit tests for FFI table source

* Add round trip tests for volatility

* Add unit tests for FFI insert op

* Simplify string generation in unit test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Fix drop of borrowed value

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Improve feature flag CI coverage `datafusion` and `datafusion-functions` (#15203)

* add extend sql & docs

* feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks (#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option

* Chore/Add additional FFI unit tests (#14802)

* Add unit tests to FFI_ExecutionPlan

* Add unit tests for FFI table source

* Add round trip tests for volatility

* Add unit tests for FFI insert op

* Simplify string generation in unit test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Fix drop of borrowed value

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Improve feature flag CI coverage `datafusion` and `datafusion-functions` (#15203)

* fix: incorrect false judgment

* add test

* separate q6 to new PR

* feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks (#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option

* Chore/Add additional FFI unit tests (#14802)

* Add unit tests to FFI_ExecutionPlan

* Add unit tests for FFI table source

* Add round trip tests for volatility

* Add unit tests for FFI insert op

* Simplify string generation in unit test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Fix drop of borrowed value

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Improve feature flag CI coverage `datafusion` and `datafusion-functions` (#15203)

* feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks (#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option

* Chore/Add additional FFI unit tests (#14802)

* Add unit tests to FFI_ExecutionPlan

* Add unit tests for FFI table source

* Add round trip tests for volatility

* Add unit tests for FFI insert op

* Simplify string generation in unit test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Fix drop of borrowed value

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Improve feature flag CI coverage `datafusion` and `datafusion-functions` (#15203)

* add benchmark for boolean_op

* fix cargo doc

* add binary_op bench

* Better comments

---------

Co-authored-by: Kristin Cowalcijk <bo@wherobots.com>
Co-authored-by: Tim Saucer <timsaucer@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
nirnayroy pushed a commit to nirnayroy/datafusion that referenced this pull request May 2, 2025
* [draft] add shot circuit in BinaryExpr

* refactor: add check_short_circuit function

* refactor: change if condition to match

* feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option

* Chore/Add additional FFI unit tests (apache#14802)

* Add unit tests to FFI_ExecutionPlan

* Add unit tests for FFI table source

* Add round trip tests for volatility

* Add unit tests for FFI insert op

* Simplify string generation in unit test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Fix drop of borrowed value

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Improve feature flag CI coverage `datafusion` and `datafusion-functions` (apache#15203)

* add extend sql & docs

* feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option

* Chore/Add additional FFI unit tests (apache#14802)

* Add unit tests to FFI_ExecutionPlan

* Add unit tests for FFI table source

* Add round trip tests for volatility

* Add unit tests for FFI insert op

* Simplify string generation in unit test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Fix drop of borrowed value

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Improve feature flag CI coverage `datafusion` and `datafusion-functions` (apache#15203)

* fix: incorrect false judgment

* add test

* separate q6 to new PR

* feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option

* Chore/Add additional FFI unit tests (apache#14802)

* Add unit tests to FFI_ExecutionPlan

* Add unit tests for FFI table source

* Add round trip tests for volatility

* Add unit tests for FFI insert op

* Simplify string generation in unit test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Fix drop of borrowed value

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Improve feature flag CI coverage `datafusion` and `datafusion-functions` (apache#15203)

* feat: Add support for --mem-pool-type and --memory-limit options to multiple benchmarks (apache#14642)

* Add support --mem-pool-type and --memory-limit options for all benchmarks

* Add --sort-spill-reservation-bytes option

* Chore/Add additional FFI unit tests (apache#14802)

* Add unit tests to FFI_ExecutionPlan

* Add unit tests for FFI table source

* Add round trip tests for volatility

* Add unit tests for FFI insert op

* Simplify string generation in unit test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Fix drop of borrowed value

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Improve feature flag CI coverage `datafusion` and `datafusion-functions` (apache#15203)

* add benchmark for boolean_op

* fix cargo doc

* add binary_op bench

* Better comments

---------

Co-authored-by: Kristin Cowalcijk <bo@wherobots.com>
Co-authored-by: Tim Saucer <timsaucer@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support --memory-limit for all benchmarking tools
3 participants