Proposal for fixed generation of benchmark data. #124

trexfeathers · 2021-10-26T16:02:49Z

Have a read of benchmarks/benchmarks/generate_data.py - the docstrings provide detail on my thoughts for this.

Summary: run generation scripts in an alternative environment that will therefore remain unchanged throughout the benchmark run.

codecov · 2021-10-26T16:24:48Z

Codecov Report

Merging #124 (5340ce6) into main (10820f2) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #124   +/-   ##
=======================================
  Coverage   98.85%   98.85%           
=======================================
  Files          14       14           
  Lines         699      699           
=======================================
  Hits          691      691           
  Misses          8        8

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 10820f2...5340ce6. Read the comment docs.

trexfeathers · 2021-10-27T08:55:34Z

I've so far made minimal changes to the benchmarks themselves, but given this PR forces saving of sample data (rather than just creating a Python object), I'd be interested in @stephenworsley's thoughts on better optimisation via ASV's setup_cache.

benchmarks/benchmarks/ci/esmf_regridder.py

benchmarks/benchmarks/generate_data.py

jamesp · 2021-11-08T09:40:00Z

Take this as a suggestion, not a review. You could refactor into either a closure or a class that wraps the python executable rather than hard coding to a global constant.

e.g.

class PythonRunner:
  def __init__(self, python:Path):
     self.python=python
  
  def __call__(self, code, *args, **kwargs):
     ... code of run_elsewhere using self.python ...

# in the specific data gen code
run_elsewhere = PythonRunner(GEN_DATA_PYTHON)

or

def make_python_runner(python: Path):
  def run_elsewhere(code, *args, **kwargs): ...
  return run_elsewhere

run_elsewhere = make_python_runner(GEN_DATA_PYTHON)

You get the picture.

trexfeathers · 2021-11-08T10:37:29Z

You could refactor into either a closure or a class that wraps the python executable rather than hard coding to a global constant.

If this needed to be more generic, sure. But that's engineering for something that I don't expect to happen: #124 (comment)

pp-mo

Generally between us @pp-mo @stephenworsley @jamesp we think it is OK !

* add benchmark * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * handle cases where files don't exist * add benchmark * Proposal for fixed generation of benchmark data. (#124) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * post merge fixes * refactor _gridlike_mesh * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix benchmarks * turn on debugging * turn on debugging * fix data generation * fix benchmarks * try saving with unique address * Synthetic file generation - re-use files and ensure uniqueness. * try saving with unique address * fix benchmarks * fix nox * refactor long benchmarks * refactor long benchmarks * move DATA_GEN_PYTHON setup to nox * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * avoid python name "type" * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * changes to test infrastructure * lint fix * complete "type" refactor * fix benchmarks * toggle ci benchmark debugging off * make codecov more lenient * parameterise load/save benchmarks * address review comments * lint fix * review actions Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> Co-authored-by: Martin Yeo <martin.yeo@metoffice.gov.uk>

Proposal for fixed generation of benchmark data.

99fa551

trexfeathers added Experience: Medium Type: Performance Type: Testing labels Oct 26, 2021

trexfeathers requested review from pp-mo and stephenworsley October 26, 2021 16:02

trexfeathers added 3 commits October 26, 2021 17:06

Docstring improvements.

50daddc

Cirrus ASV PY_VER 3.8.

d3bed8f

Better benchmark python versioning.

2e78be1

DATA_GEN_PYTHON realpath.

f9babb1

trexfeathers added 3 commits October 27, 2021 10:08

Better DATA_GEN_PYTHON handling.

1759248

Better DATA_GEN_PYTHON handling.

dfe4c59

Safer handling of save_path.

02cb1d2

pp-mo reviewed Oct 28, 2021

View reviewed changes

benchmarks/benchmarks/ci/esmf_regridder.py Show resolved Hide resolved

pp-mo reviewed Oct 28, 2021

View reviewed changes

benchmarks/benchmarks/generate_data.py Outdated Show resolved Hide resolved

stephenworsley reviewed Oct 28, 2021

View reviewed changes

benchmarks/benchmarks/generate_data.py Show resolved Hide resolved

stephenworsley reviewed Oct 28, 2021

View reviewed changes

benchmarks/benchmarks/generate_data.py Show resolved Hide resolved

Better naming.

f3dec96

pp-mo reviewed Oct 29, 2021

View reviewed changes

benchmarks/benchmarks/generate_data.py Outdated Show resolved Hide resolved

pp-mo reviewed Oct 29, 2021

View reviewed changes

benchmarks/benchmarks/generate_data.py Outdated Show resolved Hide resolved

Improvements to run_function_elsewhere().

906623c

pp-mo reviewed Nov 4, 2021

View reviewed changes

benchmarks/benchmarks/generate_data.py Show resolved Hide resolved

Merge remote-tracking branch 'upstream/main' into benchmark_data_gen

0dc3959

jamesp reviewed Nov 8, 2021

View reviewed changes

benchmarks/benchmarks/generate_data.py Outdated Show resolved Hide resolved

trexfeathers added 2 commits November 8, 2021 10:56

More pythonic checking of DATA_GEN_PYTHON.

b9de1fc

Remove superfluous --python from Cirrus Nox run.

5340ce6

pp-mo approved these changes Nov 9, 2021

View reviewed changes

pp-mo merged commit fa2e34b into SciTools:main Nov 9, 2021

trexfeathers mentioned this pull request Nov 18, 2021

Add load/save benchmarks #132

Merged

stephenworsley pushed a commit to stephenworsley/iris-esmf-regrid that referenced this pull request Nov 19, 2021

Proposal for fixed generation of benchmark data. (SciTools#124)

2ccae82

trexfeathers deleted the benchmark_data_gen branch March 31, 2022 13:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for fixed generation of benchmark data. #124

Proposal for fixed generation of benchmark data. #124

trexfeathers commented Oct 26, 2021 •

edited

Loading

codecov bot commented Oct 26, 2021 •

edited

Loading

trexfeathers commented Oct 27, 2021

jamesp commented Nov 8, 2021

trexfeathers commented Nov 8, 2021

pp-mo left a comment

Proposal for fixed generation of benchmark data. #124

Proposal for fixed generation of benchmark data. #124

Conversation

trexfeathers commented Oct 26, 2021 • edited Loading

codecov bot commented Oct 26, 2021 • edited Loading

Codecov Report

trexfeathers commented Oct 27, 2021

jamesp commented Nov 8, 2021

trexfeathers commented Nov 8, 2021

pp-mo left a comment

Choose a reason for hiding this comment

trexfeathers commented Oct 26, 2021 •

edited

Loading

codecov bot commented Oct 26, 2021 •

edited

Loading