Skip to content

Feature/asv#25

Merged
fluidnumerics-joe merged 36 commits intomainfrom
feature/asv
Dec 15, 2025
Merged

Feature/asv#25
fluidnumerics-joe merged 36 commits intomainfrom
feature/asv

Conversation

@fluidnumerics-joe
Copy link
Contributor

@fluidnumerics-joe fluidnumerics-joe commented Dec 10, 2025

This PR transitions all benchmarking to use ASV.

@fluidnumerics-joe
Copy link
Contributor Author

@VeckoTheGecko - you might give this a try. On this branch, do

pixi install
pixi shell

Once in the pixi shell, try,

asv run --python=same --quick --show-stderr --dry-run --verbose

At the moment, asv is not picking up any benchmarks and its unclear why... output shown below

$ asv run --python=same --quick --show-stderr --dry-run --verbose
· Running '/home/joe/Projects/Geomar-Utrecht/parcels-benchmarks/.pixi/envs/default/bin/python -c import sys; print(str(sys.version_info[0]) + "." + str(sys.version_info[1]))'
  OUTPUT -------->
  3.12
· Discovering benchmarks
·· Running '/home/joe/Projects/Geomar-Utrecht/parcels-benchmarks/.pixi/envs/default/lib/python3.12/site-packages/asv/benchmark.py discover /home/joe/Projects/Geomar-Utrecht/parcels-benchmarks/benchmarks /tmp/tmphyvjz_3_/result.json' in existing-py_home_joe_Projects_Geomar-Utrecht_parcels-benchmarks_.pixi_envs_default_bin_python
·· Running '/home/joe/Projects/Geomar-Utrecht/parcels-benchmarks/.pixi/envs/default/bin/python /home/joe/Projects/Geomar-Utrecht/parcels-benchmarks/.pixi/envs/default/lib/python3.12/site-packages/asv/benchmark.py discover /home/joe/Projects/Geomar-Utrecht/parcels-benchmarks/benchmarks /tmp/tmphyvjz_3_/result.json'
   OUTPUT -------->
   /home/joe/Projects/Geomar-Utrecht/parcels-benchmarks/benchmarks/moi_curvilinear.py:8: UserWarning: This is an alpha version of Parcels v4. The API is not stable and may change without deprecation warnings.
     import parcels
· No benchmarks selected

@fluidnumerics-joe
Copy link
Contributor Author

Right now this is set up to use all packages (including parcels) as defined by the pixi environment.

@fluidnumerics-joe
Copy link
Contributor Author

Looks like benchmarks need to be prefixed with specific names, e.g. time_ or peakmem_ so that asv picks them up.. See https://asv.readthedocs.io/en/stable/benchmarks.html

Pixi environment now only provides the necessary packages for the
benchmarking environment. asv will use rattler to install parcels and
its dependencies in another environment
…hmarks

The changes provided here allow for the
parcels_bencharks/benchmark_setup.py module to be found within asv
benchmarks. This makes it easy to handle downloading the necessary
datasets for benchmarks.

I've added a helper function in the moi curvilinear benchmarks to load
the xarray dataset; storing the dataset as an object attribute can cause
contamination between benchmarks that I want to avoid. Each benchmark
now loads a fresh dataset from disk at the beginning by calling
_load_ds(...)
@fluidnumerics-joe fluidnumerics-joe changed the title [WIP] Feature/asv Feature/asv Dec 11, 2025
@willirath
Copy link
Contributor

@willirath - I'm honestly inclined to remove the cli and just keep the data downloading in the 'setup' of each benchmark. The gist was to provide a means for folks to pre-download the data. While yes, there's no direct enforcement of the --data-home flag and the data_home in the benchmarks, if someone is using something other than the default, they'll need to modify the benchmark.

This is really a half baked idea that requires folks know what they're doing here at the moment.

What about removing the CLI but adding a DATA_DIR env var for override? Without this, I have to clean out my $HOME/.cache dir each time I run these on Levante benchmarks because DKRZ.de is really strict about $HOME quota.

@fluidnumerics-joe
Copy link
Contributor Author

@willirath - makes sense. Let me see what I could do to tighten up the connection here wrt to the data home

fluidnumerics-joe and others added 9 commits December 12, 2025 13:33
Co-authored-by: Willi Rath <willirath@users.noreply.github.com>
Co-authored-by: Willi Rath <willirath@users.noreply.github.com>
Co-authored-by: Willi Rath <willirath@users.noreply.github.com>
Co-authored-by: Willi Rath <willirath@users.noreply.github.com>
Co-authored-by: Willi Rath <willirath@users.noreply.github.com>
Co-authored-by: Willi Rath <willirath@users.noreply.github.com>
Until we determine how we want to manage centrally storing benchmark
results, we'll keep these out of the repository for now
@fluidnumerics-joe
Copy link
Contributor Author

@VeckoTheGecko - I've removed the .asv/ subdirectory and added .asv/ to .gitignore. I figured it'd be best to discuss in a separate issue how we want to centrally manage benchmarks submitted by multiple users. IMO, having contributions from the developers and any interested members of the Parcels community would be awesome; this way we can concretely see the variation in performance across Parcels users' systems.

@fluidnumerics-joe
Copy link
Contributor Author

@willirath - I think I've addressed all of your comments. You can now use the PARCELS_DATADIR environment variable to control where the local data cache is stored. By default, it points to whatever pooch.os_cache returns with. The CLI is removed and the benchmark_setup.py script simply can be used to pre-download all data before running benchmarks, if desired.

Copy link
Contributor

@willirath willirath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor things: leftover argparse logic in main block and missing Path on env var.

fluidnumerics-joe and others added 2 commits December 13, 2025 08:45
Co-authored-by: Willi Rath <willirath@users.noreply.github.com>
Co-authored-by: Willi Rath <willirath@users.noreply.github.com>
Co-authored-by: Willi Rath <willirath@users.noreply.github.com>
@willirath
Copy link
Contributor

Just tested, both, with / without overriding the cache dir on our Uni Kiel HPC and on my Macbook. I think it's good to be merged.

@fluidnumerics-joe fluidnumerics-joe merged commit 905c8cb into main Dec 15, 2025
@fluidnumerics-joe fluidnumerics-joe deleted the feature/asv branch December 15, 2025 15:54
Copy link
Member

@erikvansebille erikvansebille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Nice push forward. And installation was a breeze 👍

A few comments below


pset.execute(parcels.kernels.AdvectionEE, runtime=runtime, dt=dt, verbose_progress=False)

def peakmem_pset_execute_3d(self,interpolator,chunk,npart):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The body of this function appears identical(?) to the body of the time_pset_execute_3d()? Can we reduce code duplication by organising this more smartly? Why do we need two functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are identical, but this is the way ASV works. If we want a benchmark for measuring peak memory the function has to start with peakmem_. If we want runtime, it has to start with time_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but can't we then have these two functions that then call another function _execute() or so, so that the execute remains the same. Also important to avoid one benchmark changing but the other not?

@@ -0,0 +1 @@
# parcels_benchmarks/benchmark_utils
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally commented out?

__pycache__
build/
parcels/
.asv/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand @VeckoTheGecko comment to ignore the individual run outputs, but how/where are they stored now then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you run, the results are stored under .asv/ . I commented in #24 to discuss what we should retain here in version control as a follow up :)

@erikvansebille
Copy link
Member

Oops, I now only realise the PR was merged already. Well; perhaps some of my comments are useful for other rounds of enhancements?

@fluidnumerics-joe fluidnumerics-joe mentioned this pull request Dec 15, 2025
@fluidnumerics-joe
Copy link
Contributor Author

Oops, I now only realise the PR was merged already. Well; perhaps some of my comments are useful for other rounds of enhancements?

Yes indeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants