Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simpler/faster data aggregation code in aggregated_by #4970

Merged
merged 6 commits into from
May 11, 2023

Conversation

rcomer
Copy link
Member

@rcomer rcomer commented Sep 16, 2022

🚀 Pull Request

Description

The code currently used for the lazy case in aggregated_by just needs a few tweaks to also use it for the real case. So we can have fewer lines, and the McCabe complexity of the method is reduced from 27 to 20.

Edit: it turns out this also makes it faster (#4970 (comment)).


Consult Iris pull request check list

@rcomer rcomer marked this pull request as ready for review September 16, 2022 15:47
@rcomer rcomer force-pushed the simpler-aggregated_by branch from ebea66b to 8667424 Compare September 30, 2022 13:18
@rcomer
Copy link
Member Author

rcomer commented Sep 30, 2022

Ran the "cube" benchmarks locally and got

-      1.78±0.06s       1.18±0.02s     0.66  cube.Aggregation.time_aggregated_by

@rcomer rcomer changed the title Simpler aggregated_by code Simpler/faster aggregated_by code Sep 30, 2022
@rcomer rcomer changed the title Simpler/faster aggregated_by code Simpler/faster data aggregation code in aggregated_by Oct 5, 2022
@rcomer rcomer force-pushed the simpler-aggregated_by branch from e866cdf to de848f8 Compare November 23, 2022 17:47
@rcomer rcomer force-pushed the simpler-aggregated_by branch from de848f8 to 52f6430 Compare April 28, 2023 16:24
@rcomer rcomer added the benchmark_this Request that this pull request be benchmarked to check if it introduces performance shifts label Apr 28, 2023
@codecov
Copy link

codecov bot commented Apr 28, 2023

Codecov Report

Patch coverage: 95.23% and project coverage change: -0.01 ⚠️

Comparison is base (a49b5ac) 89.32% compared to head (52f6430) 89.31%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4970      +/-   ##
==========================================
- Coverage   89.32%   89.31%   -0.01%     
==========================================
  Files          89       89              
  Lines       22390    22373      -17     
  Branches     5374     5367       -7     
==========================================
- Hits        20000    19983      -17     
  Misses       1640     1640              
  Partials      750      750              
Impacted Files Coverage Δ
lib/iris/cube.py 90.45% <95.23%> (-0.12%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@rcomer rcomer removed the benchmark_this Request that this pull request be benchmarked to check if it introduces performance shifts label Apr 28, 2023
Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rcomer! Well done for making the code less scary. I've put a question in the comments, and will see if I can replicate your performance findings tomorrow.

lib/iris/cube.py Show resolved Hide resolved
@trexfeathers
Copy link
Contributor

Ran the "cube" benchmarks locally and got

-      1.78±0.06s       1.18±0.02s     0.66  cube.Aggregation.time_aggregated_by

I have been able to replicate this 👍

Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'm happy to merge as-is. Shame about the ❌ on the benchmark CI, but that's unavoidable until #5289 is finished.

@trexfeathers trexfeathers merged commit c0153ae into SciTools:main May 11, 2023
@rcomer rcomer deleted the simpler-aggregated_by branch May 11, 2023 14:39
tkknight added a commit to tkknight/iris that referenced this pull request May 21, 2023
* upstream/main:
  Updated environment lockfiles (SciTools#5332)
  Support netcdf variable emulation (SciTools#5212)
  Support netCDF load+save on dataset-like objects as well as filepaths. (SciTools#5214)
  minor refinement to release do-nothing script (SciTools#5326)
  update bibtex citation for v3.6.0 release (SciTools#5324)
  Whats new updates for v3.6.0 (SciTools#5323)
  Added whatsnew notes on netcdf delayed saving and Distributed support. (SciTools#5322)
  Updated environment lockfiles (SciTools#5320)
  Bugfix 4566 (SciTools#4569)
  Simpler/faster data aggregation code in `aggregated_by` (SciTools#4970)
tkknight added a commit to tkknight/iris that referenced this pull request Nov 9, 2023
* upstream/main:
  Updated environment lockfiles (SciTools#5334)
  Iris Docs: Dark theme ready (SciTools#5299)
  Updated environment lockfiles (SciTools#5332)
  Support netcdf variable emulation (SciTools#5212)
  Support netCDF load+save on dataset-like objects as well as filepaths. (SciTools#5214)
  minor refinement to release do-nothing script (SciTools#5326)
  update bibtex citation for v3.6.0 release (SciTools#5324)
  Whats new updates for v3.6.0 (SciTools#5323)
  Added whatsnew notes on netcdf delayed saving and Distributed support. (SciTools#5322)
  Updated environment lockfiles (SciTools#5320)
  Bugfix 4566 (SciTools#4569)
  Simpler/faster data aggregation code in `aggregated_by` (SciTools#4970)
  update release_do_nothing script (SciTools#5311)
  Restore latest Whats New files.
  Updated environment lockfiles (SciTools#5308)
  release_do_nothing script updates from v3.6.0rc0 (SciTools#5306)
  Whats new updates for v3.6.0rc0 (SciTools#5300)
  Bump scitools/workflows from 2023.04.3 to 2023.05.0 (SciTools#5303)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: 🆕 New
Development

Successfully merging this pull request may close these issues.

2 participants