-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance MODE to use OpenMP to make the convolution step faster #2724
Comments
Ran a simple test on seneca using HRRR data to compare 6-hour precip to itself.
The improvement from 1 to 2 is the result of swapping in a much more efficient looping algorithm. |
… object for each field. Still more work to do to reduce memory usage and also apply OpenMP to the ShapeData::select() function.
…() to exactly reproduce existing results. There were some subtle diffs in the handling of missing data and points off the grid.
…ikely a way to make the memory usage for efficient but it'll require a tweak to the logic.
…ory allocation for ShapeData objects.
@DanielAdriaansen, on 11/2/23, we discussed some additional refinements to MODE with the goal of minimizing unnecessary memory allocation. Currently, the fuzzy logic engine allocates memory for 1 copy of the entire domain for each forecast and observation object. In your case, running with 75 forecast and 75 observation objects on a 3km CONUS HRRR domain, that adds up to a lot of memory. It was a little more involved than I expected, but the basic change we discussed is now in place on the Your run on 11/2/23 took just under 10 minutes to complete. Please let me know what the new runtime is. I'll note that there is additional memory allocation done in the double-threshold merging step that could probably also be eliminated. I have an idea how how we could use STL maps to keep track of the simple object ids falling inside the merge objects and vice-versa. That should provide the information needed without allocating so many copies of the grid. |
@JohnHalleyGotway I tested using Yesterday:
This morning:
|
@DanielAdriaansen thanks for re-testing and passing along the test you're using on seneca. I suspect the slowness is ultimately caused by MODE looping over the input domain many, many, many times. Here's some thoughts.
|
…were computing NMEP outputs. That was removed from ensemble-stat in MET version 11.1 but the OpenMP setup remained there. This removes it from ensemble-stat and updates the documentation to accurately indicate that OpenMP currently applies to gen-ens-prod, grid-stat, and now mode.
…hould be faster and use much less memory.
Good news. This GHA run flagged no diffs. So my reimplementation of the double-thresholding to minimize memory use works. |
…ions to be more efficient by accessing the vector of data rather than the slower get(x,y) data accessor function.
…bosity level to avoid unnecessary loops through the data. Note that all calls to the logger would actually create the log message and the logger decides whether or not to print it. Wrapping expensive debugging log messages in vebosity level check is more efficient.
… diff a bit more efficient by accessing the data() array directly rather than range-checking with the data(x,y) accessor function.
When this is running on HPC, please contact Jeff Duda at GSL to test performance. |
@bonnystrong Which HPC would Jeff use to test performance? |
Either hera or jet. You should ask him for more specifics.
…On Mon, Dec 4, 2023 at 4:12 PM Julie Prestopnik ***@***.***> wrote:
@bonnystrong <https://github.com/bonnystrong> Which HPC would Jeff use to
test performance?
—
Reply to this email directly, view it on GitHub
<#2724 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG6HZOUCRI52C26HUGAIYGTYHZKFNAVCNFSM6AAAAAA63DTNPCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZZGY4TENZRGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Bonny Strong
NOAA/GSL and CIRA
home: (970) 669-1188 or office: (719) 301-6195
DSRC office 2B147
|
Describe the Enhancement
MET#1926 added OpenMP to parallelize the computation of fractional coverage fields. The same approach can easily be applied to the convolution step in MODE. This issue is to reimplement
ShapeData::conv_filter_circ(...)
using the same OpenMP-wrapped algorithm employed by thefractional_coverage()
utility function.Time Estimate
1 day.
Sub-Issues
Consider breaking the enhancement down into sub-issues.
None needed.
Relevant Deadlines
List relevant project deadlines here or state NONE.
Funding Source
Define the source of funding and account keys here or state NONE.
Define the Metadata
Assignee
Labels
Milestone and Projects
Define Related Issue(s)
Consider the impact to the other METplus components.
No impacts.
Enhancement Checklist
See the METplus Workflow for details.
Branch name:
feature_<Issue Number>_<Description>
Pull request:
feature <Issue Number> <Description>
Select: Reviewer(s) and Development issue
Select: Milestone as the next official version
Select: MET-X.Y.Z Development project for development toward the next official release
The text was updated successfully, but these errors were encountered: