parallelize hot spots #2438

clyne · 2020-11-11T21:18:53Z

We've held off on parallelizing much of vapor because Apple has not adopted support for OpenMP in their version of the clang compiler. It's getting pretty appear that apple may never support OMP. However, the generic version of clang can easily be installed on the Mac via homebrew or macports. We should explore using these compilers and parallelizing many of the hot spots in the code. Preliminary investigation indicates that the Grid class iterators are thread safe and can easily be used in parallel.

clyne · 2021-09-30T20:59:04Z

The Grid::GetRange() method would be a good "first issue" for parallelization: it is simple, can be expensive with larger data, and is invoked every time a new variable/time-step is accessed.

A good example showing how to use OpenMP with Grid Iterators is the test_get_value() function inside of vapor/test_apps/./datamgr/test_datamgr.cpp

vapor/test_apps/grid_iter/test_grid_iter.cpp might serve as a starting point for a unit test

shaomeng · 2021-09-30T21:49:41Z

@clyne John, I feel parallelization should not be a priority right now, and the reason is that existing codebase has too many issues. I'm afraid that once we move on to parallelization, both of the following scenarios will come true:

The majority of the development time will be spent on debugging existing code, and on an ad hoc basis;
There will still be many bugs not found during development and only show up when a user starts to use VAPOR with 64 threads and doing heavy lifting work.

I'm particularly concerned about the existing memory issues. To make it clear, even serial code that has no memory issues would likely need some taking care of when it is parallelized. One thing you definitely don't want to have before moving to parallelization is that the serial code is already showing many memory issues.

To explain my point, I opened VAPOR, loaded a CF dataset, closed the dataset, and closed VAPOR. This process is monitored by Valgrind and it reports the following issues:

Conditional jump or move depends on uninitialised value(s) 40 occurrences.
Invalid read of size xxx 119 occurrences.
Use of uninitialised value of size xxx 30 occurences.
invalid file descriptor 6 occurences.
definitely lost memory blocks: 222.

On the one hand, a lot of these reported errors are from QT and Python, so only a fraction of these numbers are caused by VAPOR. On the other hand, my test only involves open and close a dataset; there isn't even a real rendering job performed, so the actual number of issues is likely much bigger. It is true that many memory issues won't stop a program from being parallelized, but the sheer number of detected issues is very worrisome.

To provide a reference here and not to brag or anything, my compression code (much less complex than VAPOR, but still fairly complex) has zero memory error, so it is something very doable.

So my suggestion is to address the code quality first (eliminate much of memory errors at least, possibly take other measures) before we deploy parallelization. This approach will save overall development time by a huge amount, compared to an ad hoc bug busting approach.

clyne · 2021-10-01T21:29:24Z

Fair enough. Then I'd like to task you with taking a memory sanitation pass for 3.6. You might want to chat with @StasJ. At one point he had some scripts to remove the Qt memory errors from the valgrind report.

shaomeng · 2021-10-01T22:39:44Z

I can take on this task, and I don't think a script is needed, since Valgrind provides detailed report on where the memory error is from.

I'm thinking of extracting individual memory errors from the Valgrind report which contains the nature of the error and where they are, and then assign each error to individuals who're responsible for that file. Does this sound like a good approach to you @clyne ?

clyne · 2021-10-04T17:57:13Z

Great! I'd ask to please try and address as many of these yourself as possible, but feel free to engage the "code owner" if something is not straightforward. @StasJ and @sgpearse already have plenty on their plate for 3.6.

shaomeng · 2021-10-04T19:48:37Z

Well, I'll have >5000 lines of Valgrind report to analyze (just from opening and closing VAPOR) so that I can achieve pinpointing individual errors. I think I'll provide assistance but it'll be up to the real owner to actually fix the error.

P.S. It's a good learning process for a programmer to look back their code and realize that some of the old habits are error-prone.

Update: this issue is dependent on issue #2870

clyne · 2022-07-27T18:42:06Z

Fixed by #3091

clyne added the Performance label Nov 11, 2020

clyne mentioned this issue Mar 11, 2021

Parallelization of QuadTreeRectangle class with OpenMP #2439

Merged

clyne added this to the 3_5_0 release milestone Apr 26, 2021

clyne removed this from the 3_5_0 release milestone Jun 9, 2021

clyne added the High label Sep 30, 2021

clyne added this to the 3_6_0 Release milestone Sep 30, 2021

clyne assigned shaomeng Sep 30, 2021

clyne removed this from the 3_6_0 Release milestone Oct 19, 2021

clyne modified the milestones: 3_6_0 Release, 3_7_0_Release Mar 17, 2022

clyne closed this as completed Jul 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallelize hot spots #2438

parallelize hot spots #2438

clyne commented Nov 11, 2020

clyne commented Sep 30, 2021

shaomeng commented Sep 30, 2021 •

edited

Loading

clyne commented Oct 1, 2021

shaomeng commented Oct 1, 2021

clyne commented Oct 4, 2021

shaomeng commented Oct 4, 2021 •

edited

Loading

clyne commented Jul 27, 2022

parallelize hot spots #2438

parallelize hot spots #2438

Comments

clyne commented Nov 11, 2020

clyne commented Sep 30, 2021

shaomeng commented Sep 30, 2021 • edited Loading

clyne commented Oct 1, 2021

shaomeng commented Oct 1, 2021

clyne commented Oct 4, 2021

shaomeng commented Oct 4, 2021 • edited Loading

clyne commented Jul 27, 2022

shaomeng commented Sep 30, 2021 •

edited

Loading

shaomeng commented Oct 4, 2021 •

edited

Loading