FYI- this showed up in a tech forum. #3

sxlijin · 2020-04-23T23:30:49Z

https://news.ycombinator.com/item?id=22957884

Comments are somewhat toxic, but may contain interesting notes about implementation decisions.

LowLevelMahn · 2020-04-24T07:10:36Z

telling that the C++ code is slower than go using this type of code creates some form of "feedback" :) but its far from "toxic"

LowLevelMahn · 2020-04-24T07:49:46Z

so the questions from hackernews are :

why such constructs?:
https://github.com/ExaScience/elprep-bench/blob/master/cpp/filter_pipeline.cpp#L20-L33
auto alns = any_cast<shared_ptr<deque<shared_ptr<sam_alignment>>>>(data);

it is strange that there is no unique_prt in the complete code - everything is everytime shared?
why no std::unique_ptr?
needs the pointer really thread-safe "sharedable" with ref-counting over mupltiple threads?
shared_ptr is very costly(ref-counting, atomic-lock...), unique_ptr is nearly for free

is it clear that C++ needs way less new or make_shared/unique in code than Java?

it seems you used shared_ptr to implement some sort of move-semantik, that comes
also for free using unique_ptr

the allocation overhead seems to be very huge

maximegmd · 2020-05-01T22:25:26Z

It's hard to believe this kind of publication gets accepted when the variable that is actually measured here is the authors' relative competence in 3 langages.

pcostanza · 2020-05-02T11:41:58Z

@sxlijin Thanks a lot for the link, such notifications on discussions around elPrep are very much appreciated. However, we are currently very busy with working on the next release of elPrep, so we are focusing on that rather than participating in such discussions. Maybe we will comment sometime later. (There was a similar discussion on reddit some time ago, with similar criticisms which we already addressed at https://www.reddit.com/r/programming/comments/avsfc6/performance_comparison_of_go_c_and_java_for/ - many of our answers back then probably apply here as well, but we would have to double-check.)

@LowLevelMahn An important aspect of elPrep is that it is an open-ended framework where more filters can be added, including complex ones like the ones for marking duplicates or base-quality score recalibration, and combined in arbitrary ways. (We are currently working on other more complex ones.) In the general case, this makes it impossible to predict the lifetimes of the objects involved, which is why you need something like shared_ptr in the general case (or garbage collection if available). We already had a version of elPrep with mostly manual memory management before, and this became impractical, which is why we did the study. Doing manual memory management in C++ wouldn't have improved our situation, so wasn't a real option. This motivation for our work is discussed in the paper at https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2903-5 , and it is important to assess our work in this light. You can find more information about the background of work in our other papers about elPrep, namely https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0209523 and https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0132868

As far as we can tell, reference counting across multiple threads didn't account for major performance losses. Performance is mostly lost during a long-running deallocation phase which is strictly sequential. For the remaining phases, the C++ version is actually on par with the other implementations. This is actually discussed in some detail in the paper.

@Yamashi Competence in programming languages is difficult to assess, but productivity is an important dimension for real-world projects. It is already known for quite some time that automatic memory management can drastically improve productivity. See https://ieeexplore.ieee.org/document/5387117 for example.

Feel free to assess the proficiency in other programming languages at our lab by looking at our other projects at https://github.com/exascience/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FYI- this showed up in a tech forum. #3

FYI- this showed up in a tech forum. #3

sxlijin commented Apr 23, 2020

LowLevelMahn commented Apr 24, 2020 •

edited

Loading

LowLevelMahn commented Apr 24, 2020 •

edited

Loading

maximegmd commented May 1, 2020

pcostanza commented May 2, 2020

FYI- this showed up in a tech forum. #3

FYI- this showed up in a tech forum. #3

Comments

sxlijin commented Apr 23, 2020

LowLevelMahn commented Apr 24, 2020 • edited Loading

LowLevelMahn commented Apr 24, 2020 • edited Loading

maximegmd commented May 1, 2020

pcostanza commented May 2, 2020

LowLevelMahn commented Apr 24, 2020 •

edited

Loading

LowLevelMahn commented Apr 24, 2020 •

edited

Loading