-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorization of Fortran loops and other optimizations #227
Comments
Short answer: no. My Fortran knowledge is rusty and certainly not up to par with modern standards. |
I just installed PyBDSF using
Do you think the compiler is capable of doing the vectorizations that well? |
I'm confused. AFAICS |
No, it is not, that is a different bottleneck.
I profiled an entire run on a small FITS image.
FYI: this is how this class looks when profiled:
I dont understand the whole flow of PyBDSFs code, but maybe |
Maybe, but this is yet another potential hot-spot 😄. It's very well possible that there are numerous performance bottle-necks in PyBDSF. Unfortunately, I lack the time to dive into this. What's also not helping is that there's no reliable suite of unit tests, so I'm a bit hesitant to make changes to the code without knowing if I break something. Maybe @darafferty can give some insights here? |
Interesting results -- as far as I know, this kind of profiling has never really been done. So I'm not surprised that there are lots of bottlenecks! The So I would say that it would be worthwhile to experiment with improving some of the worst bottlenecks that Alex has identified. Perhaps for those at least we can write unit tests without too much work. |
As far as I can remember, there were also issues with images that took forever or a very long time to calculate. This may be a good way to identify where PyBDSF stalls in such cases. But I would use statistical profiler for this. Tagging @mhardcastle since you may be interested to see this. |
BTW: for each |
Yes, unfortunately I think it is necessary to recalculate the rms and mean maps for each scale, as they do change (as it turns out, this fact was relevant to the last PR that was just merged). |
This is how it looks in general for 150 MB image: Profile of
And there:
|
Thanks for this work. We definitely need to follow up on this, but we do need to find the time for it. |
Just a question / suggestion: this are the results of profiling:
Have you tried to vectorize loops there:
https://github.com/lofar-astron/PyBDSF/blob/master/src/fortran/pytess_simple.f#L16C1-L29C15
?
ChatGPT seems to be doing this:
but Im bad at Fortran so I cannot verify if this is OK.
The text was updated successfully, but these errors were encountered: