Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future plans #4

Open
TiborGY opened this issue Oct 31, 2021 · 2 comments
Open

Future plans #4

TiborGY opened this issue Oct 31, 2021 · 2 comments

Comments

@TiborGY
Copy link
Contributor

TiborGY commented Oct 31, 2021

I am writing this to inform you about future PRs that I am planning to make, and to make sure my proposed changes do not clash with the project, before I start writing code. Comments are welcome.

We have private forks of the old MSA project (https://github.com/Kee-Wang/PES-Fitting-MSA), where @vtajti has made some changes to code generation, making energy evaluation more computationally efficient, and we also have a variant that generates C++ code instead of Fortran. Note that these are purely code generation changes, i.e. msa.cpp is unchanged.

I am aiming to upstream these improvements and extensions into this repo, while preserving the existing algorithm and functionality, especially as our new code cannot compute analytical gradients (not yet anyways), nor use them during fitting.

My current idea on how to do that:

  • Overhaul msa.py:
  • The user will be asked to first choose between two modes; demo and research. The demo mode would preserve existing behavior and pretty much work as it does right now.
  • Research mode would focus on generating the source and object files without compiling the demo program that evaluates the sample CH4 geometries, and would ask the user if they want the current algorithm that can do gradients, or the alternative one that does energies only but is likely faster. (possibly much faster, as speedup depends nonlinearly on the exact details of hardware used, the number of monomials, etc.)
  • If they choose the latter ask if they want to generate C++, Fortran or both.
  • Make msa.py more user friendly to new users by adding extra messages before operations start and when they finish. Also add more error checking, e.g. if compiling something fails, write a clear error message.
  • Update tutorial.txt accordingly.

I plan to work on this as a side project, so no firm timeline. Comments are welcome.

@szquchen
Copy link
Owner

szquchen commented Nov 3, 2021

Thank you for all the contributions to the MSA project.

Your plan sounds good. Just a few comments

  • For research mode, you can certainly skip compiling the demo program, but it is very likely that pes_shell.f90 (or something similar in C++) is still needed. It provides the interface to call the fitted potential energy surface and if the users want to hook the fitted potential to other programs such as molecular dynamics, they probably need pes_shell.f90.
  • This version of MSA has the option to use gradients or energies only to fit the potential energy surface, and has a message that asks user about this. It is true that using energy only is much faster than with gradients.

We also have some news about our recent development of a faster approach to evaluate the gradient (this will not affect the fitting of the PES, but the prediction of the gradients using the PES will be much more efficient). Now we rely on Mathematica to help generate the Fortran code for the efficient gradient prediction, but we would eventually add this to the msa.py. Our changes are mainly about the prediction so I don't think it will conflict with your plan.

@TiborGY
Copy link
Contributor Author

TiborGY commented Jan 8, 2022

Brief update: beyond the changes you can see in PR #5 , I have not spent too much time on msa.py yet. Instead, I have mostly been experimenting with different ways to modify the code generator, to make energy evaluation even more efficient.

These are very preliminary results but I run some tests on a 5-th order fit of an X6Y2Z1 system, and the Fortran code generated by the generator currently available in this repo takes ~414 µs per emsav call, whereas my current best C++ code takes only ~63 µs per call, which is faster by a factor of ~6.5. The new code also takes much less time to compile, and runs much better on a wider range of CPUs, for example speedups are even better on AMD CPUs.

Note that this speedup is achieved without any pruning or fragmentation, none of the monomials or polynomials are actually removed and the results are identical within numerical precision.

I am still looking for ways to improve performance, but after I am done, I will try to continue the work on msa.py, and upload the new code generator, plus the utility I have written to measure execution speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants