Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas for finite_difference_grad.py #182

Open
annulen opened this issue Nov 13, 2023 · 4 comments
Open

Ideas for finite_difference_grad.py #182

annulen opened this issue Nov 13, 2023 · 4 comments

Comments

@annulen
Copy link

annulen commented Nov 13, 2023

  • It should be possible to compute gradients in massively parallel way, i.e. instead of sequentially running parallel calculations for every gradient, N independent serial calculation should be run. In Perl similar thing can be done by using Parallel::ForkManager that manages process pool automatically, I'm pretty sure Python has something similar as well. Benefit is that sequential SCF calculations are more efficient, and also can take different number of iterations each — with separate processes they won't slow down each other. Another benefit is ability to use more processor cores/threads — AFAIK hybrid DFT doesn't get parallelization benefits after ~20 concurrent threads, while massively parallel approach would allow to use say 32 threads on the same machine without any synchronization overhead.
  • I think a cheap alternative for the full 6N-point gradient check is possible: if we take two consequential steps from optimization and add one more point (-step) to them we can check gradient alongside direction of step. This can even be done as a part of optimization algorithm if needed. Having cheap check would allow to perform such checks casually without need to allocate resources for a large task equivalent to numerical Hessian calculation.
  • Instead of printing status reports every 5 steps, some progress bar module could be used. For example, I've used tqdm, it can wrap any iterable and automatically generate nice progress bar in terminal. See e.g. annulen/vibAnalysis@334d920 for usage example.
  • finite_difference_grad.py should share at least part of command line argument definitions with params.py to avoid code duplication and at the same time allow using all relevant features. For example, I had to make the following patch for using ASE engine:
diff --git a/tools/finite_difference_grad.py b/tools/finite_difference_grad.py
old mode 100644
new mode 100755
index f824f05..d51f348
--- a/tools/finite_difference_grad.py
+++ b/tools/finite_difference_grad.py
@@ -133,6 +133,16 @@ def parse_fd_args(*args):
     grp_univ.add_argument('--port', type=int, help='Work Queue port number (optional).\n')
     grp_univ.add_argument('--verbose', type=bool, default=False, help='Print extra information.\n')
 
+    grp_software = parser.add_argument_group('software', 'Options specific for certain software packages')
+    grp_software.add_argument(
+        '--ase-class',
+        type=str,
+        help='ASE calculator import path, eg. "ase.calculators.lj.LennardJones"')
+    grp_software.add_argument(
+        '--ase-kwargs',
+        type=str,
+        help='ASE calculator keyword args, as JSON dictionary, eg. {"param_filename":"path/to/file.xml"}')
+
     grp_help = parser.add_argument_group('help', 'Get help')
     grp_help.add_argument('-h', '--help', action='help', help='Show this help message and exit')
 
@@ -145,7 +155,7 @@ def parse_fd_args(*args):
 
 def main():
     args = parse_fd_args(sys.argv[1:])
-    molecule, engine = get_molecule_engine(engine=args['engine'], input=args['input'], dirname='fdgrad.tmp')
+    molecule, engine = get_molecule_engine(dirname='fdgrad.tmp', **args)
     if 'port' in args:
         createWorkQueue(args['port'], debug=False)
     coords = molecule.xyzs[0].flatten() * ang2bohr

However, other engine-specific arguments should also be handled in case users of respective engines need to check gradients. Perhaps grp_software has to be exported from params.py, and maybe some other option groups too.

On the other hand, I have experience with another module for handling command line arguments: absl.flags. It allows to define flags right in the modules where they are being used, and any script that uses those modules directly or indirectly will automatically be able to parse their flags from argv. Downside is an extra dependencies and a bit less user-friendly --help. I can explain more about it if you are interested.

@leeping
Copy link
Owner

leeping commented Nov 13, 2023

Thanks for the suggestions. I agree with many of the suggestions you made for finite_difference_grad.py, ideally it would share many of the command line arguments as the other "programs". A progress bar would be great (I never did get this to work). The embarrassingly parallel mode of finite_difference_grad.py (as well as the normal mode calculation in the optimizations) is currently handled using Work Queue. It creates a dependency but also enables one to run the gradient jobs on different physical machines.

Yes, I think it would be a good idea to use the optimization step as a finite difference step and compare it with the projection of the gradient in that direction. For large steps, a significant disagreement can be expected, but one should expect the agreement to improve as the steps become smaller assuming the gradient and energy are consistent. This could be done as part of the geometry optimization loop so that the user can be warned when there's energy/gradient inconsistency. I don't think additional steps to improve the numerical gradient quality are necessary, but it could be nice if implemented cleanly.

In fact it may be possible to use the energy change to "correct" the quantum chemical gradient, similar to how one updates the Hessian using BFGS, but I think that is a new research project.

@annulen
Copy link
Author

annulen commented Nov 13, 2023

The embarrassingly parallel mode of finite_difference_grad.py (as well as the normal mode calculation in the optimizations) is currently handled using Work Queue. It creates a dependency but also enables one to run the gradient jobs on different physical machines.

Does it allow to run N jobs on the same machine? I only have one for now :)

@leeping
Copy link
Owner

leeping commented Nov 13, 2023

Yes. You simply run the finite_difference_grad.py and multiple copies of work_queue_worker on the same machine.

@annulen
Copy link
Author

annulen commented Nov 16, 2023

Another practical consideration: I would like to evaluate quality of gradients on 9-molecule cluster which we've discussed in another issue, however that system contains 180 atoms so it would require huge amount of resources to compute. However, evaluation of 3 points for a single step vector could be done quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants