Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add auto restart #292

Merged
merged 13 commits into from
Sep 3, 2024
Merged

Add auto restart #292

merged 13 commits into from
Sep 3, 2024

Conversation

ElliottKasoar
Copy link
Member

Resolves #169

Adds auto restart option for MD, which generates restart_stem based on inputs (including file_prefix and restart_stem), and if restarting simulation, uses the most recent matching file for the input structure.

Also includes fixes for:

  • prefix_override, which threw errors if given a Paths (e.g. anything via --restart-stem)
  • How restart_stem was used, as currently for restart_stem=stem, files of the form stem-T300.0-res-1.extxyz are created, the same as the old usage of struct_name. With these changes, they should now be stem-1.extxyz

Note: the limit after restarting is the effectively the same as before - however far the previous simulation got + the full steps input (e.g. for NaCl-nvt-T300.0-res-100.extxyz, with --steps 1000, the last step would be 1100.)

One thing to consider is that this implementation can (and arguably is likely) to lead to duplicated steps for statistics and/or trajectory file.

E.g. if you were to print statistics every 100 steps, but restart every 1000 steps, then the statistics file may to be written for a while after the last restart file. When we restart, we then append duplicate steps.

Do we want to delete these duplicate steps? This is even harder for the trajectory, since we'd have to calculate which images would be duplicated.

@ElliottKasoar ElliottKasoar added the enhancement New/improved feature or request label Aug 28, 2024
@ElliottKasoar ElliottKasoar self-assigned this Aug 28, 2024
@alinelena
Copy link
Member

I would not delete the steps ideally one write restart at the same step as printing, I will add maybe a note to that effect is the two are different. let us not defend the users from themselves

@ElliottKasoar
Copy link
Member Author

ElliottKasoar commented Aug 28, 2024

Hopefully there would be a corresponding step, but realistically it's quite wasteful to generate a restart file as often as you'd want to output the statistics, so I'd expect this would happen for most use cases unavoidably.

(Unavoidable, and non-trivially fixable being the problem, since the users' choices are reasonable)

@ElliottKasoar
Copy link
Member Author

ElliottKasoar commented Aug 28, 2024

Just to make sure we're on the same page, I'm talking about something like

janus md --ensemble nvt --struct tests/data/NaCl.cif --steps 1000 --stats-every 10 --restart-every 100
janus md --ensemble nvt --struct tests/data/NaCl.cif --steps 1000 --stats-every 10 --restart-every 100  --restart

If the first command fails at step 140, then the -stats file will have steps 100, 110, 120, 130 and 140.

When we (auto) restart, it resumes from step 100, so we end up with stats for 110, 120, 130 and 140 again.

My issues is (1) I don't think the user should have to set restart_every to match stats_every, and even then this could be inconsistent with traj_every, and (2) there's not a nice way to avoid this otherwise, that I can think of...

Rather than deleting written data, for stats we could maybe check that the corresponding step doesn't already exist before writing again? If we added a label for traj we could potentially do the same for that?

@ElliottKasoar
Copy link
Member Author

ElliottKasoar commented Aug 30, 2024

From discussions with @oerc0122, perhaps for now it might be sufficient to add a label to each trajectory image (dyn_step?), which means the user can relatively straightforwardly determine from both the stats and traj files where any overlap is?

Either way, it needs to be clearly documented.

@alinelena
Copy link
Member

I will call it time_step or timestep otherwise is perfectly ok... maybe since you are at this add another time= time in fs...

@ElliottKasoar
Copy link
Member Author

I will call it time_step or timestep otherwise is perfectly ok... maybe since you are at this add another time= time in fs...

It turned out we already had an info["step"] from when we set the statistics, but that meant it would only update at the stats frequency, so wasn't necessarily accurate at the time each trajectory step is written out.

I think the varying frequencies of updating different info keys is still a problem for others e.g. "density" and "real_time", but we can think about that in a separate issue.

@ElliottKasoar ElliottKasoar marked this pull request as ready for review September 2, 2024 11:28
Copy link
Collaborator

@oerc0122 oerc0122 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor things.

janus_core/calculations/md.py Show resolved Hide resolved
janus_core/calculations/md.py Show resolved Hide resolved
janus_core/calculations/md.py Outdated Show resolved Hide resolved
janus_core/calculations/md.py Outdated Show resolved Hide resolved
janus_core/calculations/md.py Outdated Show resolved Hide resolved
janus_core/helpers/utils.py Outdated Show resolved Hide resolved
tests/test_md.py Show resolved Hide resolved
Co-authored-by: Jacob Wilkins <46597752+oerc0122@users.noreply.github.com>
@ElliottKasoar ElliottKasoar merged commit 084b2c9 into stfc:main Sep 3, 2024
8 checks passed
@ElliottKasoar ElliottKasoar deleted the add-auto-restart branch September 3, 2024 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New/improved feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add auto restart for MD
3 participants