-
Notifications
You must be signed in to change notification settings - Fork 0
/
quickstart.tex
242 lines (155 loc) · 18.5 KB
/
quickstart.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
\section{Quick start}
\subsection{Input files}
There are two mandatory input files -- one contains the input coordinates in xyz format and the other one contains all the parameters specifying the simulation.
The XYZ format is specified as follows:
first line contains number of atoms, second line is empty or contains some comment and the following lines contains cartesian coordinates (in \AA ngstroms) in the form "\textit{atom-name x y z}". The atom names are not case-sensitive. For example, a water molecule is specified as:
\begin{verbatim}
3
Useful comment
O 0.000000 0.000000 0.000000
H 0.000000 0.000000 0.950000
H 0.895670 0.000000 -0.316663
\end{verbatim}
The parameter file contains all parameters needed for the simulation.
The input uses Fortran namelist syntax.
Variables are grouped into different namelists and there are two namelists that have to be always present: \textit{general} and \textit{nhcopt}. For each type of simulation, there is an example input file in
the folder SAMPLES/. Here is the minimum input file for classical MD.
\begin{verbatim}
----------------------------------------------------------------
Everything outside namelists is ignored
&general ! This is the beginning of a namelist.
!Comments within a namelist start with exclamation mark.
!The order of keywords is irrelevant.
!Most of the keywords have reasonable default values.
pot='g09' ! where do we get forces and energies?
natom=27, ! number of atoms
ipimd=0, ! 0=CMD, 1=PIMD,2=SH
mini=2000, ! equilibration period
!keywords on one line are separated by a comma
nstep=50000,dt=20., ! number of time steps, time step [au]
irandom=1651563, ! random seed
irest=0, ! should we restart from restart.xyz?
nwrite=1, ! how often some output should be printed (energies etc.)
nwritex=5, ! how often should we print coordinates?
nrest=5, ! how often should we print restart files?
nwritev=0, ! how often should we print velocities?
/ ! end of a namelist
&nhcopt
temp=298.15, ! Temperature [K] for Maxwell-Boltzmann sampling and thermostat.
inose=1, ! Thermostatting: 0 = no thermostat, 1 = Nosé-Hoover 1, 2 = GLE
tau0=0.001, ! relaxation time of the NHC thermostat [ps]
/
\end{verbatim}
Besides these two input files, you need to prepare the directory with a bash script calling the specified \textit{ab initio} program e.\,g. ORCA/r.orca. You have to modify this file and specify the method, basis set and other parameters specifying the \textit{ab initio} calculation. The interface can be found in the folder INTERFACES.
After you prepared your input files, you can run \abin simply as:
\bigskip
\noindent \colorbox{black!20}{$ \$ ./abin \quad [-x \; coordinates.xyz]\; [-i\; input.dat]\quad [-v \; velocities.xyz]\; \quad > abin.out $}
\bigskip \noindent
As you can see, you can also provide file with initial velocities in the same XYZ format as coordinates. However, velocities must be in atomic units. Without the command line options, the program looks for initial coordinates in file \textit{mini.dat}, input parameters in \textit{input.in} and takes initial velocities from Maxvell-Boltzmann distribution.
If you run \abin in cluster environment, it is better to use the automated scripts as described in the section \ref{sec:clusters}.
\subsection{Output files}
At the beginning of the simulation, \abin dumps all the input parameters (including the defaults not specified by the user)
to the standard output along with some other essential information like atom masses, number of threads in parallel simulations etc.
During the simulation, only the timestep and time in femtosecond is printed to standards output, all the other information are printed to separate files.
All output files are simple text files. Besides the files containing the geometries, they should all be readable by \textsc{xmgrace} and you can thus visualize the time evolution of each quantity very easily. Each file contains a one-line header explaining the quantity in each column.
\begin{description}
\item[temper.dat] This file contains instantaneous and average temperature. The third column shows the conserved quantity of the respective thermostat (see section about NH thermostat). You should always check, whether this quantity does not drift. Otherwise, the thermostat
is probably not working properly.
\item[energies.dat] This file contains various types of energies. The content may vary slightly depending on the type of simulation. However, the columns are usually in the following order: \textit{time}, \textit{potential energy}, \textit{kinetic energy}, \textit{total energy}.
\item[movie.xyz] This file contains the trajectory in the xyz format and is directly e.g. by VMD or Molden.
\item[restart.xyz] This file contains all the necessary information so that \abin can restart the simulation in case it crashed or if the user wants to extend it. All data are given in atomic units. The structure of this file depends on the simulation parameters (\textit{e.g.} which thermostat you use). The first line contains the current time step. Then there is a section containing the current geometry followed by the section containing the velocities. The third section contains either the thermostat parameters (if you use one) or surface hopping wave function. The next section contains the cumulative averages of certain estimators. The last part contains the state of random number generator. This part is not mandatory. If it's missing, the random number generator simply uses the random seed from the input file and starts the sequence from the beginning.
% This is obsolete, user should not mess with the restart file at all
%If you need to produce restart file manually (but really you probably don't) you must always use the correct character string that comes at the beginning of each section (i.e. "\textit{Cartesian Velocities [au]}").
%For surface hopping or classical MD, you can use program UTILS/make\_restart.f90 to produce the appropriate restart file.
\end{description}
The other output files are written only when using certain methods. These are described later in the manual.
\subsection{Running ABIN on clusters}
\label{sec:clusters}
There are a few scripts that are recommended for efficient execution of \abin jobs on clusters. All of them are simple BASH scripts and it should be straightforward to customize them to your needs.
The launching script \textit{r.abin} (it should be located at the same directory as your input files) can be submitted simply as follows:
\bigskip
\colorbox{black!20}{\$ \textit{qsub -cwd r.abin}} (you may also want to specify the queue by \textit{-q QueueName})
\bigskip \noindent
The script copies all files in the current working directory to the directory on the node scratch and launches \abin there to avoid I/O traffic over the network.
The names of the node and the scratch directory are written to file job.log
In order to see the current data from the run, one can use the script \textit{fetchabin.sh} simply as:
\bigskip
\colorbox{black!20}{$ \$ \quad fetchabin.sh $} (assuming it is in your \textsc{\$PATH}).
\bigskip \noindent
This will copy the files from the scratch to the current working directory.
\noindent
To interrupt the simulation, use:
\colorbox{black!20}{$ \$ \quad ExitAbin.sh $}
\bigskip \noindent
This script will create empty file \textit{EXIT} in the scratch directory. \abin checks for the existence of this file at the beginning of each step and if present, prints the restart file and exits. There are two advantages of this approach over simple \textit{qdel} command. First, you have the most recent restart file. Moreover, the files from scratch disk are automatically deleted (if you used script r.abin).
\subsection{Running classical MD}
Here we will see how to setup a basic classical NVT simulation using velocity Verlet algorithm to propagate classical Newton's equations and Nos\'{e}-Hoover-Chain thermostat to keep constant temperature.
It is supposed that you have a clean directory with the XYZ geometry of your system.
\begin{itemize}
\item The first thing to decide is what program to use to get forces and energies. If you would choose for example ORCA, copy the directory
ABIN-INTERFACES/ORCA. Now you have to modify script ORCA/r.orca and specify the method and basis that you want to use. The job of this script is simple: take the geometry from \abin, create input for \textsc{Orca}, launch \textsc{Orca}, collect data from \textsc{Orca}'s output and send them back (via files) to \abin.
\item Now you need to create or modify the file with input parameters. Copy the sample file from SAMPLES/input.in.cmd.
This file is commented so it should be straightforward to understand. There are a few basic things that need to be specified in every simulation: where do we get forces (parameter \textit{pot}), how many atoms we have (\textit{natom}), what type of simulation do we want (\textit{ipimd}, how many time steps and their length (\textit{nstep, dt}). These keywords are in namelist "general". There are two more critical parameters in namelist "nhcopt". You need to specify the thermostat (in this case we will use Nosé-Hoover-Chain thermostat, inose=1)
and temperature of the system (\textit{temp}). The keyword \textit{temp} also specifies the initial temperature i.e. the Maxvell-Boltzmann distribution from which we randomly take initial velocities. And that's basically it.
\item Now you can launch the simulation. If you are using the cluster environment, copy the script UTILS/r.abin.
You should modify it and specify the names of your input files and and path to the \abin binary. Then you can send the job to queue and use \textit{fetchabin.sh} script too see, what's happening. It is strongly advised that each simulation has its own folder and that there is no other stuff in that folder. Otherwise, \textit{fetchabin.sh} will copy unnecessary amount of data and will rewrite any changes that you could make since the simulation began.
\end{itemize}
There are couple of other options that you typically want to set. Each simulation should have its unique random seed (\textit{irandom}). This is especially critical for surface hopping simulations and simulations utilizing quantum thermostat ($inose=2$). Secondly, you might want to control the frequency of writing to the output to files (\textit{nwrite, nwritex, nrest, nwritev}). Especially the trajectory file \textit{movie.xyz} might get very big for large systems and/or long simulations. If you are doing thermodynamical simulations $(inose > 0$) you don't need every geometry anyway, since they are correlated and therefore useless for statistical analysis.
When running a classical simulation without thermostat (\textit{inose=0}) you should always check the conservation of energy in file \textit{energies.dat}. If you use thermostat ($inose>0$), you should check the conserved quantity of the thermostat, which is printed at the fourth column in file \textit{temper.dat}. This might vividly oscillate, but it should not drift. If it drifts, then your time step is too long, or you need to tune the parameters of the thermostat, as explained below the next section.
\subsubsection{Tuning Nosé-Hoover thermostat}
A comprehensive review of Nosé-Hoover algorithm is given in section \ref{sec:NHC}. Here we describe only practical details.
It is sometimes necessary to tune the parameters of the thermostat if its conserved quantity drifts or if it even produces incorrect temperature. There are couple of parameters that are needed to specify the action of the thermostat. All of them have default values, which were chosen for typical systems containing hydrogen atoms. The most important parameter is so called relaxation time of the thermostat ($tau0$) given in picoseconds. If this value is not chosen appropriately, the thermostat may decouple from the system and will not work.
If your system does not contain fast vibrations, you may want to set a larger value.
Other parameters are more robust. Number of chains ($nchain=4$) and number of inner iterations during of the NHC integrator ($nrespnose$, $nyosh$) have by default high values that should suffice for common usage. Since the simulation time is typically determined by \textit{ab initio} calculations, the overhead caused by these conservative settings is usually completely negligible.
\abin by default uses so called massive thermostatting scheme originally devised for PIMD simulations ($imasst=1$).
If you know what you are doing, you might want to use global thermostat, but be aware that this is not tested nearly as much as the massive version and can in some cases provide wrong results for small systems, because \abin does not remove initial rotations. The details about the global version can be found in section \ref{sec:shake} which explains how to set up the simulation with SHAKE constraints.
\subsection{Restarting the simulation}
Quite often you will need to restart the simulation from the last point. This is typically pretty straightforward. Simply set \textit{irest=1} in namelist "general" and start the simulation again.
Note that \abin always checks for the presence of files movie.xyz and restart.xyz. If they are present but \textit{irest=0}, the program will stop with an error. This prevents an inadvertent deletion of your precious data. However, if you delete movie.xyz and restart.xyz and set \textit{irest=0}, all other output files will be rewritten.
Before reading any data, \abin copies the original restart.xyz file to "restart.xyz.\textit{timestep}". This file serves as a backup if anything goes wrong, so that you can always go back to that restart file.
\medskip
WARNING: If the frequency of restart printing (\textit{nrest}) is different from the other ones (\textit{nwrite}, \textit{nwritex}), it may happen that there will be duplicate entries in the output files after the restart. However, you can easily fix this after the end of simulation by simply deleting the conflicting lines and geometries. You can easily determine, whether the simulation was restarted by searching for headers in the output files (lines beginning with "\# Time").
\medskip
In some rare cases, you may want to change the parameters of the simulations when before restart.
This may cause troubles because the format of the restart file might be different. If you want for example to turn on the thermostat, the restart file will be missing the thermostat section. You may fix that by setting \textit{readNHC=0} or \textit{readQT=0}. In other more complicated cases, you will have to manually edit the restart file. As mentioned before, \abin is strict about the exact string match in order to prevent reading corrupted restart file.
If you want to see what's really going on during restart, checkout the subroutine \textit{restin} in src/analysis.f90. If you want to create the restart file for surface hopping simulation, you should use/modify program \verb|UTILS/make\_restart.f90|.
\subsection{Running Path Integral MD}
TODO:
PIMD simulation produces one more output file -- \textit{est\_energy.dat} -- which contains the instantaneous values and cumulative averages of primitive and centroid virial total energy estimators (equations \ref{eqeprim} a \ref{eqevir}). The cumulative average of the virial estimator should converge much more quickly to an constant value. However, both estimators should always converge to the same value. If they do not, something is wrong with your simulation.
Note that during PIMD simulations \abin always prints geometries of every bead.
\subsubsection{Running parallel simulations}
PIMD algorithm can be straightforwardly parallelized, as we need to do do \textit{nwalk} independent \textit{ab initio} calculations at each time step.
\abin uses OpenMP parallelization for this. Note that this is the only part of the code, that is currently parallelized.
You can set variable \textit{nproc} to specify the number of processors you want to use. As you might suspect, \textit{nwalk} should be divisible by \textit{nproc} for maximum efficiency (\abin will complain if it is not).
Note that you can in addition use parallel version of the \textit{ab initio} program as well. You can of course do that for all types of simulations. However, in the case of PIMD, it is almost always more efficient to use \abin parallelization, since you rarely get perfect linear scaling in \textit{ab initio} codes.
For example, it might be convenient to set \textit{nproc=4} when using PI-GLE method with 4 beads (see below), and also set 2 processors for each ab initio calculation. This setup requires 8 processors in total.
\subsection{Running simulations with GLE thermostat}
Besides the Nosé-Hoover thermostat, \abin can also use the thermostat based on Generalized Langevin equation (GLE). While this thermostat can be used also for classical MD and has many different application, it is currently used to simulate quantum nuclear effects using either Quantum Thermostat or PI+GLE method as described below.
\subsubsection{Quantum thermostat}
The quantum thermostat method (here denoted QT) was devised by M. Cerriotti et al. CITE for efficient simulations of quantum nuclear effects without the need for Path Integrals, which are very computationally demanding. There are two key ideas behind this method:
\begin{itemize}
\item The QT method is based on the solution of harmonic oscillator in thermal equilibrium.
It turns out that both classical and quantum distributions have Gaussian shape, albeit with different widths. One can then convert the classical distribution into the quantum distribution
by running classical MD at elevated temperature $T^{\star}$ given by:
\begin{equation}
T =
\end{equation}
\end{itemize}
\subsubsection{PI+GLE method}
The ideas behind the quantum thermostat can be also applied for accelerating the convergend of PIMD simulations to the quantum limit.
\subsection{Running Surface Hopping simulations}
TODO:
\subsection{SHAKE and global NH thermostat}
TODO:
\label{sec:shake}
\colorbox{black!20}{imasst=0 } %\hspace{4cm}} \newline
\colorbox{black!20}{natmolt=2 } %\hspace{4cm}}
\subsection{Boundary conditions}
By default, all simulations are performed under open boundary conditions.
For classical simulation with Amber force fields (keyword pot=nab), the periodic boundary conditions (\textbf{PBC}) are being developed.
For cluster simulations, one can use simple spherical restraining potential (\textbf{SBC}) as explained in detail below.
%The more elaborate elastic bag method\cite{bag} is currently under development.
%\subsubsection{PBC}
\subsubsection{SBC}
TODO
%\subsubsection{Elastic bag}