Skip to content

08_Running_Ambuild

Ambuild-code edited this page Oct 18, 2023 · 17 revisions

8.1 How to Set Up an Ambuild Input File

Once Ambuild has been successfully installed onto your machine (as is described in the installation guide README.md, found at: https://github.com/linucks/ambuild), it is possible to start using it to generate models of amorphous materials such as hyper-crosslinked polymers (HCPs), covalent triazine frameworks (CTFs), or conjugated microporous polymers (CMPs), to name but a few. The following instructions explain how to set up and run a test script.

  1. After logging in to your account, create (or move to) a directory where you wish to run your Ambuild job. For example, my directory will be called: /opt/ambuild/work/testcase.
  2. Inside this directory, we will need three things: the Ambuild python input file, all of the required building blocks that you will use in your script, and all of the parameters required to describe the bonds, angles, and intermolecular potentials encountered during the production of your polymer material.

Here, my Ambuild input file is called test.py. I have placed all of the building blocks I need to run this script (found within the tests/blocks directory downloaded as part of the Ambuild install) within a directory called blocks, and all of the parameter forcefields in a directory called params.

  1. Next, we will make sure that the test.py Ambuild input file is ready to run. Opening it up shows me this:
#!/usr/bin/env python3
#Our imports
from ambuild import ab_cell

paramsDir='./params'

#Cell Dimensions:
boxDim=[20,20,20]

#Create the cell and add fragments to the library
mycell = ab_cell.Cell(boxDim, atomMargin=0.1, bondMargin=0.5, bondAngleMargin=5, paramsDir=paramsDir )

mycell.libraryAddFragment( filename='./blocks/ch4.car', fragmentType='ch4', markBonded=True )
mycell.libraryAddFragment( filename='./blocks/hydrogen.car', fragmentType='h2', markBonded=True )

#Specify the bonding rules
mycell.addBondType( 'ch4:a-ch4:a' )
mycell.addBondType( 'h2:a-h2:a' )
mycell.addBondType( 'ch4:a-h2:a' )

#Add building blocks to the cell in a loop of 100 steps
for i in range(100):
    mycell.seed( 1, fragmentType='ch4')
    mycell.seed( 1, fragmentType='h2')

#Retain structural information after each step
    mycell.dump()

test.py is a very simple input file which is designed to determine the minimum cell size that can successfully incorporate all of the building blocks I have asked it to include.

All Ambuild input scripts must begin with the line: #!/usr/bin/env python3.

They must also import the ab_cell module from the ambuild python package in order to be able to create a cell to build the polymer structures up within.

Additional imports may also be required depending on the particular script you wish to run, for example, the ab_util module (also imported from the ambuild package) is needed in order to restart a script from a previous .pkl.gz output step. This will be shown in more detail later in this guide.

Once we have imported all the necessary modules, we must specify the full path to the directory containing all of the forcefield parameters (in .csv format) needed to build our models. This is done using: paramsDir="./params".

The next thing to do is to set the cell dimensions. Ambuild is designed to build cubic unit cells, with identical x, y, and z dimensions, but it can also be used to generate cuboidal cells. These can be set using the boxDim argument, as described in our wiki page (https://github.com/linucks/ambuild/wiki/03_Creating_The_Cell).

We can now create our unit cell and add to the library the fragments we wish to use in our script. This is done using:

mycell = ab_cell.Cell(boxDim, atomMargin=0.1, bondMargin=0.5, bondAngleMargin=5, paramsDir=paramsDir )
mycell.libraryAddFragment( filename='./blocks/ch4.car', fragmentType='ch4', markBonded=True )

To add multiple fragments to the library, simply repeat the mycell.libraryAddFragment command until each fragment has been added. Again, more details about this can be found in our wiki page (https://github.com/linucks/ambuild/wiki/03_Creating_The_Cell).

As with the parameters, the full path to each building block must be provided to allow the computer to find all of the building blocks it needs to run the script. The path given here shows that the building blocks are found within the blocks subdirectory of the working directory in which the input file is located. The blocks directory must contain both the .car and .csv file for each building block required to allow Ambuild to run.

The bonding rules can then be specified for the polymer you wish to build using the mycell.addBondType command, as specified in our wiki (section 03_Creating_The_Cell). The bonding rules can be used to describe heterocoupling, homocoupling, and a catalytic pathway.

The next stage in our test.py input file is to add the required building blocks into the cell using the mycell.seed command. Here, I have added 100 building blocks of methane and 100 building blocks of hydrogen by running a loop of 100 steps and adding one of each building block during each step, but these values can, of course, be adjusted to suit your particular case.

The mycell.dump command can then be used to retain structural information. Each time mycell.dump is called, a new .pkl.gz output file will be generated. This command can also be incorporated into for loops within the script, however, it is advisable not to retain too many output files as these can quickly clog up the free space on your computer. The output of test.py should be 100 .pkl.gz steps if successful.

8.2 Running Ambuild Using Docker

Now that the Ambuild input file is set up correctly, it can be run directly from the terminal. (Please be aware that the first time Ambuild is run on a new computer account or machine there may be a relatively long start-up process whilst Ambuild downloads the docker images for HOOMD-blue, which are several GB in size. Ambuild will only do this once).

To run Ambuild from the terminal, firstly ensure that you are inside the working directory containing the Ambuild input script, in my case /opt/ambuild/work/testcase. Then, simply type in:

/opt/ambuild/misc/run_ambuild_docker.sh test.py

Replacing test.py with the name of your Ambuild input script. This will output to the screen.

It is also possible to run Ambuild scripts in the background (recommended), allowing you to close your terminal window without causing the script to crash. This is done using:

nohup /opt/ambuild/misc/run_ambuild_docker.sh test.py &

Another potential option, which again allows you to run Ambuild in the background, is to set off the script so it writes to a log file (in the same way that the command above will write the output to a file called 'nohup.out', the following command will pipe the output to a file called 'log'):

nohup /opt/ambuild/misc/run_ambuild_docker.sh test.py > log 2>&1 &

Each line of the run_ambuild_docker.sh is explained in detail below:

docker run --rm: run docker and remove the container on exit.

--runtime=nvidia: use the Nvidia environment to take advantage of the GPU acceleration.

--volume $run_dir:$run_dir: make the current working directory from where this command is run (specified using the variable `run_dir="$PWD"`) available inside the container.

--volume ${ambuild_dir}/ambuild:/home/abbie/ambuild/ambuild: make the current Ambuild directory ${ambuild_dir} on the local file system available as /ambuild:/home/abbie/ambuild/ambuild within the container. This makes it possible for the python3 executable within the container to find the ambuild code, so that import ambuild within the Ambuild script works.

--workdir $run_dir: make the working directory inside the container the full path to the working directory on the local machine. This means that the current working directory where the script is run, will be used as the working directory for running Ambuild.

glotzerlab/software:2020.11.18-cuda10: use the docker image from [glotzerlab/software version 2020.11.18-cuda10](https://hub.docker.com/r/glotzerlab/software/). This downloads the file from the Docker repository (it's very large - several GB - so the download can take some time, although it's only done once), and uses this to create the container.

python3 $script: run the specified script, containing the Ambuild commands in the current directory with the python3 executable in the container.

Structural information from the .pkl.gz output steps can be obtained as discussed in our wiki (https://github.com/linucks/ambuild/wiki/09_Saving_Results).

8.3 Running Ambuild from a Previous Output Step using Docker

If you wish to run Ambuild from a previously-generated .pkl.gz step, this can be done as described below:

  1. Follow steps 1 and 2 above. In addition to the Ambuild input file, and directories including all of the required parameters and building blocks, it is also necessary to place the .pkl.gz output step you wish to run the Ambuild input script from inside the working directory you wish to run Ambuild in.
  2. Now, we need to make some modifications to our Ambuild input file, test.py, in order to allow it to be started from a previous output step. This is done as follows:
  • In the "#Our imports" section of the script, add a line saying: from ambuild import ab_util.
  • Remove the "#Cell Dimensions:" section of the script, as the unit cell has already been assembled in the output step you will be importing.
  • Remove the mycell.libraryAddFragment and mycell.addBondType lines (unless you are adding new fragments or bonding rules).
  • Remove the mycell = ab_cell.Cell... line and replace with the following (with X being the number associated with the .pkl.gz output step you wish to begin from):

mycell = ab_util.cellFromPickle("step_X.pkl.gz", paramsDir=paramsDir )

The test.py script should now look like the one below. If so, it can be run as described above.

#!/usr/bin/env python3

#Our imports
from ambuild import ab_cell
from ambuild import ab_util

paramsDir='./params'

#Load output file
mycell = ab_util.cellFromPickle("step_1.pkl.gz", paramsDir=paramsDir )

#Add building blocks to the cell
mycell.seed( 100, fragmentType='ch4')
mycell.seed( 100, fragmentType='h2')

#Retain structural information
mycell.dump()

8.4 Restarter script

To automatically restart your simulation when it fails with a specified error (error computing cell list as the script is set up currently), you need three parts of the restarter script to be set up correctly. These can be found within /opt/ambuild/misc. The first is a modified run_ambuild_docker.sh script called run_ambuild_docker_restart.sh which restarts the script when it stops up to a specified maximum tries (default = 100), feeds the most recent saved state to Ambuild, and redirects stdout and stderr to numbered output files for convenience. The second is a python header (Python_header.py) that replaces the mycell = line in the Ambuild submission script which accepts the most recent saved state from the submission script and reads it into the simulation. The final part (Testrestart) is a short test script that reads the most recent output file and checks if it has failed with a specific error, such as a cell list error, and tells the submission script to restart if it has.

To restart your scripts, you must first start your simulation off as normal. Then when it fails, edit the Testrestart script so it catches the desired error or errors (currently set to error computing cell list), make a version of your python submission script that contains the Python_header.py and only the operations you want to restart with (so if your initial script seeds some blocks and then loops through MD and bonding, just have the MD and bonding in the restart script), set the maximum number of tries in the run_ambuild_docker_restart.sh script, and then use that script with your modified Python_header.py.

The stdout and stderr are redirected to restart.o.x and restart.e.x where x is the loop count. You can monitor your job by grepping the output you want to monitor from restart.o.* to see how much is happening between restarts.

An example modified Python_header.py is shown below. Here, a loop is undertaken to seed one methane building block and one hydrogen building block over a loop of 100 steps:

#!/usr/bin/env python3

#Our imports
from ambuild import ab_cell
from ambuild import ab_util

import sys, getopt

inputfile = ''
try:
   opts, args = getopt.getopt(sys.argv[1:],"hi:",["ifile="])
except getopt.GetoptError:
   print (sys.argv[0], ' -i <restartfile>')
   sys.exit(2)
for opt, arg in opts:
   if opt == '-h':
      print (sys.argv[0], ' -i <restartfile>')
      sys.exit(0)
   elif opt in ("-i", "--ifile"):
      inputfile = arg
sys.argv = [sys.argv[0]]
#print ('Restart file is', inputfile)

paramsDir='./params'

#Create Cell and seed it with the blocks
mycell = ab_util.cellFromPickle(inputfile, paramsDir=paramsDir )

#Add building blocks to the cell in a loop of 100 steps
for i in range(100):
    mycell.seed( 1, fragmentType='ch4')
    mycell.seed( 1, fragmentType='h2')

#Retain structural information after each step
    mycell.dump()

8.5 Adding Volume Flags to the Run_Ambuild_Docker.sh Script

The key thing to understand is that the Docker container cannot see the local computer file system - it can only access the directory structure within the container. In order to access files on the local computer, any directories will need to be mounted into the container using `--volume` arguments, and then the internal container path used within any scripts.

For example, if you wish to store your forcefield parameter files or building block files in an external directory (i.e. somewhere that is not within the Ambuild docker download), it is possible to tell the computer to search for these at a particular location by adding a volume flag within the run_ambuild_docker.sh script to that place. This is done by adding a line such as:

--volume full_path_to_directory:full_path_to_directory
The backslash at the end of the line here is important, as it tells the computer that this section of the script is one continuous line, and to read this as such.

Once this has been added in to the run_ambuild_docker.sh script, it is important to also ensure that the Ambuild input script gives the corresponding path to the parameters or building blocks. For example, if I want Ambuild to take the parameters and building blocks from a Dropbox directory containing one directory called Parameters and another called Blocks, I would add a volume flag saying:

--volume ~/Dropbox:~/Dropbox

The file path to the building blocks I would like to call in my Ambuild input script would read:

mycell.libraryAddFragment( filename="~/Dropbox/Blocks/ch4.car..." )

And I would reference my parameters directory using:

paramsDir="~/Dropbox/Parameters"

8.6 Using multiple GPUs on parallel jobs

To assign a job to a specific GPU you need to add an additional flag before the volume flags to the run_ambuild_docker.sh script: **-e NVIDIA_VISIBLE_DEVICES=X **, where X is the desired GPU. By default the job is assigned tot he first GPU, GPU0. Make as many versions as you have GPUs that you want to utilise, and save the modified shell scripts with different names. Then use the corresponding script when you want to assign a job to a particular GPU card.