The Division of Biostatistics at Washington University School of Medicine in St. Louis maintains a 12-node Linux cluster named Saturn for use by faculty and staff in the Division. This vignette discusses how to run SAS jobs according to a cron schedule, and optionally, to share output via email.
- Virtual Private Network (VPN)
Client
- allows you to connect to the University network
- the University supports Cisco AnyConnect
- Secure Shell (SSH)
Client
- allows you to issue commands on the cluster
- examples include Putty or X-Win32
- configure your client to connect to saturn.biostat.lan using the SSH protocol on port 22
- Secure File Transfer Protocol (SFTP) Client
(optional)
- makes it easier to upload/download files to/from the cluster
- examples include FileZilla or WinSCP
- configure your client to connect to saturn.biostat.lan on port 22
- GNU nano
- a text editor for Unix-like computing systems using a command line interface available on the cluster
- TORQUE
- the resource management system used for submitting and controlling jobs on the cluster
- allows for numerous directives, which are used to specify resource requirements and other attributes for jobs
- For additional help using TORQUE to submit and manage jobs, see the Submitting and managing jobs chapter of Adaptive Computing’s TORQUE Administrative Guide.
- cron
- a time-based job scheduler in Unix-like computer operating systems available on the cluster
- mail
- a command-line email client for Unix and Unix-like operating systems available on the cluster
- SAS script file
- performs all the report generation steps from loading data to generating one or more output files
- this file is called my_sas_script.sas in the vignette and is written to create an output pdf file called “my_sas_output.pdf” when executed
- A TORQUE job script
- a place to specify all of the resource requirements and other attributes for the job
- In this example we will use a file called my_job_script.pbs to request our job be run on a node with SAS installed.
- Crontab file
- In the vignette, we will issue commands to open and edit your
default crontab file to schedule times for the example report to
run.
- In this file each line corresponds to a different combination of job and schedule.
- The first characters of each line specify when the job should be run.
- Various websites such as https://crontab.guru can help determine what to enter to specify the desired schedule.
- The latter characters of each line specify the commands to run at that time.
- In the vignette, we will issue commands to open and edit your
default crontab file to schedule times for the example report to
run.
- Connect to the WUSTL network via VPN.
- Connect to the Saturn cluster head node via your SSH client.
- Create a new directory to help keep your files organized.
- In this example we will use a directory called “my_sas_job”
- Issue the command
mkdir my_sas_job
to make the directory
- Navigate into the new directory.
- Use the command
cd my_sas_job
- Use the command
- Upload your SAS script file like my_sas_script.sas to the “my_sas_job” directory using your SFTP program.
- Create the TORQUE job script file
- Issue the command
nano my_job_script.pbs
in your SSH program to open a new text file in the nano editor and enter the code copied below from my_job_script.pbs where:#!/bin/bash
specifies this is code to be run in a bash shell#PBS -l nodes=1:sas
tells TORQUE you want one node but more importantly that you want a node that has SAS installed (As of this writing only nodes saturn2.biostat.lan and saturn7.biostat.lan appear to have SAS installed and accept jobs from the default queue)cd my_sas_job
changes the working directory from your home root to your job directorysas my_sas_script.sas
tells SAS to run your SAS script
- Use “Ctrl + O” then press “Enter” to save the file and “Ctrl + X” to exit the nano editor.
- Issue the command
#!/bin/bash
#PBS -l nodes=1:sas
cd my_sas_job
sas my_sas_script.sas
- Schedule your job via CRON
- Issue the command
export VISUAL=nano; crontab -e
to edit your default crontab file in the nano editor. - Enter the below code (make sure to end the file on a new line) to run your job every first day of the month at 1:00 PM.
- Use “Ctrl + O” then press “Enter” to save the file and “Ctrl + X” to exit the nano editor.
- Issue the command
0 13 1 * * cd my_sas_job; qsub my_job_script.pbs
- If your job at least starts at the appointed time, you will likely
get mail indicating a success or a point of failure.
- To check your mail enter the command
mail
in your SSH client. - Type of the number of the message you wish to read and press “Enter”.
- After reading the email you may enter
d
and press “Enter” to delete the message. - Type
q
and press “Enter” to quit mail.
- To check your mail enter the command
- You may also wish to clean up all of the extra files generated in
your job directory “my_sas_job”.
- Navigate into the directory.
- Use the command
rm filename
to remove files you no longer want.
- The mail program is also configured to send email. If you wish to
use it to send your report output (e.g., “my_sas_output.pdf”)
- Use
nano message.txt
to open the “message.txt” file in the nano text editor- enter a message for the body of your email.
- Use “Ctrl + O” then press “Enter” to save the file.
- Use “Ctrl + X” to exit the nano editor.
- Add the following code to the bottom of your TORQUE job script
where:
- -s specify the email subject in quotes
- -a specify attachment file
- -c send carbon copies to user(s) (comma separated list of names)
- -r specify from address
- recipient_1@wustl.edu,recipient_2@wustl.edu (comma separated list of recipients)
< message.txt
sets the email message body to be the contents of message.txt
- Use
mail -s “My Subject” -a “my_sas_output.pdf” -c cc_1@wustl.edu,cc_2@wustl.edu -r ‘”Your Last Name, Your First Name” your_email@wustl.edu’ recipient_1@wustl.edu,recipient_2@wustl.edu < message.txt
- Unlike SAS, most everything in Unix is case-sensitive.
- Be careful when typing commands. Whether a character is upper or lower case does make a difference.
- Semi-relatedly, it is recommended to avoid spaces in directory
and files names.
- One option to avoid spaces is to substitute underscores.
- Be careful about transferring files between Windows machines and
Unix machines to make sure the line endings are translated properly.
- Windows uses carriage return and line feed (“\r\n”) as a line ending, while Unix uses just line feed (“\n”).
- If you created, and potentially if you edited, either your
TORQUE job script or your crontab file on a Windows machine use
a command on Saturn called
dos2unix
to replace the problematic line endings.- Issue the command
dos2unix file_with_windows_line_endings.ext
- Better yet, only create and edit such files on Saturn.
- Issue the command
Established in 1853, Washington University in Saint Louis is among the world’s leaders in teaching, research, patient care, and service to society. Boasting 24 Nobel laureates to date, the University is ranked 7th in the world for most cited researchers, received the 4th highest amount of NIH medical research grants among medical schools in 2019, and was tied for 1st in the United States for genetics and genomics in 2018. The University is committed to learning and exploration, discovery and impact, and intellectual passions and challenging the unknown.