Skip to content

Setting up AaronTools

Steven E. Wheeler edited this page Aug 25, 2018 · 27 revisions

Table of Contents

Overview

This tutorial will walk you through setting up AaronTools.

AaronTools is a collection of Perl modules for building, manipulating, and probing molecular structures, parsing Gaussian09 and Gaussian16 output files, and interacting with HPC queuing systems.

Minimal requirements for AaronTools

  1. Perl (reasonably recent version)
  2. Perl modules Math::Vector::Real and Math::MatrixReal (available from CPAN)

Setting Up AaronTools

(below, we assume you are using BASH or a related shell)

Minimal Install

If you only wish to use AaronTools to build, manipulate, and probe molecular structures and to parse Gaussian output files, but not submit or monitor jobs, the minimal setup is as follows:

Note: A full install of AaronTools is required for AARON (https://github.com/QChASM/Aaron/wiki).

1. Create a directory where you will keep the main files for AaronTools

 cd /home/group/wheeler
 mkdir QChASM

The preferred way to set up AaronTools is to have one central copy in a location accessible by all users. If individuals wish to add their own custom ligands, TS libraries, etc., these are kept in $HOME/Aaron_libs

2. Define an environmental variable QCHASM

This will point to the directory you just created. Add this variable to your .bashrc (or .bash_profile):

 echo 'export QCHASM=/home/group/wheeler/QChASM' >> ~/.bashrc

3. Define an environmental variable PERL_LIB

This points to the directory where Math::Vector::Real and Math::MatrixReal are installed

 echo 'export PERL_LIB=/home/group/wheeler/Perl' >> ~/.bashrc
 source ~/.bashrc

4. Change into $QCHASM and clone AaronTools from GitHub

 cd $QCHASM
 git clone https://github.com/QChASM/AaronTools.git

5. Run Simple Tests

To make sure everything is set up correctly and all required software is in place, you can run a series of quick tests

 cd $QCHASM/AaronTools/test
 ./test_simple_install.t

This will run 6 test sets, which cover basic functionalities of AaronTools. Ignore any warnings unless there is an indication of a failed test.

More detail regarding test failure can be seen by changing into the directory of the same name as the test and running the corresponding test script (ending in .t):

 $ ./test_simple_install.t 
 not ok 1 - environment_setup
 #   Failed test 'environment_setup'
 ...
 $ cd environment_setup
 $ ./environment_setup.t

6. Add AaronTools to your PATH

 echo 'export PATH=$PATH:$QCHASM/AaronTools/bin' >> ~/.bashrc

Full Install

To enable AaronTools to submit and monitor Gaussian jobs on an HPC cluster, you will also need:

  1. Gaussian09 or Gaussian16
  2. Moab/Torque (PBS), LSF, SGE, or Slurm queuing software
Note: A full install of AaronTools is required for AARON (https://github.com/QChASM/Aaron/wiki).

First complete the Simple Install (above), then:

7. Define an environmental variable QUEUE_TYPE

This tells AaronTools which queuing software your cluster uses. "PBS" (Moab/Torque), "Slurm", "SGE", and "LSF" queue types are currently supported.

 echo 'export QUEUE_TYPE=PBS' >> ~/.bashrc

8. Create template.job

You need to create a file called template.job in $QCHASM/AaronTools that contains a generic job submission script for your HPC cluster. AaronTools will use this generic job script to construct submission scripts for individual jobs.

The easiest way to do this is to generate a job submission script for a typical Gaussian09 or Gaussian16 computation, save it as template.job, and then make the following replacements:

job name -> $jobname
wall time (in hours) -> $walltime
number of cores -> $numprocs
total memory for job -> $memory
total memory for gaussian -> $G09memory

Note: AaronTools will not specify the number of cores or memory in the G09 input file, so the job submission script (template.job) must take care of this when it calls g09 via -p=XXX and -m=XXX or by creating a Default.Route file. See the example below for the former, which we find to be preferable.

For job parameters that depend on the number of processors (e.g. memory), you need to define a &forumula& section at the bottom of template.job to calculate these:

For example

 &formula&
 $memory=$numprocs*2
 $G09memory=$numprocs*1.5
 &formula&

will determine the memory requested for the node ($memory, 2GB per core) and the memory allotted to Gaussian ($G09memory, 1.5GB per core, being careful to only run on an even number of cores!) based on $numprocs

For example, suppose this was a normal job submission script for your cluster for an input file called water.com (using Moab/Torque in this case):

 #PBS -S /bin/bash
 #PBS -N water
 #PBS -q wheeler_q
 #PBS -l nodes=1:ppn=28:Intel
 #PBS -l walltime=12:00:00
 #PBS -l mem=56gb
 module purge
 module load gaussian/09-Intel-SSE4_2
 export GAUSS_SCRDIR=/lscratch/$USER/$PBS_JOBID
 trap "rm -r $GAUSS_SCRDIR" 0 1 2 3 9 13 14 15
 mkdir -p $GAUSS_SCRDIR
 . $g09root/g09/bsd/g09.profile
 cd $GAUSS_SCRDIR
 find $PBS_O_WORKDIR -maxdepth 1 -name "*.chk" -exec cp {} . \;
 g09 -m="42GB" -p=28 $PBS_O_WORKDIR/water.com $PBS_O_WORKDIR/water.log
 if [ -e *.chk ]; then
   for chkfile in *.chk; do
     cp $chkfile.chk $PBS_O_WORKDIR/.
   done
 fi
 exit

Your template.job file would look like this:

 #PBS -S /bin/bash
 #PBS -N $jobname
 #PBS -q wheeler_q
 #PBS -l nodes=1:ppn=$numprocs:Intel
 #PBS -l walltime=$walltime:00:00
 #PBS -l mem=$memorygb
 module purge
 module load gaussian/09-Intel-SSE4_2
 export GAUSS_SCRDIR=/lscratch/$USER/$PBS_JOBID
 trap "rm -r $GAUSS_SCRDIR" 0 1 2 3 9 13 14 15
 mkdir -p $GAUSS_SCRDIR
 . $g09root/g09/bsd/g09.profile
 cd $GAUSS_SCRDIR
 find $PBS_O_WORKDIR -maxdepth 1 -name "*.chk" -exec cp {} . \;
 g09 -m="$G09memoryGB" -p=$numprocs $PBS_O_WORKDIR/$jobname.com $PBS_O_WORKDIR/$jobname.log
 if [ -e *.chk ]; then
   for chkfile in *.chk; do
     cp $chkfile.chk $PBS_O_WORKDIR/.
   done
 fi
 exit
 &formula&
 $memory=$numprocs*2
 $G09memory=$numprocs*1.5
 &formula&

9. Run Full Tests

Note: For the full install, AaronTools will need to be able to submit jobs to the queue. As such, you should run these tests on the head/login node or any node that will allow job submissions.

To make sure everything is set up correctly and all required software is in place, run the full set of tests

 cd $QCHASM/AaronTools/test
 ./test_full_install

This will run 8 tests, which cover basic functionalities of AaronTools. Ignore any warnings unless there is an indication of a failed test.

After test job_setup, it is important to check the file $QCHASM/AaronTools/tests/job_setup/test.job, which is generated by AaronTools based on your template.job.

Make sure that the 2 cores and a 2 hour walltime are correctly specified. Also check the job memory and the memory for gaussian are correctly specified based on your &formula& section in template.job.

The most likely test to fail is job_setup, which tests the ability of AaronTools to submit, find, and kill jobs on the queue. The most likely reason for failure is problems with template.job. If job_setup fails, check test.job, which was generated by AaronTools based on $QCHASM/AaronTools/template.job and adjust template.job accordingly. Contact catalysttrends@uga.edu if you continue to have problems.