3. Job Submission Script Examples

Here we show some example job scripts that allow for various kinds of parallelization, jobs that use fewer cores than available on a node, GPU jobs, low-priority condo jobs, and long-running PCA jobs.

Simple Serial Job
Threaded/OpenMP Job
Simple Multi-Core Job
Serial Tasks Running in Parallel Job
MPI Job
Alternative MPI Job
Hybrid OpenMP+MPI Job
GPU Job
Low-Priority Job
Hadoop Job
Spark Job

Simple Serial Job

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Partition:
#SBATCH --partition=partition_name
#
# Account:
#SBATCH --account=account_name
#
# Wall clock limit:
#SBATCH --time=0:0:30
## Run command
./a.out

Threaded/OpenMP Job

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Request one node:
#SBATCH --nodes=1
#
# Specify one task:
#SBATCH --ntasks-per-node=1
#
# Number of processors for single task needed for use case (example):
#SBATCH --cpus-per-task=4
#
# Wall clock limit:
#SBATCH --time=00:00:30
## Command(s) to run (example):
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./a.out

Simple Multi-Core Job

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Request one node:
#SBATCH --nodes=1
#
# Specify number of tasks for use case (example):
#SBATCH --ntasks-per-node=20
# Processors per task:
#SBATCH --cpus-per-task=1
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
./a.out

Serial Tasks Running in Parallel Job

#!/bin/bash
#SBATCH --job-name=job-name
#SBATCH --account=account_name
#SBATCH --partition=partition_name
#SBATCH --nodes=2
#SBATCH --cpus-per-task=2
#SBATCH --time=2:00:00
#
## Command(s) to run (example):
module load bio/blast/2.6.0
module load gnu-parallel/2019.03.22
#
export WDIR=/your/desired/path
cd $WDIR
#
# set number of jobs based on number of cores available and number of threads per job
export JOBS_PER_NODE=$(( $SLURM_CPUS_ON_NODE / $SLURM_CPUS_PER_TASK ))
#
echo $SLURM_JOB_NODELIST |sed s/\,/\\n/g > hostfile
#
parallel --jobs $JOBS_PER_NODE --slf hostfile --wd $WDIR --joblog task.log --resume --progress -a task.lst sh run-blast.sh {} output/{/.}.blst $SLURM_CPUS_PER_TASK

Check out here for details of GNU Parallel

MPI Job

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Number of MPI tasks needed for use case (example):
#SBATCH --ntasks=40
#
# Processors per task:
#SBATCH --cpus-per-task=1
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
module load gcc openmpi
mpirun ./a.out

On the shared partitions, multiple jobs can be allocated on the same node if the cores are available and the memory on each node is split among each CPU core.

LR7 is a shared partition on the Lawrencium supercluster, so one CPU core and 4590MB RAM are allocated by default. --cpus-per-task=xxx must be specified when multiple cores are needed in your job.
If more memory than 4590MB is needed, --cups-per-task=xxx is required to allocate the right amount of memory for your jobs.

Another shared partition is Es1, where users can share GPU cards on each node. Please make sure the ratio of GPU cards and CPU core# is 1:2 e.g.: --gres=gpu:1 --ntasks=2

Alternative MPI Job

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Number of nodes needed for use case:
#SBATCH --nodes=2
#
# Tasks per node based on number of cores per node (example):
#SBATCH --ntasks-per-node=20
#
# Processors per task:
#SBATCH --cpus-per-task=1
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
module load gcc openmpi
mpirun ./a.out

If the MPI application is compiled using openmpi with PMIx enabled, the MPI application can be directly run using the srun command. Below is an example of the command with the executable a.out.

srun ./a.out

Other parameters in the above slurm script can be kept same.

Hybrid OpenMP+MPI Job

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Number of nodes needed for use case (example):
#SBATCH --nodes=2
#
# Tasks per node based on --cpus-per-task below and number of cores
# per node (example):
#SBATCH --ntasks-per-node=4
#
# Processors per task needed for use case (example):
#SBATCH --cpus-per-task=5
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run (example):
module load gcc openmpi
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
mpirun ./a.out

GPU Job

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=es1
#
# Number of nodes:
#SBATCH --nodes=1
#
# Number of tasks (one for each GPU desired for use case) (example):
#SBATCH --ntasks=1
#
# Processors per task (please always specify the total number of processors twice the number of GPUs):
#SBATCH --cpus-per-task=2
#
#Number of GPUs, this can be in the format of "gpu:[1-4]", or "gpu:V100:[1-4] with the type included
#SBATCH --gres=gpu:1
#
# Wall clock limit:
#SBATCH --time=1:00:00
#
## Command(s) to run (example):
./a.out

Es1 partition consists of GPU nodes with three generations of NVIDIA GPU cards(V100, GTX 2080TI, A40). Please take a look at the details on this page. A compute node with different GPU types and numbers can be allocated using slurm in the following way.

General format: --gres=gpu[type]:count
The above format can schedule jobs on nodes with V100, GTX 2080TI, or A40 GPU cards.
GRTX2080TI: --gres=gpu:GRTX2080TI:1 (up to 3 or 4 GPUs)
V100 : --gres=gpu:V100:1(up to 2 GPUs)
A40: --gres=gpu:A40:1 (up to 4 GPUs)

To help the job scheduler effectively manage the use of GPUs, your job submission script must request multiple CPUs (usually two) for each GPU you use. The scheduler will reject jobs submitted that do not request sufficient CPUs for every GPU. This ratio should be one:two.

Here’s how to request two CPUs for each GPU: the total of CPUs requested results from multiplying two settings: the number of tasks (--ntasks=) and CPUs per task (--cpus-per-task=).

For instance, in the above example, one GPU was requested via --gres=gpu:1, and the required total of two CPUs was thus requested via the combination of --ntasks=1 and --cpus-per-task=2 . Similarly, if your job script requests four GPUs via --gres=gpu:4, and uses --ntasks=8, it should also include --cpus-per-task=1 to request the required total of eight CPUs.

Note that in the --gres=gpu:n specification, n must be between 1 and the number of GPUs on a single node (which is provided here for the various GPU types). This is because the feature is associated with how many GPUs per node to request.

Low-Priority Job

#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=account_name
#
# Partition:
#SBATCH --partition=partition_name
#
# Quality of Service:
#SBATCH --qos=lrc_lowprio
#
# Wall clock limit:
#SBATCH --time=00:00:30
#
## Command(s) to run:
echo "hello world"

Hadoop Job

#!/bin/bash
#SBATCH --job-name=hadoop
#SBATCH --partition=lr2
#SBATCH --qos=lr_debug
#SBATCH --account=ac_abc
#SBATCH --nodes=4
#SBATCH --time=00:10:00
#
source /global/home/groups/allhands/bin/hadoop_helper.sh
#
# Start Hadoop On Demand
hadoop-start
#
# Example 1
hadoop jar $HADOOP_DIR/hadoop-examples-1.2.1.jar pi 4 10000
#
# Example 2
mkdir in
cp /foo/bar in/
hadoop jar $HADOOP_DIR/hadoop-examples-1.2.1.jar wordcount in out
#
# Stop Hadoop On Demand
hadoop-stop

More information about Hadoop job submission.

Spark Job

#!/bin/bash
#SBATCH --job-name=spark
#SBATCH --partition=lr2
#SBATCH --qos=lr_debug
#SBATCH --account=ac_abc
#SBATCH --nodes=4
#SBATCH --time=00:10:00

source /global/home/groups/allhands/bin/spark_helper.sh

# Start Spark On Demand
spark-start

# Example 1
spark-submit --master $SPARK_URL $SPARK_DIR/examples/src/main/python/pi.py

# Example 2
spark-submit --master $SPARK_URL $SPARK_DIR/examples/src/main/python/wordcount.py /foo/bar

# Stop Spark On Demand
spark-stop

More information about Spark job submission.

One Level Up