5.2) Batch System Basics

There are numerous batch schedulers in use on different HPC facilities, e.g. PBS, SLURM, LSF. They all share the same principles but the specific details of the commands and switches to use are dependent on both the batch system in question and how it is configured on a particular HPC facility.

Typically, working with batch schedulers follows the pattern:

A user submits a request for resources, e.g. execution time, number of cores, memory per core. The job then queues in a spooling area.
The scheduler, which manages and controls all the available resources, then reserves the requested resources. It prioritises the request according to a defined scheduling policy.
The scheduler runs the job on the compute nodes when the resources become available. The user has exclusive use of the resources for the duration of the job.
Once the job is completed, the scheduler returns the output to file/directory defined by the user and returns the resources to the available pool of resources.

First Page

Last Page