Skip to main content

Glossary



B

Batch Job - Are jobs that can be assigned to a compute node and run to completion without any further user intervention.

Batch Scheduler - A program that controls the resources, background program execution on HPC systems. It allocated resources and ensures a high utilisation of the resources.


C

Compute Node/Node - These are workhorse units that make up a computer cluster and are typically the shelf servers with multiple processors in each server.

Computer Cluster - A group servers and resources that are configured to work together as a single computing resource.

Core/CPU/Processors - A server will contain one or more CPU’s or processors, and each CPU will contain multiple cores which are the units that carry out the actual computations. These are sometimes used interchangeably, which can lead to some confusion.

CPU Hours - A measure for the amount of processing used by a programs. One CPU hour is a single core running for one hour.


D

Distributed Memory Machine - A parallel computer where each processor has it own local memory and operates independently. The processors are connected via a high speed interconnect.

Distributed Memory Parallelism - A programming model that utilises message passing to take advantage of parallel computers with a distributed memory architectures


E

Embarrassingly Parallel - A set of task that can be performed concurrently inn parallel with no communication are said to be embarrassingly parallel, e.g. a parameter sweep.


H

HPC - High performance computing is a collection of hardware systems, software tools, programming languages, parallel programming paradigms which all make previously unfeasible calculations possible.


I

I/O - Input and output operations, usually refers to reading from and writing to files on disk

I/O Bound - Refers to processes whose rate of progress is limited by the speed of the I/O subsystem.

InfiniBand Network - A high speed network link often used in HPC clusters.

Interactive Job - Jobs that require manual user interactions, such as debugging or interacting with a GUI.


L

Login Node - A user facing node/server that is used to connect to HPC cluster, submit jobs, compile code and transfer files. This node is available on a free-for-all basis to all users. It should not be used to run computational jobs.

Lustre - A parallel file-system used to provide high rate of I/O operations.


M

Modules - A software package used to manage a user’s environment, to allow access to and to easily switch between various pieces of software.

MPI - Short for Message Passing Interface. This the standard library to used for message passing, and is used to write distributed memory parallel code.


N

Node - Refers to a standalone computer or server, which can perform any one of a number of tasks, e.g. login node or compute node.


O

Observed Speedup - A simple indicator of a parallel program’s performance and is defined as the ratio of a code’s execution time on one core to the code’s execution time on multiple cores.

OpenMP - A standard library and set of directives used for shared memory parallel code.


P

Parallel File-System - A high performance file-system that enables a high of I/O operations and can greatly improve that performance of jobs making use of it.


Q

Queues - Refers to classes or groups of computing resources that can be requested via a batch scheduler. They can for example, be differentiated by the maximum allowable runtime, or type of hardware.


S

Scalability - Refers to a program’s to exhibit a proportional increase in speedup when more resources are added.

SCP - Secure copy is a means to securely transfer files between local and remote computers or hosts.

Secure Shell (ssh) - A secure network protocol allowing remote command line logins and command execution.

Shared Memory Machine - A class of machines where CPUs share a single memory address space. All the processors have access to a pool of shared memory.


W

Walltime - Is the amount of time specified in a submission script for which a job will run on a batch system.


X

X-Windows - Commonly known as X or X11, is the X Window System. It is a software system and network protocol that provides a graphical user interface (GUI) for networked computers.