Use GPUs¶

Overview¶

Run GPU-accelerated jobs on the HPC system using the job scheduler.

Before you begin¶

Make sure:

you can log into the HPC system
(see: Log into HPC)
your code supports GPU execution
required software or libraries are available
you know how many GPUs your job requires

Steps¶

1. Log into HPC¶

ssh <username>@hpc.uct.ac.za

2. Create a GPU job script¶

Create a file called gpu-job.slurm:

nano gpu-job.slurm

Add the following content:

#!/bin/bash
#SBATCH --job-name=<job-name>
#SBATCH --output=gpu-%j.out
#SBATCH --error=gpu-%j.err
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=8G
#SBATCH --gres=gpu:1

# Load required modules (if needed)
# module load <gpu-enabled-module>

# Run your program
<command-to-run>

Replace: - <job-name> with a meaningful name - <command-to-run> with your GPU-enabled command

3. Submit the job¶

sbatch gpu-job.slurm

4. Check job status¶

squeue -u <username>

Verify¶

Check GPU allocation¶

After the job starts, check output:

cat gpu-<job-id>.out

Look for indicators that GPUs are being used (depends on your software).

Check job completion¶

squeue -u <username>

If the job no longer appears, it has finished or failed.

Troubleshooting¶

Job does not start¶

Possible causes:

no GPUs available
incorrect resource request
queue limitations

GPU not used¶