Submit a Job¶
Overview¶
Run a batch job on the HPC system using the job scheduler.
Before you begin¶
Make sure:
- you can log into the HPC system
(see: Log into HPC) - your code or executable is available on the system
- your input data is accessible
- you know how long your job is expected to run
- you have identified the resources your job requires (CPU, memory, GPU if needed)
Steps¶
1. Log into HPC¶
2. Create a job script¶
Create a file called job.slurm:
Add the following content:
#!/bin/bash
#SBATCH --job-name=<job-name>
#SBATCH --output=job-%j.out
#SBATCH --error=job-%j.err
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
# Load required modules (if needed)
# module load <module-name>
# Run your program
<command-to-run>
Replace:
- <job-name> with a meaningful name
- <command-to-run> with your actual command
3. Submit the job¶
You should see output similar to:
4. Check job status¶
This shows your running and queued jobs.
Verify¶
Check output files¶
After the job completes:
You should see:
job-<job-id>.outjob-<job-id>.err
Inspect output:
Check job completion¶
If the job no longer appears in the queue:
then it has finished or failed.
Troubleshooting¶
Job stays in queue¶
Possible causes:
- insufficient resources requested
- queue is busy
- job requires a specific partition or resource
Job fails immediately¶
Check:
Possible causes:
- incorrect command
- missing input files
- required modules not loaded
Output files are empty¶
Possible causes:
- job did not execute correctly
- command produced no output
- job terminated early
Command not found¶
Possible causes:
- required software is not loaded
- incorrect command path