Running First Job
After gaining access to the HPC system at Tribhuvan University, users can submit jobs using the SLURM job scheduler. This guide explains how to run your first job, monitor its status, and optimize job submissions.
1. Logging into the HPC System¶
Before running jobs, connect to the TU HPC system via SSH:
ssh username@tu-hpc-ip
2. Creating a SLURM Job Script¶
A job script tells the SLURM scheduler how to allocate resources and execute your program. Create a job script (e.g., job.slurm
) using a text editor:
nano job.slurm
#!/bin/bash
#SBATCH --job-name=test_job # Job name
#SBATCH --output=output.txt # Output file
#SBATCH --error=error.log # Error file
#SBATCH --ntasks=1 # Number of tasks
#SBATCH --cpus-per-task=4 # Number of CPU cores per task
#SBATCH --time=01:00:00 # Time limit (hh:mm:ss)
#SBATCH --partition=normal # Specify partition (e.g., normal, protein, fplo)
module load python/3.10.12 # Load required module
python my_script.py # Run Python script
CTRL + X
, then Y
and ENTER
).
3. Submitting the Job¶
Submit the job to the SLURM scheduler using:
sbatch job.slurm
4. Monitoring Job Status¶
To check the status of your submitted jobs, use:
squeue -u $USER
scontrol show job <job_id>
scancel <job_id>
5. Checking Job Output¶
Once the job is completed, check the output and error files:
cat output.txt # View job output
cat error.log # View error messages (if any)
6. Optimizing Job Submissions¶
- Use appropriate resource allocation (CPU, memory, and time) to avoid long queue times.
- Run test jobs with small datasets before scaling up computations.
- Use job dependencies if running multiple interdependent tasks:
sbatch --dependency=afterok:<job_id> job2.slurm
- Consider using parallel computing (MPI, OpenMP) for efficiency.
7. Example: Running a GPU Job¶
For GPU-based workloads, modify the script as follows:
#!/bin/bash
#SBATCH --job-name=gpu_test
#SBATCH --output=gpu_output.txt
#SBATCH --gres=gpu:1 # Request 1 GPU
#SBATCH --partition=protein # Use GPU-enabled partition
module load cuda/11.2
./gpu_program
sbatch gpu_job.slurm
Start computing efficiently on TU HPC! 🚀