Skip to content

Managing jobs

The lifecycle of a job can be managed with as little as three different commands:

  • Submit the job with sbatch <script_name>.
  • Check the job status with squeue. (to limit the display to only your jobs use squeue -u <user_name>.)
  • (optional) Delete the job with scancel <job_id>.

You can also hold the start of a job: scontrol hold <job_id>, put a hold on the job. A job on hold will not start or block other jobs from starting until you release the hold. scontrol release <job_id>, release the hold on a job.

Job status descriptions in squeue

When you run squeue (probably limiting the output with squeue -u <user_name>), you will get a list of all jobs currently running or waiting to start. Most of the columns should be self-explaining, but the ST and NODELIST (REASON) columns can be confusing.

ST stands for state. The most important states are listed below. For a more comprehensive list, check the squeue help page section Job State Codes.

  • R The job is running
  • PD The job is pending (i.e. waiting to run)
  • CG The job is completing, meaning that it will be finished soon

The column NODELIST (REASON) will show you a list of computing nodes the job is running on if the job is actually running. If the job is pending, the column will give you a reason why it still pending. The most important reasons are listed below. For a more comprehensive list, check the squeue help page section Job Reason Codes.

  • Priority There is another pending job with higher priority
  • Resources The job has the highest priority, but is waiting for some running job to finish.

  • launch failed requeued held Job launch failed for some reason. This is normally due to a faulty node. Please contact us, stating the problem, your user name, and the jobid(s).

  • Dependency Job cannot start before some other job is finished. This should only happen if you started the job with --dependency=...
  • DependencyNeverSatisfied Same as Dependency, but that other job failed. You must cancel the job with scancel JOBID.