Script to force job termination after a given time


#1

I am using PBS on a cluster where PBS is configured as follows: if one submits a job with a given walltime and lets the job run for longer than the wall time, the scheduler will not kill the job. However, it will associate a ‘red flag’ with the user who did this, and give him low priority for his next job.

To avoid this, I would like to insert manually a job termination in my PBS script.

See for example the following basic script:

#!/bin/bash

#PBS -l walltime=00:01:00
#PBS -l mem=1gb
#PBS -l nodes=1:ppn=1
#PBS -q batch

./program.o

How may I modify this script in such a way that the job is automatically killed with qdel after 1 minute?


#2

If you submit the above script and if program.o runs for more than 1 minute, then it will be automatically killed by the pbs_mom, as the job has exceeded the requested walltime of 1 minute. You can see this message in the log file

02/26/2018 08:30:25;0008;pbs_mom;Job;JOBID.pbsserver;walltime XX exceeded limit 60


#3

Like I said, this is not what happens on the cluster on which I am running the job, because of the specific PBS configuration on that cluster.


#4

Assuming the cluster runs Linux, I would suggest checking whether ‘timeout’ command is available on the cluster.

With ‘timeout’ you could use something like:

timeout -s TERM 55 ./program.o

Without ‘timeout’ you could use something like:

./program.o &
PID=$!
sleep 55
kill -TERM $PID

Vasek


#5

@vchlum: This works, thank you.