i need to restart jobs which has failed due to license error , i have found job exit status is 2 , any suggestion
suppose a job is running but not getting license and when it fail to get license we will identify and rerun job again , is this possible by any chance
I think the correct solution for your issue would be to track the number of licenses as a custom resource so the scheduler would never run a job without an available license.
Another solution would be to use dependency with ‘afternotok’. See the qsub man page for this. You need more than one job for this.
I have one more idea though the job owner needs to be the manager or operator for this. It is probably not the ideal solution but it may help…
If your job is rerunnable then you can use qrerun command within the job on the same job. qrerun returns a rerunnable job to Q. So the idea is to detect the license unavailability within the job script and if the license is not available then run following command within the job:
You need to run the command before the job will exit.
thanks for reply,
i have written a execjob_epiloque it will read a file i pbs job working directory if file exist it will send job to hold , it is working
but now woking to write periodic hook for releasing job for every 10 minutes bit facing some issue
can you help me for it
i think periodic hook is not for job release so any solution with hook ?
license checking with script is not easy due to varrious group access in license so looking for some solution for release the job, not want to use shell script