PBS job submit problem


#1

Dear all,

I submitted some jobs to compute node (node1) via qsub.

When resources of node1 got full and I continued to submit job, the state of job changed from E->Q->R->Q->(R->Q)->H finally the state of submitted job was hold and it sit there permanently.

But when I submit job (job 1) and indicate which compute node to run (qsub -l vnode=node1), the state of job1 is Q (queued) till resources are released and job1 continue to run.

I want any of my jobs run when resources are available, how can I do it ??
When the state of job go into H (held) and how to recall that job to run?

Please help me,
Thank for any help.


#2

Hi kimolai,
Apologies for the late reply. It seems jobs are failing to run on the node, due to
which after multiple attempts, server moves the job to held state (H). This behaviour is
observable when job run_count is greater than 20. You should check server & mom logs for why the job is failing to run.

To release a job in H state use qrls command.

Regards
Dilip