Qsub blocking is not working

#1

Hello,
Sorry for double posting, as I have posted it on github too. But since I believe this is not a bug, I think this is the better place to discuss my problem.

I am submitting a code using qsub from python3 as( hereis the complete code):

scfc = ["qsub", "-Wblock=true", "script.sh"]
optstate = ["opt1", "opt2", "opt3"]

          for sstate in optstate:
            subprocess.call(scfc)
            sdir = optstate.index(sstate)
            print(sstate)
            genincar2(sdir)
            shutil.copy2("INCAR", "INCAR"+"."+str(sstate))
            shutil.copy2("CONTAR", "CONTCAR"+"."+str(sstate))

and the script.sh is:

#!/bin/bash
#PBS -S /bin/bash
#PBS -N Test
#PBS -l select=2:ncpus=24:mpiprocs=24
#PBS -q workq
#PBS -joe
#PBS -V
export I_MPI_FABRICS=shm:tmi
export I_MPI_PROVIDER=psm2
export I_MPI_FALLBACK=0
export KMP_AFFINITY=verbose,scatter
module load intel/2018
module load vasp/5.4.4
cd $PBS_O_WORKDIR
cat $PBS_NODEFILE > pbs_nodes
echo Working directory is $PBS_O_WORKDIR
NPROCS=`wc -l < $PBS_NODEFILE`
NNODES=`uniq $PBS_NODEFILE | wc -l`
mpirun -np $NPROCS --machinefile $PBS_NODEFILE vasp_std

I am expecting with -Wblock=true (and also, subprocess.call), the code to wait to finish the vasp_std code before going to sdir = optstate.index(sstate) line.

But this is not the case, and getting error as CONTCAR is not generated yet.

Can someone kindly help?

0 Likes

#2

Sorry I forgot the versions:
pbsnodes --version
pbs_version = 18.1.3

python3 --version
Python 3.4.9

uname -a
Linux master1.clusternet 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

0 Likes

#3

When used with -Wblock=true, no exit status is returned.

Basically, when qsub -Wblock=true for job submission will cause the server to hang (not do any work- like service denial) and block the operations that the server user needs to do, better to use some other methods to monitor the job.

0 Likes

#4

That isn’t true, -Wblock=true only blocks qsub from exiting. The server can continue to service requests.

In one window:

[vstumpf@shecil ~]$ qsub -Wblock=true -- /bin/sleep 10
7.shecil

In another:

[vstumpf@shecil ~]$ qsub -- /bin/sleep 10
8.shecil
[vstumpf@shecil ~]$ qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
7.shecil          STDIN            vstumpf           00:00:00 R workq           
8.shecil          STDIN            vstumpf           00:00:00 R workq           
[vstumpf@shecil ~]$ 

Eventually the qsub in the first window will exit

0 Likes

#5

FYI:
https://pbspro.atlassian.net/browse/PP-823

0 Likes