STDOUT and STDERR in submisson directory


#1

Hello,

I want jobs to be kept in the submission directory. At the moment the .o and .e files are being sent to $HOME. This is my job script

$cat submit.sh
#!/bin/bash
### Job Name
#PBS -N test_job
### Merge output and error files
#PBS -k oe
### Select 2 nodes with 36 CPUs each for a total of 72 MPI processes
#PBS -l select=1:ncpus=32

##########################################
#                                        #
#   Output some useful job information.  #
#                                        #
##########################################

echo ------------------------------------------------------
echo -n 'Job is running on node '; cat $PBS_NODEFILE
echo ------------------------------------------------------
echo PBS: qsub is running on $PBS_O_HOST
echo PBS: originating queue is $PBS_O_QUEUE
echo PBS: executing queue is $PBS_QUEUE
echo PBS: working directory is $PBS_O_WORKDIR
echo PBS: execution mode is $PBS_ENVIRONMENT
echo PBS: job identifier is $PBS_JOBID
echo PBS: job name is $PBS_JOBNAME
echo PBS: node file is $PBS_NODEFILE
echo PBS: current home directory is $PBS_O_HOME
echo PBS: PATH = $PBS_O_PATH
echo ------------------------------------------------------

myjob

After submitting it using

qsub submit.sh

I get test_job.o44 and test_job.e44 in $HOME. The test_job.o44 does show the correct submission/working directory.

PBS: working directory is /home/trumee/job/test1

How can i force the output files to stay in the submission directory?


#2

In addition, qstat -f shows

jobdir = /home/trumee
Output_Path = myserver:/home/trumee/job/test1/test_job.o46
Error_Path = myserver:/home/trumee/job/test1/test_job.e46

So the Path variables are being set correctly, but still the .o and .e files end in $HOME. Is that because the jobdir is being set to $HOME instead of submission directory?


#3

It is probably because of the “-koe”. Please try submitting the job without this.

Regards,
Shwetha


#4

I want to look at the output in real-time rather than at the end of the job. I need “-k” option for that.


#5

“-koe” is only used to retain the job’s error and output files at the end of the job on the execution host. In your case it is retaining it on the user’s shared home directory. Here’s a small clip from the man page:

-k keep Specifies whether and which of the standard output and standard error streams is retained
on the execution host. Overrides default path names for these streams. Sets the job’s
Keep_Files attribute to keep. Default: neither is retained. Overrides -o and -e options.


#6

You must be looking for direct_write feature in PBSPro which got checked-in a week before. This will help you to monitor the stdout/stderr files in real-time if your final destination is mapped in the execution host.
Design Document:
https://pbspro.atlassian.net/wiki/spaces/PD/pages/51901651/PP-516+Direct+write+of+the+job+s+stdout+err+files.

If you want to use this feature, you may need to create a build from the mainline code.

This is targeted for PBSPro 18.1 release.


#7

Thanks, I am running OHPC which is using the stable release of PBS.

If I omit the ‘-k oe’ flag and specify the output/error files like

#PBS -o /home/trumee/test1/myjob.o
#PBS -e /home/trumee/test1/myjob.e

unfortunately, I dont see these files being created on the head node as the job is being run. However, I can see that the .OU and .ER files being created in /var/spool/pbs/spool/ of the slave exec node.

So is there any way to see these STDOUT files being outputted on the submission host directory?


#8

Thanks for your post @trumee. We will be working with the OHPC folks to include the latest version of PBS once it is released. As @nithinj pointed out, you will be able to use the direct write feature to accomplish your goal. I can’t give you an exact date, but I suspect the latest PBS release should be available to OHPC users in the first half of 2018. The sooner the better, IMHO.