Passwordless configuration


#1

Hi Experts,

Scenario1:
I have one question for PBS 18.1.3 and the cluster has two nodes:

  1. pbs server node “serverNode”
  2. pbs exectuion “node1”
  3. both “serverNode” and “node1” have the same user “test”.
  4. And I didn’t configure passwordless accessing between “serverNode” and “node1” by user “test”.

When I submit PBS job such like “echo “sleep 60” | qsub” on “serverNode” using user “test”, I can get the job output and error file successfully.

Scenario2:
I have one question for PBS 18.1.3 and the cluster has two nodes:

  1. pbs server node “serverNode”
  2. pbs exectuion “node1” and “node2”
  3. both “serverNode” , “node1” and “node2” have the same user “test”.
  4. And I didn’t configure passwordless accessing between “serverNode”, “node1” and “node2” by user “test”.

When I submit PBS job such like “echo “sleep 60” | qsub” on “serverNode” using user “test”, I couldn’t get the job output and error file successfully. And the mom log said “Unable to copy file /var/spool/pbs/spool/0.pbspro-server.ER to pbspro-server.pbspro.pbspro.oraclevcn.com:/home/test/STDIN.e0”.

So my question is that:

  1. why scenario 1 can get the output/error file without passwordless configurations?

Thanks a lot for the help!


#2

Please share the output of PBS_EXEC/unsupported/pbs_dtj <jobid ran in scenario 1 > and the same for scenario 2.


#3

Thanks a lot for the reply. I mistook the two cluster configurations and they can work fine now.

And I also want to confirm with you that:

  1. I know if we want to let PBS job files can be staged in and out,we need to configure passwordless ssh from execution nodes to pbs server nodes.
    do we need to configure passwordless ssh from server nodes to execution nodes?

  2. if we use openmpi inside pbs server(which is not a mom node),do we need to configure passwordless from server nodes to execution nodes?

Thanks a lot!


#4

Yes, server to node(s) , node(s) to server and node(s) to node(s)

  • yes passwordless-ssh is required (same as mentioned above)
  • openmpi should be accessible across all the compute nodes ( in the same path )
  • openmpi is not required to be on the PBS Server host.
  • Usually, openmpi is compiled with PBS TM from a shared location accessible by all the nodes and users locally on those nodes.

#5

Thanks a lot for the patience to help me. So what’s the reason for server to node(s) passwordless accessing? Could you please give me a hand on this? It seemed that we don’t need copy file from server to nodes.


#6

Stage in – PBS Server side
The process of moving one or more job-related files from a storage location to the execution host before running the job.

Stage out – PBS Mom side
The process of moving one or more job-related files from the execution host to a storage location after running the job.
Staging and execution directory
The staging and execution directory is a directory on the execution host where the following happens:
• Files are staged into this directory before execution
• The job runs in this directory
• Files are staged out from this directory after execution
A job-specific staging and execution directory can be created for each job, or PBS can use a specified directory, or a default directory.

All the users should be able to ssh across server to nodes, nodes to nodes , nodes to server .

  1. Implement host-based authentication (instead of key based)

2 The /etc/pbs.conf file on the PBS Server and Moms should have these lines, in the same order, append this to your existing pbs.conf and restart the services

PBS_RCP=/bin/false
PBS_SCP=/usr/bin/scp
PBS_RSHCOMMAND=/usr/bin/ssh

  1. Make sure StrictHostkeyChecking in the /etc/ssh/ssh_config file is set to “no”

Please let us know if you are stuck with any log files or snapshots.

Thank you


#7

Thanks a lot @adarsh. It is very helpful.