I have problem with access to the nodes via ssh. Nodes are visible in the network (they are reply for ping command), pbsnodes -a command state section are “free”, when I submit job, it is queued and changing status on R, but nothings is doing with it.
When I trying to logon to the nodes via ssh, I can’t or I logon, I receive information about last login and then I can’t do anything - I don’t have command prompt.
Headnode working correctly. When I submit that testjob, and then I was stop pbs and starts it again, pbsnodes -a state section, for that node on which job was running before restart, is “state-unknown, down”. But one day later pbsnodes -a state for all nodes are “free”.
It was happen on all my nodes simultaneously. What I may do? I need reinstall all my nodes? How I can diagnose that problem?