Issues removing a node from a queue


#1

Hello Everyone,

my users currently cannot run jobs on our cluster. I traced the issue to a faulty node and I tried to delete the node but I am getting the error below.

[root@server548 ~]# qmgr -c “delete node term079”
qmgr obj=term079 svr=default: Cannot delete busy object
qmgr: Error (15027) returned from server

I have no idea how to resolve this issue as I am new to HPC


#2

If you have jobs running in a queue - then queue cannot be deleted
If you have jobs running on a node - then the node cannot be deleted.

Please run the below command and see whether there are any jobs on that node

  • pbsnodes term079
  • pbsnodes term079 | grep -i job