Host utilisation


#1

qstat does not retrieve the run queue length ( r15s, r1m, r15m), memory pagination (pg), io, it (idle time), ls (list of users), tmp, swap

Are you aware of any PBS native command that can provide with the host utilization or the load usage metrics on the Hosts?


#2

PBS does not track this information. This particular information is more node specific and not job specific. If you would like this information then you can create a exechost_periodic hook that collects these values and then assigns them to custom resources that get assigned to the nodes. For example you could create 3 custom resources in qmgr

qmgr -c “create resource r15s r1m r15m type=float,flag=h”

then create a periodic mom hook like this.

import pbs
import sys
import os
load = os.getloadavg()
r1m = load[0]
r5m = load[1]
r15m = load[2]
e = pbs.event()
mynode = pbs.get_local_nodename()
v = e.vnode_list[mynode]
v.resources_available[“r1m”] = r1m
v.resources_available[“r5m”] = r5m
v.resources_available[“r15m”] = r15m
pbs.logmsg(pbs.LOG_DEBUG,“getloadavg: vnode %s, r1m = %f, r5m = %f, r15m = %f” % (repr(mynode), r1m, r5m, r15m))

Then import the periodic hook and you should start to see these values on the nodes for the users.


How to aggregate heterogeneous entire clusters into “one big cluster” for cross-cluster job submission