Job Submission by Memory


#1

Good morning,
I have a cluster with varying amounts of memory in the nodes. We’d like to ensure that jobs that get submitted without memory requirements get scheduled first on the lowest memory nodes, then middle tier, then high tier (rather than by node name). We’ve looked in the configs and searched for hooks but haven’t found anything at this time. Does anyone have a solution for this? Thanks in advance!


#2

You may want to review the sched_config parameter called node_sort_key. This will allow you to sort your nodes based on a numerical resource (including custom resources).


#3

That was something we had tried but it didn’t seem to be working. Perhaps there is something we were missing?


#4

When you modified the sched_config, did you kill -HUP the pbs_sched daemon to have it re-read the new configuration?

I am assuming you did the following as root on the PBS Server/Scheduler

  1. Modified the $PBS_HOME/sched_priv/ched_config to have the following node_sort_key attribute

node_sort_key: mem LOW

  1. send the HUP signal to the pbs_sched daemon

kill -HUP cat $PBS_HOME/sched_priv/sched.lock

The tried submitting your jobs, again?

If it didn’t work, then I will need to see a little more details about your config and the sched_log.

First, as root…

  1. Update PBS_HOME/sched_priv/sched_config log_filter attribute

log_filter: 0

  1. send the HUP signal to the pbs_sched daemon

kill -HUP cat $PBS_HOME/sched_priv/sched.lock

Next, submit your test jobs, and provide the output of qstat -f (preferr the output to include jobs that are running so I can see which nodes were selected)

Finally, once your jobs finish, provide the output of
A. pbsnodes -a
B. PBS_HOME/sched_logs
C. grep -v -e “^#” -e “^$” PBS_HOME/sched_priv/sched_config


#5

It’s all working. That leads to a question, however… What is the purpose of the /opt/pbs/etc folder?


#6

The /opt/pbs/etc directory is there to house “sysconf data” as defined by GNU autotools. Any Makefile.am that provides a list of files for “dist_sysconf_DATA” is placing files in that directory. It’s generally intended for system configuration files, as one would expect.

$ grep -r dist_sysconf
src/scheduler/Makefile.am:dist_sysconf_DATA =
src/cmds/scripts/Makefile.am:dist_sysconf_DATA = \