Exclude the node from qsub


#1

Hello! Can I exclude the host when I submit the job? If I can, how to do that?


Qsub specific hosts
#2

When submitting the a job via qsub , we are requesting the resources that is required to run a job.
If you would like to exclude one or more resources ( compute nodes), then you can tag the nodes with a custom resource , and use the custom resource of your choice to tell the scheduler to select the nodes which has the custom resource set.

For example - you have 3 nodes n1 n2 and n3

  • Create a custom resource called “node_select”
    qmgr -c “create resource node_select type=string_array,flag=h”

  • Add node_select to the sched_config’s resources: line and kill -HUP < PID of the PBS_SCHED >

  • add the custom resource “node_select” to all the nodes:

qmgr -c ‘s n n1 resources_available.node_select=yes’
qmgr -c ‘s n n2 resources_available.node_select=yes’
qmgr -c ‘s n n3 resources_available.node_select=no’

  • Say, now you would like to avoid node n3 , then your qsub statement should like below

qsub -l select=1:ncpus=1:mem=100mb:node_select=yes – /bin/sleep 1000

Thank you


#3

Than you for your answer!

But what if it should be dynamically? One user want exclude n1, second user want to exclude n3.
I have the farm with 30 compunodes. Sometimes users want to exclude different servers fore some reasons.

What shall I do?


#4

Dynamically we cannot exclude resources , once it has been matched to a job,
you can qalter the request of a QUEUED job and job wide resources of a RUNNING job.

The user can request as below
qsub -l select=1:ncpus=1:mem=100mb:host=n1+1:ncpus=1:mem=100mb:host=n2
qsub -l nodes=n1+n2


#5

It’s possible, but requires a lot of work to do (for admin, not for user, after doing this, user only need a single qsub to route queue)

  • create a route_queue say route
  • create a queue for every user, say q_userA q_userB q_userC …
  • create a boolean nodelevel (flag=h, type=boolean) resources for every user, say run_userA, run_userB, run_userC …
  • use acl_control of q_userA q_userB q_userC … to assure only specific user would enter this queue
    let’s say made userA routing to q_userA, and assign the queue with default chunk
qmgr -c "s q q_userA acl_enabled=t"
qmgr -c "s q q_userA acl_users+=userA@*"
qmgr -c "s q q_userA default_chunk.run_userA=t"
  • add destination to route of all these queues
qmgr -c "s q route destinations+=q_userA"

… (so does other queues )

  1. mark the nodes with your collections of nodes, as you mentions, mark all other nodes except n1 with resources run_userA=t (… do this for every user you want to control)

after this, user only need to do

qsub -l select=1:ncpus=1 -q route -- /bin/sleep 1000

btw, I might have some typo of the commands, as i’m typing on the fly without test, but this way should do the trick.

FYI


#6

Another solution might be , to create a custom string_array host level resource (allowed_users) and enable it in the sched_config file.

  1. user server (or periodic mom hook) to read a centralised text file which has contents ( might be in this format ) ,
    Node Users_Allowed
    n1 user1, user2
    n2 user2
    n3 user3

  2. The hook will read this file accordingly and update the node configuration with respect to allowed_users for each of the nodes . You can dynamically update that text file.