PP-1217: Support for multiple fairshare trees


#1

Hi,

I want to offer a possibility to set a different fairshare tree for different execution queues. I have created the design doc. Please let me know if you are interested and if it is reasonable for you.

Our simplified use case is following:
We have several user groups and some user groups have a special relation to a particular cluster - let’s call the nodes “reserved nodes”. Reserved nodes are provided to the user group with priority. Other users are also able to access reserved nodes but only with short jobs. We also have clusters for everybody. The important thing is that due to the special relationship between the user group and reserved nodes we want to provide completely different fairshare tree for the user group on reserved nodes.

Vasek


#2

Thanks for sharing this design. We have been working on a feature PP-337 that will allow for a site to assign reserved nodes to a separate scheduler. It should be checking in this week. Each separate scheduler will have its own fairshare tree. Will this feature address your use case?


#3

I understand the similarity, and I watch PP-337 but it is not what I need. PP-337 does not allow to share the queues/nodes between the schedulers. Using more than one scheduler/fairshare tree with PP-337 actually requires to tear apart the world, but we only need the separated fairshare trees. Especially, because of the short (backfill) jobs.

In our use case, everybody can submit short jobs into the backfill queues. The backfill queues use the default fairshare tree, and backfill queues supply jobs to all nodes. For us, It is necessary to be able to use two different fairshare trees on the same node.


#4

Hi,

I wonder if you could get the behavior you want with a single fairshare tree, by embedding more information in the fairshare entity itself. This technique is has been successfully used to provide “per-Q fairshare” by combining the queue name plus user name into the fairshare entity, e.g, (workq-bill, highp-bill, lowp-bill, …) which effectively makes all the versions of “bill” separate entities for fairshare.

In this case, you could use the -A account_string as the fairshare entity and use a submission hook to set it something like:

if (user_group is on reserved nodes) then
  account_string = "RESV-" + user_group
else
  account_string = "OTHER-" + user_group

Then the RESV-bill and OTHER-bill fairshare entities will accrue fairshare independently.


#5

Hi @billnitzberg,

thank you for sharing the technique. I believe it could substitute the proposed solution, but I have a technical issue.

In my case, I think I need the queuejob and modifyjob hook to be like:

if (queue is reserved_queue) then 
    account_string = queue + "-" + username
else
    account_string = username

The issue is that account_string can be set only in two hook events (queuejob, modifyjob) but the username is not available in the queuejob hook, which I need. Theoretically, it can be obtained from the Variable_List (PBS_O_LOGNAME) but I doubt this is a good way.

I will do more tests with this technique.

Vasek


#6

It is true that Job_Owner and euser are not set at queuejob time. Your options are PBS_O_LOGNAME from the job’s Variable_List or the event’s requestor.

In the case of qsub -u, requestor and PBS_O_LOGNAME will be the user who issued the qsub, but in addition User_List will be set to the argument of qsub -u.