Runjob hook modify select problem


#1

Dear Wizards,

I am writing hooks to deal with a reservation-like scenario.
In certain cases, I will need to have a runtime hook reject a particular job (based on the node choice), and then force the job unto a specific node group (placement set).

According to the AG §6.10.5 table 6-13 (page AG-514) this should be possible (as long as I reject the job), and it is stated that I should do something like:

jobB = pbs.event().job
jobB.select = pbs.select("2:mem=2gb:ncpus=1+6:mem=8gb:ncpus=16")

However, when I try to do this, the hook fails. (It does not fail when I import the hook, but rather at runtime).

The log states:

<date/server>;PBS server internal error (15011) in Error evaluating Python script, <class 'pbs.v1._exc_types.UnsetAttributeNameError'>
<date/server>;PBS server internal error (15011) in Error evaluating Python script, job attribute 'select' not found

I cannot say if this is a problem with the job.select, the documentation - or just that I did it wrong.
For reference, my test code is simply (for debug purposes):

jobB=pbs.event().job
jobB.select = pbs.select("2:nodetype=io")
e.reject("Some comment")

All input is welcome! (Means “help!”)

Thanks,

Bjarne


#2

Hi Bjarne, yes, there is certainly a problem with that example in the documentation. It should show something like this:

jobB = pbs.event().job
jobB.Resource_List["place"] = pbs.place("pack:exclhost")
jobB.Resource_List["select"] = pbs.select("2:mem=2gb:ncpus=1+6:mem=8gb:ncpus=16")

So that would make your starting point something like this:

[root@centos7 tmp]# cat bjarne_r.py
import pbs
pbs.event().job.Resource_List["select"] = pbs.select("2:nodetype=io")
pbs.event().reject("Some comment")

This works with the caveat that the job must already have a nodetype request in the select statemt before the runjob hook tries to set it, otherwise it will not appear:

[root@centos7 tmp]# qmgr -c "s s scheduling=f"

[user1@centos7 ~]$ echo "echo bar" | qsub
11.centos7.prog.altair.com

[root@centos7 tmp]# qstat -f 11 | grep select
    Resource_List.select = 1:ncpus=1
    schedselect = 1:ncpus=1
[root@centos7 tmp]# qrun 11
qrun: Failed to run: Some comment (15136) 11.centos7.prog.altair.com
[root@centos7 tmp]# qstat -f 11 | grep select
    Resource_List.select = 1:ncpus=1
    schedselect = 1:ncpus=1
[root@centos7 tmp]# qalter -lselect=1:nodetype=compute 11
[root@centos7 tmp]# qstat -f 11 | grep select
    Resource_List.select = 1:nodetype=compute
    schedselect = 1:nodetype=compute:ncpus=1
[root@centos7 tmp]# qrun 11
qrun: Failed to run: Some comment (15136) 11.centos7.prog.altair.com
[root@centos7 tmp]# qstat -f 11 | grep select
    Resource_List.select = 2:nodetype=io
    schedselect = 2:nodetype=io:ncpus=1

I believe this is a bug and I will do more investigation and file one if need be.

Will all of your jobs already have a nodetype request as part of the job before run-time? If not, it is fairly trivial to create a queuejob hook which adds one and sets it to request a value like “any” (or whatever generic value you may already have in place in your configuration). Using a default_chunk.nodetype server/queue attribute does not appear to be a viable workaround since it does not get applied to the job’s actual Resource_List.select, but rather affects schedselect directly (which is what the scheduler ultimately actually uses to evaluate the job).


#3

The following seems to work (this is the AG example updated for the new work method):

jobB = pbs.event().job
jobB.Resource_List['select']  = pbs.select("2:mem=2gb:ncpus=1+6:mem=8gb:ncpus=16")

So, it seems the correct method to do this has been changed with the OSS update (PBS Pro v14), while the “newest doc” is still for v13.1. Probably, “place” should be equivalently updated.

It would be really good to get a list of the actual changes from v13 to v14 - especially when it comes to the API.

Best,

2>Bjarne


#4

Hi Scott,

Thanks (apparently I was typing when your reply came in, so I did not see it). We agree on the solution.

No. Some of the jobs will have it set and some not. The hook should not try to change the set. In certain cases (when the high-priority nodes are “reserved”), the hook will add a low-priority nodetype=io, but it will not change it. However, I need to make another hook, which can remove the added nodetype=io, just in case the job has not yet run, when the reservation is lifted. (In certain cases, the job will not be able to run on the io-nodes, but will just remain queued - which in our case is ok for a while).

Hmmm. That is not what I see.
Here is my example case (with comments added along the way).

[bjb@bifrost1 ~]$ qsub -l walltime=00:02:00 -l select=2:ncpus=2 -l place=scatter:excl -- /bin/sleep 30
1386.bifrost1

[bjb@bifrost1 ~]$ qstat2lin 1386|grep -e comment -e select
    Resource_List.select = 2:ncpus=2
    schedselect = 2:ncpus=2
    Submit_arguments = -l walltime=00:02:00 -l select=2:ncpus=2 -l place=scatter:excl -- /bin/sleep 30

qstat2lin is just my hack for qstat -f without the line breaks, see
Qstat -f <JOBID> line breaks
https://sourceforge.net/p/dcoo/cluster-tools/ci/master/tree/pbs/qstat2lin

The following will let the scheduler try to run the job. The hook will catch it, and update select - forcing it onto the io-nodes:

[bjb@bifrost1 ~]$ qmgr -c 'set server scheduling = 1'

Note that both Resource_List.select and schedselect get updated:

[bjb@bifrost1 ~]$ qstat2lin 1386|grep -e comment -e select
    Resource_List.select = 2:ncpus=2:nodetype=io
    schedselect = 2:ncpus=2:nodetype=io
    comment = Not Running: PBS Error: JOBSELECT2io - nodetype=io explicitly added to job select, as the walltime overlaps the reserve time set by the adminstrator
    Submit_arguments = -l walltime=00:02:00 -l select=2:ncpus=2 -l place=scatter:excl -- /bin/sleep 30

Luckily Submit_arguments is not updated, so in principle, I can use that (match for nodetype=io) to see if the hooks have fiddled with the nodetype. Alternativelty, I might try to store something in job.Variable_List. (I cannot rely on the comment field, as the server/scheduler may overwrite what I have put in.

Forcing yet another scheduling iteration, and the scheduler realizes that I really have only defined one io-node, so the job cannot really run:

[bjb@bifrost1 ~]$ qmgr -c 'set server scheduling = 1'
[bjb@bifrost1 ~]$ /home/bjb/DcooWare/dcoo-clustertools-ssh2016-09-27/pbs/qstat2lin 1386|grep -e comment -e select
    Resource_List.select = 2:ncpus=2:nodetype=io
    schedselect = 2:ncpus=2:nodetype=io
    comment = Can Never Run: can't fit in the largest placement set, and can't span psets
    Submit_arguments = -l walltime=00:02:00 -l select=2:ncpus=2 -l place=scatter:excl -- /bin/sleep 30

My nice comment got overwritten, but really the behaviour of the system is exactly what I expect.

I have tried also using qrun and it does not seem to make a difference: nodetype still gets correctly set. So, with my usage I can’t confirm “that the job must already have a nodetype request in the select statemt before the runjob hook tries to set it”.

Thank you so much for answering!

/Bjarne


#5

Hi Bjarne, you are right, it is not about the job having requested a nodetype in a select statement in particular, rather the critical difference is whether or not the job requested ANYTHING in a select statement explicitly:

[user1@centos7 ~]$ echo "echo bar" | qsub -lselect=1:ncpus=1
16.centos7.prog.altair.com

[root@centos7 tmp]# qstat -f 16 | grep select
    Resource_List.select = 1:ncpus=1
    schedselect = 1:ncpus=1
    Submit_arguments = -lselect=1:ncpus=1
[root@centos7 tmp]# qrun 16
qrun: Failed to run: Some comment (15136) 16.centos7.prog.altair.com
[root@centos7 tmp]# qstat -f 16 | grep select
    Resource_List.select = 2:nodetype=io
    schedselect = 2:nodetype=io:ncpus=1
    Submit_arguments = -lselect=1:ncpus=1



[user1@centos7 ~]$ echo "echo bar" | qsub
17.centos7.prog.altair.com

[root@centos7 tmp]# qstat -f 17 | grep select
    Resource_List.select = 1:ncpus=1
    schedselect = 1:ncpus=1
[root@centos7 tmp]# qrun 17
qrun: Failed to run: Some comment (15136) 17.centos7.prog.altair.com
[root@centos7 tmp]# qstat -f 17 | grep select
    Resource_List.select = 1:ncpus=1
    schedselect = 1:ncpus=1

#6

PP-470 has been filed for the inability to set a select statement for a job which was submitted without one.

An internal Altair ticket has been filed for the documentation bug.

Thanks!


#7

So, if the hook find the select part empty, then it is (with the present version of the code) not a good idea to add (anything) to the select, right?
Yes, that might be considered a bug. However, if -lresource is used (it could be if - explicitly or not - if -lselect is not used), then we would probably not be allowed to update select. In that case it could probably be OK for the server to throw away any updates to select?

Best,

/Bjarne


#8

Actually if -lresource (or -lnodes for that matter) are used then the server translates them into a select statement for the job. You DO have to worry about whether the submitter used -lselect or -lncpus or -lnodes when writing a queuejob hook since that translation has not taken place yet at queuejob time, but that is not a concern for a runjob hook.


#9

Thanks a bunch, Scott.

That is really helpful advice. I see that, writing to select for a queuejob hook could lead to problems.
I guess that might be mentioned in the AG(v13.1) §6.10.5 incl. Table 6-13, where it appears that the “Job select statement” can be read and set (r, s) by queuejob hooks. There seems to be no special caveats associated with this. It would be very useful information to add in this section, if there are special cases, where (for instance) select should/may/can not be updated.

/Bjarne

PS: On a side note, I wonder about this text on p. AG-514 (v13.1) §6.10.5 (also related to Table 6-3):

However, there are no occurrences of “o” in the table(?)


#10

Hi Bjarne,

I’ve filed a doc bug for this issue. Thank you for noticing this.

Regards,
-Anne