Add new hook events at the server end


#1

Hi All,

With reference to PP-912 and PP-913, this is a proposal for introducing additional hook events at the server end. They are as follows:-

  1. Reservation confirmation – The hook invoked at this event will be executed at the confirmation of the reservation. The hook will have read access to the server and, reservation attributes. Some of the reservation attributes can be settable by this hook.
  2. Reservation end – The hook invoked at this event will be executed at the successful/abnormal termination of the reservation. The hook will have read access to the server and, reservation attributes.
  3. Job end - The hook invoked at this event will be executed at the successful/abnormal termination of the job. The hook will have read access to the server and, job attributes.

Please provide your valuable feedback.
Thank you.

External Design Document


#2

Thanks @sujatapatnaik52, can you also detail as to why (from your point of view) these events need hooks/customization?


#3

If I want to do some additional tasks at the reservation confirmation or end, then I can use the hooks of these events for the same.

For the job end event, I can use the job end event of the execution hosts for executing any additional tasks. However, there can be many execution hosts and thus, I can use the server side job end event for the same. Thank you.


#4

In point #1 you mentioned some of the reservation attributes can be settable by hook. Could you please name the attributes?
#2 and #3 you mention attributes can only be read. Any reason why you don’t want to write/update the attributes?
Also please explain why you propose separate end events for reservation and job.
Thanks.


#5

One feature that was left out of the reservation modifications RFE was to have a reservation end hook to allow for extension of the reservation.


#6

#1 - By some attributes, I am presuming it can be only those attributes which are permitted to be modified at the reservation confirmation event. On a seperate note, If I want to create and use a custom attribute for the reservation then I can get the value of the custom attributes from the Resource_List attribute.

#3 - Writing for point 3 first because I have tried this one. If I am updating any job attributes at the server end job event, I cannot see the same updated attribute in the accounting logs as they are already written possibly at the execjob_epilogue event. I don’t find any use for updating the job attributes which I cannot get at a later stage.

#2 - I don’t have any specific reason for not wanting to update the reservation attributes.

Proposing seperate end events - If I want to do different tasks for reservation and job then I can use the seperate events individually.

Please correct me If I am wrong. Thank you.


#7

I have created a dcoument which details the usage of the new hook events. Please review it and give your feedback.

Design Document

Thank you.


#8
  1. Interface 1:
    a. Please rename “resvconf” to “resv_confirm” for readability. Since we have “resvsub”, it just looks unreadable if we say “resvconf” or “resvconfirm”. We have used underscore for hook event names like “execjob_begin”.
    b. “hook can be created by pbsadmin”, actually, it’s the only way for the hooks to be created righ now: running qmgr under “root” or pbsadmin". I suggest drop “can be created” to simply “is”.
    c. There are messages in the desigh that goes “alarm call while running … 'h1.”’ Replace h1 with <hook_name>.
    d. “Upon accept(), the changes done on the reservation object is committed to PBS server.”: which reservation attributes (i.e. pbs.event().resv.<attribute_name> = ??) can be altered by this “confirm_resv” hook? Please be specific on the design doc.
    e. Some suggestions on the following:

     "upon reject(), the following are the usecases:
    
     The reservation is submitted and has not yet started. 			The state of the reservation will stay at RESV_UNCONFIRMED and the reservation object will be purged from the server.
              [Al] Maybe say "if the reservation that was submitted has not yet started..." and also, no need to mention "reservation will stay at RESV_CONFIRMED" since the "reservation object will be purged from the server." It's going to be gone anyway.
    
    
     "The reservation has not started yet and an alter is requested. The reservation will continue to run with it’s original attributes."
                  [Al] Say "if the reservation that was submitted had an alter (i.e. pbs_ralter) requested..."
    
     "The reservation is running and an alter is requested. The reservation will continue to run with it’s original attributes."
                  [Al] Say "If the reservation is already running (confirmed)...."
    
  2. Interface 2:
    a. Please rename “resvend” to “resv_end” for readability.
    b. See my 1b) and 1c) comments which apply here too.
    c. What is really the effect of reject()? Are
    there things that will be prevented from happening? Right now, I only
    see a “reject” message being printed out.

  3. Interface 3:
    a. See 1a) and 1b) comments which also apply here.
    b. See 2c) comment which applies here.


#9

Hi Al,

Thank you for the suggestions. I have updated the design document as suggested except 1.d. I will take guidance from @prakashcv13 as he has worked on the reservations. I will update the specific settable attribute list after discussing with him.

Thank you.


#10

@sujatapatnaik52,

As discussed offline, I think we need a UCR document before we can go ahead.

Thanks,
Prakash


#11

@sujatapatnaik52:
Thanks for making the changes! Just additional comments:

  • I don’t think you need to change “endjob” to “job_end”. The former is fine as is and blends well with “queuejob”, “modifyjob”, “movejob”, “runjob” and it’s readable. I just wanted the “resvconf” and “resvend” change to “resv_confirm” and “resv_end” to be more readable and in line with its existing counterpart “resvsub”.

  • The error message
    " r obj=h1 svr=default: invalid argument (resvtest) to event.
    Should be one or more of: queuejob,modifyjob,resvsub,movejob,runjob,provision,periodic,job_end,execjob_begin,execjob_prologue,execjob_epilogue,execjob_preterm,
    execjob_end,exechost_periodic,execjob_launch,exechost_startup or execjob_attach for no event."
    should be the same message all error cases listing the new events: resv_confirm, resv_end, and endjob.


#12

Thank you @bayucan I have updated the changes in the document. Apart from this, In your previous comment, regarding 1.d. I have not added anything regarding the settable attributes. I have modified it to currently support the read privileges. Further details will be added in the UCR.


#13
  • Thanks for reverting back the “job_end” into “enjob”.
  • For 1d, I’m confused. what got changed? And can you point me to the UCR link? The design doc must be explicit as to what really is set when an accept() happens. If there’s nothing that is allowed to be set inside the resv_confirm hook (i.e. pbs.event().resv.<attribute_name> = <new_value>), then just point that out.
  • In regards to the error messages, maybe I wasn’t being clear. You have 3 separate error messages that in code, would actually be just one:

Interface 1:
qmgr obj=h1 svr=default: invalid argument (resvtest) to event.
Should be one or more of: queuejob,modifyjob,resvsub,movejob,runjob,provision,periodic,resv_confirm,execjob_begin,execjob_prologue,execjob_epilogue,execjob_preterm,execjob_end,exechost_periodic,execjob_launch,exechost_startup or execjob_attach for no event.

Interface 2:
Should be one or more of: queuejob,modifyjob,resvsub,movejob,runjob,provision,periodic,resv_end,execjob_begin,execjob_prologue,execjob_epilogue,execjob_preterm, execjob_end,exechost_periodic,execjob_launch,exechost_startup or execjob_attach for no event.

Interface 3:
Should be one or more of: queuejob,modifyjob,resvsub,movejob,runjob,provision,periodic,endjob,execjob_begin,execjob_prologue,execjob_epilogue,execjob_preterm, execjob_end,exechost_periodic,execjob_launch,exechost_startup or execjob_attach for no event.

I mean these all collapse into one:
Should be one or more of:
queuejob,modifyjob,resvsub,movejob,runjob,provision,periodic,execjob_begin,execjob_prologue,execjob_epilogue,execjob_preterm, execjob_end,exechost_periodic,execjob_launch,exechost_startup,endjob,resv_confirm,resv_end or execjob_attach for no event.


#14

Hi All,

Please have a look at the UCR and provide your feedback.

@bayucan Thank you for the suggestions.
For 1d, the changed part is the documented usecases and requirements which I got now and accordingly I have updated.
Regarding the error messages, you are right about the one error message for all. The reason behind keeping them seperate for now is that they are all not part of the existing error messages for the hook events and yet to be added.


#15

Now that I’ve seen the requirements and use cases, interface 1 needs a bit more tweaking.
It says:
“The hook can accept or reject the reservation. Upon accept(), the changes done on the reservation object is committed to PBS server.”
Please update this to say:

"“The hook can accept or reject the confirmation of the reservation. Upon accept(), the reservation request gets confirmed status.”
Since no changes are actually done to the reservation object via this hook, then we don’t need to mention that. Note that the resvsub hook will allow reservation object changes to take effect but that’s a separate hook.

Now I have additional questions as follows:

"Upon reject(), the following are the usecases:

if the reservation that was submitted has not yet started, the reservation object will be purged from the server."

So I take it that the reservation had gotten past the resvsub hook but is under RESV_UNCONFIRMED status. Ok, the reservation is deleted.

“If the reservation that was submitted had an alter requested, the reservation will continue to run with it’s original attributes.”
I’m a bit confused with this. An alter request? Does this mean a reservation submitted which could be in an RESV_UNCONFIRMED status or may be already running, and a pbs_ralter is done? Shouldn’t this be a separate resv_alter hook that would be deciding this rather than made as part of resv_confirm?

“If the reservation is already running, the reservation will continue to run with it’s original attributes.”
Why can’t the reject() action for resv_confirm hook just delete the reservation entirely?

There’s already one error message in the code that lists all the possible legal hook event values, and the implementation to this will just add the 3 new hook events to this existing error message. So if you need to, just repeat the same unified error message in interface 1, 2, 3.

And going over the design doc again, interface 2 about “resv_end” hook has not addressed the part below that says “hook shall be executed when there are no running jobs in the queue.”:
“Requirements :
A new reservation end hook event shall be introduced when a confirmed reservation(standing or advanced) ends successfully or there is a user request for deleting the same.
The hook shall be executed when there are no running jobs in the queue.”


#16

Updated this one.

The reservation is deleted because it’s not confirmed. You took it right.

Apologies for not making it clear. It is supposed to be something like this - “If there is a confirmed reservation which has not yet started and and an alter requested on it, the reservation will continue to stay with it’s original attributes.”

I feel there can be a seperate hook for the pbs_ralter. As per the UCR, I think there’s a requirement for confirmation hook which will take care of the confirmation of the reservation requests ex - confirmation after modification, confirmation after submission, etc.

I think I missed something here - If there is a running reservation and an alter is requested on that, the reservation will continue to run with it’s original attributes.

The resv_confirm hook which will be running at this point is for the alter requested. The hook rejects the alteration of the reservation. The original reservation should not have any impact if the alter request is rejected.

I have added it considering the sequence of the interfaces and not all together. :slight_smile:

Updated it.

Thank you for all the suggestions.


#17

Hi All,

For a standing reservation, there are two ways of running a reservation end hook :

  1. Running the resv_end hook for every occurrence as the occurrence is ending.
  2. Running the resv_end hook for the last occurrence as the complete reservation is ending.

My thought is that the resv_end hook should run only for the last occurrence, as the last occurrence points to the end of the reservation. As per the use case requirements, the resv_end hook is for the reservation end or for deletion and there is no mention about the occurrence end.

Please share your thoughts.


#18

I look at the standing reservation as normal reservation that repeats itself at a definite interval.
If we look at this way I think the end of every occurrence be treated as end of a normal reservation.
So I think resv_end should run for every occurrence.


#19

Thank you varun for sharing your thoughts. I agree to what you have said. For every occurrence, I feel the whole reservation is really not ending - by this I mean the entire reservation is not unlinked from the server. The unlinking happens only at the end of the last occurrence and thus, the end of the reservation.


#20

@sujatapatnaik52,
I think the implementation is not the question here. That can be handled later.
We should focus here on the use case. How to implement can be thought of later.