What would be the benefit to limiting the number of times a reservation can be altered? Why impose an arbitrary limit?
Reservation end hook is essentially synchronous and the decision on reservation should continue or end should be taken as soon as the hook execution is over.
Altering the end time of the reservation is asynchronous. We send the request, and some time later, scheduler confirms if the reservation can be extended. So theoretically it has to wait for an undefined amount of time.
Whatever be the case, we need to decide on whether to end the reservation immediately after running the hook. If we have to alter the reservation, the server has to wait for an undefined amount of time to decide. So I think it is not practical to alter the reservation end time in the reservation end hook.
Thank you for highlighting this. I feel we shall have a resv_alter event. However, As per the current UCR, we have a requirement for the reservation confirmation event. I think this event can handle the alter requests which comes from the scheduler. we shall create a resv_alter event and have a seperate set of features for that when there is a requirement.
I think @crjayadev has rightly pointed out why I was asking that question. It is mostly related to extending the reservation while running the reservation end hook.
I have an idea of a possible use for the reservation end hook event. I’m not sure I think it’s 100% appropriate for the hook event though. The idea is to use the reservation end hook event to extend the reservation’s walltime and allow all the jobs to finish.
The real use case is that the jobs in the reservation are running long and you want to allow them to finish. Is it better to have a hook that keeps extending the reservation, or just enhance reservations. Make a new type of reservation with a soft end time. It should take N hrs for this reservation, but let the jobs finish even if they take longer.
In any case, I don’t think altering reservations is possible in hooks yet. I think the only change we’d make to the design now is to call the reservation end hook before you delete all the jobs rather than after. This will open up the ability to allow us to handle this use case once altering reservations is possible.
I liked the idea of enhancing of reservations. I have a query here. In the above statement, Did you mean a reservation with dynamic end time which will extend itself(I mean by the server) or is it the hook event which will extend the walltime of “reservations with soft end time”.
Thank you for this suggestion. I have updated the EDD.
I agree to this.
Either will work, but I’m not sure which is the right way to design it. Hooks are a bit of a hammer. Pretty much anything can be done with them, but is it the right way to implement something? We have soft_walltime for jobs now, should we have a similar feature for reservations? I don’t know.
In any case, thanks for updating the design document. I don’t think there is much difference to the functionality of the reservation end hook if it is run before or after we delete the jobs. If we run it before we delete the jobs, we can leave the way open for implementing these soft reservations via hooks.
It was my understanding that we already have use cases for the reservation end hook and this discussion is limited to adding new hook events.
Regardless of whether it is possible to extend a reservation through reservation end hook or not, we already have the reservation end hook in the design you have shared, so according to me it is irrelevant to discuss if we could use this event for altering reservations or not.
As far as I know, there are separate requirements that need a resv_alter hook event and @smgoosen can confirm it as well.
Apart from adding new events like resv_alter, at this stage is there any further discussion needed on the three events defined in the initial scope here? We are now implementing these initial three and I want to make sure there are no further changes required.
It sounds from the discussion like the reservation end event should trigger before we delete jobs.
What about the existence of pbs.event().resv.queue as a duplicate of pbs.server().queue(pbs.event().resv.queue.name) ? I would like to hear input from the PBS Pro crowd on whether or not this is desirable – my own inclination is that duplication should be avoided as a general principle.
I agree that duplication should be avoided. It is probably better if we just get the queue name from the event and then interface with it via the server.
Even better, I think, would be to extend the reservation object further to abstract away the fact that in today’s PBS Pro reservations just happen to be implemented via special queues. For example, a hook writer should be able to control who has access to a reservation directly through reservation the object, and today PBS would achieve that for them seamlessly by setting ACL etc. attributes on the queue behind the scenes.
I consider this out of scope for the current work, though, so in the meantime building in duplication for the sake of minor convenience seems like a bad idea.
Apologies for the delayed response. Thank you for clarifying things.
Okay In that case, I think this shall be added in a seperate ticket.
I agree to this and I feel it shall be doable. However, I have one thought - In other job-related hooks, we see a similar sort of duplicacy for getting the queue object from pbs.server() and pbs.job.queue(). I shall be using probably a similar sort of way for reservations.
The above suggestion seems to me like a bit of enhancement upon the currently existing features. Please provide your thoughts.
Question - It looks like the reservation confirmation event occurs -after- the scheduler has responded, is this true? So if we want to limit by how many times or by how much a reservation can be extended by pbs_alter it seems like a waste of time to actually wait for the scheduler before deciding the requested change is not going to be allowed (without these limits a user could extend their reservation indefinitely). It seems like these would be better handled by a pbs_alter event that gets executed as soon as the server receives the alter request.
Are there actually use cases where it would be better to wait for the reservation confirmation vs accepting/rejecting an alter request when it’s made?
Apologies for the delayed response.
I would say yes to the above. The Scheduler confirms the reservation request from the server and issues a PBS_Batch_ConfirmResv request which shall invoke the reservation confirmation event.
I agree to this. I feel it’s better to have pbs_alter hook event.
I would say, currently as per the UCR, there are no specific usecases for the above. The one which you have mentioned regarding limiting the number of times a reservation can be altered seems to me as a usecase where accepting or rejecting the alter request immediately is worth it than to wait for the reservation confirmation event.
Has there ever been a decision made re: resv_alter (for which we have a use can) vs resv_confirm (for which we have no use cases)?
As much as I know, there have been no further decisions happening on the resv_alter over resv_confirm.
Then I would like to propose that the design be changed from having an event after the sched cycle to before, when the pbs_ralter request is made.
As said in my previous comments, I feel the idea of resv_alter is worth having. However, As a part of the UCR presented in this section doesn’t specify the usecases for the resv_alter event. Is it possible to take that as a seperate ticket and have a seperate design document maybe??
I would rather not have you implement a hook event that has no use cases rather than an event for which we know there is a need. Doesn’t that make sense?
Use cases are what a customer/admin wants to accomplish with the new capabilities we are introducing, they describe “why would a customer want this”. Could you point me at the use case(s) for the hook event after scheduler confirmation? I see a description of new functionality but I don’t see anything in the UCR that says why an admin may want to do this. A use case would be something like (e.g. for the resv end hook event) “An admin wishes to extend a reservation to allow jobs to finish running”. You mention allocation management for the job end hook event, perhaps a “why” use case for that hook event might be “when using allocation management an admin would like to restore unused allocation to the user’s account” or something like that. Why are we introducing a post confirmation hook event?