PP-734: Ability to release limited resources when a job is suspended


Blank (unset) = no restrictions, release everything
a particular resource is mentioned = restrict to releasing only that particular resource


@arungrover, I agree with using “restrict_res_to_release_on_suspend” and the rest of the EDD looks good to me.


Also looks good to me, thanks!


I was thinking about this feature’s interaction with Multiple scheduler enhancement that is currently in progress.

The attribute to specify which resource to be released while suspending a job (restrict_res_to_release_on_suspend) is currently a server attribute. Now in case of multiple schedulers, I don’t know if there is any use case to have each of the scheduler have their own list of resources that can be released during suspension.

I’m thinking that if there is any such need then, each of the scheduler can have it’s own set of “restrict_res_to_release_on_suspend” as part of their policy and override server’s attribute. If it is not present in the policy then server’s value will take effect (similar to how backfill_depth is implemented in server/queue).


I’ve changed the design proposal because of an anomaly we found during code review.
Change is in Interface 2 and the change is highlighted here -
“the amount of resources that are released on each node that the job was running on”


Looks good @arungrover!


Could you add the Python types for each of the attributes to the EDD?


@agurban Thanks for your review comment. I’ve added python type to the document. Also changed the type of one of the attribute from “array_string” to “string_array”.
Please have a look.


Thanks @arungrover!

It seems dangerous (or at least un-Python-y) to use a plain string as the Python type when the type is really a very structured string (i.e., an array of resource names). Wouldn’t it be better to have a type that represents an array of resources and a constructor that takes a string version and converts/checks it appropriately, e.g., like pbs.node_group_key does? (I know there are other RFEs that previously took the “string” shortcut hack, but…).


You are right! When I had a chat with @bayucan we believed that it will be a list of strings and Ideally it should be a list of strings but in reality it turns out that it was a string.
Now the problem I think is more generic in nature. Whenever anyone creates an attribute of type string_array our code internally translates into a python string by default. I believe instead of doing that it should by default be translating all string_array to python list.
If you agree then I’ll log a bug so that any such addition in future gets correctly translated in python.


As I was verifying the feature, I noticed that the example for interface 3 doesn’t match the name of the attribute “resource_released_list”. I do wish the plural version was the correct one (resources_released_list) the symmetry would have been nice. But I supposed it’s a bit late for that comment. :wink: Can you please correct the example?


Re: string_array mapped to Python str

Yes, PBS Pro should definitely do a better job here, please do file an issue. Thx!


Changed the example. Thanks for catching that.

Filed PP-847


There was a bug filed (PP-844) which is about user being able to see only resorces_released_list when he/she stats the job after the job is suspended.
while addressing this bug I realized that user does not really have to see these attributes at all because RRTROS is set by an admin and what is release/not-released does not really matter to a user.
Thus, I’ve made a change in the design document to make “resources_released” and “resource_released_list” attributes to be only visible to operator/manager privilege users.

Request community to please have a look at the document.


Makes sense that the user doesn’t need to see it. The design change looks good.


The changes in the design document to make “resources_released” and “resource_released_list” attributes to be only visible to PBS operator/manager look good to me.


@agurban pointed out that there seems to be a mistake in the document where it mentions resources_released_list as a Private interface which is not true. An admin can very well view this information by doing a qstat on the job.
I’ve made a change to the interface and request community to have a look.
Thanks @agurban for catching this.


Makes sense to me. I’m fine with the design change.