PP-706: Automatically create KNL specific information


#21

Hi @scc,
But I guess there are plenty of ways to avoid such scenario. Like node priority, associating queues with knl and non-knl nodes and using node sort formula with custom resource.

Warm regards
Dilip


#22

No, I think it is OK to proceed with the proposal, I was just highlighting it as a pretty major difference in the two approaches. Thanks!


#23

Yes, these are differences in behavior, but PBS has the means for an admin to “classify” a node by using node attributes. I know that Cray customers have been using vntype attribute to define “roles” for their nodes. This allows the users to submit jobs with vntype=cray_serial or vntype=blue

For those sites upgrading, the vntype attribute will carryover. Those that are installing for the first time will use the PBS way of configuring the node attributes.


#24

Changes to the design re: mention of the provisioning look promising. Based on suggestions above perhaps we should rename the hook script “PBS_xeon_phi_ provision”.


#25

Hi Sam,
Thank you for the comment.
I have addressed your comment and added more details to the interface,
please review the design document and provide comment / sign off.


#26

Hi All,

I have updated the design document, added a new interface for the PBS provisioning hook and requesting you all to review and provide comment(s) / sign off.

Thanks, and regards
Riyaz


#27

Latest changes done to the design looks good to me.

Regards,
Lovely


#28

Thanks for making the change, the design looks good to me!


#29

Hi, I have a question about the PBS_xeon_phi_provision hook. The design says:
“This will be invoked whenever aoe resource is requested in the job.”

Is this true even when an assigned node has the requested aoe as the current aoe?
Or will the hook check to see if the node is already set to what is requested, and thus not provision unnecessarily?


#30

Hi Lisa,

Thanks for the review.
If the node is already set to requested aoe as current aoe then provisioning hook will not be invoked.
I have now updated the design document.


#31

Hi @riyazhakki
Good to know. Thank you for updating the design. I sign off.


#32

Sorry, look like I have one more comment. I had made this comment in my code review, but since it pertains to the external design, I’ll also post my comment here.

In the code log messages #2 and #3 are actually logged at PBSEVENT_DEBUG4. But the design doc says it should be PBSEVENT_DEBUG2. Which is correct?


#33

Hi Lisa,

I have updated the design document, where I made changes to log messages, debug levels on #2 and #3, and removed the #5 based on latest code changes during the review. Please review the document.


#34

Thanks! The code and design now match! Looks good to me.


#35

I see that you have updated hbmem. Thanks for the new information.


#36

Thanks for making the change, the design looks good to me!


#37

Thank you for the quick turn around re: review and updates, the latest changes look good!