Hey Bill, thank you for looking into this.
For this project we have a brief description at UCR page here. I like your idea of connecting the features and coming up with a design description. I’ll work with @dilip-krishnan and have a page setup which will have brief design description of power work, provisioning and knl work. With this we will have a single location where all these features are connected and explained.
The current implementation and design of power provisioning and ramp rate feature follows built-in model. We have some part of the code at PBS core (mostly decision making) and action level as part of PBS hook. This is the main reason for making provsion event to support multiple hooks so that we can ship power work as PBS hook and site can put their need on a site hook.
Sounds fine, restored the interface to have only vnode and aoe value. If we are extending aoe capabilities, it should serve all oe requirements in future and we will still have a simple and minimal inputs to the hook.
Yes, they are from previous revision. Fixed them now.
Introduction of these new states wont break backward compatibility. And these states will be seen only when feature is enabled so tools monitoring the node states shouldn’t get affected when the feature is off. Even when the feature is ON new states will appear when the cluster idle or we provision some of the idle nodes to run the jobs. I need more understanding on how these tools monitor the node states. Also we don’t have node sub states, its only jobs that support them.