After backporting cgroups hook from 19.1.2 to 18.1.4 and having great success using it to manage allocation and usage on our nodes, I’ve run into a node with an unusual division of numa nodes that the cgroups hook is having difficulty with.
The node has 32 cores (16 per socket), 4 GPUs, & 64 gb mem. However when enabling vnode_per_numa_node in the cgroups hook config, the node is divided into 2 vnodes with 8 cores, 2 GPUs, & 32 gb mem. As a result an entire socket is just gone - as far as pbs is concerned. This seems to be due to the layout of this node, which I have provided a highly technical diagram of below:
A less technical diagram from $ nvidia-smi topo -m
Has anyone has encountered a node like this? For the sake of using the GPUs efficiently I’m considering leaving these as 8 core vnodes since the hook is managing the reported resources very well. However, I really don’t want to have an unallocatable socket. I’d be grateful for any suggestions.
I can also provide the backported python script for anyone who would like it. I’m not sure what the best way to share it is. The backporting just involved commenting out a handful of lines.