I’m experimenting with placement sets using 16 x VMs to simulate a small, 3D hypercube environment:
- Vertex 1 (switch 1) - nodes n101 & n102
- Vertex 2 (switch 2) - nodes n201 & n202
- Vertex 3 (switch 3) - nodes n301 & n302
- Vertex 4 (switch 4) - nodes n401 & n402
- Vertex 5 (switch 5) - nodes n501 & n502
- Vertex 6 (switch 6) - nodes n601 & n602
- Vertex 7 (switch 7) - nodes n701 & n702
- Vertex 8 (switch 8) - nodes n801 & n802
The results so far are pleasing, with jobs being placed on nodes as expected (minimising the number of ‘hops’ between nodes assigned to the same job).
The thing is, although it’s working fine, I’m not sure that I’ve gone about configuring PBS the best way. The resulting config is quite large/complex for a simple, 16-node cluster. If I used the same methodology for a large cluster, it’d be far too complex to manage. I’m probably over-thinking it by trying to cover every (optimal) node combination.
I’d really appreciate some guidance from anyone who has experience with a similar (real-world) configuration.