PP-927 Install procedure for PBS on Cray CLE 6.0 using the PBS RPM


#1

Hi,

There are steps to install PBS’s RPM on a Cray CLE 6.0 using Cray’s IMPS commands. They can be found here: https://pbspro.atlassian.net/wiki/spaces/PD/pages/61014047/PP-927+Update+the+installation+procedure+for+PBS+on+Cray+CLE+6.0+to+use+the+PBS+rpm+under+construction
This is for PP-927.

Please have a look at the instructions and provide comments/feedback.


#2

Where does the admin acquire the sle-server and updates RPMs referred to in steps 4 a and b?

Should I know how to accomplish step 4c, or should additional guidance be provided?

Step 6 looks more like a warning than an actual step. Use shift+enter to create a newline without adding a new numbered entry.

What are “applicable nodes” in step 7? All nodes that need PBS Pro services started? Nodes that don’t run PBS Pro services but need the client commands?

When the system was rebooted and the nodes restarted, didn’t PBS Pro started automatically at boot? Do we need to shut down PBS Pro before we start it as instructed in step 9?


#3
  • Hmm…good question about the sle-server and updates RPMs. The RPMs I needed were readily available on the machine I was on…but I don’t know if that will be the case by default. I’ll find out.

  • 4c is from the S-2559 doc. I’ll add that pointer back to the Cray documentation.

  • I’ll add a bit more text around Step 6.

  • For step 7, I meant for all nodes that need the image that contains PBS Professional. I’ll clarify step 7.

  • PBS Professional is not started automatically at boot. So we do not need to stop PBS Professional.


#4

@lisa-altair,

  • step 6: I looked at man page for cfgset and did not find string ‘persistent’. You might want to mention to add /etc/pbs.conf to the config set according to the node’s role (e.g. head node or mom-only node).
  • Between steps 9a and 9b you might want to mention that the nodes should be in batch mode .
  • 9b seems to apply to server nodes only.

#5

Thanks for the comments @vccardenas.
I have updated step 6 and added new documentation references.
Good catch about batch mode, I have added that as the new 9b.
The old 9b became 9c and I added some text there too.


#6

@mkaro @vccardenas and others. I have made changes to the install steps based on the comments provided. Please have a look.


#7

Hi @lisa-altair, this is looking good. My only question is whether we’ll be supporting installations where admins want to preserve their old settings from PBS_HOME. As currently documented, it will always be a fresh installation where admins have to reconfigure their queues, nodes, etc.


#8

@lisa-altair, the updates look good to me, thanks. I forgot to ask about upgrades:

  • from 13.0.40x to 17.x and
  • from 17.x to 17.y
    Are there steps for upgrading in the cases above?

#9

Thanks @mkaro. In step 6 where the admin is told to create configuration sets, this is where the admin can preserve their settings, including preserving /var/spool/pbs (A.K.A. PBS_HOME).
I don’t give steps on how to create configuration sets because there are multiple ways to do it. Instead I provide a pointer to Cray documentation and the admin can choose the way that works for them.


#10

Step 6 mentions configuration sets but does not mention their role in the context of either migration or overlay upgrades. If either of those is supported, more information on how they’re integrated with PBS-specific upgrade steps is needed.


#11

@vccardenas @iestockdale PP-927 is only for new installs.
Upgrades will be covered in a separate RFE.


#12

@lisa-altair, the instructions look good to me now.


#13

I’m satisfied with the changes. You have my signoff.


#14

@lisa-altair I tried the steps you mentioned in the document and it seems that step 3 in your steps talks about creating a recipe and following steps mentioned in Cray’s documentation. Just doing this makes the validation of the recipe fail on a cray.
But then when looks at step 4 which talks about adding more repo to the recipe previously created everything passes.
It would be better if there is a way to link these two steps together and making sure that the user does not proceed with validating the recipe without going through step 4.


#15

@arungrover Thanks for the feedback. I made (former) step 4 a sub-step of step 3. And also put the “repeat as needed” at the end of adding all the RPMs.
Take a look and let me know if this helps.


#16

@lisa-altair, the changes look good to me.


#17

@lisa-altair, in the current (v16) page, Step 3,a,i: should say “also add sle-server_12sp3_x86-64”.


#18

Hi folks,
I have added a behavior change to the EDD, and made other changes to the instructions. Please take a look.


#19

@lisa-altair, I have reviewed the EDD changes and they look good to me.


#20

Changes look fine. I sign off (again).