New configuration variable: PBS_MOM_NODE_NAME


#1

This note is to inform the community of a new configuration variable that has been recently added. Details may be found here: https://pbspro.atlassian.net/wiki/display/PD/PP-277%3A+Multinode+jobs+may+fail+to+start

Please feel free to provide feedback on this topic.


#2

Hi @mkaro,

Your external design change to add the new configuration variable PBS_MOM_HOST_NAME looks good to me.

Regards,
Subhasis


#3

Hi @mkaro,
The external design change looks good. You might want to add an example situation on how someone would make use of defining PBS_MOM_HOST_NAME. Perhaps take one of the situations mentioned in pbspro-networking document, and show how the variable should be set, and what happens if not set. Maybe it will show the node as down when doing pbsnodes -av…


#4

Hi @bayucan,

Thanks for the suggestion. You will find a detailed description for testing the fix in the ticket here: https://pbspro.atlassian.net/browse/PP-277

I believe that provides the example you are looking for.

Thanks,

Mike


#5

Thanks for the link. Yes it is helpful. The only thing is, I still can’t find an example in PP-277 about setting PBS_MOM_HOST_NAME. I see PBS_LEAF_NAME. Maybe I missed it somewhere…


#6

@mkaro, Does the value of PBS_MOM_NODE_NAME need to be resolvable?


#7

I think we’re conflating a couple of issues.

And as a result we’re also using two different names…PBS_MOM_NODE_NAME and PBS_MOM_HOST_NAME.

PBS_MOM_NODE_NAME is not used to fix the issue with PBS_LEAF_NAME that was mentioned in the original post. It is used to ensure that when the MoM starts up it uses a name for the natural node that is consistent with the name that was used on the server to create the node.

I.e. if you have a host with this in /etc/hosts:

192.168.2.11 nid0001 preproc01

It allows you to use qmgr -c “create node preproc01” on the server without ill effects.

The problem arises because when MoM starts up it builds a list of vnodes. It is either only the natural node or it is a list of local vnodes (either configured with a v2 configuration file or with an exechost_startup or even exechost_periodic hook). Obviously to build this list it cannot check what the server thinks is the name of the natural node --the server may even not be running at all! So by default it assumes that the name of the natural node is the (non-canonicalized) hostname.

So if sites want to use an alias (or even a name bound to another IP address owned by the host!) to create nodes on hosts, instead of the “official” hostname, we need a way to tell MoM what we’ve done on the server at startup.

Otherwise many odd things happen - if e.g. you manipulate the vnode list in an exechost_startup hook, the vnode list on MoM becomes a list with the natural name named after the output of “hostname”, and when the UPDATE2 message hits the server it will render the ‘original’ node created on the server stale and enforce the new (and presumably not improved!) naming for the vnodes. Which, of course, means that any resources set using qmgr on the original vnode are now on a stale vnode…

Hence the name PBS_MOM_NODE_NAME. It’s the name of the vnode when the hostname is actually different. In the code, it does not correspond to “mom_host”, which is the canonicalized hostname, but mom_short_name.

So it should, in my opinion, never be called PBS_MOM_HOST_NAME since it is actually used when it is NOT the hostname.

The problem with multihost jobs is actually an internal problem that arose when PBS_LEAF_NAME was created. It does not concern this variable. PBS_LEAF_NAME changes the address with which MoM registers with pbs_comm. As a result, it will also change the Mom= attribute or the vnodes involved on the server. That, in turn, will change what is used in exec_Host2 attributes of jobs, and the code erroneously checks if it’s “part of the job” by matching its canonicalized hostname against it. Instead, it should be checking for a match against the canonicalized PBS_LEAF_NAME. That will work always provided that name resolution is consistent across the cluster (i.e. if PBS_LEAF_NAME resolves to an address that is canonicalized to the same name on the MoM and the server).


#8

I would strongly suggest to indeed enforce that it be resolvable to an IP address on the execution host. Otherwise, it’s far too easy to accept a typo in /etc/pbs.conf that will result in very weird and hard to debug failures.

If people want to do "qmgr -c “create node [name] Mom=[name_of_some_MoM_leaf_address]” then I don’t think it’s too much of a burden to insist that the name be known and resolvable on the execution host.

It usually would be an alias of one of the other names of the host on an IP address bound to the host anyway, and if not, then all that needs to be done is to add the name somewhere in /etc/hosts on the execution host.


#9

The title of the post has been corrected. The new variable is PBS_MOM_NODE_NAME. My apologies for causing any confusion.


#10

@mkaro, does the current design require that PBS_MOM_NODE_NAME be resolvable? If not, what do you think are the pros and cons of updating the design to require this?


#11

Yes, it must be resolvable, just like it had been in the past. All we’re doing is overriding the value that would otherwise have been obtained from gethostname(). The code that verifies it is resolvable is unchanged.


#12

I stil have a number of issues with how we’ve organised the documentation of the (separate) issues.

The bottom of https://pbspro.atlassian.net/wiki/display/PD/PP-277%3A+Multinode+jobs+may+fail+to+start

still refers to mom_host whereas PBS_MOM_NODE_NAME changes mom_short_name.

In addition, the top of the page document and the PP-277 problem concern a different internal problem with PBS_LEAF_NAME.

OF course the important thing is that the code actually works, and that’s fine.


#13

I think iestockdale’s issue is not that the implementation asks for the name to be resolvable. I think his suggestion is that the design actually mention this (and possibly the rationale).


#14

I have added a sentence in the design to indicate that the value must be a resolvable host.


#15

The code Alexis contributed addresses two issues, both of which were contributed under PP-277.

  1. Join-job was failing when the vnode name in the job’s node list didn’t match PBS_LEAF_NAME and
  2. the ability to override the default hostname by using PBS_MOM_NODE_NAME.

To further confuse the issue, I accidentally used the string “PBS_MOM_HOST_NAME” in an earlier version of the EDD. This was entirely incorrect since it does not exist. The correct name is “PBS_MOM_NODE_NAME”.

All of his changes were committed under the same ticket. The EDD covers the second item. The first item was simply a bug that need to be addressed.


#16

The change looks good, Mike. I sign off on the EDD.

I notice you mentioned that the first item is a bug. Has this bug been filed? I’d hate to lose the fact after we close out this EDD.


#17

Yes, the bug you are referring to is this ticket, and siblings have been created for each branch.


#18

I am fine with the changes Carl suggested (on the confluence page). I sign off again.


#19

The details of PBS_MOM_NODE_NAME have been updated to reflect Alexis’ review comments. Alexis has since provided confirmation that he is satisfied with these changes. Recording the fact here for posterity.


#20

Recently updated the name resolution paper provided by field support.