Qsub BAD UID for execution


#1

Hello.

I’m not able to submit jobs from other nodes. Only from the node where pbs server and scheduler runs.
I can do qstat, though, from submithosts and compute nodes.

log does not tell much:
11/29/2016 13:49:33;0080;Server@nas1;Req;req_reject;Reject reply code=15023, aux=0, type=1, from einjen@compute-0-2.some.server.com


#2

This is for the open source version. not commercial version


#3

15023 is PBSE_BADUSER.

Does the account named einjen exist on nas1?

If so, is the einjen user able to submit jobs while logged into nas1, or are you using a different account on nas1 when you say submission is successful?

What does “getent passwd einjen” show when run on both nas1 and on compute-0-2?

Is any error displayed by the qsub command (“Bad UID for job execution”)?


#4

einjen does exist on all nodes. With same uid on all nodes.

set server scheduling = True
set server acl_host_enable = True
set server acl_hosts = lille-login2
set server default_queue = workq

this would not accept login from “lille-login2”

for future reference:

I added:

set server flatuid = True

and now it is working


#5

Right, OK, so here is what I was going to share once the problem was more isolated. You hit one way to solve it, though maybe not the ideal way:

For a job to be accepted by the PBS server, the user at the submitting host must pass an ruserok() test.

From the RCMD(3) man page:

* The iruserok() and ruserok() functions take a remote host's IP address or name, respectively, two user names and a flag indicating whether the local user's name is that of the superuser. Then, if the user is NOT the superuser, it checks the /etc/hosts.equiv file. If that lookup is not done, or is unsuccessful, the .rhosts in the local user's home directory is checked to see if the request for service is allowed.

  If this file does not exist, is not a regular file, is owned by anyone other than the user or the superuser, or is writeable by anyone other than the owner, the check automatically fails. Zero is returned if the machine name is listed in the hosts.equiv file, or the host and remote user name are found in the .rhosts file; otherwise iruserok() and ruserok() return -1. If the local domain (as obtained from gethostname(2)) is the same as the remote domain, only the machine name need be specified. 

If the pbs_server attribute flatuid is set to true, this test is skipped and the job is accepted based on the submitting users name alone (with fairly obvious security implications, users can easily impersonate one another if they can create arbitrarily named accounts on systems which can qsub to the PBS server).

Flatuid or not, to run as a user other than the job owner (the submitter) you must have authorization to do so. Otherwise, any user could run a job as any other user. You authorize for userA to run a job as userB the same way you authorize userA@host1 to run a job as userA on host2 when flatuid is Not SET, i.e. see .ruserok() and .rhosts.

Here is a test program to see if ruserok passes for a given user and host:

/*
   Two use cases:
    1) User submitting job from remote host to server getting unexpected
        "Bad UID" message. That is, user doesn't have access when he thinks
        he should.
    2) User(s) can delete, etc other user(s) jobs. That is, one user is able
        to act as what he thinks is a different user, server sees them as
        being equivalent.

Build with "cc ruserok.c -o ruserok"

Usage (run on the PBS server system):

ruserok remote_host remote_user1 local_user2

where:

remote_host:  the host from which the job is being submitted, or where the PBS client command is issued

remote_user1: the username of the user submitting the job, or issuing the client command

local_user2: the username of the user remote_user1 is trying to submit the job as, or owner of the job that remote_user1 is trying to act on with the client command

*/


#include <errno.h>
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
        int rc;
        char hn[257];

        if (argc != 4) {
                fprintf(stderr, "Usage: %s remote_host remote_user1 local_user2\n", argv[0]);
                return 1;
        }
        if (gethostname(hn, 256) < 0) {
                perror("unable to get hostname");
                return 2;
        }
        hn[256] = '\0';

        printf("on local host %s, from remote host %s\n", hn, argv[1]);
        rc = ruserok(argv[1], 0, argv[2], argv[3]);
        if (rc == 0)
                printf("remote user %s is allowed access as local user %s\n", argv[2], argv[3]);
        else
                printf("remote user %s is denied access as local user %s\n", argv[2], argv[3]);

        return 0;
}

#6

Ok.

Not sure if I understodd how that was supposed to work:
[root@nas1 ~]# ./testpbsuser nas1 einjen einjen
on local host nas1, from remote host nas1
remote user einjen is denied access as local user einjen


#7

It is not really meaningful unless you use a hostname different from the one you are on, so in your case you’d want to test with “compute-0-2” rather than “nas1” (you ar eON nas1).

Here’s an example showing one way (hosts.equiv in this case, there are other methods alluded to in the man page) to make an ruserok() test pass:

[root@centos7 tmp]# cat /etc/hosts.equiv

[root@centos7 tmp]# ./ruserok rhel64 user1 user1
on local host centos7.prog.altair.com, from remote host rhel64
remote user user1 is denied access as local user user1

[root@centos7 tmp]# echo rhel64 > /etc/hosts.equiv

[root@centos7 tmp]# ./ruserok rhel64 user1 user1
on local host centos7.prog.altair.com, from remote host rhel64
remote user user1 is allowed access as local user user1