Increasing the open files limit across all nodes


#1

Hi

I need to up the ulimit for open files from 1024 to maybe 10000.
I have added that limit for a user in /etc/security/limits.conf on all nodes.
I have added “session required pam_limits.so” to /etc/pam.d/login
With a normal login that larger limit now works but not for PBS interactive logins nor PBS jobs.

I have seen this post where they needed to up the stack-size.


They needed to edit PBS_EXEC/lib/init.d/limits.pbs_mom on each mom.

I have therefore edited /usr/pbs/lib/init.d/limits.pbs_mom (on one not busy node) and added at the end of the file:

if [ -f /etc/sgi-release -o -f /etc/sgi-compute-node-release ] ; then
    MEMLOCKLIM=`ulimit -l`
    NOFILESLIM=`ulimit -n`
    STACKLIM=`ulimit -s`
    ulimit -l unlimited
    ulimit -n 16384
    ulimit -s unlimited
fi
NOFILESLIM=10000

then /etc/init.d/pbs restart

Using a PBS interactive login job to login to that node I get the old limit still:

mynode~$ ulimit -Sn
1024

If I login directly without going through PBS I get the larger limit set in limits.conf as expected.

So there is something else I need to do to get PBS to set a ulimit. (Note: I have not changed limits.pbs_mom on the head node.)
I’m using PBSPro 14.2. There is no mention of ulimit or NOFILESLIM in the Admin or Reference Guides.

Mike


#2

Please edit $PBS_EXEC/lib/init.d/limits.pbs_mom on each mom. with the below contents (remove any other contents) and restart the mom services

ulimit -H -s unlimited
ulimit -S -s 65536
ulimit -l unlimited
ulimit -n 10000

Otherwise:

  • try by setting it in sysctl.conf and then sysctl -p
  • /etc/security/limits.d/nofile.conf , setting hard and soft limits in this file

#3

Hi adarsh

I added those 4 ulimit commands to limits.pbs_mom just one one non-busy node. Then /etc/init.d/pbs restart. Now an interactive PBS does show the higher ulimit :slight_smile:

So I should not have used “NOFILESLIM=10000” (it looked to me like NOFILESLIM was a PBS env variable) and I should use the actual ulimit commands in limits.pbs_mom

Do I need to do a “init.d/pbs restart” or would a HUP of the mom process be sufficient? i.e.

$ ps ax | grep pbs
 16461 ?        Ssl    0:00 /usr/pbs/sbin/pbs_mom
$ sudo kill -HUP 16461

A HUP would not affect running jobs - is that correct?
A restart of the mom would affect running jobs ?

I’ll check more tomorrow when at work.

Mike


#4

Hi Mike ,

Please use ulimit commands in limits.pbs_mom

/etc/init.d/pbs restart or systemctl restart pbs is required

Thank you


#5

I’ve been having similar issues, my solution is the following:
In /etc/systemd/system.conf append “DefaultTasksMax=65536”
set hardnofile in /etc/security/limits.conf to “* hard nofile 131072”

Hope this helps,

Pete


#6

Hi all

Thanks for this help. I have rolled out those changes to several nodes that had no jobs on them, installing the new limits.pbs_mom and a full restart of PBS mom. From my testing that looks like it has worked fine.

Thanks
Mike