Openrm return SIGSEGV


#1

Hi, List
I’m trying to play with rm interface with pbspro 18.1.0
my test code below

#include <sys/types.h>
#include <netinet/in.h>
#include <rm.h>
int main()
{
	openrm("debian",15002);
}

as you can see, my distr is Debian/Sid
compile with

${CC} -o a.out test.c -I${PBS_EXEC}/include -L${PBS_EXEC}/lib -lpbs -lnet 

During runtime

# export LD_LIBRARY_PATH=$PBS_EXEC/lib:$LD_LIBRARY_PATH
#./a.out
Segmentation fault

Am i doing something wrong? it should return a negative value even the connections rather than a SIGSEGV.


#2

You should generate a core file and analyze it with gdb. Make sure your core file limit is unlimited by running ulimit -c unlimited before you execute your binary. Once the core file is generated, analyze it by running gdb ./a.out <corefilename>. Once you see the gdb prompt, you may issue the command where or bt to print a backtrace. If you cut and paste the output, we can help you analyze it.

Thanks,

Mike


#3

here is the output of :

# gdb ./a.out core
GNU gdb (Debian 8.1-4+b1) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./a.out...done.
[New LWP 5789]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./a.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000000000 in ?? ()
(gdb)

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x000055aa9fa7e5db in openrm (host=0x55aa9fa7f004 "debian", port=15002)
    at rm.c:177
#2  0x000055aa9fa7e1ea in main () at test.c:7
 

rm.c:177

                                break;
                }
        }
177 ->         stream = rpp_open(host, port);
#else
        if ((stream = socket(AF_INET, SOCK_STREAM, 0)) != -1) {
                int     tryport = IPPORT_RESERVED;
                struct  sockaddr_in     addr;

I examed $PBS_EXEC directory carefully and found that neither header (rpp.h) nor lib (librpp.a) exists.


#4

add a call to set_rpp_funcs(NULL) before you do openrm.
It takes a function pointer to log messages. You can pass NULL if you do not want any logging.

You can also use pbs_rmget present in “${PBS_EXEC}/unsupported” directory. This binary does what you want to do.


#5

It did solve SIGSEGV problem
One thing i should comment is that we can only use <source_code_location>/src/include to find rpp.h rather than <INSTALL_PREFIX>/include, as there is no rpp.h , but link to libpbs.so is vaild.

it’s good to hear we have pre-builded pbs_rmget command, however, this is a practice for myself just for fun :).

Thanks


#6

Right, there is scope for improvement here. I think openrm should call this function internally to setup rpp functions if they are not set already. A user writing a program using rm library APIs should not do that.