Noob question: can't initialize the PBS dataservice

#1

Hi,
I’m trying the PBSpro for the first time, aiming to replace the no-longer-opensource Torque.
I followed the INSTALL file instructions and can build the code on a CentOS 6.10 test machine from scratch.
However, when I tried to start the service, it gives the following error:
[root@c6 libexec]# service pbs restart
Restarting PBS
Stopping PBS
PBS sched - was pid: 5019
PBS comm - was pid: 5004
Waiting for shutdown to complete
/sbin/chkconfig
Starting PBS
/cluster/pbspro/sbin/pbs_comm ready (pid=6007), Proxy Name:c6:17001, Threads:4
PBS comm
Creating usage database for fairshare.
PBS sched
Connecting to PBS dataservice…connected to PBS dataservice@c6
Server@c6: Server@c6, Failed to initialize PBS dataservice:[Prepare of statement insert_job failed: ERROR: relation “pbs.job” does not exist
LINE 1: insert into pbs.job (ji_jobid,ji_state,ji_substate,ji_svrfla…
^]
pbs_server startup failed, exit 255 aborting.
Did I miss something in the process?

Mike Chen
Research Assistant
Dept. of Atmospheric Science
National Taiwan University

0 Likes

#2

The problem does not present in v14.1.2…
Not sure about the cause, maybe PostgreSQL version compatilibity?
Since the target system is CentOS 6, I’ll test with v14.1.2 first.
Input welcomed :slight_smile:

0 Likes

#3

Hi,
I have not seen this problem myself but since the error is related to db schema you have rightly pointed that PostgreSQL incompatibility might be causing this.
There were some changes made specific to database in version 19 and it would require you to install postgresql-contrib package on your system. Can you please install that package and try rebuilding/installing PBS again?

0 Likes

#4

Hi,
Thanks for the reply!
I tried with v18.1.3 and it works.
So as you expected, the problem’s on v19 only.
I tried with v19.1.1 again, but found that I already had that postgresql-contrib package installed when the problem occurred.
Both the postgresql and postgresql-contrib are in version 8.4.20-8.el6.9.

Mike

0 Likes

#5

Hi @mikescchen

Yes, I have the same problem.

I saw the same error with latest commit in master branch (a few week ago)
(my test VM environment is: config.vm.box = “centos/7”)

Instead, I use the following commit in our cluster and test VM.

  • commit 9ad06407424b16c0f097f56df11fd309f39d52d5
    | Author: Bhroam Mann bmann@altair.com
    | Date: Thu Feb 1 10:23:23 2018 -0800

This version don’t give me the errors.

0 Likes

#6

Hi~
I did a quick test with tag v19.1.1 and the latest master branch.
Neither of them have the said problem on CentOS 7.6.
FYI~

Mike

0 Likes

#7

Hi

Which version of postgres is working with v19.0.0

Regards,
Nkwe

0 Likes

#8

Can you please try installing a newer postgres(including contrib package) version (like 9.6) on Centos6.10 and check again?

Thanks!
Arun

0 Likes

#9

The PostgreSQL in CentOS 7.5 repository is 9.2.24-1.
Meanwhile, the one in CentOS 6.10 is 8.4.20-8.
FYI~

Mike

0 Likes

#10

Hi @mikescchen,

Yes, you are right, that the reason you are facing issue while starting the database v8.4.20.
In the latest version of PBS Pro v19.1, we have started using the new feature “hstore” module that was not required in PBS Pro v14.1 and PBS Pro v18.1.

I have tried in same platform CentOS 6 with the same version of postgres 8.4.20 alongside contrib package. Hit with the same issue while staring the services.

Current database schema and scripts are compatible with Postgres version which supports “create extension” feature. (say from 9.4 onwards)

So please use the latest supported version of Postgres database as per link https://www.postgresql.org/support/versioning/

I recommend to use v9.6 with latest PBS Pro v19.1.

Please write back if any info needed, Thanks

P:S. Much thanks for trying in CentOS 6 and notifying us this issue. we will add this DB version specific requirements in RPM builds.

0 Likes

#11

Hi,
Great to have the cause confirmed!
For curiosity’s sake, I tried to build PostgreSQL 9.4.21 from source on CentOS 6, and installed it in /opt/pgsql.
But the PBSpro configure.sh can’t find the database headers:

checking for PBS database directory… configure: error: Database headers not found.

I tried to add the CPPFLAGS in configure.sh, but not working.

./configure --prefix=/opt/pbs CPPFLAGS=-I/opt/pgsql/include

Any ideas on how to build PBSpro with source-built PostgreSQL?

Mike

0 Likes

#12

Hi Mike,

To configure PBS Pro with pre-compiled libraries, we support with statement flags.

Plz configure like ./configure --with_database_dir=<path/to/compiled/pgsql/dir>

More info available at ./configure --help

Regards,
Brem

0 Likes

#13

Hi!
Thanks for the info!
I can now compile pbspro with source-built postgreSQL 9.4.21 on CentOS 6.
I did some search, it looks like the postgresql can’t be run as root.
So I created a system account postgres with:

useradd postgres -d /var/lib/pgsql -m -r

After the pbs_postinstall, I got this error when starting the pbs service:

PBS Home directory /var/spool/pbs needs updating.
Running /opt/pbs/libexec/pbs_habitat to update it.


*** psql command is not in PATH

But actually the psql is in PATH:

[root@c6 pbspro]# which psql
/opt/pgsql/bin/psql

Anyway, following the message, I ran the pbs_habitat.
It does not complete, but stuck at the step:

Connecting to PBS dataservice…

By the “ps ax|grep post” output, I can see the postgreSQL server is started, and the database is initialized:

[root@c6 pbspro]# ps ax|grep post
1260 ? Ss 0:00 /usr/libexec/postfix/master
27225 ? S 0:00 /opt/pgsql/bin/postgres -D /var/spool/pbs/datastore -p 15007
27228 ? Ss 0:00 postgres: logger process
27230 ? Ss 0:00 postgres: checkpointer process
27231 ? Ss 0:00 postgres: writer process
27232 ? Ss 0:00 postgres: wal writer process
27233 ? Ss 0:00 postgres: autovacuum launcher process
27234 ? Ss 0:00 postgres: stats collector process

So I have no idea why the pbs_habitat stuck.
Looking for more help on this; I think I’m quite close.

Mike

0 Likes

#14

Hi Mike,

Might be some system issue, as you have mentioned “psql” command is configured in the path, it should be executed without any compliant.

I am guessing, below way should help.

  1. Stop the running postgres.
  2. Remove the PBS HOME dir (rm -rf /var/spool/pbs)
  3. If you have installed the pgsql source built in a directory (say contents are bin/include/lib/share sub-dirs) copy the whole directory to /opt/pbs/pgsql and chown to database user.
    steps:
    a) mkdir /opt/pbs/pgsql
    b) cp -R <path/to/source/built> to /opt/pbs/pgsql
    c) chown -R postgres:postgres /opt/pbs/pgsql
  4. run pbs_habitat (/opt/pbs/libexec/pbs_habitat)
  5. /opt/pbs/libexec/pbs_postinstall

Please share your feedback after following these steps.

Thanks,
Brem

0 Likes

#15

And, I see you have configured prefix as “/cluster/pbspro/” in previous logs, then please replace “/opt/pbs/” with your configured prefix directory in my previous steps. I have given steps with default path where PBS Pro installs.

Here, the idea is we are providing another copy of built binaries in <PBS_HOME>/pgsql directory. so that pbs_habitat script should pick the path from any of these locations.

-Brem

0 Likes

#16

Ouch!
I tried your step but the problem remains.
The posgres is there, so I tried to connecting the DB directly, but hey:

[root@c6 postgresql-9.4.21]# psql -h c6 -p 15007
psql: could not connect to server: Network is unreachable
Is the server running on host “c6” (192.168.20.6) and accepting
TCP/IP connections on port 15007?

The firewall is off, then I found the problem is in the name resolution.
The IP in /etc/hosts is incorrect.
The pbs_habitat can connect to the DB after the IP fixed.
However… now I have that “Failed to initialize PBS dataservice” problem again.
The postgres version is confirmed to be 9.4.21:

[root@c6 postgresql-9.4.21]# postgres --version
postgres (PostgreSQL) 9.4.21

All I did is:

./configure --prefix=/opt/pgsql

Do I need extra options to enable the needed functions when building postgresql?
Meanwhile, I’ll try with newer version of postgresql.

Mike

0 Likes

#17

Hi,
Update:
Tried with postgres 9.6.12.
Still has this error:

[root@c6 postgresql-9.6.12]# /opt/pbs/libexec/pbs_habitat


*** Setting default queue and resource limits.


Connecting to PBS dataservice…connected to PBS dataservice@c6
Server@c6: Server@c6, Failed to initialize PBS dataservice:[Prepare of statement insert_job failed: ERROR: relation “pbs.job” does not exist
LINE 1: insert into pbs.job (ji_jobid,ji_state,ji_substate,ji_svrfla…
^]
*** Error starting pbs server

0 Likes

#18

Hi,

I would like to see your Postgres compilation steps.

I have just built Postgres 9.4.21 in CentOS 6 and tried to create extension hstore. It worked fine, then i will proceed to build PBS Pro with this compiled pgsql.

To validate your pgsql compilation, plz connect to DB and try these steps.

postgres=# select version();
version

PostgreSQL 9.4.21 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23), 64-bit
(1 row)

postgres=# create extension hstore;

postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
---------±--------±-----------±-------------------------------------------------
hstore | 1.3 | public | data type for storing sets of (key, value) pairs
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(2 rows)

Please follow instructions given in Postgres INSTALL file and also to compile & install extensions.
you need to a) ./configure , b) make world & c) make install-world

Regards,
Brem

0 Likes

#19

Sorry I didn’t see your reply before update!
To clear things up: after (I think) I understands the process better, I recreate the test environment again, down from installing a fresh copy of CentOS.
I uses --prefix=/opt/pbs since then.
So the recent test results are not from the environment I begin this post with.

It’s weird that the problem remains even if I install the postgres directly to /opt/pbs/pgsql (and then chown).
The symptom is the same with: postgresql 8.4 from repository, 9.4.21 from source, and 9.6.12 from source.
From what I guess, I think the DB creation by the postgres is okay; the pbspro can connect to the DB, but problem happens when it tried to create DB structures.
We already knows the postgres 8.4 lacks the features pbspro 19.x requires; but that can’t explain why it fails with the same message on source-built 9.4.21 and 9.6.12.
What did I missed? Do I have to install some extra features or modules of postgres?

Mike

0 Likes

#20

I have tried in the same machine setup

OS : CentOS 6.8
Postgres DB : v9.4.21

Configuration:

  1. ./configure --prefix=/opt/pbs -with-database-dir=/home/pbs/pgsql --with-database-user=pbs
  2. make
  3. make install

Installation logs:

[root@selinux-training pbspro]# /opt/pbs/libexec/pbs_postinstall
*** PBS Installation Summary


*** Postinstall script called as follows:
*** /opt/pbs/libexec/pbs_postinstall ‘’


*** No configuration file found.
*** Creating new configuration file: /etc/pbs.conf
*** Replacing /etc/pbs.conf with /etc/pbs.conf.19.0.0
*** /etc/pbs.conf has been created.


*** Registering PBS Pro as a service.
*** Systemctl binary is not available; Failed to register PBS Pro as a service


*** PBS_HOME is /var/spool/pbs
*** Setting TZ from /etc/sysconfig/clock
*** Creating new file /var/spool/pbs/pbs_environment


*** The PBS Pro server has been installed in /opt/pbs/sbin.
*** The PBS Pro scheduler has been installed in /opt/pbs/sbin.


*** The PBS Pro communication agent has been installed in /opt/pbs/sbin.


*** The PBS Pro MOM has been installed in /opt/pbs/sbin.


*** The PBS commands have been installed in /opt/pbs/bin.


*** End of /opt/pbs/libexec/pbs_postinstall
[root@selinux-training pbspro]# vi /etc/pbs.conf
[root@selinux-training pbspro]# /etc/init.d/pbs start
/sbin/chkconfig
Starting PBS
PBS Home directory /var/spool/pbs needs updating.
Running /opt/pbs/libexec/pbs_habitat to update it.


*** Setting default queue and resource limits.


Connecting to PBS dataservice…connected to PBS dataservice@selinux-training
*** End of /opt/pbs/libexec/pbs_habitat
Home directory /var/spool/pbs updated.
/opt/pbs/sbin/pbs_comm ready (pid=45156), Proxy Name:selinux-training:17001, Threads:4
PBS comm
PBS mom
Creating usage database for fairshare.
PBS sched
Connecting to PBS dataservice…connected to PBS dataservice@selinux-training
Licenses valid for 10000000 Floating hosts
PBS server
[root@selinux-training pbspro]# cat /etc/release
CentOS release 6.8 (Final)
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
cat: /etc/lsb-release.d: Is a directory
CentOS release 6.8 (Final)
CentOS release 6.8 (Final)
cpe:/o:centos:linux:6:GA

0 Likes