How to test database update

Soon, we’ll be upgrading our current PBSPro 13 to 19.2.1 and are concerned about carrying the database forward, due to unpleasant experiences seen doing a naive update on a test cluster.

Assume we want to keep the outage associated with this upgrade as short as possible. So, we want to try everything ahead of time on a test host. Something like:

  • Clone the production server to a test server, including PBS_EXEC and PBS_HOME.
  • Make sure the postgres “hstore” extension is available.
  • Change /etc/pbs.conf on the test server to point to itself.
  • Fiddle the network connections so the test server cannot talk to real MoMs or COMMs.
  • Go through the Installation Guide section 6.6 steps on the test server, skipping the parts that apply to MoMs and COMM-only hosts.
  • Note that we will build and install from source, not RPMs.
    • Should we manually run “pbs_postinstall server” after the install?
  • Repeat, tweaking procedures until database update works.

Does using a test host this way sound do-able? Is there a simpler way?

If you are not interested in preserving the job-id sequence and accounting logs , but only the configuration , then you can follow the below steps:

On the PBS Pro 13.x :

  1. qmgr -c “print server” > print_server.txt
  2. qmgr -c “print node @default” > print_node.txt
  3. qmgr -c “print sched @default” > print_sched.txt
  4. $PBS_HOME/server_priv/resourcedef # backup this file
  5. $PBS_HOME/sched_priv/sched_config # backup this file
  6. $PBS_HOME/server_priv/hooks #backup this file
  7. Take a backup of /etc/pbs.conf

Deploy PBS Pro 19.x on the test cluster

Make sure the PBS Server is up and running , so that you can import the configuration of PBS Pro 13.x as below into 19.x

a. qmgr < print_server.txt
b. qmgr < print_node.txt ( make sure the /etc/hosts file has all the hostname of the compute nodes)
c. qmgr < print_sched.txt
d. copy the $PBS_HOME/sched_priv/sched_config to the same location ( back-up the original)
e. adjust the attributes /etc/pbs.conf to suite the new cluster
d. restart the pbs services

Another way:

  1. Take a pgdump of the PBS Pro 13.x
  2. Install PBS Pro 19.x on the test cluster
  3. import the dump in the postgres

Hope this helps

As I expected, a straight update using the old database did not go well.

First, the password from the old database did not match what the new install created, so pbs_server would not start.

Next, pbs_ds_password.bin core dumped when trying to set the password. After fixing that, pbs_server was unable to start postgres for an unknown reason.

07/23/2019 09:27:45;0002;Server@mtestpbs;Svr;Server@mtestpbs;PBS dataservice not running:[Connection:  failed: could not connect to server: Connection refused
        Is the server running on host "127.0.0.1" and accepting
        TCP/IP connections on port 15007?]

So, I ran “pbs_dataservice start” to get it started. That worked, but then pbs_server complained and exited:

07/23/2019 09:29:47;0002;Server@mtestpbs;Svr;Server@mtestpbs;connected to PBS dataservice@localhost
07/23/2019 09:29:47;0001;Server@mtestpbs;Svr;Server@mtestpbs;Server@mtestpbs, Failed to initialize PBS dataservice:[Prepare of statement insert_job failed: ERROR:  column "attributes" of relation "job" does not exist
LINE 1: ...ji_4ash,ji_credtype,ji_qrank,ji_savetm,ji_creattm,attributes...
                                                             ^]
07/23/2019 09:29:48;0002;Server@mtestpbs;Svr;Server@mtestpbs;Stopping PBS dataservice

I’m pretty sure the issue is that the database is still in version 13 format and needs to be updated to version 19. What tools are available to do that conversion?

I am not sure whether there are any tools to update the database schema/changes to the latest release.

  1. either upgrading 13.x to 19.x and taking the dump would be the alternative way
  2. the best way is configuration migration with a fresh installation PBS Pro OSS 19.x . You would have the rest of the details in the accounting logs.

$PBS_EXEC/libexec/pbs_schema_upgrade will take care of the schema changes. Ideally it will get invoked when you start PBS using init script or systemctl.

1 Like

Success! After I patched pbs_schema_upgrade. That is, the upgrade script creates the hstore extension as part of the pbs schema. However, the 19 server expects hstore to be available generally.

Once I moved hstore from pbs to public, I was able to take a PBSPro 13 database and upgrade it all the way to one acceptable by PBSPro 19.

$ diff -u /PBS/libexec/pbs_schema_upgrade pbs_schema_upgrade 
--- /PBS/libexec/pbs_schema_upgrade     2019-08-05 11:02:17.371586175 -0700
+++ pbs_schema_upgrade  2019-08-12 14:34:37.517228831 -0700
@@ -161,9 +161,9 @@
 
        ${PGSQL_DIR}/bin/psql -p ${PBS_DATA_SERVICE_PORT} -d pbs_datastore -U ${PBS_DATA_SERVICE_USER} < /dev/null
 
-               CREATE EXTENSION hstore SCHEMA pbs;
+               CREATE EXTENSION hstore SCHEMA public;
 
-               ALTER TABLE pbs.job ADD attributes pbs.hstore DEFAULT ''::pbs.hstore;
+               ALTER TABLE pbs.job ADD attributes hstore DEFAULT ''::hstore;
                UPDATE pbs.job SET attributes=(
                        SELECT hstore(array_agg(attr.key ), array_agg(attr.value))
                                FROM ( SELECT concat(attr_name, '.' , attr_resource) AS key,
@@ -172,7 +172,7 @@
                UPDATE pbs.job SET attributes='' WHERE attributes IS NULL;
                ALTER TABLE pbs.job ALTER COLUMN attributes SET NOT NULL;
 
-               ALTER TABLE pbs.node ADD attributes pbs.hstore DEFAULT ''::pbs.hstore;
+               ALTER TABLE pbs.node ADD attributes hstore DEFAULT ''::hstore;
                UPDATE pbs.node SET attributes=(
                        SELECT hstore(array_agg(attr.key ), array_agg(attr.value))
                                FROM ( SELECT concat(attr_name, '.' , attr_resource) AS key,
@@ -181,7 +181,7 @@
                UPDATE pbs.node SET attributes='' WHERE attributes IS NULL;
                ALTER TABLE pbs.node ALTER COLUMN attributes SET NOT NULL;
 
-               ALTER TABLE pbs.queue ADD attributes pbs.hstore DEFAULT ''::pbs.hstore;
+               ALTER TABLE pbs.queue ADD attributes hstore DEFAULT ''::hstore;
                UPDATE pbs.queue SET attributes=(
                        SELECT hstore(array_agg(attr.key ), array_agg(attr.value))
                                FROM ( SELECT concat(attr_name, '.' , attr_resource) AS key,
@@ -190,7 +190,7 @@
                UPDATE pbs.queue SET attributes='' WHERE attributes IS NULL;
                ALTER TABLE pbs.queue ALTER COLUMN attributes SET NOT NULL;
 
-               ALTER TABLE pbs.resv ADD attributes pbs.hstore DEFAULT ''::pbs.hstore;
+               ALTER TABLE pbs.resv ADD attributes hstore DEFAULT ''::hstore;
                UPDATE pbs.resv SET attributes=(
                        SELECT hstore(array_agg(attr.key ), array_agg(attr.value))
                                FROM ( SELECT concat(attr_name, '.' , attr_resource) AS key,
@@ -199,7 +199,7 @@
                UPDATE pbs.resv SET attributes='' WHERE attributes IS NULL;
                ALTER TABLE pbs.resv ALTER COLUMN attributes SET NOT NULL;
 
-               ALTER TABLE pbs.scheduler ADD attributes pbs.hstore DEFAULT ''::pbs.hstore;
+               ALTER TABLE pbs.scheduler ADD attributes hstore DEFAULT ''::hstore;
                UPDATE pbs.scheduler SET attributes=(
                        SELECT hstore(array_agg(attr.key ), array_agg(attr.value))
                                FROM ( SELECT concat(attr_name, '.' , attr_resource) AS key,
@@ -208,7 +208,7 @@
                UPDATE pbs.scheduler SET attributes='' WHERE attributes IS NULL;
                ALTER TABLE pbs.scheduler ALTER COLUMN attributes SET NOT NULL;
 
-               ALTER TABLE pbs.server ADD attributes pbs.hstore DEFAULT ''::pbs.hstore;
+               ALTER TABLE pbs.server ADD attributes hstore DEFAULT ''::hstore;
                UPDATE pbs.server SET attributes=(
                        SELECT hstore(array_agg(attr.key ), array_agg(attr.value))
                                FROM ( SELECT concat(attr_name, '.' , attr_resource) AS key,
2 Likes

Hi @dtalcott, thanks for following up here. This has been taken care of in master branch since July 26th: https://github.com/PBSPro/pbspro/pull/1213.