Can not delete a vnode whose name prefix with #(hash)


#1

Hi All,

I created a vnode whose name prefix with a hash, ex: #cn_vnode1 by accident.
like the follow version 2 config file of MoM

#cn_vnode1: resources_available.ncpus = 10
cn_vnode2: resources_available.ncpus = 10
blah blah …

use pbs_mom -s insert command it successfully to read and create the #cn_vnode1.(not my expectation actually)

The qmgr can print out the #cn_vnode1 by print node @default,
but seems it can not be removed by the command delete node #cn_vnode1.
it reported the message,
No Active Nodes, nothing done.
when I try to delete it

The qmgr treat the #(hash) as a comment, will not parse the trailing words of #.
I have tried the \ (escape) or quote, double quote, and it all failed(Syntax error).

Do anyone have the ideal to fix this? a bug?
I have updated the MoM version2 config file, but the ghost vnode still can not be removed.
Any suggestion would be appreciated.

Thanks,


#2

Hi Chris,

You need to delete it from the pbs_datastore.

Connect to database and remove the problematic node as below

#$PBS_EXEC/pgsql/bin/psql -U pbsdata -p 15007 -d pbs_datastore
pbs_datastore=# set search_path to pbs;
pbs_datastore=# select * from pbs.node;
pbs_datastore=# DELETE FROM pbs.node WHERE pbs.node.nd_name=’#cn_vnode1’;
pbs_datastore=# select * from pbs.node;
pbs_datastore=# \q

Thank you


#3

Hi adarsh,

Thanks for your replying.
I am lack of the sql knowledge.
Maybe my follow description have some issues.
There is no psql binary file at $PBS_EXEC/pgsql/bin/psql , but in /usr/bin/psql
I think both are the same, we just need a kind of sql program.
The problem when I type the command _psql -U pbsdata -p 15007 -d
it requested the password, I don’t konw this part.
Is that something I need to set when I configured the pbspro?
I even don’t know that did my installation have the user pbsdata of psql or not.

I noticed the problem is I can not delete the #cn_vnode1 by qmgr, so I remove the datatore dir and reconfig the pbspro, which is a stupid method for now.
If you can explain the psql method more, that would be appreciated.

Thanks


#4

You may use /opt/pbs/sbin/pbs_ds_passwd to change the password for the psql database. The remaining steps that @adarsh outlined are correct.

If you don’t mind starting from scratch, you may also do the following…

  1. Stop all PBS Pro services
  2. Delete /var/spool/pbs (the PBS_HOME directory)
  3. Start PBS and reconfigure the services

Keep in mind, this process will delete all configuration data from PBS Pro and you’ll effectively be starting with a fresh installation. Use this as a last resort.


Pbs doesnt start after openhpc update
#5

You could also try: qmgr -c “delete node @default

Note: This will delete all of the nodes you’ve created.


#6

@mkaro
Thanks for the answer of the psql password.

@lisa-altair
That’s also a solution, a better alternative of deleting the datatore dir. Thanks.