Failover setup in DAS


#1

Dear All,

I would like to configure Failover setup in DAS storage with ext4 file system formatted.

Can any one help me out.

Thanks,
Ans.


#2

Please read this Chapter from the PBS Pro Administrator guide:
Making Your Site More Robust
9.2.5.3 Host Configuration for Failover on Linux


#3

I have gone through the step mentioned but it is giving me error when i have configured using ext4 file system.

Also i have tried using xfs and it is giving me error when the primary goes down and also secondary is not able to continue the jobs further.

Thanks,
Ans


#4

Thank you Ans, please share you configuration with respect shared PBS_HOME directory (whether file locking is enabled on this share) , , whether independently Primary or Secondary PBS Worked

  • Stop Secondary PBS Services, Start Primary PBS Services, whether Primary works
  • Stop Secondary PBS Services, Stop Primary PBS Services , Start Secondary PBS Services , whether Secondary works

Please share the error logs

  • How do you initiate Primary failure here ?
  • When you mean Primary goes down ? services are down, system is shutdown(poweroff), network goes down etc ?
  • which are the active pbs services on the Secondary when Primary goes down ?

#5

Thank you adarsh.
PBS_HOME=/var/spool/pbs which is the DAS storage formatted in ext4.
i did not enable any file locking.

i have shutdown the primary, killed the PBS services to test the setup but the jobs are just showing in the running state but the jobs are not getting completed.

The setup is working fine when the PBS_HOME is an NFS partition exported from another server.

Thanks,
Ans.


#6

is there any heartbeat or mechanism that will attach this storage on Secondary when Primary fails.
URL:

This is more commonly used , if the above approach is expensive.

Thank you


#7

Thank you adarsh.

As of now there is no heart beat mechanism configured and if required then need to configure only for this.

Kindly let me know how can i use ext4/xfs file system as PBS_HOME for configuring PBS failover setup.

Thanks,
Ans.


#8

There is any specific solution on using a particular file system

  • basically we have to make sure the PBS_HOME is used by one of the active systems (primary or secondary at any one point in time)
  • Also, there should be some global file locking mechanism, to keep the split brain situation at bay , as we do not need both primary and secondary working at the same time.
  • NFS solution would be way to go

PBS Pro 18.2 Administrator guide : 9.2.4.6 Shared Filesystem
The filesystem you use for the machines managed by PBS should be highly reliable. We recommend, in this order, the following filesystems:
•HA DAS
•DAS, such as xfs or gfs
•HA NFS
•NFS